Potentially raising the bar on SQL scalability, Facebook has released as open source a SQL query engine it developed called Presto that was built to work with petabyte-sized data warehouses.
Currently, more than 1,000 Facebook employees use Presto daily to run 30,000 interactive queries, involving over a petabyte of processing, according to a post authored by Facebook software engineer Martin Traverso. The company has scaled the software to run on a 1,000 node cluster.
Now, Facebook wants other data-driven organizations to use, and it hopes, refine Presto. The company has posted the software’s source code and is encouraging contributions from other parties. The software is already being tested by a number of other large Internet services, namely AirBnB and Dropbox.