Barak Regev is head of cloud EMEA and big data guru at Google UK
As we approach the ten year anniversary of the Google File System whitepaper –the catalyst for MapReduce, the fundamental underlying technology in Apache Hadoop, Business Cloud News sat down with Barak Regev, head of cloud and Google’s big data guru in the EMEA to speak about the evolution of big data, its intertwining with cloud and how it will force enterprises to shift their business intelligence strategies.
While many businesses have only recently been engulfed in the big data frenzy currently sweeping through the commercial world, Google can legitimately claim its place as a key enabler of the revolution in data storage and processing that led to it.
“As an organisation we handle big data daily, and that’s what led to the Google File System whitepaper, which led to MapReduce,” Regev says. Indeed it would be fair to say that many organisations developing big data solutions have been inspired by technologies or approaches outlined by Google at one point or another.
The Google File System whitepaper was released in the fall of 2003 and outlined the fundamentals of what would become the company’s distributed file system – where files are basically split into chunks and stored in a redundant fashion on servers.
In early 2004 two now famous Google data scientists Jeffrey Dean and Sanjay Ghemawat released a paper outlining MapReduce, a programming model for processing and generating large volume data sets that would become the cornerstone of Apache Hadoop, which has become synonymous with big data. In 2010, the company released the Dremel whitepaper (named after the popular American power tools brand), which is Google’s model for an interactive, low-latency, ad-hoc query system that can handle petabytes of data and scale to thousands of CPUs, and the model used for key