In the last few years we have seen Big Data generate a lot of buzz along with the launch of several successful big data products. The big data ecosystem has now reached a tipping point where the basic infrastructural capabilities for supporting big data challenges and opportunities are easily available. Now we are entering what I would call the next generation of big data — big data 2.0 — where the focus is on three key areas:
Data is growing at an exponential rate, and the ability to analyze it faster is more important than ever. Almost every big data vendor is coming out with product offerings, like in-memory processing to process data faster. Hadoop also launched its new release, Hadoop 2.0 / YARN, which can process data in near real-time. Another big data technology gaining traction is Apache Spark, which can run 100 times faster than Hadoop. Leading Silicon Valley venture capital firm Andreessen Horowitz led a $14 million investment to start a company named Databricks around Apache Spark.