Faster, more capable: What Apache Spark brings to Hadoop

Apache Spark is an execution engine that broadens the type of computing workloads Hadoop can handle, while also tuning the performance of the big data framework.

Hadoop specialist Cloudera recently announced that it will offer commercial support for Apache Spark, which is available as part of Cloudera’s Hadoop-powered Enterprise Data Hub. But why should businesses care about Spark?

Apache Spark has numerous advantages over Hadoop’s MapReduce execution engine, in both the speed with which it carries out batch processing jobs and the wider range of computing workloads it can handle.

Spark is able to execute batch-processing jobs between 10 to 100 times faster than the MapReduce engine according to Cloudera, primarily by reducing the number of writes and reads to disc.

“You have map and reduce tasks and after that there’s a synchronisation barrier and you persist all of the data to disc,” said Mark Grover, Hadoop engineer for Cloudera.

Read More

  1. g 11 months ago

    Informative article, exactly what I needed.

  2. minecraft games 7 months ago

    Do you mind if I quote a couple of your articles as long as I provide credit and sources back
    to your blog? My website is in the exact same area of interest as yours
    and my users would genuinely benefit from some of the information you provide here.
    Please let me know if this alright with you. Thanks!

  3. Hi mates, pleasant article and pleasant arguments commented
    here, I am in fact enjoying by these.

Leave a Comment

Your email address will not be published.

You may also like

Pin It on Pinterest