Hadoop vendor Cloudera is singing the praises of its own SQL query engine, releasing on Monday the results of a benchmark that shows how Cloudera Impala compares to Apache Hive and a mystery proprietary database. As one might expect, Impala easily bested its competitors in the benchmarks (no vendor has ever, to my knowledge, released results highlighting its product’s inferiority), but Hive and SQL databases probably aren’t Impala’s real rivals.
Its more-direct competition comes from other Hadoop vendors doing their own things to try and make Hadoop queries faster and more interactive. Because the choice right now isn’t to Hadoop or not to Hadoop, it’s which flavor of Hadoop to do. Companies that are using Hive are already using Hadoop, so that decision has been made. And even Cloudera — unless its stance has shifted drastically — acknowledges that Impala isn’t yet a replacement for a purpose-built data warehouse or relational database systems.