Hadoop

10 reasons why Hadoop is NOT the best solution for all purposes

10th Dec `14, 01:09 PM in Hadoop

Hadoop has become the backbone of several applications and Big Data cannot be even imagined without Hadoop. Hadoop…

BDMS
Guest Contributor
 

Hadoop has become the backbone of several applications and Big Data cannot be even imagined without Hadoop. Hadoop offers distributed storage, scalability and huge performance. It’s also considered as the standard platform for high-volume data infrastructures. But there are several reasons why Hadoop is not always the best solution for all purposes. Let’s discuss ten disadvantages of Hadoop here:

1. Pig vs. Hive:

Hive UDFs are not allowed to be used in Pig. Hcatalog is required to access Hive tables in Pig. Pig UDFs cannot be used in Hive too. If any extra functionality is required in Hive, then a Pig script is always not much preferred.

2. Security concerns:

If Hadoop is used to manage a complex application, then it becomes a huge challenge. Hadoop’s security model is not very recommended one and if used in complex applications, it gets disabled by default. Data is at huge risk as encryption is missing in Hadoop system at the storage and network levels. Without encryption, data can always be compromised easily.

3. Big Data cravings:

Hadoop is mostly craved when business is built on a Big data dataset. But before using Hadoop, you need to know answers to certain questions like how much terrabyte of data do you have, if you are having a steady and huge flow of data or not and how much data will be operated upon in reality.

4. Shared libraries forcefully stored in HDFS:

Hadoop keeps repeating this issue. If Pig script is stored in HDFS, then it’s assumed that there will be JAR files too. This theme recurs in Oozie and other tools too. Well, storing shared libraries in HDFS is not that much a bad idea, but if it’s to be done across a huge organisation, then the task is painful.

Read More
MORE FROM BIG DATA MADE SIMPLE