5 controversies and debates around Big Data

21st Aug `14, 10:44 PM in Resources

Big Data is growing in popularity. With popularity, comes controversies. In this post, we summarize five of the…

Nidhish Alex Contributor

Big Data is growing in popularity. With popularity, comes controversies. In this post, we summarize five of the popular controversies/debates around Big Data.

1. There is no need to distinguish Big Data analytics from traditional data analytics. They are only two different names that ultimately entail the same solution to solve the data problem, aren’t they? Read Traditional vs Big Data Analytics and How ‘Big Data’ Is Different to know more on this debate.

2. Hadoop is not always the best tool. There are occasions when Hadoop is not sufficient for big data in organizations; while crunching real time data, when the organizations have multiple data sources and business processes that cannot be fit into a single data infrastructure and when the organizations lack the talent pool for complicated data science tools like MapReduce. This article discusses why organizations should think beyond the Hadoop hype.

3. Size of the data does not matter as in real time analytics, data may be changing. What is more important is its recency. Big data may not be the right choice, while crunching real time data, because Hadoop like tools process data in batches. With a temporal dimension varying over time, the quantity of data doesn’t matter unless it ties in with time. What actually matters is the sample size at a particular instance (which is again a dimension) and not the overall sample size. Here is an interesting read to understand the role of time in real time analytics: Time Means Everything In Programmatic Display.

4. Bigger data are not always better. Data sufficiency plays a critical role when we run samples across different dimensions. The quality of data used for crunching decides the quality of insights.  If the signal to noise ratio is high, the accuracy of results may vary for dirty samples. Read more.

5. Big Data involves an ethical issue: Privacy. In the era of big data, the debate between privacy and personalization will be ongoing. Big Data is a big deal today but confirming to privacy guidelines is equally important. It is unethical if people are unware that their data is analysed. An in-depth study that weighs the big data rewards against privacy risks as appeared in Stanford law review is here for you to read.