Top big data tools used to store and analyse data

cloud storage services

BIG DATA is a term used for a collection of data sets so large and complex that it is difficult to process using traditional applications/tools. It is the data exceeding Terabytes in size. Because of the variety of data that it encompasses, big data always brings a number of challenges relating to its volume and complexity. A recent survey says that 80% of the data created in the world are unstructured. One challenge is how these unstructured data can be structured, before we attempt to understand and capture the most important data. Another challenge is how we can store it. Here are the top tools used to store and analyse Big Data. We can categorise them into two (storage and Querying/Analysis).
1. Apache Hadoop
Apache Hadoop is a java based free software framework that can effectively store large amount of data in a cluster. This framework runs in parallel on a cluster and has an ability to allow us to process data across all nodes. Hadoop Distributed File System (HDFS) is the storage system of Hadoop which splits big data and distribute across many nodes in a cluster. This also replicates data in a cluster thus providing high availability.
2. Microsoft HDInsight
It is a Big Data solution from Microsoft powered by Apache Hadoop which is available as a service in the cloud. HDInsight uses Windows Azure Blob storage as the default file system. This also provides high availability with low cost.
3. NoSQL
While the traditional SQL can be effectively used to handle large amount of structured data, we need NoSQL (Not Only SQL) to handle unstructured data. NoSQL databases store unstructured data with no particular schema. Each row can have its own set of column values. NoSQL gives better performance in storing massive amount of data. There are many open-source NoSQL DBs available to analyse big Data.
4. Hive
This is a distributed data management for Hadoop. This supports SQL-like query option HiveSQL (HSQL) to access big data. This can be primarily used for Data mining purpose. This runs on top of Hadoop.
5. Sqoop
This is a tool that connects Hadoop with various relational databases to transfer data. This can be effectively used to transfer structured data to Hadoop or Hive.
6. PolyBase
This works on top of SQL Server 2012 Parallel Data Warehouse (PDW) and is used to access data stored in PDW. PDW is a datawarhousing appliance built for processing any volume of relational data and provides an integration with Hadoop allowing us to access non-relational data as well.
7. Big data in EXCEL
As many people are comfortable in doing analysis in EXCEL, a popular tool from Microsoft, you can also connect data stored in Hadoop using EXCEL 2013. Hortonworks, which is primarily working in providing Enterprise Apache Hadoop, provides an option to access big data stored in their Hadoop platform using EXCEL 2013. You can use Power View feature of EXCEL 2013 to easily summarise the data. (More information).
Similarly, Microsoft’s HDInsight allows us to connect to Big data stored in Azure cloud using a power query option. (More information).
8. Presto
Facebook has developed and recently open-sourced its Query engine (SQL-on-Hadoop) named Presto which is built to handle petabytes of data. Unlike Hive, Presto does not depend on MapReduce technique and can quickly retrieve data.

  1. Charles nehme 2 years ago

    Very interesting

  2. Appreciating the commitment you put into your website and detailed information you provide.
    It’s good to come across a blog every once in a while that isn’t the same
    old rehashed information. Fantastic read! I’ve bookmarked your site
    and I’m including your RSS feeds to my Google account.

  3. I pay a quick visit everyday a few web pages and blogs to read articles
    or reviews, but this blog provides quality based content.

  4. I love reading an article that can make people think.
    Also, thank you for allowing me to comment!

  5. quest bars cheap 2 years ago

    If some one desires to be updated with newest technologies afterward he must be go to see this web page and be up to date daily.

  6. You ought to take part in a contest for one of
    the finest websites on the net. I most certainly
    will highly recommend this site!

  7. coconut oil with 1 year ago

    fantastic issues altogether, you just won a brand new reader.

    What might you suggest about your publish that you simply made some days in the past?
    Any certain?

  8. It’s perfect time to make a few plans for the long run and it
    is time to be happy. I’ve learn this publish and if I may I desire to recommend you few attention-grabbing
    issues or tips. Maybe you could write subsequent articles relating to this article.
    I want to read more issues about it!

Leave a Comment

Your email address will not be published.

You may also like

Pin It on Pinterest