How to create a real Hadoop cluster in 10 minutes?

We recently launched demo.gethue.com, which in one click lets you try out a real Hadoop cluster. We followed the exact same process as building a production ready cluster. Here is how we did it.

Before getting started, you will need to get your hands on some machines. Hadoop runs on commodity hardware, so any regular computer with a major linux distribution will work. To follow along with the demo, take a look at Amazon Cloud Computing service. If you already have a server or two, or don’t mind running Hadoop on your local linux box, then go straight to Machine Setup!

Here is a video demoing how easy it is to boot your own cluster and start crunching data!

We picked AWS and started 4 m3.large instances with Ubuntu 12.04 and 100 GB storage (instead of the default 8GB). If you need less performance, one xlarge instance is enough or you can install less services on an even smaller instance.

Then configure the security group like below. We allow everything between the instances (the first row, don’t forget it on multi machine cluster!) and open up Cloudera Manager and Hue ports to the outside.

Leave a Comment

Your email address will not be published.

You may also like

Crayon Yoda

Pin It on Pinterest