Hadoop

Map and Reduce in Python without Hadoop

15th Jan `14, 11:44 AM in Hadoop

MapReduce is not a new programming model, but the Google’s paper on MapReduce made it popular. A map…

BDMS
Guest Contributor
 

MapReduce is not a new programming model, but the Google’s paper on MapReduce made it popular. A map is usually used for transformation, while reduce/fold is used for aggregation. They are built-in primitives used in functional programming languages like Lisp and ML. More about the functional programming roots to MapReduce paradigm can be found in Section 2.1 of Data-Intensive Text Processing with MapReduce paper.

Below is a simple Python 2 program using the map/reduce functions. map/reduce are functions in the __builtin__ python module. More about functional programming in Python here. For those using Python3, the reduce function has removed from the __builtin__ package. According to the Python 3.1 release notes :

Read More
MORE FROM BIG DATA MADE SIMPLE