Forgot your password? Click here.
MapReduce is a programming model which is originally introduced by Google for processing and generating large data sets on clusters of computers and also serves two essential functions first It parcels out work to various nodes within the cluster or map and organizes and reduces the results from each node into a cohesive answer to a query. The primary objective of MapReduce is to split the input data set into independent chunks that are processed in a completely parallel manner. The Hadoop MapReduce framework sorts the outputs of the maps, which are then input to the reduce tasks. Typically, both the input and the output of the job are stored in a file system. The MapReduce framework beneficial because library routines can be used to create parallel programs without any worries about infra-cluster communication, task monitoring or failure handling processes.