Optimization of Hadoop MapReduce Model in cloud Computing Environment
In recent years data analysis has become one of the trending topic among the researchers. Moreover, Information is the new baseline of all organization, as to grow the faster and bigger. Relevant information provides the flexibility to know the like and dislike of customer and to get the relevant information requires the analysis of huge information that is stored in various format. Hadoop constitutes of two basic model i.e. Hadoop Distributed File system (HDFS) and MapReduce, Hadoop is used for processing the huge amount of data whereas MapReduce is used for data processing. Hadoop MapReduce is one of the best platform for processing the huge data in efficient manner such as processing of web logs data. In this paper, we have proposed optimized HPMR (Hadoop MapReduce) model, which maximizes the memory utilization for the task and balances the performance between the I/O system and CPUs. HPMR contains the three phase i.e. Hadoop, Map and Reduce just like any other Hadoop model, however HPMR optimizes all three phase i.e. map, shuffle and reduce. Moreover, to optimize the memory model HPMR opts for dynamic terminology and input/output optimization is done through the dual operation. Moreover, in order to evaluate the performance of our model we have performed the Word-Count application on the Wikipedia data of size 128 Mb, 256 Mb, 512 Mb, 1 GB and 2 GB. The comparative analysis shows that our model optimizes nearly 30% better than the existing one.
Optimized HPMR (Hadoop MapReduce) model, Hadoop MapReduce, HDFS (Hadoop Distributed File System).