MEST: A Model-Driven Efficient Searching Approach for MapReduce Self-Tuning
Hadoop is the most popular implementation framework of the MapReduce programming model, and it has a number of performance-critical configuration parameters. However, manually setting these parameters to their optimal values not only needs in-depth knowledge on Hadoop as well as the job itself, but also requires a large amount of time and efforts. Automatic approaches have therefore been proposed. Their usage, however, is still quite limited due to the intolerably long searching time. In this paper, we introduce MapreducE Self-Tuning (MEST), a framework that accelerates the searching process for the optimal configuration of a given Hadoop application. We have devised a novel mechanism by integrating the model trees algorithm with the genetic algorithm. As such, MEST significantly reduces the searching time by removing unnecessary profiling, modeling, and searching steps, which are mandatory for existing approaches. Our experiments using five benchmarks, each with two input data sets ( DS1 and 2Ã?DS1) show that MEST improves the searching efficiency (SE) by factors of 1.37Ã? and 2.18Ã? on average respectively over the state-of-the-art approach.
MapReduce, Hadoop, self-tuning, model trees, genetic algorithm.