Heterogeneous Job Allocation Scheduler for Hadoop MapReduce Using Dynamic Grouping Integrated Neighboring Search
MapReduce is a crucial framework in the cloud computing architecture, and is implemented by Apache Hadoop and other cloud computing platforms. The resources required for executing jobs in a large data center vary according to the job types. In general, there are two types of jobs, CPU-bound and I/O-bound, which require different resources but run simultaneously in the same cluster. The default job scheduling policy of Hadoop is first-come-first-served and therefore, may cause unbalanced resource utilization. Considering various job workloads, numerous job allocation schedulers were proposed in the literature. However, those schedulers encountered the data locality problem or unreasonable job execution performance. This study proposes a job scheduler based on a dynamic grouping integrated neighboring search strategy, which can balance the resource utilization and improve the performance and data locality in heterogeneous computing environments.
Hadoop, heterogeneous computing environments, heterogeneous workloads, MapReduce, scheduling.