BE/BTech & ME/MTech Final Year Projects for Computer Science | Information Technology | ECE Engineer | IEEE Projects Topics, PHD Projects Reports, Ideas and Download | Sai Info Solution | Nashik |Pune |Mumbai
director@saiinfo settings_phone02536644344 settings_phone+919270574718 +919096813348 settings_phone+917447889268
logo


SAI INFO SOLUTION


Diploma | BE |B.Tech |ME | M.Tech |PHD

Project Development and Training

Search Project by Domainwise


Energy-Efficient Task Scheduling for CPU-Intensive Streaming Jobs on Hadoop


Scalable and Secure Big Data I

3D Reconstruction in Canonical

Class Agnostic Image Common Ob
Abstract


Hadoop, especially Hadoop 2.0, has been a dominant framework for real-time big data processing. However, Hadoop is not optimized for energy efficiency. Aiming to solve this problem, in this paper we propose a new framework to improve the energy efficiency of Hadoop 2.0. We focus on the resource manager in Hadoop 2.0, namely YARN, and propose energy-efficient task scheduling mechanisms on YARN. Particularly, we focus on CPU-intensive streaming jobs and classify streaming jobs into two types, namely batch streaming jobs (i.e., a set of jobs are submitted simultaneously) and online streaming jobs (i.e., jobs are continuously submitted one by one). We devise different energy-efficient task scheduling algorithms for each kind of streaming jobs. Specially, we first propose to abstractly model performance and energy consumption by considering the characteristics of tasks as well as the computational resources in YARN. Based on this model, we study the energy efficiency of streaming tasks which consist of the performance model and energy consumption model of task. We propose two key principles for improving energy efficiency: 1) CPU usage aware task allocation, partitions tasks to NMs based on the task characteristic in term of CPU usage; 2) Resource efficient task allocation, reduce idle resource. Then, we propose a D-based binning algorithm for the batch task scheduling and K-based binning algorithm for the online task scheduling that can adapt to continuously arriving tasks. We conduct extensive experiments on a real Hadoop 2.0 cluster and use two kinds of workloads to evaluate the performance and energy efficiency of our proposal. Compared with Storm (the streaming data processing tool in Hadoop 2.0) and other approaches including TAPA and DVFS-MR, our proposal is more energy efficient. The batch task scheduling algorithm reduces up to 10% of energy consumption and keeps comparable performance. In addition, the online task scheduling algorithm reduces up to 7% over the existing algorithms

KeyWords
Energy Efficiency, Scheduling Algorithms, Hadoop, YARN



Share
Share via WhatsApp
BE/BTech & ME/MTech Final Year Projects for Computer Science | Information Technology | ECE Engineer | IEEE Projects Topics, PHD Projects Reports, Ideas and Download | Sai Info Solution | Nashik |Pune |Mumbai
Call us : 09096813348 / 02536644344
Mail ID : developer.saiinfo@gmail.com
Skype ID : saiinfosolutionnashik