TaskTracker aware scheduler with resource availability control for Hadoop MapReduce
Abstract
Schedulers are playing a vital role in task assignment for Hadoop MapReduce. In some scenario, the default schedulers of Hadoop spawn tasks in TaskTracker without checking the external dependency and may fail. As a result, Hadoop should rerun the tasks in another TaskTracker. To address this issue, TaskTracker aware scheduler has been introduced. This paper focuses the resource availability control of TaskTracker aware scheduler. The proposed scheduler will not allow a task to run and fail if the load of the TaskTracker reaches its threshold for the Job. The performance of this scheduler may increase if the scheduler is aware of the status of the resources present in the TaskTracker nodes. The main features of this scheduler are user controllability of jobs and configuration based resource utilisation control for task allocation.