Dependency-Driven Task Scheduling Scheme of Big Data Processing
Author:
Affiliation:

Clc Number:

Fund Project:

National High Technology Research and Development Program of China(863) (2013AA01A209); National Natural Science Foundation of China (61172048, 61303250)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Currently, there is lack of consideration of dependencies between data in big data processing, resulting in low data processing efficiency with large amounts of data transfer during task execution. In order to reduce data transfer and improve processing performance, this paper proposes a data-dependency driven task scheduling scheme, named D3S2, for big data processing. D3S2 is mainly composed of two parts:dependency-aware placement mechanism(DAPM), and transfer-aware task scheduling mechanism(TASM). DAPM discovers dependency between data so that strongly related data will be clustered and assigned to nodes in the same rack, thereby reducing the cross-rack data migration. TASM schedules tasks simultaneously after data placement according to the data locality constraint, so as to minimize the data transfer cost during the task execution. DAPM and TASM provide basis for decision making to each other, iterating constantly to adjust the scheduling scheme with the goal of minimizing the execution cost until an optimal solution is reached. The proposed scheme is verified in Hadoop environment. Experiments show that compared to native Hadoop, D3S2 reduces the data transfer during job execution, and shortens job running time.

    Reference
    Related
    Cited by
Get Citation

王玢,吴雅婧,阳小龙,孙奇福.关联性驱动的大数据处理任务调度方案.软件学报,2017,28(12):3385-3398

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:July 04,2016
  • Revised:December 07,2016
  • Adopted:
  • Online: March 27,2017
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063