Task-Based Parallel Programming Model Supporting Fault Tolerance
Author:
Affiliation:

Clc Number:

Fund Project:

National Natural Science Foundation of China (61300011)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Task-Based parallel programming model has become the mainstream parallel programming model to improve the performance of parallel computer systems by exploiting task parallelism. This paper presents a novel task-based parallel programming model which supports hardware fault tolerance. This model incorporates fault tolerance mechanisms into the task-based parallel programming model and aim to improve system performance and reliability. It uses task as the basic unit of scheduling, execution, fault detection and recovery, and supports fault tolerance in the application level. A buffer-commit computation model is used for transient fault tolerance and application-level diskless checkpointing technique is employed for permanent fault tolerance. A work-stealing scheduling scheme supporting fault tolerance is adopted to achieve dynamic load balancing. Experimental results show that the proposed model provides hardware fault tolerance with low performance overhead.

    Reference
    Related
    Cited by
Get Citation

王一拙,陈旭,计卫星,苏岩,王小军,石峰.一种支持容错的任务并行程序设计模型.软件学报,2016,27(7):1789-1804

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:December 31,2014
  • Revised:March 02,2015
  • Adopted:
  • Online: December 31,2015
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063