异构多核上多级并行模型支持及性能优化
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(61303050);“十二五”国家科技支撑计划(2011BAK08B04);国家高技术研究发展计划(863)(2011AA01A205);中国科学院计算机系统结构重点实验室开放课题(CARCH201108)


Support for Multi-Level Parallelism on Heterogeneous Multi-Core and Performance Optimization
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    低功耗及廉价性使得异构多核在超级计算机计算资源中占有重要比例.然而,异构多核具有高带宽及松耦合一致性等特点,获得理想的存储及计算性能需要更多地考虑底层硬件细节.实现了一种针对典型的异构多核Cell BE 处理器的多级并行模型CellMLP,通过C 语言扩展编译指导语句,实现了对数据并行、任务并行以及流水并行编程模型的支持,提高了并行程序生产率.运行支持优化方面,数据并行采用SPE 并行数据传输、双缓冲等优化手段来提高数据传输带宽;任务并行使用一种新式混合任务队列以支持异步任务窃取,降低SPE 线程间竞争,提高了任务并行的可扩展性;流水并行首次使用阻塞信号传输机制实现SPE 线程间的低开销同步操作.实验对Stream,NASBenchmark 及BOTS 等应用进行了测试,结果表明,CellMLP 可对多种典型并行应用进行高效支持.与目前同类编程模型SARC 及CellSs 进行性能对比,其结果表明,CellMLP 实际数据传输带宽以及非规则应用的支持方面具有明显优势.

    Abstract:

    Due to its lower power consumption and cost, heterogeneous multi-core makes up a major computing resource in the current supercomputers. However, heterogeneous multi-core processor features high bandwidth and loose memory consistency, programmers pay attention to hardware details to get ideal memory and computation performance. This paper introduces CellMLP, a multi-level parallelism model for Cell BE heterogeneous multi-core processor. Through extending compiler directives based on C, CellMLP supports data parallelism, task parallelism and pipeline parallelism programming model, and improves the programming productivity. In addition, runtime optimizations are used to improve the performance. Parallel SPEs data transfer and double-buffer mechanisms are used to improve memory bandwidth. A novel hybrid task queue is used in task parallelism to support asynchronous work stealing, reduce the contention between SPE threads and increase the scalability of task parallelism. For the pipeline parallelism, low-overhead synchronization operations are firstly implemented utilizing signal channels in Cell BE. Experiments are conducted on Stream, NAS Benchmark, BOTS and other typical irregular applications. Results show that CellMLP can support different typical parallel applications efficiently. Compared with similar programming model SARC and CellSs, CellMLP has obvious advantages in terms of practical data transfer bandwidth as well as the support of irregular applications.

    参考文献
    相似文献
    引证文献
引用本文

李士刚,胡长军,王珏,李建江.异构多核上多级并行模型支持及性能优化.软件学报,2013,24(12):2782-2796

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2012-10-02
  • 最后修改日期:2012-12-03
  • 录用日期:
  • 在线发布日期: 2013-12-04
  • 出版日期:
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号