General SpMV Implementation in Many-Core Domestic Sunway 26010 Processor
Author:
Affiliation:

Clc Number:

Fund Project:

National Key R&D Program (2016YFB0200603); National Natural Science Foundation of China (91530323)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    The fastest supercomputer in the world-Sunway TaihuLight with performance of more than 100P has been released. It makes use of heterogeneous many-core processors which is different from the existing pure CPU, CPU-MIC, CPU-GPU architecture. Each processor has 4 core groups (CGs), with each including one management processing element (MPE) and one computing processing element (CPE) cluster of 64 CPEs. The peak performance of single processor is 3TFlops/s, the memory bandwidth is 130GB/s. Sparse matrix-vector multiplication is a very important kernel in scientific and engineering computing, which is bandwidth limited and subject to indirect memory access. Implementing an efficient SpMV kernel is a big challenge in Sunway processor. This paper proposes a general SpMV heterogeneous manycore algorithm for the traditional sparse matrix storage format CSR, which divides the task and LDM space in detail, a cache mechanism of dynamic and static buffers to improve the hit rate of vector x, and a dynamic-static task scheduling method to achieve load balancing. In addition, several key factors affecting the performance of SpMV are analyzed, and adaptive optimization is carried out to further enhance the performance. Finally 16 matrix from matrix market collection are used to perform tests. The experimental results show that the algorithm achieves bandwidth of 86% and average bandwidth utilization of 47%. Compared with the implementation of the controller core, the speedup can be up to 10x, and average speedup is 6.51x.

    Reference
    Related
    Cited by
Get Citation

刘芳芳,杨超,袁欣辉,吴长茂,敖玉龙.面向国产申威26010众核处理器的SpMV实现与优化.软件学报,2018,29(12):3921-3932

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:January 11,2017
  • Revised:
  • Adopted:
  • Online: December 05,2018
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063