Optimization of GEMM on SW1621 Processors
Author:
Affiliation:

Clc Number:

TP303

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    General matrix multiply (GEMM) is one of the most used functions in scientific and engineering computation, and it is also the base function of many linear algebra libraries. Its performance usually has essential influence on the whole application. Besides, because of its intensity in computation, its efficiency is often considered as an important metric of the hardware platform. This study conducts systematic optimization to dense GEMM on the domestic SW1621 processor. Based on analysis of the baseline code and profiling of various overhead, as well as utilization of the architectural features and instruction set, optimization for DGEMM is carefully designed and performed, including blocking scheme, packing mechanism, kernel function implementation, data prefetch, etc. Besides, a code generator is developed, which can generate different assembly and C code according to the input parameters. Using the code generator, together with auto-tuning scripts, it is able to find the optimal values for the tunable parameters. After applying the optimizations and tuning, the proposed single thread DGEMM achieved 85% of the peak performance of a single core, and 80% of the performance of the entire chip of 16 cores. The optimization to DGEMM not only improves the performance of BLAS on SW1621, but also provides an important reference for optimizing dense data computation on SW series multi-core machines.

    Reference
    Related
    Cited by
Get Citation

闫昊,刘芳芳,马文静,陈道琨.申威1621处理器上矩阵乘法优化研究.软件学报,2023,34(7):3451-3463

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:June 07,2021
  • Revised:August 07,2021
  • Adopted:
  • Online: November 30,2022
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063