###
Journal of Software:2011.22(10):2509-2522

非平衡进程到达模式下MPI广播的性能优化方法
刘志强,宋君强,卢风顺,徐芬
(国防科学技术大学 计算机学院, 湖南 长沙 410073)
Optimizing Method for Improving the Performance of MPI Broadcast under Unbalanced Process Arrival Patterns
LIU Zhi-Qiang,SONG Jun-Qiang,LU Feng-Shun,XU Fen
(College of Computer, National University of Defense Technology, Changsha 410073, China)
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 3549   Download 3160
Received:December 11, 2009    Revised:March 05, 2010
> 中文摘要: 为了提高非平衡进程到达(unbalanced process arrival,简称UPA)模式下MPI广播的性能,对UPA模式下的广播问题进行了理论分析,证明了在多核集群环境中通过节点内多个MPI 进程的竞争可以有效减少UPA 对MPI广播性能的影响,并在此基础上提出了一种新的优化方法,即竞争式流水化方法(competitive and pipelined method,简称CP).CP方法通过一种节点内进程竞争机制在广过程中尽早启动节点间通信,经该方法优化的广播算法利用共享内存在节点内通信,利用由竞争机制产生的引导进程执行原算法在节点间通信.并且,该方法使节点间通信和节点内通信以流水方式重叠执行,能够有效利用集群系统各节点的多核优势,减少了MPI广播受UPA的影响,提高了性能.为了验证CP方法的有效性,基于此方法优化了3种典型的MPI广播算法,分别适用于不同消息长度的广播.在真实系统中,通过微基准测试和两个实际的应用程序对CP广播进行了性能评价,结果表明,该方法能够有效地提高传统广播算法在UPA模式下的性能.在应用程序的负载测试实验结果中,CP广播的性能较流水化广播的性能提高约16%,较MVAPICH2 1.2中广播的性能提高18%~24%.
Abstract:This paper aims at improving the performance of MPI broadcasts under unbalanced process arrival (UPA) patterns. This paper analyzes this problem with a performance model and proves that the negative impact of UPA on MPI broadcast can be effectively reduced by the competition of intra-node MPI processes on a multicore cluster. Based on this theory, a new optimizing method, called competitive and pipelined method (CP), is proposed. The CP method can start inter-node communications during the broadcast process through an intra-node competitive mechanism. In a CP method based broadcast algorithm, intra-node communications overlap inter-node communications through a pipelined method, and intra-node communications are implemented through shared memory while inter-node communications are executed by a set of leader MPI processes, which is selected by the competitive mechanism. In order to verify the CP method, this paper improves three typical broadcast algorithms by using this method and evaluates these algorithms in a real platform by using a micro-benchmark case and two practical applications. The results show that the performance of the CP method can effectively improve the performance of broadcast algorithms in the condition of UPA patterns. In the experimental results of the performance of the practical applications, the performance of CP broadcasts is about 16% higher than the performance of P broadcasts and is 18% to 24% higher than the performance of broadcast operation in MVAPICH2 1.2.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学创新群体基金(60621003) 国家自然科学创新群体基金(60621003)
Foundation items:
Reference text:

刘志强,宋君强,卢风顺,徐芬.非平衡进程到达模式下MPI广播的性能优化方法.软件学报,2011,22(10):2509-2522

LIU Zhi-Qiang,SONG Jun-Qiang,LU Feng-Shun,XU Fen.Optimizing Method for Improving the Performance of MPI Broadcast under Unbalanced Process Arrival Patterns.Journal of Software,2011,22(10):2509-2522