###
DOI:
Journal of Software:2010.21(zk):284-289

一种基于迭代聚类的并行应用性能分析方法
朱鹏,李巍,李云春
(北京航空航天大学 网络技术北京市重点实验室,北京 100191)
An Iterative Clustering Based Approach for Parallel Performance Analysis
ZHU Peng,LI Wei,LI Yun-Chun
(Beijing Key Laboratory of Network Technology, Beihang University, Beijing 100191, China)
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 2919   Download 2418
Received:June 15, 2010    Revised:December 10, 2010
> 中文摘要: 随着超级计算机的发展,其使用到的核心数逐渐达到数十万,而且运行于其上的应用的复杂性也不断加大.因此,开发人员需要对并行应用的性能进行测量,并做出分析,以便对程序源码进行优化,提高程序的执行效率.但是由于核心数的大量增加,对并行程序性能进行测量将得到海量的性能数据,如何处理海量性能数据,以便分析并行程序性能成为一个难点.介绍了一种基于迭代聚类的并行应用性能分析方法,该方法使用数据挖掘的聚类算法处理处理海量性能数据,并可以根据条件迭代执行,确定影响并行程序性能的函数和进程,然后通过贝叶斯信息准则评价聚类结果,以确定迭代聚类的可靠性,最后用实验证明了方法的有效性.
Abstract:With the development of supercomputers, the CPU core numbers of which come to several hundreds of thousands, and on which the complexity of the applications run are increasing. Therefore, in order to optimize the source code of the programs, developers of parallel applications need to measure the performance of parallel applications and make a useful analysis, so that they can improve the performance of the applications. However, due to a substantial increasing of the CPU core numbers, performance measurement will produce vast amounts of performance data, and then, how to deal with massive data is a very critical problem for parallel performance analysis. A new approach, named Iterative based Clustering Approach for Parallel Performance Analysis (ICAPPA), is proposed for parallel performance analysis in this paper. In this approach, clustering method of data mining technique, which is used to processing massive data, will be carried out iteratively for the result in some conditions after previous clustering, to find out the dominating functions and processes of the parallel performance. And Bayesian Information Criteria (BIC) is applied to evaluate the result of clustering method. By using BIC score, whether iterative clustering applied to the result is reliable or not can be decided. And at the end of this paper, the validity of that approach is verified by experimental analysis.
文章编号:     中图分类号:    文献标志码:
基金项目:Supported by the National High-Tech Research and Development Plan of China under Grant No.2007AA01A127 (国家高技术研究发展计划(863)) Supported by the National High-Tech Research and Development Plan of China under Grant No.2007AA01A127 (国家高技术研究发展计划(863))
Foundation items:
Reference text:

朱鹏,李巍,李云春.一种基于迭代聚类的并行应用性能分析方法.软件学报,2010,21(zk):284-289

ZHU Peng,LI Wei,LI Yun-Chun.An Iterative Clustering Based Approach for Parallel Performance Analysis.Journal of Software,2010,21(zk):284-289