###
Journal of Software:2017.28(6):1418-1434

基于关键类判定的代码提交理解辅助方法
黄袁,刘志勇,陈湘萍,熊英飞,罗笑南
(中山大学 数据科学与计算机学院, 广东 广州 510006;国家数字家庭工程技术研究中心, 广东 广州 510006;国家数字家庭工程技术研究中心, 广东 广州 510006;中山大学 先进技术研究院, 广东 广州 510006;北京大学 信息科学技术学院 软件研究所, 北京 100871;高可信软件技术教育部重点实验室(北京大学), 北京 100871)
Auxiliary Method for Code Commit Comprehension Based on Core-Class Identification
HUANG Yuan,LIU Zhi-Yong,CHEN Xiang-Ping,XIONG Ying-Fei,LUO Xiao-Nan
(School of Data and Computer Science, Sun Yat-Sen University, Guangzhou 510006, China;National Engineering Research Center of Digital Life, Guangzhou 510006, China;National Engineering Research Center of Digital Life, Guangzhou 510006, China;Institute of Advanced Technology, Sun Yat-Sen University, Guangzhou 510006, China;Software Engineering Institute, School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China;Key Laboratory of High Confidence Software Technologies of Ministry of Education(Peking University), Beijing 100871, China)
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 2040   Download 968
Received:July 28, 2016    Revised:October 11, 2016
> 中文摘要: 软件代码提交是最重要的软件版本演化数据之一,被广泛应用于软件审查和软件理解中.对于程序员,提交的理解难度随着受影响的类数量、修改的代码量的增加而增加.通过对大量数据的分析发现:识别出提交中核心的修改类(关键类)以及为了完成这个核心修改所进行的依赖性改动的类(非关键类),能够辅助代码提交的理解.受机器学习技术在分类领域有效性的启发,提出一种基于机器学习的关键类识别方法,将判定提交中的关键类建模为二分类问题(即关键和非关键类),从软件演化过程中产生的海量提交数据中抽取可判别性特征来度量类的关键性.在多个数据集上的实验结果表明:该方法判定关键类的综合准确率达到了87%;相比于开发人员直接理解提交,使用关键类信息提示来辅助理解提交,能够显著提高开发人员的效率和正确率.
Abstract:Code commit is one of the most important software evolution data, and it is widely used in the software review and code comprehension. A commit involving multiple modified classes and code makes the review of code changes difficult. By analyzing a large amount of commit data, this study discovers that identifying the core modified classes in a commit can speed up commit review for developers. Inspired by the effectiveness of machine learning techniques in classification, the paper models the core class identification as a binary classification problem (i.e., core and non-core) and proposes discriminative features from a large number of commits to characterize the core modified classes. The experiments results show that the proposed approach achieves 87% accuracy, and using core class in commit review provides significant improvement than the ones without core class.
文章编号:     中图分类号:    文献标志码:
基金项目:NSFC-广东联合基金(U1201252);国家重点研发计划(2016YFB1000101);国家自然科学基金(61672545,61672045);广东科技计划(2015B040403005) NSFC-广东联合基金(U1201252);国家重点研发计划(2016YFB1000101);国家自然科学基金(61672545,61672045);广东科技计划(2015B040403005)
Foundation items:NSFC-Guangdong Joint Fund (U1201252); National Key Research and Development Program of China (2016YFB1000101); National Natural Science Foundation of China Science and Technology (61672545, 61672045); Science and Technology Planning Project of Guangdong Province (2015B040403005)
Reference text:

黄袁,刘志勇,陈湘萍,熊英飞,罗笑南.基于关键类判定的代码提交理解辅助方法.软件学报,2017,28(6):1418-1434

HUANG Yuan,LIU Zhi-Yong,CHEN Xiang-Ping,XIONG Ying-Fei,LUO Xiao-Nan.Auxiliary Method for Code Commit Comprehension Based on Core-Class Identification.Journal of Software,2017,28(6):1418-1434