[关键词]
[摘要]
随着开发者社区和代码托管平台成为程序员获取代码的主要途径,针对代码的用户评论数量急剧增加.用户在使用代码后给出的评论中包含多种静态和动态的代码质量属性信息,但是由于用户评论多为复杂句,使得评论中包含的代码质量属性难以判断.针对复杂用户评论的代码质量属性判断将有助于分析用户评论中的代码质量信息,有助于开发者在了解用户的代码使用情况和用户关注的代码质量属性后有针对性地提升代码质量.提出了针对复杂用户评论的代码质量属性判断方法.首先对复杂用户评论进行分句并构建分句的依存句法关系有向图;然后,应用基于分句的依存句法关系的主题判断规则抽取分句中的主题;接着,根据初始的代码质量属性特征词库识别各主题对应的代码质量属性,并获取各主题的代码质量属性表现与表现结果;最后,基于主题处理规则分析复杂用户评论中的代码质量属性表现与表现结果,产生复杂用户评论中代码质量属性相关结果,并持续扩充初始代码质量属性特征词库.实验结果表明,该方法能够对复杂用户评论的代码质量属性进行有效判断.
[Key word]
[Abstract]
As the developer community and code-hosting platforms become the primary means for programmers to access code, the number of user's comments on code has increased dramatically. There are a variety of static and dynamic code quality attributes in user's comments. However, as most of the user's comments are complex sentences, it is difficult to identify the code quality attributes in the comments. Judging the code quality attributes of complex user's comments will help to analyze the code quality information in user's comments and to improve code quality for the developers when they know about user's code usage and code quality attributes. In this study, a method is proposed to judge code quality attributes based on complex user's comments. Firstly, complex user's comments are divided into clauses and a dependency syntactic relation directed graph of the clauses is constructed. After that, the topic of the clause is extracted based on the topic judgment rule of the dependency syntactic relation of the clause. Then, according to the initial feature thesaurus of code quality attribute, the code quality attributes corresponding to each topic are identified, and the representation and the representation result of code quality attribute for each topic are acquired. And finally, the representation and the representation result of code quality attribute in the complex user's comments are analyzed based on the topic processing rule. The code quality attribute related result in the complex user's comment is produced, and the initial code quality attribute feature thesaurus is continuously expanded. The experimental results show that the proposed method can judge the code quality attributes of complex user's comments effectively.
[中图分类号]
[基金项目]
国家重点研发计划(2018YFB1003904);国家自然科学基金(61462049,61063006,60703116);云南省应用基础研究计划(2017FA033);云南省计算机技术应用重点实验室开放基金(2020101)