Automatic Code Semantic Tag Generation Approach Based on Software Knowledge Graph
Author:
Affiliation:

Clc Number:

TP311

Fund Project:

National Natural Science Foundation of China (61972098)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Code snippets in open-source and enterprise software projects and posted on various software development websites are important software development resources. However, developer's needs for code search often reflect high-level intentions and topics, which are difficult to be satisfied through code search techniques based on information retrieval. It is thus highly desirable that code snippets can be accompanied with semantic tags reflecting their high-level intentions and topics to facilitate code search and understanding. Existing tag generation technologies are mainly oriented to text content or rely on historical data, and cannot meet the needs of large-scale code semantic annotation and auxiliary code search and understanding. Targeted at the issue, this study proposes an approach based on software knowledge graph (called KGCodeTagger) that automatically generates semantic tags for code snippets. KGCodeTagger constructs a software knowledge graph based on concepts and relations extracted from API documentations and software development Q&A text and uses the knowledge graph as the basis of code semantic tag generation. Given a code snippet, KGCodeTagger identifies and extracts API invocations and concept mentions, and then links them to the corresponding concepts in the software knowledge graph. On this basis, the approach further identifies other concepts related to the linked concepts as candidates and selects semantic tags from relevant concepts based on the diversity and representativeness. The software knowledge graph construction steps of KGCodeTagger and the quality of the generated code tags are evaluated. The results show that KGCodeTagger can produce high-quality and meaningful software knowledge graph and code semantic tags, which can help developers quickly understand the intention of the code.

    Reference
    Related
    Cited by
Get Citation

邢双双,刘名威,彭鑫.基于软件知识图谱的代码语义标签自动生成方法.软件学报,2022,33(11):4027-4045

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:December 25,2020
  • Revised:February 13,2021
  • Adopted:
  • Online: May 21,2021
  • Published: November 06,2022
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063