检索增强生成在软件工程中的应用综述
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(U24A20337, 62372228); 中央高校基本科研业务费专项资金(14380029)


Survey on Application of Retrieval-augmented Generation in Software Engineering
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    检索增强生成(retrieval-augmented generation, RAG)通过融合信息检索与语言生成模型, 显著提升代码生成、补全、程序修复等软件工程下游任务的性能. 随着RAG在软件工程领域的迅速发展, 研究者难以全面掌握其最新的进展、面临的挑战及未来的潜在机遇. 为此, 系统性地综述2021–2024年间RAG在软件工程中的应用, 围绕RAG的核心架构及其在软件工程中的应用, 对108篇相关高质量研究进行汇总与深入分析. 首先, 探讨软件工程领域中RAG架构的关键组成部分, 详细总结检索器和生成器的通用分类, 并概述二者的集成方式. 其次, 重点分析RAG在各类软件工程下游任务中的应用, 包括代码生成、测试生成、程序修复等, 梳理其在不同任务场景下的实践方法与技术趋势. 最后, 讨论当前RAG应用所面临的挑战, 涉及知识库构建、检索和生成这3个阶段, 并探讨未来的研究方向与潜在发展路径. 总体而言, 为软件工程社区提供一份全面的RAG研究综述, 旨在帮助研究者系统了解现有成果, 洞察关键问题, 并推动该领域的进一步发展.

    Abstract:

    Retrieval-augmented generation (RAG) significantly enhances the performance of downstream software engineering tasks such as code generation, code completion, and program repair by combining information retrieval with language generation models. As RAG develops rapidly in software engineering, it is difficult for researchers to comprehensively grasp its current achievements, challenges, and future potential opportunities. This study presents the first systematic review of the application of RAG in software engineering from 2021 to 2024, summarizing and deeply analyzing 108 relevant high-quality studies from the perspectives of RAG’s core architecture and its applications in software engineering. Firstly, the key architectural components of RAG in the field of software engineering are discussed, and a detailed summary of common types of retrievers and generators is provided, with the integration methods of both summarized. Secondly, the application of RAG in various downstream software engineering tasks is mainly analyzed, such as code generation, code completion, and program repair. Additionally, a systematic review is provided for RAG’s practical methods and technical trends under different task scenarios. Finally, the challenges that the current RAG application faces are discussed, covering three stages of knowledge base construction, retrieval, and generation, with the future research directions and potential development paths pointed out. Generally, this study provides a comprehensive review of RAG research for the software engineering community, aiming to help researchers have a systematic understanding of the current achievements and an insight into key problems, and promote the further development of this field.

    参考文献
    相似文献
    引证文献
引用本文

张犬俊,谢杨,房春荣,虞圣呈,赵源,陈振宇.检索增强生成在软件工程中的应用综述.软件学报,2026,37(3):1316-1339

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-03-14
  • 最后修改日期:2025-06-11
  • 录用日期:
  • 在线发布日期: 2025-12-24
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号