CodeLLMTuner: 基于样本重用的代码大模型选择与解码参数调优框架
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP311

基金项目:

中国科学院软件研究所基础研究项目(ISCAS-JCMS-202405)


CodeLLMTuner: Code LLM Selection and Decoding Parameter Tuning Framework Based on Sample Reusing
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    随着大语言模型(LLM)技术的迅速发展, 涌现了众多代码大模型(Code LLM), 以支持代码生成、代码补全、代码测试和代码重构等任务. 不同模型在处理相同任务时可能表现出显著的性能差异, 且推理阶段的解码参数也会对模型性能产生重要影响. 研究如何为特定代码开发任务高效地选择最佳模型及其最优解码参数. 现有方法通常将模型选择和参数调优分为两个独立阶段, 由于不同阶段的采样策略差异导致无法共享样本数据, 采样与评估计算成本较高. 考虑到不同代码大模型解码参数空间相同, 提出利用倾向评分匹配(PSM)算法加权调整和对齐不同分布的样本数据, 以提高样本数据复用效率、降低计算成本. 由此提出了一个基于样本重用的代码大模型选择与解码参数调优框架CodeLLMTuner. 该框架包含3个阶段: (1)独立采样阶段, 对多个代码大模型并行执行解码参数调优(如贝叶斯优化)并进行数据采样与评估以收集样本数据; (2)模型选择阶段, 利用PSM技术对齐不同模型的样本数据, 从中选出性能期望最优的模型; (3)获选模型的解码参数调优阶段, 复用获选模型的样本数据, 并在其基础上继续进行解码参数调优, 以全面探索性能空间并显著降低采样成本. 实验结果表明, 在代码生成、代码摘要和测试用例生成这3项任务上, CodeLLMTuner相比基线方法在相同成本下性能提升10%–15%, 或在达到相同性能下成本降低超过20%.

    Abstract:

    With the rapid development of large language model (LLM) technology, many Code LLMs have emerged to support tasks such as code generation, code completion, code testing, and code refactoring. Different models may show significant performance differences when processing the same task, and the decoding parameters at the inference stage will also have an important influence on model performance. This study investigates how to efficiently select the best model and its optimal decoding parameters for a specific code development task. Existing methods generally divide model selection and parameter tuning into two independent stages. Due to the differences in sampling strategies at different stages, sample data cannot be shared, and the computational cost of sampling and evaluation is high. Considering that the decoding parameter space of different Code LLMs is the same, this study proposes the utilization of the propensity score matching (PSM) algorithm for conducting weighted adjustment and aligning sample data of different distributions to improve the reuse efficiency of sample data and reduce computational costs. Therefore, this study proposes a framework CodeLLMTuner for Code LLM selection and decoding parameter tuning based on sample reuse. The framework includes three stages, including the independent sampling stage, which performs decoding parameter tuning (such as Bayesian optimization) on multiple Code LLMs in parallel and conducts data sampling and evaluation to collect sample data. Additionally, the model selection stage adopts PSM technology to align the sample data of different models and selects the model with the optimal performance expectations. The decoding parameter tuning stage of the selected model reuses the sample data of the selected model and continues decoding parameter tuning based on it to fully explore the performance space and significantly reduce sampling costs. Experimental results show that in the three tasks of code generation, code summarization and test case generation, CodeLLMTuner improves performance by 10% to 15% at the same cost compared to the baseline methods, or reduces the cost by more than 20% under the same performance.

    参考文献
    相似文献
    引证文献
引用本文

曲慕子,亢良伊,刘杰,王帅,叶丹,黄涛. CodeLLMTuner: 基于样本重用的代码大模型选择与解码参数调优框架.软件学报,2026,37(5):2131-2150

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-11-18
  • 最后修改日期:2025-03-03
  • 录用日期:
  • 在线发布日期: 2025-12-10
  • 出版日期: 2026-05-06
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号