BinDec: 面向RISC-V的LLM与符号执行协同反编译方法

doi:10.13328/j.cnki.jos.007616

微信小程序

微信服务号

微信订阅号

首页 > 过刊浏览>2026年第37卷第6期 >2327-2345. DOI:10.13328/j.cnki.jos.007616

PDF HTML阅读 XML下载导出引用引用提醒

BinDec: 面向RISC-V的LLM与符号执行协同反编译方法
DOI:
                        10.13328/j.cnki.jos.007616
                    
CSTR:
                        32375.14.jos.007616
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:TP311
基金项目:

BinDec: LLM and Symbolic Execution Collaborative Decompilation Method for RISC-V

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

反编译是软件逆向工程中的基础技术, 其目标是从面向硬件的二进制代码中恢复出高级语言代码, 以支持人工阅读、分析或重工程任务. 尽管该技术已得到广泛研究, 但传统基于规则的反编译器所生成的反编译代码往往可读性较差, 且难以复用. 此外, 由于传统反编译器的开发周期较长, 其对RISC-V等新兴指令集架构的支持通常较为滞后. 在当前大语言模型(large language model, LLM)技术广泛应用于自动化软件工程任务并取得显著成效的背景下, 面向RISC-V架构的反编译需求, 提出了一种LLM与符号执行协同的反编译方法BinDec. 该方法通过LLM生成与符号执行验证的交替迭代, 充分利用LLM的代码理解与生成能力, 以产生更易于理解与重用的反编译代码; 同时借助符号执行的代码分析与验证能力, 确保生成结果的可靠性. 通过一系列实验对BinDec的有效性进行了评估, 实验结果表明, 该方法在达到与传统反编译器相近的语义准确性的同时显著提升了代码的可读性.

Abstract:

Decompilation serves as a fundamental technique in software reverse engineering, aiming to recover high-level source code from hardware-oriented binary programs to support human understanding, analysis, and re-engineering tasks. Although this technique has been extensively studied, traditional rule-based decompilers often generate decompiled code with poor readability and limited reusability. Moreover, due to long development cycles, support for emerging instruction set architectures such as RISC-V is typically delayed in conventional decompilers. With the widespread adoption of large language models (LLMs) in automated software engineering tasks and their demonstrated effectiveness, this study proposes BinDec, a RISC-V binary decompilation approach that synergistically integrates LLM and symbolic execution. The proposed method alternates between LLM-based code generation and symbolic execution-based verification, fully exploiting the code understanding and generation capabilities of LLM to produce decompiled code that is more readable and reusable, while leveraging the analysis and verification capabilities of symbolic execution to ensure semantic correctness and reliability. The effectiveness of the proposed method is evaluated through a series of experiments. Experimental results demonstrate that BinDec achieves semantic accuracy comparable to that of traditional decompilers, while significantly improving the readability of the generated decompiled code.

参考文献

相似文献

引证文献

引用本文

李玉璋,张熙,徐涛. BinDec: 面向RISC-V的LLM与符号执行协同反编译方法.软件学报,2026,37(6):2327-2345

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2025-09-07
最后修改日期:2025-10-20
录用日期:
在线发布日期: 2025-12-26
出版日期: 2026-06-06

微信小程序

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码