Journal of Software:2020.31(5):1276-1293

(中国石油大学(华东) 计算机科学与技术学院, 山东 青岛 266580)
Data Flow Analysis for Sequential Storage Structures
WANG Shu-Dong,YIN Wen-Jing,DONG Yu-Kun,ZHANG Li,LIU Hao
(College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China)
Chart / table
Similar Articles
Article :Browse 416   Download 457
Received:August 31, 2019    Revised:October 24, 2019
> 中文摘要: C程序中数组、malloc动态分配后的连续内存等顺序存储结构被大量使用,但大多数传统的数据流分析方法未能充分描述其结构及其上的操作,特别是在利用指针访问顺序存储结构时,传统的分析方法只关注了指针的指向关系,而未讨论指针可能发生偏移的数值信息,且未考虑发生偏移时可能存在越界的不安全问题,导致了对顺序存储结构分析不精确.针对以上不足,首先对顺序存储结构进行抽象建模,并对顺序存储结构与指针结合使用时的指向关系与偏移量进行有效表示,建立了用于顺序存储结构的抽象内存模型SeqMM;其次,归纳总结C程序中顺序存储结构涉及的指针相关迁移操作、谓词操作及遍历顺序存储结构的循环操作,提出了安全范围判别保证操作安全性;之后,针对函数调用时形参指针引用顺序存储结构与实参的映射过程进行过程间推导规则设计;最后,基于上述分析,提出了一种内存泄漏缺陷检测算法,对5个开源C工程的内存泄漏缺陷进行检测.实验结果表明,所提出的SeqMM能够有效地刻画C程序中的顺序存储结构及其涉及的各种操作,其数据流分析结果能够用于内存泄漏的检测工作,同时在效率和精度之间取得合理的权衡.
Abstract:Sequential storage structures such as array and continuous memory block allocated dynamically by malloc are widely used in C programs. But traditional data flow analysis fails to adequately describe their structures and operations. When pointers are used to access the sequential storage structures in C programs, existing data flow analysis methods basically pay attention to only points-to information and do not consider the numerical properties offset. In addition, it does not consider the unsafe problem caused by out of bounds when offset occurs, which leads to inaccurate analysis for sequential storage structure. To improve the precision for analyzing sequential storage structures, an abstract memory model SeqMM is proposed to describe sequential storage structures, which can effectively describe points-to relationships and offset. Then there are three operations are summarized, such as the pointer-related transfer operation, predicate operation, and loop operation traversing sequential storage structures, and it is also considered that whether the index is out of bounds to ensure the security of operation execution when analyzing these operations. After that, mapping rules are introduced for parameters referencing sequential storage structure to corresponding arguments. Finally, a memory leak detection algorithm is proposed to detect memory leak in 5 open-source projects. The experimental results indicate that SeqMM can effectively describe sequential storage structure and various operations in C programs, and the results of data flow analysis can be used to detect memory leaks when a reasonable balance between accuracy and efficiency occurs.
文章编号:     中图分类号:    文献标志码:
基金项目:中央高校基本科研业务费专项资金(19CX02028A);国家自然科学基金(61873281) 中央高校基本科研业务费专项资金(19CX02028A);国家自然科学基金(61873281)
Foundation items:Fundamental Research Funds for the Central Universities (19CX02028A); National Natural Science Foundation of China (61873281)
Reference text:


WANG Shu-Dong,YIN Wen-Jing,DONG Yu-Kun,ZHANG Li,LIU Hao.Data Flow Analysis for Sequential Storage Structures.Journal of Software,2020,31(5):1276-1293