Journal of Software:2018.29(S1):83-91

(中国科学院大学 网络空间安全学院, 北京 100049;物联网信息安全技术北京市重点实验室(中国科学院 信息工程研究所), 北京 100093;国家工业信息安全发展研究中心, 北京 100040)
Open Source Component Version Recognition Without Version Strings
ZHANG Wei-Dong,YIN Li-Bo,LI Hong,WEN Hui,SUN Li-Min
(School of Cyber Security, University of Chinese Academy of Sciences Beijing 100049, China;Beijing Key Laboratory of IoT Information Security Technology(Institute of Information Engineering, The Chinese Academy of Sciences), Beijing 100093, China;National Research Center for the Development of Industrial Information Security Beijing 100040, China)
Chart / table
Similar Articles
Article :Browse 531   Download 275
Received:May 01, 2018    
> 中文摘要: 由于软件代码复用和第三方SDK的广泛使用,开源组件普遍存在于物联网设备固件中.威胁固件的安全漏洞往往存在于组件的某些特定版本中,识别物联网设备固件中开源二进制组件的版本信息,对于物联网的安全评估与应急响应意义重大.现有的基于版本字符串的版本信息提取方法不适用于组件版本字符串缺失的情况.设计和实现了一种不依赖于版本字符串的开源组件版本识别方法Protues.该方法的核心思想是通过开源组件相邻版本源码间的差异构造一条版本差异链,从而将版本识别问题转化为待查组件在版本差异链上的滑动查询问题.进一步地,为了提高识别准确率,采用条件判断表达式来表征版本差异链上的节点.为了验证该方法的实用性,对来自于4种开源组件Samba,Msmtp,Nginx和Libgcrypt的共428个二进制文件进行了版本识别实验,实验结果表明,该方法能够准确识别的版本数达到418个,识别准确率约为98%.
Abstract:Due to the extensive use of code reuse and third-party SDKs, open source components are ubiquitous in IoT device firmware. The security vulnerabilities which cause threat to the firmware usually exist in some specific versions of the components. The version information identification of open source binary components in the IoT firmware is of great significance for the safety assessment and emergency response of IoT devices. The existing version string based version extraction method is not applicable to the cases with missing version strings. This paper designs and implements a version extraction method (termed as Protues) for open source components that does not depend on version strings. The core idea of this method is to construct a version difference chain by using the differences between the open source components' neighboring versions of the source code to convert the version identification problem into a query on the version difference chain. Furthermore, in order to improve the recognition accuracy, the conditional judgment expressions are used in this paper to represent the nodes on the version difference chain. To verify the practicability of this method, version identification experiments are performed on a total number of 428 binary files from 4 kinds of open source components Samba, Msmtp, Nginx and Libgcrupt. The experimental results show that the number of versions that can be accurately identified by this method reaches 418, and the recognition accuracy rate is 98%.
文章编号:     中图分类号:    文献标志码:
基金项目:国家重点研发计划(2018YFB0803402);国家自然科学基金(U1536107);中国科学院信息工程研究所基础前沿项目(Y7Z0311104) 国家重点研发计划(2018YFB0803402);国家自然科学基金(U1536107);中国科学院信息工程研究所基础前沿项目(Y7Z0311104)
Foundation items:National Key R&D Program of China (2018YFB0803402); National Natural Science Foundation of China (U1536107); Fundamental Theory and Cutting Edge Technology Research Program of Institute of Information Engineering, CAS (Y7Z0311104)
Reference text:


ZHANG Wei-Dong,YIN Li-Bo,LI Hong,WEN Hui,SUN Li-Min.Open Source Component Version Recognition Without Version Strings.Journal of Software,2018,29(S1):83-91