江苏省前沿引领技术基础研究专项(BK202002001); 国家自然科学基金(61702041); 北京信息科技大学“勤信人才”培育计划(QXTCP C201906)
随着区块链技术的兴起, 智能合约安全问题被越来越多的研究者和企业重视, 目前已有一些针对智能合约缺陷检测技术的研究. 软件缺陷预测技术是软件缺陷检测技术的有效补充, 能够优化测试资源分配, 提高软件测试效率. 然而, 目前还没有针对智能合约的软件缺陷预测研究. 针对这一问题, 提出了面向Solidity智能合约的缺陷预测方法. 首先, 设计了一组针对Solidity智能合约特有的变量、函数、结构和Solidity语言特性的度量元集(smart contract-Solidity, SC-Sol度量元集), 并将其与重点考虑面向对象特征的度量元集(code complexity and features of object-oriented program, COOP度量元集)组合为COOP-SC-Sol度量元集. 然后, 从Solidity智能合约代码中提取相关度量元信息, 并结合缺陷检测结果, 构建Solidity智能合约缺陷数据集. 在此基础上, 应用了7种回归模型和6种分类模型进行Solidity智能合约的缺陷预测, 以验证不同度量元集和不同模型在缺陷数量和倾向性预测上的性能差异. 实验结果表明, 相对于COOP度量元集, COOP-SC-Sol能够让缺陷预测模型的F1-score指标提升8%. 此外, 进一步研究了智能合约缺陷预测中的类不平衡问题, 实验结果表明, 通过采样技术对数据集进行预处理能够提升缺陷预测模型的性能, 其中随机欠采样技术能够使模型的F1-score指标提升9%. 在特定缺陷倾向性预测问题上, 模型的预测性能受到数据集类不平衡的影响, 在缺陷模块百分比大于10%的数据集中能取得较好的预测性能.
With the rise of blockchain technology, more and more researchers and companies pay attention to the security of smart contracts. Currently, there are some studies on smart contract defect detection and testing techniques. Software defect prediction technology is an effective supplement to the defect detection techniques, which can optimize the allocation of testing resources and improve the efficiency of software testing. However, there is no research on software defect prediction for the smart contract. To address this problem, this study proposes a defect prediction method for Solidity smart contracts. First, it designs a metrics suite (smart contract-Solidity, SC-Sol) which considers the variables, functions, structures, and features of Solidity smart contracts, and SC-Sol is combined with the traditional metrics suite (code complexity and features of object-oriented program, COOP), which consider the object-oriented features, into COOP-SC-Sol metrics suite. Then, it extracts relevant metric meta-information from the Solidity code and performs defect detection to obtain the defects information to construct a Solidity smart contracts defect data set. On this basis, seven regression models and six classification models are applied to predict the defects of Solidity smart contracts to verify the performance differences of different metrics suites and different models for predicting the number and tendency of defects. Experimental results show that compared with the COOP, COOP-SC-Sol can improve the performance of the defect prediction model by 8% in terms of the F1-score. In addition, the problem of class imbalance in smart contract defect prediction is further studied. The result shows that the random under-sampling method can improve the performance of the defect prediction model by 9% in F1-score. In predicting the tendency of specific types of defects, the performance of the model is affected by the imbalance of data sets. Better performance is achieved in predicting the types of defects which the percentage of defect modules is greater than 10%.