采用多目标优化的深度学习测试优化方法

doi:10.13328/j.cnki.jos.006583

微信服务号

微信订阅号

首页 > 过刊浏览>2022年第33卷第7期 >2499-2524. DOI:10.13328/j.cnki.jos.006583

PDF HTML阅读 XML下载导出引用引用提醒

采用多目标优化的深度学习测试优化方法
DOI:
                        10.13328/j.cnki.jos.006583
                    
作者:
                        
                        
                    
作者单位:
作者简介:沐燕舟(1996-),男,硕士,CCF学生会员,主要研究领域为机器学习,并发程序分析,深度学习测试;
陈俊洁(1992-),男,博士,副教授,博士生导师,CCF专业会员,主要研究领域为软件分析与测试;
王赞(1979-),男,博士,教授,博士生导师,CCF专业会员,主要研究领域为软件测试,机器学习;
赵静珂(1997-),男,硕士生,主要研究领域为深度学习安全质量保证;
陈翔(1980-),男,博士,副教授,CCF高级会员,主要研究领域为软件缺陷预测,软件缺陷定位,回归测试,组合测试;
王建敏(1986-),男,博士,助理研究员,主要研究领域为智能软件测试,系统仿真.
通讯作者:王赞,E-mail:wangzan@tju.edu.cn
中图分类号:TP311
基金项目:基金项目:国家自然科学基金(61872263);基础加强计划技术领域基金(2020-JCJQ-JJ-490);2020年天津市智能制造专项资金

Deep Learning Test Optimization Method Using Multi-objective Optimization

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

随着深度学习技术的快速发展,对其质量保障的研究也逐步增多.传感器等技术的迅速发展,使得收集测试数据变得不再困难,但对收集到的数据进行标记却需要花费高昂的代价.已有工作尝试从原始测试集中筛选出一个测试子集以降低标记成本,这些测试子集保证了与原始测试集具有相近的整体准确率(即待测深度学习模型在测试集全体测试输入上的准确率),但却不能保证在其他测试性质上与原始测试集相近.例如,不能充分覆盖原始测试集中各个类别的测试输入.提出了一种基于多目标优化的深度学习测试输入选择方法DMOS (deep multi-objective selection),其首先基于HDBSCAN (hierarchical density-based spatial clustering of applications with noise)聚类方法初步分析原始测试集的数据分布,然后基于聚类结果的特征设计多个优化目标,接着利用多目标优化求解出合适的选择方案.在8组经典的深度学习测试集和模型上进行了大量实验,结果表明,DMOS方法选出的最佳测试子集(性能最好的Pareto最优解对应的测试子集)不仅能够覆盖原始测试集中更多的测试输入类别,而且对各个类别测试输入的准确率估计非常接近原始测试集.同时,它还能保证在整体准确率以及测试充分性上的估计也接近于原始测试集:对整体准确率估计的平均误差仅为1.081%,比最新方法PACE (practical accuracy estimation)减小了0.845%的误差,提升幅度为43.87%;对各个类别测试输入的准确率估计的平均误差仅为5.547%,比最新方法PACE减小了2.926%的误差,提升幅度为34.53%;对5种测试充分性度量的平均估计误差仅为8.739%,比最新方法PACE减小了7.328%的误差,提升幅度为45.61%.

Abstract:

With the rapid development of deep learning technology, the research on its quality assurance is raising more attention. Meanwhile, it is no longer difficult to collect test data owing to the mature sensor technology, but it costs a lot to label the collected data. In order to reduce the cost of labeling, the existing work attempts to select a test subset from the original test set. They only ensure that the overall accuracy (the accuracy of the target deep learning model on all test inputs of the test set) of the test subset is similar to that of the original test set. However, existing work only focuses on estimating overall accuracy, ignoring other properties of the original test set. For example, it can not fully cover all kinds of test input in the original test set. This study proposes a method based on multi-objective optimization called DMOS (deep multi-objective selection). It firstly analyzes the data distribution of the original test set based on HDBSCAN (hierarchical density-based spatial clustering of applications with noise) clustering method. Then, it designs the optimization objective based on the characteristics of the clustering results and then carries out multi-objective optimization to find out the appropriate selection solution. A large number of experiments are carried out on 8 pairs of classic deep learning test sets and models. The results show that the best test subset selected by DMOS method (corresponding to the Pareto optimal solution with the best performance) can not only cover more test input categories in the original test set, but also estimate the accuracy of each test input category extremely close to the original test set. Meanwhile, it can also ensure that the overall accuracy and test adequacy are close to the original test set: The average error of the overall accuracy estimation is only 1.081%, which is 0.845% less than the PACE (practical accuracy estimation), with the improvement of 43.87%. The average error of the accuracy estimation of each category of test input is only 5.547%, which is 2.926% less than PACE, with the improvement of 34.53%. The average estimation error of the five test adequacy measures is only 8.739%, which is 7.328% lower than PACE, with the increase improvement of 45.61%.

参考文献

相似文献

引证文献

引用本文

沐燕舟,王赞,陈翔,陈俊洁,赵静珂,王建敏.采用多目标优化的深度学习测试优化方法.软件学报,2022,33(7):2499-2524

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2021-09-05
最后修改日期:2021-10-14
录用日期:
在线发布日期: 2022-01-28
出版日期: 2022-07-06

微信服务号

微信订阅号

引用本文

分享

文章指标

历史