Journal of Software:2020.31(5):1255-1275

(天津大学 智能与计算学部, 天津 300350;天津大学 国际工程师学院, 天津 300350;南通大学 信息科学技术学院, 江苏 南通 226019)
Survey on Testing of Deep Neural Networks
WANG Zan,YAN Ming,LIU Shuang,CHEN Jun-Jie,ZHANG Dong-Di,WU Zhuo,CHEN Xiang
(College of Intelligence and Computing, Tianjin University, Tianjin 300350, China;International Engineering Institute, Tianjin University, Tianjin 300350, China;School of Information Science and Technology, Nantong University, Nantong 226019, China)
Chart / table
Similar Articles
Article :Browse 991   Download 2717
Received:September 01, 2019    Revised:October 24, 2019
> 中文摘要: 随着深度神经网络技术的快速发展、大数据的涌现和计算能力的显著提升,深度神经网络被越来越多地应用到各个安全攸关领域,例如自动驾驶、人脸识别、飞机碰撞检测等.传统的软件系统通常由开发人员手工编写代码实现其内部的决策逻辑,并依据相应的测试覆盖准则设计测试用例来测试系统代码.与传统的软件系统不同,深度学习定义了一种新的数据驱动的编程范式,开发人员仅编写代码来规定深度学习系统的网络结构,其内部逻辑则由训练过程获得的神经元连接权值所决定.因此,针对传统软件的测试方法及度量指标无法直接被移植到深度神经网络系统上.近年来,越来越多的研究致力于解决深度神经网络的测试问题,例如提出新的测试评估标准、测试用例生成方法等.调研了92篇相关领域的学术论文,从深度神经网络测试度量指标、测试输入生成、测试预言这3个角度对目前已有的研究成果进行了系统梳理.同时,分析了深度神经网络测试在图像处理、语音处理以及自然语言处理上的已有成果,并介绍了深度神经网络测试中应用到的数据集及工具.最后,对深度神经网络测试的未来工作进行了展望,以期为该领域的研究人员提供参考.
Abstract:With the rapid development of deep neural networks, the emerging of big data as well as the advancement of computational power, Deep Neural Network (DNN) has been widely applied in various safety-critical domains such as autonomous driving, automatic face recognition, and aircraft collision avoidance systems. Traditional software systems are implemented by developers with carefully designed programming logics and tested with test cases which are designed based on specific coverage criteria. Unlike traditional software development, DNN defines a data-driven programming paradigm, i.e., developers only design the structure of networks and the inner logic is reflected by weights which are learned during training. Traditional software testing methods cannot be applied to DNN directly. Driven by the emerging demand, more and more research works have focused on testing of DNN, including proposing new testing evaluation criteria, generation of test cases, etc. This study provides a thorough survey on testing DNN, which summarizes 92 works from related fields. These works are systematically reviewed from three perspectives, i.e., DNN testing metrics, test input generation, and test oracle. Existing achievements are introduced in terms of image processing, speech processing, and natural language processing. The datasets and tools used in DNN testing are surveyed and finally the thoughts on potential future research directions are summarized on DNN testing, which, hopefully, will provide references for researchers interested in the related directions.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(61872263,61802275,71502125);天津市智能制造专项资金(20191012);天津大学自主创新基金(2019XZC-0073,2020XZC-0042) 国家自然科学基金(61872263,61802275,71502125);天津市智能制造专项资金(20191012);天津大学自主创新基金(2019XZC-0073,2020XZC-0042)
Foundation items:National Natural Science Foundation of China (61872263, 61802275, 71502125); Intelligent Manufacturing Special Fund of Tianjin (20191012); Innovation Research Project of Tianjin University (2019XZC-0073, 2020XZC-0042)
Reference text:


WANG Zan,YAN Ming,LIU Shuang,CHEN Jun-Jie,ZHANG Dong-Di,WU Zhuo,CHEN Xiang.Survey on Testing of Deep Neural Networks.Journal of Software,2020,31(5):1255-1275