基于变异的正则表达式反例测试串生成算法

doi:10.13328/j.cnki.jos.006925

微信服务号

微信订阅号

首页 > 过刊浏览>年第卷第期 >1-22. DOI:10.13328/j.cnki.jos.006925

PDF HTML阅读 XML下载导出引用引用提醒

基于变异的正则表达式反例测试串生成算法
DOI:
                        10.13328/j.cnki.jos.006925
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:TP311
基金项目:国家自然科学基金(61872339); 福建省自然科学基金(2021J01316, 2021J01320); 中央高校基本科研业务费专项资金(ZQN-1010); 厦门市自然科学基金(3502Z20227191); 上海市自然科学基金(22ZR1422200)

Mutation-based Generation Algorithm of Negative Test Strings from Regular Expressions

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

正则表达式在计算机科学的许多领域具有广泛应用. 然而, 由于正则表达式语法比较复杂, 并且允许使用大量元字符, 导致开发人员在定义和使用时容易出错. 测试是保证正则表达式语义正确性的实用和有效手段, 常用的方法是根据被测表达式生成一些字符串, 并检查它们是否符合预期. 现有的测试数据生成大多只关注正例串, 而研究表明, 实际开发中存在的错误大部分在于定义的语言比预期语言小, 这类错误只能通过反例串才能发现. 研究基于变异的正则表达式反例测试串生成. 首先通过变异向被测表达式中注入缺陷得到一组变异体, 然后在被测表达式所定义语言的补集中选取反例字符串揭示相应变异体所模拟的错误. 为了能够模拟复杂缺陷类型, 以及避免出现变异体特化而无法获得反例串的问题, 引入二阶变异机制. 同时采取冗余变异体消除、变异算子选择等优化技术对变异体进行约简, 从而控制最终生成的测试集规模. 实验结果表明, 与已有工具相比, 所提算法生成的反例测试串规模适中, 并且具有较强的揭示错误能力.

Abstract:

Regular expressions are widely used in various areas of computer science. However, due to the complex syntax and the use of a large number of meta-characters, regular expressions are quite error-prone when defined and used by developers. Testing is a practical and effective way to ensure the semantic correctness of regular expressions. The most common method is to generate a set of character strings according to the tested expression and check whether they comply with the intended language. Most of the existing test data generation focuses only on positive strings. However, empirical study shows that a majority of errors during actual development are manifested by the fact that the defined language is smaller than the intended one. In addition, such errors can only be detected by negative strings. This study investigates the generation of negative strings from regular expressions based on mutation. The study first obtains a set of mutants by injecting defects into the tested expression through mutation and then selects a negative character string in the complementary set of the language defined by the tested expression to reveal the error simulated by the corresponding mutant. In order to simulate complex defects and avoid the problem that the negative strings cannot be obtained due to the specialization of mutants, a second-order mutation mechanism is adopted. Meanwhile, optimization techniques such as redundant mutant elimination and mutation operator selection are used to reduce the mutants, so as to control the size of the finally generated test set. The experimental results show that the proposed algorithm can generate negative test strings with a moderate size and have strong error detection ability compared with the existing tools.

参考文献

相似文献

引证文献

引用本文

郑黎晓,余李林,陈海明,陈祖希,骆翔宇,汪小勇.基于变异的正则表达式反例测试串生成算法.软件学报,,():1-22

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2022-05-24
最后修改日期:2022-10-26
录用日期:
在线发布日期: 2023-08-30
出版日期:

微信服务号

微信订阅号

引用本文

分享

文章指标

历史