Abstract:With the rapid development of technologies such as deep learning and significant breakthroughs in areas including computer hardware and cloud computing, increasingly mature artificial intelligence (AI) technologies are being applied to software systems across various fields. Software systems that incorporate AI models as core components are collectively referred to as intelligence software systems. Based on the application fields of AI technologies, these systems are categorized into image processing, natural language processing, speech processing, and other applications. Unlike traditional software systems, AI models adopt a data-driven programming paradigm in which all decision logic is learned from large-scale datasets. This paradigm shift renders traditional code-based test case generation methods ineffective for evaluating the quality of intelligence software systems. As a result, numerous testing methods tailored for intelligence software systems have been proposed in recent years, including novel approaches for test case generation and evaluation that address the unique characteristics of such systems. This study reviews 80 relevant publications, classifies existing methods according to the types of systems they target, and systematically summarizes test case generation methods for image processing, natural language processing, speech processing, point cloud processing, multimodal data processing, and deep learning models. Potential future directions for test case generation in intelligence software systems are also discussed to provide a reference for researchers in this field.