XIE Cheng-Wang , GUO Hua , WEI Wei , JIANG Lei
2023, 34(4):1523-1542. DOI: 10.13328/j.cnki.jos.006702
Abstract:It is difficult to solve many-objective optimization problems (MaOPs) effectively by using the traditional multi-objective evolutionary algorithms (MOEAs) based on Pareto dominance relation. A dominance relation is proposed firstly by combing double distances of PBI utility function without introducing extra parameter. Secondly, a diversity maintenance method based on double distances is also defined, which not only considers the double distances of the individual, but also adaptively adjusts the weight of diversity according to the objective number of MaOP, so as to better balance the convergence and diversity of the solution set in many-objective space. Finally, the proposed dominance relation and diversity maintenance method are embedded into the framework of NSGA-II, and then a many-objective evolutionary algorithm based on double distances (MaOEA/d2) is designed. The MaOEA/d2 is compared with other five representative many-objective evolutionary algorithms on the DTLZ and WFG benchmark functions with 5-,10-,15-, and 20-objective in terms of IGD and HV indicators. The empirical results show that MaOEA/d2can obtain better convergence and diversity. Therefore, the proposed MaOEA/d2is a promising many-objective evolutionary algorithm.
ZHAI Zhi-Nian , LU Ya-Hui , LIU Guan-Jun , LEI Jing-Sheng , XIANG Jian , WU Ming-Wei
2023, 34(4):1543-1569. DOI: 10.13328/j.cnki.jos.006682
Abstract:Workflow satisfiability problem is an elemental issue in the security planning of business process, and it is facing the performance challenge caused by high resource ratio (the number n of resources is significantly greater than the number k of steps). Under resource-independent constraints, its most efficient approach is incremental pattern backtracking (IPB) in the pattern space. To overcome the performance bottleneck of verifying whether a node is authentic, IPB computes the k-assignment (bipartite) graph of a pattern and its (left complete) matching in an incremental manner, which requires O(kn) and O(k2) time respectively. This study computes a full-assignment graph incrementally with only O(n) time by exploiting the atomic difference between a sub-pattern and its super one, and in particular its actual performance will increase rapidly with the size of a block in pattern. However, the size O(kn) of such a graph will result in the same incremental matching time. Further, this study introduces the concept of complete k core matching and shows that its existence is equivalent to a left complete matching and its incremental computation only costs O(k2) time. Therefore, this study proposes an algorithm of minimum incremental pattern backtracking (MIPB) that is superior to IPB in time complexity. Experiments are conducted on an extended public instance set with constraints of two global value-cardinality types and of the mutual exclusion, and with an authorization ratio of about 1/4. The results show that: when k varies at n/k=10 (n/k=100, respectively), MIPB achieves averagely more than 2 (5, respectively) times and at least 1.5 (2.9, respectively) times advantage of performance compared to IPB; when k=18 (k=36, respectively), and n/k belongs to 2~4096 (2~2048, respectively), MIPB achieves averagely more than 2.6 (3.6, respectively) times advantage. While compared to the champion solver OR-Tools CP-SAT in the 2018~2021 Minizinc Challenges, MIPB achieves at least more than 3 times advantage.
TAO Xiao-Han , ZHU Yu , PANG Jian-Min , ZHAO-Jie , XU Jin-Long
2023, 34(4):1570-1593. DOI: 10.13328/j.cnki.jos.006688
Abstract:Heterogeneous architectures are dominating the realm of high-performance computing. However, these architectures also complicate the programming issue due to its increasingly complex hardware and memory hierarchy compared to homogeneous architectures. One of the most promising solutions to this issue is making use of optimizing compilers which can help programmers develop high-performance code executable on target machines, thereby simplifying the difficulty of programming. The polyhedral model is widely studied due to its ability to generate effective code and portability to various targets, which is realized by first converting a program into its intermediate representation and then combining the compositions of loop transformations and hardware binding strategies. This paper presents a source-to-source parallel code generator targeting the domestic, heterogeneous architecture of the Sunway machine using the polyhedral model. In particular, the computation is deployed automatedly onto the Sunway architecture and memory management, minimizing the amount of data movements between the management processing element and computing processing elements of the target. The experiments are conducted on 13 linear algebra applications extracted from the Polybench Benchmarks. The experimental results show that the proposed approach can generate effective code executable on the Sunway heterogeneous architecture, providing a mean speedup of 539.16× on 64 threads over the sequential implementation executed on a management processing element.
WANG Wen-Xiang , GAO Qing , XU Ke , ZHANG Shi-Kun
2023, 34(4):1594-1612. DOI: 10.13328/j.cnki.jos.006691
Abstract:Software crash is a kind of serious software flaw, which can lead to software crashes. Therefore, testing for software crashes is extremely important in the process of software iteration. In recent years, since a large number of test inputs can be automatically generated to trigger software crashes, fuzzing techniques (such as AFL) are widely used in software testing. Nevertheless, most of root causes of crashes that are generated by this technique are same. In this case, software developers have to classify the test inputs one by one, which brings a lot of redundant work. At present, there are many automated methods for testing input classification, mainly including classification algorithms based on program repair and classification algorithms based on software crash information. The former analyzes the program semantics, and re-runs the test input after replacing the repair templates in the program at runtime, and then classifies the inputs. Since this method requires the preparation of repair templates to be completed artificially, the efficiency of its classification is closely related to the quality of the repair templates. At the same time, the repair efficiency of the software has been greatly affected due to the need to repair the crash and classify the crash. Since certain advantages of the latter, this study proposes a lightweight and efficient test inputs classification algorithm, which uses software crash information. Based on the algorithm of software crash point stack information classification, this study introduces dynamic link library information in analyzing CICELY. By distinguishing system dynamic link library from user dynamic link library and combining with location information of user codes, this study gets the set of functions that are focused by programmers to define the crash based on the user function in the classification. In the end, this study also compares CICELY with some existing classification tools based on program repair and software crash information. The experimental test data sets total 19 projects, and 42 test sets. When comparing with other classification tools, Honggfuzz and CERT BFF, whose main classification algorithms are based on software crash information on the same data set, the numbers of classification results of the two are 2112.89% and 135.05% worse than that of CICELY, proving that the experimental effect of CICELY is greatly improved and has higher accuracy compared with similar algorithms. Compared with the classification algorithm "Semantic Crash Bucketing" based on program repair using the test data set provided in their article, CICELY is worse than it by 4.42%. When using the test set consisting of test inputs corresponding to multiple crashes, CICELY got 3% higher repeatability than it. However, Semantic Crash Bucketing can only classify crashes caused by two kinds of crash inputs, null pointer dereference and buffer overflow, while CICELY is not subject to such restrictions.
WANG Ya-Wen , WANG Jun-Jie , SHI Lin , WANG Qing
2023, 34(4):1613-1629. DOI: 10.13328/j.cnki.jos.006697
Abstract:App reviews are considered as a communication channel between users and developers to perceive user’s satisfaction. Users usually describe buggy features (i.e., user actions) and App abnormal behaviors (i.e., abnormal behaviors) in forms of key phrases (e.g., “send a video” and “crash”), which could be buried with other trivial information (e.g., complaints) in the review texts. A fine-grained view about this information could facilitate the developers’ understanding of feature requests or bug reports from users, and improve the quality of Apps. Existing pattern-based approaches to extract target phrases can only summarize the high-level topics/aspects of reviews, and suffer from low performance due to insufficient semantic understanding of reviews. This study proposes a semantic-aware and fine-grained App review bug mining approach (Arab) to extract user actions and abnormal behaviors, and mine the correlations between them. A novel neural network model is designed for extracting fine-grained target phrases, which combines textual descriptions and review attributes to better represent the semantics of reviews. Arab also clusters the extracted phrases based on their semantic relations and provides a visualization of correlations between User Actions and Abnormal Behaviors. 3,426 reviews from six Apps are used to carry out evaluation test, and the results confirm the effectiveness of Arab in phrase extraction. A case study is further conducted with Arab on 301,415 reviews of 15 popular Apps to explore its potential application and examine its usefulness on large-scale data.
ZHONG Yi , SHI Meng-Yu , FANG Chun-Rong , ZHAO Zhi-Hong , CHEN Zhen-Yu
2023, 34(4):1630-1649. DOI: 10.13328/j.cnki.jos.006701
Abstract:Automated testing tools are the primary means of quality assurance for Android applications. With the increase in Android version diversity, underlying hardware variability (fragmentation), and logical complexity, automated testing faces new challenges. Numerous automated testing tools have been developed in recent years to address the above issues. However, there are vast tools with various testing focuses, making it hard for testers to choose the right one. To help testers select the best tool for testing and achieve a unified evaluation for automated testing tools, a multi-characteristic comprehensive evaluation of the Android automated testing (CEAT) method is proposed and an easy-to-use platform is implemented for testers. CEAT introduces three widely accepted evaluation metrics: code coverage, exception detection rate, fusion multi-version compatibility score, and further introduces mutation kill rate based on the mutation testing concept, and UI control widget coverage from the perspective of the user. The five metrics constitute the whole CEAT system, thus realizing a comprehensive multi-dimensional evaluation of Android automated testing tool. To verify the effectiveness of CEAT, a set of 1,089 mutated applications is generated for testing, the experiments are deployed in a real-world cluster containing six mobile devices, and 5,040 test tasks are executed for the testing tools. The results suggest that: (i) the five indicators evaluate the automated testing tools from different perspectives, reflecting the testing performance of different tools in a more multi-dimensional way and validating the effectiveness of CEAT; (ii) CEAT supports testers to assign different weights to the five metrics and obtain comprehensive evaluation results depending on the practical testing requirements, which has certain flexibility; (iii) CEAT automatically reconstructs the APP to obtain mutant APPs and set a specific platform for testing the tool, making it convenient to operate. CEAT effectively provides a reference for testers to select the best Android automated testing tool according to different testing requirements.
YU Pu , SHU Hui , XIONG Xiao-Bing , KANG Fei
2023, 34(4):1650-1665. DOI: 10.13328/j.cnki.jos.006719
Abstract:At present, in the field of code protection technology research, traditional obfuscation methods have obvious obfuscation characteristics, and analysts can perform customized de-obfuscation processing based on these characteristics. For this reason, this study proposes a code protection technology based on code slice fusion. This technology slices the target code into code fragments according to grammatical rules at the source code level, and inserts the fragments into different positions of another program according to the execution order and grammatical rules. After repairing the function call process and the data relationship, the fusion code that can run the two code functions normally is formed. A comparative experiment for the fusion code is carried out from three perspectives, namely, resource overhead, code complexity impact, and code similarity. The test results demonstrate that the implicit code obfuscation technique based on code slice fusion can effectively obfuscate code semantics, change control flow characteristics, and has no obvious obfuscation characteristics. Therefore, fusion technology has obvious advantages in the ability to fight against multiple similarity comparison algorithms.
WANG Hong-Bing , YAN Jia , ZHANG Dan-Dan , LU Rong-Rong
2023, 34(4):1666-1694. DOI: 10.13328/j.cnki.jos.006721
Abstract:As a novel schema of software development, software crowdsourcing has been widely studied by academia and industry. Compared with traditional software development, software crowdsourcing makes the most use of developers all over the world to complete complex development tasks which can effectively reduce costs and improve efficiency. Nevertheless, because there are a large number of complex tasks in the current crowdsourcing platform and inaccurate task matching will affect the progress and quality of task solutions, it is very important to study the matching problem between developers and tasks. Therefore, this study utilizes the dynamic preferences and competitiveness features of developers and proposes a task recommendation model to recommend appropriate software development tasks for developers. First, the attention mechanism based-long short-term memory network is adopted to predict the current preference of a developer to screen out the top-N tasks that conform to the preference from the candidate tasks. On this basis, according to the developer’s competitiveness, differential evolution algorithm based-extreme gradient boosting is used to predict the developer’s scores of top-N tasks, thus further filtering out the top-K tasks with the highest scores to recommend to the developer. Finally, in order to verify the validity of the proposed model, a series of experiments is carried out to compare the existing methods. The experiment results illustrate that the proposed model has significant advantages in task recommendation in software crowdsourcing.
HU Tian-Xiang , XIE Rui , YE Wei , ZHANG Shi-Kun
2023, 34(4):1695-1710. DOI: 10.13328/j.cnki.jos.006723
Abstract:Code summarization generates brief natural language descriptions of source code pieces, which can assist developers to understand code and reduce documentation workload. Recent research efforts on code summarization mainly adopt deep learning models. Most of these models are trained on large datasets, consisting of independent code-summary pairs. Despite the technical advances, most of these works, referred as general code summarization models, ignore the project-level contextual information of code pieces and summaries, which developers would heavily rely on when writing documentation. This study investigates project-specific code summarization, a scenario that is much more consistent with human behavior and tool implementation of code summarization. Specifically, a novel deep learning approach is proposed that leverages highly relevant code pieces and their corresponding summaries to characterize contextual semantics, and integrates common knowledge learned from large-scale cross-project dataset via transfer learning. The dataset is created and released for project-specific code summarization, consisting of 800k method-summary pairs along with their lifecycle information for re-producing accurate code context. Experimental results on this dataset demonstrate that the proposed technique can not only gain huge improvement over general code summarization model, but also generates more consistent summaries within a project.
CONG Run-Min , ZHANG Chen , XU Mai , LIU Hong-Yu , ZHAO Yao
2023, 34(4):1711-1731. DOI: 10.13328/j.cnki.jos.006700
Abstract:Inspi