LI Yi , CAI Tian-Xun , FAN Jian-Feng , WU Wen-Yuan , FENG Yong
2019, 30(7):1903-1915. DOI: 10.13328/j.cnki.jos.005748
Abstract:Synthesizing ranking functions of polynomial loop programs is the dominant method for checking their termination. In this study, the synthesis of ranking functions of a class of polynomial loop program is reduced to the binary problem. The support vector machine (SVM) technique then is applied to solve such the binary problem. This naturally relates detection of ranking functions to SVM. Different from the CAD-based method for synthesizing ranking functions, the proposed method can get more expressive polynomial ranking functions in an acceptable time.
WUNIRI Qi-Qi-Ge , LI Xiao-Ping , MA Shi-Long , LÜ Jiang-Hua , ZHANG Si-Qing
2019, 30(7):1916-1938. DOI: 10.13328/j.cnki.jos.005749
Abstract:According to the authoritative statistics, more than 70% of software errors during the test are introduced in requirements gathering and analysis or architectural design. The design and verification of the software architecture is essential to improve the quality of application software. The existing application software design methods do not support the verification of requirements, and they usually need the support of other verification tools. In this study, with the background of Web application architecture design and verification, a software architecture modelling and verification language (SAML) and a software architecture modelling and verification method (SAMM), which are based on the higher-order type theory, are proposed. In the SAML language, the syntax and semantics of the types, the ordinary terms as well as the type terms are defined to describe the structure of types and objects, the typing rules are defined to process the judgments of Γ|-t:T and Γ|-R(T1,T2). In the SAMM method, the software architecture modelling paradigm is presented, of which is consisted of the five layers of modelling including Mcls (interface type layer), Mcmpt (component layer), Mcont (container layer), Mfrm (frame layer), and Mfrwk (framework layer). In each layer, modelling of the types and the relations of the types are needed, while the typing rules corresponding to the type relations are automatically generated. Furthermore, the method invocation graph (GSA) is defined to describe the design requirements and the type sequences and its correctness are defined to describe the properties of user requirements, and the related checking algorithms are given. The prototype of the software architecture modelling and verification system as a modelling and verifying tool is developed, to which support to the design process by modelling and the automatic verification of the design regarding to the requirements. Finally, the method is applied to a real case of large scale by the design of software architecture and its verification and evaluation.
2019, 30(7):1939-1952. DOI: 10.13328/j.cnki.jos.005750
Abstract:In recent years, the verification technology based on Petri net coverability has been successfully applied to the verification and analysis of concurrent programs. However, due to the complexity of Petri net coverability, such technology has great limitations in application. For large scale input model, they all have timeout problem. While a subsystem of Petri net-communication-free Petri net, whose reachability and coverability are both NP-complete, can be used as a verification model of certain class of concurrent program because of its expression. In this study, a tool called CFPCV is designed and implemented, which can verify coverability of communication-free Petri net very efficiently. A constraint-based approach is used to extract the constraints from the model and solve the constraints with Z3 SMT solver, and then the candidate solutions are verified with subnet markable method to ensure that each solution is correct. The success rate, iteration times, and performance of the tool are experimentally analyzed, and it is found that the proposed algorithm has not only a high success rate but also excellent performance.
2019, 30(7):1953-1965. DOI: 10.13328/j.cnki.jos.005752
Abstract:Duration calculus is a kind of interval temporal logic, which is designed for specifying and reasoning the properties about the embedded real-time systems and hybrid systems. Extend linear duration invariants (ELDI) is an important subset of duration calculus. In this study, a bounded model checking algorithm of ELDI formulas against real-time automaton is proposed. The bounded model checking problem of ELDI is reduced into the validity problem of Quantified linear real arithmetic (QLRA) formulas, which can be solved by Quantifier elimination (QE) technique. Firstly, by using deep first search algorithm, the real-time automaton is searched to find all the symbolic paths segments which satisfy the constraints of observation time length. Secondly, the paths segments are transformed into QLRA formulas. Finally, the QLRA formulas are solved by QE tools. Thus, compared with the related works, a verification algorithm of ELDIs is proposed against real-time automaton with lower complexity. In addition, the practical speed of verifying process is accelerated.
LUO Chen-Xia , WANG Rui , GUAN Yong , LI Xiao-Juan , SHI Zhi-Ping , Xiaoyu SONG
2019, 30(7):1966-1979. DOI: 10.13328/j.cnki.jos.005753
Abstract:Cyber-physical systems (CPS) is a system that integrates physical and computational elements based on context-awareness. It can intelligently respond to dynamic changes in the real world and has important and broad application prospects. However, CPS works in a complex physical environment, and changes around it can affect the behavior of CPS. Therefore, ensuring the safety and reliability of CPS in complex environments is critical. This study proposes an integrated modeling method for real-time data. By defining a series of rules, the domain environment model is combined into the runtime verification process to ensure the safety and reliability of CPS in an uncertain environment. First, the method builds a mathematical model for the environment. Then, design the merge rules to merge the mathematical models with only one environmental factor under the same system parameter into a mathematical model with one or more environmental factors under the same system parameter. Next, the transformation rules are defined to convert the mathematical model into an environment model represented by pseudocode. Finally, the environment model is combined into the runtime monitoring model to perform verification according to the combination rules. The method makes the verification process more complete and accurate. When the environment changes, it ensures that the safety properties in the CPS are still satisfied by dynamically adjusting the parameter range. Finally, the method is applied to the mobile robot obstacle avoidance experiment, to model the temperature and humidity physical environment and then, to combine it into the monitoring model, eventually, the life time safety reminder is accurately given in different environments.
YANG Zhi-Bin , YUAN Sheng-Hao , XIE Jian , ZHOU Yong , CHEN Zhe , XUE Lei , Jean-Paul BODEVIX , Mamoun FILALI
2019, 30(7):1980-2002. DOI: 10.13328/j.cnki.jos.005754
Abstract:Multi-core processors are being widely used in safety-critical systems, due to the demands of higher computation performance when designing these systems, and strengths of multi-core processors, such as faster computation and SWaP (size, weight, and power) properties. Synchronous languages are suitable for modeling and verification of safety-critical software due to their abilities, e.g. the description of concurrency behaviors and precise timing semantics. At present, the SIGNAL compiler supports to generate the sequential code from synchronous specification. The existing studies pay a little attention to the generation of parallel code from SIGNAL specification. The paper presents a multi-threaded code generation tool for synchronous language. Firstly, the SIGNAL specification is transformed into the intermediate program S-CGA and is carried out the clock calculus. After that, the S-CGA program is transformed into CDDG (clock data dependency graph). Then, the CDDG is partitioned by topological sort, after which an optimized algorithm and a partition algorithm are respectively proposed based on pipeline-style. Finally, the partition results are transformed to VMT (virtual multi-threaded) code which is then transformed into executable multi-threaded C/Java program. The experiment running on multi-core CPUs is given to verify the effectiveness of the proposed methodology.
KANG Yue-Xin , GAN Yuan-Ke , WANG Sheng-Yuan
2019, 30(7):2003-2017. DOI: 10.13328/j.cnki.jos.005755
Abstract:Synchronous data-flow languages, such as Lustre and Signal, have been widely used in safety-critical industrial areas, such as airplane, high-speed railways, nuclear power plants, and so on, for example, Scade, a tool suitable for modeling and developing a real-time control systems in those areas, is based on a Lustre-like language. Naturally, the safety of development tools, especially compilers, for such languages, has been paid highly attentions. In recent years, the construction of a trustworthy compiler based on formal verification has become one of the research focuses in the field of programming language, and remarkable achievements have also been achieved, for example, a product level trustworthy C compiler has been developed in the CompCert project. Similarly, this method has been used to develop the trustworthy compilers of a synchronous data flow language. Two long term projects for such research work, called Vélus and L2C respectively in this study, are focused here. Either of them has been developing a compiler for a Lustre-like synchronous data-flow language. The analysis and comparison are deeply carried out in terms of various points between Vélus and L2C.
YANG Kang , WANG Rui , GUAN Yong , LI Xiao-Juan , SHI Zhi-Ping , Xiaoyu SONG
2019, 30(7):2018-2032. DOI: 10.13328/j.cnki.jos.005756
Abstract:Cyber-physical systems (CPS) are next-generation intelligent systems based on environment-aware computing, communication, and physical elements. They are widely used in security-critical systems and industrial control. The interaction of information technology and the physical world makes CPS vulnerable to various malicious attacks, thereby undermining its security. This work mainly studies the attack detection problem of sensors in CPS systems with transient faults. This study considers a system with multiple sensors measuring the same physical variables, and some sensors may be malicious attacked and provide erroneous measurements. In addition, this study uses an abstract sensor model where each sensor provides the controller with an interval of possible values for the true value. Existing methods for detecting sensor malicious attacks are conservative. When a professional attacker manipulates the sensor's output slightly or infrequently over a period of time, existing methods are difficult to capture attacks, such as stealth attacks. In order to solve this problem, this study designs a sensor attack detection algorithm based on fusion intervals and historical measurements. First, the algorithm constructs different fault models for different sensors, integrates historical measurements into the attack detection method using system dynamics equations, and analyzes sensor measurements from different aspects. In addition, combined with historical measurement and fusion interval, the problem of whether there are faults when the two sensors intersect is solved. The core idea of this method is to detect and identify attack by using pairwise inconsistency between sensors. This study obtains real measurement data from EV3 ground vehicles to verify the performance of the algorithm. The experimental results show that the proposed method is superior to the state-of-the-art algorithm, and has better detection and recognition performance for various attack types. Especially for stealth attacks, the detection rate and recognition rate are increased by more than 90%.
ZHU Kai , WU Guo-Qing , WU Li-Hua , YUAN Meng-Ting
2019, 30(7):2033-2051. DOI: 10.13328/j.cnki.jos.005757
Abstract:The reset sequences of finite automata, also known as the synchronizing sequences, have a characteristic:a finite automaton can reach a certain state qw by running a reset sequence from any unknown or unobservable state q0. This is dependent only on the reset sequence w itself, not on the state q0 of the finite automaton at the beginning of running the sequence w. It can be used to restore the running partially observable and complex systems automatically that needs no resetting, and sometimes even that cannot reset. Therefore, the reset problem has been paid attention to since it emerged and has been studied continuously. Recently, it has extended to infinite state models that can describe the complex systems, including distributed and embedded real-time systems, such as timed automata, register automata, etc. In this work, the computational complexity of several problems for the resetting timed automata is studied, and the strong connection between resettability problem and reachability problem for timed automata is found. The main contribution includes:(1) the complexity of the problem for resetting the complete and deterministic timed automata is updated more precisely with the recent achievements in reachability problem for timed automata; (2) the complexity of the problem for resetting the partially specified timed automata is studied. Even if the size of input alphabet is decreased to 2, it is still PSPACE-complete, and in the case of single clock, it is NLOGSPACE-complete; (3) for the complete and nondeterministic timed automata, Di-resetting problems (i=1,2,3) are all undecidable. The resetting problem for nondeterministic register automata and nondeterministic timed automata can be inter-reduced in exponential time, and the reduction in exponential time is closed for relatively high computational complexity classes. Therefore, it concludes that the problem for resetting it in single clock case is Ackermann-complete, and that bounded version is NEXPTIME-complete from the results on corresponding nondeterministic register automata. These conclusions show that most of resetting problems for timed automata are intractable. On the one hand, they make a solid theoretical foundation for checking and solving the resettability of the timed systems, on the other hand, they guide to seek for some subclasses of real time system which have particular structure and effective algorithms for solving it.
KONG Fang , WANG Hong-Ling , ZHOU Guo-Dong
2019, 30(7):2052-2072. DOI: 10.13328/j.cnki.jos.005834
Abstract:Natural language is usually understood from discourse perspective. With the success of research at the lexical and sentence levels, the focus of natural language processing research has shifted to the discourse level. The object of discourse analysis is to analyze the text structure and the semantic relationship between discourse units, and thus understand the text. According to different purposes, discourse units and their relationships can be expressed as different textual structures, and the study of them can provide different levels of text comprehension. Currently, there are few studies on the inherent laws of Chinese texts and the lack of theory and method system for effective analysis and in-depth understanding of Chinese discourse has seriously restricts the relevant research and application. This study focuses on two basic features of a text, namely cohesion and coherence. From three aspects of theoretical research, resource construction, and computational model of discourse analysis, it explores the rhetorical structure (reflecting text coherence) and topic structure (reflecting text cohesion) respectively. It summarizes the current research, and presents the main problems and research trends.
ZHANG Jian , DING Shi-Fei , ZHANG Nan , DU Peng , DU Wei , YU Wen-Jia
2019, 30(7):2073-2090. DOI: 10.13328/j.cnki.jos.005840
Abstract:The Probabilistic graph is a research hotspot in machine learning at present. Generative models based on probabilistic graphs model have been widely used in image generation and speech processing. The restricted Boltzmann machines (RBMs) is a probabilistic undirected graph, which has important research value in modeling data distribution. On the one hand, the RBMs model can be used to construct deep neural network, and on the other hand, it can provide statistical support of deep nets. This paper mainly summarizes the related research of RBMs based probability graph model and their applications in image recognition. Firstly, this paper introduces the basic concepts and training algorithms of RBMs. Secondly, this paper summarizes the applications of RBMs in deep learning; and then, this paper discusses existing problems in research of neural nets and RBMs. Finally, this paper gives a summary and prospect of the research on the RBMs.
ZHU Su-Yang , LI Shou-Shan , ZHOU Guo-Dong
2019, 30(7):2091-2108. DOI: 10.13328/j.cnki.jos.005838
Abstract:Emotion analysis, which aims to determine the emotion contained in a piece of text via training a machine learning model, is a fine-grained sentiment analysis task. Emotion analysis can be divided into two tasks:Emotion classification and emotion regression. In this paper, an adversarial neural network is proposed for multi-dimensional emotion regression task. The proposed network consists of three modules:Feature extractors, regressors, and a discriminator. The network aims to train multiple feature extractors and regressors to score for multiple emotion dimensions for a textual input. Feature extractors take a text as inputs, and extract different feature vectors for different emotion dimensions. Regressors take extracted feature vectors as inputs to score for multiple emotion dimensions. The discriminator take an extracted feature vector as its input, and discriminate for which emotion dimension the feature vector is extracted. The proposed approach conducts adversarial training between different feature extractors via the discriminator in order to training feature extractors which can extract more generalized features for multiple emotion dimensions. Empirical studies on EMOBANK corpus demonstrate the notable improvements in r-value achieved by the proposed approach on EMOBANK readers' and writers' emotion regression in news domain and fictions domain compared to all baseline systems, including several state-of-the-art text regression systems.
2019, 30(7):2109-2123. DOI: 10.13328/j.cnki.jos.005489
Abstract:With the fast development of open source software and wide application of development supporting tools, there have been a great many of open software development data on the Internet. To improve the software development efficiency and product quality, more and more practitioners and researchers attempt to obtain insights of software development from the data. To facilitate the data analyses and their reproduction and comparison, building and using shared datasets are proposed and practiced. However, the existing datasets are lack of traceability of dataset construction process, application scope, and consideration of data variation over time and with environment changes, which threat the data quality and analysis validity. To address these problems, an advanced approach is proposed for sharing and using the software development datasets. It constructs datasets with multiple levels and multiple versions. Through multiple levels, the datasets remain the raw data, intermediate data, and final data to possess data traceability. Meanwhile, by multiple versions, users can compare and observe the data variety to verify and improve data quality and analysis validity. Based on the previously constructed Mozilla issue tracking dataset, it is demonstrated that how to build and use multi-level and multi-version software development dataset and verified that the proposed approach can help users efficiently use the dataset.
XU Lei , LIU Rui-Cheng , CHEN Gui-Mei , ZHAO Chen , ZHANG Wei-Feng
2019, 30(7):2124-2138. DOI: 10.13328/j.cnki.jos.005490
Abstract:Online advertisement (short as ad) has become one of the most important business patterns, with the rapid development of Internet. Online advertisements are main economic sources of Web applications, but the negative affect is that ads may leak users' privacy, or increase loads of browsers' performance. In order to study online ads systematically, it is necessary to obtain a complete call path in the whole generating process. However, since the sizes of the loaded JavaScript files are usually large, the function call path is long, and even worse, the JavaScript code in the path is compressed and confused, it is difficult to get the call path of the online ads through static analysis method. This study tracks the call path of online ads dynamically, namely instruments the relevant codes at first, then uses the function parameters to transmit the call information and records the internal call information in each iframe, finally, by matching and merging the information in multiple iframes, a complete ad call path about the generating process of online ads is generated. The experiment focused on 21 real websites, and the results show that:the proposed method can obtain the dynamic loading information of ads and generate the whole call paths, which are impossible for static methods, and the overhead is acceptable.
WANG Xin , ZOU Lei , WANG Chao-Kun , PENG Peng , FENG Zhi-Yong
2019, 30(7):2139-2174. DOI: 10.13328/j.cnki.jos.005841
Abstract:Knowledge graphs have become the cornerstone of artificial intelligence. The construction and publication of large-scale knowledge graphs in various domains have posed new challenges on the data management of knowledge graphs. In this paper, in accordance with the structural and operational elements of a data model, the current theories, methods, technologies, and systems of knowledge graph data management are surveyed. First, the paper introduces knowledge graph data models, including the RDF graph model and the property graph model, and also introduces 5 knowledge graph query languages, including SPARQL, Cypher, Gremlin, PGQL, and G-CORE. Second, the storage management schemes of knowledge graphs are presented, including relational-based and native approaches. Third, three kinds of query operations are discussed, which are graph pattern matching, navigational, and analytical queries. Fourth, the paper introduces mainstream knowledge graph database management systems, which are categorized into RDF triple stores and native graph databases. Meanwhile, the state-of-the-art distributed systems and frameworks that are used for processing knowledge graphs are also described, and benchmarks are presented for knowledge graphs. Finally, the future research directions of knowledge graph data management are put forward as well.
FENG Shuo , SHEN De-Rong , NIE Tie-Zheng , KOU Yue , YU Ge
2019, 30(7):2175-2187. DOI: 10.13328/j.cnki.jos.005831
Abstract:With the popularization of Internet, plenty of social networks come into lives. To enjoy different services, users usually take part in multiple social networks simultaneously. Therefore, user identification across social networks has become a hot research topic. In this study, social network structure is used to solve the problem of network alignment. Firstly, the problem of network alignment is formalized as the problem of maximum common subgraph (α-MCS). A method is proposed to determine parameter α adaptively. Compared with the other heuristic methods on determiningα, the proposed method can distinguish matched users and unmatched users effectively on different kinds of social networks. Secondly, in order to fast answer α-MCS, algorithm MCS_INA (α-MCS based iterative network alignment algorithm) is proposed. MCS_INA mainly contains two steps in each iteration. In the first step, MCS_INA aims at selecting candidates in the two networks respectively. In the second step, a mapping algorithm is proposed to match candidates. Compared with other methods, MCS_INA has lower time complexity and higher identification accuracy on different networks. At last, experiments are conducted on real-world and synthetic datasets to demonstrate the effectiveness of the proposed algorithm MCS_INA.
CAO Meng , TIAN Qing , MA Ting-Huai , CHEN Song-Can
2019, 30(7):2188-2207. DOI: 10.13328/j.cnki.jos.005837
Abstract:Over the past decades, human facial attributes (e.g. gender, age, and race) estimation has received large amount of attention and research due to its potential applications, and variety of methods have been proposed to address it. This article is devoted to review related works comprehensively and give references for researchers. Firstly, in accordance with whether exploiting the potential correlations between these facial attributes, the existing approaches are classified into naïve and natural groups and they are reviewed within each group. Then, in terms of incompleteness of annotated labels, considered attributes, and correlations utilization, the drawbacks of existing methods are analyzed. Finally, future research directions are provided at the end of this work.
SONG Chuan-Ming , MIN Xin , YAN Xiao-Hong , WANG Xiang-Hai , YIN Bao-Cai
2019, 30(7):2208-2226. DOI: 10.13328/j.cnki.jos.005487
Abstract:Elastic motion estimation is an effective temporal predictive coding technique of video proposed in recent years. But its optimization solution based on Gauss-Newton method still exhibits the problem of high computational complexity and unstable convergence yet. Thus an elastic motion estimation algorithm is addressed based on an improved Levenberg-Marquardt (L-M) method. First, a fast implementation of the L-M Hessian matrix is designed according to the numerical symmetry of elastic basis function and the Hessian matrix, which reduces its computational complexity by 62.5%. Second, it is found that the update factor of L-M diagonal matrix's damping coefficient has obvious influence on the performance of elastic motion estimation through theoretical and experimental analyses. The squared ratio of the step size in the latest two iterations is used to adaptively determine the update factor, by which the damping coefficient is updated positively and negatively in turn. Experimental results show that the proposed algorithm is able to obtain stable performance for the video sequences with various spatial resolution and scene characteristics. It gains 2.54 dB and 1.77 dB higher average motion-compensated peak signal-to-noise ratio (PSNR) than those of the full search based on block-wise translational model and the elastic motion estimation based on modified Gauss-Newton method, respectively. Furthermore, the proposed algorithm converges fast. Only 1~2 iterations are needed before it achieves higher PSNR than the conventional elastic motion estimation and the block-wise translational full search.