YU Heng-Biao , YI Xin , FAN Xiao-Kang , TANG Tao , HUANG Chun , YIN Bang-Hu , WANG Ji
2025, 36(12):5387-5401. DOI: 10.13328/j.cnki.jos.007406 CSTR: 32375.14.jos.007406
Abstract:The compiler is one of the most relied-upon performance tuning tools for program developers. However, due to the limited precision encoding of floating-point numbers, many compiler optimization options can alter the semantics of floating-point calculations, leading to result inconsistency. Locating the program statements that cause compilation optimization-induced result inconsistency is crucial for performance tuning and result reproducibility. The state-of-the-art approach employs precision enhancement-based binary search to locate the code snippets causing result inconsistency but suffers from insufficient support for multi-source localization and low search efficiency. This study proposes a floating-point instruction difference-guided Delta-Debugging localization method, FI3D, which utilizes the backtracking mechanism in Delta-Debugging to better support multi-source problem code localization and exploits the differences in floating-point instruction sequences under different compiler optimization options to guide the localization. FI3D is evaluated using 6 applications from the NPB benchmark, 10 programs from the GNU scientific library, and 2 programs from the floatsmith mixed-precision benchmark. Experimental results demonstrate that FI3D successfully locates the 4 applications where PLiner fails and achieves an average 26.8% performance improvement for the 14 cases successfully located by PLiner.
SHEN Li , ZHOU Wen-Hao , WANG Fei , LI Bin , TAN Jian , SHANG Hong-Hui , AN Hong , QI Feng-Bin
2025, 36(12):5402-5422. DOI: 10.13328/j.cnki.jos.007407 CSTR: 32375.14.jos.007407
Abstract:With the increasing adoption of heterogeneous integrated architectures in high-performance computing, it has become essential to harness their potential and explore new strategies for application development. Traditional static compilation methodologies are no longer sufficient to meet the complex computational demands. Therefore, dynamic programming languages, known for their flexibility and efficiency, are gaining prominence. Julia, a modern high-performance language characterized by its JIT compilation mechanism, has demonstrated significant performance in fields such as scientific computing. Targeting the unique features of the Sunway heterogeneous many-core architecture, the ORCJIT engine is introduced, along with an on-chip storage management approach specifically designed for dynamic modes. Based on these advancements, swJulia is developed as a Julia dynamic language compiler tailored for the new generation of the Sunway supercomputer. This compiler not only inherits the flexibility of the Julia compiler but also provides robust support for the SACA many-core programming model and runtime encapsulation. By utilizing the swJulia compilation system, the deployment of the NNQS-Transformer quantum chemistry simulator on the new generation of the Sunway supercomputer is successfully achieved. Comprehensive validation across multiple dimensions demonstrates the efficacy and efficiency of swJulia. Experimental results show exceptional performance in single-threaded benchmark tests and many-core acceleration, significantly improving ultra-large-scale parallel simulations for the NNQS-Transformer quantum chemistry simulator.
ZHANG Yang-Yang , ZHANG Hao , GAN Tao , LENG Chang , HUANG Cheng-Chao , ZHANG Li-Jun
2025, 36(12):5423-5437. DOI: 10.13328/j.cnki.jos.007408 CSTR: 32375.14.jos.007408
Abstract:With the rapid development of autonomous driving technology, the issue of vehicle control takeover has become a prominent research topic. A car equipped with an assisted driving system cannot fully handle all driving scenarios. When the actual driving scenario exceeds the operational design domain of the assisted system, human intervention is still required to control the vehicle and ensure the safe completion of the driving task. Takeover performance is an extremely important metric for evaluating a driver’s performance during the takeover process, which includes takeover reaction time and takeover quality. The takeover reaction time refers to the time from the system’s takeover request to the driver’s control of the steering wheel. The length of the takeover response time not only reflects the driver’s current state but also affects the subsequent handling of complex scenarios. Takeover quality refers to the quality of manual vehicle operation by the driver after regaining control. This study, based on the CARLA driving simulator, constructs 6 typical driving scenarios, simulates the vehicle control takeover process, and collects physiological signals and eye movement data from 31 drivers using a multi-channel acquisition system. Based on the driver’s takeover performance, and regarding International standards, an objective takeover performance evaluation metric is proposed, incorporating the driver’s takeover reaction time, maximum horizontal and vertical accelerations, and minimum collision time, derived from multiple vehicle data. By combining driver data, vehicle data, and scenario data, a deep neural network (DNN) model predicts takeover performance, while the SHAP model analyzes the impact of each feature, improving the model’s interpretability and transparency. The experimental results show that the proposed DNN model outperforms traditional machine learning methods in predicting takeover performance, achieving an accuracy of 92.2% and demonstrating good generalization. The SHAP analysis reveals the impact of key features such as heart rate variability, driving experience, and minimum safe distance on the prediction results. This research provides a theoretical and empirical foundation for the safety optimization and human-computer interaction design of autonomous driving systems and is of great significance for improving the efficiency and safety of human-vehicle cooperation in autonomous driving technology.
LI Chun-Yi , MA Zhi , WU Qiang , WANG Xiao-Bing , ZHAO Liang
2025, 36(12):5438-5455. DOI: 10.13328/j.cnki.jos.007409 CSTR: 32375.14.jos.007409
Abstract:Temporal logic has been extensively applied in domains such as formal verification and robotics co