基于语义分析的小程序代码与隐私声明一致性检测

doi:10.13328/j.cnki.jos.007387

微信小程序

微信服务号

微信订阅号

首页 > 过刊浏览>2025年第36卷第11期 >5102-5117. DOI:10.13328/j.cnki.jos.007387

PDF HTML阅读 XML下载导出引用引用提醒

基于语义分析的小程序代码与隐私声明一致性检测
DOI:
                        10.13328/j.cnki.jos.007387
                    
CSTR:
                        
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:TP311
基金项目:国家自然科学基金(62172027, U24B20117); 国家重点研发计划(2020YFB1005601); 浙江省自然科学基金(LZ23F020013)

Code to Policy Consistency Detection for Mini Program Based on Semantic Analysis

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

小程序需要为用户提供隐私声明, 告知要使用的隐私信息种类和目的. 代码与隐私声明不一致的小程序可能会欺骗用户导致用户隐私泄露. 现有一致性检测方法中, 将二者转为预设的标签进行一致性判断的方法会损失信息导致漏报, 而仅依靠代码分析的方法也难以应对混淆处理的小程序代码. 针对上述问题, 提出基于语义分析的小程序代码与隐私声明一致性检测方法, 根据定制化污点分析结果提取代码行为, 使用代码语言处理模型将敏感资源使用代码表示为自然语言, 结合隐私声明中资源使用目的, 人工检测与代码行为的一致性. 实验结果表明, 污点分析模块覆盖小程序接口的全部3种数据返回方式和4种常见数据流, 较同类方法提升小程序敏感行为发现能力; 在上万个小程序语义分析中, 发现高频调用接口的部分行为存在隐私泄露风险, 识别出真实环境中代码与隐私声明不一致的小程序.

Abstract:

Mini programs are required to provide privacy policies to inform users about the types and purposes of the privacy data being collected and used. However, inconsistencies between the underlying codes and the privacy statements may occur, potentially deceiving users and leading to privacy leakage. Existing methods for detecting such inconsistencies typically rely on converting the code and policies into predefined labels for comparison. This approach introduces information loss during label conversion, resulting in underreporting. In addition, traditional code analysis methods are often ineffective against obfuscated mini program code. To address these limitations, a semantic-analysis-based method for code-to-policy consistency detection in mini programs is proposed. Customized taint analysis is utilized to capture code behaviors based on mini program coding paradigms, and a code language processing model is applied to represent these behaviors as natural language descriptions. By aligning the natural language representation of code behaviors with the stated purposes in privacy policies, expert reviewers can analyze the consistency between the two effectively. Experiments indicate that the proposed taint analysis module covers all three data return methods and four common data flow patterns within mini programs APIs, achieving superior sensitivity compared to existing methods. Semantic analysis of tens of thousands of mini programs reveals privacy leakage risks associated with certain high-frequency API calls. Case studies using the MiniChecker tool further identify real-world instances of mini programs where inconsistencies between code and privacy policies are detected.

参考文献

相似文献

引证文献

引用本文

刘力沛,毛剑,林其箫,吕雨松,李嘉维,刘建伟.基于语义分析的小程序代码与隐私声明一致性检测.软件学报,2025,36(11):5102-5117

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2024-06-06
最后修改日期:2024-08-29
录用日期:
在线发布日期: 2025-07-17
出版日期: 2025-11-06

微信小程序

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码