Journal of Software:2013.24(8):1852-1867

(福州大学 数学与计算机科学学院, 福建 福州 350108;福州大学 管理学院, 福建 福州 350108)
Fast Clustering-Based Anonymization Algorithm for Data Streams
(College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350108, China;College of Management, Fuzhou University, Fuzhou 350108, China)
Chart / table
Similar Articles
Article :Browse 2718   Download 2723
Received:July 29, 2011    Revised:March 23, 2012
> 中文摘要: 为了防止敏感信息的泄漏,保护用户隐私,常采用概化和抑制等技术在共享数据前对其准标识符进行匿名化.与静态数据集不同,数据流具有潜在无限、高度动态等特性,使得数据流匿名需要解决更加复杂的问题,不能直接应用静态数据集的匿名方法.在分析现有数据流匿名方法的基础上,提出一种采用聚类思想进行数据流匿名的方法,通过单遍扫描数据识别和重用满足匿名条件的簇,以实现数据流的快速匿名.真实数据集上的实验结果表明,该方法在满足匿名要求的同时能够降低概化和抑制处理带来的信息损失,并且具有较低的时间和空间复杂度.
中文关键词: 数据匿名  数据流  聚类
Abstract:In order to prevent the disclosure of sensitive information and protect users’ privacy, the generalization and suppression of technology is often used to anonymize the quasi-identifiers of the data before its sharing. Data streams are inherently infinite and highly dynamic which are very different from static datasets, so that the anonymization of data streams needs to be capable of solving more complicated problems. The methods for anonymizing static datasets cannot be applied to data streams directly. In this paper, an anonymization approach for data streams is proposed with the analysis of the published anonymization methods for data streams. This approach scans the data only once to recognize and reuse the clusters that satisfy the anonymization requirements for speeding up the anonymization process. Experimental results on the real dataset show that the proposed method can reduce the information loss that is caused by generalization and suppression and also satisfies the anonymization requirements and has low time and space complexity.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(70871024); 福建省自然科学基金(2010J01358); 福州大学科技发展基金(201-xy-16) 国家自然科学基金(70871024); 福建省自然科学基金(2010J01358); 福州大学科技发展基金(201-xy-16)
Foundation items:
Reference text:


GUO Kun,ZHANG Qi-Shan.Fast Clustering-Based Anonymization Algorithm for Data Streams.Journal of Software,2013,24(8):1852-1867