| P.O.Box 8718, Beijing 100080, China | Journal of Software, May 2006,17(5):1222-1231 |
| E-mail: jos@iscas.ac.cn | ISSN 1000-9825, CODEN RUXUEW, CN 11-2560/TP |
| http://www.jos.org.cn | Copyright © 2006 by Journal of Software |
支持多约束的K-匿名化方法
杨晓春, 刘向宇, 王 斌, 于 戈
Abstract
K-Anonymization is an important approach to protect data privacy in data publishing scenario. Existing approaches mainly consider data processing with single constraint. There exist multiple constraints cases in the real applications, which makes the K-anonymization more complex. Simply applying the approaches with single constraint to the problem of multiple constraints may cause high information loss and low efficiency. Based on the idea of Classfly, a family of multiple constraints supported K-anonymization approaches named Classfly+ are proposed according to the features of mutiple constraints. Three K-anonymization approaches are proposed, which are na?ve approach, complete IndepCSet, and partial IndepCSet Classfly+ approaches. Experimental results show that Classfly+ can decrease the information loss and improve efficiency of
k-anonymization.
Yang XC, Liu XY, Wang B, Yu G. K-Anonymization approaches for supporting multiple constraints.
Journal of Software, 2006,17(5):1222-1231.
DOI:
10.1360/jos171222
http://www.jos.org.cn/1000-9825/17/1222.htm
摘要
K-匿名化(K-anonymization)是数据发布环境下保护数据隐私的一种重要方法.目前,K-匿名化方法主要针对单一约束条件进行处理,而实际应用中涉及到大量的多约束条件,使K-匿名化问题更加复杂.如果简单地将单一约束K-匿名化方法应用到多约束情况,会造成大量的信息损失及过低的处理效率.根据多约束之间的关系,通过继承Classfly算法的元组概括过滤思想,提出多约束K-匿名化方法Classfly+及相应的3种算法,包括朴素算法、完全IndepCSet算法和部分IndepCSet的Classfly+算法.实验结果显示,Classfly+能够很好地降低多约束K-匿名化的信息损失,改善匿名化处理的效率.
基金项目:本文为2005年中国计算机大会推荐优秀论文.Supported by the National Natural Science Foundation of China under Grant Nos.60503036, 60573090 (国家自然科学基金); the University Key Teacher Award Program for Outstanding Young Teachers in High Education Institute of the Ministry of Education of China (教育部高等学校优秀青年教师教学科研奖励计划基金); the Natural Science Foundation for Doctoral Career of Liaoning Province of China under Grant No.20041016 (辽宁省博士科研启动项目); the National Research Foundation for the Doctoral Program of the Ministry of Education under Grant Nos.20030145029 (教育部博士点基金)
References:
[1] Sweeney L. K-Anonymity: A model for protecting privacy. Int'l Journal on Uncertainty, Fuzziness and Knowledge-Based Systems, 2002,10(5):557-570.
[2] Liu XY, Yang XC, Yu G. A representative classes based privacy preserving data publishing approach with high precision. Computer Science, 2005,32(9A):368-373 (in English with Chinese abstract).
[3] Meyerson A, Williams R. On the complexity of optimal k-anonymity. In: Deutsch A, ed. Proc. of the 23rd ACM SIGACT- SIGMOD-SIGART Symp. on Principles of Database Systems (PODS 2004). New York: ACM, 2004. 223-228.
[4] Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A. Anonymizing tables. In: Eiter T, Libkin L, eds. Proc. of the 10th Int'l Conf. on Database Theory (ICDT 2005). LNCS 3363, Springer-Verlag, 2005. 246-258.
[5] Iyengar V. Transforming data to satisfy privacy constraints. In: Za?ane O, ed. Proc. of the 8th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining (KDD 2002). New York: ACM, 2002. 279-288.
[6] Yao C, Wang XS, Jajodia S. Checking for k-anonymity violation by views. In: B?hm K, Jensen CS, Hass LM, Kersten ML, Larson P, Ooi BC, eds. Proc. of the 31st Int'l Conf. on Very Large Data Bases (VLDB 2005). Trondheim: ACM, 2005. 910-921.
[7] Sweeney L. Guaranteeing anonymity when sharing medical data, the Datafly system. In: Masys DR, ed. Proc. of the 1997 American Medical Informatics Association Annual Fall Symp. (AMIA'97). 1997. 51-55. http://www.amia.org/pubs/symposia/D004462.pdf
[8] LeFevre K, DeWitt D, Ramakrishnan R. Incognito: Efficient full-domain k-anonymity. In: Ozcan F, ed. Proc. of the ACM SIGMOD Int'l Conf. on Management of Data. New York: ACM, 2005. 49-60.
[9] Fung B, Wang K, Yu P. Top-Down specialization for information and privacy preservation. In: Toyama M, Sasaki S, eds. Proc. of the 21st Int'l Conf. on Data Engineering (ICDE 2005). Tokyo: IEEE Computer Society, 2005. 205-216.
[10] Sweeney L. Achieving k-anonymity privacy protection using generalization and suppression. Int'l Journal on Uncertainty, Fuzziness and Knowledge-Based Systems, 2002,10(5):571-588.