###
Journal of Software:2015.26(9):2326-2338

增强覆盖度与非相似性的标签选择多样化方法
汪美玲,周翔,陶秋铭,赵琛
(中国科学院 软件研究所, 北京 100190;中国科学院 研究生院, 北京 100049)
Diversifying Tag Selection Result by Improving Both Coverage and Dissimilarity
WANG Mei-Ling,ZHOU Xiang,TAO Qiu-Ming,ZHAO Chen
(Institute of Software, The Chinese Academy of Sciences, Beijing 100190, China;Graduate University, The Chinese Academy of Sciences, Beijing 100049, China)
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 1972   Download 1879
Received:September 22, 2013    Revised:July 09, 2014
> 中文摘要: 标签云是社交网站提供在线资源说明与导航功能的一种流行机制.标签选择即从大量标签中选出有代表性的有限标签,是创建标签云的核心任务.标签选择结果的多样性,是影响用户满意度的一个重要因素.信息覆盖度与标签非相似性是在标签选择中引入多样性的两个主要角度.为了进一步提高标签选择结果的信息覆盖度与标签非相似性,提出了3种标签选择方法.在每种方法中,定义了目标函数以同时量化标签集合的信息覆盖度与标签非相似性,并设计了近似算法以求解相应的最大化问题;同时,还分析了近似算法的近似比.利用CiteULike网站与Last.fm网站的标注数据集,将所提出的方法与已有方法进行了比较.实验结果表明,所提出的方法在信息覆盖度与标签非相似性方面都具有较好的效果.
Abstract:Tag cloud has been a popular facility used by social networks for online resource summarization and navigation. Tag selection, which aims to select a limited number of representative tags from a large set of tags, is the core task for creating tag clouds. Diversity of tag selection result is an important factor that affects user satisfaction. Information coverage and tag dissimilarity are two major perspectives for introducing diversity in tag selection. To improve information coverage and tag dissimilarity of tag selection result, this paper proposes three new tag selection approaches. In each approach, an objective function is defined to quantify both information coverage and tag dissimilarity of tags, and an approximate algorithm is designed to solve the corresponding maximization problem. Further the approximate ratio for each approximate algorithm is analyzed. The proposed and existing approaches are compared using tagging datasets extracted from the websites of CiteULike and Last.fm. The experimental results show that the new approaches perform better in terms of both information coverage and tag dissimilarity.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(61100067); 中国科学院先导专项(XDA06010600) 国家自然科学基金(61100067); 中国科学院先导专项(XDA06010600)
Foundation items:
Reference text:

汪美玲,周翔,陶秋铭,赵琛.增强覆盖度与非相似性的标签选择多样化方法.软件学报,2015,26(9):2326-2338

WANG Mei-Ling,ZHOU Xiang,TAO Qiu-Ming,ZHAO Chen.Diversifying Tag Selection Result by Improving Both Coverage and Dissimilarity.Journal of Software,2015,26(9):2326-2338