(数据工程与知识工程教育部重点实验室(中国人民大学), 北京 100872;中国人民大学 信息学院, 北京 100872;清华大学 经济管理学院, 北京 100084)
Identification of Misleading Product Description in E-Commerce Website
LONG Yin,LIU Hong-Yan,HE Jun,HU He,DU Xiao-Yong
(Key Laboratory of Data Engineering and Knowledge Engineering of the Ministry of Education (Renmin University of China), Beijing 100872, China;School of Information, Renmin University of China, Beijing 100872, China;School of Economics and Management, Tsinghua University, Beijing 100084, China)
Chart / table
Similar Articles
Article :Browse 1028   Download 1537
Received:May 07, 2014    Revised:August 19, 2014
> 中文摘要: 网上购物已被越来越多的消费者接受,C2C网站作为主流购物平台提供数以万计的商品条目供消费者选择,其中有一定数量商品条目的商品描述具有误导性.误导性是指条目的商品描述与其实际价格不符合,通常的表现是描述商品的价格低于其应有的价格,以此吸引消费者,误导消费者到其购物页面.这既影响消费者的判断,又损坏购物网站的信誉度.为了找出这部分具有误导性的商品描述,提出了一种结合概率模型HMM和基于统计的异常值识别方法,能够有效地识别出误导性商品描述.HMM模型从概率的角度有效地确定商品描述所指代的商品,为C2C网站上商品描述的不规范导致的商品指代信息模糊提供了一种行之有效的解决方法.基于统计的异常值识别方法在处理C2C网站上商品信息比较单一时较为有效.用该方法在实际的电商网站数据集上进行了实验.实验结果证明了该方法的有效性.
中文关键词: 误导性描述  HMM  异常值检测
Abstract:Online shopping has been accepted by more and more consumers. C2C websites provide thousands of offers for consumers as a mainstream e-commerce platform. When customers search products in C2C website, some returned offers have misleading description. Misleading description means that the description does not convey the actual price of products, but usually claiming much lower price for the purpose of attracting more consumers. The misleading offers affect consumers' judgments and bring bad influences on the websites' reputation. This paper proposes an approach that combines statistical model HMM with statistical outlier detection method to detect misleading offers. HMM model is built to determine the product that an offer description really designates, providing an efficient solution to eliminate the ambiguity of the offer description caused by description irregularities. The statistical outlier detection method is effective to deal with limited product offer information. The paper further conducts experiments on real data set of electric business websites and the results demonstrate the effectiveness of the proposed approach.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(71272029,71110107027);国家社会科学基金(12&ZD220);国家高技术研究发展计划(863)(2014AA015204) 国家自然科学基金(71272029,71110107027);国家社会科学基金(12&ZD220);国家高技术研究发展计划(863)(2014AA015204)
Foundation items:
Reference text:


LONG Yin,LIU Hong-Yan,HE Jun,HU He,DU Xiao-Yong.Identification of Misleading Product Description in E-Commerce Website.Journal of Software,2014,25(S2):127-135