###
Journal of Software:2017.28(12):3183-3205

中文微博情感分析研究与实现
李勇敢,周学广,孙艳,张焕国
(武汉大学 计算机学院, 湖北 武汉 430079;海军工程大学 信息安全系, 湖北 武汉 430033;中国人民解放军 92941 部队, 辽宁 葫芦岛 125000)
Research and Implementation of Chinese Microblog Sentiment Classification
LI Yong-Gan,ZHOU Xue-Guang,SUN Yan,ZHANG Huan-Guo
(School of Computer Science, Wuhan University, Wuhan 430079, China;Department of Information Security, Navy University of Engineering, Wuhan 430033, China;Unit Number of 92941, PLA, Huludao 125000, China)
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 2582   Download 3621
Received:May 19, 2016    Revised:January 24, 2017
> 中文摘要: 中文微博的大数据、指数传播和跨媒体等特性,决定了依托人工方式监控和处理中文微博是不现实的,迫切需要依托计算机开展中文微博情感自动分析研究.该项研究可分为3个任务:中文微博观点句识别、情感倾向性分类和情感要素抽取.为完成上述任务,研制了一个评测系统:通过构建多级词库、制定成词规则、开展串频统计等给出一种基于规则和统计的新词识别方法,在情感词和评价对象的依存模式的基础上给出基于词语特征的观点句识别算法;以词序流表示文本的LDA-Collocation模型,采用吉布斯抽样法推导了算法,实现中文微博情感倾向性自动分类;针对中文微博情感要素抽取召回率较低的问题,利用依存关系分析理论,按主语类和宾语类把依存模式分为两类,建立了6个优先级的评价对象和情感词汇的依存模式,通过评价对象归并算法实现计算机自动抽取情感要素.实验包括两个部分:一是参加NLP&CC2012的公开评测,所提方法在微博观点句识别任务中的准确率为第2,在中文微博情感要素抽取任务中的准确率和F值均为第2,验证了该算法的实用性;二是在分析公开评测结果的基础上,分别比较了参加公开评测的各类算法在处理中文微博情感分析时的效率,给出了相关结论.
Abstract:This paper studies sentiment analysis in Weibo. The study focuses on three types of tasks:emotion sentence identification and classification, emotion tendency classification, and emotion expression extraction. An unsupervised topic sentiment model, UTSM, is proposed based on the LDA Collocation model to facilitate automatic hashtag labeling. A Gibbs sampling implementation is presented for deriving an algorithm that can be used to automatically categorize emotion tendency with computer. To address the issue of lower recall ratio for emotion expression extraction in Weibo, dependency parsing is used to divide dependency model into two categories with subject and object. Six dependency models are also constructed from evaluation objects and emotion words, and a merging algorithm is proposed to accurately extract emotion expression. Result of experiments indicates that the presented method has a strong innovative and practical value.
文章编号:     中图分类号:    文献标志码:
基金项目:国家重点基础研究发展计划(973)(2014CB340600);国家自然科学基金(61332019,61672531);国家社会科学基金(14GJ003-152) 国家重点基础研究发展计划(973)(2014CB340600);国家自然科学基金(61332019,61672531);国家社会科学基金(14GJ003-152)
Foundation items:National Program on Key Basic Research Project (973) (2014CB340600); National Natural Science Foundation of China (61332019, 61672531); National Social Science Foundation of China (14GJ003-152)
Reference text:

李勇敢,周学广,孙艳,张焕国.中文微博情感分析研究与实现.软件学报,2017,28(12):3183-3205

LI Yong-Gan,ZHOU Xue-Guang,SUN Yan,ZHANG Huan-Guo.Research and Implementation of Chinese Microblog Sentiment Classification.Journal of Software,2017,28(12):3183-3205