Journal of Software:2018.29(7):2046-2070

(中国民航大学 计算机科学与技术学院, 天津 300300;北京交通大学 计算机与信息技术学院, 北京 100044)
Features Oriented Survey of State-of-the-Art Keyphrase Extraction Algorithms
CHANG Yao-Cheng,ZHANG Yu-Xiang,WANG Hong,WAN Huai-Yu,XIAO Chun-Jing
(School of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China;School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China)
Chart / table
Similar Articles
Article :Browse 2201   Download 2393
Received:July 19, 2017    Revised:November 02, 2017
> 中文摘要: 面向文本的关键词自动提取一直以来是自然语言处理领域的一个关键基础问题和研究热点.特别是,随着当前对文本数据应用需求的不断增加,使得关键词提取技术进一步得到研究者的广泛关注.尽管近年来关键词提取技术得到长足的发展,但提取结果目前还远未取得令人满意的效果.为了促进关键词提取问题的解决,对近年来国内、外学者在该研究领域取得的成果进行了系统总结,具体包括候选关键词生成、特征工程和关键词提取3个主要步骤,并对未来可能的研究方向进行了探讨和展望.不同于围绕提取方法进行总结的综述文献,主要围绕着各种方法使用的特征信息归纳总结现有成果,这种从特征驱动的视角考察现有研究成果的方式有助于综合利用现有特征或提出新特征,进而提出更有效的关键词提取方法.
Abstract:Keyphrases that efficiently represent the main topics discussed in a document are widely used in various document processing tasks, and automatic keyphrase extraction has been one of fundamental problems and hot research issues in the field of natural language processing (NLP). Although automatic keyphrase extraction has received a lot of attention and the extraction technologies have developed quickly, the state-of-the-art performance on this task is far from satisfactory. In order to help to solve the keyphrase extraction problem, this paper presents a survey of the latest development in keyphrase extraction, mainly including candidate keyphrase generation, feature engineering and keyphrase extraction models. In addition, some published datasets are listed, the evaluation approaches are analyzed, and the challenges and trends of automatic keyword extraction techniques are also discussed. Different from the existing surveys that mainly focus on the models of keyphrase extraction, this paper provides a features oriented survey of automatic keyphrase extraction. This perspective may help to utilize the existing features and propose the new effective extraction approaches.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(U1533104,U1633110,61603028);中央高校基本科研业务费(ZXH2012P009) 国家自然科学基金(U1533104,U1633110,61603028);中央高校基本科研业务费(ZXH2012P009)
Foundation items:National Natural Science Foundation of China (U1533104, U1633110, 61603028); Fundamental Research Funds for the Central Universities (ZXH2012P009)
Reference text:


CHANG Yao-Cheng,ZHANG Yu-Xiang,WANG Hong,WAN Huai-Yu,XIAO Chun-Jing.Features Oriented Survey of State-of-the-Art Keyphrase Extraction Algorithms.Journal of Software,2018,29(7):2046-2070