###
Journal of Software:2020.31(2):321-343

宏观篇章结构表示体系和语料建设
褚晓敏,奚雪峰,蒋峰,徐昇,朱巧明,周国栋
(苏州大学 自然语言处理实验室, 江苏 苏州 215006;苏州科技大学 电子与信息工程学院, 江苏 苏州 215009)
Macro Discourse Structure Representation Schema and Corpus Construction
CHU Xiao-Min,XI Xue-Feng,JIANG Feng,XU Sheng,ZHU Qiao-Ming,ZHOU Guo-Dong
(Natural Language Processing Laboratory, Soochow University, Suzhou 215006, China;School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China)
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 495   Download 320
Received:January 09, 2018    Revised:April 19, 2019
> 中文摘要: 篇章结构分析是自然语言处理领域的一个重要研究方向.篇章结构分析有助于理解篇章的结构和语义,并为自然语言处理的应用(如自动文摘、信息抽取、问答系统等)提供有力的支撑.目前,篇章结构分析主要集中在微观的层面,分析的重点是句子内部或句子与句子之间的关系和结构,而宏观层面的研究相对较少.因此,以篇章结构作为研究对象,并将研究重点放在宏观篇章结构的表示体系和语料资源建设上.探讨了篇章结构分析的重要性,从理论体系、语料资源、计算模型这3个方面阐述了篇章结构分析的研究现状,提出了以篇章主次关系为媒介的宏观和微观统一的篇章结构表示框架,并分别构建了宏观篇章的逻辑语义结构和功能语用结构.在此基础上,标注了规模为720篇新闻报道的宏观篇章结构语料,并对标注的结果进行了一致性分析和标注统计分析.
Abstract:Discourse structure analysis is an important research topic in natural language processing. Discourse structure analysis not only helps to understand the discourse structure and semantics, but also provides strong support for deep applications of natural language processing, such as automatic summarization, information extraction, question answering, etc. At present, the analysis of discourse structure is mainly concentrated on the micro level. The analysis focuses on the relations and structures between sentences or sentences groups, while the analysis on macro level is less. Therefore, this study takes discourse structure as the research object, and focuses on the construction of representation schema and corpus resources on the macro level. This study discusses the importance of discourse structure analysis, expounds the research status of discourse structure analysis from three aspects, namely, theory system, corpora resource, and computing model, and puts forward the macro-micro unified discourse structure representation framework with the primary-secondary relation as the carrier. Furthermore, this study constructs the logical semantic structure and functional pragmatic structure of macro discourse level respectively. On this basis, this study annotates a macro Chinese discourse structure corpus, consisting of 720 newswire articles, and analyzes the results of the annotations in consistency and statistical data.
文章编号:     中图分类号:TP18    文献标志码:
基金项目:国家自然科学基金(61773276,61673290,61836007) 国家自然科学基金(61773276,61673290,61836007)
Foundation items:National Natural Science Foundation of China (61773276, 61673290, 61836007)
Reference text:

褚晓敏,奚雪峰,蒋峰,徐昇,朱巧明,周国栋.宏观篇章结构表示体系和语料建设.软件学报,2020,31(2):321-343

CHU Xiao-Min,XI Xue-Feng,JIANG Feng,XU Sheng,ZHU Qiao-Ming,ZHOU Guo-Dong.Macro Discourse Structure Representation Schema and Corpus Construction.Journal of Software,2020,31(2):321-343