基于文本摘要及引用关系的可视辅助文献阅读

张加万; 杨思琪; 李泽宇; 杨伟强; 王锦东; 贺瑞芳; 黄茂林

引用本文:	张加万,杨思琪,李泽宇,杨伟强,王锦东,贺瑞芳,黄茂林.基于文本摘要及引用关系的可视辅助文献阅读.软件学报,2016,27(5):1163-1173

【打印本页】【下载PDF全文】【查看/发表评论】【EndNote】【RefMan】【BibTex】

←前一篇|后一篇→

过刊浏览高级检索

本文已被：浏览 5850次下载 7773次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
基于文本摘要及引用关系的可视辅助文献阅读
张加万¹, 杨思琪¹, 李泽宇¹, 杨伟强¹, 王锦东¹, 贺瑞芳², 黄茂林^1,3
1.天津大学软件学院, 天津 300350;2.天津大学计算机科学与技术学院, 天津 300350;3.Faculty of Engineering and Information Technologies, School of Software, University of Technology Sydney, Australia

摘要:

近年来,科技论文发表数量与日俱增,科研人员需要阅读文献的数量也随之迅速增长.如何快速而有效地阅读一篇科技论文,逐渐成为一个重要的研究课题.另一方面,在阅读科技论文时,理解与其相关的重要参考文献可帮助读者更好地理解文章的内容.然而,如何从众多的参考文献中快速找到最重要、最相关的几篇,如何避免在阅读过程中迷失在文档的多维空间,仍是值得研究的问题.为了解决上述问题,提出了一个基于文本摘要和引用关系的可视辅助文献阅读系统.该系统利用一种基于阅读目的的文本摘要技术提取出论文中重要的句子,并采用多尺度的可视化方式进行展示;使用LDA(latent dirichlet allocation)话题模型抽取参考文献的核心话题;记录用户的阅读行为,用于提示其阅读上下文,以保证用户关注点不发生迷失.同时,在一个具体的案例场景中详细介绍了系统的使用方法,并进行了用户研究以验证系统的可用性.

关键词: 文档可视化文本摘要引用网络阅读行为分析文本可视分析

DOI：10.13328/j.cnki.jos.004962

分类号:

基金项目:国家社会科学基金(12&ZD213);国家科技支撑计划(2013BAK01B05,2014BAK09B04)

Visualization Guided Document Reading by Citation and Text Summarization

ZHANG Jia-Wan¹, YANG Si-Qi¹, LI Ze-Yu¹, YANG Wei-Qiang¹, WANG Jin-Dong¹, HE Rui-Fang², HUANG Mao-Lin^1,3

1.School of Computer Software, Tianjin University, Tianjin 300350, China;2.School of Computer Science and Technology, Tianjin University, Tianjin 300350, China;3.Faculty of Engineering and Information Technologies, School of Software, University of Technology Sydney, Australia

Abstract:

With growing volume of publications in recent years, researchers have to read much more literatures. Therefore, how to read a scientific article in an efficient way becomes an importance issue. When reading an article, it's necessary to read its references in order to get a better understanding. However, how to differentiate between the relevant and non-relevant references, and how to stay in topic in a large document collection are still challenging tasks. This paper presents GUDOR (GUidedDOcument Reader), a visualization guided reader based on citation and summarization. It (1) extracts the important sentences from a scientific article with an objective-based summarization technique, and visualizes the extraction results by a multi-resolution method; (2) identifies the main topics of the references with a LDA (Latent Dirichlet Allocation) model; (3) tracks user's reading behavior to keep him or her focusing on the reading objective. In addition, the paper describes the functions and operations of the system in a usage scenario and validates its applicability by a user study.

Key words: document visualization text summarization citation network reading behavior analysis visual text analysis