国家重点研发计划(2018YFC0830105, 2018YFC0830100); 国家自然科学基金(61972186, 61732005, 61762056); 云南省重大科技专项(202002AD080001); 云南省基础研究专项面上项目(202001AT070047, 202001AT070046); 云南省高新技术产业专项(201606)
从案件相关的话题评论中生成简短的话题描述对于快速了解案件舆情有着重要作用, 其可以看做是基于用户评论的多文档摘要任务. 然而用户评论中含有较多噪声且生成摘要所需的重要信息分散在不同的评论句中, 直接基于序列模型容易生成错误或不相关的摘要. 为了缓解上述问题, 提出一种基于主题交互图的案件话题摘要方法, 将嘈杂的用户评论组织为主题交互图, 利用图来表达不同用户评论之间的关联关系, 从而过滤重要的用户评论信息. 具体来说, 首先从评论句中抽取案件要素, 然后构造以案件要素为节点, 包含案件要素的句子为内容的主题交互图; 然后利用图Transformer网络生成图中节点的表征, 最后生成简短的话题描述. 在收集的案件话题摘要数据集上的实验结果表明, 所提方法是一种有效的数据选择方法, 能够生成连贯、事实正确的话题摘要.
Generating coherent topic descriptions from the user comments of case-related topics plays a significant role in quickly understanding the case-related news, which can be regarded as a multi-document summarization task based on user comments. However, these comments contain lots of noise, the crucial information for generating summaries is scattered in different comments, the sequence-to-sequence model tends to generate irrelevant and incorrect summaries. Based on these observations, this paper presents a case-related topic summarization method based on the topic interaction graph, which reconstructs the user comments into a topic interaction graph. The motivation is that the graph can express the correlation between different user comments, which is useful to filter the key information in user comments. Specifically, the case elements are first extracted from the user comments, and then the topic interaction graph is constructed, which takes the case elements as the nodes and uses the sentences including these case elements as the node’s contents; then the graph transformer network is introduced to produce the representation of the graph. Finally, the summary is generated by using a standard transformer-based decoder. The experimental results on the collected case-related topic summarization corpus show that the proposed method effectively selects useful content and can generate coherent and factual topic summaries.