题名 | LenC: A redundancy-aware length control framework for extractive summarization |
作者 | |
发表日期 | 2021-08-20 |
会议名称 | 2021 4th International Conference on Pattern Recognition and Artificial Intelligence (PRAI) |
会议录名称 | 2021 4th International Conference on Pattern Recognition and Artificial Intelligence (PRAI 2021)
![]() |
ISBN | 9781665413220 |
页码 | 1-7 |
会议日期 | August 20-22, 2021 |
会议地点 | Yibin, China |
出版者 | The Institute of Electrical and Electronics Engineers, Inc. |
摘要 | While extractive summarization is an important approach of the NLP text summarization task, redundancy in the generated extractive summary is always a problem. Previous works usually set the length of the output summary to a fixed number, which might only be appropriate for some of the documents while too long for others. At the same time, though extractive summarization possesses high readability as it directly selects sentences from the document, the unimportant parts within sentences are also selected. These two scenarios result in redundancy in the extractive summaries. To solve this problem, we propose a length control framework for extractive summarization, named LenC, in a two-stage pipeline. We first use a pretrained BERT-based summarizer to select smaller units (i.e. EDUs) than original sentences to abandon the insignificant parts of a sentence. Then a portable length controller is implemented to prune the output summary to an appropriate length, and it can be attached to any extractive summarizer. Experiments show that the proposed model outperforms the state-of-the-art baseline models and successfully reduces the redundancy in the extractive summaries. |
关键词 | Redundant information Single document summarization |
DOI | 10.1109/PRAI53619.2021.9550801 |
URL | 查看来源 |
语种 | 英语English |
Scopus入藏号 | 2-s2.0-85117897473 |
引用统计 | |
文献类型 | 会议论文 |
条目标识符 | https://repository.uic.edu.cn/handle/39GCC9TT/6838 |
专题 | 理工科技学院 |
作者单位 | 1.BNU-HKBU United International College,Computer Science and Technology,Guangdong,China 2.Hong Kong Baptist University,Computer Science and Technology,Hong Kong,China |
第一作者单位 | 北师香港浸会大学 |
推荐引用方式 GB/T 7714 | Li, Shuxin,Su, Weifeng,Liu, Jiming. LenC: A redundancy-aware length control framework for extractive summarization[C]: The Institute of Electrical and Electronics Engineers, Inc., 2021: 1-7. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论