Performance comparison of abstractive text summarization models on short and long text instances

Authors: Lateef, R. and Wani, M.A.

Journal: Proceedings of the 2021 8th International Conference on Computing for Sustainable Global Development, INDIACom 2021

Pages: 125-130

ISBN: 9789380544434

DOI: 10.1109/INDIACom51348.2021.00023

Abstract:

One of the desirable features of a text summarization system is its ability to generalize on unseen text. In this paper, models representing three main categories viz, Pointer-Generator networks, Transformer, and Bidirectional Encoder Representation from Transformer (BERT) are analyzed for their performance on unseen text. The models are trained and tested on the benchmark CNN/Daily Mail dataset containing 287, 226 training, 13, 368 validation and 11, 490 test instances respectively. For analyzing the generalization capabilities, the models trained on CNN/Daily Mail dataset are tested on two different datasets - GigaWord and DUC 2004 containing 1951 and 500 test instances respectively. The short and long text test instances in the CNN/Daily Mail test set are partitioned into groups to compare the performance of the models on short and long text instances. The performance of the models is carried out using the ROUGE metric.

Source: Scopus