Skip to content

Latest commit

 

History

History
125 lines (105 loc) · 4.3 KB

Readme.md

File metadata and controls

125 lines (105 loc) · 4.3 KB

Liu & Lapata (2019)

Publication

Text Summarization with Pretrained Encoders

Relevant Repositories

https://github.com/nlpyang/PreSumm

Available Models

The original GitHub repository provides 4 pretrained models:

  • CNN/DM TransformerAbs

    • Description: Their baseline abstractive model trained on the CNN/DailyMail dataset
    • Name: liu2019-transformerabs
    • Usage:
      from repro.models.liu2019 import TransformerAbs
      model = TransformerAbs()
      summary = model.predict("document")
  • CNN/DM BertSumExt

    • Description: A BERT-based extractive model trained on the CNN/DailyMail dataset
    • Name: liu2019-bertsumext
    • Usage:
      from repro.models.liu2019 import BertSumExt
      model = BertSumExt()
      summary = model.predict("document")
  • CNN/DM BertSumExtAbs

    • Description: A BERT-based abstractive model trained on the CNN/DailyMail dataset
    • Name: liu2019-bertsumextabs
    • Usage:
      from repro.models.liu2019 import BertSumExtAbs
      model = BertSumExtAbs()  # or BertSumExtAbs("bertsumextabs_cnndm.pt")
      summary = model.predict("document")
  • XSum BertSumExtAbs

    • Description: A BERT-based abstractive model trained on the XSum dataset
    • Name: liu2019-bertsumextabs
    • Usage:
      from repro.models.liu2019 import BertSumExtAbs
      model = BertSumExtAbs("bertsumextabs_xsum.pt")
      summary = model.predict("document")

Implementation Notes

  • The input to the pretrained models is expected to be already preprocessed. Therefore, we tried to replicate their preprocessing steps as closely as we could, which means all of the input documents are tokenized and sentence split using the Stanford CoreNLP library within the docker container.

  • If you pass in a pre-sentence tokenized document, the current implementation does not respect those sentence boundaries and will reprocess the document.

Dockerfile Information

  • Image name: liu2019
  • Build command:
    repro setup liu2019 \
        [--transformerabs-cnndm] \
        [--bertsumext-cnndm] \
        [--bertsumextabs-cnndm] \
        [--bertsumextabs-xsum] \
        [--silent]
    
    Each of the flags indicates whether the corresponding model should be downloaded.
  • Requires network: No

Testing

repro setup liu2019 \
    --transformerabs-cnndm \
    --bertsumext-cnndm \
    --bertsumextabs-cnndm \
    --bertsumextabs-xsum

pytest -s models/liu2019/tests

Status

  • Regression unit tests pass
    See the latest successful tests on Github here

  • Correctness unit tests pass
    The authors provide their model outputs and instructions for processing the data from scratch. We did not attempt to perfectly reproduce their summaries.

  • Model runs on full test dataset
    See here

  • Predictions approximately replicate results reported in the paper
    The results for the abstractive models approximately replicate the reported in the paper, but the extractive model does not. See this experiment for details. The ROUGE scores are calculated against reference summaries which have been preprocessed in the same way that the input documents are, not the original references.

    TransformerAbs on CNN/DailyMail

    R1 R2 RL
    Reported 40.21 17.76 37.09
    Ours 40.38 17.81 37.10

    BertSumExt on CNN/DailyMail

    R1 R2 RL
    Reported 43.23 20.24 39.63
    Ours 41.93 18.98 38.07

    BertSumExtAbs on CNN/DailyMail

    R1 R2 RL
    Reported 42.13 19.60 39.18
    Ours 42.08 19.43 38.95

    BertSumExtAbs on XSum

    R1 R2 RL
    Reported 38.81 16.50 31.27
    Ours 38.88 16.41 31.31

    The abstractive models seem to be faithful reproductions of the original results, whereas the extractive model is not. It is not clear why.

  • Predictions exactly replicate results reported in the paper
    See above