Skip to content

Named entity recognition pretraining for abstractive text summarization

Notifications You must be signed in to change notification settings

Serbernari/NER-summarization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NER-summarization

Named entity recognition pretraining for abstractive text summarization

Check-out the paper! https://arxiv.org/abs/2307.02570

This repositiry contains code with demonstarion of MNELM (Masked Named Entity Language Model) pretraining proccedure for the summarization models.

MNEML forces neural network to concentrate more on named entities during its pretraining phase, which helps increase precision and accuracy of named entity inclusion in the generated summaries.

Features

  • Custom Pretraining Objective: Enhances model attention on named entities with novel MLNEM learning task.
  • Named Entity Recognition: Uses RoBERTa for NER.
  • Abstractive Summarization: Utilizes BART for summarization tasks.
  • Improved Precision and Recall: Higher named entity inclusion compared to standard masked language models.

Repository structure

  • SCIERC_NER.ipynb - creating and training of RoBERTa NER model on the SCIERC dataset
  • bart_pretrain.ipynb - MNELM pretraining of BART model with following summarization fine-tuning
  • bart.ipynb - example of working with CNN summarization dataset

image

About

Named entity recognition pretraining for abstractive text summarization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published