This repository contains code introduced in the following paper:
Neural Mention Detection
Juntao Yu, Bernd Bohnet and Massimo Poesio
In Proceedings of the 12th Language Resources and Evaluation Conference (LREC), 2020
- The code is written in Python 2, the compatibility to Python 3 is not guaranteed.
- Before starting, you need to install all the required packages listed in the requirment.txt using
pip install -r requirements.txt
. - After that run
to compile the Tensorflow custom kernels. - If you want to use
Lee MD
you need to uncomment the first few lines
to download the GloVe embeddings that required by the system.
Pre-trained models can be download from this link. We provide two pre-trained models trained with our best model (
Biaffine MD
):- One trained on the CoNLL 2012 shared task data in which singletons and non-referring expressions are not annotated.
- The other trained on the CRAC 2018 shared task data that has both single mentions and the non-referring expressions annotated.
Choose the model you want to use and put the
files under thelogs/biaffinmd
folder. -
Modifiy the test_path accordingly:
- the test_path is the path to .jsonlines file, each line of the .jsonlines file must in the following format:
{ "clusters": [[[0,0],[5,5]],[[2,3],[7,8]], "doc_key": "nw", "sentences": [["John", "has", "a", "car", "."], ["He", "washed", "the", "car", "yesteday","."],["Really","?","it", "was", "raining","yesteday","!"]], }
- If you only have mentions annotated, but not the coreference clusters, then you can simply give every mention a cluster, the reason we use
instead ofmentions
is to allow the same data also be used by our coreference resolution system.
The model has two output mode
which can be configured in theexperiments.conf
Then use
python config_name
to start your evaluation
- If you plan to use
orBiaffine MD
, you need to run thepython train.jsonlines dev.jsonlines
to store the ELMo embeddings in the disk, this will speed up your training a lot. - If you plan to use
you will additionally need to create the character vocabulary by usingpython train.jsonlines dev.jsonlines
- If you plan to use
Bert MD
we would suggest you train on the CPU instead unless you have a very small data set, see BERT page for more information about the GPU memory issue. - Finally you can start training by using
python config_name
and Biaffine MD
can be trained in just a few (4-6) hours on a GTX 1080Ti GPU. The Bert MD
however takes about 1 week to finish on 48 CPU cores.