Co-reference identifies pieces of text and links them with other pieces of text that refer to the same thing. Sometimes pieces of text have zero-length, where an overt pronoun or noun is omitted.
Input:
我的姐姐给我她的狗。很喜欢.
Output:
[我]0的[姐姐]1给[我]0[她]1的[狗]2。[]0很喜欢[]2.
Average of F1-scores returned by these three precision/recall metrics:
- MUC.
- B-cubed.
- Entity-based CEAF.
- BLANC.
- Link-Based Entity-Aware metric (LEA).
CoNLL 2012 introduced a co-reference task in Chinese.
Data for this evaluation is part of OntoNotes, distributed by the Linguistic Data Consortium (LDC).
Test set | # of co-referring mentions | Genre |
---|---|---|
CoNLL 2012 co-reference | 144k (including 15k zero-length “dropped” subjects) | Newswire, broadcast news, broadcast conversation |
Average F1 of MUC, B-cubed, and CEAF
Scoring code: https://github.com/conll/reference-coreference-scorers
System | Average F1 of MUC, B-cubed, CEAF |
---|---|
Clark & Manning (2016b) | 63.88 |
Kong & Jian (2019) | 63.85 |
Clark & Manning (2016a) | 63.66 |
Data for this evaluation is part of OntoNotes, distributed by the Linguistic Data Consortium (LDC).
F1 score computed on resolution hits (Zhao & Ng 2007).
System | Overall F1 (w/ gold syntactic info) | Overall F1 (w/o gold syntactic info) |
---|---|---|
Aloraini & Poesio (2020) | 63.5 | |
Song et al. (2020) | 58.5 | 26.1 |
Yang et al. (2019) | 58.1 | |
Yin et al. (2018) | 57.3 | |
Liu et al. (2017) | 55.3 | |
Yin et al. (2017) | 54.9 | 22.7 |
Training and testing is performed on the train and dev splits of OntoNotes 5.0 respectively (statistics reported by Yin et al. (2018))
Split | Documents | Sentences | Words | Anaphoric Zero Pronouns |
---|---|---|---|---|
Train | 1,391 | 36,487 | 756K | 12,111 |
Dev | 172 | 6,083 | 110K | 1,713 |
Suggestions? Changes? Please send email to [email protected]