Skip to content
wejradford edited this page Feb 25, 2014 · 6 revisions

Evaluate

The main script produces various [evaluation measures](Evaluation measures). It takes the gold-standard annotation and system output in [AIDA/CoNLL format](Data format):

cne evaluate -g GOLD.testb SYSTEM.testb

Extract dataset splits

The distributed gold-standard includes three splits: train, testa and testb. To filter out some of these splits, run:

cne filter -k ".*testb.*" GOLD > GOLD.testb

Mapping to current titles

Wikipedia (and other KBs) change over time, including page titles. A system using a more recent version of Wikipedia may lose points for using a newer title. Luckily, Wikipedia redirects can often be used to map between titles in different versions.

The map script can be used to map link titles in SYSTEM and GOLD to a common version:

cne filter -m MAP.testb SYSTEM.testb > SYSTEM.testb.mapped

Fetch a map from Wikipedia redirects

The MAP file should contain lines corresponding to titles from the newer version. The first column contains the newer title and any following tab-separated columns contain names that should map to the newer title (e.g., titles of redirect pages that point to the newer title).

The fetch_map script can be used to generate a current redirect mapping using the Wikipedia API:

cne fetch_map GOLD.testb > MAP.testb

Error analysis

We can describe different types of errors:

  • wrong-link - where we link a mention to the wrong KB node
  • link-as-nil - where we fail to link a mention to the KB
  • nil-as-link - where we link a mention that should not be
  • missing - where we do not exactly detect a mention
  • extra - where we detect a mention that is not exactly in the gold-standard

Running

./cne analyze -g GOLD FILE

Gives us output like:

link-as-nil	<doc_id>	m"<mention>"	g"<entity>"	s"None"
wrong-link	<doc_id>	m"<mention>"	g"<entity_a>"   s"<entity_b>"
nil-as-link	<doc_id>	m"<mention>"	g"None"	        s"<entity>"
extra	        <doc_id>	m"<mention>"	s"None"
missing		<doc_id>        m"<mention>"	s"<entity>"

We can aggregate like:

./cne analyze -g GOLD SYSTEM | cut -f1 | sort | uniq -c

Giving us output like:

 652 extra
 114 link-as-nil
1306 missing
 333 nil-as-link
 606 wrong-link
Clone this wiki locally