-
Notifications
You must be signed in to change notification settings - Fork 1
Writing a Custom Prediction Reader
As an alternative to converting your predictions into one of the formats supported by ELEVANT, you can write your own
prediction reader, such that you can use your prediction files with the link_benchmark.py
script directly.
This requires three steps. Note: Make sure you perform the following steps outside of the docker container,
otherwise your changes will be lost when exiting the container.
-
Implement a prediction reader in
src/elevant/prediction_readers/
that inherits fromsrc.elevant.prediction_readers.abstract_prediction_reader.AbstractPredictionReader
. You must either implement thepredictions_iterator()
method or theget_predictions_with_text_from_file()
method.Implement
predictions_iterator()
if you are sure that the order in which the predictions are read corresponds to the article order in the benchmark. Setpredictions_iterator_implemented = True
when callingsuper().__init__()
. See here for an example.Implement
get_predictions_with_text_from_file()
if you are not sure that the order in which the predictions are read corresponds to the article order in the benchmark and the prediction file contains the original article texts. Setpredictions_iterator_implemented = False
when callingsuper().__init__()
. See here for an example. -
Add your custom prediction reader name to the
src.elevant.linkers.linkers.PredictionFormats
enum, e.g.MY_FORMAT = "my_format"
. -
In
src.elevant.linkers.linking_system.LinkingSystem._initialize_linker
add anelif
case in which you load necessary mappings (if any) and initialize theLinkingSystem
'sprediction_reader
. This could look something like this:elif linker_type == Linkers.MY_FORMAT.value: self.load_missing_mappings({MappingName.WIKIPEDIA_WIKIDATA, MappingName.REDIRECTS}) self.prediction_reader = MyCustomPredictionReader(prediction_file, self.entity_db)
where
prediction_file
is the path to the prediction file. Theload_missing_mappings()
line is necessary if you predict Wikipedia entities and therefore have to convert them to Wikidata entities. The mappings are loaded intoself.entity_db
. You can then get a Wikidata QID from a Wikipedia title by callingentity_id = KnowledgeBaseMapper.get_wikidata_qid(entity_reference, self.entity_db)
You can then convert your linking results into ELEVANT's internally used format by running
python3 link_benchmark.py <experiment_name> -pfile <path_to_linking_results> -pformat my_format -pname <linker_name> -b <benchmark_name>