NUBIA: NeUral Based Interchangeability Assessor for Text Generation
https://github.com/wl-research/nubia
- Nubia
- Description: A learned text generation evaluation metric
- Name:
kane2020-nubia
- Usage: Include a small snippet for how to use the model
from repro.models.kane2020 import NUBIA model = NUBIA() inputs = [ {"candidate": "The candidate text", "references": ["The reference text"]} ] macro, micro = model.predict_batch(inputs)
macro
is the Nubia score averaged over the inputs, andmicro
is the Nubia score per-input.
- The implementation does not support using a GPU
- The metric only supports a single reference, so the length of
references
must be 1.
- Image name:
danieldeutsch/kane2020:1.0
- Build command:
repro setup kane2020 [--silent]
- Requires network: No
repro setup kane2020
pytest models/kane2020/tests
- Regression unit tests pass
- Correctness unit tests pass
See here. We replicated the features show in an example from the original repository. However, there are additional features now and the overall score has changed. - Model runs on full test dataset
Not tested - Predictions approximately replicate results reported in the paper
Not tested - Predictions exactly replicate results reported in the paper
Not tested