Kane et al. (2020)

Publication

NUBIA: NeUral Based Interchangeability Assessor for Text Generation

Repositories

https://github.com/wl-research/nubia

Available Models

Nubia
- Description: A learned text generation evaluation metric
- Name: kane2020-nubia
- Usage: Include a small snippet for how to use the model
```
from repro.models.kane2020 import NUBIA
model = NUBIA()
inputs = [
    {"candidate": "The candidate text", "references": ["The reference text"]}
]
macro, micro = model.predict_batch(inputs)
```
  macro is the Nubia score averaged over the inputs, and micro is the Nubia score per-input.

Implementation Notes

The implementation does not support using a GPU
The metric only supports a single reference, so the length of references must be 1.

Docker Information

Image name: danieldeutsch/kane2020:1.0
Build command:
```
repro setup kane2020 [--silent]
```
Requires network: No

Testing

repro setup kane2020
pytest models/kane2020/tests

Status

Regression unit tests pass
Correctness unit tests pass
See here. We replicated the features show in an example from the original repository. However, there are additional features now and the overall score has changed.
Model runs on full test dataset
Not tested
Predictions approximately replicate results reported in the paper
Not tested
Predictions exactly replicate results reported in the paper
Not tested

Changelog