Releases: danieldeutsch/repro
Releases · danieldeutsch/repro
v0.1.6
Added
- Added the
ParallelModel
class as an easy abstraction over thejoblib
library for parallel computation. - Added an
aggregate_parallel_metrics
function to make using metrics in parallel easier. - Added MTEQE
Changed
- Split
Prism
into reference-basedPrism
and reference-freePrismSrc
.
They now support multi-reference and multi-source via averaging over the references/sources. - Relaxed the dependency on
pytest
so it does not require a specific version
v0.1.5
Added
- Added BaryScore, InfoLM, and DepthScore
- Added ability to set
beam_size
andnbest
parameters for BART. - Added GPU support for MoverScore
Changed
- Changed the backend implementation of MoverScore to use a non-IDF dict based version.
- Changed the default BLEURT version to use
"BLEURT-20"
instead of"bleurt-base-128"
and using length-batched optimization.
v0.1.4
Changed
- Relaxed the
datasets
version requirement to match the GEM Metrics library - Moved some dependencies into
dev-requirements.txt
Fixed
- Removed warnings that may happen if the Docker clients are not closed.
v0.1.3
Added
- Added CLIPScore
- Added a QA SRL Parser
- Added SUPERT
- Added BLANC
- Added METEOR
- Added a role question generator from Pyatkin et al. (2021)
- Added using Prism as an MT model
- Added COMET
Fixed
- Fixed an error in Lite3Pyramid by updating to a newer version of the code.
v0.1.2
Changed
- Changed backend of Lite3Pyramid to use our own fork of the official repo with some modifications.
v0.1.1
v0.1.1 - 2021-10-05
Added
- Added Benepar
- Added Lite3Pyramid
- Added BARTScore
Changed
- Fixed silly variable name typo:
DOCKERHUB_REPRO
toDOCKERHUB_REPO
v0.1.0
Added
- Added DAE
- Adding FactCC and FactCCX
- Added utilities to remove empty inputs and insert values at specific indices
- Added automatically building and publishing model images
- Added a command to pull default Docker images for each model
- Added SummaQA
- Added NUBIA
- Added Prism
Changed
- BERTScore now returns 0 for its metrics if the input is empty.
- BLEURT now returns the mean and max scores over the references.
- Changing Lewis et al. (2020) to download CNN/DM and XSum models by default
- Changing Liu et al. (2019) to download all models by default
v0.0.3
Added
- Added BLEURT
- Added BERTScore
- Added BLEU and SentBLEU
- Added QuestEval
- Added MoverScore
- Added FEQA
Changed
- Changed the QAEval interface to match other text generation metrics.
The backend was also changed to not rely on SacreROUGE.
v0.0.2
Added
- Added a
RecipeGenerationModel
class - Added a recipe generation model from Dugan et al. (2020)
- Added a
TruecasingModel
class - Added an RNN-based truecaser from Susanto et al. (2016) based on an implementation here.
- Added the question-generation and question-answering models used in the QAEval metric.
See here. - Added ROUGE
- Added
--predict-kwargs
arguments to thepredict
command - Added support for running and writing evaluation metrics, for instance, ROUGE.
- Added a jsonl dataset reader (
JSONLinesDatasetReader
) - Added the
SQuADv2Evaluation
metric - Added the BART-based sentence-guided models from Dou et al. (2021).
- Added the LERC model from Chen et al. (2020)
- Added the QAEval metric
- Adding a wrapper around the original Perl implementation of ROUGE.
See here
Changed
- Renamed the
--model-args
,--dataset-reader-args
, and--output-write-args
predict
arguments to--model-kwargs
,--dataset-reader-kwargs
, and--output-write-kwargs
. - Renamed the
--output-file
argument inpredict
to--output
to allow for output files or directories.
v0.0.1
Initial prototype of the library with setup
and predict
commands as well as implementations of Gupta et al. (2020), Lewis et al. (2020), and Liu & Lapata (2019).