feat(evaluate): add signwriting evaluation metrics #1104
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request fixes #1103
Changes here are complete (this works), but we might want to figure out a different solution, in case this is too much hardcoding for you all (perhaps something like an external evaluation script, and a single custom metric that can call that script)
Note! that unlike BLEU or other metrics, this evaluation takes all factors into account, as they matter for SignWriting. Even if using a non-factored setup though, this evaluation works.
What do you think? how can we integrate this into sockeye?
This is very important to me, to push forward the adoption of SignWriting as a written form of signed languages, allowing both transcription and translation models to be trained with sockeye.
Tests fail with the following error, despite an Nvidia Tesla T4 GPU available:
Pull Request Checklist
until you can check this box.
pytest
)pytest test/system
)./style-check.sh
)sockeye/__init__.py
. Major version bump if this is a backwards incompatible change.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.