1.29.0
1.29.0 (2025-01-13)
Ci
-
ci: fix model loading test (#1775)
-
pass base branch into the make command as an arg
-
test a file that has custom wrapper
-
what about overview
-
just dont check overview
-
revert instance check
-
explicitly omit overview and init
-
remove test change
-
try on a lot of models
-
revert test model file
Co-authored-by: Isaac Chung <[email protected]> (9b117a8
)
Feature
-
feat: Update task filtering, fixing bug which included cross-lingual tasks in overly many benchmarks (#1787)
-
feat: Update task filtering, fixing bug on MTEB
- Updated task filtering adding exclusive_language_filter and hf_subset
- fix bug in MTEB where cross-lingual splits were included
- added missing language filtering to MTEB(europe, beta) and MTEB(indic, beta)
The following code outlines the problems:
import mteb
from mteb.benchmarks import MTEB_ENG_CLASSIC
task = [t for t in MTEB_ENG_CLASSIC.tasks if t.metadata.name == "STS22"][0]
# was eq. to:
task = mteb.get_task("STS22", languages=["eng"])
task.hf_subsets
# correct filtering to English datasets:
# ['en', 'de-en', 'es-en', 'pl-en', 'zh-en']
# However it should be:
# ['en']
# with the changes it is:
task = [t for t in MTEB_ENG_CLASSIC.tasks if t.metadata.name == "STS22"][0]
task.hf_subsets
# ['en']
# eq. to
task = mteb.get_task("STS22", hf_subsets=["en"])
# which you can also obtain using the exclusive_language_filter (though not if there was multiple english splits):
task = mteb.get_task("STS22", languages=["eng"], exclusive_language_filter=True)
-
format
-
remove "en-ext" from AmazonCounterfactualClassification
-
fixed mteb(deu)
-
fix: simplify in a few areas (
4a70e5d
)