Releases: instructlab/eval
Releases · instructlab/eval
v0.4.0
What's Changed
- build(deps): bump rhysd/actionlint from 1.7.2 to 1.7.3 in /.github/workflows by @dependabot in #142
- Add missing comment for error_rate return by @danmcp in #141
- build(deps): bump rojopolis/spellcheck-github-actions from 0.42.0 to 0.43.0 by @dependabot in #147
- build(deps): bump actions/checkout from 4.2.0 to 4.2.1 by @dependabot in #146
- build(deps-dev): update pre-commit requirement from <4.0,>=3.0.4 to >=3.0.4,<5.0 by @dependabot in #145
- build(deps): bump pypa/gh-action-pypi-publish from 1.10.2 to 1.10.3 by @dependabot in #144
- chore: rename 'basic-workflow-tests' to 'e2e-custom' by @nathan-weinberg in #152
- build(deps): bump rojopolis/spellcheck-github-actions from 0.43.0 to 0.43.1 by @dependabot in #154
- Give nice error for empty taxonomy by @danmcp in #151
- ci: change small E2E CI job to medium by @nathan-weinberg in #155
- ci: add large-size E2E CI job by @nathan-weinberg in #157
- ci: use org variable for AWS EC2 AMI in E2E CI jobs by @nathan-weinberg in #159
- build(deps): bump rojopolis/spellcheck-github-actions from 0.43.1 to 0.44.0 by @dependabot in #160
- build(deps): bump actions/setup-python from 5.2.0 to 5.3.0 by @dependabot in #161
- ci: convert med E2E CI job to L4 GPU by @nathan-weinberg in #162
- build(deps): bump actions/checkout from 4.2.1 to 4.2.2 by @dependabot in #158
- build(deps): bump pypa/gh-action-pypi-publish from 1.10.3 to 1.11.0 by @dependabot in #164
- feat: use custom http_client by @leseb in #163
- build(deps): bump hynek/build-and-inspect-python-package from 2.9.0 to 2.10.0 by @dependabot in #166
- build(deps): bump machulav/ec2-github-runner from 2.3.6 to 2.3.7 by @dependabot in #167
- Add facilities for unit and functional tests by @danmcp in #165
- build(deps): bump rhysd/actionlint from 1.7.3 to 1.7.4 in /.github/workflows by @dependabot in #168
- build(deps): bump pypa/gh-action-pypi-publish from 1.11.0 to 1.12.0 by @dependabot in #170
- build(deps): bump rojopolis/spellcheck-github-actions from 0.44.0 to 0.45.0 by @dependabot in #171
- build(deps): bump pypa/gh-action-pypi-publish from 1.12.0 to 1.12.2 by @dependabot in #175
- Add check data unit tests by @danmcp in #169
- Undo commit of unit cov and add to gitignore by @danmcp in #172
- Remove functional test output and add to .gitignore by @danmcp in #173
- Add model adapter unit tests by @danmcp in #174
New Contributors
Full Changelog: v0.3.1...v0.4.0
v0.3.1
v0.3.0
What's Changed
- build(deps): bump pypa/gh-action-pypi-publish from 1.10.1 to 1.10.2 by @dependabot in #133
- build(deps): bump rojopolis/spellcheck-github-actions from 0.41.0 to 0.42.0 by @dependabot in #132
- docs: update README with more contextual eval info by @nathan-weinberg in #130
- github: add stale bot to eval repo by @nathan-weinberg in #136
- ci: fix lint action by @nathan-weinberg in #137
- build(deps): bump rhysd/actionlint from 1.7.1 to 1.7.2 in /.github/workflows by @dependabot in #134
- Bump sigstore/gh-action-sigstore-python from 2.1.1 to 3.0.0 by @dependabot in #76
- build(deps): bump actions/checkout from 4.1.7 to 4.2.0 by @dependabot in #139
- Remove max_workers and serving_gpus from constructor by @danmcp in #140
- return overall_score from MTBenchBranch.judge_answers() by @alimaredia in #138
Note: This release contains two changes which aren't backwards compatible:
- Remove max_workers and serving_gpus from constructor by @danmcp in #140
- return overall_score from MTBenchBranch.judge_answers() by @alimaredia in #138
Full Changelog: v0.2.1...v0.3.0
v0.2.1
What's Changed
- update README by @sallyom in #108
- Use single answer file and model list (backport #110) by @mergify in #112
- mergify: add mergify configuration by @nathan-weinberg in #114
- Bump step-security/harden-runner from 2.8.1 to 2.9.1 by @dependabot in #94
- ci: move E2E runner from github to AWS by @nathan-weinberg in #118
- docs: add initial release strategy doc and CHANGELOG by @nathan-weinberg in #91
- CI: Fix working directories to be relative by @danmcp in #120
- Bump actions/setup-python from 5.1.1 to 5.2.0 by @dependabot in #119
- Bump actions/checkout from 4.1.6 to 4.1.7 by @dependabot in #116
- build(deps): bump pypa/gh-action-pypi-publish from 1.9.0 to 1.10.0 by @dependabot in #122
- ci: add AWS tags to show github ref and PR num for all jobs by @nathan-weinberg in #123
- Bump rojopolis/spellcheck-github-actions from 0.38.0 to 0.41.0 by @dependabot in #96
- build(deps): bump pypa/gh-action-pypi-publish from 1.10.0 to 1.10.1 by @dependabot in #124
- build(deps): bump hynek/build-and-inspect-python-package from 2.6.0 to 2.9.0 by @dependabot in #125
- build(deps): bump DavidAnson/markdownlint-cli2-action from 16.0.0 to 17.0.0 by @dependabot in #126
- build(deps): bump step-security/harden-runner from 2.9.1 to 2.10.1 by @dependabot in #127
- Add comment to make it clear how the code is working by @danmcp in #105
- Allow for external serving to be used with mmlu by @danmcp in #99
- Better path and string handling by @danmcp in #106
- Improve logging by @danmcp in #111
- Cleanup usage of load model answers by @danmcp in #115
- add option to pass 'api_key' to gen_answers, judge_answers by @sallyom in #128
- e2e: only run PR job if certain files are changed by @nathan-weinberg in #131
- Allow max_workers to be passed in after evaluator is created by @danmcp in #107
- Remove fastchat dependency by @danmcp in #98
New Contributors
Full Changelog: v0.2.0...v0.2.1
v0.1.2
v0.2.0
What's Changed
- Changing few_shots default to 5 by @danmcp in #92
- Don't sleep on last retry attempt by @booxter in #84
- github: add action to free runner disk space for tox installs by @nathan-weinberg in #93
- Remove remaining print()s from the library by @booxter in #86
- Fix e2e by removing old option by @danmcp in #102
- Default to merge_system_user_message if mistral model detected by @danmcp in #100
- Dont retry on connection failure by @danmcp in #103
- Add optional auto tuning for max_workers by @danmcp in #101
New Contributors
Full Changelog: v0.1.1...v0.2.0
v0.1.1
What's Changed
- Bump sigstore/gh-action-sigstore-python from 2.1.1 to 3.0.0 by @dependabot in #70
- Revert "Bump sigstore/gh-action-sigstore-python from 2.1.1 to 3.0.0" by @alinaryan in #72
- feat: add new InvalidModelError and handling by @nathan-weinberg in #79
- small docs update for clarity by @makelinux in #81
- fix: use the context correctly in mt_bench_branch by @bcrochet in #90
- fix: catch KeyError in mt_bench_branch by @bcrochet in #89
- fix: mt_bench_branch should ignore knowledge in generate by @bcrochet in #88
New Contributors
- @makelinux made their first contribution in #81
- @bcrochet made their first contribution in #90
Full Changelog: v0.1.0...v0.1.1
v0.1.0
What's Changed
- Fixing up test case after api changes to add error_rate by @danmcp in #63
- Inherit logging from caller rather than from vLLM by @danmcp in #66
- Update batch size description and allow for str by @danmcp in #67
- Don't set basicConfig from libraries by @danmcp in #69
Full Changelog: v0.0.9...v0.1.0