Skip to content

Release 2.7.0

Compare
Choose a tag to compare
@RusticLuftig RusticLuftig released this 17 Dec 16:47
· 33 commits to main since this release
de1346a

Announcing Release 2.7.0

In this release we introduce a new-and-improved IFEval evaluation task that will take effect at block 4_523_592. Details below!

IFEvalV2

  • Miner's are currently performing very well on the initial set of rules in IFEval. In fact, it's too easy! In V2, we've introduce a new set of rules that are notably harder
  • We've also increased the number of rules per prompt to 2-5, up from 1-4.
  • We've also changed how miners are scored on IFEval, to better reward models that correctly adhere to all the rules in a single prompt. Previously, the score was simply the ratio of correctly handled rules across all samples
  • IFEvalV2 is worth 10% of the total score, up from 5%

Other updates

  • As promised, this release improves the logging, reducing a lot of the noise from bt 8.4.3
  • Some miners were experiencing timeouts during model upload, so we have increased the TTL from 30->60 seconds

Validators should update as soon as they can. Note that due to requirement version updates you will need to rerun

python -m pip install -e .