04 Jan 22:32

e52210b

Release 2.8.0 Latest

Latest

Announcing Release 2.8.0

In 2025, we'd like to both improve the quality of the existing instruct competition and add more specialized competitions. As a first step on that path, we're sunsetting the B7_MULTI_CHOICE at block 4_675_163 and giving full weight to INSTRUCT_8B.

Changes

The IFEval evaluation task weight is increased to 20% at block 4_675_163 and weight on MMLU has decreased to 65%
Fixed a bug causing the random seeds to differ across validators (it turns out python's hash() function isn't deterministic across sessions/machines...who knew? 🤔 ). This should reduce model evaluation variance and improve vTrust.

Assets 2

23 Dec 21:02

Sid-Data-Universe

v2.7.2

20bc7d6

Release 2.7.2.

Announcing Release 2.7.2

In this release we greatly improved the consistency of our weight setting with a dedicated thread and also updated to bittensor 8.5.1. CR3 will be enabled in a few days once we see valis have updated.

Other updates

Starting on block 4_552_883 we will change the competition weight distribution from 90/10 to 75/25 between comp 2 and comp 3.
Future changes to existing competitions (like IfEvalV1->V2) will no longer cause a long bottleneck in picking up new models.

Validators should update as soon as they can. Note that due to requirement version updates you will need to rerun

python -m pip install -e .

Assets 2

19 Dec 16:34

Sid-Data-Universe

v2.7.1

32aba23

Release 2.7.1.

Announcing Release 2.7.1.

This release contains prompt improvements for two of the upcoming IfEvalV2 tasks.

Validators please update before the start block for IfEvalV2 of 4,523,592

Assets 2

17 Dec 16:47

RusticLuftig

v2.7.0

de1346a

Release 2.7.0

Announcing Release 2.7.0

In this release we introduce a new-and-improved IFEval evaluation task that will take effect at block 4_523_592. Details below!

IFEvalV2

Miner's are currently performing very well on the initial set of rules in IFEval. In fact, it's too easy! In V2, we've introduce a new set of rules that are notably harder
We've also increased the number of rules per prompt to 2-5, up from 1-4.
We've also changed how miners are scored on IFEval, to better reward models that correctly adhere to all the rules in a single prompt. Previously, the score was simply the ratio of correctly handled rules across all samples
IFEvalV2 is worth 10% of the total score, up from 5%

Other updates

As promised, this release improves the logging, reducing a lot of the noise from bt 8.4.3
Some miners were experiencing timeouts during model upload, so we have increased the TTL from 30->60 seconds

Validators should update as soon as they can. Note that due to requirement version updates you will need to rerun

python -m pip install -e .

Assets 2

05 Dec 03:54

Sid-Data-Universe

v2.6.0

c6dce9d

Release 2.6.0

Announcing Release 2.6.0

This release introduces a 2nd competition starting at block 4,451,695: INSTRUCT 8B. As we strive for SOTA models, we believe it's important that miners have as much flexibility as possible - hence, this competition allows you to bring-your-own-tokenizer!

New competition

Start block: 4,451,695
Max size: 8.1B
The allowed list of architectures has also been expanded
- https://github.com/macrocosm-os/finetuning/blob/c6dce9d27d1317b9c543071913ae34df09faddc7/constants/__init__.py#L114
This competition uses the same set of evaluation tasks as the current competition
The competition is started at 10% subnet weight, and we will look to increase this in the near future if the competition shows promise

Other updates

This change also updates the subnet to bittensor 8.4.3. As part of this update, we have found the logs are quite noisy. However, we wanted to get this release out to kick-off the new competition! We will greatly improve validator logging in the next release
Note: One benign log that shows up frequently is File "/usr/lib/python3.10/multiprocessing/connection.py", line 383, in _recv raise EOFError

Validators should update as soon as they can. Note that due to requirement version updates you will need to rerun
python -m pip install -e .

Assets 2

21 Nov 16:24

Sid-Data-Universe

v2.5.1

1237391

Release 2.5.1

This is a small release that includes a fix for the upcoming IfEval task and an improvement for validator weight setting.

Subnet

IfEval rules regarding word and sentence counts have been adjusted to align better with token generation limits. Thanks @PawKanarek

Validators

Create a new subtensor instance for each set weights attempt.

Contributors

PawKanarek

Assets 2

19 Nov 03:22

Sid-Data-Universe

v2.5.0

7f23ad9

Release 2.5.0

This release adds a fourth evaluation task (IfEval) into the current competition starting on block 4,344,030. At this time the weighting of each task will be 85% MMLU, 5% Word Sorting, 5% Fineweb, and 5% IfEval.

Subnet

Added new IfEval (Instruction Following) evaluation task.
This evaluation scores models based on how well they follow generated rules about their response. To start with this will include rules about casing, comma usage, word count, and sentence count.
Includes a check to make sure models are generating reasonable output. Meaning they are not using the same response for the same rules when asked different questions.

Validators

The expected time per evaluation cycle has increased due to the new evaluation task.
TTLs have been adjusted and each model is required to complete all evaluation tasks in 12 minutes.
Alpha has also been adjusted. Models should first receive weight after 2 cycles (~360 blocks) and will receive all weight after 17 cycles (~3060 blocks) of consecutive wins.
Output width is set explicitly to improve readability of pm2 rich tables in logging. Thanks coldint!

Miners

The new dataset loader for the if_eval task can be found at https://github.com/macrocosm-os/finetuning/blob/main/finetune/datasets/generated/if_eval_loader.py.
As mentioned this will be incorporated into the existing competition starting in block 4,344,030 so please take this into consideration for your training.

This release requires running pip install -e . to pick up the latest dependencies

Assets 2

14 Nov 17:37

Sid-Data-Universe

v2.4.1

726fb93

Release 2.4.1

Announcing Release 2.4.1

This release is focused on improving vTrust by adjusting the speed at which models receive (and lose) weight internally for each validator.

Subnet

Leaderboard has been updated to better handle old models become the top model due to a competition adjustment.

Validators

The alpha validators use for their weight moving average has been adjusted from 0.5 to 0.05.
- This will improve vtrust when a new top model arrives since validators will no longer shift their weights so rapidly.
The minimum internal weight before a validator starts setting weights on the chain for a miner has been adjusted to 0.1.
- This will help avoid blips where one model gets lucky on a single set of samples.
- It takes 3 cycles of winning in a row to go from 0 weight to 0.143 weight and cross the 0.1 threshold.
- It takes 45 cycles of winning in a row to go from 0 weight to 0.901 weight and ensure one model receives all of the weight.

Assets 2

07 Nov 03:00

Sid-Data-Universe

v2.4.0

f9c499a

Release 2.4.0

This release incorporates a third evaluation task (Fineweb) into the current competition starting on block 4,250,808. At this time the weighting of each dataset will be 90% MMLU, 5% Word Sorting, and 5% Fineweb.

Subnet

Added new Fineweb evaluation task.
- This evaluation scores models based on the computed average cross entropy loss on samples from Fineweb.
- It is the same evaluation from subnet 9. Including it helps ensure the finetuned models do not lose too much of their original context.
- Includes a check to make sure models are generating reasonable output. Meaning they are not too repetitive within or across responses.
Improved definition of the competition schedule to include eval tasks.
- This makes it easier to add new evaluations to competitions at specific weights and makes it easier to view them as a miner.
- See COMPETITION_SCHEDULE_BY_BLOCK in constants/__init.py__ to view for yourself.

Validators

Improved the logic around strategy selection for sharing files across subprocess boundaries. This will help avoid overflowing /dev/shm.

Miners

The new dataset loader for the fineweb task can be found at https://github.com/macrocosm-os/finetuning/blob/main/finetune/datasets/hugging_face/hugging_face_loader.py.
- As mentioned this will be incorporated into the existing competition starting in block 4,250,808 so please take this into consideration for your training.
- Note that this supports general hugging face datasets. Currently constants are included for Falcon and Fineweb. The current competition is only using Fineweb data.
Validators should update as soon as they can. Note that due to requirement version updates you will need to rerun
python -m pip install -e .

Assets 2

01 Nov 21:10

RusticLuftig

v2.3.0

9598e92

Release 2.3.0

This release addresses the current wandb sampling issue from SN 1 and adds functionalities to improve v-trust.

V-trust improvements:

We've improved the PromptingDatasetLoader to more reliably and consistently fetch samples. Validators will now fetch 700 samples instead of 400
Validators now align to "sync blocks" to use the same set of eval samples, as well as pace how frequently evaluations are performed. This should improve v-trust across the board, particularly in situations where the top model changes.
Miner weights are now fully converted to a winner-takes-all, where exactly 1 model will receive weight. Previously a 2nd model could receive a small amount of weight (due to soft-maxing of weights) if enough models were evaluated in a batch
Added better retry behavior for set_weights

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Announcing Release 2.8.0

Changes

Announcing Release 2.7.2

Other updates

Announcing Release 2.7.1.

Announcing Release 2.7.0

IFEvalV2

Other updates

Announcing Release 2.6.0

New competition

Other updates

Subnet

Validators

Contributors

Subnet

Validators

Miners

Announcing Release 2.4.1

Subnet

Validators

Release 2.4.0

Subnet

Validators

Miners

Releases: macrocosm-os/finetuning

Release 2.8.0

Announcing Release 2.8.0

Changes

Release 2.7.2.

Announcing Release 2.7.2

Other updates

Release 2.7.1.

Announcing Release 2.7.1.

Release 2.7.0

Announcing Release 2.7.0

IFEvalV2

Other updates

Release 2.6.0

Announcing Release 2.6.0

New competition

Other updates

Release 2.5.1

Subnet

Validators

Contributors

Release 2.5.0

Subnet

Validators

Miners

Release 2.4.1

Announcing Release 2.4.1

Subnet

Validators

Release 2.4.0

Release 2.4.0

Subnet

Validators

Miners

Release 2.3.0