Replies: 3 comments 4 replies
-
you can checkout the lengthy discussions in the related PRs (eg #2632) |
Beta Was this translation helpful? Give feedback.
-
llama.cpp (i.e.
It is a cheap progress bar with length proportional to the loss improvement over the first loss it encountered during this training run.
Training stops after number of iterations (--adam-iter N) is reached or the number of epochs (--epochs N) is reached. Whichever happens first. |
Beta Was this translation helpful? Give feedback.
-
@jooray How is your fine-tuning with llama.cpp so far? I am looking for a guide on this topic as well. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I am playing with finetune on MacBook M1 (although finetune does not seem to use GPU, is it planned?).
I generated prompt formatted dataset with simple script.
I have a few questions:
train_opt_callback: iter= 11 sample=45/2308 sched=0.110000 loss=13.552320 dt=00:01:37 eta=00:31:00 |--------------->
(What does the "--->" mean? Where does it end? is it a progressbar to infinity? :)
It says iter= 11, sample 45/2308 (2308 is my sample size, which is OK). It says it will end in 31 minutes, which it does, but it only processes a few of the samples. Should I rerun the command to continue finetuning? What's the stopping criteria? I would to process all samples during finetuning, but it seems it has some preset iterations.
Should I increase --adam-iter? Should I train more than one time?
Should I lower the learning rate? (--adam-alpha ?) Seems quite large compared to other howtos.
Any discussion forum/telegram/discord/... where we could chat about this and collectively improve? Or is this a good place? I can help with dataset generation, I have done some experiments.
Beta Was this translation helpful? Give feedback.
All reactions