What is the data format to LoRA-fine-tune Vicuna? #4

DavidFarago · 2023-05-11T08:44:10Z

Since https://github.com/lm-sys/FastChat/ does not publish its data, but mentions it "enhanced the training scripts provided by Alpaca to better handle multi-round conversations and long sequences", I looked at ShareGPT Vicuna datasets on Huggingface, and they contain conversations.

Now I see in this repo, data/merge_sample.json is used as data_path for the script supervised_finetune.py, but it contains Aplaca-like instruction, input, output triples.

Can we use supervised_finetune.py to fine-tune on conversations, e.g. in the format as the ShareGPT Vicuna datasets on Huggingface? If so, have you tried such a fine-tuning? If not, do you know of some repo that offers Vicuna fine-tuning based on conversations? Do you think supervised_finetune.py can be adapted easily to allow fine-tuning based on conversations?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the data format to LoRA-fine-tune Vicuna? #4

What is the data format to LoRA-fine-tune Vicuna? #4

DavidFarago commented May 11, 2023

What is the data format to LoRA-fine-tune Vicuna? #4

What is the data format to LoRA-fine-tune Vicuna? #4

Comments

DavidFarago commented May 11, 2023