Skip to content

v0.6.2: ORPO and Qwen1.5-32B

Compare
Choose a tag to compare
@hiyouga hiyouga released this 11 Apr 12:27
· 1367 commits to main since this release

New features

  • Support ORPO algorithm by @hiyouga in #3066
  • Support inferring BNB 4-bit models on multiple GPUs via the quantization_device_map argument
  • Reorganize README files, move example scripts to the examples folder
  • Support saving & loading arguments quickly in LlamaBoard by @hiyouga and @marko1616 in #3046
  • Support load alpaca-format dataset from the hub without dataset_info.json by specifying --dataset_dir ONLINE
  • Add a parameter moe_aux_loss_coef to control the coefficient of auxiliary loss in MoE models.

New models

  • Base models
    • Breeze-7B-Base
    • Qwen1.5-MoE-A2.7B (14B)
    • Qwen1.5-32B
  • Instruct/Chat models
    • Breeze-7B-Instruct
    • Qwen1.5-MoE-A2.7B-Chat (14B)
    • Qwen1.5-32B-Chat

Bug fix