Skip to content

Latest commit

 

History

History
32 lines (27 loc) · 1.26 KB

Yi-6B-Chat-asym-recipe.md

File metadata and controls

32 lines (27 loc) · 1.26 KB

This recipe is outdated, we recommend using symmetric quantization. You can remove --asym from the command.

auto-round \
--model 01-ai/Yi-6B-Chat  \
--device 0 \
--group_size 128 \
--bits 4 \
--iters 1000 \
--nsamples 512 \
--asym \
--minmax_lr 2e-3 \
--format 'auto_gptq,auto_round' \
--output_dir "./tmp_autoround"

Due to licensing restrictions, we are unable to release the model. Install lm-eval-harness from source, and the git id 96d185fa6232a5ab685ba7c43e45d1dbb3bb906d.

We used the following command for evaluation. For reference, the results of official AWQ-INT4 release are listed.

lm_eval --model hf  --model_args pretrained="./",autogptq=True,gptq_use_triton=True,trust_remote_code=True --device cuda:0 --tasks ceval-valid,cmmlu,mmlu,gsm8k --batch_size 16 --num_fewshot 0
Metric BF16 01-ai/Yi-6B-Chat-4bits INT4
Avg. 0.6043 0.5867 0.5939
mmlu 0.6163 0.6133 0.6119
cmmlu 0.7431 0.7312 0.7314
ceval 0.7355 0.7155 0.7281
gsm8k 0.3222 0.2866 0.3040