Skip to content

Latest commit

 

History

History
40 lines (34 loc) · 1.31 KB

gpt-j-6B-asym-recipe.md

File metadata and controls

40 lines (34 loc) · 1.31 KB

This recipe is outdated, we recommend using symmetric quantization. You can remove --asym from the command.

A sample command to generate an INT4 model.

auto-round \
--model EleutherAI/gpt-j-6b \
--device 0 \
--group_size 128 \
--bits 4 \
--iters 1000 \
--nsamples 512 \
--asym \
--format 'auto_gptq,auto_round' \
--output_dir "./tmp_autoround"

Install lm-eval-harness from source, we used the git id 96d185fa6232a5ab685ba7c43e45d1dbb3bb906d ##pip install auto-gptq[triton] ##pip install triton==2.2.0

lm_eval --model hf --model_args pretrained="./",autogptq=True,gptq_use_triton=True --device cuda:0 --tasks lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,rte,arc_easy,arc_challenge,mmlu --batch_size 32
Metric FP16 INT4
Avg. 0.5039 0.5034
mmlu 0.2694 0.2793
lambada_openai 0.6831 0.6790
hellaswag 0.4953 0.4902
winogrande 0.6409 0.6401
piqa 0.7541 0.7465
truthfulqa_mc1 0.2020 0.2179
openbookqa 0.2900 0.2900
boolq 0.6544 0.6554
rte 0.5451 0.5271
arc_easy 0.6692 0.6734
arc_challenge 0.3396 0.3387