Skip to content

Commit

Permalink
fix typo
Browse files Browse the repository at this point in the history
  • Loading branch information
wenhuach21 committed Jun 7, 2024
1 parent 49c0ac1 commit c431576
Showing 1 changed file with 2 additions and 3 deletions.
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ pip install auto-round
## Model quantization

### Gaudi2/ CPU/ GPU
By default, we export to the AutoRound format, which supports both CUDA and CPU backends and ensures asymmetry accuracy. To export in a format compatible with Transformers, save it in the auto_gptq format.
We export to the AutoRound format by default, which supports both CUDA and CPU backends and ensures asymmetry accuracy. To export in a format compatible with Transformers, save it in the auto_gptq format.


```python
Expand All @@ -74,7 +74,7 @@ bits, group_size, sym = 4, 128, False
autoround = AutoRound(model, tokenizer, bits=bits, group_size=group_size, sym=sym, device=None)
autoround.quantize()
output_dir = "./tmp_autoround"
autoround.save_quantized(output_dir) ## tsave_quantized(output_dir,format=="auto_gptq")
autoround.save_quantized(output_dir) ## save_quantized(output_dir,format=="auto_gptq")
```

<details>
Expand Down Expand Up @@ -136,7 +136,6 @@ Please run the quantization code first.
### CPU/GPU

```python
## install auto-round first, for auto_gptq format, please install auto-gptq
from transformers import AutoModelForCausalLM, AutoTokenizer
from auto_round.auto_quantizer import AutoHfQuantizer ## comment it for models with auto_gptq format

Expand Down

0 comments on commit c431576

Please sign in to comment.