How to use Smoothquant and FP16 or Weight Only in a LLM on the same time? #1846

focusunsink · 2024-06-26T13:21:52Z

focusunsink
Jun 26, 2024

We know that smoothquant has little precision loss.
Could we choose some layers to quantitize using Smoothquant and others to keep float16. Is that work, and how to config.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use Smoothquant and FP16 or Weight Only in a LLM on the same time? #1846

{{title}}

Replies: 0 comments

Select a reply

How to use Smoothquant and FP16 or Weight Only in a LLM on the same time? #1846

focusunsink Jun 26, 2024

Replies: 0 comments

focusunsink
Jun 26, 2024