How to use Smoothquant and FP16 or Weight Only in a LLM on the same time? #1846
focusunsink
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We know that smoothquant has little precision loss.
Could we choose some layers to quantitize using Smoothquant and others to keep float16. Is that work, and how to config.
Beta Was this translation helpful? Give feedback.
All reactions