Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load an AWQ model via python API #1849

Open
GhostXu11 opened this issue Dec 3, 2024 · 2 comments
Open

Load an AWQ model via python API #1849

GhostXu11 opened this issue Dec 3, 2024 · 2 comments
Labels
question Further information is requested

Comments

@GhostXu11
Copy link

Hi, I recently wanted to use litserve to deploy a service and use litgpt to load the model. However, due to video memory reasons, I wanted to use an AWQ model. However, I did not see the method of loading the awq model in the python API document. Do you have any methods?

@GhostXu11 GhostXu11 added the question Further information is requested label Dec 3, 2024
@Andrei-Aksionov
Copy link
Collaborator

If you want to quantize an original model to lower the VRAM consumption, you can use BitsandBytes quantization that does it on-the-fly: https://github.com/Lightning-AI/litgpt/blob/main/tutorials/quantize.md

But if you have weights in AWQ format and want to load them to LItGPT - it's not supported.
There was an attempt to support AutoGPTQ (that should also support AWQ format) in #924, but it was never merged.
(Making that PR up-to-date might could be a cool contribution 😉).

@GhostXu11
Copy link
Author

thanks for your reply :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants