Threw together a really slapdash continuously built semi-pre-installed/compiled Docker Image for running 4bit GPTQ Llama on a provider like vast.ai, runpod.io, etc. #302
nelsonjchen
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
https://github.com/nelsonjchen/docker-quick-llama
I genuinely probably do not have enough time to keep up with this amount of hot action. That said, I was able to run Llama on vast.ai pretty well. But with all this llama.cpp, daili, and so on, I'm not sure if what I'm doing is relevant beyond my weekend escapades. Anyway, if the ball goes back to GPUs, this is there I guess.
The real value may be in the github actions workflow, its pre-setup of the GPTQ stuff, and its use of GitHub packages to distribute the multi-gigabyte environment.
Beta Was this translation helpful? Give feedback.
All reactions