Replies: 8 comments 1 reply
-
Wow, that's nice! Thanks for the heads up, I had not heard of this model. |
Beta Was this translation helpful? Give feedback.
-
Forgot to mention that here's the online test app with a bunch of example questions & answers: https://huggingface.co/spaces/togethercomputer/OpenChatKit |
Beta Was this translation helpful? Give feedback.
-
This model is amazing. It's also apache2, so you can build commercial apps out of it |
Beta Was this translation helpful? Give feedback.
-
So this works out of the box? no need to do or add any code the UI? |
Beta Was this translation helpful? Give feedback.
-
Only LLaMa (and maybe OPT) works in 4-bit. EDIT: Although other people are working on getting GPTQ to work for other models. So hopefully we'll be able to use this in 4-bit very soon. |
Beta Was this translation helpful? Give feedback.
-
GPTQ is implemented for OPT, but I have not managed to get good results with quantizing GALACTICA (a variant of OPT) :( |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
For my taste, the answer of llama-13b is even more chatGPT-like: "Do mussels have muscles? Give a detailed answer." "Yes, mussels are animals that have muscle tissue. They live in salt water and filter food from it through their gills." |
Beta Was this translation helpful? Give feedback.
-
Together just released a new model finetuned from GPT-NeoX that can act like a chat bot aka ChatGPT.
https://huggingface.co/togethercomputer/GPT-NeoXT-Chat-Base-20B
I got it running on my 3090 in bf16, although the VRAM is maxed out.
python server.py --model GPT-NeoXT-Chat-Base-20B --bf16 --listen-port 7861 --cai-chat --no-stream
Quality seems pretty good but obviously not as good as the real ChatGPT. I'm excited to see if they'll do another one based on LLaMa (probably not because of the licence but one can dream).
Beta Was this translation helpful? Give feedback.
All reactions