New chat finetuned model - togethercomputer/GPT-NeoXT-Chat-Base-20B #264

wywywywy · 2023-03-12T18:17:35Z

wywywywy
Mar 12, 2023

Together just released a new model finetuned from GPT-NeoX that can act like a chat bot aka ChatGPT.

https://huggingface.co/togethercomputer/GPT-NeoXT-Chat-Base-20B

I got it running on my 3090 in bf16, although the VRAM is maxed out. python server.py --model GPT-NeoXT-Chat-Base-20B --bf16 --listen-port 7861 --cai-chat --no-stream

Quality seems pretty good but obviously not as good as the real ChatGPT. I'm excited to see if they'll do another one based on LLaMa (probably not because of the licence but one can dream).

oobabooga · 2023-03-12T18:25:17Z

oobabooga
Mar 12, 2023
Maintainer

Wow, that's nice! Thanks for the heads up, I had not heard of this model.

0 replies

wywywywy · 2023-03-13T10:29:56Z

wywywywy
Mar 13, 2023
Author

Forgot to mention that here's the online test app with a bunch of example questions & answers: https://huggingface.co/spaces/togethercomputer/OpenChatKit

0 replies

lxe · 2023-03-14T08:30:53Z

lxe
Mar 14, 2023

This model is amazing. It's also apache2, so you can build commercial apps out of it

0 replies

ye7iaserag · 2023-03-16T19:42:02Z

ye7iaserag
Mar 16, 2023

So this works out of the box? no need to do or add any code the UI?
Also is that model loadable in 4bit? or with offloading?

0 replies

wywywywy · 2023-03-16T20:32:35Z

wywywywy
Mar 16, 2023
Author

Only LLaMa (and maybe OPT) works in 4-bit.

EDIT: Although other people are working on getting GPTQ to work for other models. So hopefully we'll be able to use this in 4-bit very soon.

0 replies

oobabooga · 2023-03-16T20:39:02Z

oobabooga
Mar 16, 2023
Maintainer

GPTQ is implemented for OPT, but I have not managed to get good results with quantizing GALACTICA (a variant of OPT) :(

0 replies

ye7iaserag · 2023-03-17T01:27:24Z

ye7iaserag
Mar 17, 2023

Got it working on 3080ti using

python server.py --auto-devices --gpu-memory 9 --cai-chat --load-in-8bit --listen --listen-port 8888 --model=GPT-NeoXT-Chat-Base-20B

Can we get a preset for it? (with Pygmalion settings it just doesn't shut up once it starts generating tokens xD)

1 reply

ye7iaserag Mar 17, 2023

Same again and it OOMed after a "hey there" message, maybe I'll drop the 9gbs of vram to 8

OWKenobi · 2023-03-19T14:54:27Z

OWKenobi
Mar 19, 2023

For my taste, the answer of llama-13b is even more chatGPT-like:

"Do mussels have muscles? Give a detailed answer."

"Yes, mussels are animals that have muscle tissue. They live in salt water and filter food from it through their gills."

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New chat finetuned model - togethercomputer/GPT-NeoXT-Chat-Base-20B #264

{{title}}

Replies: 8 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

New chat finetuned model - togethercomputer/GPT-NeoXT-Chat-Base-20B #264

wywywywy Mar 12, 2023

Replies: 8 comments · 1 reply

oobabooga Mar 12, 2023 Maintainer

wywywywy Mar 13, 2023 Author

lxe Mar 14, 2023

ye7iaserag Mar 16, 2023

wywywywy Mar 16, 2023 Author

oobabooga Mar 16, 2023 Maintainer

ye7iaserag Mar 17, 2023

ye7iaserag Mar 17, 2023

OWKenobi Mar 19, 2023

wywywywy
Mar 12, 2023

Replies: 8 comments 1 reply

oobabooga
Mar 12, 2023
Maintainer

wywywywy
Mar 13, 2023
Author

lxe
Mar 14, 2023

ye7iaserag
Mar 16, 2023

wywywywy
Mar 16, 2023
Author

oobabooga
Mar 16, 2023
Maintainer

ye7iaserag
Mar 17, 2023

OWKenobi
Mar 19, 2023