Keep Getting Different Outputs With The Same Seed #6275

GrennKren · 2024-07-26T08:20:25Z

GrennKren
Jul 26, 2024

I use LLMs for storytelling and role-playing adventures, but I'm curious why I keep getting different outputs even when I set a specific seed.

It only happens the first two times with the same prompt, but from the third time onwards, the output is exactly the same as expected.

I've tried Koboldcpp, llama.cpp, and most recently OobaBooga, but I always get the same issue.

After reading about seeds on this https://github.com/oobabooga/text-generation-webui/wiki/03-%E2%80%90-Parameters-Tab , I finally understood why. I had been using Koboldcpp, llama.cpp, and OobaBooga to run GGUF models, and also EXL2 on OobaBooga.

But with Koboldcpp, I once got the same output using CLBLAST instead of CuBLAS.

So, is there a way to run a quantized model and still get the same output with a set seed? Right now, I use AWQ, which works with the Transformer loader in OobaBooga, but there aren't as many models on Huggingface compared to GGUF and EXL2.

I run LLMs on Kaggle using their 2xT4 GPUs.

Answered by GrennKren

Aug 8, 2024

Instead of using AWQ, I discovered that I can load the model without quantization using a Transformer loader.

This actually works out better since I can choose to run the model in full precision, 8-bit, or 4-bit.
For 8-bit or 4-bit, you just need to add the --load-in-4bit or --load-in-8bit flag.

So yeah, problem solved 🤍

View full answer

GrennKren · 2024-08-08T12:43:19Z

GrennKren
Aug 8, 2024
Author

Instead of using AWQ, I discovered that I can load the model without quantization using a Transformer loader.

This actually works out better since I can choose to run the model in full precision, 8-bit, or 4-bit.
For 8-bit or 4-bit, you just need to add the --load-in-4bit or --load-in-8bit flag.

So yeah, problem solved 🤍

2 replies

GrennKren Aug 8, 2024
Author

Oh, and I forgot to mention the output. It’s just like with AWQ. You can get the exact same results when you use the same seed.

The only downside is the storage space since you have to download the full model.

GrennKren Aug 9, 2024
Author

Just found out something.
Turns out, Transformers can save the model, and I only recently noticed that Huggingface has 4bit and 8bit models. How did I miss this until now?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keep Getting Different Outputs With The Same Seed #6275

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Keep Getting Different Outputs With The Same Seed #6275

GrennKren Jul 26, 2024

Replies: 1 comment · 2 replies

GrennKren Aug 8, 2024 Author

GrennKren Aug 8, 2024 Author

GrennKren Aug 9, 2024 Author

GrennKren
Jul 26, 2024

Replies: 1 comment 2 replies

GrennKren
Aug 8, 2024
Author

GrennKren Aug 8, 2024
Author

GrennKren Aug 9, 2024
Author