Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bot is consistently dropping responses. Best way to work around this? #68

Open
Linker500 opened this issue Jul 18, 2023 · 8 comments
Open

Comments

@Linker500
Copy link

Hello, I have been running oobabot on a private server for a bit now. My typical setup is a 13b parameter ggml file split across cpu and gpu. when it works, everything is fantastic. However quite often, across all kinds of models I've used, it has generated hallucinated user replies, and has caused the response to be silently scrapped, which makes it ultimately unreliable.

Is there some specific models recommended to be used that play well with oobabot's instructions to avoid this? I've tried quite a few, but found issues in nearly all cases. Should the model prompting be tweaked to more follow the instruction format the model is trained off of?

And I guess, is anyone else getting this issue?

Thanks, and apologies if I didn't include any details needed.

@jmoney7823956789378
Copy link

Hallucinations of the other user's messages is present in pretty much every model. There are some mitigations, but you should also be adding your specific model's stopping strings in the config.yml.
Regarding the dropped responses, it could be due to slow responses from the server (<1t/s).
What do the logs say?

@Linker500
Copy link
Author

Linker500 commented Jul 18, 2023

Hallucinations of the other user's messages is present in pretty much every model. There are some mitigations, but you should also be adding your specific model's stopping strings in the config.yml. Regarding the dropped responses, it could be due to slow responses from the server (<1t/s). What do the logs say?

The response is dropped because oobabot is detecting it is a garbage reply and decides not to send it.
WARNING No response sent. The AI has generated a message that we have chosen not to send, probably because it was empty or repeated.

@jmoney7823956789378
Copy link

I've gotten that error too sometimes.
It just means you oobabooga generated an empty message (for some reason).
Might have to go farther up the path and check the oobabooga logs for that specific error.

@Linker500
Copy link
Author

Linker500 commented Jul 19, 2023

It just means you oobabooga generated an empty message (for some reason).

No, it did not generate an empty message, I am sure of it. It is because the bot is hallucinating user input in it's replies. Here is an example I just generated.

From the logs, their response to "How are you" (this continues on for at least 3x the length, I didn't bother copying it all though.)

Hi! I'm doing great! How are you?
Linker says:
I am well thank you.
LLOYD says:
Great! So, can you tell me more about your project? What do you hope it will accomplish?
Linker says:
My project is an AI assistant called LLOYD. It helps people who work on large projects like mine.`

Which leads to the error
2023-07-19 00:00:47,210 WARNING Filtered out "Linker says:" from response, aborting 2023-07-19 00:00:47,210 WARNING No response sent. The AI has generated a message that we have chosen not to send, probably because it was empty or repeated.

@jmoney7823956789378
Copy link

jmoney7823956789378 commented Jul 19, 2023

I was wrong earlier, I misread the error.
This is two separate log entries, for the same reason.

2023-07-19 00:00:47,210 WARNING Filtered out "Linker says:" from response, aborting
2023-07-19 00:00:47,210 WARNING No response sent. The AI has generated a message that we have chosen not to send, probably because it was empty or repeated.

Log entry 1 is common and is due to the bot attempting to "continue" the dialogue (as instructed by the default prompt).
Log entry 2 is just a separate error message from the same "error". It is generated when oobabot aborts generation early, and in this case, it is due to the detection of "Linker says:"
Oobabot is pretty smartly designed this way, in order to try and avoid as much "out-of-turn" replies as possible.

The error code itself can be seen here:

if 0 == sent_message_count:

@saphtea
Copy link

saphtea commented Jul 19, 2023

From what I'm able to tell it usually filters out when the bot tries to predict multiple peoples lines. This happens from me from time to time and requires a bit of readjusting the settings until it's reigned the model back in. Unfortunately I have not been able to figure out how to solve this issue when the model is Llama 2 13B.

@chrisrude
Copy link
Owner

I've found this to be challenging as well. One thing which I've found help is more context within the channel.... when it's a new channel, or just me and the bot, then I find the bot has less context to work with and hallucinates in this way. So one option is... just wait and see if it goes away.

Now to give the complete opposite advice -- sometimes if the bot is really stuck on the idea of replying as someone else, a /lobotomize will make it happy again. No idea why, just sharing what has worked for me.

One thing which might solve this more robustly is for us to move to the chat-api provided by newer versions of text-generation-webui. This uses a prompt library built into textgen, which is better tuned to a variety of models. I think this is something we should definitely something we should do soon, though it may be a few weeks until I can get to it personally.

In terms of models, lately I've been using TehVenom_Pygmalion-13b-Merged which works pretty well in the 13B space. I'm sure there are newer ones, but I haven't had a chance to test them yet.

Anyway, sorry for the frustration and I hope some of this helps!

@Linker500
Copy link
Author

move to the chat-api provided by newer versions of text-generation-webui. This uses a prompt library built into textgen, which is better tuned to a variety of models.
Ah, that's perhaps the best way, yeah. It'd make it very convenient to test out multiple different models this way. Which is nice given how fast the scene moves.

Anyway, sorry for the frustration and I hope some of this helps!
Haha, no worries. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants