You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I discovered this when I created a script to test speed with various prompts lengths.
What's interesting is that when feeding 28k, 30k, 32k, it has the same problem where it only generates 27 tokens with the same phrase. When feeding Prompts with 26k tokens and less, it didn't have the problem.
I'm suspecting something might be going with long context? It's like opposite of the issues I created for looping problem with long context and llama-3.1-8b-instruct-4bit.
I'll test some more with what you suggested, and report back.
My prompt looks like this:
However, if I run the following,
I only get this:
"I hope this information has been helpful. If you have any further questions or need more information, please don't hesitate to ask."
I'm attaching the full prompt below.
28000.txt
Thanks!
The text was updated successfully, but these errors were encountered: