Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Gracefully handle when conversation contexts become too long #89

Open
a3957273 opened this issue Nov 21, 2023 · 3 comments

Comments

@a3957273
Copy link

Summary

At the moment the plugin seems to fail with a generic error message when the conversation is too long. Ideally, some context pruning might take place to keep within a context limit defined by the plugin.

@a3957273 a3957273 changed the title Doc: Gracefully handle when conversation contexts become too long Feature Request: Gracefully handle when conversation contexts become too long Nov 21, 2023
@a3957273
Copy link
Author

We have an internal truncateRequest function to resolve this on our instance. It tries to take as many tokens as possible up until about half the token limit. This is super simple implementation that we threw together in under an hour.

func getSubstring(message string, characters int) string {
	if len(message) < characters {
		return message
	}
	return message[:characters/2] + message[len(message)-characters/2:]
}

func (s *OpenAI) truncateRequest(request openaiClient.ChatCompletionRequest) openaiClient.ChatCompletionRequest {
	var messages []openaiClient.ChatCompletionMessage
	token_count := 0
	limit := s.TokenLimit() / 2

	for i := len(request.Messages) - 1; i >= 0; i-- {
		message := request.Messages[i]

		// Add a few for the role, etc.
		tokens := s.CountTokens(message.Content) + 10

		// Can we fit the entire message in the window?
		if token_count+tokens < limit {
			token_count += tokens
			messages = append([]openaiClient.ChatCompletionMessage{message}, messages...)
			continue
		}

		// We can't fit the message in, so just include the start and the end
		remaining := limit - token_count
		characters := remaining * 4 // Estimate 4 characters per token

		message.Content = getSubstring(message.Content, characters)
		messages = append(messages, message)
		break
	}

	request.Messages = messages
	return request
}

@crspeller crspeller added this to the 1.0 milestone Nov 28, 2023
@crspeller
Copy link
Member

This makes a lot of sense as it's a pretty bad experience when you run oft of context at the moment. The annoying part is (at least last time I checked) there is no precise token counting library in Go.

@crspeller crspeller removed this from the 1.0 milestone Feb 7, 2024
@a3957273
Copy link
Author

a3957273 commented Mar 31, 2024

We've had a lot of success with fairly basic heuristics. We've generated millions of values and using the existing built-in token counter has worked every time. I realise we're possibly losing a small amount of the context window, but the trade-off is worthwhile for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants