Feature Request: Gracefully handle when conversation contexts become too long #89

a3957273 · 2023-11-21T12:42:05Z

Summary

At the moment the plugin seems to fail with a generic error message when the conversation is too long. Ideally, some context pruning might take place to keep within a context limit defined by the plugin.

a3957273 · 2023-11-23T01:10:40Z

We have an internal truncateRequest function to resolve this on our instance. It tries to take as many tokens as possible up until about half the token limit. This is super simple implementation that we threw together in under an hour.

func getSubstring(message string, characters int) string {
	if len(message) < characters {
		return message
	}
	return message[:characters/2] + message[len(message)-characters/2:]
}

func (s *OpenAI) truncateRequest(request openaiClient.ChatCompletionRequest) openaiClient.ChatCompletionRequest {
	var messages []openaiClient.ChatCompletionMessage
	token_count := 0
	limit := s.TokenLimit() / 2

	for i := len(request.Messages) - 1; i >= 0; i-- {
		message := request.Messages[i]

		// Add a few for the role, etc.
		tokens := s.CountTokens(message.Content) + 10

		// Can we fit the entire message in the window?
		if token_count+tokens < limit {
			token_count += tokens
			messages = append([]openaiClient.ChatCompletionMessage{message}, messages...)
			continue
		}

		// We can't fit the message in, so just include the start and the end
		remaining := limit - token_count
		characters := remaining * 4 // Estimate 4 characters per token

		message.Content = getSubstring(message.Content, characters)
		messages = append(messages, message)
		break
	}

	request.Messages = messages
	return request
}

crspeller · 2023-11-28T20:14:15Z

This makes a lot of sense as it's a pretty bad experience when you run oft of context at the moment. The annoying part is (at least last time I checked) there is no precise token counting library in Go.

a3957273 · 2024-03-31T02:09:56Z

We've had a lot of success with fairly basic heuristics. We've generated millions of values and using the existing built-in token counter has worked every time. I realise we're possibly losing a small amount of the context window, but the trade-off is worthwhile for us.

a3957273 changed the title ~~Doc: Gracefully handle when conversation contexts become too long~~ Feature Request: Gracefully handle when conversation contexts become too long Nov 21, 2023

crspeller added this to the 1.0 milestone Nov 28, 2023

crspeller removed this from the 1.0 milestone Feb 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Gracefully handle when conversation contexts become too long #89

Feature Request: Gracefully handle when conversation contexts become too long #89

a3957273 commented Nov 21, 2023

a3957273 commented Nov 23, 2023

crspeller commented Nov 28, 2023

a3957273 commented Mar 31, 2024 •

edited

Loading

Feature Request: Gracefully handle when conversation contexts become too long #89

Feature Request: Gracefully handle when conversation contexts become too long #89

Comments

a3957273 commented Nov 21, 2023

Summary

a3957273 commented Nov 23, 2023

crspeller commented Nov 28, 2023

a3957273 commented Mar 31, 2024 • edited Loading

a3957273 commented Mar 31, 2024 •

edited

Loading