-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cohere update #155
cohere update #155
Conversation
sanderland
commented
Nov 2, 2023
•
edited
Loading
edited
- Refreshes Cohere outputs to reflect the most recent model
- Uses the new "max_tokens=None" feature in our API to avoid unneeded truncation
fn_completions: "cohere_completions" | ||
completions_kwargs: | ||
model_name: "command-nightly" | ||
max_tokens: 2048 | ||
max_tokens: null # up to EOS or context length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the context length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our current context length is 4096
@@ -5,6 +5,7 @@ llama-2-70b-chat-hf,92.66169154,0.911762258,743,57,4,804,minimal,1790 | |||
ultralm-13b-v2.0-best-of-16,92.29813665,0.940299807,743,62,0,805,community,1720 | |||
xwinlm-13b-v0.1,91.76029963,0.968139439,734,65,2,801,community,1894 | |||
ultralm-13b-best-of-16,91.54228856,0.981927769,736,68,0,804,community,1980 | |||
cohere,91.49068322981367,0.9781229071866879,735,67,3,805,community,2012 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
impressive jump, is the model updated or is it because you removed the context length limit (the model output seems 300 characters longer)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new model, the context length limit change gives a small additional bump (to the point it could just have been evaluator noise), but it's also just the current proper way to call our models.
Impressive results @sanderland, is it on purpose that the PR is marked as draft? |
@YannDubs just double checking, should be all ready now :) |