Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix] Fix the default value for temperature in ChatCompletionRequest #11219

Merged
merged 1 commit into from
Dec 16, 2024

Conversation

yansh97
Copy link
Contributor

@yansh97 yansh97 commented Dec 16, 2024

FIX #10930

Set the default value for temperature in ChatCompletionRequest to 1.0.

References:

temperature: float = 1.0

https://platform.openai.com/docs/api-reference/chat

Copy link

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

  • Add ready label to the PR
  • Enable auto-merge.

🚀

@mergify mergify bot added the frontend label Dec 16, 2024
@simon-mo simon-mo enabled auto-merge (squash) December 16, 2024 01:37
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 16, 2024
@yansh97
Copy link
Contributor Author

yansh97 commented Dec 16, 2024

This modifies the default behavior of the vLLM OpenAI Compatible Server. Considering that temperature is very likely to be customized by users, I believe the impact is minimal.

@yansh97
Copy link
Contributor Author

yansh97 commented Dec 16, 2024

Some tests failed, and it doesn’t seem to be caused by my PR. @simon-mo

@simon-mo simon-mo merged commit 17138af into vllm-project:main Dec 16, 2024
62 of 66 checks passed
@yansh97 yansh97 deleted the patch-1 branch December 16, 2024 08:16
@K-Mistele
Copy link
Contributor

Out of curiosity, why was this change made? 0.7 has been the default temperature for a long time, and changing from 0.7 to 1 is not a small difference in terms of behavior, and in particular it may impact the quality of tool calls using auto tool choice without a temperature manually specified

@yansh97
Copy link
Contributor Author

yansh97 commented Dec 17, 2024

Out of curiosity, why was this change made? 0.7 has been the default temperature for a long time, and changing from 0.7 to 1 is not a small difference in terms of behavior, and in particular it may impact the quality of tool calls using auto tool choice without a temperature manually specified出于好奇,为什么要做出这样的改变? 0.7 长期以来一直是默认温度,从 0.7 更改为 1 在行为方面差异不小,特别是它可能会影响使用自动工具选择而不手动指定温度的工具调用质量

The default temperature for offline inference (LLM class) is 1.0, same as OpenAI’s official implementation. It’s also the default in most frameworks. I didn’t check past PRs to see why the OpenAI Compatible Server uses 0.7 as the default. It’s a bit odd.

Also, like I mentioned before, in temperature-sensitive cases, it’s common for users to set their own temperature. This shouldn’t affect too many use cases. Maybe we should note this change in the OpenAI Compatible Server docs? @simon-mo

@simon-mo
Copy link
Collaborator

I actually don't recall why is 0.7 given both huggingface and OpenAI have it at 1.0. Are we aware of any breakage?

@yansh97
Copy link
Contributor Author

yansh97 commented Dec 17, 2024

I actually don't recall why is 0.7 given both huggingface and OpenAI have it at 1.0. Are we aware of any breakage?实际上我不记得为什么是 0.7,因为 Huggingface 和 OpenAI 的值都是 1.0。我们是否意识到有任何破损?

This value has been 0.7 since it was introduced, see #116. What’s strange is that ChatCompletionRequest uses 0.7, while CompletionRequest uses 1.0.

I searched Google for the source of 0.7, and it seems to be because OpenAI chatbots like ChatGPT use it (though no solid proof). The default for OpenAI’s Chat Completion API is 1.0. I think aligning with that makes sense.

@K-Mistele
Copy link
Contributor

K-Mistele commented Dec 17, 2024

Aligning with default behavior 100% makes sense, but it's also to avoid risking breaking workflows that depend on existing defaults.

It seems like this should be at least a minor version bump in semver, if not a major due to potentially breaking behavior; although it's definitely fuzzy. it's syntactically backwards-compatible, although arguably not semantically

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
frontend ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Different default value for temperature in SamplingParams and ChatCompletionRequest
3 participants