Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use fast softmax only on prefill #1159

Merged
merged 2 commits into from
Aug 5, 2024
Merged

Conversation

jaygala223
Copy link
Contributor

use fast softmax only on prefill

@yafshar
Copy link
Contributor

yafshar commented Jul 26, 2024

@jaygala223 can you point to the source for this change? or is there an issue? From literature, fast softmax can be used in prefill and can be beneficial in any context where the softmax operation is a computational bottleneck.

@libinta libinta added the synapse1.17 PR that should be available along with Synapse 1.17 but have no dependency on Synapse 1.17 content. label Jul 26, 2024
@yafshar
Copy link
Contributor

yafshar commented Jul 29, 2024

If there is an accuracy issue raised by using fast softmax, can you point out to the issue, or bring it up here for the reference. It could be beneficial for other models and prevent others to root causing in similar cases.

vidyasiv pushed a commit to emascarenhas/optimum-habana that referenced this pull request Aug 1, 2024
vidyasiv added a commit to emascarenhas/optimum-habana that referenced this pull request Aug 2, 2024
@jaygala223
Copy link
Contributor Author

@yafshar apologies for the delayed response. There was a performance regression which got introduced in a patch revert for context manager. This PR fixes it

@yafshar
Copy link
Contributor

yafshar commented Aug 2, 2024

@yafshar apologies for the delayed response. There was a performance regression which got introduced in a patch revert for context manager. This PR fixes it

@jaygala223 thanks, does this affect other models? Should we consider the same change for other models?

Copy link
Contributor

@yafshar yafshar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@regisss please look at this PR

@yafshar
Copy link
Contributor

yafshar commented Aug 2, 2024

@libinta please correct the PR label

@jaygala223
Copy link
Contributor Author

@jaygala223 thanks, does this affect other models? Should we consider the same change for other models?

No, it does not affect other models

@libinta libinta added the run-test Run CI for PRs from external contributors label Aug 4, 2024
@regisss
Copy link
Collaborator

regisss commented Aug 4, 2024

Please run make style

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@regisss regisss merged commit fb72aac into huggingface:main Aug 5, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-test Run CI for PRs from external contributors synapse1.17 PR that should be available along with Synapse 1.17 but have no dependency on Synapse 1.17 content.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants