Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip hpugraph usage for first token to save memory #397

Merged
merged 10 commits into from
Sep 16, 2023
Merged

skip hpugraph usage for first token to save memory #397

merged 10 commits into from
Sep 16, 2023

Conversation

polisettyvarma
Copy link
Contributor

What does this PR do?

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@polisettyvarma
Copy link
Contributor Author

polisettyvarma commented Sep 12, 2023

@regisss please add @dvarshney-habana to reviewers
please review this @dvarshney-habana and @regisss

@regisss regisss added the run-test Run CI for PRs from external contributors label Sep 13, 2023
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

examples/text-generation/README.md Outdated Show resolved Hide resolved
examples/text-generation/run_generation.py Outdated Show resolved Hide resolved
optimum/habana/transformers/generation/utils.py Outdated Show resolved Hide resolved
optimum/habana/transformers/generation/utils.py Outdated Show resolved Hide resolved
optimum/habana/transformers/generation/utils.py Outdated Show resolved Hide resolved
optimum/habana/transformers/generation/utils.py Outdated Show resolved Hide resolved
optimum/habana/transformers/generation/utils.py Outdated Show resolved Hide resolved
@polisettyvarma
Copy link
Contributor Author

please review now @regisss, @dvarshney-habana and @puneeshkhanna

Signed-off-by: P V R K Jyothendra Varma <[email protected]>
@regisss regisss added run-test Run CI for PRs from external contributors and removed run-test Run CI for PRs from external contributors labels Sep 14, 2023
Copy link
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI failed because _get_hpu_graphs_kwargs should be linked to Transformers' `generate here:

# Generation is modified to run faster in lazy mode

Or, maybe easier, you can just declare this method outside of this class and use it that way without self.

Signed-off-by: P V R K Jyothendra Varma <[email protected]>
@polisettyvarma
Copy link
Contributor Author

CI failed because _get_hpu_graphs_kwargs should be linked to Transformers' `generate here:

# Generation is modified to run faster in lazy mode

Or, maybe easier, you can just declare this method outside of this class and use it that way without self.

I took the first approach, please review @regisss

@regisss regisss added run-test Run CI for PRs from external contributors and removed run-test Run CI for PRs from external contributors labels Sep 15, 2023
Copy link
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@polisettyvarma CI failed, we need to return hpu_graphs_kwargs at the end of _get_hpu_graphs_kwargs.
There is also one merge conflict to solve, probably due to the trim logit PR that I just merged. There shouldn't be much to do I think.

@polisettyvarma
Copy link
Contributor Author

please review now @regisss

@regisss regisss added run-test Run CI for PRs from external contributors and removed run-test Run CI for PRs from external contributors labels Sep 16, 2023
@regisss regisss merged commit fa1f2c2 into huggingface:main Sep 16, 2023
11 of 12 checks passed
@polisettyvarma polisettyvarma deleted the skip_hpugraph_for_first_token branch September 16, 2023 06:33
@yafshar
Copy link
Contributor

yafshar commented Apr 12, 2024

@regisss, @polisettyvarma which model is using this feature? I do not see bypass_hpu_graphs in any models' arguments. Would you please point me to a model which is using this?

@regisss
Copy link
Collaborator

regisss commented Apr 24, 2024

@regisss, @polisettyvarma which model is using this feature? I do not see bypass_hpu_graphs in any models' arguments. Would you please point me to a model which is using this?

@yafshar You won't see it in the code of any model. It's managed by the HPU graph API which wraps the forward of the model.
For example, in a Docker container running the SynapseAI 1.15 Docker image, you can find it at the beginning of the wrapped_hpugraph_forward method in /usr/local/lib/python3.10/dist-packages/habana_frameworks/torch/hpu/graphs.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-test Run CI for PRs from external contributors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants