-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
skip hpugraph usage for first token to save memory #397
skip hpugraph usage for first token to save memory #397
Conversation
@regisss please add @dvarshney-habana to reviewers |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
Signed-off-by: P V R K Jyothendra Varma <[email protected]>
Co-authored-by: regisss <[email protected]>
Co-authored-by: regisss <[email protected]>
Signed-off-by: P V R K Jyothendra Varma <[email protected]>
Signed-off-by: P V R K Jyothendra Varma <[email protected]>
Signed-off-by: P V R K Jyothendra Varma <[email protected]>
please review now @regisss, @dvarshney-habana and @puneeshkhanna |
Signed-off-by: P V R K Jyothendra Varma <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI failed because _get_hpu_graphs_kwargs
should be linked to Transformers' `generate here:
# Generation is modified to run faster in lazy mode |
Or, maybe easier, you can just declare this method outside of this class and use it that way without self
.
Signed-off-by: P V R K Jyothendra Varma <[email protected]>
I took the first approach, please review @regisss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@polisettyvarma CI failed, we need to return hpu_graphs_kwargs
at the end of _get_hpu_graphs_kwargs
.
There is also one merge conflict to solve, probably due to the trim logit PR that I just merged. There shouldn't be much to do I think.
Signed-off-by: P V R K Jyothendra Varma <[email protected]>
please review now @regisss |
@regisss, @polisettyvarma which model is using this feature? I do not see |
@yafshar You won't see it in the code of any model. It's managed by the HPU graph API which wraps the forward of the model. |
What does this PR do?
Fixes # (issue)
Before submitting