Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Llama2 and gpt_neox performance with Habana fused RoPE and RMSNorm #321

Merged
merged 7 commits into from
Aug 8, 2023

Conversation

mandy-li
Copy link
Collaborator

@mandy-li mandy-li commented Aug 7, 2023

What does this PR do?

  1. Improve Llama2 inference performance by using habana fused RoPE and RMSNorm
  2. Improve gpt_neox inference performance by using habana fused RoPE

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Aug 7, 2023

The documentation is not available anymore as the PR was closed or merged.

@ZhaiFeiyue
Copy link
Collaborator

@mandy-li nice PR 👍

Copy link
Collaborator

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@regisss regisss merged commit f54e025 into main Aug 8, 2023
9 checks passed
@regisss regisss deleted the habana_llm branch August 8, 2023 23:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants