Optimized inference of XGLM model on HPU #1323

XinyuYe-Intel · 2024-09-10T08:00:07Z

What does this PR do?

Optimized inference of XGLM model on HPU.

Before submitting

Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Signed-off-by: Ye, Xinyu <[email protected]>

libinta · 2024-09-18T21:45:43Z

@XinyuYe-Intel can you provide gaudi2 test on latest 1/17/1.18 docker RUN_SLOW=true GAUDI2_CI test and gaudi1 test result?

XinyuYe-Intel · 2024-09-20T07:14:43Z

@XinyuYe-Intel can you provide gaudi2 test on latest 1/17/1.18 docker RUN_SLOW=true GAUDI2_CI test and gaudi1 test result?

perf on gaudi2 on 1.17.1 with RUN_SLOW=true is as below:

For gaudi1, I don't have the machine, so I can't provide the result.

Signed-off-by: Ye, Xinyu <[email protected]>

ssarkar2

@XinyuYe-Intel could you please resolve the conflicts on this PR, looks good otherwise

XinyuYe-Intel · 2024-10-12T06:12:14Z

@XinyuYe-Intel could you please resolve the conflicts on this PR, looks good otherwise

Resolved conflicts.

libinta · 2024-11-01T16:14:06Z

@XinyuYe-Intel can you rebase?

XinyuYe-Intel · 2024-11-12T06:06:34Z

Tested this PR on FW 1.18.0

HuggingFaceDocBuilderDev · 2024-11-12T19:36:24Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

XinyuYe-Intel added 2 commits September 10, 2024 03:29

Optimized inference of XGLM model on HPU

2f78787

Signed-off-by: Ye, Xinyu <[email protected]>

add test.

c48fc46

Signed-off-by: Ye, Xinyu <[email protected]>

XinyuYe-Intel requested review from ssarkar2, bhargaveede, vivekgoe and regisss as code owners September 10, 2024 08:00

add readme.

45d4d4c

Signed-off-by: Ye, Xinyu <[email protected]>

XinyuYe-Intel force-pushed the xglm branch from fa21cef to 45d4d4c Compare September 13, 2024 06:23

XinyuYe-Intel added 3 commits September 26, 2024 11:25

Merge branch 'main' into xglm

a5d729e

style fix

f748171

Signed-off-by: Ye, Xinyu <[email protected]>

Merge branch 'main' into xglm

48dc721

ssarkar2 reviewed Oct 11, 2024

View reviewed changes

Merge branch 'main' into xglm

6b807a3

ssarkar2 approved these changes Oct 16, 2024

View reviewed changes

libinta closed this Nov 1, 2024

libinta reopened this Nov 12, 2024

Merge branch 'main' into xglm

6db1249

libinta added the run-test Run CI for PRs from external contributors label Nov 12, 2024

regisss approved these changes Nov 12, 2024

View reviewed changes

regisss merged commit 7fe7bf8 into huggingface:main Nov 12, 2024
2 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimized inference of XGLM model on HPU #1323

Optimized inference of XGLM model on HPU #1323

XinyuYe-Intel commented Sep 10, 2024

libinta commented Sep 18, 2024

XinyuYe-Intel commented Sep 20, 2024

ssarkar2 left a comment •

edited

Loading

XinyuYe-Intel commented Oct 12, 2024

libinta commented Nov 1, 2024

XinyuYe-Intel commented Nov 12, 2024

HuggingFaceDocBuilderDev commented Nov 12, 2024

Optimized inference of XGLM model on HPU #1323

Optimized inference of XGLM model on HPU #1323

Conversation

XinyuYe-Intel commented Sep 10, 2024

What does this PR do?

Before submitting

libinta commented Sep 18, 2024

XinyuYe-Intel commented Sep 20, 2024

ssarkar2 left a comment • edited Loading

Choose a reason for hiding this comment

XinyuYe-Intel commented Oct 12, 2024

libinta commented Nov 1, 2024

XinyuYe-Intel commented Nov 12, 2024

HuggingFaceDocBuilderDev commented Nov 12, 2024

ssarkar2 left a comment •

edited

Loading