-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gaudi2 support opt-66b(DS mode) #279
Conversation
2834e8e
to
4520bf1
Compare
@regisss opt-66B cannot run on 1 Gaudi2 and also cannot run with DS 8x, since in here model weights will be transferred to device before weights sharding in
so I change the device from |
Thanks for this PR @ZhaiFeiyue, I'll review it by the end of this week! |
@ZhaiFeiyue So I investigated this a bit, and with a few changes to DeepSpeed I got it to work:
Do you think we can push these changes to Habana's DeepSpeed fork? |
@regisss thanks for your investigation 😄, with the changes above you can run text-generation with opt-66b ds 8x in the way like Bloom right,? since bloom is special handled in |
Yes, it follows the same path as BLOOM in the script. I'm going to push a new commit so that you can test it. |
cool @regisss, for your question I prefer we do these in optimum-habana side, since Habana DS is a fork version and should not involve too much model special changes. |
The documentation is not available anymore as the PR was closed or merged. |
@regisss very clean changes 👍 any ideas or I missed something? |
@ZhaiFeiyue Yeah I have the same error, I need to see where it comes from exactly. When doing the changes directly in DeepSpeed it was working. |
@ZhaiFeiyue The model is now loaded correctly but results are weird:
It's like it is not doing anything. |
@regisss yes, same in my side |
Weird 🤔 |
@ZhaiFeiyue There must be something wrong in the injection policy. Maybe because of the way we write the JSON checkpoint here:
I also see that auto-injection will be merged at some point (see here), so if it's planned for v1.11.0 maybe it's better to just wait for it and keep your initial change? |
@regisss agree with you, I will open a new PR and we could keep your changes in this PR |
new PR is #285 |
@regisss I finally got time to debug OPT-66b now 😄, the weights is not loaded correctly because the name mismatch, your injection works well. I add new changes here if with opt-125m and 66b the weights name starting from see here the prefix with level=0 will be stripped, which lead to the name mismatch here I have tested 125m 13b, will test 66b later when resource available. |
@ZhaiFeiyue Nice! Let me know if OPT-66b works 🙂 |
8930ebb
to
0032f86
Compare
@regisss I have checked all the weights' name of bloom models(from 560m to 176b), all of them are same.
transformers base on the above analysis, what I think is correct me if some wrong. |
@ZhaiFeiyue I just pushed a rebase. |
@regisss that's great if could fix in Transformers team. |
@regisss let's close this PR, because the changes for OPT should be in Habana DS like official DS |
Sounds good! |
What does this PR do?
current text-generation only support Bloom-176b, but not support opt-66b, since opt-66b can not fit into 1 gaudi2 device(96GB).
with this PR opt-66B could run with 2 Gaudi2 cards
Fixes # (issue)
Before submitting