-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apple's cross entropy computation #391
Comments
Hi @fzyzcjy , did you try the original repo? Does it work as expected? |
Hi, no I have not tried it yet |
I ran into problems using/installing Apple's kernel from that repo. I assume that the triton etc. that they did is sound and does what it says it does, but it's just research code and isn't well tested for many versions/platforms. Would be amazing to have it be part of liger-kernel, because everything here is well tested and "just works" out of the box. |
We are more than happy to host and maintain innovative kernels like https://github.com/apple/ml-cross-entropy. @erikwijmans are you interested in collaboration? we are committed to long-term maintenance at the company level |
FYI @ByronHsu this is a very simple reproduction of why the cce kernel isn't working for me. I know you use Modal for CI so you should pretty easily able to reproduce this. I wish I could debug it myself but I am not a Triton god like you. This is the most simple setup I can think of: fresh install of only the cut cross entropy package, attempt to import linear_cross_entropy, error happens. I did not make any modifications to the code. This uses python version 3.10, triton==3.1.0, torch==2.5.1.
Backtrace:
|
OK, turns out the problem is something related to triton's regexp search for the source code + applying multiple decorators. The fix is to comment out the decorators on
...and instead apply them "manually" like this:
|
This has been merged to https://github.com/apple/ml-cross-entropy |
I got CCE working with transformers but it was a hacked mess. The unsloth guys just announced that its now supported in their new blog post here. here is the patch of the forward: Here is where they bring in CCE I am not an expert on the kernel side but can help with the integration. |
Hi thanks for the library! Today I see a paper https://openreview.net/forum?id=E4Fk3YuG56 (code: https://github.com/apple/ml-cross-entropy), which seems to discuss a way to compute cross entropy. Thus I share this here in case it is useful for this repository.
The text was updated successfully, but these errors were encountered: