-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
match gemm_softmax_gemm when there is no scale #2748
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #2748 +/- ##
========================================
Coverage 91.48% 91.48%
========================================
Files 468 468
Lines 17539 17539
========================================
Hits 16045 16045
Misses 1494 1494 ☔ View full report in Codecov by Sentry. |
src/targets/gpu/prefuse_ops.cpp
Outdated
else | ||
return; | ||
}); | ||
if(r.instructions.find("scale") != r.instructions.end()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use contains(r.instructions, "scale")
instead.
This build is OK for merge ✅ |
🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output |
src/targets/gpu/prefuse_ops.cpp
Outdated
auto scale_lit = r.instructions["scale"]; | ||
scale_lit->eval().visit([&](const auto s) { | ||
// CK only supports single-valued scale | ||
if(std::all_of( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to flip the logic so that it returns if scale values are different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking something like this
if(std::all_of( | |
if(not std::all_of(....) return; | |
scale = s.front(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But thats inside a visit, that return will not exit the apply method for this matcher
Oh you dont mean to modify the current functionallity, just rewrite it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yeah I saw that too. But looks like you rewrote it and looks fine now.
Can you add a test case? |
This pass only really runs with special flags. How would the test cases work for this? |
test/gpu/prefuse_ops.cpp
Outdated
|
||
migraphx::program p1 = create_program(); | ||
migraphx::program p2; | ||
if(migraphx::gpu::mlir_attention_enabled()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dont rely on this global setting. Instead add a flag to the prefuse_ops
class like enable_attention
that can be set to true during tests. And then you pass that along to the find_gemm_softmax_gemm
so the matcher will match it.
test/gpu/prefuse_ops.cpp
Outdated
auto dot2 = mm->add_instruction(migraphx::make_op("dot"), sm, z); | ||
mm->add_return({dot2}); | ||
return p; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dont create lambdas. Instead just construct the program directly:
migraphx::program p1;
{
auto* mm = p1.get_main_module();
...
}
src/targets/gpu/prefuse_ops.cpp
Outdated
struct find_gemm_softmax_gemm | ||
{ | ||
bool enable_attention; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set this to false
, so its not uninitialized when default constructing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Make the scale mul op optional when looking for attention. Required to get mlir attention for for SDXL workflow.