Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial splitk reduce #2908

Merged
merged 60 commits into from
Apr 12, 2024
Merged

Add initial splitk reduce #2908

merged 60 commits into from
Apr 12, 2024

Conversation

pfultz2
Copy link
Collaborator

@pfultz2 pfultz2 commented Mar 20, 2024

This adds a pass to do splitk reductions. It also updates the kernel to handle splitk reduciton as well. Right now, this only works for a single reduce, future work will enable this for multiple reductions. This is only enabled when using MIGRAPHX_ENABLE_SPLIT_REDUCE=1 until we get more testing for this.

I do see a 2x improvement when using this on large reductions like 327680.

@pfultz2
Copy link
Collaborator Author

pfultz2 commented Apr 2, 2024

I consider this as a Large PR. I think it can be broken down to following pieces.

  1. Changes for the Param Utils and liveness

Why would I create PR to move functions to separate headers when they are only used once?

  1. Adding hip::fill operator

Why would I create a PR for an operator that is not used anywhere?

  1. Adding Generic Split functionality for the module and adding tests for those (they are missing right now).
  2. Adding Split_reduce pass

I would like it better if we can break down this PR

This will slow down getting splitk reduce implemented as there are a lot more changes to come for this. I stopped here to avoid having a very large PR and avoiding dealing with merge conflicts.

{"reduced", "decltype(" + generate_make_shape(reduce_output_shape) + ")"},
{"lambda", v.at("lambda").to<std::string>()},
{"transformers", make_transformer_args(vec)},
{"preamble", v.get("preamble", std::string{})}});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Not sure why all of this is shifted instead of one change for adding assign.

}))
continue;
auto v = ins->get_operator().to_value();
auto axes = v["axes"].to_vector<std::int64_t>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add coverage for continues; non float and empty splits() above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to add tests for those cases, since I hope to support those other cases in the future, and will add test cases for that then.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then add a TODO there or make it an issue we can track for later improvement.

Copy link
Collaborator

@TedThemistokleous TedThemistokleous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments. One question regarding the use of hipMemsetAsync

I understand we want this in sooner than later so if we're reducing scope for now, add in TODO's

Also add in test coverage for additional early returns/continue

Copy link
Collaborator

@TedThemistokleous TedThemistokleous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix format and windows build.

Blue ocean is all green. Not sure why its not reporting correctly here though.

@causten causten merged commit 76c3ebb into develop Apr 12, 2024
18 of 19 checks passed
@causten causten deleted the split-reduce branch April 12, 2024 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants