Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TensorIR] Cross-Thread Reduction #9360

Merged
merged 6 commits into from
Nov 14, 2021

Conversation

MasterJH5574
Copy link
Contributor

Hi community! This PR adds cross-thread reduction support for TensorIR. After this PR, cross-thread reduction patterns in TIR can be successfully lowered.

cc @Hzfengsy @vinx13 @comaniac @junrushao1994 @jcf94 @jinhongyii @spectrometerHBH @tqchen

Co-authored-by: Wuwei Lin [email protected]
Co-authored-by: Junru Shao [email protected]
Co-authored-by: Siyuan Feng [email protected]
Co-authored-by: Hongyi Jin [email protected]
Co-authored-by: Bohan Hou [email protected]

src/tir/schedule/analysis.h Outdated Show resolved Hide resolved
src/tir/schedule/analysis/analysis.cc Show resolved Hide resolved
src/tir/transforms/lower_cross_thread_reduction.cc Outdated Show resolved Hide resolved
@junrushao
Copy link
Member

I will do another round of review next week!

@junrushao
Copy link
Member

Will do the review tomorrow

@junrushao
Copy link
Member

Finally got some time for a detailed code review! Will take over this PR and try to get it merged!

@junrushao junrushao force-pushed the m2/cross-thread-reduction branch from 3ca0e56 to a560a9b Compare November 12, 2021 03:24
@junrushao
Copy link
Member

Did a pass over analysis and misc changes

@junrushao
Copy link
Member

Love the PR and very comprehensively tested implementation ❤️

@junrushao junrushao force-pushed the m2/cross-thread-reduction branch from 4ac6c65 to 6a162d7 Compare November 12, 2021 20:29
@junrushao
Copy link
Member

@Hzfengsy @MasterJH5574 Should be good to go. Please take another look :-)

@MasterJH5574
Copy link
Contributor Author

@Hzfengsy Could you take another look? Junru's polishing looks very good, but I myself as the author cannot approve this PR 😅.

Copy link
Member

@Hzfengsy Hzfengsy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @MasterJH5574 for such a great effort on this PR.

@junrushao junrushao merged commit 08898e1 into apache:main Nov 14, 2021
mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request Dec 1, 2021
* [TensorIR] Cross-Thread Reduction

* Code revision on analysis and misc

* Refactor TransformReductionBlock

* Refactor code organization

* Address comment

* Use `std::make_tuple`

Co-authored-by: Junru Shao <[email protected]>
mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request Dec 1, 2021
* [TensorIR] Cross-Thread Reduction

* Code revision on analysis and misc

* Refactor TransformReductionBlock

* Refactor code organization

* Address comment

* Use `std::make_tuple`

Co-authored-by: Junru Shao <[email protected]>
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 7, 2022
* [TensorIR] Cross-Thread Reduction

* Code revision on analysis and misc

* Refactor TransformReductionBlock

* Refactor code organization

* Address comment

* Use `std::make_tuple`

Co-authored-by: Junru Shao <[email protected]>
yangulei pushed a commit to yangulei/tvm that referenced this pull request Jan 11, 2022
* [TensorIR] Cross-Thread Reduction

* Code revision on analysis and misc

* Refactor TransformReductionBlock

* Refactor code organization

* Address comment

* Use `std::make_tuple`

Co-authored-by: Junru Shao <[email protected]>
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
* [TensorIR] Cross-Thread Reduction

* Code revision on analysis and misc

* Refactor TransformReductionBlock

* Refactor code organization

* Address comment

* Use `std::make_tuple`

Co-authored-by: Junru Shao <[email protected]>
@LeiWang1999
Copy link
Contributor

hi @MasterJH5574 , do we currently only support block reduction with warp level __shfl_down_sync ? or the reduction can also be lowered into block reduce within shared memory.

@LeiWang1999
Copy link
Contributor

I've found the related implementation, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants