Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate llvm-project and mlir-hlo. #2454

Merged
merged 9 commits into from
Sep 12, 2023
Merged

Conversation

stellaraccident
Copy link
Collaborator

Corresponding commits:

  • mlir-hlo: 16886a108eff5197f816ca0f1950cc5ff1b078d9
  • stablehlo: 77a59815a82b34f7b08ed2d42a711d9920682d0e
  • llvm-project: 4acc3ff

@ramiro050
Copy link
Collaborator

Hi @stellaraccident, would it be possible to include the commits in #1178 (comment) as part of this bump?

There's a team that uses torch-mlir and onnx-mlir, and the commits from last week are the ones that align with onnx-mlir's bi-weekly bump (brought up during the developer hour this morning).

@stellaraccident
Copy link
Collaborator Author

Hi @stellaraccident, would it be possible to include the commits in #1178 (comment) as part of this bump?

There's a team that uses torch-mlir and onnx-mlir, and the commits from last week are the ones that align with onnx-mlir's bi-weekly bump (brought up during the developer hour this morning).

It's a week behind and I read that as being unmaintained now. Sorry I didn't make it to the developer hour: have we gotten a volunteer yet to help with this (I didn't see any comments to my request)?

@stellaraccident
Copy link
Collaborator Author

Will iterate on this some more now that compile breaks are cleared. I think something got messed up in the region branch thing.

@ramiro050
Copy link
Collaborator

ramiro050 commented Sep 11, 2023

It's a week behind and I read that as being unmaintained now.

Yeah, the person assigned had quite a few things on their plate, so the bump didn't get made. Normally someone would ping the person to check on the status, but I think we were all a bit distracted last week.

Is it possible to have two commits on this PR so that the people using onnx-mlir don't have to wait over a week to get a matching llvm?

cc: @philass

have we gotten a volunteer yet to help with this (I didn't see any comments to my request)?

I have pinged the main contributors of the stablehlo backend in torch-mlir to take a look at your proposal, since I imagine it affects them the most. I think they missed the notification from last week.

@stellaraccident
Copy link
Collaborator Author

It's a week behind and I read that as being unmaintained now.

Yeah, the person assigned had quite a few things on their plate, so the bump didn't get made. Normally someone would ping the person to check on the status, but I think we were all a bit distracted last week.

Is it possible to have two commits on this PR so that the people using onnx-mlir don't have to wait over a week to get a matching llvm?

cc: @philass

have we gotten a volunteer yet to help with this (I didn't see any comments to my request)?

I have pinged the main contributors of the stablehlo backend in torch-mlir to take a look at your proposal, since I imagine it affects them the most. I think they missed the notification from last week.

I'm going to keep working on getting this patch working. If there is a volunteer, they can rebase, but we have to keep the train moving.

@ramiro050
Copy link
Collaborator

I'm going to keep working on getting this patch working. If there is a volunteer, they can rebase, but we have to keep the train moving.

I'm happy to do the rebase 🙂

@philass
Copy link

philass commented Sep 11, 2023

@ramiro050, @stellaraccident,

Thanks for the consideration of our situation during the uplift. Feel free to skip the intermediate LLVM uplifts to match the onnx-mlir llvm commit.

As long as we get the weekly schedule back on track and the LLVM commit gets bumped next Monday after this lands, we shouldn't have any problems.

@stellaraccident
Copy link
Collaborator Author

The random PyTorch crashes I was chasing yesterday with this bump disappeared overnight/with restart of the CI runners. Was also experiencing this locally but suspect maybe ccache related. In either case, just down to two tosa failures. Can see about patching those and landing this morning. @eric-k256 in case if there is any context.

@eric-k256
Copy link
Collaborator

We updated the tosa printers in LLVM, it looks like that is included in the bump. #2345 has the updated lit tests to match the LLVM change. I'm not sure what the easiest way is to integrate that PR with the LLVM bump.

Synchronize to the change in MLIR.
@stellaraccident
Copy link
Collaborator Author

We updated the tosa printers in LLVM, it looks like that is included in the bump. #2345 has the updated lit tests to match the LLVM change. I'm not sure what the easiest way is to integrate that PR with the LLVM bump.

Thanks - I cherry-picked that commit into this branch. Will see what the CI says.

Would you mind having a look/approving? If the CI is clean, I'll land it.

@stellaraccident
Copy link
Collaborator Author

@eric-k256 There is a legit lowering failure: https://github.com/llvm/torch-mlir/actions/runs/6162809713/job/16725038984?pr=2454

If you have a pointer, I can try to patch but would be best if someone familiar with that part triaged. Feel free to push a fix to this branch (or another).

@eric-k256
Copy link
Collaborator

I'll take a look and see what's going on. (I was just about to approve assuming CI would pass)

@stellaraccident
Copy link
Collaborator Author

I'll take a look and see what's going on. (I was just about to approve assuming CI would pass)

If you've got a pointer, I can make a change quickly... just not in a "deep thinking" about it state for the next hour or so.

@ramiro050
Copy link
Collaborator

Relevant patch: https://reviews.llvm.org/D157424

We need to update the TorchToTosa lowerings to cast the index into the an i32

@ramiro050
Copy link
Collaborator

ramiro050 commented Sep 12, 2023

Might require a few more things, but I think replacing getI64IntegerAttr with the right attribute getter in the file:

https://github.com/llvm/torch-mlir/blob/main/lib/Conversion/TorchToTosa/TosaLegalizeCommon.cpp

should fix it.

@eric-k256
Copy link
Collaborator

Yes, that's the basic idea. I've got a commit I'm about to add that fixes it locally. Some of the checks in the lit tests needed updating also.

@stellaraccident
Copy link
Collaborator Author

Yes, that's the basic idea. I've got a commit I'm about to add that fixes it locally. Some of the checks in the lit tests needed updating also.

Cool, thanks. You just convinced me to go eat breakfast vs coming up with a fix :)

@stellaraccident
Copy link
Collaborator Author

With that patch, feel free to squash and merge if it turns green and I'm not back yet.

@stellaraccident
Copy link
Collaborator Author

Yes, that's the basic idea. I've got a commit I'm about to add that fixes it locally. Some of the checks in the lit tests needed updating also.

Looks like it might need some more work. Crash on that test now...

@eric-k256
Copy link
Collaborator

Yeah, not sure what's going on yet as it passed locally. Will see if I can reproduce the crash.

@ramiro050
Copy link
Collaborator

I think the data being passed to

DenseIntElementsAttr::get(reduceDimsType, llvm::ArrayRef(reduceDims));

needs to match the type. In other words, the reduceDims probably needs to be a vector of int32_t values.

@eric-k256
Copy link
Collaborator

Thanks @ramiro050 I think you're right, and I can reproduce locally now, so should be able to verify before pushing the fix.

@stellaraccident
Copy link
Collaborator Author

Looks in better shape now. I found an accidental dump that got left in. Pushing a patch to remove that.

Copy link
Collaborator

@eric-k256 eric-k256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things are looking good in CI now. Thanks for your patience.

@stellaraccident
Copy link
Collaborator Author

Days since the mlir-hlo dep screwed us: 0

One line fix on Windows. No way to actually make it because the Google repos are impenetrable.

@stellaraccident stellaraccident merged commit a00a0d4 into main Sep 12, 2023
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants