-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CHERI-RISCV] Inline expansion of capability-based cmpxchg in hybrid mode #490
base: dev
Are you sure you want to change the base?
Conversation
8053c29
to
1f423d3
Compare
Haven't had a chance to look at it properly yet, but could you please hoist the CheriStoreCond_r operand before all the other commits (and add the "-" in RISC-V while you're at it)? |
1f423d3
to
71f5af4
Compare
fb79cca
to
fe93bf8
Compare
The last commit now adds support for RMW ops, I can split that out into a separate PR if you'd prefer. |
fe93bf8
to
d58fe61
Compare
rebased and updated tests for each commit. |
795f797
to
c29c0c9
Compare
ping? I spent quite a bit of time on this a few months ago so it would be nice to get it merged. |
// instead of promoting to i32/i64 to ensure we don't trigger bounds errors. | ||
const DataLayout &DL = CI->getModule()->getDataLayout(); | ||
uint64_t Size = | ||
DL.getTypeSizeInBits(CI->getCompareOperand()->getType()).getFixedSize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between these two? Shouldn't we always have a primitive here whose size we can get? The code on the left comes from upstream, and there are a bunch of other places that use that form too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the former does not handle pointers since it's not got a DL.
; HYBRID-ATOMICS-NEXT: .LBB0_3: # %partword.cmpxchg.end | ||
; HYBRID-ATOMICS-NEXT: xor a0, a5, a0 | ||
; HYBRID-ATOMICS-NEXT: seqz a1, a0 | ||
; HYBRID-ATOMICS-NEXT: srl a0, a5, a6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Masked atomics are so sad... stupid RISC-V. Though this codegen looks rather different to (and worse than) plain RISC-V on integers.
const Value *Ptr) const { | ||
if (I->getModule()->getDataLayout().isFatPointer(Ptr->getType())) | ||
return 8; // All sizes are supported via capabilities | ||
return TargetLowering::getMinCmpXchgSizeInBits(I, Ptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come integer pointers work fine for RISC-V then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I understand the question, but i8/i16 ops are expanded in IR: https://godbolt.org/z/Y9bKxeb88
This was previously asserting since the backend tried to use the masked instrinsics (which cannot be used with capabilities and also should not be used since we need to remain in-bounds). The test for this commit will be included in the next one since this depends on cmpxchg being expanded.
This adds the required SelectionDAG nodes and RISCVExpandAtomicPseudoInsts changes to emit cmpxchg (lr.cap/sc.cap) in hybrid mode. There is one known issues which will be fixed in the next commits: All explicit mode atomics are relaxed, so we are missing fences.
We should not be using the expandPartwordCmpXchg() logic since capability- based atomics have i8 and i16 variants. To fix this, add a Type argument to TLI->getMinCmpXchgSizeInBits() so that the RISC-V backend can return 8 for capability-based cmpxchg and 32 for non-capability ones.
This is in preparation for the next commit since they are still missing the required fences for RISC-V.
…tomics The explicit addressing mode atomics always use relaxed ordering so we need to insert fences if strong orderings are requested. Fortunately there is already support for this in the AtomicExpandPass so all we need to do here is to fix emitLeadingFence/emitTrailingFence and handle RMW instructions in shouldInsertFencesForAtomic().
c29c0c9
to
9aa0d2f
Compare
This allows expanding all integer atomic RMW operations using capability pointers inline in hybrid mode.
9aa0d2f
to
ae3ca30
Compare
This allows expanding e.g. `atomicrmw xchg i32 addrspace(200)*` without a library call. Code generation could be improved by adding an explicit pseudo but that can be done as a follow-up change
1211aff
to
ef40de6
Compare
rebased and hopefully addressed all comments. |
Factored out from #473 and rebased on top of #491