Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JitArm64_Integer: Carry flag optimizations #13251

Merged
merged 12 commits into from
Jan 6, 2025

Conversation

Sintendo
Copy link
Member

This PR implements a bunch of missing optimization opportunities regarding the carry flag.

  • addex
Optimize InPPCState for 0 Before:
0x394bd3b8 ldrb w24, [x29, #0x2f4]
0x2a1803f9 mov w25, w24

After:

0x394bd3b9 ldrb w25, [x29, #0x2f4]
Optimize InHostCarry for 0 Before:
0x5280001b   mov    w27, #0x0                 ; =0
0x1a1f037b   adc    w27, w27, wzr

After:

0x1a9f37fb   cset   w27, hs
Optimize InHostCarry for -1 Before:
0x1280001a   mov    w26, #-0x1                ; =-1
0x1a1f035a   adc    w26, w26, wzr

After:

0x5a9f23fa   csetm  w26, lo
  • addzex
Optimize ConstantFalse for 0 Before:
0x52800119   mov    w25, #0x8                 ; =8
0x2a1903fa   mov    w26, w25

After:

Optimize ConstantTrue for 0 Before:
0x52800119   mov    w25, #0x8                 ; =8
0x1100073a   add    w26, w25, #0x1

After:

Optimize InPPCState for 0 Before:
0x52800019   mov    w25, #0x0                 ; =0
0x394bd3b8   ldrb   w24, [x29, #0x2f4]
0x2b180339   adds   w25, w25, w24

After:

0x394bd3b9   ldrb   w25, [x29, #0x2f4]
Optimize InHostCarry for 0 Before:
0x5280000d   mov    w13, #0x0                 ; =0
0x1a1f01ae   adc    w14, w13, wzr

After:

0x1a9f37ee   cset   w14, hs
  • subfex
Optimize InPPCState for 0 Before:
0x394bd3b8   ldrb   w24, [x29, #0x2f4]
0x2a1803f9   mov    w25, w24

After:

0x394bd3b9   ldrb   w25, [x29, #0x2f4]
Optimize InHostCarry for -1 Before:
0x1280001a   mov    w26, #-0x1                ; =-1
0x1a1f035a   adc    w26, w26, wzr

After:

0x5a9f23fa   csetm  w26, lo
  • subfzex
Constant folding Before:
0x52800016   mov    w22, #0x0                 ; =0
0x2a3603f6   mvn    w22, w22

After:

When the immediate is zero, we can load the carry flag from memory
directly to the destination register.

Before:
0x394bd3b8   ldrb   w24, [x29, #0x2f4]
0x2a1803f9   mov    w25, w24

After:
0x394bd3b9   ldrb   w25, [x29, #0x2f4]
The result is either -1 or 0 depending on the state of the carry flag.
This can be done with a csetm instruction.

Before:
0x1280001a   mov    w26, #-0x1                ; =-1
0x1a1f035a   adc    w26, w26, wzr

After:
0x5a9f23fa   csetm  w26, lo
When both the input register and the carry flag are constants, the
result can be precomputed.

Before:
0x52800016   mov    w22, #0x0                 ; =0
0x2a3603f6   mvn    w22, w22

After:
Same optimization we did for subfex. Skip loading the carry flag into a
temporary register first when we're dealing with zero.

Before:
0x394bd3b8   ldrb   w24, [x29, #0x2f4]
0x2a1803f9   mov    w25, w24

After:
0x394bd3b9   ldrb   w25, [x29, #0x2f4]
Similar to what we did for subfex, but for 0.

Before:
0x5280001b   mov    w27, #0x0                 ; =0
0x1a1f037b   adc    w27, w27, wzr

After:
0x1a9f37fb   cset   w27, hs
Same thing we did for subfex.

Before:
0x1280001a   mov    w26, #-0x1                ; =-1
0x1a1f035a   adc    w26, w26, wzr

After:
0x5a9f23fa   csetm  w26, lo
When the input register and carry flags are known, we can always
precompute the result.

We still materialize the immediate when the condition register
needs to be updated, but this seems to be a general problem. I might
look into that one day, but for now this'll do.

- ConstantFalse
Before:
0x52800119   mov    w25, #0x8                 ; =8
0x2a1903fa   mov    w26, w25

After:
N/A

- ConstantTrue
Before:
0x52800119   mov    w25, #0x8                 ; =8
0x1100073a   add    w26, w25, #0x1

After:
N/A
Before:
0x52800019   mov    w25, #0x0                 ; =0
0x394bd3b8   ldrb   w24, [x29, #0x2f4]
0x2b180339   adds   w25, w25, w24

After:
0x394bd3b9   ldrb   w25, [x29, #0x2f4]
@JosJuice
Copy link
Member

The last commit can remove the Common/Unreachable.h include. LGTM otherwise.

Before:
0x5280000d   mov    w13, #0x0                 ; =0
0x1a1f01ae   adc    w14, w13, wzr

After:
0x1a9f37ee   cset   w14, hs
@JosJuice JosJuice merged commit eec2e2f into dolphin-emu:master Jan 6, 2025
13 checks passed
@Sintendo Sintendo deleted the carry-opts branch January 6, 2025 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants