Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix branch-misses on Raspberry Pi 3B #95

Merged
merged 1 commit into from
Dec 16, 2023
Merged

Conversation

vacantron
Copy link
Collaborator

@vacantron vacantron commented Dec 16, 2023

According to "Arm Cortex-A53 MPCore Processor Technical Reference Manual", the data-processing instructions using the PC as a destination register is not predicted.

After substituting the MOV pc, ... with BLX instruction, the statistics become:

Performance counter stats for 'out/shecc-stage1.elf tests/fib.c' (5 runs):

         2,308,525      branches:u                                                    ( +-  0.00% )
           229,797      branch-misses:u           #    9.95% of all branches          ( +-  0.09% )

         0.0756673 +- 0.0000838 seconds time elapsed  ( +-  0.11% )

It reduces half of branch-misses and has ~4% speedup. The further improvement should be done in the optimizer.

Related Issues

@jserv
Copy link
Collaborator

jserv commented Dec 16, 2023

Should we close #93 accordingly?

According to "Arm Cortex-A53 MPCore Processor Technical Reference
Manual", the data-processing instructions using the PC as a
destination register is not predicted. Replace the `MOV pc, ...`
with `BLX` or `POP` instruction.

Close sysprog21#93
@vacantron
Copy link
Collaborator Author

vacantron commented Dec 16, 2023

Should we close #93 accordingly?

OK. I have updated the commit message.

@jserv jserv merged commit 650a900 into sysprog21:master Dec 16, 2023
3 checks passed
@vacantron vacantron deleted the dev branch December 16, 2023 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants