-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#14826: Remove misoptimizations from init code #14861
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (1/4)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (2/4)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (3/4)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (4/4)
6f79a1d
to
d5b4f64
Compare
I was dissatisfied with the volatile asms, so changed to actual assembly instructions, hiding the form of the loop from the compiler. All the git hub comments are unavoidable changes -- this is startup code, of course it's going to be 'exciting' |
1) Stop wzerorange being recognized as memset 2) Reduce insns in data image copy 3) Do not use a loop for residue Rename init code as do_crt1, to make it clearer what it is doing.
d5b4f64
to
10c7e96
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment tweak requested
Ticket
#14826
Problem description
Empty kernels are larger than expected.
What's changed
Stop wzerorange being recognized as memset. Memset is no longer pulled in.
Reduce insns in data image copy. Original loop was 21 isnsn (3.5 per word), new loop is 10 insns (3.3 per word).
Do not use a loop for residue. We only have to handle 0, 1 and 2 cases. A loop is more overhead.
Sprinkle a few more unroll-inhibiting pragmas around.
Rename init code as do_crt1, to make it clearer what it is doing.
These changes remove 436 bytes from a kernel code.
Checklist