[WIP] IL2CPU IL-level Optimization #169

ascpixi · 2022-10-14T14:37:51Z

This pull request implements the base building blocks for IL-level optimization, alongside basic optimizations. The IL would normally be optimized with a JITter, but, as IL2CPU is an AOT compiler, IL-level optimizations can be made.

Optimization passes to implement:

This is a work-in-progress pull request.

terminal-cs · 2022-10-14T16:00:16Z

What kind of benifits will this have? Faster running speeds, smaller compile size, faster compile, etc

ascpixi · 2022-10-14T16:42:04Z

What kind of benifits will this have? Faster running speeds, smaller compile size, faster compile, etc

Smaller compile size and faster running speeds. The compile times will increase, but I plan to address this in a future PR; for now, if you need fast compilation, you can simply disable optimization from your build profile

terminal-cs · 2022-10-14T16:47:05Z

alrighty, what kind of performance gains will there be to expect? anything significant?

ascpixi · 2022-10-14T17:01:05Z

alrighty, what kind of performance gains will there be to expect? anything significant?

As new passes get added, you can expect quite sizable performance gains - for now, there is only a direct property inline pass that drastically improves performance when using properties - it removes the need for the CPU to jmp to a memory address, meaning that the pipeline does not get cleared; the performance is the same as if you would use a field, because the IL instruction call is directly replaced with stfld/stsfld.

Other features that are planned, such as method inlining and control flow reordering, will boost performance even more. Method inlining will avoid jmps altogether, which will drastically improve performance in loops, and control flow reordering will prioritize the branch that will most likely get called - reducing the number of jumps in that scenario as well.

ascpixi · 2022-10-14T17:02:54Z

This article covers a good portion of the optimizations that a JIT would normally perform (and, in our case, the Optimizer class, as we lack a JIT).

zarlo · 2022-10-15T04:39:54Z

Method inlining. yes, yes, yes, yes 1000 times yes

this would lets up speed up the current canvas with little work

quajak · 2022-10-15T17:42:42Z

This is a great PR! The approach looks very sensible for now. Regarding optimization a very big improvement would be to figure out when we actually need to push + pop to the stack and not only keep it in the registers.

terminal-cs · 2022-10-25T04:40:04Z

what about compile times and sizes, how will those be effected?

ascpixi · 2022-10-25T17:40:13Z

what about compile times and sizes, how will those be effected?

Compile times will be extended, as the compiler will need to perform extra passes for each method. Depending on the complexity of the pass, it can take the compiler anywhere from a millisecond to a full second to process a method. For example, if the method has a lot of calls that can be inlined, the InlineMethodsPass (has not yet been commited) will need to perform local analysis and instruction correction for each inlined method call.

This is why there are additional passes like InlineDirectPropertiesPass that will inline every direct property without the need for any method analysis, reducing the load on InlineMethodsPass, which will perform a (relatively) more complex method analysis routine.

As for binary size, this is may vary depending on the set of optimization passes you'll be using. Method inlining will introduce a few more bytes to the final binary, but redundant instruction elimination should balance that out. IIRC, IL2CPU already only compiles in methods that it will need as it uses a scanner. The reason why the final kernel binary is so big is because Cosmos initializes a large majority of devices for you, even if you're not going to use them; so, for example, the network driver will be initialized outside of your kernel code, meaning you don't really have a choice whether it will include it or not.

As an example, the CAI can be used, as it's extensible and allows kernel authors to choose to enable it or not. Compile a kernel that doesn't reference any CAI classes, and then search for AudioBuffer in the assembly file IL2CPU creates; you'll find that no references to such class exists. After adding an audio card initialization routine, and re-compiling, you'll notice that these references get created.

In the cases mentioned previously, the optimizer can't really help you, as it can't simply take code out that it knows a part of the kernel uses; not only would that be dangerous, but also that burden shouldn't lie on the compiler at all. A solution would be to do a refactor of all drivers whose initialization can be delegated to public API methods (like the CAI).

TL;DR: this comment.

MishaTy · 2022-10-25T20:31:29Z

It is also big because each plug that may not be used is included.

quajak · 2022-10-31T01:44:03Z

Are all plugs included? They get scanned but I would expect only the required plugs to actually emitted.

ascpixi · 2023-01-11T20:45:57Z

This PR is currently inactive, but as new IL optimizers have come out as of late, there is a possibility to use such a project (like DistIL) and reduce the amount of work we would need to do under IL2CPU - optimization itself can introduce a lot of buggy behavior, so a lot of upkeep would be required to keep this stable (or, at least, stable for IL2CPU standards).

However, I won't close this PR, as it's not confirmed if these projects would be suitable for IL2CPU - it might be the case that writing an external IL optimizer, suited for IL2CPU, but not directly associated/exclusive to it, would be the best option here.

If anyone wants to take over this PR, let me know, as currently I'm occupied with operating system development with NAOT research.

Create optimizer infrastructure

4e2cf2b

ascpixi added 2 commits October 29, 2022 20:31

Document ExceptionRegioninfo

0eb4c53

Merge branch 'CosmosOS:master' into optimization

1c8c548

Merge branch 'master' into pr/169

1512199

valentinbreiz added 2 commits December 24, 2023 04:22

Merge branch 'master' into pr/169

4ee0894

Init optimizer

a50cc6d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] IL2CPU IL-level Optimization #169

[WIP] IL2CPU IL-level Optimization #169

ascpixi commented Oct 14, 2022 •

edited

Loading

terminal-cs commented Oct 14, 2022

ascpixi commented Oct 14, 2022 •

edited

Loading

terminal-cs commented Oct 14, 2022

ascpixi commented Oct 14, 2022 •

edited

Loading

ascpixi commented Oct 14, 2022

zarlo commented Oct 15, 2022 •

edited

Loading

quajak commented Oct 15, 2022

terminal-cs commented Oct 25, 2022

ascpixi commented Oct 25, 2022 •

edited

Loading

MishaTy commented Oct 25, 2022

quajak commented Oct 31, 2022

ascpixi commented Jan 11, 2023

[WIP] IL2CPU IL-level Optimization #169

Are you sure you want to change the base?

[WIP] IL2CPU IL-level Optimization #169

Conversation

ascpixi commented Oct 14, 2022 • edited Loading

terminal-cs commented Oct 14, 2022

ascpixi commented Oct 14, 2022 • edited Loading

terminal-cs commented Oct 14, 2022

ascpixi commented Oct 14, 2022 • edited Loading

ascpixi commented Oct 14, 2022

zarlo commented Oct 15, 2022 • edited Loading

quajak commented Oct 15, 2022

terminal-cs commented Oct 25, 2022

ascpixi commented Oct 25, 2022 • edited Loading

MishaTy commented Oct 25, 2022

quajak commented Oct 31, 2022

ascpixi commented Jan 11, 2023

ascpixi commented Oct 14, 2022 •

edited

Loading

ascpixi commented Oct 14, 2022 •

edited

Loading

ascpixi commented Oct 14, 2022 •

edited

Loading

zarlo commented Oct 15, 2022 •

edited

Loading

ascpixi commented Oct 25, 2022 •

edited

Loading