-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] IL2CPU IL-level Optimization #169
base: master
Are you sure you want to change the base?
Conversation
What kind of benifits will this have? Faster running speeds, smaller compile size, faster compile, etc |
Smaller compile size and faster running speeds. The compile times will increase, but I plan to address this in a future PR; for now, if you need fast compilation, you can simply disable optimization from your build profile |
alrighty, what kind of performance gains will there be to expect? anything significant? |
As new passes get added, you can expect quite sizable performance gains - for now, there is only a direct property inline pass that drastically improves performance when using properties - it removes the need for the CPU to Other features that are planned, such as method inlining and control flow reordering, will boost performance even more. Method inlining will avoid |
This article covers a good portion of the optimizations that a JIT would normally perform (and, in our case, the |
Method inlining. yes, yes, yes, yes 1000 times yes this would lets up speed up the current canvas with little work |
This is a great PR! The approach looks very sensible for now. Regarding optimization a very big improvement would be to figure out when we actually need to push + pop to the stack and not only keep it in the registers. |
what about compile times and sizes, how will those be effected? |
Compile times will be extended, as the compiler will need to perform extra passes for each method. Depending on the complexity of the pass, it can take the compiler anywhere from a millisecond to a full second to process a method. For example, if the method has a lot of calls that can be inlined, the This is why there are additional passes like As for binary size, this is may vary depending on the set of optimization passes you'll be using. Method inlining will introduce a few more bytes to the final binary, but redundant instruction elimination should balance that out. IIRC, IL2CPU already only compiles in methods that it will need as it uses a scanner. The reason why the final kernel binary is so big is because Cosmos initializes a large majority of devices for you, even if you're not going to use them; so, for example, the network driver will be initialized outside of your kernel code, meaning you don't really have a choice whether it will include it or not. As an example, the CAI can be used, as it's extensible and allows kernel authors to choose to enable it or not. Compile a kernel that doesn't reference any CAI classes, and then search for In the cases mentioned previously, the optimizer can't really help you, as it can't simply take code out that it knows a part of the kernel uses; not only would that be dangerous, but also that burden shouldn't lie on the compiler at all. A solution would be to do a refactor of all drivers whose initialization can be delegated to public API methods (like the CAI). TL;DR: this comment. |
It is also big because each plug that may not be used is included. |
Are all plugs included? They get scanned but I would expect only the required plugs to actually emitted. |
This PR is currently inactive, but as new IL optimizers have come out as of late, there is a possibility to use such a project (like DistIL) and reduce the amount of work we would need to do under IL2CPU - optimization itself can introduce a lot of buggy behavior, so a lot of upkeep would be required to keep this stable (or, at least, stable for IL2CPU standards). However, I won't close this PR, as it's not confirmed if these projects would be suitable for IL2CPU - it might be the case that writing an external IL optimizer, suited for IL2CPU, but not directly associated/exclusive to it, would be the best option here. If anyone wants to take over this PR, let me know, as currently I'm occupied with operating system development with NAOT research. |
This pull request implements the base building blocks for IL-level optimization, alongside basic optimizations. The IL would normally be optimized with a JITter, but, as IL2CPU is an AOT compiler, IL-level optimizations can be made.
Optimization passes to implement:
This is a work-in-progress pull request.