-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate build paths in the context of the build's bindings #94
Conversation
Wow, I'm impressed you got all this figured out! I had some ideas here and was about to start tinkering and then I realized I probably should put some more testing in place before I broke something I had done in the past... I checked in some testing support for benchmarking changes to parsing, see It looks like for the LLVM CMake build this patch regresses parse perf by ~60%:
Eyeballing the patch, I'm not sure if it's fundamental to just getting this correct, or if it's something simpler like the stop_at param or needing to avoid allocating some Vecs while parsing. (Definitely have the feeling that I wish I had thought variables through better when making Ninja, blergh. It was a very long time ago though...) |
/// evalulate turns the EvalString into a regular String, looking up the | ||
/// values of variable references in the provided Envs. It will look up | ||
/// its variables in the earliest Env that has them, and then those lookups | ||
/// will be recursively expanded starting from the env after the one that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this isn't right.
The Ninja docs aren't too great but the intent is expansion only ever happens once:
https://ninja-build.org/manual.html#_variable_expansion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is mostly an implementation detail. I mostly made this change so that it was easy to turn an EvalString into the proper string under a variety of different circumstances. (rule bindings vs build bindings) If we want, we can expand out variables at any point in the process.
However I don't think doing early expansion will really help that much. Consider this build.ninja:
root_out = out
build ${local_out}/foo: ${local_out}/bar
local_out = ${root_out}/local
Here, we have to expand 2 paths. Even if we had preemptively flattened root_out into local_out, we still need to copy the out/local
path fragment into two separate strings. So flattening it doesn't save us any string copying time, only some scope traversal time, but since there are a maximum of 4 scopes currently in n2 (implicit, rule, build, global, though this will increase when subninja support is added), I don't think that matters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is an interesting argument regarding scope traversal time!
I think the most nested case of lookup (excluding subninja) is a file
var = a
rule print
command = echo "var:'$var' out:'$out'"
build c$var: print
var = b$var
which prints
var:'ba' out:'cba'
// TODO(#83): this is wrong in that it doesn't include envs. | ||
// This can occur when you have e.g. | ||
// rule foo | ||
// bar = $baz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Ninja it turns out you can't declare arbitrary vars in a rule block so this example isn't legal.
I tried a construction like
rule print
command = echo "varref is $varref"
depfile = abc
build out: print
varref = $depfile
And Ninja expanded varref to the empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...which I think means this worry was misplaced (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand this comment, (where is x
used?) but just from the signature you can tell it was wrong, (You can't convert an EvalString to a String without any additional input, or else you shouldn't have used an EvalString in the first place) so I reworked it so Envs return EvalStrings.
Thanks for the comments, and adding the benchmarking tests. I'll look into the performance regressions. I think the easiest way to make it faster would be to expand the build bindings as soon as they're parsed, but one reason I didn't do that was because the android multithreaded parser will parse chunks of the manifest in parallel. So we could parse a build definition before all the global variables above it are parsed. So if we want to support that eventually, we'd need to leave the ability to have the build's paths and bindings unevaluated. |
9dffdcb
to
c13f859
Compare
For my benchmarking I've just been using the "synthetic" build file, I haven't set up an LLVM build. First I started with some straight optimizations:
With these changes it goes from about 90% slower than HEAD~ to about 20% slower, on the "parse synthetic build.ninja" test. I think this can probably mostly be explained due to the fact that I've moved the evaluation of the build's variable bindings from the loader code into I then realized, if we do multithreading, it's essentially always wrong to do any evaluating inside the parser, as the parser is what will be run in the separate threads. So I changed the parser to always return I added a benchmark that runs the loader instead of just the parser. With this benchmark, this cl is only 3% slower than before. (And even that might be noise, I've seen 3% differences when running the same benchmark twice on the same code) So this change ended up turning into a lot of preparation for multithreading, and not just the path evaluation change. Sorry about that. |
c13f859
to
8e61026
Compare
Actually, this still has some issues. Rule bindings do need to be recursive at least. |
Regular ninja also has this validation.
8e61026
to
fd10e19
Compare
Well, currently n2 doesn't have recursive rule bindings either, so I guess this is ok for now, and we can address that issue in a followup. I rebased this pr on top of #96, and removed the |
I'm making some changes that affect the performance of the parser and the loader, and would like to benchmark them together. Unfortunately this required making a bunch of things public (#[cfg(test)] doesn't seem to apply for benchmarks), which isn't great.
fd10e19
to
9831750
Compare
I checked in some LLVM build files for your convenience, see |
I wrote down what I believe the rules are for variable scope here, based on a bit of fiddling with running files through Ninja https://github.com/evmar/n2/blob/main/doc/design_notes.md#variable-scope I'm not sure if that's helpful, but I think it at least helped me to try to think it through. |
This fixes an incompatibility with ninja. I've also moved a bunch of variable evaluations out of the parser and into the loader, in preparation for parsing the build file in multiple threads, and then only doing the evaluations after all the chunks of the file have been parsed. Fixes evmar#91 and evmar#39.
9831750
to
d510d1d
Compare
Thanks! I hacked together a benchmark test to use them (locally), and discovered this PR was ~15% slower when loading the llvm-cmake files. I found a fix for this by duplicating the inner loop of
I see you wrote "Lookup for rule variables can refer to $in/$out, build scope, then toplevel". This isn't entirely correct, it should be "$in/$out, other rule variables recursively (with cycle detection), build scope, then toplevel". However currently n2 both before and after this PR doesn't have that behavior. I plan to add it in a follow-up after this PR is merged. This recursive rule binding is useful for depfiles:
I found this pattern in the android codebase. Also it might be worth specifying that while build bindings can refer to toplevel bindings, they can only refer to toplevel bindings that were defined before the build. |
Oh yikes, this seems to have happened after my time working on Ninja. It's probably worth figuring out if there is a simpler specific thing we could do here that is less work. |
I'm not opposed to not supporting recursive rule bindings in n2, and instead fixing android to not rely on it, if you'd prefer that. This is probably easier to fix with fewer occurrences than build paths taking into account build bindings. |
My hope is there's some balance between bug-for-bug Ninja compatibility and breaking totally from it. For example, maybe this could be the rule: in any scope, bindings written earlier can be used in lexically later statements. I think that's easy to understand, it's already how toplevel scope works, and it defines away circularity problems in the other scopes. (Selfishly, that also makes evaluation easier.) |
I took another pass over my attempt at writing down what the current rules are, which mostly meant removing some of the details (which as you correctly pointed out, I had wrong) |
8cafbda
to
d43e8ca
Compare
@evmar Have you had a chance to look at this change? |
(Sorry for the delay, ice storm here means I've been on childcare all week.) My overall take is I appreciate the intent and I think the tests are great. I don't love the approach and have an ill-formed idea about a different approach but I can revisit that at some point in the future if I care to. |
My vague ideas (sorry on five hours of sleep) are something like:
Sorry that is so vague, I do not expect you to do it, mostly writing it for my own memory! |
Thanks! I think most of your points about doing evaluation earlier are incompatible with multithreaded parsing. You can't evaluate the toplevel bindings as you read them, because there could be variable references to variables from an earlier chunk of the file that's being parsed in a different thread. Multithreading will likely blow any savings we would get from earlier evaluation out of the water as well. I'll work on a multithreading PR, so you can have a frame of reference for what can be evaluated earlier and what can't. |
Naively it feels like the generator knows a lot more about the file structure than n2 does, and it could do something around splitting files and using subninja to control scopes such that you don't need to do so much speculation about the file's contents... (?) E.g. even splitting a text file with multiple threads means trying to seek around for newlines, I would think. |
Yeah, the android fork of ninja essentially looks for a non-escaped newline immediately followed by an identifier. Having the generator split the file is an interesting idea. But wouldn't it still require deferred evaluation of global variables? If you used includes, you'd have to wait defer expansion of everything after the include, and the contents of the included file. With subninja it's better, but still requires deferring evaluation of the subninja's variables until the parent file has evaluated up until the subninja command. And it's harder to split a ninja file using subninja because you can't share global variables across them. You wouldn't want a "linked list" of subninjas, where there was 1 subninja statement per file, at the end of every file, because then you wouldn't be able to start parsing that until you fully parsed the file before it, essentially making it a serial again. A better way would be to have 1 top level file that just contained include/subninja statements for all the broken-up files. But that only really works with include, because the subninja files would be unable to share variables. You'd have to have a pretty sophisticated generator to ensure all your variables were available in the correct scopes if you were to use subninja. It also puts more work on generator authors. Doubly so for android as we'd have to support the splitting in both soong and kati. |
I think I still don't have a good mental picture of how these work, they are like hundreds of mbs of interdependent globals? 😬 |
The two big samples I have either don't use them at all:
Or in GN's case, they have a subninja per module that each set some at the top:
where the largest files are still pretty small:
|
In android's case, we have a lot of variables that are spread throughout one ninja file, and they commonly contain references to earlier variables:
(the However, I haven't done a deep dive into seeing how difficult it would be to make soong/kati output separate ninja files. I'll take a look into that, maybe there's a pattern to the inter-dependedness that can be exploited. I'll open a separate issue for multithreading as well. |
If you had one of these .ninja files handy that doesn't contain anything confidential I'd love to add it to https://github.com/evmar/n2/tree/main/tests/snapshot |
This fixes an incompatibility with ninja.
I've also refactored Env to make it clearer and remove the TODO(#83), but I actually think we could do signifigant further cleanup. Only rules should require EvalStrings, global variables and build bindings can be evaluated as soon as they're read, although maybe that would change with subninjas. Also, rules currently parse their variables as EvalString, but I think that could be changed to EvalString<&'text str> if we hold onto the byte buffers of all the included files until the parsing is done.
Fixes #91 and #39.