-
-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialize identifiers to ESTree #2521
Conversation
CodSpeed Performance ReportMerging #2521 will degrade performances by 7.18%Comparing Summary
Benchmarks breakdown
|
Shall we try and keep these spec names until something blows up? |
Yeah we can. After some thoughts, this is really the "easy" part for compatibility. Given that's this requires fixing also type codegen, let's tackle other issues before this one. I will work on the import attributes/withClause mismatch tonight |
Just to put in my 2 cents: I think if we're going for ESTree compatibility, we should go for 100% compatibility. It's simpler for users to have that unambiguity. Once #2409 comes to fruition (granted, that may be a while) it'll be working off the binary representation, so this information won't be lost, only hidden. An AST visitor could expose But in the AST presented as a JS object, in my opinion it's better to be 100% ESTree compatible. Any conversion/translation required to achieve that is going to be faster if done on Rust side during serialization, than on JS side. And one other thing: If we do achieve 100% compat, it'll be easier to import a ton of tests from another library without having to comment out a bunch as "we do it differently". |
By the way, I'm not saying we have to do it now, or that you have to do it Arnaud! I'm just saying that in my view, I think it should be our eventual end goal. |
Yeah I agree OXC should provide an ESTree compatible output as builtin, but this should not block better features as you suggested.
|
Thanks for coming back @ArnaudBarre and outlining your plan. Sounds great. Glad you agree that OXC's JS interface should provide a TSESTree-compatible AST ideally. I believe that all transformations required for that should be achievable on Rust side during serialization. Beyond using the Writing There's one part of this I don't understand. What's required in the TSESTree -> Prettier tree conversion? Can you give an example please? |
I want to identify all the issues to serializing these AST nodes that have more information to nodes that have less. For example, it occurs to me that we can't deserialize an estree AST back to oxc AST if there is information loss - a possible fix is to add extra information to the serialized node so we can deserialize it back. This is just one example, there may be more but I'm not sure. I'm all in for 100% estree serialization, but we want to sit on the problem for a bit longer so we can identify all the issues prior. |
I think can reconstruct the information from context. e.g. for I doubt there'd be many ambiguities. e.g. serde may even deal with it automatically if there's no ambiguity.
This would work for an AST which was originally created by OXC, but if the user alters the AST in JS (e.g. inserts new nodes), this extra metadata wouldn't be present. |
On this specific PR... @ArnaudBarre would you like to split out the parts of this PR which are uncontroversial into a separate PR which can get merged before decision is made about whether to "lower" #[cfg_attr(feature = "serde", serde(skip))]
pub reference_id: Cell<Option<ReferenceId>>, |
Yeah I don't really know what's the optimal way to convert the complex cases, I think rest element handling is probably the most complex one. If it requires tricks specific to serde that copilot don't know maybe I will need help of be happy that someone does it. But I think having all the difference laid out as a TS implementation is a better start before diving in (at least for me). For things specific to Prettier, see this file and few others in the folder above. This show that even if almost all parsers are estree like, there is always subtleties, mostly because all TS only part is less standardized than the pure ESTree part. For the "loss" information part, I think this is simpler for OXC to have two mode, one output closer to OXC internal representation so communication can go back and force, and one fully ESTree compatible where you can't go back to OXC. I don't see a usage for something in between, and it will I think be costly to maintain. @overlookmotel which part is "uncontroversial" for you? For me both result in potential information loss. But I'm happy to discuss skipping nodes and merging type identifier separately anyway. An other example I've seen is the span info the trailing comma is is not part of ESTree. |
Ah sorry, my mistake. I see what you mean. I can write a Concerning 2 modes: I'm not sure how easy this would be to do with serde. I think it assumes there's one canonical way to serialize a type, and don't think it allows multiple variations (but maybe I'm wrong and there is a workaround to do that). Presumably it's OK to add extra fields (e.g. As far as conflating Thanks for info on Prettier. I'll try and get my head around that soon as I get a chance. |
I'll try and get the TypeScript types compiling and get this merged tomorrow. |
What the hell? How on earth did this speed up the parser? As far as I can see, this PR purely changes the |
Oh I get it. CodSpeed is comparing this PR rebased on main vs baseline of 1-week old main. So other changes made in past week which speeded up parser are showing as due to this PR. It's annoying CodSpeed is misleading in this way sometimes. Should we alter benchmark CI job for PRs so it runs benchmarks on actual HEAD commit of the PR, rather than the PR auto-rebased on main? |
Re #2463
Do you want to go in this direction?
It can also be seen as a loss for people building other things on top later on (both for merging them and using skipping reference_id).
I'm also quite ok with keeping the AST as close as possible to ESTree without loosing information (align name for 1-1 nodes, camelCase name, string enum, ...) and having a small wrapper to convert to ESTree.
This wrapper could be in a first time in TS and then in Rust. Given that Prettier needs some very custom AST transformations, a small wrapper will be needed in that case anyway so I'm fine saying the current AST for identifier is fine as is.
Also this
rename = "Identifier"
doesn't seem to be entirely handled by the wasm-pack, the generated types looks like this: