Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The source generator is completely non incremental #20

Closed
Sergio0694 opened this issue Nov 13, 2023 · 21 comments · Fixed by #22
Closed

The source generator is completely non incremental #20

Sergio0694 opened this issue Nov 13, 2023 · 21 comments · Fixed by #22

Comments

@Sergio0694
Copy link

Just FYI, the incremental pipeline of the source generator is completely broken (as in, it's not incremental at all). This will have awful performance, as source files will be recomputed for every single edit. Additionally, this will unnecessarily root compilation objects, increasing memory use. I think I know what tutorial you probably followed here, but I'm afraid it was just completely incorrect.

The problematic line is this (among others):

IncrementalValueProvider<(Compilation, ImmutableArray<MethodDeclarationSyntax>)> compilationAndMethods
= context.CompilationProvider.Combine(methodDeclarations.Collect());
context.RegisterSourceOutput(
compilationAndMethods,
static (spc, source) => Execute(source.Item1, source.Item2, spc));

You should never have any compilations or symbols in any steps of your incremental pipelines, and especially in output steps.
Those objects are not equatable and not meant to be used this way.


Basically exactly the same issue as k94ll13nn3/AutoConstructor#82.
You can look at k94ll13nn3/AutoConstructor#83 for an example of how to fix this (note: don't expect this to be easy).

@virzak
Copy link
Contributor

virzak commented Nov 25, 2023

@Sergio0694, do you think it is even possible or reasonable to make this SG an incremental one? It pretty much makes a copy of a method every time, so I'd have to cache the entire syntax tree including comments, and perhaps only excluding some whitespace.

I followed this tutorial.

@Sergio0694
Copy link
Author

Sergio0694 commented Nov 25, 2023

That tutorial is unfortunately complete garbage as far as the incremental pipeline is concerned. We've seen dozens and dozens of developers over time following that same tutorial and sadly doing the same exact mistake every time. The pipeline shown there is also completely non incremental, and it's doing several things which are really bad practice and should never ever be done by a source generator. For instance, flowing symbols across incremental pipeline steps, and capturing the compilation object.

"I'd have to cache the entire syntax tree including comment"

I'm not familiar with the exact architecture of your generator, but yeah it's possible you might have to actually cache the entire generated method if there's no way for you to cache things in a better way. It's also possible there's just no way to implement this generator efficiently. Generators are not meant to support every possible scenario, but only a specific subset.

@lsoft
Copy link

lsoft commented Jan 1, 2024

I would love to see this incremental-pipeline fixed too. This generator is very useful when one need to maintain sync and async version of the same code.

@andrewlock
Copy link

That tutorial is unfortunately complete garbage as far as the incremental pipeline is concerned

Wow, harsh words @Sergio0694 😅 For what it's worth, I totally agree, that implementation in the blog post is trash. It was a first iteration using a new API, and the subsequent iteration doesn't have these issues. It makes all these mistakes partly because noone in the community knew these were mistakes at the time. The guidance for using the generator APIs from Microsoft has been, to use your words, "trash", so I don't think it's surprising people make these mistakes 🤷

We've seen dozens and dozens of developers over time following that same tutorial

FWIW, I had completely forgotten that these posts I wrote years ago were wrong - I'm only finding out now because someone else linked me to your comments. It might have been nice for someone to leave a note on the tutorial if it's been causing so many issues, and I could fix them 😉 I'd been planning a post on things to watch out for (like using Compilation and symbols etc); realising that the original series is so out of date now, I can prioritise that.

The pipeline shown there is also completely non incremental, and it's doing several things which are really bad practice and should never ever be done by a source generator. For instance, flowing symbols across incremental pipeline steps, and capturing the compilation object.

If only Microsoft had some way to "analyse" incorrect usages of APIs and "warn" consumers on how to "Fix" their usages 😜

@CyrusNajmabadi
Copy link

FWIW, i did drop a note a couple of years ago: andrewlock/blog-comments#24 (comment) :)

@CyrusNajmabadi
Copy link

If only Microsoft had some way to "analyse" incorrect usages of APIs and "warn" consumers on how to "Fix" their usages 😜

We try. But it's a definite challenge to do this. A whitelist of save things could be too restrictive, and a blacklist of bad things is something we try, but it's often easy to punch through with bad behavior.

@CyrusNajmabadi
Copy link

CyrusNajmabadi commented Jan 14, 2024

One area i think we could help would be an actual test harness to help validate proper/expected incremental behavior. So, an easy way to write a test that says: "i started with this user code and generated this source code. then the user made this edit, and we can see that nothing was regenerated, because the incremental SG terminated at this step".

i actually think a visual presentation of this (like an ascii flow chart of the steps) could be potentially useful. However, it would likely be too long in the horizontal direction, and diffs/updates to the test baseline might be painful.

You can write an incremental test today, but i would certainly not call it trivial. We also are missing an automatic test that does the following:

Start with two compilations (exactly the same, jsut cloned from each other). Run the incremental pipeline against both. Ensure the incremental stage values are all .Equals across the two drivers. This will ensure the use creates their record-models properly. That's a huge source of problems, and ends up making the entire pipeline non-incremental.

--

Basically, it's easy to test that you get the expected outputs of the SG. However, it's much harder to validate correct incremental behavior.

--

Tagging @chsienski

@virzak
Copy link
Contributor

virzak commented Jan 14, 2024

Even though the article by @andrewlock's own admission was trash, I am still very grateful to have come across it at the time when I didn't know much about source generators at all. Without it I'd probably use the non-incremental generator. Looking forward to merging a related PR.

This source generator makes pretty much an exact copy of a method. Am I correct in assuming that this type of SG doesn't benefit much from being incremental?

@lsoft
Copy link

lsoft commented Jan 14, 2024

@virzak

This source generator makes pretty much an exact copy of a method. Am I correct in assuming that this type of SG doesn't benefit much from being incremental?

oh oh, this is very alarming statement! We are planning to enable this ISG in our big codebase and the codebase is going to be bigger. Could you provide additional details? what it means exactly? We are OK if ISG will regenerate sync sibling if programmer changes the body of async sibling, it is expected. will ISG runs for any unrelated press keys? even if key presses performs in the file where ISG is not used at all?

@CyrusNajmabadi
Copy link

Am I correct in assuming that this type of SG doesn't benefit much from being incremental?

You want to design the generator so that it won't end up generating new source for a user edit that has no impact on the final generated source.

presumably in your case you're only generating sibling methods for specific methods, not for all of htem. So this really should be deeply incremental.

@virzak
Copy link
Contributor

virzak commented Jan 14, 2024

@lsoft, I don't think it runs for unrelated methods (though I'll have to double check). As someone who has 69 instances of the attribute in your project, you are the best one to know whether the current implementation slows VS or compilation down. The related PR will be merged regardless of whether there would be a performance gain or not. The only reason it was on the backburner for now, was resolving higher priority issues which actually prevented the usage of SG alltogether.

@andrewlock
Copy link

One area i think we could help would be an actual test harness to help validate proper/expected incremental behavior. So, an easy way to write a test that says: "i started with this user code and generated this source code. then the user made this edit, and we can see that nothing was regenerated, because the incremental SG terminated at this step".

I realise this is all tangential to the problem, but I would love something like this @CyrusNajmabadi 😍

Start with two compilations (exactly the same, jsut cloned from each other). Run the incremental pipeline against both. Ensure the incremental stage values are all .Equals across the two drivers. This will ensure the use creates their record-models properly. That's a huge source of problems, and ends up making the entire pipeline non-incremental.

I had no idea that this was possible (found the original issue now) - just tried it out, and it's great, thanks!

FWIW, I've just done some minimal updates to that blog post—e.g. use a data model properly, don't combine with Compilation, don't return syntax/symbol, mention the improved ForAttributeWithMetadataName etc. I have another blog post I'll publish soon in that series that describes a bunch of these things to look out for...

More back on topic....

I don't think it runs for unrelated methods

AFAICT, this will almost certainly re-generate for every single edit, because you're doing a Combine() with Compilation.

@CyrusNajmabadi
Copy link

W00t. Glad to hear it. This will be super helpful to the community!

@virzak
Copy link
Contributor

virzak commented Jan 17, 2024

Version 1.3.1 implements the source generator properly and the new blog post by @andrewlock was used in process. Thanks everyone!

@andrewlock
Copy link

Sorry to bother you again @CyrusNajmabadi, but I've been playing with this approach:

Start with two compilations (exactly the same, jsut cloned from each other). Run the incremental pipeline against both. Ensure the incremental stage values are all .Equals across the two drivers. This will ensure the use creates their record-models properly. That's a huge source of problems, and ends up making the entire pipeline non-incremental.

It works really well AFAICT except for one scenario, when you add a tracking name to the Collect() call. In that scenario, the Collect() returns ImmutableArray<T>, and as discussed in this issue, the two instances won't be Equal() unless they wrap the same array (which they won't in this case) 🤔

So my question is - does the generator infrastructure compare the contents of ImmutableArray<T>s rather than calling Equals directly? Are they special cased everywhere? Or just in the Collect call?

Also, would really appreciate it if you could take a look at the post in case there's anything I've got wrong 😅

Thanks again!

@Sergio0694
Copy link
Author

"you may find your predicate can be as simple as (_, _) => true"

I would recommend not doing that, as it'd trigger on also actually invalid targets (input symbols are not necessary valid).

"Don't use *Syntax or ISymbol instances in your pipeline"

You're correctly also calling out not to capture Compilation-s below, but I'd recommend also calling out not capturing Diagnostic-s either. Those are sneaky because they might also actually wrap symbols internally without you noticing.

"Consider using RegisterImplementationSourceOutput instead of RegisterSourceOutput"

The first two bullet points are wrong: you should not use RegisterImplementationSourceOutput when implementing partial methods or interceptors. Doing so will potentially break IntelliSense and result in false positive errors, as the compiler might in the future really only generate that code on actual builds and not design time builds. In which case IntelliSense would eg. rightfully complain about missing partial implementations, or quick info might point to the wrong method, etc.

This method should only be used to generate code that should literally never be referenced directly at all.

@andrewlock
Copy link

Thanks @Sergio0694!

I would recommend not doing that, as it'd trigger on also actually invalid targets (input symbols are not necessary valid)

Interesting, so are you saying that if I have an AttributeUsage that can only be applied to enums (for example), but they've applied to a class, ForAttributeWithMetadataName will still call the predicate? I assumed it didn't from other things I had read (though I confess, I haven't actually tried that myself).

I'd recommend also calling out not capturing Diagnostics either

Good call 👍

The first two bullet points are wrong

Thanks - yeah, they were speculative, and with interceptors I wasn't sure the IDE integration was there yet, so good to know 🙂

Cheers!

@lsoft
Copy link

lsoft commented Jan 18, 2024

@Sergio0694 is it safe (from performance and GC-perspectives) to use SyntaxNode in ISG pipeline if I provide a custom equatable comparer?

@CyrusNajmabadi
Copy link

If you use IsIncrementallyEquivalentTo, then possibly yes. Note that you want to understand that function (and it's implementation) very well before depending on it

@lsoft
Copy link

lsoft commented Jan 18, 2024

@CyrusNajmabadi could you add some new info or hiNt - what is IsIncrementallyEquivalentTo? This is a new concept for me I can't found such word in Roslyn codebase.

@CyrusNajmabadi
Copy link

@CyrusNajmabadi could you add some new info or hiNt - what is IsIncrementallyEquivalentTo? This is a new concept for me I can't found such word in Roslyn codebase.

Here you go: https://github.com/dotnet/roslyn/blob/db94f712c1f6e4d64d2c61921f9e60629ecb41a1/src/Compilers/Core/Portable/Syntax/SyntaxNode.cs#L365

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants