Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add draft of runtime async ECMA-335 changes #104063

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

agocke
Copy link
Member

@agocke agocke commented Jun 26, 2024

I expect this will change a lot, but wanted to write down the basics.

Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-meta
See info in area-owners.md if you want to be subscribed.

docs/design/specs/runtime-async.md Outdated Show resolved Hide resolved
docs/design/specs/runtime-async.md Show resolved Hide resolved
docs/design/specs/runtime-async.md Outdated Show resolved Hide resolved

Each of the above methods will have semantics analogous to the current AsyncTaskMethodBuilder.AwaitOnCompleted/AwaitUnsafeOnCompleted methods. After calling this method, in can be presumed that the task has completed.

Async methods have the following restrictions:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should split these restrictions into fundamental (hard/impossible to ever remove them) and non-fundamental ones that just exist to make the initial implementation easier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any truly fundamental restrictions? I think, with enough effort, we could make almost anything work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think escaping ref and byref locals are fundamental restriction category. I agree that we can make them work in principle, but it would come with tough performance trade-offs.


Async methods have the following restrictions:
* Usage of the `localloc` and instruction is forbidden
* The `lodloca` and `ldarga` instructions are redefined to return managed pointers instead of pointers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that ldloca and ldarga already return managed pointers in the spec. This should rather be a modification around the note we have on transient pointers here: https://github.com/dotnet/runtime/blob/main/docs/design/specs/Ecma-335-Augments.md#transient-pointers

There is a question of what level of behavior we want to specify. We can either specify

Arguments / local variables of async methods are stored in unmanaged memory with an address that is guaranteed to be static in between suspension points, but that may change across suspension points.

or

Arguments / local variables of async methods are stored in managed memory.

The latter precludes code like unsafe { <unsafe code taking addresses of locals without any suspension points> } in C#, but Roslyn already warns on that today.

I think in either case we also need to document that values of managed pointers and structs containing managed pointers are not preserved across suspension points (at least for an initial version).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I picked the last one because it's the most restrictive. We can relax it if necessary.


_[Note: async methods have the same return type conventions as sync methods. If the async method produces a System.Int32, the return type must be System.Int32.]_

The second async signature is implicit and runtime-generated and is hereafter referred to as the "Task-equivalent" signature. It is generated based on the primary signature. The transformation is as follows:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second async signature is implicit and runtime-generated

Not sure if it's obvious but this means that if you want to use the generated signature you have to emit a call, callvirt, ldftn or ldvirtfn with a MethodRef or MethodSpec token, not MethodDef, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. I think that's correct, but I have to review the full ECMA-335 spec to make sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, reviewed, you're correct. I don't have time right now to find every spot in the spec that would need to be adjusted here, but the intent is that basically the calling convention for implicit definitions of both sync and async methods would be to use MethodRef just like a VARARG call. I think every spot in the spec that mentions special casing for VARARG should be adjusted to also mention implicit definitions.

agocke and others added 4 commits June 27, 2024 10:02
Co-authored-by: Jan Kotas <[email protected]>
Co-authored-by: Hamish Arblaster <[email protected]>
Co-authored-by: Aleksey Kliger (λgeek) <[email protected]>
Co-authored-by: Fred Silberberg <[email protected]>
* Calling another async method. No special instructions need to be provided. If the callee suspends, the caller will suspend as well.
* Using new .NET runtime APIs to "await" an "INotifyCompletion" type. The signatures of these methods shall be:
```C#
// public static async2 Task AwaitAwaiterFromRuntimeAsync<TAwaiter>(TAwaiter awaiter) where TAwaiter : INotifyCompletion
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

async2? will this temporary keyword for the prototype be part of our ecma spec?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporary 😄 Long-term I think we can use “runtime async” and “compiler async”

Copy link
Member

@jaredpar jaredpar Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

await GetFinalSyntax("async2").ConfigureAwait(FileNotFound);


Each of the above methods will have semantics analogous to the current AsyncTaskMethodBuilder.AwaitOnCompleted/AwaitUnsafeOnCompleted methods. After calling this method, in can be presumed that the task has completed.

Only local variables which are "hoisted" may be used across suspension points. That is, only "hoisted" local variables will have their state preserved after returning from a suspension. On methods with the `localsinit` flag set, non-"hoisted" local variables will be initialized to their default value when resuming from suspension. Otherwise, these variables will have an undefined value. To identify "hoisted" local variables, they must have an optional custom modifier to the `System.Runtime.CompilerServices.HoistedLocal` class, which will be a new .NET runtime API. This custom modifier must be the last custom modifier on the variable. It is invalid for by-ref variables, or variables with a by-ref-like type, to be marked hoisted. Hoisted local variables are stored in managed memory and cannot be converted to unmanaged pointers without explicit pinning.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Only local variables which are "hoisted" may be used across suspension points. That is, only "hoisted" local variables will have their state preserved after returning from a suspension. On methods with the `localsinit` flag set, non-"hoisted" local variables will be initialized to their default value when resuming from suspension. Otherwise, these variables will have an undefined value. To identify "hoisted" local variables, they must have an optional custom modifier to the `System.Runtime.CompilerServices.HoistedLocal` class, which will be a new .NET runtime API. This custom modifier must be the last custom modifier on the variable. It is invalid for by-ref variables, or variables with a by-ref-like type, to be marked hoisted. Hoisted local variables are stored in managed memory and cannot be converted to unmanaged pointers without explicit pinning.
Only local variables which are "hoisted" may be used across suspension points. That is, only "hoisted" local variables will have their state preserved after returning from a suspension. Variables not marked as "hoisted" will have an undefined value after suspension points. To identify "hoisted" local variables, they must have an optional custom modifier to the `System.Runtime.CompilerServices.HoistedLocal` class, which will be a new .NET runtime API. This custom modifier must be the last custom modifier on the variable. It is invalid for by-ref variables, or variables with a by-ref-like type, to be marked hoisted. Hoisted local variables are stored in managed memory and cannot be converted to unmanaged pointers without explicit pinning.

The prototypes don't do anything like this today and I also don't see how an IL generator can benefit from this guarantee when it only applies after resumption, so I think there's no need to specify it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have IL verification rules for this. I think this guarantee is useful for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the verification rule going to be? I do not see how this guarantee helps when the non-resuming case is specified as undefined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The starting point would be the existing verification rules from ECLA-335:

Verification assumes that the CLI zeroes all memory other than the evaluation stack before it is made visible to programs. A conforming implementation of the CLI shall provide this observable behavior. Furthermore, verifiable methods shall have the localsinit bit set, see Partition II (Flags for Method Headers).

We can work on evolving these rules.

We have tried to ignore describing the rules for verifiable (ie type and memory safe) IL for a while as we have evolved .NET. We have learned the hard way recently that it might have been a mistake. We do not have clarity about the compiler/runtime contract that we once had, and it leads to confusion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving towards a verification rule like "locals not marked hoisted will either have their value from before the suspension point or zero" does not seem very satisfying to me. Requiring the code generators to zero these locals always is possible, but of course comes with extra complexity (and extra bloat after every suspension point in unoptimized codegen). And sadly validating that the values of unhoisted locals are unused across suspension points would be quite an involved verification rule since it requires computing liveness.

As I have expressed before my opinion here is that we should drop the HoistedLocal annotation entirely. Then there is no concept of "unhoisted local used across a suspension point", and the use just naturally retains its (potentially zero initialized) value from before the suspension point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTOH, if we want to do verification that involves byrefs I'm not sure we can avoid the "the verifier needs to compute liveness" problem in any case...

Copy link
Member

@VSadov VSadov Aug 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think explicit HoistedLocal is an unnecessary concept. JIT will have to infer what is hoisted from usage anyways and will have to reject invalid use anyways - regardless of decoration.

NOTE: I completely agree with Jan on making the set of rules that is machine-verifiable. I just do not think HoistedLocal would help.

I think it will work better if we specify what uses of locals are illegal.

Ideally the rules could be enforced by a reasonably straight-forward analyzer (i.e. ILVerifier). In some cases we may want to be a bit more conservative than we could. While JIT could do complex analysis, it should not drive the rules here. The capability of JITs/interpreters will change with different levels of optimizations or as new features are added to the runtime.

Being able to do conformance analysis in one pass would be ideal (if possible).
Conversely - if the rules would ask for complex analysis (i.e. iterating to a fixed point), we should ask if the scenario it enables is really worth it.

Also note that the rules will indirectly inform the rules of high-level languages and we do not want those to be too complex either. C# users should not do liveness analysis in their head. Current async rules do not require that from the user.
(even though C# compiler does liveness analysis for the purpose of optimal async state machine generation, it is an internal detail that changed a lot over time - i.e. was completely redone and became more precise in Roslyn 1.0)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's be clear on what motivated my addition of HoistedLocal -- making it easier to get the same results as the current C# compiler lowering in the runtime implementation, particularly for interpreters.

I have not heard a lot of discussion on how interpreters should handle the case without HoistedLocal. Let's focus on that.


Async MethodDef entries implicitly create two member definitions: one explicit, primary definiton, and a second implicit, runtime-generated definition.

The primary, mandatory definition must be present in metadata as a MethodDef. This signature is required to have a `modopt` (optional modifier) as the last custom modifier before the return type. The custom modifier must fit the following requirements:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modopt (optional modifier)

I got confused between modopt (mentioned here) and modreq (mentioned above, such as line 45). A modreq seems more appropriate as compilers that don't understand this metadata should not attempt to invoke those methods.


The second async signature is implicit and runtime-generated and is hereafter referred to as the "Task-equivalent" signature. It is generated based on the primary signature. The transformation is as follows:
* If the async return type is void, the return type of the Task-equivalent signature is the type of the async custom modifier.
* Otherwise, the Task-equivalent return type is the custom modifier type (either ``Task`1`` or ``ValueTask`1``), substituted with the async return type.
Copy link
Member

@jcouv jcouv Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like custom task-like types are out of luck. Can we make that explicit? Is that in the maybe or likely-never category? #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, although I think this is also specified above as the list of valid custom-modifiers is spelled out precisely.


All async methods effectively have two entry points, or signatures. The first signature is the one present in the above code: a modreq before the return type. The second signature is a "Task-equivalent signature", described in further detail in [I.8.6.1.5 Method signatures].

Async methods have a special calling convention and may not be called directly outside of other async methods. To call an async method from a sync method, callers must use the second "Task-equivalent signature".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like it would impact aspect of Ref.Emit and invoke. Has Ref.Emit been considered with respect to async2 methods?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.