Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check ref safety of arg mixing in collection initializers #76237

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

jjonescz
Copy link
Member

@jjonescz jjonescz commented Dec 3, 2024

Fixes #75802.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged Issues and PRs which have not yet been triaged by a lead label Dec 3, 2024
// So we check arg mixing of `(receiverPlaceholder).Add(element)` calls,
// and make the result "scoped" if any call could cause the elements to escape.

using var _ = new PlaceholderRegion(this, [(colExpr.Placeholder, SafeContext.CallingMethod)]) { ForceRemoveOnDispose = true };
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added ForceRemoveOnDispose = true because otherwise it can happen that the placeholder is added twice causing an assert to fail.

Copy link
Member Author

@jjonescz jjonescz Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might need to be extended to also handle the assert in the Add part, like I'm doing in a similar PR (#76263): b64ee87

But the bigger question is whether this looks like a good approach in general - using PlaceholderRegion temporarily during the check, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like adding a PlaceholderRegion temporarily here works, and it is unfortunate that PlaceholderRegions are typically added during the Visit methods and not removed. I'm curious though: was there a reason it was necessary to add the PlaceholderRegion temporarily in this particular case rather than adding it once (and not removing), in the calling VisitObjectCreationExpressionBase()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was there a reason it was necessary to add the PlaceholderRegion temporarily in this particular case rather than adding it once (and not removing), in the calling VisitObjectCreationExpressionBase()?

Yes. When GetValEscape is called on a collection initializer, I need to decide whether it can be returnable or must be scoped to current block. So I temporarily mark the receiver with CallingMethod scope to see if that is ref safe - but it might not be ref safe (and depending on that, I determine the ValEscape of the collection initializer). Put another way, the receiver scope is just speculative and hence I don't think it should be set permanently in the visitor.

@jjonescz jjonescz marked this pull request as ready for review December 3, 2024 17:49
@jjonescz jjonescz requested a review from a team as a code owner December 3, 2024 17:49
@jaredpar jaredpar requested review from RikkiGibson and cston December 9, 2024 17:43
@jaredpar
Copy link
Member

jaredpar commented Dec 9, 2024

@RikkiGibson, @cston FYI

@jaredpar jaredpar added this to the 17.13 milestone Dec 9, 2024
@RikkiGibson RikkiGibson self-assigned this Dec 10, 2024

[Theory, WorkItem("https://github.com/dotnet/roslyn/issues/75802")]
[InlineData("[System.Diagnostics.CodeAnalysis.UnscopedRef]")]
[InlineData("")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[InlineData("")]

It looks like the error reported when there is no [UnscopedRef] attribute may be unrelated to this change. Instead, it looks like RefSafetyAnalysis.HasLocalScope is not expecting a ref struct to implement IEnumerable, other than in error cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to CollectionExpression_Builder, I was just verifying that references cannot be captured with collection expressions. Haven't realized the error is unexpected. Should we file an issue for that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't realized the error is unexpected. Should we file an issue for that?

Yes, we should file an issue for the error reported when the Add(in int x) parameter is not marked [UnscopedRef].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
if (!scope.IsConvertibleTo(escapeTo))
{
Error(diagnostics, inUnsafeRegion ? ErrorCode.WRN_EscapeVariable : ErrorCode.ERR_EscapeVariable, node, expr.Syntax);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inUnsafeRegion ? ErrorCode.WRN_EscapeVariable

Are we testing the unsafe { } region case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added unsafe test. I'm not sure the WRN_EscapeVariable here is reachable (at least in the tested scenarios, this is used with discarded diagnostics just for the return value), but at least it's consistent with other similar places (and I imagine it might be reachable with more complex nested scenarios).

{
var local = 1;
var r = new R() { local };
return r;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we report a warning for this case?

R M1()
{
var local = 1;
return new R() { local }; // 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens with a collection initializer like new R() { 1 }? I would expect the same error to be reported. Please add a test.

return GetValEscape(colElement.Arguments, scopeOfTheContainingExpression);
// If arg mixing fails when the receiver has calling-method scope (i.e., some arguments could escape into the receiver), make the value scoped.
return
!scopeOfTheContainingExpression.IsConvertibleTo(SafeContext.CallingMethod) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we predicate this only on convertibility to CallingMethod scope? Consider the following:

RS M(ref int param)
{
    int local = 1;
    return new RS(ref param) { local }; // error expected
}

// ref struct RS, Add([UnscopedRef] in int item), etc.

In this case, scopeOfTheContainingExpression is ReturnOnly (right?), and not convertible to CallingMethod. But don't we still need to do a mixing check here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could also imagine scenarios where new RS(...) has some local lifetime, but a collection initializer argument has an even narrower lifetime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check !scopeOfTheContainingExpression.IsConvertibleTo(SafeContext.CallingMethod) is just an optimization - if it's false, it actually means scopeOfTheContainingExpression == SafeContext., and we can return SafeContext.CallingMethod immediately.

I will refactor the code to make that clearer.

In this case, scopeOfTheContainingExpression is ReturnOnly (right?), and not convertible to CallingMethod. But don't we still need to do a mixing check here?

We do a mixing check there. The !scopeOfTheContainingExpression.IsConvertibleTo(SafeContext.CallingMethod) condition is true in that case.

However, I'm not sure how that example is different from currently tested scenarios - you added a ref param to the ref struct's constructor, but that already goes through different code paths, in this PR I've only changed handling of the collection initializer, i.e., the part after the constructor - { local }.

I could also imagine scenarios where new RS(...) has some local lifetime, but a collection initializer argument has an even narrower lifetime.

I don't follow. Here we are computing ValEscape for the expression new R() { ... }, so it does not yet have any lifetime. And we either determine the expression to have ValEscape of scopeOfTheContainingExpression or CallingMethod. I don't think the collection initializer argument can have narrower lifetime than scopeOfTheContainingExpression (that's the narrowest lifetime we are currently at).

var colExpr = (BoundCollectionInitializerExpression)expr;
return GetValEscape(colExpr.Initializers, scopeOfTheContainingExpression);
{
if (scopeOfTheContainingExpression == SafeContext.CallingMethod)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scopeOfTheContainingExpression == SafeContext.CallingMethod

When might this be true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually never, will remove, thanks.

// The `new R() { param }` expression could be inferred to have ReturnOnly scope in M1 (then there would be no errors),
// however the ref safety analysis is currently more conservative and only infers either CallingMethod or local scope for the expression.
// It is equivalent to what would happen for `var r = new R(); r.Add(param);` where the scope of `r` is also either CallingMethod or local (if `scoped` keyword is used).
// Similarly `new R(param) { param }` in M2 is equivalent to `var r = new R(param); r.Add(param);` so the scope of `r` is ReturnOnly.
Copy link
Member Author

@jjonescz jjonescz Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a change to make it work as described in this comment. An offline discussion with @RikkiGibson led to me to add this test case and I think this is how it should behave?

We could also keep it simpler (i.e., revert the most recent change) and always infer either CallingMethod or local scope regardless of the new R(...) constructor call. Then M2 in this test would be an error as well.

Or maybe something else entirely?

@cston also for thoughts, thanks

Copy link
Member Author

@jjonescz jjonescz Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked with Rikki offline, I will think about a more robust approach. Basically now we are doing arg mixing check inside GetValEscape which is sort of a hack (we normally never call CheckValEscape inside GetValEscape) but it's simple to implement and probably can be considered good enough for this edge case scenario. The alternative approach would be to extract some sort of GetInvocationEscape utility that would analyze the invocation arguments and tell us what can escape into the receiver (now similar utilities tell us only what escapes into the return value). We could then hopefully use this also for similar bugs like #63306 and it wouldn't feel so hacky.

@jjonescz jjonescz marked this pull request as draft December 12, 2024 19:46
Copy link
Contributor

@RikkiGibson RikkiGibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per offline discussion I think it is worth a spike to see if we can allow the initializer arguments to contribute more precisely to the resulting safe-context of the object-creation. We think the solution to this may generalize to some interpolated string bugs, etc.

// return c;
//
// So we check arg mixing of `(receiverPlaceholder).Add(element)` calls
// where the placeholder has `escapeTo` scope and the call has `escapeFrom` scope.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// where the placeholder has `escapeTo` scope and the call has `escapeFrom` scope.
// where the receiverPlaceholder has `escapeTo` scope and the call has `escapeFrom` scope.

suggest using exactly the same name here to make it just a little bit easier to grasp what this comment is conveying.


ref struct R : IEnumerable
{
public void Add({{attr}} in int x) { }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also have tests with signatures like void Add(R r), void Add(scoped R r), void Add(in R r), void Add([UnscopedRef] in R r).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers Feature - Ref Fields untriaged Issues and PRs which have not yet been triaged by a lead
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Collection initializers can leak refs
4 participants