Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to #35024 - Query could not be translated when using a static ICollection/IList field #35029

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

maumar
Copy link
Contributor

@maumar maumar commented Nov 1, 2024

Problem was that when converting primitive collection to inline query root we were not matching the expression if it was constant wrapped in convert. Fix is to remove convert for the purpose of pattern match for the transformation.

Fixes #35024

@maumar maumar requested a review from a team as a code owner November 1, 2024 05:27
…llection/IList field

Problem was that when converting primitive collection to inline query root we were not matching the expression if it was constant wrapped in convert.
Fix is to remove convert for the purpose of pattern match for the transformation.

Fixes #35024
@ChrisJollyAU
Copy link
Contributor

@maumar That looks fine.

I did though ask myself where the Convert was actually coming in from.

Turns out it gets added as part of the funcletizer

return constantExpression.Type != returnType

return constantExpression.Type != returnType
            ? Convert(constantExpression, returnType)
            : constantExpression;

LH Type is List`1 and RH Type is IList`1

If you want to prevent the convert you could go something like this

return constantExpression.Type != returnType && !constantExpression.Type.GetInterfaces().Contains(returnType)`

@maumar
Copy link
Contributor Author

maumar commented Nov 3, 2024

@ChrisJollyAU I'm bit apprehensive about adding (more) reflection to the funcletizer as it is a perf-sensitive component. We'd also need to cover the case of DeriveType : BaseType (e.g. ObservableCollection), so that's some more reflection. Although it would fix the problem at the very source. @roji any thoughts on this?

@ChrisJollyAU
Copy link
Contributor

ChrisJollyAU commented Nov 3, 2024

As a side note just FYI.

The IEnumerable case does also get the Convert added at the same point. Its just that in the QueryableMethodNormalizingExpressionVisitor, in the VistMethodCall function it can get handled. With IEnumerable it calls TryConvertEnumerableToQueryable which strips the convert away. If you have a IList or ICollection (and it is a Contains) it calls TryConvertCollectionContainsToQueryableContains. This does not strip the same type of convert

Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a correct fix for the specific issue, but I think the general behavior of constantizing "init-only" (code) is problematic... The fact that a (static) member is init-only (readonly) doesn't mean that it's immutable; for example, in the example of the test you just wrote, there's nothing preventing the contents StaticIds from being mutated (even if StaticIds itself is readonly). But the way we currently translate, the contents of the list are captured and cached the first time the query is compiled (because they're constantized).

In other words, I thing we have an unrelated bug here; if that bug is fixed (by no longer constantizing init-only fields), then we'd no longer have the Convert node that is also causing the trouble here.

(note that if users really do want to constantize, they can still do that by introducing EF.Constant()).

What do you think?


[ConditionalTheory]
[MemberData(nameof(IsAsyncData))]
public virtual Task Contains_with_static_IList(bool async)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason this is in this test class (there doesn't seem to be anything aggregate about the query).

Maybe PrimitiveCollectionsQueryTestBase is a better place...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a lot of Contains tests in that class, so just keeping them all together. I don't mind moving them all to PCQTB as I do agree it's better place for them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very important obviously... Whatever you prefer

@roji
Copy link
Member

roji commented Nov 5, 2024

I'm bit apprehensive about adding (more) reflection to the funcletizer as it is a perf-sensitive component. We'd also need to cover the case of DeriveType : BaseType (e.g. ObservableCollection), so that's some more reflection. Although it would fix the problem at the very source. @roji any thoughts on this?

Aside from my above comment, @ChrisJollyAU's suggestion makes sense to me - I think simply checking constantExpression.Type.IsAssignableTo(returnType) should be sufficient, and take care of interfaces/subclasses/etc. I think perf here should be OK - the funcletizer is indeed perf-sensitive, but not to the point where we can't allow this kind of check where it's needed (remember we use the LINQ interpreter just below to evaluate parameters, which is much slower). So I'd do this regardless of the removal of init-only constantization (discussed above).

@ChrisJollyAU
Copy link
Contributor

This seems like a correct fix for the specific issue, but I think the general behavior of constantizing "init-only" (code) is problematic... The fact that a (static) member is init-only (readonly) doesn't mean that it's immutable; for example, in the example of the test you just wrote, there's nothing preventing the contents StaticIds from being mutated (even if StaticIds itself is readonly). But the way we currently translate, the contents of the list are captured and cached the first time the query is compiled (because they're constantized).

I did experiment with changing that to be containsCapturedVariable: true. It mostly works with the values as a parameter instead of a constant but there are a couple of interesting tests.

Indexof_with_emptystring

SELECT [c].[CustomerID], [c].[Address], [c].[City], [c].[CompanyName], [c].[ContactName], [c].[ContactTitle], [c].[Country], [c].[Fax], [c].[Phone], [c].[PostalCode], [c].[Region]
FROM [Customers] AS [c]
WHERE CASE
    WHEN [c].[Region] IS NOT NULL THEN 0
END = 0

new sql

@__Empty_0='' (Size = 15)

SELECT [c].[CustomerID], [c].[Address], [c].[City], [c].[CompanyName], [c].[ContactName], [c].[ContactTitle], [c].[Country], [c].[Fax], [c].[Phone], [c].[PostalCode], [c].[Region]
FROM [Customers] AS [c]
WHERE CHARINDEX(@__Empty_0, [c].[Region]) - CASE
    WHEN @__Empty_0 = N'' THEN 0
    ELSE 1
END = 0

Note that we seem to lose the optimization for that depends on a literal empty string constant. string.Empty is static readonly

Contains_on_readonly_enumerable

SELECT [w].[Id], [w].[AmmunitionType], [w].[IsAutomatic], [w].[Name], [w].[OwnerFullName], [w].[SynergyWithId]
FROM [Weapons] AS [w]
WHERE [w].[AmmunitionType] = 1

New sql

@___weaponTypes_0='[1]' (Size = 4000)

SELECT [w].[Id], [w].[AmmunitionType], [w].[IsAutomatic], [w].[Name], [w].[OwnerFullName], [w].[SynergyWithId]
FROM [Weapons] AS [w]
WHERE [w].[AmmunitionType] IN (
    SELECT [w0].[value]
    FROM OPENJSON(@___weaponTypes_0) WITH ([value] int '$') AS [w0]
)

Array_cast_to_IEnumerable_Contains_with_constant

SELECT [c].[CustomerID], [c].[Address], [c].[City], [c].[CompanyName], [c].[ContactName], [c].[ContactTitle], [c].[Country], [c].[Fax], [c].[Phone], [c].[PostalCode], [c].[Region]
FROM [Customers] AS [c]
WHERE [c].[CustomerID] IN (N'ALFKI', N'WRONG')

New sql

@___customers_0='["ALFKI","WRONG"]' (Size = 4000)

SELECT [c].[CustomerID], [c].[Address], [c].[City], [c].[CompanyName], [c].[ContactName], [c].[ContactTitle], [c].[Country], [c].[Fax], [c].[Phone], [c].[PostalCode], [c].[Region]
FROM [Customers] AS [c]
WHERE [c].[CustomerID] IN (
    SELECT [c0].[value]
    FROM OPENJSON(@___customers_0) WITH ([value] nchar(5) '$') AS [c0]
)

Specially in the above cases it probably makes more sense to constantize.

It also breaks SpatialSqlServerTest.Translators_handle_static_members with Microsoft.Data.SqlClient.SqlException : The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Parameter 1 ("@__Empty_0"): The supplied value is not a valid instance of data type geography. Check the source data for invalid values. An example of an invalid value is data of numeric type with scale greater than precision.

@roji
Copy link
Member

roji commented Nov 5, 2024

@ChrisJollyAU thanks for taking a look; it definitely makes sense that some optimizations would be lost due to moving from constantization to parameterization. The problem here is that we can't reliably know when the type in question is fully immutable, which is when it's safe to constantize. We could have a hard-coded list of the common immutable types (string, int, DateTime...) and constantize for those (when they're init-only of course), but constantizing for a type list List<T> (as in the example above) simply seems like a bug...

@ChrisJollyAU
Copy link
Contributor

Off the top of my head, I think the only ones with the problem are those like List, Collection, Queue, HashSet, Dictionary i.e. those that implement IList and ICollection.

But definitely for your basic types that you know are completely immutable (like a readonly string or int) my thoughts are towards keeping the constantizing

but constantizing for a type list List<T> (as in the example above) simply seems like a bug

Is that because of the caching or something like that (because its not in the actual sql thats generated as that works)

@roji
Copy link
Member

roji commented Nov 5, 2024

But definitely for your basic types that you know are completely immutable (like a readonly string or int) my thoughts are towards keeping the constantizing

Yeah, agree.

but constantizing for a type list List (as in the example above) simply seems like a bug

Is that because of the caching or something like that (because its not in the actual sql thats generated as that works)

Apologies - I thought a little bit about this, and I was wrong above.

The problem with constantization of a mutable thing (e.g. List<T>), is that if that list is then actually mutated, we get a cache miss - the funcletizer outputs a different tree. This means that the query gets fully recompiled every time the mutable thing has a different actual value: the recompilation is quite expensive, going through the entire query pipeline. We're basically missing the advantage of having parameters in the first place: that the same query compilation artifacts can be reused without recompilation even though the (parameterized) values change across executions.

So I don't think it's quite a bug, more like a problematic performance problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Query could not be translated when using a static ICollection/IList field
3 participants