-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why DefaultIfEmpty inside double SelectMany is wrong working? #33343
Comments
@KvaKoZyaBBrr can you please submit a runnable sample, as a console program I can actually run and see the results? Your query code there doesn't seem to compile. |
Of course. |
Thanks for the repro. I've removed the irrelevant parts of the query (that's always a good idea), here's a distlilled comparison of with and without DefaultIsEmpty for better understanding what's going on: var withDefaultIfEmpty = context.TestEntities
.Select(x => x.Id)
.SelectMany(pk => context.TestEntities
.Where(entity => pk == entity.Id)
.SelectMany(
entity => context.TestEntities
.Where(a => a.Name.Contains(entity.Name))
.DefaultIfEmpty()));
Console.WriteLine(withDefaultIfEmpty.ToQueryString());
var withoutDefaultIfEmpty = context.TestEntities
.Select(x => x.Id)
.SelectMany(pk => context.TestEntities
.Where(entity => pk == entity.Id)
.SelectMany(
entity => context.TestEntities
.Where(a => a.Name.Contains(entity.Name))));
Console.WriteLine(withoutDefaultIfEmpty.ToQueryString()); SQLs: SELECT [s].[Id], [s].[Name], [s].[Parent]
FROM [TestEntities] AS [t]
LEFT JOIN (
SELECT [t2].[Id], [t2].[Name], [t2].[Parent], [t0].[Id] AS [Id0]
FROM [TestEntities] AS [t0]
CROSS APPLY (
SELECT [t1].[Id], [t1].[Name], [t1].[Parent]
FROM [TestEntities] AS [t1]
WHERE [t1].[Name] IS NOT NULL AND [t0].[Name] IS NOT NULL AND (CHARINDEX([t0].[Name], [t1].[Name]) > 0 OR [t0].[Name] LIKE N'')
) AS [t2]
) AS [s] ON [t].[Id] = [s].[Id0]
SELECT [s].[Id], [s].[Name], [s].[Parent]
FROM [TestEntities] AS [t]
INNER JOIN (
SELECT [t2].[Id], [t2].[Name], [t2].[Parent], [t0].[Id] AS [Id0]
FROM [TestEntities] AS [t0]
CROSS APPLY (
SELECT [t1].[Id], [t1].[Name], [t1].[Parent]
FROM [TestEntities] AS [t1]
WHERE [t1].[Name] IS NOT NULL AND [t0].[Name] IS NOT NULL AND (CHARINDEX([t0].[Name], [t1].[Name]) > 0 OR [t0].[Name] LIKE N'')
) AS [t2]
) AS [s] ON [t].[Id] = [s].[Id0] First, regarding why the outer join is a LEFT JOIN... The fragment inside the outer SelectMany() ends with a DefaultIsEmpty(); this means that for every outer ID, if there's no matching TestEntity, the query should return a null. If the outer query used a non-LEFT INNER JOIN, it would instead filter out any IDs with non-matching TestEntities, which would return wrong result (remember, EF always tries to produce the same results that a regular LINQ-to-Object query would produce). But more generally, your usage of SelectMany() is slightly odd, and likely inefficient... Taking your original fragmentary query above: DbSet<MyRecord<long>>().FromSql(select * from unnest({0}), object[] { NpgsqlParameter<MyRecord<long>[]> })
.SelectMany(pk => dataContext.TestItems
.Where(item => pk.Item == item.ParentId) What you're looking for here seems to be a simple Contains query: var ids = new[] { 1, 2, 3 };
_ = context.TestEntities.Where(e => ids.Contains(e.Id))... This should make the EF provider produce better SQL, using the PG |
We have the same issue with nested SelectMany (MSSQL, EF Core 7.0.5.).
Why is the inner select using cross apply in the first version? |
TestEF_selectMany.zip |
@roji |
@KvaKoZyaBBrr I haven't had time to reexamine this. If your repo shows the actual problems this time, I'll try to find some time in the next few weeks and will report back. |
@roji can I get feedback about this problem or about my PR for current issue please? |
@KvaKoZyaBBrr I'm sorry I haven't been able to look at this - it's a very busy time and it may take a while longer unfortunately. You'll have to be patient a bit more. |
@roji systems like ours, has a lot of data tables on a frontend, powered by SQL views which we would to rewrite into EF Core queries. Now, we have to change the original query syntax if there is a nested join (LEFT then INNER), becasue EF translates it wrong. The problem is the same if there is a LEFT first, than an INNER join. Please check @KvaKoZyaBBrr's PR if you can. |
@roji is there any update about this problem? We are waiting for this fix. :( |
I've looked at this again and I'm still not seeing a problem with the SQL that EF is producing - I suspect there are some incorrect expectations here. @KvaKoZyaBBrr your 2nd code sample is essentially the same as the 1st. I stripped down your query a bit (please do that yourself when submitting code samples, removing anything non-essential): var selectMany = context.TestEntities
.Select(x => x.Id)
.SelectMany(pk => context.TestEntities
.Where(entity => pk == entity.Parent)
.SelectMany(
entity => context.TestQueries
.Where(a => a.Query.Contains(entity.Name))
.DefaultIfEmpty(),
(left, right) => new JoinResult<TestEntity, TestQuery>() { Left = left, Right = right }
)); As I wrote above, the DefaultIfEmpty() means that a default/empty result needs to be returned for each outer TestEntity; that is why there's a LEFT JOIN and not an INNER JOIN - an INNER JOIN would filter out rows without matches. I unfortunately can't start explaining how LINQ operators or why EF chooses a specific translation in each specific case. I really, really advise trying to run your LINQ queries in-memory - without EF; look at the results you get, and make sure you understand why those results are returned. If you then see EF returning different results, that would likely be a bug in EF, since EF strives to match the .NET in-memory LINQ evaluation. If you see that happening, please submit a code sample clearly showing the same LINQ query behaving differently - i.e. returning different results - in EF and with in-memory evaluation. Also, I suggest trying to think about your queries in LINQ, rather than thinking about the SQL you want, and then trying to figure out how to get that exact SQL generated by EF. If you're comfortable with SQL and want to execute a very specific SQL query, there's nothing wrong with just writing that SQL rather than using LINQ. EF really shines when you think in LINQ and let translate to SQL. I'll go ahead and close this issue for now, so I'm not seeing a problem - yet. If you can submit a LINQ query which behaves differently in EF as it does in-memory (as explained above), I'm happy to reopen and investigate that as a bug. |
@roji ok let's do it again :)
will be translated
And I get right results. I get some testQueries for any testEntity which contain entityName Next. I wanna use that case for any child of any testEntity (I will use your version of code)
Ant this translated to
And I get And more funny case if not create testQuery But it is not what I expect. because I just create query with selectMany under existed selectMany with defaultIfEmpty. So I see what inner leftJoin disapear and transform to innerJoin - that is wrong because inner SelectMany must operate with null. Also I see what outer selectMany which haven`t defaultIfEmpty translated to leftJoin - thats wrong again because outer query does not expect null. attached test project |
@KvaKoZyaBBrr have you tried running your LINQ queries against in-memory collections that contain the same data as in the database, as I suggested above? |
It can really be quite hard to a maintainer like me to understand what exactly you're saying with all of the text and snapshots. It's really easiest if you post a simple console program with a LINQ query that runs once in-memory (no EF at all), and once against a database (with EF). If the results differ between these two executions, then you have a clear bug report with a repro. I promise that if you post something like this, I'll reopen and we'll investigate. |
@roji Wow! It`s really very simple to catch problem! Nice!
thats result are equal!
thats is not equal |
In-memory result is that I expect on EF |
Confirmed, full minimal cleaned-up repro below. @KvaKoZyaBBrr thanks for doing the work here - in the future a console program like the below is the quickest way of reporting an issue. var actual = await context.TestEntities
.Select(x => x.Id)
.SelectMany(pk => context.TestEntities
.Where(entity => pk == entity.Parent)
.SelectMany(
entity => context.TestQueries
.Where(a => a.ObjectName.Contains(entity.Name))
.DefaultIfEmpty(),
(left, right) => new { Left = left, Right = right }
))
.ToListAsync(); Full minimal reproawait using var context = new BlogContext();
await context.Database.EnsureDeletedAsync();
await context.Database.EnsureCreatedAsync();
List<TestEntity> entitiesInMemory =
[
new() { Id = 1, Name = "first" },
new() { Id = 2, Name = "second", Parent = 1 },
new() { Id = 3, Name = "third", Parent = 1 },
new() { Id = 4, Name = "fourth", Parent = 2 }
];
context.TestEntities.AddRange(
new() { Name = "first" },
new() { Name = "second", Parent = 1 },
new() { Name = "third", Parent = 1 },
new() { Name = "fourth", Parent = 2 });
List<TestQuery> queriesInMemory =
[
new() { Id = 1, ObjectName = "first" },
new() { Id = 2, ObjectName = "first" },
new() { Id = 3, ObjectName = "second" },
new() { Id = 4, ObjectName = "second" },
new() { Id = 5, ObjectName = "fourth" },
new() { Id = 6, ObjectName = "fourth" }
];
context.TestQueries.AddRange(
new() { ObjectName = "first" },
new() { ObjectName = "first" },
new() { ObjectName = "second" },
new() { ObjectName = "second" },
new() { ObjectName = "fourth" },
new() { ObjectName = "fourth" });
await context.SaveChangesAsync();
var actual = await context.TestEntities
.Select(x => x.Id)
.SelectMany(pk => context.TestEntities
.Where(entity => pk == entity.Parent)
.SelectMany(
entity => context.TestQueries
.Where(a => a.ObjectName.Contains(entity.Name))
.DefaultIfEmpty(),
(left, right) => new { Left = left, Right = right }
))
.ToListAsync();
var expected = entitiesInMemory
.Select(x => x.Id)
.SelectMany(pk => entitiesInMemory
.Where(entity => pk == entity.Parent)
.SelectMany(
entity => queriesInMemory
.Where(a => a.ObjectName!.Contains(entity.Name!))
.DefaultIfEmpty(),
(left, right) => new { Left = left, Right = right }
))
.ToList();
Console.WriteLine($"Expected: {expected.Count}, actual: {actual.Count}");
public class BlogContext : DbContext
{
public DbSet<TestEntity> TestEntities { get; set; }
public DbSet<TestQuery> TestQueries { get; set; }
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
=> optionsBuilder
.UseSqlServer("Server=localhost;Database=test;User=SA;Password=Abcd5678;Connect Timeout=60;ConnectRetryCount=0;Encrypt=false")
.LogTo(Console.WriteLine, LogLevel.Information)
.EnableSensitiveDataLogging();
}
public class TestEntity
{
public int? Id { get; set; }
public string? Name { get; set; }
public int? Parent { get; set; }
}
public class TestQuery
{
public int? Id { get; set; }
public string? ObjectName { get; set; }
} @maumar does this ring any bells? Interested in looking into it? |
Thanks again @KvaKoZyaBBrr, we'll take a look! The PR definitely looks interesting. |
I have entities:
And when Im try get all childs for requested items ordered by desc which has linked query I use next linq:
I expect DefaultIfEmpty change inner SelectMany to leftJoin inside join by outer SelectMany smth
But I get
How can I change this behaviour?
The text was updated successfully, but these errors were encountered: