An incorrect query result, where the distributed query plan seems wrong #7698

duerwuyi · 2024-10-05T11:24:02Z

DROP TABLE IF EXISTS t1;
create table t1 ( 
vkey int4 ,
c10 int4 
);
DROP TABLE IF EXISTS t3;
create table t3 ( 
vkey int4 
);


insert into t3 (vkey) values (1);
insert into t1 (vkey,c10) values (4, -70);

I made t1 as a distributed table:
SELECT create_distributed_table('t1', 'vkey');

query:

select  t3.vkey
from  (t1  right outer join t3 
    on (t1.c10 = t3.vkey ))
where exists (select  *   from   t3);

The result should be t3.vkey = 1 ,but actually returns nothing. The citus version is 12.5.

The text was updated successfully, but these errors were encountered:

duerwuyi · 2024-10-05T12:07:27Z

the distributed query plan of this query:
Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=100000 width=4)
-> Distributed Subplan 1_1
-> Seq Scan on t3 (cost=0.00..35.50 rows=2550 width=4)
-> Distributed Subplan 1_2
-> Seq Scan on t3 (cost=0.00..35.50 rows=2550 width=4)
Task Count: 0
Tasks Shown: All

task count should not be 0.

This PR addresses two reported issues: citusdata#7698 and citusdata#7697. The Problem: Both issues involve incorrect query results when a query references both local and distributed tables, and includes a WHERE EXISTS clause on the local table. For example: SELECT ... WHERE EXISTS (SELECT * FROM local_table); In such cases, the WHERE EXISTS clause typically generates an InitPlan or "one-time filter," which determines whether the rest of the plan's output qualifies for the result. If this InitPlan relies on the contents of a local table, it must be executed locally on the coordinator. However, the planner's decisions regarding whether to convert local or distributed tables into intermediate results fail to account for the references within the InitPlan. This results in an incorrect query execution plan and, subsequently, incorrect data. The Fix: This PR ensures that when the standard planner (standard_planner) generates an InitPlan in the PlannedStmt, we check the executor parameters (PARAM nodes) in the join qualifiers for relations referenced by the InitPlan. If such references exist, distributed table references are converted to intermediate results rather than local tables. This adjustment ensures that local tables used in the InitPlan remain intact and behave as expected. This fix prevents incorrect query results in cases involving mixed local and distributed tables with WHERE EXISTS clauses and improves the accuracy of distributed query planning.

onurctirtir · 2024-12-26T15:13:20Z

The actual problem is that we directly push down this query and this is wrong:

set client_min_messages to debug3;

select  t3.vkey
from  (t1  right outer join t3 
    on (t1.c10 = t3.vkey ))
where exists (select  *   from   t3);

DEBUG:  no shard pruning constraints on t1 found
DEBUG:  shard count after pruning for t1: 32
DEBUG:  Router planner cannot handle multi-shard select queries
DEBUG:  no shard pruning constraints on t1 found
DEBUG:  shard count after pruning for t1: 32
DEBUG:  assigned task 1 to node localhost:9701
DEBUG:  assigned task 2 to node localhost:9702
DEBUG:  assigned task 3 to node localhost:9701
DEBUG:  assigned task 4 to node localhost:9702
..
DEBUG:  assigned task 32 to node localhost:9702

So, for such a query, where t3 is a reference table (which we call as a recurring rel) and t1 is a distributed table, and where recurring relation is in the outer part of the outer part of the outer join and the distributed one is in the inner part of the join; we should normally go through recursive planning and should first recursively plan the distributed rel. See this;

	/*
	 * Similarly, logical planner cannot handle outer joins when the outer rel
	 * is recurring, such as "<recurring> LEFT JOIN <distributed>". In that case,
	 * we convert distributed table into a subquery and recursively plan inner
	 * side of the outer join. That way, inner rel gets converted into an intermediate
	 * result and logical planner can handle the new query since it's of the from
	 * "<recurring> LEFT JOIN <recurring>".
	 */
	if (ShouldRecursivelyPlanOuterJoins(context))
	{
		RecursivelyPlanRecurringTupleOuterJoinWalker((Node *) query->jointree,
													 query, context);
	}

Otherwise, i.e., when push-down this query, it causes returning incorrect result set because the tuples that don't exist in shards of the distributed table would appear in each outer join pushed down to the shards. In the past, because pushing down such a query would cause returning such an incorrect result set, we were throwing an error. But as of #6512, we support such queries by recursively planning the distributed table first. In other words, we first convert it into another recurring relation form which we call as "intermediate result". Although this is not so much ideal from performance perspective, this at least allow us supporting such queries in an appropriate way. Having said that, this issue doesn't seem specific to local tables but can be generalized to set returning functions and reference tables etc. too.

One thing that looks interesting is that when I remove the where clause from this function, we actually do what we're supposed to do:

set client_min_messages to debug2;

select * from (t2 full outer join t1 on(t2.vkey = t1.vkey ));
DEBUG:  Router planner cannot handle multi-shard select queries
**DEBUG:  recursively planning right side of the full join since the other side is a recurring rel**
**DEBUG:  recursively planning distributed relation "t1" since it is part of a distributed join node that is outer joined with a recurring rel**
DEBUG:  Wrapping relation "t1" to a subquery
DEBUG:  Router planner cannot handle multi-shard select queries
DEBUG:  generating subplan 10_1 for subquery SELECT vkey FROM public.t1 WHERE true
DEBUG:  Plan 10 query after replacing subqueries and CTEs: SELECT t2.vkey, t1.vkey FROM (public.t2 FULL JOIN (SELECT t1_1.vkey FROM (SELECT intermediate_result.vkey FROM read_intermediate_result('10_1'::text, 'binary'::citus_copy_format) intermediate_result(vkey integer)) t1_1) t1 ON ((t2.vkey OPERATOR(pg_catalog.=) t1.vkey)))
DEBUG:  Creating router plan
┌──────┬──────┐
│ vkey │ vkey │
├──────┼──────┤
│    5 │      │
└──────┴──────┘
(1 row)

In other words, in that case, we first convert the distributed table into an intermediate result and everything goes well. However, when the where clause that is mentioned above is introduced, then ShouldRecursivelyPlanOuterJoins() returns false and so we don't call into RecursivelyPlanRecurringTupleOuterJoinWalker() function, which we use to convert distributed tables into intermediate results.

So, either standard_planner() or a prior part of the Citus planner seems to convert the outer join mentioned above into some other format and that causes us incorrectly planning the query.

@codeforall & me will debug this a bit to see what's happening here.

(cc: @colm-mchugh)

colm-mchugh · 2025-01-06T13:13:26Z

@onurctirtir this problem (and both #7697 and #7696) does not occur with Citus 13 and on investigation the problem is with Postgres 16, because of pg commit 695f5deb79, which prevents the set_join_pathlist_hook from being called if any of the join restrictions is a pseudo-constant. In #7698 on Citus 12, Postgres 16 does not call the join hook because the right outer join fails the consider_join_pushdown check in add_paths_to_joinrel() so citus has no info on the join, never sees that the query has an outer join, and ends up producing an incorrect plan. Pg commit 9e9931d re-enables unconditional calling of the set_join_pathlist_hook, and is in Postgres 17, so this problem (and #7697 and #7696) does not occur with Citus 13 because citus has visibility on all the joins and can make the decisions you describe in your post. So it seems to be a somewhat unfortunate case of Citus 12 being impacted by a restriction on the calling of set_join_pathlist_hook in Postgres 16. Getting the pg commit (sha 9e9931d) that removed the restriction into Postgres 16 would have prevented this, but was not to be.

cc @codeforall sharing this update in case its useful to your PR

duerwuyi changed the title ~~a Distributed table right outer join a non-empty undistributed table, and return nothing.~~ An incorrect query result which distributed query plan seems wrong Oct 5, 2024

duerwuyi changed the title ~~An incorrect query result which distributed query plan seems wrong~~ An incorrect query result, where the distributed query plan seems wrong Oct 5, 2024

duerwuyi mentioned this issue Dec 16, 2024

A logic bug with distributed table #7783

Open

codeforall mentioned this issue Dec 25, 2024

[Bug Fix] Incorrect Query Results with Distributed Query Plans. citusdata#7698 and citusdata#7697. #7809

Closed

codeforall mentioned this issue Dec 25, 2024

[Bug Fix] Incorrect Query Results with Distributed Query Plans. citusdata#7698 and citusdata#7697 #7810

Open

This was referenced Dec 26, 2024

Incorrect result from a distributed table full outer join an undistributed table. #7697

Open

INCORRECT result from a distributed table full outer join a reference table #7696

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

An incorrect query result, where the distributed query plan seems wrong #7698

An incorrect query result, where the distributed query plan seems wrong #7698

duerwuyi commented Oct 5, 2024 •

edited

Loading

duerwuyi commented Oct 5, 2024

onurctirtir commented Dec 26, 2024 •

edited

Loading

colm-mchugh commented Jan 6, 2025 •

edited

Loading

An incorrect query result, where the distributed query plan seems wrong #7698

An incorrect query result, where the distributed query plan seems wrong #7698

Comments

duerwuyi commented Oct 5, 2024 • edited Loading

duerwuyi commented Oct 5, 2024

onurctirtir commented Dec 26, 2024 • edited Loading

colm-mchugh commented Jan 6, 2025 • edited Loading

duerwuyi commented Oct 5, 2024 •

edited

Loading

onurctirtir commented Dec 26, 2024 •

edited

Loading

colm-mchugh commented Jan 6, 2025 •

edited

Loading