-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Replies: 2 comments · 2 replies
-
0.13 is an old release version. I am not sure this bug has been fixed or not in the new version. |
Beta Was this translation helpful? Give feedback.
All reactions
-
还有一点是,可以把增加 shuffle 和不加 shuffle 声明的 plan 贴出来。先看一下规划层面的 diff |
Beta Was this translation helpful? Give feedback.
All reactions
-
我正尝试在每个涉及到的exec node上打写日志,查看一下。 以下是explain 不加shuffle-------------------------+
| Explain String |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:<slot 30> `t1`.`date_scale` | <slot 31> `t1`.`time_tag` | <slot 32> `t2`.`region_id` | <slot 33> `t2`.`region_name` | <slot 34> `t1`.`cat_id` | <slot 35> sum(`t1`.`arranged_amt_3p`) |
| PARTITION: RANDOM |
| |
| RESULT SINK |
| |
| 7:AGGREGATE (update finalize) |
| | output: sum(`t1`.`arranged_amt_3p`) |
| | group by: `t1`.`date_scale`, `t1`.`time_tag`, if(TupleIsNull(2), NULL, <slot 4> `region_id`), if(TupleIsNull(2), NULL, <slot 5> max(`region_name`)), `t1`.`cat_id` |
| | |
| 6:HASH JOIN |
| | join op: INNER JOIN (BROADCAST) |
| | hash predicates: |
| | colocate: false, reason: Node type not match |
| | equal join conjunct: if(TupleIsNull(2), NULL, <slot 4> `region_id`) = <slot 19> `region_id` |
| | |
| |----9:EXCHANGE |
| | |
| 3:HASH JOIN |
| | join op: LEFT OUTER JOIN (BROADCAST) |
| | hash predicates: |
| | colocate: false, reason: Node type not match |
| | equal join conjunct: `t1`.`customer_bu_id` = <slot 3> `bu_id` |
| | |
| |----8:EXCHANGE |
| | |
| 0:OlapScanNode |
| TABLE: x |
| PREAGGREGATION: OFF. Reason: Aggregate Operator not match: SUM <--> REPLACE |
| PREDICATES: `t1`.`date_scale` = '202107', `t1`.`time_tag` = 2, `t1`.`customer_bu_id` != -1, `t1`.`cat_id` = -1 |
| partitions=1/1 |
| rollup: x |
| tabletRatio=1/10 |
| tabletList=884032 |
| cardinality=5753 |
| avgRowSize=55.481663 |
| numNodes=3 |
| |
| PLAN FRAGMENT 1 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 09 |
| UNPARTITIONED |
| |
| 5:AGGREGATE (update finalize) |
| | output: multi_distinct_count(DISTINCT `bu_id`), multi_distinct_count(DISTINCT if(1 = 1, `bu_id`, NULL)) |
| | group by: `region_id` |
| | having: <slot 20> multi_distinct_count(DISTINCT `bu_id`) = <slot 21> multi_distinct_count(DISTINCT if(`hp` = 1, `bu_id`, NULL)) |
| | |
| 4:OlapScanNode |
| TABLE: x |
| PREAGGREGATION: OFF. Reason: conjunct on `region_id` which is StorageEngine value column |
| PREDICATES: NOT `region_name` LIKE '%KA%' |
| partitions=1/1 |
| rollup: x |
| tabletRatio=1/1 |
| tabletList=884017 |
| cardinality=111 |
| avgRowSize=73.03603 |
| numNodes=3 |
| |
| PLAN FRAGMENT 2 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 08 |
| UNPARTITIONED |
| |
| 2:AGGREGATE (update finalize) |
| | output: max(`region_name`) |
| | group by: `bu_id`, `region_id` |
| | |
| 1:OlapScanNode |
| TABLE: x |
| PREAGGREGATION: ON |
| partitions=1/1 |
| rollup: x |
| tabletRatio=1/1 |
| tabletList=884017 |
| cardinality=111 |
| avgRowSize=73.03603 |
| numNodes=3 |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
shuffle-------------------------+
| Explain String |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:<slot 30> `t1`.`date_scale` | <slot 31> `t1`.`time_tag` | <slot 32> `t2`.`region_id` | <slot 33> `t2`.`region_name` | <slot 34> `t1`.`cat_id` | <slot 35> sum(`t1`.`arranged_amt_3p`) |
| PARTITION: HASH_PARTITIONED: if(TupleIsNull(2), NULL, <slot 4> `region_id`) |
| |
| RESULT SINK |
| |
| 7:AGGREGATE (update finalize) |
| | output: sum(`t1`.`arranged_amt_3p`) |
| | group by: `t1`.`date_scale`, `t1`.`time_tag`, if(TupleIsNull(2), NULL, <slot 4> `region_id`), if(TupleIsNull(2), NULL, <slot 5> max(`region_name`)), `t1`.`cat_id` |
| | |
| 6:HASH JOIN |
| | join op: INNER JOIN (PARTITIONED) |
| | hash predicates: |
| | colocate: false, reason: Has join hint |
| | equal join conjunct: if(TupleIsNull(2), NULL, <slot 4> `region_id`) = <slot 19> `region_id` |
| | |
| |----10:EXCHANGE |
| | |
| 9:EXCHANGE |
| |
| PLAN FRAGMENT 1 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 10 |
| HASH_PARTITIONED: <slot 19> `region_id` |
| |
| 5:AGGREGATE (update finalize) |
| | output: multi_distinct_count(DISTINCT `bu_id`), multi_distinct_count(DISTINCT if(1 = 1, `bu_id`, NULL)) |
| | group by: `region_id` |
| | having: <slot 20> multi_distinct_count(DISTINCT `bu_id`) = <slot 21> multi_distinct_count(DISTINCT if(`hp` = 1, `bu_id`, NULL)) |
| | |
| 4:OlapScanNode |
| TABLE: x |
| PREAGGREGATION: OFF. Reason: conjunct on `region_id` which is StorageEngine value column |
| PREDICATES: NOT `region_name` LIKE '%KA%' |
| partitions=1/1 |
| rollup: x |
| tabletRatio=1/1 |
| tabletList=884017 |
| cardinality=111 |
| avgRowSize=73.03603 |
| numNodes=3 |
| |
| PLAN FRAGMENT 2 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 09 |
| HASH_PARTITIONED: if(TupleIsNull(2), NULL, <slot 4> `region_id`) |
| |
| 3:HASH JOIN |
| | join op: LEFT OUTER JOIN (BROADCAST) |
| | hash predicates: |
| | colocate: false, reason: Node type not match |
| | equal join conjunct: `t1`.`customer_bu_id` = <slot 3> `bu_id` |
| | |
| |----8:EXCHANGE |
| | |
| 0:OlapScanNode |
| TABLE: x |
| PREAGGREGATION: OFF. Reason: Aggregate Operator not match: SUM <--> REPLACE |
| PREDICATES: `t1`.`date_scale` = '202107', `t1`.`time_tag` = 2, `t1`.`customer_bu_id` != -1, `t1`.`cat_id` = -1 |
| partitions=1/1 |
| rollup: x |
| tabletRatio=1/10 |
| tabletList=884032 |
| cardinality=5753 |
| avgRowSize=55.481663 |
| numNodes=3 |
| |
| PLAN FRAGMENT 3 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 08 |
| UNPARTITIONED |
| |
| 2:AGGREGATE (update finalize) |
| | output: max(`region_name`) |
| | group by: `bu_id`, `region_id` |
| | |
| 1:OlapScanNode |
| TABLE: x |
| PREAGGREGATION: ON |
| partitions=1/1 |
| rollup: x |
| tabletRatio=1/1 |
| tabletList=884017 |
| cardinality=111 |
| avgRowSize=73.03603 |
| numNodes=3 |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
Beta Was this translation helpful? Give feedback.
All reactions
-
根据这个描述,exchange id = 8 的下层节点,和上层单个节点。在shuffle 和非shuffle 之间并不存在diff。反而区别在于更上层的节点。
|
Beta Was this translation helpful? Give feedback.
-
一、查询SQL
二、查询结果
+------------+----------+-----------+--------------+--------+-----------------------------+
| date_scale | time_tag | region_id | region_name | cat_id | sum(
t1
.arranged_amt_3p
) |+------------+----------+-----------+--------------+--------+-----------------------------+
| 202107 | 2 | 22304 | A | -1 | 106839459.9 |
| 202107 | 2 | 22303 | B | -1 | 131133774.81 |
| 202107 | 2 | 22306 | C | -1 | 66322643.32 |
| 202107 | 2 | 22307 | D | -1 | 77013996.36 |
| 202107 | 2 | 22302 | E | -1 | 56652868.22 |
+------------+----------+-----------+--------------+--------+-----------------------------+
每次的查询结果中,sum列的值都会变。
profile
三、强制shuffle
我们 对 hpr 的表变量 强制shuffle join 结果就稳定了。
profile
四、疑问
doris版本:0.13.11
不加shuffle hint,都是走的broadcast,结果是异常的。
异常能稳定复现,即使我将数据导入到测试进去进行测试,异常依旧稳定复现。
我应该从什么角度去排查这个问题呢?
目前从profile看出,不加shuffle hint 的 profile 在往exchange node id=8 中发送数据的条数貌似被过滤掉了一部分。
加shuffle hint的profile,exchange node id=8中返回的数据条数和接收条数一致,没有被过滤。
不知道原因是否在这个地方。
Beta Was this translation helpful? Give feedback.
All reactions