[BugFix] fix concat expr with multiple args nullsFraction (backport #52683) #52696
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why I'm doing:
The current nullsFraction method for concat calculation is incorrect and may be negative. This will cause the cardinality of the join node to be 1.
What I'm doing:
Reference binaryExpressionCalculate nullsFraction algorithm is
1 - ((1 - left.getNullsFraction ()) * ( 1 - right.getNullsFraction ()))
. Fix multiaryExpressionCalculateFixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check:
This is an automatic backport of pull request #52683 done by [Mergify](https://mergify.com). ## Why I'm doing: The current nullsFraction method for concat calculation is incorrect and may be negative. This will cause the cardinality of the join node to be 1. ``` 8:HASH JOIN | join op: LEFT OUTER JOIN (BROADCAST) | equal join conjunct: [44: xxx, LARGEINT, true] = [127: ccc, LARGEINT, true] | other predicates: concat[(cast([9: CLAIM_STATUS_CD, BIGINT, true] as VARCHAR), ' - ', [126: LOOKUP_DETAIL, VARCHAR, true]); args: VARCHAR; result: VARCHAR; args nullable: true; result nullable: true] = '3 - CLEARED' | output columns: 3, 8, 9, 43, 54, 126 | can local shuffle: false | cardinality: 1 | column statistics: | * aaa-->[4.446678308E9, 1.8487815906E10, 0.0, 8.0, 1.0] ESTIMATE | * vvv-->[-Infinity, Infinity, 0.0, 1.0000000994337923, 1.0] ESTIMATE | * ccc-->[1.0, 42.0, 0.0, 8.0, 1.0] ESTIMATE | * www-->[-1.3145022026298428E38, 1.2734555494779917E38, 0.0, 16.0, 1.0] ESTIMATE | * qqqq-->[-1.5620653915393402E38, 1.2734555494779917E38, 0.0, 16.0, 1.0] ESTIMATE | * rrr-->[-1.7007341574472433E38, 1.7003487503261273E38, 0.0, 16.0, 1.0] ESTIMATE | * ccaa-->[-Infinity, Infinity, 0.0, 7.076923076923077, 1.0] ESTIMATE | * aafff-->[-1.5620653915393402E38, 1.2734555494779917E38, 0.0, 16.0, 1.0] ESTIMATE | |----7:EXCHANGE | distribution type: BROADCAST | cardinality: 13 | 5:OlapScanNode table: table, rollup: table preAggregation: on partitionsRatio=1/1, tabletsRatio=814/814 tabletList=30250908,30250912,30250916,30250920,30250924,30250928,30250932,30250936,30250940,30250944 ... actualRows=643644364, avgRowSize=65.0 cardinality: 643644364 column statistics: * afffa-->[4.446678308E9, 1.8487815906E10, 0.0, 8.0, 6.4845024E8] ESTIMATE * dsdf-->[-Infinity, Infinity, 0.0, 1.0000000994337923, 7.0] ESTIMATE * ccc-->[1.0, 42.0, 0.0, 8.0, 10.0] ESTIMATE * fff-->[-1.3145022026298428E38, 1.2734555494779917E38, 0.0, 16.0, 7.0] ESTIMATE * aaaa-->[-1.5620653915393402E38, 1.2734555494779917E38, 0.0, 16.0, 10.0] ESTIMATE * wwww-->[-1.7007341574472433E38, 1.7003487503261273E38, 0.0, 16.0, 9105.0] ESTIMATE ```
What I'm doing:
Reference binaryExpressionCalculate nullsFraction algorithm is
1 - ((1 - left.getNullsFraction ()) * ( 1 - right.getNullsFraction ()))
. Fix multiaryExpressionCalculateFixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist: