-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recursive CTEs: Stage 3 - add execution support #6
base: matt/feat/recursive-ctes/parser
Are you sure you want to change the base?
Recursive CTEs: Stage 3 - add execution support #6
Conversation
time, | ||
1 as "val" | ||
FROM | ||
(SELECT DISTINCT "time" FROM "beg_account_balance") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test fails. But the following changes cause the test to pass
- removing
time
or1 as "val"
from thesub_cte
projection - removing the reference to
beg_account_balance
in the FROM clause (effectively removing the sub-query from thesub_cte
) - removing the
FULL JOIN "sub_cte" on 1=1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test fails with index out of bounds: the len is 0 but the index is 0
but other incarnations of the error after playing around with the repro were
Internal error: PhysicalExpr Column references column 'time' at index 0 (zero-based) but input schema only has 0 columns: [].
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker
[SQL] WITH RECURSIVE "recursive_cte" AS
External error: query failed: DataFusion error: ProjectionPushdown
caused by
Internal error: PhysicalExpr Column references column 'time' at index 0 (zero-based) but input schema only has 0 columns: [].
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker
[SQL] WITH RECURSIVE "recursive_cte" AS (
External error: query failed: DataFusion error: ProjectionPushdown
caused by
Internal error: PhysicalExpr Column references column 'LEAD(beg_account_balance.time,Int64(1)) ORDER BY [beg_account_balance.time ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW' at index 1 (zero-based) but input schema only has 1 columns: ["LEAD(beg_account_balance.time,Int64(1)) ORDER BY [beg_account_balance.time ASC NULLS LAST] RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW"].
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker
[SQL] WITH RECURSIVE "recursive_cte" AS (
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the columns referenced in the error messages are different from the repro because the repro at the time that I collected those errors was different.
datafusion/common/src/config.rs
Outdated
@@ -282,7 +282,7 @@ config_namespace! { | |||
/// Should DataFusion support recursive CTEs | |||
/// Defaults to false since this feature is a work in progress and may not | |||
/// behave as expected | |||
pub enable_recursive_ctes: bool, default = false | |||
pub enable_recursive_ctes: bool, default = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to set back to false. Need to provide a method to set this at the root config level
98f77ff
to
531d541
Compare
e0ea336
to
5257fe3
Compare
9615dde
to
460561c
Compare
a993efb
to
f67763a
Compare
22661af
to
8d38c92
Compare
06fcbea
to
68c9fe7
Compare
* refactor * generated files * feat * feat * feat * feat * tests * clippy --------- Co-authored-by: Andrew Lamb <[email protected]>
…ion (apache#8839) * add config flag for recursive ctes update docs from script update slt test for doc change * restore testing pin * add sql -> logical plan support * impl cte as work table * move SharedState to continuance * impl WorkTableState wip: readying pr to implement only logical plan fix sql integration test wip: add sql test for logical plan wip: format test assertion * wip: remove uncessary with qualifier method some docs more docs * Add comments to `RecursiveQuery` * Update datfusion-cli Cargo.lock * Fix clippy * better errors and comments * add doc comment with rationale for create_cte_worktable method * wip: tweak --------- Co-authored-by: Andrew Lamb <[email protected]>
…t empty (apache#8914) * Fix aggregate_statistics * Add more test
* Support to_timestamp with chrono formatting apache#5398 * Updated user guide's to_timestamp to include chrono formatting information apache#5398 * Minor comment update. * Small documentation updates for to_timestamp functions. * Cargo fmt and clippy improvements. * Switched to assert and unwrap_err based on feedback * Fixed assert, code compiles and runs as expected now. * Fix fmt (again). * Add additional to_timestamp tests covering usage with tables with and without valid formats. * to_timestamp documentation fixes. * - Changed internal_err! -> exec_err! for unsupported data type errors. - Extracted out to_timestamp_impl method to reduce code duplication as per PR feedback. - Extracted out validate_to_timestamp_data_types to reduce code duplication as per PR feedback. - Added additional tests for argument validation and invalid arguments. - Removed unnecessary shim function 'string_to_timestamp_nanos_with_format_shim' * Resolved merge conflict, updated toStringXXX methods to reflect upstream change * prettier * Fix clippy --------- Co-authored-by: Andrew Lamb <[email protected]>
…value (apache#8912) * Minor: Document third argument of date_bin as optional # Rationale @mhilton noticed this as part of an internal discussion where we were confused about the output values of `date_bin` for timestamps with timezones (as the default argument is for the unix EPOCH in UTC) # Changes: Document the third parameter to `date_bin` better * prettier
Co-authored-by: Andrew Lamb <[email protected]>
68c9fe7
to
b11ca8b
Compare
apache#8829) * support array slice * fix argument * fix typo * support from and to is negative * fix conflict * modify user doc * refactor code * fix clippy * add 1-index test --------- Co-authored-by: Andrew Lamb <[email protected]>
Signed-off-by: Ruihang Xia <[email protected]>
* Fix expr partial ord test * Add comment * Resolve linter error
* Simplify windows builtin functions * add field comments
* Updated documentation concerning to_timestamp scalar function including adding missing link in the datafusion-examples/README file. Resolves apache#8980 * Fix example url. * Applied prettier.
* Add helper function for scalar function * Update datafusion/physical-expr/src/functions.rs Co-authored-by: Andrew Lamb <[email protected]> * Fix * Fix --------- Co-authored-by: Andrew Lamb <[email protected]>
* Fix optimize projections bug * Add new dataframe test
apache#8982) * optimize NOT Expr logic * fix Null type * fix test case * fmt
* Optimize MIN/MAX when relation is empty * Fix clippy
Signed-off-by: tangruilin <[email protected]>
* [task apache#8213]Part of Port tests in select.rs to sqllogictest Signed-off-by: tangruilin <[email protected]> * test large strings as well * Restore parameter tests * make test deterministic --------- Signed-off-by: tangruilin <[email protected]> Co-authored-by: Andrew Lamb <[email protected]>
* Migrate last repartition query * Add reference to issue Co-authored-by: Andrew Lamb <[email protected]> * Fix missing statement --------- Co-authored-by: Andrew Lamb <[email protected]>
* Update to sqlparser `0.42.0` * Update datafusion Cargo.lock
* Make tests deterministic * Add duplicate timestamps
* Add support for PG LIKE operators * Bump sqlparser dep from branch to merge commit
…he#8964) * fix: table alias will process uppercase alias into lowercase * test: uppercase table alias * test: add table alias sqllogictest * test: move tests to sqllogictests * test: update cte.slt * chore: add test descriptions
…pache#8986) * Document parallelism and thread scheduling in the architecture guid * Apply suggestions from code review Co-authored-by: Raphael Taylor-Davies <[email protected]> --------- Co-authored-by: Raphael Taylor-Davies <[email protected]>
* Fix none projections * Update select.slt
…type other than Int64 (apache#9001)
add config flag for recursive ctes update docs from script update slt test for doc change restore testing pin add sql -> logical plan support * impl cte as work table * move SharedState to continuance * impl WorkTableState wip: readying pr to implement only logical plan fix sql integration test wip: add sql test for logical plan wip: format test assertion wip: remove uncessary with qualifier method some docs more docs Add comments to `RecursiveQuery` Update datfusion-cli Cargo.lock Fix clippy better errors and comments add sql -> logical plan support * impl cte as work table * move SharedState to continuance * impl WorkTableState wip: readying pr to implement only logical plan fix sql integration test wip: add sql test for logical plan wip: format test assertion wip: remove uncessary with qualifier method some docs more docs impl execution support add sql -> logical plan support * impl cte as work table * move SharedState to continuance * impl WorkTableState wip: readying pr to implement only logical plan partway through porting over isidentical's work Continuing implementation with fixes and improvements Lint fixes ensure that repartitions are not added immediately after RecursiveExec in the physical-plan add trivial sqllogictest more recursive tests remove test that asserts recursive cte should fail additional cte test wip: remove tokio from physical plan dev deps format cargo tomls fix issue where CTE could not be referenced more than 1 time wip: fixes after rebase but tpcds_physical_q54 keeps overflowing its stack Impl NamedRelation as CteWorkTable * impl cte as work table * move SharedState to continuance * impl WorkTableState * upd * assign work table state * upd * upd fix min repro but still broken on larger test case set config in sql logic tests clean up cte slt tests fixes fix option add group by test case and more test case files wip add window function recursive cte example simplify stream impl for recrusive query stream add explain to trivial test case move setting of recursive ctes to slt file and add test to ensure multiple record batches are produced each iteration remove tokio dep and remove mut lint, comments and remove tokio stream update submodule pin to feat branch that contains csvs update submodule pin to feat branch that contains csvs
e65caee
to
80069f7
Compare
WIP. Adds execution support to recursive CTEs
Todos
Closes #.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?