Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matt/feat/recursive ctes/config flag #3

Closed
wants to merge 572 commits into from

Conversation

matthewgapp
Copy link
Owner

Veeupup and others added 30 commits November 28, 2023 10:02
* move array function unit_tests to sqllogictest

Signed-off-by: veeupup <[email protected]>

* add comment for array_expression internal test

---------

Signed-off-by: veeupup <[email protected]>
…che#8351)

* Minor: Improve the document format of JoinHashMap

* sql csv_with_quote_escape

* fix
* Minor: restore DataFrame test

* Move test to a better location

* simplify test
…pache#8354)

These utils manipulate `LogicalPlan`s and `Expr`s and may be useful in
projects that only depend on `datafusion-expr`
* Extract parquet statistics to its own module, add tests

* Update datafusion/core/src/datasource/physical_plan/parquet/statistics.rs

Co-authored-by: Raphael Taylor-Davies <[email protected]>

* rename enum

* Improve API

* Add test for reading struct array statistics

* Add test for column after statistics

* improve tests

* simplify

* clippy

* Update datafusion/core/src/datasource/physical_plan/parquet/statistics.rs

* Update datafusion/core/src/datasource/physical_plan/parquet/statistics.rs

* Add test showing incorrect statistics

* Rework statistics

* Fix clippy

* Update documentation and make it clear the statistics are not publically accessable

* Add link to upstream arrow ticket

---------

Co-authored-by: Raphael Taylor-Davies <[email protected]>
Co-authored-by: Raphael Taylor-Davies <[email protected]>
* feat:implement sql style 'find_in_set' string function

* format code

* modify test case
* Refactor aggregate function handling

* fix ci

* update comment

* fix ci

* simplify the code

* fix fmt

* fix ci

* fix clippy
* Implement Aliases for ScalarUDF

Signed-off-by: veeupup <[email protected]>

* fix comments

Signed-off-by: veeupup <[email protected]>

---------

Signed-off-by: veeupup <[email protected]>
* support LargeList in array_empty

* update err info
* feat: test queries for to_timestamp(float) WIP

* feat: Float64 input for to_timestamp

* cargo fmt

* clippy

* docs: double input type for to_timestamp

* feat: cast floats to timestamp

* style: cargo fmt

* fix: float64 cast for timestamp nanos only
* Support User Defined Table Function

Signed-off-by: veeupup <[email protected]>

* fix comments

Signed-off-by: veeupup <[email protected]>

* add udtf test

Signed-off-by: veeupup <[email protected]>

* add file header

* Simply table function example, add some comments

* Simplfy exprs

* make clippy happy

* Update datafusion/core/tests/user_defined/user_defined_table_functions.rs

---------

Signed-off-by: veeupup <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
* document timestamp input limis

* fix text

* prettier

* remove doc for nanoseconds

* Update datafusion/physical-expr/src/datetime_expressions.rs

Co-authored-by: Andrew Lamb <[email protected]>

---------

Co-authored-by: Andrew Lamb <[email protected]>
* fix: make ntile work in some corner cases

* fix comments

* minor

* Update datafusion/sqllogictest/test_files/window.slt

Co-authored-by: Mustafa Akur <[email protected]>

---------

Co-authored-by: Mustafa Akur <[email protected]>
Given that group keys inherently have few repeated values, especially
when grouping on a single column, the use of dictionary encoding is
unlikely to be yielding significant returns
* done

Signed-off-by: jayzhan211 <[email protected]>

* add more test

Signed-off-by: jayzhan211 <[email protected]>

* cleanup

Signed-off-by: jayzhan211 <[email protected]>

---------

Signed-off-by: jayzhan211 <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
* Minor: Improve the documentation on `ScalarValue`

* Update datafusion/common/src/scalar.rs

Co-authored-by: Liang-Chi Hsieh <[email protected]>

* Update datafusion/common/src/scalar.rs

Co-authored-by: Liang-Chi Hsieh <[email protected]>

---------

Co-authored-by: Liang-Chi Hsieh <[email protected]>
* add benchmark

Signed-off-by: jayzhan211 <[email protected]>

* fmt

Signed-off-by: jayzhan211 <[email protected]>

* address clippy

Signed-off-by: jayzhan211 <[email protected]>

* cleanup

Signed-off-by: jayzhan211 <[email protected]>

* fix comment

Signed-off-by: jayzhan211 <[email protected]>

---------

Signed-off-by: jayzhan211 <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
* minor changes

* PipelineStatePropagator tree refactor

* Remove duplications by children_unbounded()

* Remove on-the-fly tree construction

* Minor changes

---------

Co-authored-by: Mustafa Akur <[email protected]>
…8121)

* feat: support  LargeList in make_array and
array_length

* chore: add tests

* fix: update tests for nested array

* use usise_as

* add new_large_list

* refactor array_length

* add comment

* update test in sqllogictest

* fix ci

* fix macro

* use usize_as

* update comment

* return based on data_type in make_array
Ted-Jiang and others added 29 commits January 3, 2024 17:02
…apache#8737)

* Add primary key support for row_number window function

* Add comments, minor changes

* Add new test

* Review

---------

Co-authored-by: Mehmet Ozan Kabak <[email protected]>
…pache#8721)

* DistinctCountGroupsAccumulator

* test coverage

* clippy warnings

* count distinct for primitive types

* revert hashset to std

* fixed accumulator size estimation
* support LargeList in cardinality
…nforceDistribution` rule (apache#8731)

* Cleanup

* More

* Restore add_roundrobin_on_top

* Restore test files

* More

* Restore

* More

* More

* Make test stable

* For review

* Add test
* Clean internal implementation of WindowUDF

* fix doc
* support largelist in array_to_string

* reduce code duplication
…, `array_append` and `array_prepend` (apache#8636)

* reuse function for string concat

Signed-off-by: jayzhan211 <[email protected]>

* remove casting in string concat

Signed-off-by: jayzhan211 <[email protected]>

* add test

Signed-off-by: jayzhan211 <[email protected]>

* operator to function rewrite

Signed-off-by: jayzhan211 <[email protected]>

* fix explain

Signed-off-by: jayzhan211 <[email protected]>

* add more test

Signed-off-by: jayzhan211 <[email protected]>

* add column cases

Signed-off-by: jayzhan211 <[email protected]>

* cleanup

Signed-off-by: jayzhan211 <[email protected]>

* presever name

Signed-off-by: jayzhan211 <[email protected]>

* Update datafusion/optimizer/src/analyzer/rewrite_expr.rs

Co-authored-by: Andrew Lamb <[email protected]>

* rename

Signed-off-by: jayzhan211 <[email protected]>

---------

Signed-off-by: jayzhan211 <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
* fix bug

Signed-off-by: jayzhan211 <[email protected]>

* fmt

Signed-off-by: jayzhan211 <[email protected]>

* add rowsort

Signed-off-by: jayzhan211 <[email protected]>

---------

Signed-off-by: jayzhan211 <[email protected]>
…ingPredicate and cp_solver (apache#8749)

* Minor: Improve library docs to mention TreeNode, ExprSimplifier, PruningPredicate and cp_solver

* fix link
* Add logo source files

* add another file
* Add `schema_err!` error macros with optional backtrace
…pache#8740)

* revert eb8aff7 / Materialize dictionaries in group keys

* Update tests

* Update tests
* fix: struct don't push down to TableScan

* add similar to test and apply comment

* remove catch all in outer_columns_helper

* minor

* fix clippy

---------

Co-authored-by: Andrew Lamb <[email protected]>
* Fix error messages in array expressions

* fix fmt
* move tests from  to sqllogictests part1

* Update datafusion/sqllogictest/test_files/expr.slt

Co-authored-by: Andrew Lamb <[email protected]>

* Update datafusion/sqllogictest/test_files/expr.slt

Co-authored-by: Andrew Lamb <[email protected]>

---------

Co-authored-by: Andrew Lamb <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

non-null sub-field on nullable struct-field has wrong nullity. Parallel NDSON file reading