Skip to content

Latest commit

 

History

History
295 lines (281 loc) · 33.6 KB

35.0.0.md

File metadata and controls

295 lines (281 loc) · 33.6 KB

35.0.0 (2024-01-20)

Full Changelog

Breaking changes:

  • Minor: make SubqueryAlias::try_new take Arc #8542 (sadboy)
  • Remove ListingTable and FileScanConfig Unbounded (#8540) #8573 (tustvold)
  • Rename ParamValues::{LIST -> List,MAP -> Map} #8611 (kawadakk)
  • Rename expr::window_function::WindowFunction to WindowFunctionDefinition, make structure consistent with ScalarFunction #8382 (edmondop)
  • Implement ScalarUDF in terms of ScalarUDFImpl trait #8713 (alamb)
  • Change ScalarValue::{List, LargeList, FixedSizedList} to take specific types rather than ArrayRef #8562 (rspears74)
  • Remove unused array_expression.rs and SUPPORTED_ARRAY_TYPES #8807 (alamb)
  • Simplify physical expression creation API (not require schema) #8823 (comphead)
  • Determine causal window frames to produce early results. #8842 (mustafasrepo)

Implemented enhancements:

  • feat: implement Unary Expr in substrait #8534 (waynexia)
  • feat: implement Repartition plan in substrait #8526 (waynexia)
  • feat: support largelist in array_slice #8561 (Weijun-H)
  • feat: support LargeList in array_positions #8571 (Weijun-H)
  • feat: support LargeList in array_element #8570 (Weijun-H)
  • feat: support LargeList in array_dims #8592 (Weijun-H)
  • feat: support LargeList in array_remove #8595 (Weijun-H)
  • feat: support inlist in LiteralGurantee for pruning #8654 (my-vegetable-has-exploded)
  • feat: support 'LargeList' in array_pop_front and array_pop_back #8569 (Weijun-H)
  • feat: support LargeList in array_position #8714 (Weijun-H)
  • feat: support LargeList in array_ndims #8716 (Weijun-H)
  • feat: remove filters with null constants #8700 (asimsedhain)
  • feat: support LargeList in array_repeat #8725 (Weijun-H)
  • feat: native types in DistinctCountAccumulator for primitive types #8721 (korowa)
  • feat: support LargeList in cardinality #8726 (Weijun-H)
  • feat: support largelist in array_to_string #8729 (Weijun-H)
  • feat: Add bloom filter metric to ParquetExec #8772 (my-vegetable-has-exploded)
  • feat: support array_resize #8744 (Weijun-H)
  • feat: add more components to the wasm-pack compatible list #8843 (waynexia)

Fixed bugs:

  • fix: make sure CASE WHEN pick first true branch when WHEN clause is true #8477 (haohuaijin)
  • fix: Antarctica/Vostok tz offset changed in chrono-tz 0.8.5 #8677 (korowa)
  • fix: struct field don't push down to TableScan #8774 (haohuaijin)
  • fix: failed to create ValuesExec with non-nullable schema #8776 (jonahgao)
  • fix: fix markdown table in docs #8812 (tshauck)
  • fix: don't extract common sub expr in CASE WHEN clause #8833 (haohuaijin)

Documentation updates:

  • docs: update udf docs for udtf #8546 (tshauck)
  • Doc: Clarify When Limit is Pushed Down to TableProvider::Scan #8686 (devinjdangelo)
  • Minor: Improve PruningPredicate docstrings #8748 (alamb)
  • Minor: Add documentation about stream cancellation #8747 (alamb)
  • docs: add sudo for install commands #8804 (caicancai)
  • docs: document SessionConfig #8771 (wjones127)
  • Upgrade to object_store 0.9.0 and arrow 50.0.0 #8758 (tustvold)
  • docs: fix wrong pushdown name & a typo #8875 (SteveLauC)
  • docs: Update contributor guide with installation instructions #8876 (caicancai)
  • docs: fix wrong name in sub-crates' README #8889 (SteveLauC)
  • docs: add an example for RecordBatchReceiverStreamBuilder #8888 (SteveLauC)

Merged pull requests:

  • Remove order_bys from AggregateExec state #8537 (mustafasrepo)
  • Fix count(null) and count(distinct null) #8511 (joroKr21)
  • Minor: reduce code duplication in date_bin_impl #8528 (Weijun-H)
  • Add metrics for UnnestExec #8482 (simonvandel)
  • Prepare 34.0.0-rc3 #8549 (andygrove)
  • fix: make sure CASE WHEN pick first true branch when WHEN clause is true #8477 (haohuaijin)
  • Minor: make SubqueryAlias::try_new take Arc #8542 (sadboy)
  • Fallback on null empty value in ExprBoundaries::try_from_column #8501 (razeghi71)
  • Add test for DataFrame::write_table #8531 (devinjdangelo)
  • [MINOR]: Generate empty column at placeholder exec #8553 (mustafasrepo)
  • Minor: Remove now dead SUPPORTED_STRUCT_TYPES #8480 (alamb)
  • [MINOR]: Add getter methods to first and last value #8555 (mustafasrepo)
  • [MINOR]: Some code changes and a new empty batch guard for SHJ #8557 (metesynnada)
  • docs: update udf docs for udtf #8546 (tshauck)
  • feat: implement Unary Expr in substrait #8534 (waynexia)
  • Fix compute_record_batch_statistics wrong with projection #8489 (Asura7969)
  • Minor: Cleanup warning in scalar.rs test #8563 (jayzhan211)
  • Minor: move some invariants out of the loop #8564 (haohuaijin)
  • feat: implement Repartition plan in substrait #8526 (waynexia)
  • Fix sort order aware file group parallelization #8517 (alamb)
  • feat: support largelist in array_slice #8561 (Weijun-H)
  • minor: fix to support scalars #8559 (comphead)
  • refactor: HashJoinStream state machine #8538 (korowa)
  • Remove ListingTable and FileScanConfig Unbounded (#8540) #8573 (tustvold)
  • Update substrait requirement from 0.20.0 to 0.21.0 #8574 (dependabot[bot])
  • [minor]: Fix rank calculation bug when empty order by is seen #8567 (mustafasrepo)
  • Add LiteralGuarantee on columns to extract conditions required for PhysicalExpr expressions to evaluate to true #8437 (alamb)
  • [MINOR]: Parametrize sort-preservation tests to exercise all situations (unbounded/bounded sources and flag behavior) #8575 (mustafasrepo)
  • Minor: Add some comments to scalar_udf example #8576 (alamb)
  • Move Coercion for MakeArray to coerce_arguments_for_signature and introduce another one for ArrayAppend #8317 (jayzhan211)
  • feat: support LargeList in array_positions #8571 (Weijun-H)
  • feat: support LargeList in array_element #8570 (Weijun-H)
  • Increase test coverage for unbounded and bounded cases #8581 (mustafasrepo)
  • Port tests in parquet.rs to sqllogictest #8560 (hiltontj)
  • Minor: avoid a copy in Expr::unalias #8588 (alamb)
  • Minor: support complex expr as the arg in the ApproxPercentileCont function #8580 (liukun4515)
  • Bugfix: Add functional dependency check and aggregate try_new schema #8584 (mustafasrepo)
  • Remove GroupByOrderMode #8593 (ozankabak)
  • Minor: replace not-impl-err in array_expression #8589 (Weijun-H)
  • Substrait insubquery #8363 (tgujar)
  • Minor: port last test from parquet.rs #8587 (alamb)
  • Minor: consolidate map sqllogictest tests #8550 (alamb)
  • feat: support LargeList in array_dims #8592 (Weijun-H)
  • Fix regression in regenerating protobuf source #8603 (andygrove)
  • Remove unbounded_input from FileSinkOptions #8605 (devinjdangelo)
  • Add arrow_err! macros, optional backtrace to ArrowError #8586 (comphead)
  • Add examples of DataFrame::write* methods without S3 dependency #8606 (devinjdangelo)
  • Implement logical plan serde for CopyTo #8618 (andygrove)
  • Fix InListExpr to return the correct number of rows #8601 (alamb)
  • Remove ListingTable single_file option #8604 (devinjdangelo)
  • feat: support LargeList in array_remove #8595 (Weijun-H)
  • Rename ParamValues::{LIST -> List,MAP -> Map} #8611 (kawadakk)
  • Support binary temporal coercion for Date64 and Timestamp types #8616 (Asura7969)
  • Add new configuration item listing_table_ignore_subdirectory #8565 (Asura7969)
  • Optimize the parameter types of ParamValues's methods #8613 (kawadakk)
  • Do not panic on zero placeholders in ParamValues::get_placeholders_with_values #8615 (kawadakk)
  • Fix #8507: Non-null sub-field on nullable struct-field has wrong nullity #8623 (marvinlanhenke)
  • Implement contained API in PruningPredicate #8440 (alamb)
  • Add partial serde support for ParquetWriterOptions #8627 (andygrove)
  • Minor: add arguments length check in array_expressions #8622 (Weijun-H)
  • Minor: improve dataframe functional dependency tests #8630 (alamb)
  • Improve regexp_match performance by avoiding cloning Regex #8631 (viirya)
  • Minor: improve listing_table_ignore_subdirectory config documentation #8634 (alamb)
  • Support Writing Arrow files #8608 (devinjdangelo)
  • Filter pushdown into cross join #8626 (mustafasrepo)
  • [MINOR] Remove duplicate test utility and move one utility function for better organization #8652 (metesynnada)
  • [MINOR]: Add new test for filter pushdown into cross join #8648 (mustafasrepo)
  • Rewrite bloom filters to use contains API #8442 (alamb)
  • Split equivalence code into smaller modules. #8649 (tushushu)
  • Move parquet_schema.rs from sql to parquet tests #8644 (alamb)
  • Fix group by aliased expression in LogicalPLanBuilder::aggregate #8629 (alamb)
  • Refactor array_union and array_intersect functions to one general function #8516 (Weijun-H)
  • Minor: avoid extra clone in datafusion-proto::physical_plan #8650 (ongchi)
  • Minor: name some constant values in arrow writer, parquet writer #8642 (alamb)
  • TreeNode Refactor Part 2 #8653 (berkaysynnada)
  • feat: support inlist in LiteralGurantee for pruning #8654 (my-vegetable-has-exploded)
  • Streaming CLI support #8651 (berkaysynnada)
  • Add serde support for CSV FileTypeWriterOptions #8641 (andygrove)
  • Add trait based ScalarUDF API #8578 (alamb)
  • Handle ordering of first last aggregation inside aggregator #8662 (mustafasrepo)
  • feat: support 'LargeList' in array_pop_front and array_pop_back #8569 (Weijun-H)
  • chore: rename ceresdb to apache horaedb #8674 (tanruixiang)
  • Minor: clean up code #8671 (Weijun-H)
  • fix: Antarctica/Vostok tz offset changed in chrono-tz 0.8.5 #8677 (korowa)
  • Make the BatchSerializer behind Arc to avoid unnecessary struct creation #8666 (metesynnada)
  • Implement serde for CSV and Parquet FileSinkExec #8646 (andygrove)
  • [pruning] Add shortcut when all units have been pruned #8675 (Ted-Jiang)
  • Change first/last implementation to prevent redundant comparisons when data is already sorted #8678 (mustafasrepo)
  • minor: remove useless conversion #8684 (comphead)
  • refactor: modified JoinHashMap build order for HashJoinStream #8658 (korowa)
  • Start setting up tpch planning benchmarks #8665 (matthewmturner)
  • Doc: Clarify When Limit is Pushed Down to TableProvider::Scan #8686 (devinjdangelo)
  • Closes #8502: Parallel NDJSON file reading #8659 (marvinlanhenke)
  • Improve array_prepend signature for null and empty array #8625 (jayzhan211)
  • Cleanup TreeNode implementations #8672 (viirya)
  • Update sqlparser requirement from 0.40.0 to 0.41.0 #8647 (dependabot[bot])
  • Update scalar functions doc for extract/datepart #8682 (Jefffrey)
  • Remove DescribeTableStmt in parser in favour of existing functionality from sqlparser-rs #8703 (Jefffrey)
  • Simplify NULL [NOT] IN (..) expressions #8691 (asimsedhain)
  • Rename expr::window_function::WindowFunction to WindowFunctionDefinition, make structure consistent with ScalarFunction #8382 (edmondop)
  • Deprecate duplicate function LogicalPlan::with_new_inputs #8707 (viirya)
  • Minor: refactor bloom filter tests to reduce duplication #8435 (alamb)
  • Minor: clean up code based on Clippy #8715 (Weijun-H)
  • Minor: Unbounded Output of AnalyzeExec #8717 (berkaysynnada)
  • feat: support LargeList in array_position #8714 (Weijun-H)
  • feat: support LargeList in array_ndims #8716 (Weijun-H)
  • feat: remove filters with null constants #8700 (asimsedhain)
  • support LargeList in array_prepend and array_append #8679 (Weijun-H)
  • Support for extract(epoch from date) for Date32 and Date64 #8695 (Jefffrey)
  • Implement trait based API for defining WindowUDF #8719 (guojidan)
  • Minor: Introduce utils::hash for StructArray #8552 (jayzhan211)
  • [CI] Improve windows machine CI test time #8730 (comphead)
  • fix guarantees in allways_true of PruningPredicate #8732 (my-vegetable-has-exploded)
  • Minor: Avoid memory copy in construct window exprs #8718 (Ted-Jiang)
  • feat: support LargeList in array_repeat #8725 (Weijun-H)
  • Minor: Ctrl+C Termination in CLI #8739 (berkaysynnada)
  • Add support for functional dependency for ROW_NUMBER window function. #8737 (mustafasrepo)
  • Minor: reduce code duplication in PruningPredicate test #8441 (alamb)
  • feat: native types in DistinctCountAccumulator for primitive types #8721 (korowa)
  • [MINOR]: Add a test case for when target partition is 1, no hash repartition is added to the plan. #8757 (mustafasrepo)
  • Minor: Improve PruningPredicate docstrings #8748 (alamb)
  • feat: support LargeList in cardinality #8726 (Weijun-H)
  • Add reproducer for #8738 #8750 (alamb)
  • Minor: Use faster check for column name in schema merge #8765 (matthewmturner)
  • Minor: Add documentation about stream cancellation #8747 (alamb)
  • Move repartition_file_scans out of enable_round_robin check in EnforceDistribution rule #8731 (viirya)
  • Clean internal implementation of WindowUDF #8746 (guojidan)
  • feat: support largelist in array_to_string #8729 (Weijun-H)
  • [MINOR] CLI error handling on streaming use cases #8761 (metesynnada)
  • Convert Binary Operator StringConcat to Function for array_concat, array_append and array_prepend #8636 (jayzhan211)
  • Minor: Fix incorrect indices for hashing struct #8775 (jayzhan211)
  • Minor: Improve library docs to mention TreeNode, ExprSimplifier, PruningPredicate and cp_solver #8749 (alamb)
  • [MINOR] Add logo source files #8762 (andygrove)
  • Add Apache attribution to site footer #8760 (alamb)
  • ci: speed up win64 test #8728 (Jefffrey)
  • Add schema_err! error macros with optional backtrace #8620 (comphead)
  • Fix regression by reverting Materialize dictionaries in group keys #8740 (alamb)
  • fix: struct field don't push down to TableScan #8774 (haohuaijin)
  • Implement ScalarUDF in terms of ScalarUDFImpl trait #8713 (alamb)
  • Minor: Fix error messages in array expressions #8781 (Weijun-H)
  • Move tests from expr.rs to sqllogictests. Part1 #8773 (comphead)
  • Permit running sqllogictest as a rust test in IDEs (+ use clap for sqllogicttest parsing, accept (and ignore) rust test harness arguments) #8288 (alamb)
  • Minor: Use standard tree walk in Projection Pushdown #8787 (alamb)
  • Implement trait based API for define AggregateUDF #8733 (guojidan)
  • Minor: Improve DataFusionError documentation #8792 (alamb)
  • fix: failed to create ValuesExec with non-nullable schema #8776 (jonahgao)
  • Update substrait requirement from 0.21.0 to 0.22.1 #8796 (dependabot[bot])
  • Bump follow-redirects from 1.15.3 to 1.15.4 in /datafusion/wasmtest/datafusion-wasm-app #8798 (dependabot[bot])
  • Minor: array_pop_first should be array_pop_front in documentation #8797 (ongchi)
  • feat: Add bloom filter metric to ParquetExec #8772 (my-vegetable-has-exploded)
  • Add note on using larger row group size #8745 (twitu)
  • Change ScalarValue::{List, LargeList, FixedSizedList} to take specific types rather than ArrayRef #8562 (rspears74)
  • fix: fix markdown table in docs #8812 (tshauck)
  • docs: add sudo for install commands #8804 (caicancai)
  • Standardize CompressionTypeVariant encoding in protobuf #8785 (tushushu)
  • Make benefits_from_input_partitioning Default in SHJ #8801 (metesynnada)
  • refactor: standardize exec_from funcs arg order #8809 (tshauck)
  • [Minor] extract const and add doc and more tests for in_list pruning #8815 (Ted-Jiang)
  • [MINOR]: Add size check for aggregate #8813 (mustafasrepo)
  • Minor: chores: Update clippy in pre-commit.sh #8810 (my-vegetable-has-exploded)
  • Cleanup the usage of round-robin repartitioning #8794 (viirya)
  • Implement monotonicity for ScalarUDF #8799 (guojidan)
  • Remove unused array_expression.rs and SUPPORTED_ARRAY_TYPES #8807 (alamb)
  • feat: support array_resize #8744 (Weijun-H)
  • Minor: typo in arrays.slt #8831 (Weijun-H)
  • docs: document SessionConfig #8771 (wjones127)
  • Minor: Improve datafusion-proto documentation #8822 (alamb)
  • [CI] Refactor CI builders #8826 (comphead)
  • Serialize function signature simplifications #8802 (metesynnada)
  • Port tests in group_by.rs to sqllogictest #8834 (hiltontj)
  • Simplify physical expression creation API (not require schema) #8823 (comphead)
  • feat: add more components to the wasm-pack compatible list #8843 (waynexia)
  • Port tests in timestamp.rs to sqllogictest. Part 1 #8818 (caicancai)
  • Upgrade to object_store 0.9.0 and arrow 50.0.0 #8758 (tustvold)
  • Fix ApproxPercentileCont signature #8825 (joroKr21)
  • Minor: Update with_column_rename method doc #8858 (comphead)
  • Minor: Document parquet_metadata function #8852 (alamb)
  • Speedup new_with_metadata by removing sort #8855 (simonvandel)
  • Minor: fix wrong function call #8847 (Weijun-H)
  • Add options of parquet bloom filter and page index in Session config #8869 (Ted-Jiang)
  • Port tests in timestamp.rs to sqllogictest #8859 (caicancai)
  • test: Port order.rs tests to sqllogictest #8857 (simicd)
  • Determine causal window frames to produce early results. #8842 (mustafasrepo)
  • docs: fix wrong pushdown name & a typo #8875 (SteveLauC)
  • fix: don't extract common sub expr in CASE WHEN clause #8833 (haohuaijin)
  • Add "Extended" clickbench queries #8861 (alamb)
  • Change cli to propagate error to exit code #8856 (tshauck)
  • test: Port tests in predicates.rs to sqllogictest #8879 (simicd)
  • docs: Update contributor guide with installation instructions #8876 (caicancai)
  • Minor: add tests for casts between nested List and LargeList #8882 (Weijun-H)
  • Disable Parallel Parquet Writer by Default, Improve Writing Test Coverage #8854 (devinjdangelo)
  • Support for order sensitive NTH_VALUE aggregation, make reverse ARRAY_AGG more efficient #8841 (mustafasrepo)
  • test: Port tests in csv_files.rs to sqllogictest #8885 (simicd)
  • test: Port tests in references.rs to sqllogictest #8877 (simicd)
  • fix bug with to_timestamp and InitCap logical serialization, add roundtrip test between expression and proto, #8868 (Weijun-H)
  • Support LargeListArray scalar values and align_array_dimensions #8881 (Weijun-H)
  • refactor: rename FileStream.file_reader to file_opener & update doc #8883 (SteveLauC)
  • docs: fix wrong name in sub-crates' README #8889 (SteveLauC)
  • Recursive CTEs: Stage 1 - add config flag #8828 (matthewgapp)
  • Support array literal with scalar function #8884 (jayzhan211)
  • Bump actions/cache from 3 to 4 #8903 (dependabot[bot])
  • Fix datafusion-cli print output #8895 (alamb)
  • docs: add an example for RecordBatchReceiverStreamBuilder #8888 (SteveLauC)
  • Fix "Projection references non-aggregate values" by updating rebase_expr to use transform_down #8890 (wizardxz)
  • Add serde support for Arrow FileTypeWriterOptions #8850 (tushushu)
  • Improve datafusion-cli print format tests #8896 (alamb)
  • Recursive CTEs: Stage 2 - add support for sql -> logical plan generation #8839 (matthewgapp)
  • Minor: remove null in array-append and array-prepend #8901 (Weijun-H)
  • Add support for FixedSizeList type in arrow_cast, hashing #8344 (Weijun-H)
  • aggregate_statistics should only optimize MIN/MAX when relation is not empty #8914 (viirya)
  • support to_timestamp with optional chrono formats #8886 (Omega359)
  • Minor: Document third argument of date_bin as optional and default value #8912 (alamb)
  • Minor: distinguish parquet row group pruning type in unit test #8921 (Ted-Jiang)