Skip to content

Latest commit

 

History

History
273 lines (259 loc) · 29.4 KB

34.0.0.md

File metadata and controls

273 lines (259 loc) · 29.4 KB

34.0.0 (2023-12-11)

Full Changelog

Breaking changes:

  • Implement DISTINCT ON from Postgres #7981 (gruuya)
  • Encapsulate EquivalenceClass into a struct #8034 (alamb)
  • Make fields of ScalarUDF , AggregateUDF and WindowUDF non pub #8079 (alamb)
  • Implement StreamTable and StreamTableProvider (#7994) #8021 (tustvold)
  • feat: make FixedSizeList scalar also an ArrayRef #8221 (wjones127)
  • Remove FileWriterMode and ListingTableInsertMode (#7994) #8017 (tustvold)
  • Refactor: Unify Expr::ScalarFunction and Expr::ScalarUDF, introduce unresolved functions by name #8258 (2010YOUY01)
  • Refactor aggregate function handling #8358 (Weijun-H)
  • Move PartitionSearchMode into datafusion_physical_plan, rename to InputOrderMode #8364 (alamb)
  • Split EmptyExec into PlaceholderRowExec #8446 (razeghi71)

Implemented enhancements:

  • feat: show statistics in explain verbose #8113 (NGA-TRAN)
  • feat:implement postgres style 'overlay' string function #8117 (Syleechan)
  • feat: fill missing values with NULLs while inserting #8146 (jonahgao)
  • feat: to_array_of_size for ScalarValue::FixedSizeList #8225 (wjones127)
  • feat:implement calcite style 'levenshtein' string function #8168 (Syleechan)
  • feat: roundtrip FixedSizeList Scalar to protobuf #8239 (wjones127)
  • feat: impl the basic string_agg function #8148 (haohuaijin)
  • feat: support simplifying BinaryExpr with arbitrary guarantees in GuaranteeRewriter #8256 (wjones127)
  • feat: support customizing column default values for inserting #8283 (jonahgao)
  • feat:implement sql style 'substr_index' string function #8272 (Syleechan)
  • feat:implement sql style 'find_in_set' string function #8328 (Syleechan)
  • feat: support LargeList in array_empty #8321 (Weijun-H)
  • feat: support LargeList in make_array and array_length #8121 (Weijun-H)
  • feat: ScalarValue from String #8411 (QuenKar)
  • feat: support LargeList for array_has, array_has_all and array_has_any #8322 (Weijun-H)
  • feat: customize column default values for external tables #8415 (jonahgao)
  • feat: Support array_sort(list_sort) #8279 (Asura7969)
  • feat: support InterleaveExecNode in the proto #8460 (liukun4515)
  • feat: improve string statistics display in datafusion-cli parquet_metadata function #8535 (asimsedhain)

Fixed bugs:

  • fix: Timestamp with timezone not considered join on #8150 (ACking-you)
  • fix: wrong result of range function #8313 (smallzhongfeng)
  • fix: make ntile work in some corner cases #8371 (haohuaijin)
  • fix: Changed labeler.yml to latest format #8431 (viirya)
  • fix: Literal in ORDER BY window definition should not be an ordinal referring to relation column #8419 (viirya)
  • fix: ORDER BY window definition should work on null literal #8444 (viirya)
  • fix: RANGE frame for corner cases with empty ORDER BY clause should be treated as constant sort #8445 (viirya)
  • fix: don't unifies projection if expr is non-trival #8454 (haohuaijin)
  • fix: support uppercase when parsing Interval #8478 (QuenKar)
  • fix: incorrect set preserve_partitioning in SortExec #8485 (haohuaijin)
  • fix: Pull stats in IdentVisitor/GraphvizVisitor only when requested #8514 (vrongmeal)
  • fix: volatile expressions should not be target of common subexpt elimination #8520 (viirya)

Documentation updates:

  • Library Guide: Add Using the DataFrame API #8319 (Veeupup)
  • Minor: Add installation link to README.md #8389 (Weijun-H)
  • Prepare version 34.0.0 #8508 (andygrove)

Merged pull requests:

  • Fix typo in partitioning.rs #8134 (lewiszlw)
  • Implement DISTINCT ON from Postgres #7981 (gruuya)
  • Prepare 33.0.0-rc2 #8144 (andygrove)
  • Avoid concat in array_append #8137 (jayzhan211)
  • Replace macro with function for array_remove #8106 (jayzhan211)
  • Implement array_union #7897 (edmondop)
  • Minor: Document ExecutionPlan::equivalence_properties more thoroughly #8128 (alamb)
  • feat: show statistics in explain verbose #8113 (NGA-TRAN)
  • feat:implement postgres style 'overlay' string function #8117 (Syleechan)
  • Minor: Encapsulate LeftJoinData into a struct (rather than anonymous enum) and add comments #8153 (alamb)
  • Update sqllogictest requirement from 0.18.0 to 0.19.0 #8163 (dependabot[bot])
  • feat: fill missing values with NULLs while inserting #8146 (jonahgao)
  • Introduce return type for aggregate sum #8141 (jayzhan211)
  • implement range/generate_series func #8140 (Veeupup)
  • Encapsulate EquivalenceClass into a struct #8034 (alamb)
  • Revert "Minor: remove unnecessary projection in `single_distinct_to_g… #8176 (NGA-TRAN)
  • Preserve all of the valid orderings during merging. #8169 (mustafasrepo)
  • Make fields of ScalarUDF , AggregateUDF and WindowUDF non pub #8079 (alamb)
  • Fix logical conflicts #8187 (tustvold)
  • Minor: Update JoinHashMap comment example to make it clearer #8154 (alamb)
  • Implement StreamTable and StreamTableProvider (#7994) #8021 (tustvold)
  • [MINOR]: Remove unused Results #8189 (mustafasrepo)
  • Minor: clean up the code based on clippy #8179 (Weijun-H)
  • Minor: simplify filter statistics code #8174 (alamb)
  • Replace macro with function for array_position and array_positions #8170 (jayzhan211)
  • Add Library Guide for User Defined Functions: Window/Aggregate #8171 (Veeupup)
  • Add more stream docs #8192 (tustvold)
  • Implement func array_pop_front #8142 (Veeupup)
  • Moving arrow_files SQL tests to sqllogictest #8217 (edmondop)
  • fix regression in the use of name in ProjectionPushdown #8219 (alamb)
  • [MINOR]: Fix column indices in the planning tests #8191 (mustafasrepo)
  • Remove unnecessary reassignment #8232 (qrilka)
  • Update itertools requirement from 0.11 to 0.12 #8233 (crepererum)
  • Port tests in subqueries.rs to sqllogictest #8231 (PsiACE)
  • feat: make FixedSizeList scalar also an ArrayRef #8221 (wjones127)
  • Add versions to datafusion dependencies #8238 (andygrove)
  • feat: to_array_of_size for ScalarValue::FixedSizeList #8225 (wjones127)
  • feat:implement calcite style 'levenshtein' string function #8168 (Syleechan)
  • feat: roundtrip FixedSizeList Scalar to protobuf #8239 (wjones127)
  • Update prost-build requirement from =0.12.1 to =0.12.2 #8244 (dependabot[bot])
  • Minor: Port tests in displayable.rs to sqllogictest #8246 (Weijun-H)
  • Minor: add with_estimated_selectivity to Precision #8177 (alamb)
  • fix: Timestamp with timezone not considered join on #8150 (ACking-you)
  • Replace macro in array_array to remove duplicate codes #8252 (Veeupup)
  • Port tests in projection.rs to sqllogictest #8240 (PsiACE)
  • Introduce array_except function #8135 (jayzhan211)
  • Port tests in describe.rs to sqllogictest #8242 (Asura7969)
  • Remove FileWriterMode and ListingTableInsertMode (#7994) #8017 (tustvold)
  • Minor: clean up the code based on Clippy #8257 (Weijun-H)
  • Update arrow 49.0.0 and object_store 0.8.0 #8029 (tustvold)
  • feat: impl the basic string_agg function #8148 (haohuaijin)
  • Minor: Make schema of grouping set columns nullable #8248 (markusa380)
  • feat: support simplifying BinaryExpr with arbitrary guarantees in GuaranteeRewriter #8256 (wjones127)
  • Making stream joins extensible: A new Trait implementation for SHJ #8234 (metesynnada)
  • Don't Canonicalize Filesystem Paths in ListingTableUrl / support new external tables for files that do not (yet) exist #8014 (tustvold)
  • Minor: Add sql level test for inserting into non-existent directory #8278 (alamb)
  • Replace array_has/array_has_all/array_has_any macro to remove duplicate code #8263 (Veeupup)
  • Fix bug in field level metadata matching code #8286 (alamb)
  • Refactor Interval Arithmetic Updates #8276 (berkaysynnada)
  • [MINOR]: Remove unecessary orderings from the final plan #8289 (mustafasrepo)
  • consistent logical & physical NTILE return types #8270 (korowa)
  • make array_union/array_except/array_intersect handle empty/null arrays rightly #8269 (Veeupup)
  • improve file path validation when reading parquet #8267 (Weijun-H)
  • [Benchmarks] Make partitions default to number of cores instead of 2 #8292 (andygrove)
  • Update prost-build requirement from =0.12.2 to =0.12.3 #8298 (dependabot[bot])
  • Fix Display for List #8261 (jayzhan211)
  • feat: support customizing column default values for inserting #8283 (jonahgao)
  • support LargeList for arrow_cast, support ScalarValue::LargeList #8290 (Weijun-H)
  • Minor: remove useless clone based on Clippy #8300 (Weijun-H)
  • Calculate ordering equivalence for expressions (rather than just columns) #8281 (mustafasrepo)
  • Fix sqllogictests link in contributor-guide/index.md #8314 (qrilka)
  • Refactor: Unify Expr::ScalarFunction and Expr::ScalarUDF, introduce unresolved functions by name #8258 (2010YOUY01)
  • Support no distinct aggregate sum/min/max in single_distinct_to_group_by rule #8266 (haohuaijin)
  • feat:implement sql style 'substr_index' string function #8272 (Syleechan)
  • Fixing issues with for timestamp literals #8193 (comphead)
  • Projection Pushdown over StreamingTableExec #8299 (berkaysynnada)
  • minor: fix documentation #8323 (comphead)
  • fix: wrong result of range function #8313 (smallzhongfeng)
  • Minor: rename parquet.rs to parquet/mod.rs #8301 (alamb)
  • refactor: output ordering #8304 (QuenKar)
  • Update substrait requirement from 0.19.0 to 0.20.0 #8339 (dependabot[bot])
  • Port tests in aggregates.rs to sqllogictest #8316 (edmondop)
  • Library Guide: Add Using the DataFrame API #8319 (Veeupup)
  • Port tests in limit.rs to sqllogictest #8315 (zhangxffff)
  • move array function unit_tests to sqllogictest #8332 (Veeupup)
  • NTH_VALUE reverse support #8327 (mustafasrepo)
  • Optimize Projections during Logical Plan #8340 (mustafasrepo)
  • [MINOR]: Move merge projections tests to under optimize projections #8352 (mustafasrepo)
  • Add quote and escape attributes to create csv external table #8351 (Asura7969)
  • Minor: Add DataFrame test #8341 (alamb)
  • Minor: clean up the code based on Clippy #8359 (Weijun-H)
  • Minor: Make it easier to work with Expr::ScalarFunction #8350 (alamb)
  • Minor: Move some datafusion-optimizer::utils down to datafusion-expr::utils #8354 (Jesse-Bakker)
  • Minor: Make BuiltInScalarFunction::alias a method #8349 (alamb)
  • Extract parquet statistics to its own module, add tests #8294 (alamb)
  • feat:implement sql style 'find_in_set' string function #8328 (Syleechan)
  • Support LargeUtf8 to Temporal Coercion #8357 (jayzhan211)
  • Refactor aggregate function handling #8358 (Weijun-H)
  • Implement Aliases for ScalarUDF #8360 (Veeupup)
  • Minor: Remove unnecessary name field in ScalarFunctionDefintion #8365 (alamb)
  • feat: support LargeList in array_empty #8321 (Weijun-H)
  • Double type argument for to_timestamp function #8159 (spaydar)
  • Support User Defined Table Function #8306 (Veeupup)
  • Document timestamp input limits #8369 (comphead)
  • fix: make ntile work in some corner cases #8371 (haohuaijin)
  • Minor: Refactor array_union function to use a generic union_arrays function #8381 (Weijun-H)
  • Minor: Refactor function argument handling in ScalarFunctionDefinition #8387 (Weijun-H)
  • Materialize dictionaries in group keys #8291 (qrilka)
  • Rewrite array_ndims to fix List(Null) handling #8320 (jayzhan211)
  • Docs: Improve the documentation on ScalarValue #8378 (alamb)
  • Avoid concat for array_replace #8337 (jayzhan211)
  • add a summary table to benchmark compare output #8399 (razeghi71)
  • Refactors on TreeNode Implementations #8395 (berkaysynnada)
  • feat: support LargeList in make_array and array_length #8121 (Weijun-H)
  • remove unalias TableScan filters when create Physical Filter #8404 (jackwener)
  • Update custom-table-providers.md #8409 (nickpoorman)
  • fix transforming LogicalPlan::Explain use TreeNode::transform fails #8400 (haohuaijin)
  • Docs: Fix array_except documentation example error #8407 (Asura7969)
  • Support named query parameters #8384 (Asura7969)
  • Minor: Add installation link to README.md #8389 (Weijun-H)
  • Update code comment for the cases of regularized RANGE frame and add tests for ORDER BY cases with RANGE frame #8410 (viirya)
  • Minor: Add example with parameters to LogicalPlan #8418 (alamb)
  • Minor: Improve PruningPredicate documentation #8394 (alamb)
  • feat: ScalarValue from String #8411 (QuenKar)
  • Bump actions/labeler from 4.3.0 to 5.0.0 #8422 (dependabot[bot])
  • Update sqlparser requirement from 0.39.0 to 0.40.0 #8338 (dependabot[bot])
  • feat: support LargeList for array_has, array_has_all and array_has_any #8322 (Weijun-H)
  • Union schema can't be a subset of the child schema #8408 (jackwener)
  • Move PartitionSearchMode into datafusion_physical_plan, rename to InputOrderMode #8364 (alamb)
  • Make filter selectivity for statistics configurable #8243 (edmondop)
  • fix: Changed labeler.yml to latest format #8431 (viirya)
  • Minor: Use ScalarValue::from impl for strings #8429 (alamb)
  • Support crossjoin in substrait. #8427 (my-vegetable-has-exploded)
  • Fix ambiguous reference when aliasing in combination with ORDER BY #8425 (Asura7969)
  • Minor: convert marcro list-slice and slice to function #8424 (Weijun-H)
  • Remove macro in iter_to_array for List #8414 (jayzhan211)
  • fix: Literal in ORDER BY window definition should not be an ordinal referring to relation column #8419 (viirya)
  • feat: customize column default values for external tables #8415 (jonahgao)
  • feat: Support array_sort(list_sort) #8279 (Asura7969)
  • Bugfix: Remove df-cli specific SQL statment options before executing with DataFusion #8426 (devinjdangelo)
  • Detect when filters on unique constraints make subqueries scalar #8312 (Jesse-Bakker)
  • Add alias check to optimize projections merge #8438 (mustafasrepo)
  • Fix PartialOrd for ScalarValue::List/FixSizeList/LargeList #8253 (jayzhan211)
  • Support parquet_metadata for datafusion-cli #8413 (Veeupup)
  • Fix bug in optimizing a nested count #8459 (Dandandan)
  • Bump actions/setup-python from 4 to 5 #8449 (dependabot[bot])
  • fix: ORDER BY window definition should work on null literal #8444 (viirya)
  • flx clippy warnings #8455 (waynexia)
  • fix: RANGE frame for corner cases with empty ORDER BY clause should be treated as constant sort #8445 (viirya)
  • Preserve dict_id on Field during serde roundtrip #8457 (avantgardnerio)
  • feat: support InterleaveExecNode in the proto #8460 (liukun4515)
  • [BUG FIX]: Proper Empty Batch handling in window execution #8466 (mustafasrepo)
  • Minor: update cast #8458 (Weijun-H)
  • fix: don't unifies projection if expr is non-trival #8454 (haohuaijin)
  • Minor: Add new bloom filter predicate tests #8433 (alamb)
  • Add PRIMARY KEY Aggregate support to dataframe API #8356 (mustafasrepo)
  • Minor: refactor data_trunc to reduce duplicated code #8430 (Weijun-H)
  • Support array_distinct function. #8268 (my-vegetable-has-exploded)
  • Add primary key support to stream table #8467 (mustafasrepo)
  • Add evaluate_demo and range_analysis_demo to Expr examples #8377 (alamb)
  • Minor: fix function name typo #8473 (Weijun-H)
  • Minor: Fix comment typo in table.rs: s/indentical/identical/ #8469 (KeunwooLee-at)
  • Remove define_array_slice and reuse array_slice for array_pop_front/back #8401 (jayzhan211)
  • Minor: refactor trim to clean up duplicated code #8434 (Weijun-H)
  • Split EmptyExec into PlaceholderRowExec #8446 (razeghi71)
  • Enable non-uniform field type for structs created in DataFusion #8463 (dlovell)
  • Minor: Add multi ordering test for array agg order #8439 (jayzhan211)
  • Sort filenames when reading parquet to ensure consistent schema #6629 (thomas-k-cameron)
  • Minor: Improve comments in EnforceDistribution tests #8474 (alamb)
  • fix: support uppercase when parsing Interval #8478 (QuenKar)
  • Better Equivalence (ordering and exact equivalence) Propagation through ProjectionExec #8484 (mustafasrepo)
  • Add today alias for current_date #8423 (smallzhongfeng)
  • Minor: remove useless clone in array_expression #8495 (Weijun-H)
  • fix: incorrect set preserve_partitioning in SortExec #8485 (haohuaijin)
  • Explicitly mark parquet for tests in datafusion-common #8497 (Dennis40816)
  • Minor/Doc: Clarify DataFrame::write_table Documentation #8519 (devinjdangelo)
  • fix: Pull stats in IdentVisitor/GraphvizVisitor only when requested #8514 (vrongmeal)
  • Change display of RepartitionExec from SortPreservingRepartitionExec to RepartitionExec preserve_order=true #8521 (JacobOgle)
  • Fix DataFrame::cache errors with Plan("Mismatch between schema and batches") #8510 (Asura7969)
  • Minor: update pbjson_dependency #8470 (alamb)
  • Minor: Update prost-derive dependency #8471 (alamb)
  • Minor/Doc: Add DataFrame::write_table to DataFrame user guide #8527 (devinjdangelo)
  • Minor: Add repartition_file.slt end to end test for repartitioning files, and supporting tweaks #8505 (alamb)
  • Prepare version 34.0.0 #8508 (andygrove)
  • refactor: use ExprBuilder to consume substrait expr and use macro to generate error #8515 (waynexia)
  • [MINOR]: Make some slt tests deterministic #8525 (mustafasrepo)
  • fix: volatile expressions should not be target of common subexpt elimination #8520 (viirya)
  • Minor: Add LakeSoul to the list of Known Users #8536 (xuchen-plus)
  • Fix regression with Incorrect results when reading parquet files with different schemas and statistics #8533 (alamb)
  • feat: improve string statistics display in datafusion-cli parquet_metadata function #8535 (asimsedhain)
  • Defer file creation to write #8539 (tustvold)
  • Minor: Improve error handling in sqllogictest runner #8544 (alamb)