Skip to content

Latest commit

 

History

History
292 lines (276 loc) · 31.4 KB

33.0.0.md

File metadata and controls

292 lines (276 loc) · 31.4 KB

33.0.0 (2023-11-12)

Full Changelog

Breaking changes:

  • Refactor Statistics, introduce precision estimates (Exact, Inexact, Absent) #7793 (berkaysynnada)
  • Remove redundant unwrap in ScalarValue::new_primitive, return a Result #7830 (maruschin)
  • Add parquet feature flag, enabled by default, and make parquet conditional #7745 (ongchi)
  • Change input for to_timestamp function to be seconds rather than nanoseconds, add to_timestamp_nanos #7844 (comphead)
  • Percent Decode URL Paths (#8009) #8012 (tustvold)
  • chore: remove panics in datafusion-common::scalar by making more operations return Result #7901 (junjunjd)
  • Combine Expr::Wildcard and Wxpr::QualifiedWildcard, add wildcard() expr fn #8105 (alamb)

Performance related:

  • Add distinct union optimization #7788 (maruschin)
  • Fix join order for TPCH Q17 & Q18 by improving FilterExec statistics #8126 (andygrove)
  • feat: add column statistics into explain #8112 (NGA-TRAN)

Implemented enhancements:

  • Support InsertInto Sorted ListingTable #7743 (devinjdangelo)
  • External Table Primary key support #7755 (mustafasrepo)
  • add interval arithmetic for timestamp types #7758 (mhilton)
  • Interval Arithmetic NegativeExpr Support #7804 (berkaysynnada)
  • Exactness Indicator of Parameters: Precision #7809 (berkaysynnada)
  • Implement GetIndexedField for map-typed columns #7825 (swgillespie)
  • Fix precision loss when coercing date_part utf8 argument #7846 (Dandandan)
  • Support Binary/LargeBinary --> Utf8/LargeUtf8 in ilike and string functions #7840 (alamb)
  • Support Decimal256 on AVG aggregate expression #7853 (viirya)
  • Support Decimal256 column in create external table #7866 (viirya)
  • Support Decimal256 in Min/Max aggregate expressions #7881 (viirya)
  • Implement Hive-Style Partitioned Write Support #7801 (devinjdangelo)
  • feat: support Decimal256 for the abs function #7904 (jonahgao)
  • Parallelize Serialization of Columns within Parquet RowGroups #7655 (devinjdangelo)
  • feat: Use bloom filter when reading parquet to skip row groups #7821 (hengfeiyang)
  • Support Partitioning Data by Dictionary Encoded String Array Types #7896 (devinjdangelo)
  • Read only enough bytes to infer Arrow IPC file schema via stream #7962 (Jefffrey)
  • feat: Support determining extensions from names like foo.parquet.snappy as well as foo.parquet #7972 (Weijun-H)
  • feat: Protobuf serde for Json file sink #8062 (Jefffrey)
  • feat: support target table alias in update statement #8080 (jonahgao)
  • feat: support UDAF in substrait producer/consumer #8119 (waynexia)

Fixed bugs:

  • fix: preserve column qualifier for DataFrame::with_column #7792 (jonahgao)
  • fix: don't push down volatile predicates in projection #7909 (haohuaijin)
  • fix: generate logical plan for UPDATE SET FROM statement #7984 (jonahgao)
  • fix: single_distinct_aggretation_to_group_by fail #7997 (haohuaijin)
  • fix: clippy warnings from nightly rust 1.75 #8025 (waynexia)
  • fix: DataFusion suggests invalid functions #8083 (jonahgao)
  • fix: add encode/decode to protobuf encoding #8089 (Syleechan)

Documentation updates:

  • Minor: Improve TableProvider document, and add ascii art #7759 (alamb)
  • Expose arrow-schema serde crate feature flag #7829 (lewiszlw)
  • doc: fix ExecutionContext to SessionContext in custom-table-providers.md #7903 (ZENOTME)
  • Minor: Document parquet crate feature #7927 (alamb)
  • Add some initial content about creating logical plans #7952 (andygrove)
  • Minor: Add implementation examples to ExecutionPlan::execute #8013 (tustvold)
  • Minor: Improve documentation for Filter Pushdown #8023 (alamb)
  • Minor: Improve ExecutionPlan documentation #8019 (alamb)
  • Improve comments for PartitionSearchMode struct #8047 (ozankabak)
  • Prepare 33.0.0 Release #8057 (andygrove)
  • Improve documentation for calculate_prune_length method in SymmetricHashJoin #8125 (Asura7969)
  • docs: show creation of DFSchema #8132 (wjones127)
  • Improve documentation site to make it easier to find communication on Slack/Discord #8138 (alamb)

Merged pull requests:

  • Minor: Improve TableProvider document, and add ascii art #7759 (alamb)
  • Prepare 32.0.0 Release #7769 (andygrove)
  • Minor: Change all file links to GitHub in document #7768 (ongchi)
  • Minor: Improve PruningPredicate documentation #7738 (alamb)
  • Support InsertInto Sorted ListingTable #7743 (devinjdangelo)
  • Minor: improve documentation to stagger_batch #7754 (alamb)
  • External Table Primary key support #7755 (mustafasrepo)
  • Minor: Build array_array() with ListArray construction instead of ArrayData #7780 (jayzhan211)
  • Minor: Remove unnecessary #[cfg(feature = "avro")] #7773 (sarutak)
  • add interval arithmetic for timestamp types #7758 (mhilton)
  • Minor: make tests deterministic #7771 (Weijun-H)
  • Minor: Improve Interval Docs #7782 (alamb)
  • DataSink additions #7778 (Dandandan)
  • Update substrait requirement from 0.15.0 to 0.16.0 #7783 (dependabot[bot])
  • Move nested union optimization from plan builder to logical optimizer #7695 (maruschin)
  • Minor: comments that explain the schema used in simply_expressions #7747 (alamb)
  • Update regex-syntax requirement from 0.7.1 to 0.8.0 #7784 (dependabot[bot])
  • Minor: Add sql test for UNION / UNION ALL + plans #7787 (alamb)
  • fix: preserve column qualifier for DataFrame::with_column #7792 (jonahgao)
  • Interval Arithmetic NegativeExpr Support #7804 (berkaysynnada)
  • Exactness Indicator of Parameters: Precision #7809 (berkaysynnada)
  • add LogicalPlanBuilder::join_on #7805 (haohuaijin)
  • Fix SortPreservingRepartition with no existing ordering. #7811 (mustafasrepo)
  • Update zstd requirement from 0.12 to 0.13 #7806 (dependabot[bot])
  • [Minor]: Remove input_schema field from window executor #7810 (mustafasrepo)
  • refactor(7181): move streaming_merge() into separate mod from the merge node #7799 (wiedld)
  • Improve update error #7777 (lewiszlw)
  • Minor: Update LogicalPlan::join_on API, use it more #7814 (alamb)
  • Add distinct union optimization #7788 (maruschin)
  • Make CI fail on any occurrence of rust-tomlfmt failed #7774 (ongchi)
  • Encode all join conditions in a single expression field #7612 (nseekhao)
  • Update substrait requirement from 0.16.0 to 0.17.0 #7808 (dependabot[bot])
  • Minor: include sort expressions in SortPreservingRepartitionExec explain plan #7796 (alamb)
  • minor: add more document to Wildcard expr #7822 (waynexia)
  • Minor: Move Monotonicity to expr crate #7820 (2010YOUY01)
  • Use code block for better formatting of rustdoc for PhysicalGroupBy #7823 (qrilka)
  • Update explain plan to show TopK operator #7826 (haohuaijin)
  • Extract ReceiverStreamBuilder #7817 (tustvold)
  • Extend backtrace coverage for DatafusionError::Plan errors errors #7803 (comphead)
  • Add documentation and usability for prepared parameters #7785 (alamb)
  • Implement GetIndexedField for map-typed columns #7825 (swgillespie)
  • Minor: Assert streaming_merge has non empty sort exprs #7795 (alamb)
  • Minor: Upgrade docs for PhysicalExpr::{propagate_constraints, evaluate_bounds} #7812 (alamb)
  • Change ScalarValue::List to store ArrayRef #7629 (jayzhan211)
  • [MINOR]:Do not introduce unnecessary repartition when row count is 1. #7832 (mustafasrepo)
  • Minor: Add tests for binary / utf8 coercion #7839 (alamb)
  • Avoid panics on error while encoding/decoding ListValue::Array as protobuf #7837 (alamb)
  • Refactor Statistics, introduce precision estimates (Exact, Inexact, Absent) #7793 (berkaysynnada)
  • Remove redundant unwrap in ScalarValue::new_primitive, return a Result #7830 (maruschin)
  • Fix precision loss when coercing date_part utf8 argument #7846 (Dandandan)
  • Add operator section to user guide, Add std::ops operations to prelude, and add not() expr_fn #7732 (ongchi)
  • Expose arrow-schema serde crate feature flag #7829 (lewiszlw)
  • Improve ContextProvider naming: rename get_table_provider --> get_table_source, deprecate get_table_provider #7831 (lewiszlw)
  • DataSink Dynamic Execution Time Demux #7791 (devinjdangelo)
  • Add small column on empty projection #7833 (ch-sc)
  • feat(7849): coerce TIMESTAMP to TIMESTAMPTZ #7850 (mhilton)
  • Support Binary/LargeBinary --> Utf8/LargeUtf8 in ilike and string functions #7840 (alamb)
  • Minor: fix typo in comments #7856 (haohuaijin)
  • Minor: improve join / join_on docs #7813 (alamb)
  • Support Decimal256 on AVG aggregate expression #7853 (viirya)
  • Minor: fix typo in comments #7861 (alamb)
  • Minor: fix typo in GreedyMemoryPool documentation #7864 (avh4)
  • Minor: fix multiple typos #7863 (Smoothieewastaken)
  • Minor: Fix docstring typos #7873 (alamb)
  • Add CursorValues Decoupling Cursor Data from Cursor Position #7855 (tustvold)
  • Support Decimal256 column in create external table #7866 (viirya)
  • Support Decimal256 in Min/Max aggregate expressions #7881 (viirya)
  • Implement Hive-Style Partitioned Write Support #7801 (devinjdangelo)
  • Minor: fix config typo #7874 (alamb)
  • Add Decimal256 sqllogictests for SUM, MEDIAN and COUNT aggregate expressions #7889 (viirya)
  • [test] add fuzz test for topk #7772 (Tangruilin)
  • Allow Setting Minimum Parallelism with RowCount Based Demuxer #7841 (devinjdangelo)
  • Drop single quotes to make warnings for parquet options not confusing #7902 (qrilka)
  • Add multi-column topk fuzz tests #7898 (alamb)
  • Change FileScanConfig.table_partition_cols from (String, DataType) to Fields #7890 (NGA-TRAN)
  • Maintain time zone in ScalarValue::new_list #7899 (Dandandan)
  • [MINOR]: Move joinside struct to common #7908 (mustafasrepo)
  • doc: fix ExecutionContext to SessionContext in custom-table-providers.md #7903 (ZENOTME)
  • Update arrow 48.0.0 #7854 (tustvold)
  • feat: support Decimal256 for the abs function #7904 (jonahgao)
  • [MINOR] Simplify Aggregate, and Projection output_partitioning implementation #7907 (mustafasrepo)
  • Bump actions/setup-node from 3 to 4 #7915 (dependabot[bot])
  • [Bug Fix]: Fix bug, first last reverse #7914 (mustafasrepo)
  • Minor: provide default implementation for ExecutionPlan::statistics #7911 (alamb)
  • Update substrait requirement from 0.17.0 to 0.18.0 #7916 (dependabot[bot])
  • Minor: Remove unnecessary clone in datafusion_proto #7921 (ongchi)
  • [MINOR]: Simplify code, change requirement from PhysicalSortExpr to PhysicalSortRequirement #7913 (mustafasrepo)
  • [Minor] Move combine_join util to under equivalence.rs #7917 (mustafasrepo)
  • support scan empty projection #7920 (haohuaijin)
  • Cleanup logical optimizer rules. #7919 (mustafasrepo)
  • Parallelize Serialization of Columns within Parquet RowGroups #7655 (devinjdangelo)
  • feat: Use bloom filter when reading parquet to skip row groups #7821 (hengfeiyang)
  • fix: don't push down volatile predicates in projection #7909 (haohuaijin)
  • Add parquet feature flag, enabled by default, and make parquet conditional #7745 (ongchi)
  • [MINOR]: Simplify enforce_distribution, minor changes #7924 (mustafasrepo)
  • Add simple window query to sqllogictest #7928 (Jefffrey)
  • ci: upgrade node to version 20 #7918 (crepererum)
  • Change input for to_timestamp function to be seconds rather than nanoseconds, add to_timestamp_nanos #7844 (comphead)
  • Minor: Document parquet crate feature #7927 (alamb)
  • Minor: reduce some #cfg(feature = "parquet") #7929 (alamb)
  • Minor: reduce use of #cfg(feature = "parquet") in tests #7930 (alamb)
  • Fix CI failures on to_timestamp() calls #7941 (comphead)
  • minor: add a datatype casting for the updated value #7922 (jonahgao)
  • Minor:add avro feature in datafusion-examples to make avro_sql run #7946 (haohuaijin)
  • Add simple exclude all columns test to sqllogictest #7945 (Jefffrey)
  • Support Partitioning Data by Dictionary Encoded String Array Types #7896 (devinjdangelo)
  • Minor: Remove array() in array_expression #7961 (jayzhan211)
  • Minor: simplify update code #7943 (alamb)
  • Add some initial content about creating logical plans #7952 (andygrove)
  • Minor: Change from &mut SessionContext to &SessionContext in substrait #7965 (my-vegetable-has-exploded)
  • Fix crate READMEs #7964 (Jefffrey)
  • Minor: Improve HashJoinExec documentation #7953 (alamb)
  • chore: clean useless clone baesd on clippy #7973 (Weijun-H)
  • Add README.md to core, execution and physical-plan crates #7970 (alamb)
  • Move source repartitioning into ExecutionPlan::repartition #7936 (alamb)
  • minor: fix broken links in README.md #7986 (jonahgao)
  • Minor: Upate the sqllogictest crate README #7971 (alamb)
  • Improve MemoryCatalogProvider default impl block placement #7975 (lewiszlw)
  • Fix ScalarValue handling of NULL values for ListArray #7969 (viirya)
  • Refactor of Ordering and Prunability Traversals and States #7985 (berkaysynnada)
  • Keep output as scalar for scalar function if all inputs are scalar #7967 (viirya)
  • Fix crate READMEs for core, execution, physical-plan #7990 (Jefffrey)
  • Update sqlparser requirement from 0.38.0 to 0.39.0 #7983 (jackwener)
  • Fix panic in multiple distinct aggregates by fixing ScalarValue::new_list #7989 (alamb)
  • Minor: Add MemoryReservation::consumer getter #8000 (milenkovicm)
  • fix: generate logical plan for UPDATE SET FROM statement #7984 (jonahgao)
  • Create temporary files for reading or writing #8005 (smallzhongfeng)
  • Minor: fix comment on SortExec::with_fetch method #8011 (westonpace)
  • Fix: dataframe_subquery example Optimizer rule common_sub_expression_eliminate failed #8016 (smallzhongfeng)
  • Percent Decode URL Paths (#8009) #8012 (tustvold)
  • Minor: Extract common deps into workspace #7982 (lewiszlw)
  • minor: change some plan_err to exec_err #7996 (waynexia)
  • Minor: error on unsupported RESPECT NULLs syntax #7998 (alamb)
  • Break GroupedHashAggregateStream spill batch into smaller chunks #8004 (milenkovicm)
  • Minor: Add implementation examples to ExecutionPlan::execute #8013 (tustvold)
  • Minor: Extend wrap_into_list_array to accept multiple args #7993 (jayzhan211)
  • GroupedHashAggregateStream should register spillable consumer #8002 (milenkovicm)
  • fix: single_distinct_aggretation_to_group_by fail #7997 (haohuaijin)
  • Read only enough bytes to infer Arrow IPC file schema via stream #7962 (Jefffrey)
  • Minor: remove a strange char #8030 (haohuaijin)
  • Minor: Improve documentation for Filter Pushdown #8023 (alamb)
  • Minor: Improve ExecutionPlan documentation #8019 (alamb)
  • fix: clippy warnings from nightly rust 1.75 #8025 (waynexia)
  • Minor: Avoid recomputing compute_array_ndims in align_array_dimensions #7963 (jayzhan211)
  • Minor: fix doc and fmt CI check #8037 (alamb)
  • Minor: remove uncessary #cfg test #8036 (alamb)
  • Minor: Improve documentation for PartitionStream and StreamingTableExec #8035 (alamb)
  • Combine Equivalence and Ordering equivalence to simplify state #8006 (mustafasrepo)
  • Encapsulate ProjectionMapping as a struct #8033 (alamb)
  • Minor: Fix bugs in docs for to_timestamp, to_timestamp_seconds, ... #8040 (alamb)
  • Improve comments for PartitionSearchMode struct #8047 (ozankabak)
  • General approach for Array replace #8050 (jayzhan211)
  • Minor: Remove the irrelevant note from the Expression API doc #8053 (ongchi)
  • Minor: Add more documentation about Partitioning #8022 (alamb)
  • Minor: improve documentation for IsNotNull, DISTINCT, etc #8052 (alamb)
  • Prepare 33.0.0 Release #8057 (andygrove)
  • Minor: improve error message by adding types to message #8065 (alamb)
  • Minor: Remove redundant BuiltinScalarFunction::supports_zero_argument() #8059 (2010YOUY01)
  • Add example to ci #8060 (smallzhongfeng)
  • Update substrait requirement from 0.18.0 to 0.19.0 #8076 (dependabot[bot])
  • Fix incorrect results in COUNT(*) queries with LIMIT #8049 (msirek)
  • feat: Support determining extensions from names like foo.parquet.snappy as well as foo.parquet #7972 (Weijun-H)
  • Use FairSpillPool for TaskContext with spillable config #8072 (viirya)
  • Minor: Improve HashJoinStream docstrings #8070 (alamb)
  • Fixing broken link #8085 (edmondop)
  • fix: DataFusion suggests invalid functions #8083 (jonahgao)
  • Replace macro with function for array_repeat #8071 (jayzhan211)
  • Minor: remove unnecessary projection in single_distinct_to_group_by rule #8061 (haohuaijin)
  • minor: Remove duplicate version numbers for arrow, object_store, and parquet dependencies #8095 (andygrove)
  • fix: add encode/decode to protobuf encoding #8089 (Syleechan)
  • feat: Protobuf serde for Json file sink #8062 (Jefffrey)
  • Minor: use Expr::alias in a few places to make the code more concise #8097 (alamb)
  • Minor: Cleanup BuiltinScalarFunction::return_type() #8088 (2010YOUY01)
  • Update sqllogictest requirement from 0.17.0 to 0.18.0 #8102 (dependabot[bot])
  • Projection Pushdown in PhysicalPlan #8073 (berkaysynnada)
  • Push limit into aggregation for DISTINCT ... LIMIT queries #8038 (msirek)
  • Bug-fix in Filter and Limit statistics #8094 (berkaysynnada)
  • feat: support target table alias in update statement #8080 (jonahgao)
  • Minor: Simlify downcast functions in cast.rs. #8103 (Weijun-H)
  • Fix ArrayAgg schema mismatch issue #8055 (jayzhan211)
  • Minor: Support nulls in array_replace, avoid a copy #8054 (alamb)
  • Minor: Improve the document format of JoinHashMap #8090 (Asura7969)
  • Simplify ProjectionPushdown and make it more general #8109 (alamb)
  • Minor: clean up the code regarding clippy #8122 (Weijun-H)
  • Support remaining functions in protobuf serialization, add expr_fn for StructFunction #8100 (JacobOgle)
  • Minor: Cleanup BuiltinScalarFunction's phys-expr creation #8114 (2010YOUY01)
  • rewrite array_append/array_prepend to remove deplicate codes #8108 (Veeupup)
  • Implementation of array_intersect #8081 (Veeupup)
  • Minor: fix ci break #8136 (haohuaijin)
  • Improve documentation for calculate_prune_length method in SymmetricHashJoin #8125 (Asura7969)
  • Minor: remove duplicated array_replace tests #8066 (alamb)
  • Minor: Fix temporary files created but not deleted during testing #8115 (2010YOUY01)
  • chore: remove panics in datafusion-common::scalar by making more operations return Result #7901 (junjunjd)
  • Fix join order for TPCH Q17 & Q18 by improving FilterExec statistics #8126 (andygrove)
  • Fix: Do not try and preserve order when there is no order to preserve in RepartitionExec #8127 (alamb)
  • feat: add column statistics into explain #8112 (NGA-TRAN)
  • Add subtrait support for IS NULL and IS NOT NULL #8093 (tgujar)
  • Combine Expr::Wildcard and Wxpr::QualifiedWildcard, add wildcard() expr fn #8105 (alamb)
  • docs: show creation of DFSchema #8132 (wjones127)
  • feat: support UDAF in substrait producer/consumer #8119 (waynexia)
  • Improve documentation site to make it easier to find communication on Slack/Discord #8138 (alamb)