38.0.0 (2024-05-07)
Breaking changes:
- refactor: make dfschema wrap schemaref #9595 (haohuaijin)
- Make FirstValue an UDAF, Change
AggregateUDFImpl::accumulator
signature, support ORDER BY for UDAFs #9874 (jayzhan211) - Remove
OwnedTableReference
andOwnedSchemaReference
#9933 (comphead) - Consistent LogicalPlan subquery handling in TreeNode::apply and TreeNode::visit #9913 (peter-toth)
- Refactor
Optimizer
to use owned plans andTreeNode
API (10% faster planning) #9948 (alamb) - Stop copying plans in
LogicalPlan::with_param_values
#10016 (alamb) - Move coalesce to datafusion-functions and remove BuiltInScalarFunction #10098 (Omega359)
- Refactor sessionconfig set fns to avoid an unnecessary enum to string conversion #10141 (psvri)
- ScalarUDF: Remove
supports_zero_argument
and avoid creating null array for empty args #10193 (jayzhan211) - Clean-up: Remove AggregateExec::group_by() #10297 (berkaysynnada)
- Remove
ScalarFunctionDefinition::Name
#10277 (lewiszlw) - feat: Determine ordering of file groups #9593 (suremarc)
- Split parquet bloom filter config and enable bloom filter on read by default #10306 (lewiszlw)
- Improve coerce API so it does not need DFSchema #10331 (alamb)
- Minor: Do not force analyzer to copy logical plans #10367 (alamb)
- Move
Covariance
(Sample)covar
/covar_samp
to be a User Defined Aggregate Function #10372 (jayzhan211)
Performance related:
- perf: Use
Arc<str>
instead ofCow<&'a>
in the analyzer #9824 (comphead)
Implemented enhancements:
- feat: Add display_pg_json for LogicalPlan #9789 (liurenjie1024)
- feat: eliminate redundant sorts on monotonic expressions #9813 (suremarc)
- feat: optimize
lower
andupper
functions #9971 (JasonLi-cn) - feat: support
unnest
multiple arrays #10044 (jonahgao) - feat:
DataFrame
supports unnesting multiple columns #10118 (jonahgao) - feat: support input reordering for
NestedLoopJoinExec
#9676 (korowa) - feat: add static_name() to ExecutionPlan #10266 (waynexia)
- feat: add optimizer config param to avoid grouping partitions
prefer_existing_union
#10259 (NGA-TRAN) - feat: unwrap casts of string and dictionary columns #10323 (erratic-pattern)
- feat: Add CrossJoin match case to unparser #10371 (sardination)
- feat: run expression simplifier in a loop until a fixedpoint or 3 cycles #10358 (erratic-pattern)
Fixed bugs:
- fix: detect non-recursive CTEs in the recursive
WITH
clause #9836 (jonahgao) - fix: improve
unnest_generic_list
handling of null list #9975 (jonahgao) - fix: reduce lock contention in
RepartitionExec::execute
#10009 (crepererum) - fix:
RepartitionExec
metrics #10025 (crepererum) - fix: Support Dict types in
in_list
physical plans #10031 (advancedxy) - fix: Specify row count in sort_batch for batch with no columns #10094 (viirya)
- fix: another non-deterministic test in
joins.slt
#10122 (korowa) - fix: duplicate output for HashJoinExec in CollectLeft mode #9757 (korowa)
- fix: cargo warnings of import item #10196 (waynexia)
- fix: reduce lock contention in distributor channels #10026 (crepererum)
- fix: no longer support the
substring
function #10242 (jonahgao) - fix: Correct null_count in describe() #10260 (Weijun-H)
- fix: schema error when parsing order-by expressions #10234 (jonahgao)
- fix: LogFunc simplify swaps arguments #10360 (erratic-pattern)
Documentation updates:
- Update
COPY
documentation to reflect changes #9754 (alamb) - doc: Add
datafusion-federation
to Integrations #9853 (phillipleblanc) - Improve
AggregateUDFImpl::state_fields
documentation #9919 (alamb) - Update datafusion-cli docs, split up #10078 (alamb)
- Fix large futures causing stack overflows #10033 (sergiimk)
- Update documentation to replace Apache Arrow DataFusion with Apache DataFusion #10130 (andygrove)
- Update github repo links #10167 (lewiszlw)
- minor: fix installation section link #10179 (comphead)
- Improve documentation on
TreeNode
#10035 (alamb) - Update .asf.yaml to publish docs to datafusion.apache.org #10190 (phillipleblanc)
- Update links to point to datafusion.apache.org #10195 (phillipleblanc)
- doc: fix subscribe mail link to datafusion mailing lists #10225 (jackwener)
- Fix docs.rs build for datafusion-proto (hopefully) #10254 (alamb)
- docs: add download page #10271 (tisonkun)
- Clarify docs explaining the relationship between
SessionState
andSessionContext
#10350 (alamb) - docs: Add DataFusion subprojects to navigation menu, other minor updates #10362 (andygrove)
Merged pull requests:
- Prepare 37.0.0 Release #9697 (andygrove)
- move Left, Lpad, Reverse, Right, Rpad functions to datafusion_functions #9841 (Omega359)
- Add non-column expression equality tracking to filter exec #9819 (mustafasrepo)
- datafusion-cli support for multiple commands in a single line #9831 (berkaysynnada)
- Add tests for filtering, grouping, aggregation of ARRAYs #9695 (alamb)
- Remove vestigal conbench integration #9855 (alamb)
- feat: Add display_pg_json for LogicalPlan #9789 (liurenjie1024)
- Update
COPY
documentation to reflect changes #9754 (alamb) - Minor: Remove the bench most likely to cause OOM in CI #9858 (gruuya)
- Minor: make uuid an optional dependency on datafusion-functions #9771 (alamb)
- doc: Add
Spice.ai
to Known Users #9852 (phillipleblanc) - minor: add a hint how to adjust max rows displayed #9845 (comphead)
- Exclude .github directory from release tarball #9850 (andygrove)
- move strpos, substr functions to datafusion_functions #9849 (Omega359)
- doc: Add
datafusion-federation
to Integrations #9853 (phillipleblanc) - chore(deps): update cargo requirement from 0.77.0 to 0.78.1 #9844 (dependabot[bot])
- chore(deps-dev): bump webpack-dev-middleware from 5.3.3 to 5.3.4 in /datafusion/wasmtest/datafusion-wasm-app #9741 (dependabot[bot])
- Implement semi/anti join output statistics estimation #9800 (korowa)
- move Log2, Log10, Ln to datafusion-functions #9869 (tinfoil-knight)
- Add CI compile checks for feature flags in datafusion-functions #9772 (alamb)
- move the Translate, SubstrIndex, FindInSet functions to datafusion-functions #9864 (Omega359)
- Support custom struct field names with new scalar function named_struct #9743 (gstvg)
- Allow declaring partition columns in
PARTITION BY
clause, backwards compatible #9599 (MohamedAbdeen21) - Minor: Move depcheck out of datafusion crate (200 less crates to compile) #9865 (alamb)
- Minor: delete duplicate bench test #9866 (Lordworms)
- parquet: Add tests for pruning on Int8/Int16/Int64 columns #9778 (progval)
- move
Atan2
,Atan
,Acosh
,Asinh
,Atanh
todatafusion-function
#9872 (Weijun-H) - minor(doc): fix dead link for catalogs example #9883 (yjshen)
- parquet: Add tests for page pruning on unsigned integers #9888 (progval)
- fix(9870): common expression elimination optimization, should always re-find the correct expression during re-write. #9871 (wiedld)
- [CI] Use alias for table.struct #9894 (jayzhan211)
- fix: detect non-recursive CTEs in the recursive
WITH
clause #9836 (jonahgao) - Minor: Add SIGMOD paper reference to architecture guide #9886 (alamb)
- refactor: add macro for the binary math function in
datafusion-function
#9889 (Weijun-H) - Add benchmark for substr_index #9878 (Omega359)
- Add test for reading back file created with
COPY ... OPTIONS (FORMAT..)
options #9753 (alamb) - Add Expr->String for SimilarTo, IsNotTrue, IsNotUnknown,Negative #9902 (yyy1000)
- refactor: make dfschema wrap schemaref #9595 (haohuaijin)
- Add
spilled_rows
metric toExternalSorter
byIPCWriter
#9885 (erenavsarogullari) - Minor: Add ParquetExec::table_parquet_options accessor #9909 (alamb)
- Add support for Bloom filters on unsigned integer columns in Parquet tables #9770 (progval)
- Move
radians
,signum
,sin
,sinh
andsqrt
functions todatafusion-functions
crate #9882 (erenavsarogullari) - refactor: make all udf function impls public #9903 (universalmind303)
- Minor: Improve math expr description #9911 (caicancai)
- perf: Use
Arc<str>
instead ofCow<&'a>
in the analyzer #9824 (comphead) - Use
struct
instead ofnamed_struct
when there are no aliases #9897 (alamb) - Improve planning speed using
impl Into<Arc<str>>
to create Arc rather than&str
#9916 (alamb) - Make FirstValue an UDAF, Change
AggregateUDFImpl::accumulator
signature, support ORDER BY for UDAFs #9874 (jayzhan211) - Add TPCH-DS planning benchmark #9907 (alamb)
- Simplify Expr::map_children #9876 (peter-toth)
- CrossJoin Refactor #9830 (berkaysynnada)
- Optimization: concat function #9732 (JasonLi-cn)
- Improve
AggregateUDFImpl::state_fields
documentation #9919 (alamb) - chore(deps): update substrait requirement from 0.28.0 to 0.29.0 #9942 (dependabot[bot])
- test: fix intermittent failure in cte.slt #9934 (jonahgao)
- Move
cbrt
,cos
,cosh
,degrees
todatafusion-functions
#9938 (erenavsarogullari) - Add Expr->String for Exists, Sort #9936 (kevinmingtarja)
- Remove
OwnedTableReference
andOwnedSchemaReference
#9933 (comphead) - Prune out constant expressions from output ordering. #9947 (mustafasrepo)
- Move
AggregateExpr
,PhysicalExpr
andPhysicalSortExpr
to physical-expr-core #9926 (jayzhan211) - Minor: Update release README #9956 (alamb)
- Optimize
COUNT(1)
: Change the sentinel value's type for COUNT(*) to Int64 #9944 (gruuya) - Improve docs for
TableProvider::supports_filters_pushdown
and remove deprecated function #9923 (alamb) - Minor: Improve documentation for AggregateUDFImpl::accumulator and
AccumulatorArgs
#9920 (alamb) - Minor: improve TableReference docs #9952 (alamb)
- Fix datafusion-cli publishing #9955 (alamb)
- Simplify TreeNode recursions #9965 (peter-toth)
- Validate partitions columns in
CREATE EXTERNAL TABLE
if table already exists. #9912 (MohamedAbdeen21) - Minor: Add additional documentation to
CommonSubexprEliminate
#9959 (alamb) - Fix tpcds planning stack overflows - Join planning refactoring #9962 (Jefffrey)
- coercion vec[Dictionary, Utf8] to Dictionary for coalesce function #9958 (Lordworms)
- Minor: Update library documentation with new crates #9966 (alamb)
- Minor: Return InternalError rather than panic for
NamedStructField should be rewritten in OperatorToFunction
#9968 (alamb) - minor: update MSRV 1.73 #9977 (comphead)
- Move First Value UDAF and builtin first / last function to
aggregate-functions
#9960 (jayzhan211) - Minor: Avoid copying all expressions in
Analzyer
/check_plan
#9974 (alamb) - Minor: Improve documentation about optimizer #9967 (alamb)
- Minor: Use
Expr::apply()
instead ofinspect_expr_pre()
#9984 (peter-toth) - Update documentation for COPY command #9931 (alamb)
- Minor: fix bug in pruning predicate doc #9986 (alamb)
- fix: improve
unnest_generic_list
handling of null list #9975 (jonahgao) - Consistent LogicalPlan subquery handling in TreeNode::apply and TreeNode::visit #9913 (peter-toth)
- Remove unnecessary result in
DFSchema::index_of_column_by_name
#9990 (lewiszlw) - Removes Bloom filter for Int8/Int16/Uint8/Uint16 #9969 (edmondop)
- Move LogicalPlan
tree_node
module #9995 (alamb) - Optimize performance of substr_index and add tests #9973 (kevinmingtarja)
- move Floor, Gcd, Lcm, Pi to datafusion-functions #9976 (Omega359)
- Minor: Improve documentation on
LogicalPlan::apply*
andLogicalPlan::map*
#9996 (alamb) - move the Log, Power functions to datafusion-functions #9983 (tinfoil-knight)
- Remove FORMAT <..> backwards compatibility options from COPY #9985 (tinfoil-knight)
- move Trunc, Cot, Round, iszero functions to datafusion-functions #10000 (Omega359)
- Minor: Clarify documentation on
PruningStatistics::row_counts
andPruningStatistics::null_counts
and make test match #10004 (alamb) - Avoid
LogicalPlan::clone()
inLogicalPlan::map_children
when possible #9999 (alamb) - Introduce
TreeNode::exists()
API, avoid copying expressions #10008 (peter-toth) - Minor: Make
LogicalPlan::apply_subqueries
andLogicalPlan::map_subqueries
pub #9998 (alamb) - Move Nanvl and random functions to datafusion-functions #10017 (Omega359)
- fix: reduce lock contention in
RepartitionExec::execute
#10009 (crepererum) - chore(deps): update rstest requirement from 0.18.0 to 0.19.0 #10021 (dependabot[bot])
- Minor: Document LogicalPlan tree node transformations #10010 (alamb)
- Refactor
Optimizer
to use owned plans andTreeNode
API (10% faster planning) #9948 (alamb) - Further clarification of the supports_filters_pushdown documentation #9988 (cisaacson)
- Prune columns are all null in ParquetExec by row_counts , handle IS NOT NULL #9989 (Ted-Jiang)
- Improve the performance of ltrim/rtrim/btrim #10006 (JasonLi-cn)
- fix:
RepartitionExec
metrics #10025 (crepererum) - modify emit() of TopK to emit on
batch_size
rather thanbatch_size-1
#10030 (JasonLi-cn) - Consolidate LogicalPlan tree node walking/rewriting code into one module #10034 (alamb)
- Introduce
OptimizerRule::rewrite
to rewrite in place, rewriteExprSimplifier
(20% faster planning) #9954 (alamb) - Fix DistinctCount for timestamps with time zone #10043 (joroKr21)
- Improve documentation on
LogicalPlan
TreeNode methods #10037 (alamb) - chore(deps): update prost-build requirement from =0.12.3 to =0.12.4 #10045 (crepererum)
- Fix datafusion-cli cursor isn't on the right position in windows 7 cmd #10028 (colommar)
- Always pass DataType to PrimitiveDistinctCountAccumulator #10047 (joroKr21)
- Stop copying plans in
LogicalPlan::with_param_values
#10016 (alamb) - fix
NamedStructField should be rewritten in OperatorToFunction
in subquery regression (changeApplyFunctionRewrites
to use TreeNode API #10032 (alamb) - Avoid copies in
InlineTableScan
via TreeNode API #10038 (alamb) - Bump sccache-action to v0.0.4 #10060 (phillipleblanc)
- chore: add GitHub workflow to close stale PRs #10046 (andygrove)
- feat: eliminate redundant sorts on monotonic expressions #9813 (suremarc)
- Disable
crypto_expressions
feature properly for --no-default-features #10059 (phillipleblanc) - Return self in EmptyExec and PlaceholderRowExec with_new_children #10052 (joroKr21)
- chore(deps): update sqllogictest requirement from 0.19.0 to 0.20.0 #10057 (dependabot[bot])
- Rename
FileSinkExec
toDataSinkExec
#10065 (phillipleblanc) - fix: Support Dict types in
in_list
physical plans #10031 (advancedxy) - Prune pages are all null in ParquetExec by row_counts and fix NOT NULL prune #10051 (Ted-Jiang)
- Refactor
EliminateOuterJoin
to implementOptimizerRule::rewrite()
#10081 (peter-toth) - chore(deps): update substrait requirement from 0.29.0 to 0.30.0 #10084 (dependabot[bot])
- feat: optimize
lower
andupper
functions #9971 (JasonLi-cn) - Prepend sqllogictest explain result with line number #10019 (duongcongtoai)
- Use PhysicalExtensionCodec consistently #10075 (joroKr21)
- Minor: Do not truncate
SHOW ALL
in datafusion-cli #10079 (alamb) - Minor: get mutable ref to
SessionConfig
inSessionState
#10050 (MichaelScofield) - Move
ceil
,exp
,factorial
todatafusion-functions
crate #10083 (erenavsarogullari) - feat: support
unnest
multiple arrays #10044 (jonahgao) - cleanup(tests): Move tests from
push_down_projections.rs
tooptimize_projections.rs
#10071 (kavirajk) - Move conversion of FIRST/LAST Aggregate function to independent physical optimizer rule #10061 (jayzhan211)
- Avoid copies in
CountWildcardRule
via TreeNode API #10066 (alamb) - Coerce Dictionary types for scalar functions #10077 (viirya)
- Refactor
UnwrapCastInComparison
to implementOptimizerRule::rewrite()
#10087 (peter-toth) - Improve ApproxPercentileAccumulator merge api and fix bug #10056 (Ted-Jiang)
- Support http s3 endpoints in datafusion-cli via
CREATE EXTERNAL TABLE
#10080 (alamb) - [Bug Fix]: Deem hash repartition unnecessary when input and output has 1 partition #10095 (mustafasrepo)
- fix: Specify row count in sort_batch for batch with no columns #10094 (viirya)
- Move concat, concat_ws, ends_with, initcap to datafusion-functions #10089 (Omega359)
- Update datafusion-cli docs, split up #10078 (alamb)
- Refactor physical create_initial_plan to iteratively & concurrently construct plan from the bottom up #10023 (Jefffrey)
- Adding TPCH benchmarks for Sort Merge Join #10092 (comphead)
- [minor] make parquet prune tests more readable #10112 (Ted-Jiang)
- Fix intermittent CI test failure in
joins.slt
#10120 (alamb) - Update dependabot to consider datafusion-cli #10108 (Jefffrey)
- fix: another non-deterministic test in
joins.slt
#10122 (korowa) - Minor: only trigger dependency check on changes to Cargo.toml #10099 (alamb)
- Refactor
UnwrapCastInComparison
to removeExpr
clones #10115 (peter-toth) - Fix large futures causing stack overflows #10033 (sergiimk)
- Avoid cloning in
log::simplify
andpower::simplify
#10086 (alamb) - feat:
DataFrame
supports unnesting multiple columns #10118 (jonahgao) - Minor: Refine dev/release/README.md #10129 (alamb)
- Minor: Add default for
Expr
#10127 (peter-toth) - Update documentation to replace Apache Arrow DataFusion with Apache DataFusion #10130 (andygrove)
- Fix AVG groups accummulator ignoring return type #10114 (gruuya)
- Port
37.1.0
changes to main #10136 (alamb) - chore(deps): update substrait requirement from 0.30.0 to 0.31.0 #10140 (dependabot[bot])
- Minor: Support more args for udaf #10146 (jayzhan211)
- Minor: Signature check for UDAF #10147 (jayzhan211)
- minor: avoid cloning the
SetExpr
during planning ofSelectInto
#10152 (jonahgao) - Add distinct aggregate tests to sqllogictest #10158 (Jefffrey)
- Add test for LIKE newline handling #10160 (Jefffrey)
- minor: unparser cleanup and new roundtrip test #10150 (devinjdangelo)
- Support Duration and Union types in ScalarValue::iter_to_array #10139 (joroKr21)
- chore(deps): update sqlparser requirement from 0.44.0 to 0.45.0 #10137 (Jefffrey)
- fix: duplicate output for HashJoinExec in CollectLeft mode #9757 (korowa)
- Move coalesce to datafusion-functions and remove BuiltInScalarFunction #10098 (Omega359)
- [DOC] Add test example for backtraces #10143 (comphead)
- Update github repo links #10167 (lewiszlw)
- feat: support input reordering for
NestedLoopJoinExec
#9676 (korowa) - minor: fix installation section link #10179 (comphead)
- Improve
TreeNode
andLogicalPlan
APIs to accept owned closures, deprecatetransform_down_mut()
andtransform_up_mut()
#10126 (peter-toth) - Projection Expression - Input Field Inconsistencies during Projection #10088 (berkaysynnada)
- implement short_circuits function for ScalarUDFImpl trait #10168 (Lordworms)
- Improve documentation on
TreeNode
#10035 (alamb) - implement rewrite for ExtractEquijoinPredicate and avoid clone in filter #10165 (Lordworms)
- Update .asf.yaml to point to new mailing list #10189 (phillipleblanc)
- Update NOTICE.txt to be relevant to DataFusion #10185 (alamb)
- Update .asf.yaml to publish docs to datafusion.apache.org #10190 (phillipleblanc)
- Minor: Add
Column::from(Tableref, &FieldRef)
,Expr::from(Column)
andExpr::from(Tableref, &FieldRef)
#10178 (alamb) - implement rewrite for FilterNullJoinKeys #10166 (Lordworms)
- Implement rewrite for EliminateOneUnion and EliminateJoin #10184 (Lordworms)
- Update links to point to datafusion.apache.org #10195 (phillipleblanc)
- Minor: Introduce
Expr::is_volatile()
, adjustTreeNode::exists()
#10191 (peter-toth) - Doc: Modify docs to fix old naming #10199 (comphead)
- [MINOR] Remove ScalarFunction from datafusion.proto #10173 #10202 (dmitrybugakov)
- Allow expr_to_sql unparsing with no quotes #10198 (phillipleblanc)
- Minor: Avoid a clone in ArrayFunctionRewriter #10204 (alamb)
- Move coalesce function from math to core #10201 (xxxuuu)
- fix: cargo warnings of import item #10196 (waynexia)
- Minor: Remove some clone in
TypeCoercion
#10203 (alamb) - doc: fix subscribe mail link to datafusion mailing lists #10225 (jackwener)
- Minor: Prevent empty datafusion-cli commands #10219 (comphead)
- Optimize date_bin (2x faster) #10215 (simonvandel)
- Refactor sessionconfig set fns to avoid an unnecessary enum to string conversion #10141 (psvri)
- fix: reduce lock contention in distributor channels #10026 (crepererum)
- Avoid
Expr
copiesOptimizeProjection
, 12% faster planning, encapsulate indicies #10216 (alamb) - chore: Create a doap file #10233 (tisonkun)
- Allow adding user defined metadata to
ParquetSink
#10224 (wiedld) - refactor
EliminateDuplicatedExpr
optimizer pass to avoid clone #10218 (Lordworms) - Support for median(distinct) aggregation function #10226 (Jefffrey)
- Add tests that
random()
anduuid()
produce unique values for each row #10248 (alamb) - ScalarUDF: Remove
supports_zero_argument
and avoid creating null array for empty args #10193 (jayzhan211) - Add Expr->String for WindowFunction #10243 (yyy1000)
- Make function modules public, add Default impl's. #10239 (Omega359)
- chore: Update release scripts to reflect move to TLP #10235 (andygrove)
- Stop copying plans in
EliminateLimit
#10253 (kevinmingtarja) - Minor Clean-up in JoinSelection Tests #10249 (berkaysynnada)
- fix: no longer support the
substring
function #10242 (jonahgao) - Fix docs.rs build for datafusion-proto (hopefully) #10254 (alamb)
- Minor: Possibility to strip datafusion error name #10186 (comphead)
- Docs: Add governance page to contributor guide #10238 (alamb)
- Improve documentation on
ColumnarValue
#10265 (alamb) - Minor: Add comments for removed protobuf nodes #10252 (alamb)
- feat: add static_name() to ExecutionPlan #10266 (waynexia)
- Zero-copy conversion from SchemaRef to DfSchema #10298 (tustvold)
- chore: Update Error for Unnest Rewritter #10263 (Weijun-H)
- feat(CLI): print column headers for empty query results #10300 (jonahgao)
- Clean-up: Remove AggregateExec::group_by() #10297 (berkaysynnada)
- Add mailing list descriptions to documentation #10284 (alamb)
- chore(deps): update substrait requirement from 0.31.0 to 0.32.0 #10279 (dependabot[bot])
- refactor: Convert
IPCWriter
metrics fromu64
tousize
#10278 (erenavsarogullari) - Validate ScalarUDF output rows and fix nulls for
array_has
andget_field
forMap
#10148 (duongcongtoai) - Minor: return NULL for range and generate_series #10275 (Lordworms)
- docs: add download page #10271 (tisonkun)
- Minor: Add some more tests to map.slt #10301 (alamb)
- fix: Correct null_count in describe() #10260 (Weijun-H)
- chore: Add datatype info to error message #10307 (viirya)
- feat: add optimizer config param to avoid grouping partitions
prefer_existing_union
#10259 (NGA-TRAN) - Remove
ScalarFunctionDefinition::Name
#10277 (lewiszlw) - Display: Support
preserve_partitioning
on SortExec physical plan. #10153 (kavirajk) - Fix build with missing
use
(" return internal_err!("UDF returned a different ..."
) #10317 (alamb) - [Minor] Update link to list of committers in contributor guide #10312 (alamb)
- Optimize EliminateFilter to avoid unnecessary copies #10288 #10302 (dmitrybugakov)
- chore: add function to set prefer_existing_union #10322 (NGA-TRAN)
ExecutionPlan
visitor example documentation #10286 (matthewmturner)- fix: schema error when parsing order-by expressions #10234 (jonahgao)
- Stop copying LogicalPlan and Exprs in
RewriteDisjunctivePredicate
#10305 (rohitrastogi) - feat: unwrap casts of string and dictionary columns #10323 (erratic-pattern)
- feat: Determine ordering of file groups #9593 (suremarc)
- Stop copying LogicalPlan and Exprs in
DecorrelatePredicateSubquery
#10318 (alamb) - Minor: Add additional coalesce tests #10334 (alamb)
- Minor: add a few more dictionary unwrap tests #10335 (alamb)
- Check list size before concat in ScalarValue #10329 (timsaucer)
- Split parquet bloom filter config and enable bloom filter on read by default #10306 (lewiszlw)
- Improve coerce API so it does not need DFSchema #10331 (alamb)
- Stop copying LogicalPlan and Exprs in
PropagateEmptyRelation
#10332 (dmitrybugakov) - Stop copying LogicalPlan and Exprs in EliminateNestedUnion #10319 (emgeee)
- Fix clippy lints found by Clippy in Rust
1.78
#10353 (alamb) - Minor: Add sql level test for lead/lag on arrays #10345 (alamb)
- fix: LogFunc simplify swaps arguments #10360 (erratic-pattern)
- Refine documentation for
Transformed::{update,map,transform})_data
#10355 (alamb) - Clarify docs explaining the relationship between
SessionState
andSessionContext
#10350 (alamb) - Optimized push down filter #10291 #10366 (dmitrybugakov)
- Unparser: Support
ORDER BY
in window function definition #10370 (yyy1000) - docs: Add DataFusion subprojects to navigation menu, other minor updates #10362 (andygrove)
- feat: Add CrossJoin match case to unparser #10371 (sardination)
- Minor: Do not force analyzer to copy logical plans #10367 (alamb)
- Minor: Move Sum aggregate function test to slt #10382 (jayzhan211)
- chore: remove DataPtr trait since Arc::ptr_eq ignores pointer metadata #10378 (intoraw)
- Move
Covariance
(Sample)covar
/covar_samp
to be a User Defined Aggregate Function #10372 (jayzhan211) - Support limit in StreamingTableExec #10309 (lewiszlw)
- Minor: Move count test to slt #10383 (jayzhan211)
- [MINOR]: Reduce test run time #10390 (mustafasrepo)
- Fix
coalesce
,struct
andnamed_strct
expr_fn function to take multiple arguments #10321 (alamb) - Minor: remove old
create_physical_expr
toscalar_function
#10387 (jayzhan211) - Move average unit tests to slt #10401 (lewiszlw)
- Move array_agg unit tests to slt #10402 (lewiszlw)
- feat: run expression simplifier in a loop until a fixedpoint or 3 cycles #10358 (erratic-pattern)
- Add
SessionContext
/SessionState::create_physical_expr()
to createPhysicalExpressions
fromExpr
s #10330 (alamb)