Modin 0.28.0
This release introduces modin.pandas.api.extensions
module, faster implementations for merge
and
groupby.rolling
(by default) functions, and new functions to work with Ray Dataset: to/from_ray_dataset
.
It also includes some other new features, performance optimizations and bug fixes.
Key Features and Updates Since 0.27.0
- Stability and Bugfixes
- FIX-#6935: Fix
merge
when right operand is an empty dataframe (#6941) - FIX-#6936: Fix
read_parquet
when dataset is created withto_parquet
andindex=False
(#6937) - FIX-#6944: Apply
isort
formatting for scripts from tutorials (#6945) - FIX-#6946: Remove
needs: [lint-black-isort, ...]
(#6947) - FIX-#6948: Fix
groupby
when Modin dataframe has several column partitions (#6951) - FIX-#6952: Use
render_as_string
to get sqlalchemy engine url (#6953) - FIX-#6968: Align API with pandas (#6969)
- FIX-#6974: Always use actual pandas version in
test_all_urls_exist
(#6975) - FIX-#6982: Updating data in notebooks from yellow taxi to green taxi dataset (#6993)
- FIX-#6984: Ensure the results of inplace operations materialize (for tests) (#6985)
- FIX-#6935: Fix
- Performance enhancements
- Refactor Codebase
- REFACTOR-#6856: Rename
read_pickle_distributed/to_pickle_distributed
toread_pickle_glob/to_pickle_glob
(#6957) - REFACTOR-#6939: Make
modin.pandas.DataFrame._to_pandas
a public method (#6940) - REFACTOR-#6958: Remove
DataFrame.to_pickle_distributed
in favour ofDataFrame.modin.to_pickle_distributed
(#6959) - REFACTOR-#7002: Get more information about exceptions from
eval_general
utility (#7003) - REFACTOR-#7008: Remove
check_exception_type
argument ofeval_general
function (#7009) - REFACTOR-#7013: Move
to_pandas
andto_ray_dataset
into modin namespace (#7014) - REFACTOR-#7017: Align
to_hdf
andhist
signatures to pandas (#7018)
- REFACTOR-#6856: Rename
- Update testing suite
- Documentation improvements
- New Features
- FEAT-#3044: Create Extensions Module in Modin (#6961)
- FEAT-#4622: Unify data type of
log_level
in logging module (#6992) - FEAT-#6913: Support sqlalchemy connectables in
read_sql
by getting connection url (#6956) - FEAT-#6934: Support
include_groups=False
parameter ingroupby.apply()
(#6938) - FEAT-#6942: Enable range-partitioning impl for
groupby().rolling()
by default (#6943) - FEAT-#6965: Implement
.merge()
using range-partitioning implementation (#6966) - FEAT-#6970: Implement
to/from_ray_dataset
functions (#6971) - FEAT-#6983: Add Pluggable Documentation Module Support (#6986)
- FEAT-#7001: Do not force materialization in
MetaList.__getitem__()
(#7006)
Contributors
@AndreyPavlenko
@Retribution98
@YarShev
@anmyachev
@arunjose696
@dchigarev
@sfc-gh-dpetersohn
@tochigiv