Skip to content

Modin 0.28.0

Compare
Choose a tag to compare
@anmyachev anmyachev released this 07 Mar 18:35
· 67 commits to master since this release
0.28.0
14452a8

This release introduces modin.pandas.api.extensions module, faster implementations for merge and
groupby.rolling(by default) functions, and new functions to work with Ray Dataset: to/from_ray_dataset.
It also includes some other new features, performance optimizations and bug fixes.

Key Features and Updates Since 0.27.0

  • Stability and Bugfixes
    • FIX-#6935: Fix merge when right operand is an empty dataframe (#6941)
    • FIX-#6936: Fix read_parquet when dataset is created with to_parquet and index=False (#6937)
    • FIX-#6944: Apply isort formatting for scripts from tutorials (#6945)
    • FIX-#6946: Remove needs: [lint-black-isort, ...] (#6947)
    • FIX-#6948: Fix groupby when Modin dataframe has several column partitions (#6951)
    • FIX-#6952: Use render_as_string to get sqlalchemy engine url (#6953)
    • FIX-#6968: Align API with pandas (#6969)
    • FIX-#6974: Always use actual pandas version in test_all_urls_exist (#6975)
    • FIX-#6982: Updating data in notebooks from yellow taxi to green taxi dataset (#6993)
    • FIX-#6984: Ensure the results of inplace operations materialize (for tests) (#6985)
  • Performance enhancements
    • PERF-#6976: Do not trigger unnecessary computations on ._propagate_index_objs() (#6977)
    • PERF-#6979: Do not trigger ._copartition() for identical indices on binary operations (#6980)
  • Refactor Codebase
    • REFACTOR-#6856: Rename read_pickle_distributed/to_pickle_distributed to read_pickle_glob/to_pickle_glob (#6957)
    • REFACTOR-#6939: Make modin.pandas.DataFrame._to_pandas a public method (#6940)
    • REFACTOR-#6958: Remove DataFrame.to_pickle_distributed in favour of DataFrame.modin.to_pickle_distributed (#6959)
    • REFACTOR-#7002: Get more information about exceptions from eval_general utility (#7003)
    • REFACTOR-#7008: Remove check_exception_type argument of eval_general function (#7009)
    • REFACTOR-#7013: Move to_pandas and to_ray_dataset into modin namespace (#7014)
    • REFACTOR-#7017: Align to_hdf and hist signatures to pandas (#7018)
  • Update testing suite
    • TEST-#6932: Don't use deprecated pandas._testing.makeStringIndex (#6933)
    • TEST-#6994: Update tests in test_series.py (#6995)
    • TEST-#6996: Update tests in test_io.py (#6997)
  • Documentation improvements
  • New Features
    • FEAT-#3044: Create Extensions Module in Modin (#6961)
    • FEAT-#4622: Unify data type of log_level in logging module (#6992)
    • FEAT-#6913: Support sqlalchemy connectables in read_sql by getting connection url (#6956)
    • FEAT-#6934: Support include_groups=False parameter in groupby.apply() (#6938)
    • FEAT-#6942: Enable range-partitioning impl for groupby().rolling() by default (#6943)
    • FEAT-#6965: Implement .merge() using range-partitioning implementation (#6966)
    • FEAT-#6970: Implement to/from_ray_dataset functions (#6971)
    • FEAT-#6983: Add Pluggable Documentation Module Support (#6986)
    • FEAT-#7001: Do not force materialization in MetaList.__getitem__() (#7006)

Contributors

@AndreyPavlenko
@Retribution98
@YarShev
@anmyachev
@arunjose696
@dchigarev
@sfc-gh-dpetersohn
@tochigiv