Skip to content

Commit

Permalink
Backport PR pandas-dev#57323: REGR: Fix regression when grouping over…
Browse files Browse the repository at this point in the history
… a Series
  • Loading branch information
phofl authored and meeseeksmachine committed Feb 10, 2024
1 parent 10b26fe commit 2f59813
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 3 deletions.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.2.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ Fixed regressions
- Fixed regression in :meth:`.DataFrameGroupBy.idxmin`, :meth:`.DataFrameGroupBy.idxmax`, :meth:`.SeriesGroupBy.idxmin`, :meth:`.SeriesGroupBy.idxmax` ignoring the ``skipna`` argument (:issue:`57040`)
- Fixed regression in :meth:`.DataFrameGroupBy.idxmin`, :meth:`.DataFrameGroupBy.idxmax`, :meth:`.SeriesGroupBy.idxmin`, :meth:`.SeriesGroupBy.idxmax` where values containing the minimum or maximum value for the dtype could produce incorrect results (:issue:`57040`)
- Fixed regression in :meth:`CategoricalIndex.difference` raising ``KeyError`` when other contains null values other than NaN (:issue:`57318`)
- Fixed regression in :meth:`DataFrame.groupby` raising ``ValueError`` when grouping by a :class:`Series` in some cases (:issue:`57276`)
- Fixed regression in :meth:`DataFrame.loc` raising ``IndexError`` for non-unique, masked dtype indexes where result has more than 10,000 rows (:issue:`57027`)
- Fixed regression in :meth:`DataFrame.merge` raising ``ValueError`` for certain types of 3rd-party extension arrays (:issue:`57316`)
- Fixed regression in :meth:`DataFrame.sort_index` not producing a stable sort for a index with duplicates (:issue:`57151`)
Expand Down
5 changes: 2 additions & 3 deletions pandas/core/internals/managers.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
cast,
)
import warnings
import weakref

import numpy as np

Expand Down Expand Up @@ -282,8 +281,8 @@ def references_same_values(self, mgr: BaseBlockManager, blkno: int) -> bool:
Checks if two blocks from two different block managers reference the
same underlying values.
"""
ref = weakref.ref(self.blocks[blkno])
return ref in mgr.blocks[blkno].refs.referenced_blocks
blk = self.blocks[blkno]
return any(blk is ref() for ref in mgr.blocks[blkno].refs.referenced_blocks)

def get_dtypes(self) -> npt.NDArray[np.object_]:
dtypes = np.array([blk.dtype for blk in self.blocks], dtype=object)
Expand Down
11 changes: 11 additions & 0 deletions pandas/tests/copy_view/test_methods.py
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,17 @@ def test_reset_index_series_drop(using_copy_on_write, index):
tm.assert_series_equal(ser, ser_orig)


def test_groupby_column_index_in_references():
df = DataFrame(
{"A": ["a", "b", "c", "d"], "B": [1, 2, 3, 4], "C": ["a", "a", "b", "b"]}
)
df = df.set_index("A")
key = df["C"]
result = df.groupby(key, observed=True).sum()
expected = df.groupby("C", observed=True).sum()
tm.assert_frame_equal(result, expected)


def test_rename_columns(using_copy_on_write):
# Case: renaming columns returns a new dataframe
# + afterwards modifying the result
Expand Down

0 comments on commit 2f59813

Please sign in to comment.