Make AffineScalarFunc hashable for Pandas, Pint and Pint-Pandas #170

MichaelTiemannOSC · 2023-07-05T23:58:31Z

Add hash to AffineScalarFunc so it works with Pandas, Pint, and Pint-Pandas.

Part of pandas-dev/pandas#53970
and hgrecco/pint#1615
and hgrecco/pint-pandas#140

Add __hash__ so it works with Pandas, Pint, and Pint-Pandas Signed-off-by: Michael Tiemann <[email protected]>

jbrockmendel · 2023-07-06T15:56:57Z

uncertainties/core.py

@@ -1849,6 +1849,10 @@ def std_dev(self):
    # Abbreviation (for formulas, etc.):
    s = std_dev

+    def __hash__(self):
+        # Placeholder until we figure out how to really make these hashable
+        return id(self)


will this satisfy the invariance x == y \Rightarrow hash(x) == hash(y)?

I ran some tests:

u = ufloat(1, 2) v = ufloat(1, 2) w = u+u z = w / 2 # u != v, expected # u == z, expected # id(u) != id(v) != id(z) a = float(1+2) b = float(2+1) c = a+a d = c/2 # a == b, expected # a == d, expected # b == d, expected # id(a) != id(b) != id(d)

I'll see what I can come up with to satisfy the invariant. Thanks!

Tweak hash function so that when x==y, hash(x)==hash(y) Test case: ``` import uncertainties from uncertainties import ufloat u = ufloat(1.23, 2.34) v = ufloat(1.23, 2.34) print(f"u{u} == v{v}: {u==v}") print(f"hash(u){hash(u)} == hash(v){hash(v)}: {hash(u)==hash(v)}") print(f"u{u} == (u+u)/2{(u+u)/2}: {u==(u+u)/2}") print(f"hash(u){hash(u)} == hash((u+u)/2){hash((u+u)/2)}: {hash(u)==hash((u+u)/2)}") ``` Signed-off-by: Michael Tiemann <[email protected]>

MichaelTiemannOSC · 2023-07-08T02:00:41Z

While I solved the invariant problem, when I ran pytest the pickle test failed. All others passed. I'll see what I can do about that...

Ensure that we have a linear_combo attribute before trying to look up its keys. Also re-indent so our conditional test is not too long, and add test case to test_uncertainties. Now test_uncertainties passes all 31 tests original tests plus the new one, and test_umath passes all 9 tests. Signed-off-by: Michael Tiemann <[email protected]>

jbrockmendel · 2023-07-08T04:39:42Z

uncertainties/test_uncertainties.py

@@ -2404,3 +2404,21 @@ def test_correlated_values_correlation_mat():
        assert arrays_close(
            numpy.array(cov_mat),
            numpy.array(uncert_core.covariance_matrix([x2, y2, z2])))
+
+    def test_hash():


are there any NaN-likes that will have x!=x? if you want those to be considered "matching" for e.g. dict lookups (and pandas Index lookups) those will need matching hashes

My understanding is that there is no canonical NaN in the uncertainties package:

import numpy as np import uncertainties from uncertainties import ufloat xx = ufloat(np.nan, 0) yy = ufloat(np.nan, 1) # nan+/-0 print(xx + xx) # nan+/-2 print(yy + yy) # False, because nan != nan print(xx == xx)

The uncertainties package has unumpy.isnan which works for both scalar and arrays. There's no dictionary compare against a nan. The PintArray type uses unp.isnan to check whether something is a nan or not. The pandas changes all use PintArray.isna to test NA-ness (including NaN-ness). But if I'm missing something, please let me know!

I've now changed the Pint-Pandas changes to use np.nan instead of pd.NA. While I was able to get everything to work with pd.NA, things were cleaner using np.nan as the NA value for uncertain values.

Call `expand` directly if self._linear_part is not yet expanded. Signed-off-by: Michael Tiemann <[email protected]>

Attempt to work around old version of nose that doesn't work with later versions of Python. Will it work? Signed-off-by: Michael Tiemann <[email protected]>

This reverts commit 7cca18c.

Attempt replacing `nose` (which has not been regularly supported for years) with `pytest` for better CI/CD experience. Signed-off-by: Michael Tiemann <[email protected]>

This reverts commit f3cb615.

Try running nosetests directly, rather than via setuptools, which trips up nose-1.3.7 for python >= 3.6. See nose-devs/nose#873 Signed-off-by: Michael Tiemann <[email protected]>

Install `future` to satisfy `builtins` for Python 2.7. Signed-off-by: Michael Tiemann <[email protected]>

MichaelTiemannOSC · 2023-07-30T12:13:42Z

All changes needed for Pandas 2.1 are now present and Pint/Pint-Pandas changes are ready for review. Could we get a fresh release with hashable AffineScalarFunc?

NelDav · 2024-01-23T12:10:04Z

I made similar changes in #189, just because I haven't seeen this MR before. I do not have any clue about uncertainties, but wouldn't it be necessary, to check the correlations as well? You may check my MR, I tried to do it there.

MichaelTiemannOSC · 2024-01-23T17:26:56Z

I made similar changes in #189, just because I haven't seeen this MR before. I do not have any clue about uncertainties, but wouldn't it be necessary, to check the correlations as well? You may check my MR, I tried to do it there.

@NelDav nice to meet you! The correlations are checked by expanding out the _linear_part. I created some test cases that demonstrate what I understand to be proper equality/inequality relationships. If your test case passes those, then great, because your hash function is simpler than mine. But you need to test all the corner cases you can think of as part of your PR. A nifty hash function alone is not enough.

NelDav · 2024-01-25T12:06:44Z

Hi @MichaelTiemannOSC, thank you for your answer. I merged your branch into mine an checked my hash function with your tests. Indeed, a few problems occoured. I then modified my hash calculation and now your tests are pasing. I think the new version is still simpler than your solution. However, I am far from an expert on how to calculate with uncertainties and how identity verification should be carried out. It would be very nice if you could look at my new proposal and share your impressions.

jagerber48 · 2024-07-21T03:58:26Z

See my comment on the other PR: #219 (comment)

Closing this PR in favor of #219. Though that one will likely be closed as well since I think we discovered implementing a hash requires a bit of a larger refactor of the AffineScalarFunc and LinearCombination code than simply slapping a __hash__ function somewhere.

Update core.py

d90c541

Add __hash__ so it works with Pandas, Pint, and Pint-Pandas Signed-off-by: Michael Tiemann <[email protected]>

jbrockmendel reviewed Jul 6, 2023

View reviewed changes

jbrockmendel reviewed Jul 8, 2023

View reviewed changes

MichaelTiemannOSC added 7 commits July 8, 2023 15:52

Improve efficiency of AffineScalarFunc hash

634db47

Call `expand` directly if self._linear_part is not yet expanded. Signed-off-by: Michael Tiemann <[email protected]>

Update appveyor.yml

7cca18c

Attempt to work around old version of nose that doesn't work with later versions of Python. Will it work? Signed-off-by: Michael Tiemann <[email protected]>

Revert "Update appveyor.yml"

af3447c

This reverts commit 7cca18c.

Replace nose with pytest

f3cb615

Attempt replacing `nose` (which has not been regularly supported for years) with `pytest` for better CI/CD experience. Signed-off-by: Michael Tiemann <[email protected]>

Revert "Replace nose with pytest"

5e40c49

This reverts commit f3cb615.

Update appveyor.yml

cd3b7e0

Try running nosetests directly, rather than via setuptools, which trips up nose-1.3.7 for python >= 3.6. See nose-devs/nose#873 Signed-off-by: Michael Tiemann <[email protected]>

Update appveyor.yml

a2d4bb1

Install `future` to satisfy `builtins` for Python 2.7. Signed-off-by: Michael Tiemann <[email protected]>

NelDav mentioned this pull request Jan 23, 2024

Add hash function for AffineScalarFunc class #189

Closed

wshanks force-pushed the master branch from 7950418 to 2738c32 Compare March 7, 2024 13:31

jagerber48 closed this Jul 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make AffineScalarFunc hashable for Pandas, Pint and Pint-Pandas #170

Make AffineScalarFunc hashable for Pandas, Pint and Pint-Pandas #170

MichaelTiemannOSC commented Jul 5, 2023

jbrockmendel Jul 6, 2023

MichaelTiemannOSC Jul 6, 2023

MichaelTiemannOSC commented Jul 8, 2023

jbrockmendel Jul 8, 2023

MichaelTiemannOSC Jul 8, 2023 •

edited

Loading

MichaelTiemannOSC Jul 30, 2023

MichaelTiemannOSC commented Jul 30, 2023

NelDav commented Jan 23, 2024

MichaelTiemannOSC commented Jan 23, 2024

NelDav commented Jan 25, 2024 •

edited

Loading

jagerber48 commented Jul 21, 2024

Make AffineScalarFunc hashable for Pandas, Pint and Pint-Pandas #170

Make AffineScalarFunc hashable for Pandas, Pint and Pint-Pandas #170

Conversation

MichaelTiemannOSC commented Jul 5, 2023

jbrockmendel Jul 6, 2023

Choose a reason for hiding this comment

MichaelTiemannOSC Jul 6, 2023

Choose a reason for hiding this comment

MichaelTiemannOSC commented Jul 8, 2023

jbrockmendel Jul 8, 2023

Choose a reason for hiding this comment

MichaelTiemannOSC Jul 8, 2023 • edited Loading

Choose a reason for hiding this comment

MichaelTiemannOSC Jul 30, 2023

Choose a reason for hiding this comment

MichaelTiemannOSC commented Jul 30, 2023

NelDav commented Jan 23, 2024

MichaelTiemannOSC commented Jan 23, 2024

NelDav commented Jan 25, 2024 • edited Loading

jagerber48 commented Jul 21, 2024

MichaelTiemannOSC Jul 8, 2023 •

edited

Loading

NelDav commented Jan 25, 2024 •

edited

Loading