Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hash function for AffineScalarFunc class #189

Closed
wants to merge 20 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
d90c541
Update core.py
MichaelTiemannOSC Jul 5, 2023
5d429fe
Implement hash invariant
MichaelTiemannOSC Jul 7, 2023
40154ce
Fix pickling (broken by last commit)
MichaelTiemannOSC Jul 8, 2023
634db47
Improve efficiency of AffineScalarFunc hash
MichaelTiemannOSC Jul 8, 2023
7cca18c
Update appveyor.yml
MichaelTiemannOSC Jul 11, 2023
af3447c
Revert "Update appveyor.yml"
MichaelTiemannOSC Jul 11, 2023
f3cb615
Replace nose with pytest
MichaelTiemannOSC Jul 11, 2023
5e40c49
Revert "Replace nose with pytest"
MichaelTiemannOSC Jul 11, 2023
cd3b7e0
Update appveyor.yml
MichaelTiemannOSC Jul 11, 2023
a2d4bb1
Update appveyor.yml
MichaelTiemannOSC Jul 11, 2023
d74a9d1
feat: Added hash function for AffineScalarFunc class
NelDav Jan 22, 2024
9a9d6d5
Merge remote-tracking branch 'hash/hash_for_pandas' into implement-ha…
NelDav Jan 25, 2024
3e0b064
fix: Merged hash method from "MichaelTiemannOSC". And created a corre…
NelDav Jan 25, 2024
1e57a2f
refactor: Nose does not like return types. Removed them.
NelDav Jan 25, 2024
e4ef9e1
Merge branch 'master' into implement-hash-for-AffineScalarFunc
NelDav Apr 2, 2024
4d8d268
fix: hash calculation works now. However, hash equality between Varia…
NelDav Apr 4, 2024
605b5cd
fix: Variabel hash is equal for objects which __eq__ returns True.
NelDav Apr 4, 2024
82285df
fix: Pickle is able to serialize/deserialize Variables again
NelDav Apr 5, 2024
3c64ae3
fix: pickling works now
NelDav Apr 5, 2024
185aa0d
fix: derivatives with value 0 are filtered out for hash calculation.
NelDav Apr 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 70 additions & 9 deletions uncertainties/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -1502,6 +1502,16 @@ def __bool__(self):
"""
return bool(self.linear_combo)

def copy(self):
"""Shallow copy of the LinearCombination object.

Returns:
LinearCombination: Copy of the object.
"""
cpy = LinearCombination.__new__(LinearCombination)
cpy.linear_combo = self.linear_combo.copy()
return cpy

def expanded(self):
"""
Return True if and only if the linear combination is expanded.
Expand Down Expand Up @@ -1756,6 +1766,20 @@ def __bool__(self):

########################################

def __hash__(self):
"""
Calculates the hash for any AffineScalarFunc object.
The hash is calculated from the nominal_value, and the derivatives.

Returns:
int: The hash of this object
"""

# derivatives which are zero must be filtered out, because the variable is insensitive to errors in those correlations.
# the derivatives must be sorted, because the hash depends on the order, but the equality of variables does not.
derivatives = sorted([(id(key), value) for key, value in self.derivatives.items() if value != 0])
return hash((self._nominal_value, tuple(derivatives)))

# Uncertainties handling:

def error_components(self):
Expand Down Expand Up @@ -2422,6 +2446,7 @@ def __setstate__(self, data_dict):
"""
Hook for the pickle module.
"""

for (name, value) in data_dict.items():
# Contrary to the default __setstate__(), this does not
# necessarily save to the instance dictionary (because the
Expand Down Expand Up @@ -2729,7 +2754,7 @@ def __init__(self, value, std_dev, tag=None):
# differentiable functions: for instance, Variable(3, 0.1)/2
# has a nominal value of 3/2 = 1, but a "shifted" value
# of 3.1/2 = 1.55.
value = float(value)
self._nominal_value = float(value)

# If the variable changes by dx, then the value of the affine
# function that gives its value changes by 1*dx:
Expand All @@ -2739,7 +2764,7 @@ def __init__(self, value, std_dev, tag=None):
# takes much more memory. Thus, this implementation chooses
# more cycles and a smaller memory footprint instead of no
# cycles and a larger memory footprint.
super(Variable, self).__init__(value, LinearCombination({self: 1.}))
super(Variable, self).__init__(self._nominal_value, LinearCombination({self: 1.}))

self.std_dev = std_dev # Assignment through a Python property

Expand All @@ -2766,6 +2791,21 @@ def std_dev(self, std_dev):

self._std_dev = float(std_dev)

def __hash__(self):
"""
Calculates the hash for any `Variable` object.
The implementation is the same as for `AffineScalarFunc`.
But this method sets the `_linear_part` manually.
It is set to a single entry with a self reference as key and 1.0 as value.

Returns:
int: The hash of this object
"""

# The manual implementation of the _linear_part is necessary, because pickle would not work otherwise.
# That is because of the self reference inside the _linear_part.
return hash((self._nominal_value, ((id(self), 1.),)))

# The following method is overridden so that we can represent the tag:
def __repr__(self):

Expand All @@ -2776,13 +2816,6 @@ def __repr__(self):
else:
return "< %s = %s >" % (self.tag, num_repr)

def __hash__(self):
# All Variable objects are by definition independent
# variables, so they never compare equal; therefore, their
# id() are allowed to differ
# (http://docs.python.org/reference/datamodel.html#object.__hash__):
return id(self)

def __copy__(self):
"""
Hook for the standard copy module.
Expand Down Expand Up @@ -2817,6 +2850,34 @@ def __deepcopy__(self, memo):

return self.__copy__()

def __getstate__(self):
"""
Hook for the pickle module.

Same as for the AffineScalarFunction but remove the linear part,
since it only contains a self reference.
This would lead to problems when unpickling the linear part.
"""

LINEAR_PART_NAME = "_linear_part"
state = super().__getstate__()

if LINEAR_PART_NAME in state:
del state[LINEAR_PART_NAME]

return state

def __setstate__(self, state):
"""
Hook for the pickle module.

Same as for AffineScalarFunction, but manually set the linear part.
This one is removed when pickling Variable objects.
"""

super().__setstate__(state)
self._linear_part = LinearCombination({self: 1.})


###############################################################################

Expand Down
49 changes: 47 additions & 2 deletions uncertainties/test_uncertainties.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@
from builtins import map
from builtins import range
import copy
import weakref
import math
from math import isnan, isinf
from uncertainties.umath import cos
import random
import sys

Expand All @@ -30,7 +30,7 @@
# Local modules

import uncertainties.core as uncert_core
from uncertainties.core import ufloat, AffineScalarFunc, ufloat_fromstr
from uncertainties.core import ufloat, AffineScalarFunc, ufloat_fromstr, LinearCombination
from uncertainties import umath

# The following information is useful for making sure that the right
Expand Down Expand Up @@ -2338,3 +2338,48 @@ def test_correlated_values_correlation_mat():
assert arrays_close(
numpy.array(cov_mat),
numpy.array(uncert_core.covariance_matrix([x2, y2, z2])))

def test_hash():
'''
Tests the invariance that if x==y, then hash(x)==hash(y)
'''

a = ufloat(1.23, 2.34)
b = ufloat(1.23, 2.34)

# nominal values and std_dev terms are equal, but...
assert a.n==b.n and a.s==b.s
# ...x and y are independent variables, therefore not equal as uncertain numbers
assert a != b
assert hash(a) != hash(b)

# order of calculation should be irrelevant
assert a + b == b + a
assert hash(a + b) == hash(b + a)

# the equation (2x+x)/3 is equal to the variable x, so...
assert ((2*a+a)/3)==a
# ...hash of the equation and the variable should be equal
assert hash((2*a+a)/3)==hash(a)
Comment on lines +2361 to +2363
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is simple enough that it works out okay, but if the calculation got more complicated than (2. + 1.) / 3 == 1. the equality check could fail due to floating point rounding error. Luckily though the equality check would fail along with the hash, so it would not be a violation of the model.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think you are right.
I already wondered, why the equality check is implemented in such a strange way.
But I did not step into details. Probably there is a good reason for that.


c = ufloat(1.23, 2.34)

# the values of the linear combination entries matter
x = AffineScalarFunc(1, LinearCombination({a:1, b:2, c:1}))
y = AffineScalarFunc(1, LinearCombination({a:1, b:2, c:2}))
assert x != y
assert hash(x) != hash(y)

# the order of linear combination values matter and should not lead to the same hash
x = AffineScalarFunc(1, LinearCombination({a:1, b:2}))
y = AffineScalarFunc(1, LinearCombination({a:2, b:1}))
assert x != y
assert hash(x) != hash(y)

# test for a derivative with value 0
a = ufloat(0, 0.1)
b = ufloat(1, 0.1)
x = 2 * b
y = 2 * b * cos(a)
x == y
hash(x) == hash(y)