You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
importnumpyasnpimportnumcodecsfilter=numcodecs.fixedscaleoffset.FixedScaleOffset(273.15, 1.0, 'f4', astype='f4')
# temp in Ktemperature=np.array([263.05, 273.05, 273.35, 283.25, 293.55, 304.05, 313.94998], dtype=np.float32)
# scale temperature in degrees K to degrees Cfilter.encode(temperature)
I'm looking at implementing lossy compression with the BitRound filter for some large weather datasets stored in zarr. Some parameters are stored with units that put all values in a range that can be fairly large in magnitude (e.g. not in the range of [2^0, 2^1]. One example is temperature in Kelvin. The quantization errors after applying BitRound are larger than they need to be in such cases.
If I could offset the data to a more reasonable range, I could achieve smaller quantization errors. It looked like FixedScaleOffset would be just the ticket after I saw that it accepts an astype argument. Unfortunately, FixedScaleOffset always rounds the data to integers before casting to that type. I tested a local implementation of FixedScaleOffset and found that removing this rounding achieved the desired behavior.
I would like to chain the FixedScaleOffset and Bitround filters in a way that could minimize quantization errors. In one local test, I used bit rounding with keepbits=8 for a temperature array. The maximum quanitzation errors were +/-0.5 degrees. Using FixedScaleOffset without integer rounding, these errors were reduced to +/-0.0625 degrees.
Potential enhancement
We could check that the astype argument is an integer dtype. If it is, we apply rounding. Otherwise, we leave the data alone.
Or, we could add an optional argument to FixedScaleOffset that controls whether or not rounding to integers is applied and default that to True for backwards compatibility.
The text was updated successfully, but these errors were encountered:
Problem description
I'm looking at implementing lossy compression with the BitRound filter for some large weather datasets stored in zarr. Some parameters are stored with units that put all values in a range that can be fairly large in magnitude (e.g. not in the range of [2^0, 2^1]. One example is temperature in Kelvin. The quantization errors after applying BitRound are larger than they need to be in such cases.
If I could offset the data to a more reasonable range, I could achieve smaller quantization errors. It looked like FixedScaleOffset would be just the ticket after I saw that it accepts an
astype
argument. Unfortunately, FixedScaleOffset always rounds the data to integers before casting to that type. I tested a local implementation of FixedScaleOffset and found that removing this rounding achieved the desired behavior.I would like to chain the FixedScaleOffset and Bitround filters in a way that could minimize quantization errors. In one local test, I used bit rounding with keepbits=8 for a temperature array. The maximum quanitzation errors were +/-0.5 degrees. Using FixedScaleOffset without integer rounding, these errors were reduced to +/-0.0625 degrees.
Potential enhancement
We could check that the astype argument is an integer dtype. If it is, we apply rounding. Otherwise, we leave the data alone.
Or, we could add an optional argument to FixedScaleOffset that controls whether or not rounding to integers is applied and default that to True for backwards compatibility.
The text was updated successfully, but these errors were encountered: