Converting a `pint.Quantity` to `PintArray` when adding to a `pd.DataFrame` #248

Musaefendic · 2024-08-12T09:05:52Z

Description

I tried adding a pint.Quantity to an existing pd.DataFrame, thinking that pint-pandas might transform the Quantity into a PintArray that:

matches the number of rows
preserves the unit.

Reproducible Example

import pandas as pd
from pint_pandas import PintArray, Quantity

# Existing pd.DataFrame
data = {"bar": [0.07, 0.30, 0.85, 1.00]}
df = pd.DataFrame(data)

# Trying to add a `pint.Quantity`
df['content'] = Quantity(42.0, units='percent')

df.dtypes 

# Output:
# bar      float64
# content  object    # <---- Expected: pint[percent]

output:

	bar	content
0	0.07	42.0 percent
1	0.30	42.0 percent
2	0.85	42.0 percent
3	1.00	42.0 percent

Question

The documentation indeed suggests using a pd.Series or a PintArray to achieve this, but this approach feels a bit verbose. I’d like to add a new column directly with just Quantity to mimic the pandas API when creating a new column from a float to an existing pd.DataFrame.

Would it make sense to convert a pint.Quantity into a PintArray when adding it to a pd.DataFrame?

The text was updated successfully, but these errors were encountered:

andrewgsavage · 2024-08-12T10:03:28Z

An ExtensionArray must have a defined length, making your suggestion not possible.

Yes, it would be very nice to have. To get this work, pandas needs to identify the quantity as a scalar of PintType, then try to construct a PintArray.
pandas-dev/pandas#27995

mutricyl · 2024-10-04T15:21:23Z

I started to work on the panda side of this issue https://github.com/mutricyl/pandas/tree/27995_infer_EA_from_obj

I came across a constructor issue with pint_pandas:

>>> import pandas as pd
>>> import numpy as np
>>> import pint_pandas
>>> km = pd.Series([1.0, 2.0, 3.0], dtype="pint[km]")
>>> ndarray_object = km.to_numpy()  # creates a numpy array of Quantity with dtype == object
>>> ndarray_object
array([<Quantity(1.0, 'kilometer')>, <Quantity(2.0, 'kilometer')>,
       <Quantity(3.0, 'kilometer')>], dtype=object)
>>> pint_pandas.PintArray(ndarray_object)
NotImplementedError
>>> pint_pandas.PintArray(ndarray_object, dtype=type(ndarray_object))
ValueError: could not construct PintType

Am I using improperly PintArray constructor or should we be able to construct a PintArray from a ndarray of Quantities ?

andrewgsavage · 2024-10-04T18:29:18Z

fyi I had a go at fixing it here
pandas-dev/pandas#59767

yes, ideally that should also work

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Converting a `pint.Quantity` to `PintArray` when adding to a `pd.DataFrame` #248

Converting a `pint.Quantity` to `PintArray` when adding to a `pd.DataFrame` #248

Musaefendic commented Aug 12, 2024

andrewgsavage commented Aug 12, 2024

mutricyl commented Oct 4, 2024

andrewgsavage commented Oct 4, 2024

Converting a pint.Quantity to PintArray when adding to a pd.DataFrame #248

Converting a pint.Quantity to PintArray when adding to a pd.DataFrame #248

Comments

Musaefendic commented Aug 12, 2024

Description

Reproducible Example

Question

andrewgsavage commented Aug 12, 2024

mutricyl commented Oct 4, 2024

andrewgsavage commented Oct 4, 2024

Converting a `pint.Quantity` to `PintArray` when adding to a `pd.DataFrame` #248

Converting a `pint.Quantity` to `PintArray` when adding to a `pd.DataFrame` #248