polars plugin to support quantities: value and a physical unit
This package is still in early phases of development. Supported features:
- use expression transparently on the numeric columns
- functions that take an argument (e.g.
clip
) that is not an expression - propagate physical units when combining different units
- unit conversions
- pretty print units
The value and physical unit is stored as a struct.
Unit operations supported:
- addition/subtraction
- multiplication/division
- power
add qt.with_unit([("m", 1)])
to a Series
to specify the unit of
measure.
import polars as pl
df = pl.DataFrame(
{
"distance": pl.Series([1.0, 2.0, 3.0]).qt.with_unit([("m", 1)]),
"time": pl.Series([1.0, 2.0, 3.0]).qt.with_unit([("s", 1)]),
}
)
df
distance | time |
---|---|
struct[2] | struct[2] |
{1.0,[{"m",1}]} | {1.0,[{"s",1}]} |
{2.0,[{"m",1}]} | {2.0,[{"s",1}]} |
{3.0,[{"m",1}]} | {3.0,[{"s",1}]} |
Can apply functions on the underlying numeric column using .qt.<func>
on an expression
df.select(pl.col("distance").qt.mean())
distance |
---|
struct[2] |
{2.0,[{"m",1}]} |
<func>
can be any expression function that is supported on a numeric
column. It also works on functions that take 2 columns
df.with_columns(
dist_neg=pl.col("distance").qt.neg(),
dist_dist=pl.col("distance").qt.add(pl.col("distance")),
)
distance | time | dist_neg | dist_dist |
---|---|---|---|
struct[2] | struct[2] | struct[2] | struct[2] |
{1.0,[{"m",1}]} | {1.0,[{"s",1}]} | {-1.0,[{"m",1}]} | {2.0,[{"m",1}]} |
{2.0,[{"m",1}]} | {2.0,[{"s",1}]} | {-2.0,[{"m",1}]} | {4.0,[{"m",1}]} |
{3.0,[{"m",1}]} | {3.0,[{"s",1}]} | {-3.0,[{"m",1}]} | {6.0,[{"m",1}]} |
you need to use the .qt
on at least one operand (cannot subclass
pl.Series
) when doing basic arithmetic
df.with_columns(dist_squared=pl.col("distance").qt * pl.col("distance"))
distance | time | dist_squared |
---|---|---|
struct[2] | struct[2] | struct[2] |
{1.0,[{"m",1}]} | {1.0,[{"s",1}]} | {1.0,[{"m",2}]} |
{2.0,[{"m",1}]} | {2.0,[{"s",1}]} | {4.0,[{"m",2}]} |
{3.0,[{"m",1}]} | {3.0,[{"s",1}]} | {9.0,[{"m",2}]} |
df.with_columns(speed=pl.col("distance").qt / pl.col("time"))
distance | time | speed |
---|---|---|
struct[2] | struct[2] | struct[2] |
{1.0,[{"m",1}]} | {1.0,[{"s",1}]} | {1.0,[{"m",1}, {"s",-1}]} |
{2.0,[{"m",1}]} | {2.0,[{"s",1}]} | {1.0,[{"m",1}, {"s",-1}]} |
{3.0,[{"m",1}]} | {3.0,[{"s",1}]} | {1.0,[{"m",1}, {"s",-1}]} |
the plugin is implemented as a rust polars plugin.
A quantity Series
is stored as a Struct with two fields:
value
a numeric columnunit
a unit. This is implemented a a List column of Struct with the fieldsname
andpower
Polars doesn’t support yet Extentions Dtype so this implementation detail is shown to the user.
The core of the plugin unpacks the value
from the given series,
applies the original expression, and then repacks it a quantity Series
we need a runtime unit system so we can’t use the uom
crate, which is
a compile time unit check.
The idea of the implementation is that on the Rust side we only
implement the basic checks of units and operations, all the rest
(parsing from strings, simplify, formatting) will be implemented in
python, maybe using a library like pint
as a backend