Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subdtype #247

Open
wants to merge 23 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
name: Documentation Build

on: [push, pull_request]

jobs:
docbuild:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
with:
fetch-depth: 100

- name: Get tags
run: git fetch --depth=1 origin +refs/tags/*:refs/tags/*

- name: Set up minimal Python version
uses: actions/setup-python@v2
with:
python-version: "3.10"

- name: Get pip cache dir
id: pip-cache
run: echo "::set-output name=dir::$(pip cache dir)"

- name: Setup pip cache
uses: actions/cache@v2
with:
path: ${{ steps.pip-cache.outputs.dir }}
key: pip-docs
restore-keys: pip-docs

- name: Install locales
run: |
sudo apt-get install language-pack-fr
sudo localedef -i fr_FR -f UTF-8 fr_FR

- name: Install dependencies
run: |
sudo apt install -y pandoc
pip install --upgrade pip setuptools wheel
pip install -r "requirements_docs.txt"
pip install docutils==0.14 commonmark==0.8.1 recommonmark==0.5.0 babel==2.8
pip install .

- name: Build documentation
run: sphinx-build -n -j auto -b html -d build/doctrees docs build/html

- name: Doc Tests
run: sphinx-build -a -j auto -b doctest -d build/doctrees docs build/doctest
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ build/
dist/
MANIFEST
*pytest_cache*
*mypy_cache*
.eggs

# WebDAV file system cache files
Expand Down
2 changes: 0 additions & 2 deletions docs/getting/tutorial.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
.. _tutorial:

**************************
Tutorial
**************************
Expand Down
4 changes: 1 addition & 3 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
:orphan:

Pint-pandas: Unit support for pandas
======================
=====================================

**Useful links**:
`Code Repository <https://github.com/hgrecco/pint-pandas>`__ |
Expand Down Expand Up @@ -66,9 +66,7 @@ Pint-pandas: Unit support for pandas

Getting started <getting/index>
User Guide <user/index>
Advanced topics <advanced/index>
ecosystem
API Reference <api/index>

.. toctree::
:maxdepth: 1
Expand Down
3 changes: 2 additions & 1 deletion docs/user/common.rst
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,9 @@ Creating DataFrames from Series
The default operation of Pandas `pd.concat` function is to perform row-wise concatenation. When given a list of Series, each of which is backed by a PintArray, this will inefficiently convert all the PintArrays to arrays of `object` type, concatenate the several series into a DataFrame with that many rows, and then leave it up to you to convert that DataFrame back into column-wise PintArrays. A much more efficient approach is to concatenate Series in a column-wise fashion:

.. ipython:: python
:suppress:
:okwarning:

list_of_series = [pd.Series([1.0, 2.0], dtype="pint[m]") for i in range(0, 10)]
df = pd.concat(list_of_series, axis=1)


Expand Down
35 changes: 25 additions & 10 deletions docs/user/initializing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,35 +4,50 @@
Initializing data
**************************

There are several ways to initialize PintArrays in a DataFrame. Here's the most common methods. We'll use `PA_` and `Q_` as shorthand for PintArray and Quantity.
There are several ways to initialize a `PintArray`s` in a `DataFrame`. Here's the most common methods. We'll use `PA_` and `Q_` as shorthand for `PintArray` and `Quantity`.



.. ipython:: python
:okwarning:

import pandas as pd
import pint
import pint_pandas
import io

PA_ = pint_pandas.PintArray
ureg = pint_pandas.PintType.ureg
Q_ = ureg.Quantity

df = pd.DataFrame(
{
"A": pd.Series([1.0, 2.0], dtype="pint[m]"),
"B": pd.Series([1.0, 2.0]).astype("pint[m]"),
"C": PA_([2.0, 3.0], dtype="pint[m]"),
"D": PA_([2.0, 3.0], dtype="m"),
"E": PA_([2.0, 3.0], dtype=ureg.m),
"F": PA_.from_1darray_quantity(Q_([2, 3], ureg.m)),
"G": PA_(Q_([2.0, 3.0], ureg.m)),
"Ser1": pd.Series([1, 2], dtype="pint[m]"),
"Ser2": pd.Series([1, 2]).astype("pint[m]"),
"Ser3": pd.Series([1, 2], dtype="pint[m][Int64]"),
"Ser4": pd.Series([1, 2]).astype("pint[m][Int64]"),
"PArr1": PA_([1, 2], dtype="pint[m]"),
"PArr2": PA_([1, 2], dtype="pint[m][Int64]"),
"PArr3": PA_([1, 2], dtype="m"),
"PArr4": PA_([1, 2], dtype=ureg.m),
"PArr5": PA_(Q_([1, 2], ureg.m)),
"PArr6": PA_([1, 2],"m"),
}
)
df


In the first two Series examples above, the data was converted to Float64.

.. ipython:: python

df.dtypes


To avoid this conversion, specify the subdtype (dtype of the magnitudes) in the dtype `"pint[m][Int64]"` when constructing using a `Series`. The default data dtype that pint-pandas converts to can be changed by modifying `pint_pandas.DEFAULT_SUBDTYPE`.

`PintArray` infers the subdtype from the data passed into it when there is no subdtype specified in the dtype. It also accepts a pint `Unit`` or unit string as the dtype.


.. note::

"pint[unit]" must be used for the Series or DataFrame constuctor.
`"pint[unit]"` or `"pint[unit][subdtype]"` must be used for the Series or DataFrame constuctor.
5 changes: 4 additions & 1 deletion docs/user/reading.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,10 @@ Let's read that into a DataFrame. Here io.StringIO is used in place of reading a
df = pd.read_csv(io.StringIO(test_data), header=[0, 1], index_col=[0, 1]).T
# df = pd.read_csv("/path/to/test_data.csv", header=[0, 1])
for col in df.columns:
df[col] = pd.to_numeric(df[col], errors="ignore")
try:
df[col] = pd.to_numeric(df[col])
except:
pass
df.dtypes


Expand Down
Loading
Loading