-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Test lowest versions of all required and optional dependencies #3639
base: main
Are you sure you want to change the base?
Conversation
Using uv's --resolution lowest-direct to install the lowest compatible versions for all direct Python dependencies (required and optional) listed in the pyproject.toml file.
Need to get more recent version with a binary wheel, otherwise will need to compile from source.
Pin major versions for optional dependencies that are not on version 0.x. Using a reasonable major version released in the past year or two.
Pin minor versions of optional dependencies that are still on v0.x.
run: uv run --with pytest==8 --resolution lowest-direct --all-extras --dev make test_no_images PYTEST_EXTRA="-r P" | ||
env: | ||
GMT_LIBRARY_PATH: $CONDA_PREFIX/lib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting this error at https://github.com/GenericMappingTools/pygmt/actions/runs/11945150103/job/33297317606?pr=3639#step:7:35:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/runner/work/pygmt/pygmt/pygmt/__init__.py", line 24, in <module>
from pygmt import datasets
File "/home/runner/work/pygmt/pygmt/pygmt/datasets/__init__.py", line 7, in <module>
from pygmt.datasets.earth_age import load_earth_age
File "/home/runner/work/pygmt/pygmt/pygmt/datasets/earth_age.py", line 12, in <module>
from pygmt.datasets.load_remote_dataset import _load_remote_dataset
File "/home/runner/work/pygmt/pygmt/pygmt/datasets/load_remote_dataset.py", line 9, in <module>
from pygmt.clib import Session
File "/home/runner/work/pygmt/pygmt/pygmt/clib/__init__.py", line 9, in <module>
from pygmt.clib.session import Session, __gmt_version__
File "/home/runner/work/pygmt/pygmt/pygmt/clib/session.py", line 111, in <module>
_libgmt = load_libgmt()
File "/home/runner/work/pygmt/pygmt/pygmt/clib/loading.py", line 62, in load_libgmt
raise GMTCLibNotFoundError("\n".join(error_msg))
pygmt.exceptions.GMTCLibNotFoundError: Error loading GMT shared library at '/home/runner/micromamba/envs/pygmt/lib/libgmt.so'.
/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /home/runner/micromamba/envs/pygmt/lib/./libgdal.so.34)
Error loading GMT shared library at 'libgmt.so'.
/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /home/runner/micromamba/envs/pygmt/lib/python3.10/lib-dynload/../.././libgdal.so.34)
make: *** [Makefile:35: _runtest] Error 1
Similar to issues mentioned at GenericMappingTools/try-gmt#50 (comment) and GenericMappingTools/try-gmt#37 (comment). Seems to only be on the Ubuntu/Linux build.
@@ -33,18 +33,18 @@ dependencies = [ | |||
"numpy>=1.24", | |||
"pandas>=2.0", | |||
"xarray>=2023.04", | |||
"netCDF4", | |||
"netCDF4>=1.7", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed making NetCDF an optional dependency before in #429. Maybe we could revisit that discussion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. NetCDF4 was needed by xarray when reading and writing netCDF files. As mentioned in #239 (comment), previously, our load_xxx
functions (e.g., load_earth_relief
) called xarray.load_dataarray
to load local netCDF grids into an xarray.DataArray object, making netCDF a highly "required" dependency.
After we refactor our load_xxx
function in #3120, we no longer call xarray.load_dataarray
, so netCDF should be no longer highly needed by PyGMT.
I've opened PR #3643 to see how the PyGMT relies on the netCDF package.
Edit: There are only 14 failures in the Python 3.11 + Ubuntu CI job after removing netCDF entirely (see https://github.com/GenericMappingTools/pygmt/actions/runs/11968416461/job/33367210857?pr=3643) and most of them are caused by load_static_earth_relief
.
# Install the package that we want to test | ||
- name: Install the package | ||
run: make install | ||
run: | | ||
uv run --with pip==23 --resolution lowest-direct --all-extras --dev make install |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using --resolution lowest-direct
for now, because --resolution lowest
grabs many other transitive dependencies that may be hard to install (e.g. missing wheels for newer Python versions, so requires compilation from source). We could switch to --resolution lowest
once the Python ecosystem does lower bound pins a bit more thoroughly (which may take years).
pyproject.toml
Outdated
"contextily>=1", | ||
"geopandas>=0.14", | ||
"IPython>=8", # 'ipython' is not the correct module name. | ||
"pyarrow>=14", | ||
"rioxarray>=0.14", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting lower bounds here, otherwise uv
will try to install super old versions that may not work anymore, some which actually go back to Python 2! For e.g., it tried to install rioxarray=0.0.1 at https://github.com/GenericMappingTools/pygmt/actions/runs/11945100244/job/33297179010#step:7:45. Happy to reconsider some of these version pins if needed.
Edit see comments at #3639 (comment) and #3639 (comment) below.
I'm unsure if "uv" is a good solution here, since it installs packages from PyPI only, but GMT is only available on conda-forge. It means we will mix packages from conda-forge and PyPI, which is usually not recommended. |
Fix `TypeError: canonicalize_version() got an unexpected keyword argument 'strip_trailing_zero'` when running `pip install --no-deps -e .`, xref pypa/setuptools#4501
BTW, the "Install package" step has errors but the step still passes. |
We have a similar situation in
Yes, I noticed that. Struggling to see if there's an upstream issue on this at https://github.com/astral-sh/uv/issues. Need to find a way to make it return a non-zero exit code. |
Create a proper virtualenv, and activate that virtualenv before installing pygmt and running pytest.
So that xyzservices is also installed by default and `from xyzservices import TileProvider` can work. See geopandas/contextily#183
So that `pyarrow.array(..., dtype="string_view")` will work. See apache/arrow#39652 when this was added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated a few more minimum pins on optional dependencies contextily>1.2
and pyarrow>16
based on the failing tests.
"IPython", # 'ipython' is not the correct module name. | ||
"pyarrow", | ||
"rioxarray", | ||
"contextily>=1.2", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to pin to contextily>=1.2
, otherwise this code block won't work as expected (there is a chance that an ImportError will be raised if contextily=1.1
is installed, but xyzservices is not installed, breaking the _HAS_CONTEXTILY logic):
pygmt/pygmt/datasets/tile_map.py
Lines 12 to 19 in 97185e8
try: | |
import contextily | |
from xyzservices import TileProvider | |
_HAS_CONTEXTILY = True | |
except ImportError: | |
TileProvider = None | |
_HAS_CONTEXTILY = False |
Xref geopandas/contextily#183, where xyzservices
was included as a requirement of contextily
(in v1.2.0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know that we require contexily>=1.2
pyproject.toml
Outdated
"contextily>=1.2", | ||
"geopandas>=0.14", | ||
"IPython>=8", # 'ipython' is not the correct module name. | ||
"pyarrow>=16", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pinning to pyarrow>16
here to fix this error:
________ test_to_numpy_pyarrow_array_pyarrow_dtypes_string[string_view] ________
> ???
E KeyError: 'string_view'
pyarrow/types.pxi:5025: KeyError
During handling of the above exception, another exception occurred:
dtype = 'string_view'
@pytest.mark.skipif(not _HAS_PYARROW, reason="pyarrow is not installed")
@pytest.mark.parametrize(
"dtype",
[
None,
"string",
"utf8", # alias for string
"large_string",
"large_utf8", # alias for large_string
"string_view",
],
)
def test_to_numpy_pyarrow_array_pyarrow_dtypes_string(dtype):
"""
Test the _to_numpy function with PyArrow arrays of PyArrow string types.
"""
> array = pa.array(["abc", "defg", "12345"], type=dtype)
../pygmt/tests/test_clib_to_numpy.py:333:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pyarrow/array.pxi:230: in pyarrow.lib.array
???
pyarrow/types.pxi:5040: in pyarrow.lib.ensure_type
???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E ValueError: No type alias for string_view
pyarrow/types.pxi:5027: ValueError
Code was added in #3608:
pygmt/pygmt/tests/test_clib_to_numpy.py
Lines 326 to 336 in 97185e8
"string_view", | |
], | |
) | |
def test_to_numpy_pyarrow_array_pyarrow_dtypes_string(dtype): | |
""" | |
Test the _to_numpy function with PyArrow arrays of PyArrow string types. | |
""" | |
array = pa.array(["abc", "defg", "12345"], type=dtype) | |
result = _to_numpy(array) | |
_check_result(result, np.str_) | |
npt.assert_array_equal(result, array) |
The pyarrow.StringViewArray
class was added in apache/arrow#39652 that was released in pyarrow v16.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyArrow v16.0 was released in April, 2024. Instead of pinning pyarrow>=16.0, we can just skip string_view
if pyarrow<v16.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I just tested older versions from pyarrow v12 to v15, and think we can pin to pyarrow>=13
(v13 was released on 24 Aug 2023) if ignoring string_view
. With v12, there is another error on datetime.
_________________________________________________________________ test_to_numpy_pyarrow_array_pyarrow_dtypes_date[date64[ms]] __________________________________________________________________
dtype = 'date64[ms]', expected_dtype = 'datetime64[ms]'
@pytest.mark.skipif(not _HAS_PYARROW, reason="pyarrow is not installed")
@pytest.mark.parametrize(
("dtype", "expected_dtype"),
[
pytest.param("date32[day]", "datetime64[D]", id="date32[day]"),
pytest.param("date64[ms]", "datetime64[ms]", id="date64[ms]"),
],
)
def test_to_numpy_pyarrow_array_pyarrow_dtypes_date(dtype, expected_dtype):
"""
Test the _to_numpy function with PyArrow arrays of PyArrow date types.
date32[day] and date64[ms] are stored as 32-bit and 64-bit integers, respectively,
representing the number of days and milliseconds since the UNIX epoch (1970-01-01).
Here we explicitly check the dtype and date unit of the result.
"""
data = [
date(2024, 1, 1),
datetime(2024, 1, 2),
datetime(2024, 1, 3),
]
array = pa.array(data, type=dtype)
result = _to_numpy(array)
_check_result(result, np.datetime64)
> assert result.dtype == expected_dtype # Explicitly check the date unit.
E AssertionError: assert dtype('<M8[D]') == 'datetime64[ms]'
E + where dtype('<M8[D]') = array(['2024-01-01', '2024-01-02', '2024-01-03'], dtype='datetime64[D]').dtype
../pygmt/tests/test_clib_to_numpy.py:364: AssertionError
The main change appears to be in apache/arrow#33321, when pyarrow supported preserving the datetime64 temporal resolution.
Maybe we can keep the string_view
test, but add a skip_if(pyarrow.__version__ < 16)
marker or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjusted pin to pyarrow>=13
in commit 4733bda
Lower pyarrow pin from 16 to 13, and added a skipif pytest marker to the `test_to_numpy_pyarrow_array_pyarrow_dtypes_string[string_view]` unit test to not run when pyarrow<16 is installed. Setting a pin on pyarrow>=13 so that datetime64 unit preservation is handled, xref apache/arrow#33321.
Description of proposed changes
Test lowest versions of all direct dependencies of PyGMT listed in the pyproject.toml file (both required and optional dependencies).
This implements the idea at #3615 (comment) on
.github/workflows/ci_tests_legacy.yaml
(instead ofci_tests.yaml
) by using theuv
package manager's--resolution lowest-direct
strategy (https://docs.astral.sh/uv/concepts/resolution/#resolution-strategy).Related to optional dependency policy at #3456, and complements #3615 by testing lowest versions of transitive dependencies more thoroughly. The following version pins have been set:
xyzservices
as requirement, #3639 (comment)Testing locally
If it helps, this is how I'm testing locally using
nektos/act
:ci_tests_legacy.yaml
first:os: [ubuntu-20.04, macos-13, windows-2019]
to justos: [ubuntu-20.04]
(or whatever OS you want to test on)act schedule --workflows '.github/workflows/ci_tests_legacy.yaml'
Reminders
make format
andmake check
to make sure the code follows the style guide.doc/api/index.rst
.Slash Commands
You can write slash commands (
/command
) in the first line of a comment to performspecific operations. Supported slash command is:
/format
: automatically format and lint the code