Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preps towards Xarray-DGGS usage #6

Open
allixender opened this issue Nov 10, 2023 · 2 comments
Open

Preps towards Xarray-DGGS usage #6

allixender opened this issue Nov 10, 2023 · 2 comments

Comments

@allixender
Copy link
Owner

Original discussions at BiDS

  • Do not separate climate versus EO
  • More about grids and it is more about GIS versus global model grids
  • one assumption is the co-gridding. Alignment of samples/co-sampled

Examples with some basic DGGS lib data manipulation, coordinate conversions, selecting etc: https://github.com/allixender/dggs_t1

Xarray

  • Data cubes: base structure of xarray
  • can interoperate if they shared the same discretisation.
  • we need resample or interpolation.
  • how we put such grid in xarray?

Time dimension in xarray is well handled.

Can we have a data cube which is not xyz? e.g. kee this time dimension.

  • DGGS part 1 in ISO 19170 abstract specification

quantization of time

How do we discretize. MTSIC (time)

Get sample dataset in DGGS

Let's forget about multi-resolution and we focus on mono-resolution.

Let's try to solve it for one resolution.

What are our requirements

  • do you expect the grid system to exist? or re-use the existing one.
  • We do not invent a new grid system, we re-use existing grid system
  • It is not the specificity of DGGS but we want to explore with DGGS but it could be implemented with any other grid system.

Scope for the sprint

Original Pangeo DGGS code sprint repo:

https://github.com/pangeo-data/bids2023_codesprint

Benoit nicely explains stuff:

pangeo-data/bids2023_codesprint#3

We want to work on the cell id.

We can have orthogonal dimensions: time and z.

Repositories of examples:

Base operations:

  • Selection (.sel, isel), where (.where)
  • interpolation (.interp)
  • regridding (going between different kind grids)

Regridding

  • projection:
    • map source (latlon) to target (cell id / zone id), then aggregate

Xarray DGGS extension

repository (code + examples): https://github.com/benbovy/xdggs

Example Code

import xarray as xr
ds = xr.open_dataset("something_healpix.nc")
# ds.temperature.dims == ('time', 'cell_id')

# how to decode the grid?
# if for each lat
ds = ds.dggs.decode()

# select by coordinates
# Q: do we use a custom index?
ds.sel(lon=45, lat=30, time="2023-10-01", method='nearest')
# or an accessor?
ds.dggs.sel(lon=45, lat=30, time="2023-10-01", method='nearest')

# coarsen (to absolute zoom level)
ds.dggs.coarsen(level=3).mean()

# select by bounding box
ds.dggs.bbox((ll_lon, ll_lat, ur_lon, ur_lat))
ds.dggs.query(shapely.bbox(ll_lon, ll_lat, ur_lon, ur_lat))
ds.dggs.query(shapely.Polygon([[lon, lat], ...]))

# visualization?
ds.isel(time=0).temperature.dggs.plot()

Additional discussion on Thursday

(please extend / correct, my memory of our lively discussion is already somewhat hazy)

  • to progress, we need to collect the features we need to be able to work with DGGS using xarray (→ roadmap / design document)
    • conversion / reconstruction of coordinates from / to cell ids
    • selection of cells
    • encoding / decoding for storage
    • interpolation
    • plotting
    • multi-resolution datasets
      • very useful for merging datasets with the same dggs but different resolutions
      • grid itself still has to be unchanging over time
    • STAC / catalogs?
      • use bounding box / envelope?
  • main area of work so far: selection of data
  • not all of this has to live in xdggs, for example conversion of existing dataset with lat / lon gridding to DGGS:
    • interpolation / resampling to a DGGS of roughly the same resolution
    • implementation can live in pyresample / xesmf / any other resampling / regridding library
  • not all of the code should be written in python, some of this should be implemented in a lower-level, high-performance language
  • grant proposal together with openeo: yes, but we will start working on this before the grant starts
  • meeting with the OGC working group on DGGS for more feedback / exchange (Peter will help set that up)
@allixender
Copy link
Owner Author

https://github.com/allixender/dggrid4py/releases/tag/v0.2.9

  • addresses centroids and polygon from and to geo and to and from cellid conversions
  • prepared Z3, Z7, ZORDER address type usage (beyond "just" seqnum")
  • better treatment of temp files

TODO:

  • implement Z3, Z7, ZORDER address type usage
  • implement option to use even faster formats (Feather, GeoArrow, GeoParquet)?

@allixender
Copy link
Owner Author

allixender commented Nov 19, 2024

Z7_STRING and Z3_STRING and transform work now

bbdd391

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant