You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# TODO: other/better choices for save_result format (e.g. based on backend support)?
"process_id": "save_result",
"arguments": {
"data": {"from_node": node_id},
# TODO: particular format options?
# "format": "NetCDF",
"format": "GTiff",
},
"result": True,
If I remember correctly we picked that at the time of implementation, because it's a safe choice (widely supported) and there were issues with NetCDF support in load_stac in openeo-geopyspark-driver at the time (March 2023).
We might want to revisit the situation
e.g. automatically detect a better option? let user choose in some way?
The text was updated successfully, but these errors were encountered:
I'm not really sure if netcdf will be better, especially because writing a single large netcdf is not so easy, whereas geotiff can write multiple files in parallel.
The only other format with some potential for this use case is Zarr, again because of the parallel write possibility.
A reason to prefer NetCDF is that it is more standardized to handle multidimensional cases (e.g. encode time dimension). With GTiff we do encoding of time dimension in a more ad-hoc way, so that will not scale well if more backend implementations come in play.
But indeed, this is not an urgent matter at this time
STAC + geotiff can fully define a datacube with time dimension in a standardized manner.
In fact, the stac metadata becomes more complicated for netcdf with time dimension. I've also seen other backends write netcdf output in rather unexpected ways that we would probably not support on our side.
I noticed this while looking into Open-EO/openeo-geopyspark-driver#786 related issue:
the crossbackend feature in aggregator currently uses GTiff for the
load_stac
bridge:openeo-aggregator/src/openeo_aggregator/partitionedjobs/crossbackend.py
Lines 133 to 141 in 129d4f2
If I remember correctly we picked that at the time of implementation, because it's a safe choice (widely supported) and there were issues with NetCDF support in
load_stac
in openeo-geopyspark-driver at the time (March 2023).We might want to revisit the situation
e.g. automatically detect a better option? let user choose in some way?
The text was updated successfully, but these errors were encountered: