Minimal example for downstream inference UDF #27

kvantricht · 2024-01-23T14:33:36Z

We need a minimal example showing how external projects can make use of OpenEO-GFMAP functionality for inference purposes:

Chosen backend
Custom bbox
Custom temporal range
Requested sensors and preprocessing
Custom UDF code that returns a result cube

kvantricht · 2024-04-08T09:46:46Z

@VictorVerhaert, according to Hans you would already have an inference UDF notebook for grassland watch. Would you be able to share it in a PR so @GriffinBabe can have a look at it?

VictorVerhaert · 2024-04-08T10:02:42Z

Yes I'll add it to the examples on the github.
If you want (and it fits in our next sprint) I could also take a look at creating an as minimal as possible example notebook.

VictorVerhaert · 2024-04-08T10:11:23Z

My inference notebook does not use GFMap however.
I use a shared .py file containing the preprocessing steps. My extraction pipeline (GFmap) uses this .py file after the fetchers, but my inference pipeline just uses load_collection.

for now I would just suggest putting this example in https://github.com/Open-EO/openeo-community-examples and referencing it here

VictorVerhaert · 2024-04-08T10:12:07Z

FYI you can inspect my pipelines here: https://github.com/gisat/grasslandwatch/tree/main/lc_offline

kvantricht · 2024-04-08T11:07:59Z

My inference notebook does not use GFMap however.

Ah ok interesting. Definitely useful but we should also work on a GFMAP-based inference workflow here.

VictorVerhaert · 2024-04-08T11:35:59Z

I assume the functionality of GFMap for inference would mainly be to split up the spatial extent that we want to perform inference on as well as job managing right?

kvantricht · 2024-04-08T11:40:32Z

GFMAP standardizes band names across backends, lays out typical data flow paths, takes care of loading collections and rescaling them into the most efficient datatype, applies collection-specific standardized processes, etc. That goes much broader than just the job splitting concept.

VictorVerhaert · 2024-04-08T12:00:03Z

Yes of course, I meant what would be visible in the example notebook and what to focus on in the explanation.
It might indeed be good to emphasize that using the same pipeline for extraction and inference is crucial for having accurate results due to the optimalisations you mention in the background.

GriffinBabe · 2024-04-09T10:19:55Z

@VictorVerhaert one thing about the extraction pipeline:

The S1 bands are scaled in uint16 in the following code block (in the fetching preprocessing) https://github.com/Open-EO/openeo-gfmap/blob/main/src/openeo_gfmap/fetching/s1.py#L132
This is a memory optimization for OpenEO, as the collections are in float32 power vals. Those values are automatically reconverted to decibels in the feature extractor, unless the users disables it with a flag: https://github.com/Open-EO/openeo-gfmap/blob/main/src/openeo_gfmap/features/feature_extractor.py#L110. Now I see here that you perform some operations on compositing, so probably we should do that rescaling after preprocessing and before entering in the FeatureExtractor.

GriffinBabe · 2024-04-09T12:20:40Z

@kvantricht @VictorVerhaert

I like the idea of using the common ONNX library. I see online that it is possible to convert any Sklearn, PyTorch and Tensorflow model to that format. Even catboost is directly compatible.

Based on the inference UDF of @VictorVerhaert and the Feature Extractor functionalities already implemented in GFMAP I came with this first idea for a Model Inference base class, that an user can override to implement its own model inference pipeline. Please take a look and tell me what do you think:
https://github.com/Open-EO/openeo-gfmap/blob/a7b0cd7ff05e0de73460776fb148a31d8a0167f4/src/openeo_gfmap/inference/model_inference.py

We could very well also provide a Model Inference default implementation that requires an path to download the ONNX model and the name of the input tensor as unique parameters, and that returns the probability values or directly the max_probability argument.

One thing that needs to be taken care of by the user is the dependency of ONNX within the OpenEO job. On the long term this could be directly included in the default OpenEO UDF environment, but so far we need to specify the .zip file in the udf-dependency-archives parameters at the end of the job creation, which is done mannualy at the moment. Maybe that's something to discuss in the redesign dicussion @VincentVerelst

VictorVerhaert · 2024-04-09T12:32:35Z

on this last point: @HansVRP and I had a similar discussion this morning.
I think that in the long run the onnxruntime should be included in the standard udf env, as we are advising different projects to use onnx models.

GriffinBabe · 2024-04-15T11:54:29Z

Closed by #88

kvantricht added the enhancement New feature or request label Jan 23, 2024

kvantricht assigned GriffinBabe Jan 23, 2024

kvantricht mentioned this issue Jan 23, 2024

Prototype first cropland mapping UDF #17

Closed

GriffinBabe closed this as completed Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minimal example for downstream inference UDF #27

Minimal example for downstream inference UDF #27

kvantricht commented Jan 23, 2024

kvantricht commented Apr 8, 2024

VictorVerhaert commented Apr 8, 2024

VictorVerhaert commented Apr 8, 2024

VictorVerhaert commented Apr 8, 2024

kvantricht commented Apr 8, 2024

VictorVerhaert commented Apr 8, 2024

kvantricht commented Apr 8, 2024

VictorVerhaert commented Apr 8, 2024

GriffinBabe commented Apr 9, 2024

GriffinBabe commented Apr 9, 2024

VictorVerhaert commented Apr 9, 2024

GriffinBabe commented Apr 15, 2024

Minimal example for downstream inference UDF #27

Minimal example for downstream inference UDF #27

Comments

kvantricht commented Jan 23, 2024

kvantricht commented Apr 8, 2024

VictorVerhaert commented Apr 8, 2024

VictorVerhaert commented Apr 8, 2024

VictorVerhaert commented Apr 8, 2024

kvantricht commented Apr 8, 2024

VictorVerhaert commented Apr 8, 2024

kvantricht commented Apr 8, 2024

VictorVerhaert commented Apr 8, 2024

GriffinBabe commented Apr 9, 2024

GriffinBabe commented Apr 9, 2024

VictorVerhaert commented Apr 9, 2024

GriffinBabe commented Apr 15, 2024