Make GFMAP Job Manager more robust and persistant #96

GriffinBabe · 2024-04-17T11:16:24Z

It can happen that the GFMAPJobManager crashes. Not necessarily due to errors on gfmap side, but also from bad user code in post-job actions.

Implement the possibility of re-running jobs that previously failed. Could be a parameter of the Job manager when running.
Re-run failed post-job actions. This could be done by setting up the job statuses to an intermediate value "post-processing" before setting it up to finished at the end of the post-job action. This however can enter in conflict with the MultiBackendJobManager behavior.
There is also the issue that when running an extraction on the same destination folder, the STAC catalogue is being overwritten instead of being extended Don't overwrite existing STAC collection when doing a new extraction #94

At the moment, persistence is done through the job_tracking.csv file and the base logic in the MultiBackendJobManager
https://github.com/Open-EO/openeo-python-client/blob/master/openeo/extra/job_management.py#L32

The text was updated successfully, but these errors were encountered:

GriffinBabe · 2024-04-23T08:50:20Z

Whenever a crash happens from the user-code, the GFMAP manager loses it's stac collection progress as it is only written whenever the manager finishes it's jobs.

One temporary way of tackling that would be to simply add a try/except clause as such:

try:
    manager.run_jobs(job_df, create_datacube_optical, tracking_df_path)
except Exception as e:
    _pipeline_log.error("Error during the job execution: %s", e)
finally:
    manager.create_stac(constellation='sentinel2', item_assets={"auxiliary": AUXILIARY})

This should in-theory save only fully initialized STAC items (crashing points should be considered from the output_path_gen, post_job_action, create_job user-functions, all of which are called before adding any item to the collection):

self._root_collection.add_items(job_items)

@VincentVerelst However I was thinking that it would be maybe better to call create_stac function automatically within the manager, so that STAC is automatically handled during a crash. The usage of a job manager could look like this:

manager = GFMAPJobManager(...)
manager.setup_stac(constellation='sentinel2', item_assets={'auxiliary': AUXILIARY})

manager.run_jobs(...)  # Will can _create_stac internally

Tell me what do you think 😄

VincentVerelst · 2024-04-23T09:11:48Z

@GriffinBabe, sounds like a good idea! I don't see any benefit in the user having to call create_stac themselves. Also like the idea of having a setup_stac. Maybe we can also make this one optional? i.e. only if the user is interested in changing the STAC metadata, they need to call it, otherwise GFMap will generate a default STAC collection based on which constellation is selected.

GriffinBabe mentioned this issue Apr 17, 2024

Don't overwrite existing STAC collection when doing a new extraction #94

Closed

VincentVerelst mentioned this issue Apr 26, 2024

Make STAC collection persistent #102

Closed

GriffinBabe mentioned this issue Apr 30, 2024

96 jobmanager persistance #104

Merged

GriffinBabe linked a pull request Apr 30, 2024 that will close this issue

96 jobmanager persistance #104

Merged

GriffinBabe closed this as completed in #104 Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make GFMAP Job Manager more robust and persistant #96

Make GFMAP Job Manager more robust and persistant #96

GriffinBabe commented Apr 17, 2024 •

edited

Loading

GriffinBabe commented Apr 23, 2024 •

edited

Loading

VincentVerelst commented Apr 23, 2024

Make GFMAP Job Manager more robust and persistant #96

Make GFMAP Job Manager more robust and persistant #96

Comments

GriffinBabe commented Apr 17, 2024 • edited Loading

GriffinBabe commented Apr 23, 2024 • edited Loading

VincentVerelst commented Apr 23, 2024

GriffinBabe commented Apr 17, 2024 •

edited

Loading

GriffinBabe commented Apr 23, 2024 •

edited

Loading