Change from Provider CLI to Provider API #376

grahamia · 2024-10-16T11:05:42Z

Is your feature request related to a problem? Please describe.
Currently monitoring of workflow failures is not done. So if a call to VAI for example fails then the workflow finishes and no one is the wiser.

Currently the various workflows KFP Operator triggers start a pod for the provider image which provides a CLI for the various bits of functionality.

e.g. /provider --provider <config location> pipeline create --pipeline-definition /resource-definition.yaml

To allow for better monitoring of provider events turning this step into calling an api on a running deployment that can expose metrics (using open telemetry) (rather than relying on publishing metrics to the argo workflow controller, which we have already experienced is not the best solution).

Describe the solution you'd like
We have the event source server/event processor for handling the run completion events as a deployment already. Idea would be to add a http rest interface to this deployment to handle the different CLI commands that are currently made to be CRUD HTTP requests on the running deployment. The workflows will then all need to be changed so that they make http requests out rather than start a pod up. A metrics endpoint should also be added to expose metrics for the various events that are processed and whether was successful or not. Analysis should be carried to see what useful metrics would be for the different endpoints.

--provider = this is the custom resource so service will still need to load the resource to get the config as it currently does. Provider name will be a template var.

The different resource definitions will be passed in the body of the request.

/pipeline
/run
/schedule
/experiment

PUT = Create
POST = Update
DELETE = Delete

The text was updated successfully, but these errors were encountered:

grahamia added the unrefined label Oct 16, 2024

grahamia mentioned this issue Oct 22, 2024

Expose model training metrics #109

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change from Provider CLI to Provider API #376

Change from Provider CLI to Provider API #376

grahamia commented Oct 16, 2024 •

edited

Loading

Change from Provider CLI to Provider API #376

Change from Provider CLI to Provider API #376

Comments

grahamia commented Oct 16, 2024 • edited Loading

grahamia commented Oct 16, 2024 •

edited

Loading