Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the list of supported entrypoint parameter types and add workaround to export full list of plugin parameter types #640

Merged
merged 6 commits into from
Sep 18, 2024

Conversation

jkglasbrenner
Copy link
Collaborator

@jkglasbrenner jkglasbrenner commented Sep 11, 2024

Request: Please do not squash, rebase and merge to dev only. The commits are different enough in scope and purpose that squashing will obscure the reasons for the various updates.

Summary of changes

The first commit adds support for the "boolean", "integer", "list", and "mapping" entrypoint parameter types that are already supported by the task execution engine. In addition, the /workflows/jobFilesDownload endpoint service has been updated to handle all supported types when creating the parameters.json and task engine YAML files. The "path" and "uri" types have been removed since they were treated the same as strings.

The second commit adds the env_vars contextmanager to the Dioptra SDK, which can be used to temporarily update environment variables in a safe and reversible way.

The third commit updates the Worker execution logic to set a __JOB_ID environment variable when executing task engine YAML so that its possible to access the job ID using plugins.

The fourth commit adds a workaround to export the full list of registered plugin parameter types to the task engine YAML. THIS IS A WORKAROUND THAT VIOLATES IDEMPOTENCE/REPRODUCIBILITY! This workaround allows users to create and use parameter types that are only used indirectly, such as when defining a structured parameter type. This is a "hack" because the indirect parameter types are the latest available snapshots, not the snapshots that were associated with the plugins when the entrypoint was saved/updated. This is in contrast with the parameter types explicitly registered to the plugin task input and output parameters, which are linked to the entrypoint and job by their snapshot id instead of the resource id. This difference means that the task engine YAML files are not 100% reproducible, as any changes to an "indirect" plugin parameter type will be immediately reflected in subsequent download requests made to the job files workflow.

The fifth commit improves the error handling in the Dioptra workers in 2 ways. First, every function that has a possibility of encountering an error has been wrapped in a try/except block, which better ensures that any errors are logged and the job status is properly updated in the Dioptra API. Second, the YAML validation step has been modified to emit warning and error log messages instead of swallowing them silently, and in addition raise a ValueError exception when errors are encountered with a message that details the errors that were encountered. This update implements the request in #604.

The sixth commit is a fix to ensure that the exported job YAML sorts the input and output parameters for the plugin tasks according to their position index.

This update adds support for the "boolean", "integer", "list", and "mapping" entrypoint parameter
types that are already supported by the task execution engine. In addition, the
/workflows/jobFilesDownload endpoint service has been updated to handle all supported types when
creating the parameters.json and task engine YAML files.

The "path" and "uri" types have been removed since they were treated the same as strings.
This update adds the env_vars contextmanager to the Dioptra SDK, which can be used to temporarily
update environment variables in a safe and reversible way.
@jkglasbrenner jkglasbrenner added the feature New feature to add to project label Sep 11, 2024
@jkglasbrenner
Copy link
Collaborator Author

I pushed a couple of updates. The first adds the missing Resource.is_deleted == False filter when fetching the plugin parameter types to build the job YAML. The second adds more detailed error handling to the worker execution scripts, and ensures that the worker will attempt to update the job status in the API no matter where the failure happens. In addition, all YAML validation errors should now be emitted as log messages in the worker, as opposed to just saying "Job YAML is invalid!", which should make it easier to debug things.

THIS IS A WORKAROUND THAT VIOLATES IDEMPOTENCE/REPRODUCIBILITY!

This workaround allows users to create and use parameter types that are only used indirectly, such
as when defining a structured parameter type. This is a "hack" because the indirect parameter types
are the latest available snapshots, not the snapshots that were associated with the plugins when the
entrypoint was saved/updated. This is in contrast with the parameter types explicitly registered to
the plugin task input and output parameters, which are linked to the entrypoint and job by their
snapshot id instead of the resource id. This difference means that the task engine YAML files are
not 100% reproducible, as any changes to an "indirect" plugin parameter type will be immediately
reflected in subsequent download requests made to the job files workflow.
This update improves the error handling in the Dioptra workers in 2 ways:

1. Every function that has a possibility of encountering an error has been wrapped in a try/except
   block. This will better ensure that the errors are logged and the job status is properly updated
   in the Dioptra API.
2. The YAML validation step has been modified to emit any warning and error messages that are
   raised as warning and error log messages instead of swallowing them silently. Jobs that encounter
   1 or more error messages will additionally raise a ValueError exception with a message that also
   details the errors that were encountered.
This fix fixes an issue with the YAML generated when requesting a download from the
/workflows/jobFilesDownload endpoint. The input and output parameters for the plugin tasks weren't
matching the parameter order of the Python function signatures. This was particularly a problem for
plugin tasks that returned tuples. The input and output parameters are now sorted by their position
index, which resolves the issue.
@jkglasbrenner
Copy link
Collaborator Author

I've cleaned up the commit messages, and this is ready to merge into dev.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature to add to project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant