How to deal with long running processes? #52

sckott · 2024-12-20T23:02:35Z

(pulling info from a slack thread convo with @dtenenba on 2024-12-20 so it doesn't get lost)

We have some "long" running processes that we'll have to deal with for tests. Long here may be 30 seconds or multiple hours, both of which need specific handling - a request to get metadata about job X after 2 seconds may error or return a different set of data than the same request after job X has completed.

Some options:

use test fixtures - record a real http response and then use that cached http response insetad of making a real request to the cromwell server - this will be very fast
use mocks - similar to fixtures, but instead of the whole http response saved to a file that we read (the fixture) we just specify the returned json in a variable
forget about tests that depend on longer running operations? seems not ideal
on a cron schedule submit all the wdls/jsons that have longer running operations - say once a day - when we run tests we just use the results already in the test user DB

If things can take a long time and we're okay with that (we don't use the above approaches of fixtures, mocks, etc.), then we need a different testing setup than is typical (where you do X and Y and then run the unit test right away). How would that work?

Some background info:

Self-hosted runners can have jobs that take up to 5 days
You can trigger actions by calling the github api with a token. If you have a workflow that has on: workflow_dispatch: then you can do e.g.,

curl -X POST \
  -H "Accept: application/vnd.github+json" \
  -H "Authorization: Bearer YOUR_GITHUB_TOKEN" \
  https://api.github.com/repos/OWNER/REPO/actions/workflows/WORKFLOW_ID/dispatches \
  -d '{"ref":"main"}'

It's not clear yet what the solution is, but likely involves some system where one workflow (.yml file) submits workflows, then another runs tests that depend on those workflows (presumably all in final state on cromwell) - how the 2nd system knows when to run is up for debate; possibly some script polling for all workflows being done

The text was updated successfully, but these errors were encountered:

sckott · 2025-01-06T18:16:22Z

Once we have a solution here, we'll want to add additional tests that deal with final states, e.g., right now the cromwell api tests are dealing with initial state after workflow submission, which of course only covers initial state conditions.

sckott · 2025-01-06T18:17:36Z

@dtenenba @tefirman @seankross Thoughts on this?

seankross · 2025-01-08T01:22:46Z

@sckott, based on our discussion, let me know if you agree with this (others are encouraged to weigh in too of course):

For testing the PROOF API, we'll have the following kinds of tests:

Short jobs that submitted on PR, push, or any other circumstance that is appropriate.
Fixtures that are updated frequently via cron.

Also, tests should only evaluate jobs when they have just been submitted, or they are completely finished running (at either the start or end of their lifecycle).

sckott · 2025-01-09T18:59:12Z

Yep, agree @seankross

sckott · 2025-01-09T19:43:07Z

Work started locally on my machine on branch vcr incorporating vcrpy and some other pkgs to cache http requests/responses. This will stay local only until we've worked out how to minimize any exposed sensitive strings in test fixtures (aka: vcr cassettes).

vcr notes and todos:

vcr was branched off of api-tests-adding-more
the change of dir cromwell-api to cromwellapi originally occurred on vcr branche has been cherry picked onto api-tests-adding-more
make sure that anyone working on vcr cassettes locally is using the token for the test user in 1password
maybe change vcr matchers to ignore the hostname that can change (e.g., gizmok99.fhcrc.org:44444) even for the same user

sckott added the question Further information is requested label Dec 20, 2024

sckott mentioned this issue Jan 9, 2025

Figure out how to avoid tenacity retry wait times when we're using vcr cassettes #58

Open

tefirman added the infrastructure Infrastructure fix to execute WDL GitHub Actions label Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to deal with long running processes? #52

How to deal with long running processes? #52

sckott commented Dec 20, 2024

sckott commented Jan 6, 2025

sckott commented Jan 6, 2025

seankross commented Jan 8, 2025

sckott commented Jan 9, 2025

sckott commented Jan 9, 2025 •

edited

Loading

How to deal with long running processes? #52

How to deal with long running processes? #52

Comments

sckott commented Dec 20, 2024

sckott commented Jan 6, 2025

sckott commented Jan 6, 2025

seankross commented Jan 8, 2025

sckott commented Jan 9, 2025

sckott commented Jan 9, 2025 • edited Loading

sckott commented Jan 9, 2025 •

edited

Loading