Skip to content

Async on the cheap (for MVP)

Cris Simpson edited this page Mar 25, 2021 · 2 revisions

Introduction

It's recognized [1, 2] that the best way to handle long-running tasks is to use a task queue, allowing separation of the middle layer (API server) and the execution server. But as we're trying to get an MVP out for feedback, it's not unreasonable to use a less-than-perfect solution for the interim. Here's a few ideas for discussion:

Continue to treat execute() as synchronous but stream back status information

We've been operating (at the API server) with a model of receive request, do work, return() with data. But both Flask and JS support streaming data in chunks from server to client:
Flask: Streaming Contents
JS: Using readable streams

From the Flask side, the data it streams back would be status updates (e.g., every 100 rows processed) which the React client would use to update the display. When the server sends back "complete", React displays a nice completion message and the user proceeds to the 360 view.

Evaluation

Doesn't appear to require much heavy lifting at server or client (we would need to figure out how to feed the generator on the server) but may be a bit brittle; if there's any kind of network hiccup (or user reloads the page?) the stream would be broken and we wouldn't be able to tell the user anything useful.

Client aborts Fetch, polls status API until completion

In this idea, instead of waiting for the execute() Fetch to complete, the React client uses an AbortController to cancel the pending Fetch. It then starts polling the API execution status endpoint, displaying updates until that endpoint reports that the operation is complete.

Evaluation

Using SQLAlchemy's engine.dispose(), and two uWSGI processes. I've got /api/get_execution_status/<job_id> working correctly. I'd probably want to have it find the latest job

instead of having to specify it (although we could use the streaming model above to send back the job_id). We need to figure what side-effects there might be to cancelling the fetch. I presume the browser would drop the connection; will Flask assume it can kill the request?
The client could check status when the page loads to see if there's a running job so it would be more robust in the face of network issues or reloads.


[1] https://flask.palletsprojects.com/en/1.1.x/patterns/celery/
[2] https://blog.miguelgrinberg.com/post/the-flask-mega-tutorial-part-xxii-background-jobs