Skip to content
This repository has been archived by the owner on Nov 14, 2023. It is now read-only.

[Feature] Task Queue System & Rate Limit on API #335

Open
wants to merge 118 commits into
base: develop
Choose a base branch
from

Conversation

kshitijrajsharma
Copy link
Member

@kshitijrajsharma kshitijrajsharma commented Sep 22, 2022

This will also solve failing Github action issue of #326

Features :

  • You can launch no of workers by command
  • Workers are trackable , Meaning : Each worker's queue can be visualized / controlled through flower , If worker fails we can know why it got failed and exact traceback produced by the worker at that time , it will be saved in to redis db
  • API will handle any exception raised within task and return the null result to user meanwhile , error reason will be recorded in redis DB
  • No user information will be saved for now but each task will be differentiated by unique id which is the same id that will be available in s3 after the export is done , so it would be easy to debug if anything happens in future ! API and worker will share the same id through the work
  • Worker is setup in different .py file i.e. api_worker.py so that it will enable us to distinguish the API which will need queue and which will not , so that rest of the endpoint will perform as it is !
  • Changes in rawdata endpoint is not backward compatible hence rawdata comes with v2 of API , v1 will work as it is without queue system
  • Introduces rate limit with ip of request through f349755 for rawdata and can be used in any endpoint with the decorator , default is 5 / minute

Implementation details :

  • Redis needs to be hosted , Same as a internal flower dashboard that will be monitoring celery workers can be separated from API , It is not hard coded to be available within the api
  • All the local installation related to this implementation has been setup on github build , also through docker !

Test This PR :

Simply checkout this feature/celery branch and start from docs/GETTING_SARTED_WITH_DOCKER here

This PR involves some major changes on documentation so I recommend going through docs will be very useful to test it

  • It introduces a new doc called config_doc.md > where all the config related docs has been moved and enhanced
  • Readme has been refactored as well and docker setup documentation is also up now
  • Also it enables build.yml workflow as well to showcase successful installation of API with a sample request call

Once the API is up and running ,
post Sample request provided on rawdata endpoint
It should give you your unique task id.

Follow the steps to track status

  • Post the your request here and your request will be on queue, endpoint will return as following : { "task_id": "your task_id", "track_link": "/current-snapshot/tasks/task_id/" }
  • Now navigate to /current-snapshot/tasks/ with your task id to track progress and result

Meanwhile navigate to flower To see what worker is doing and job status / info

You can hit the /current-snapshot/ multiple times when you reach more than 5 request per min ,API will block the request and You should see too many requests error

@kshitijrajsharma kshitijrajsharma changed the title Feature : Task Queue System [Feature] Task Queue System Sep 22, 2022
Dockerfile Outdated Show resolved Hide resolved
@kshitijrajsharma
Copy link
Member Author

kshitijrajsharma commented Sep 23, 2022

I fail to run the project with the docker-compose, it seems it doesn't find my database whereas I found it in local

image

In fact, if I connect to the app docker, install postgresql-client, I get the error.

image

How should I configure app to have access to my database ?

For information, I run postgis with docker-compose which works well in develop branch. I added the postgis service in the depends_on section of app but it changed nothing.

  postgis:
    image: postgis/postgis:14-3.3
    container_name: "postgis"
    ports:
      - "5439:5432"
    environment:
      - POSTGRES_DB=test
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=admin
    restart: on-failure
    logging:
      driver: "json-file"
      options:
        max-size: "200k"
        max-file: "10"

Oh ! I think i know the problem , For me it works because I use ssh tunnel , But may not work for the postgres lying on local machine. Since docker container is requesting connection outside the network when we specify localhost : container reference to it's own locahost instead , Try replacing it with your device network ip address it should be able to connect ! ( I use ifconfig to know ip ) . Let me know if it works ! I will include this in docs as well

@kshitijrajsharma
Copy link
Member Author

kshitijrajsharma commented Sep 23, 2022

Update : Replacing my network ip with localhost worked for me with my local Postgres ! I found this article as well

@kshitijrajsharma kshitijrajsharma changed the title [Feature] Task Queue System [Feature] Task Queue System & Rate Limit on API Sep 24, 2022
@kshitijrajsharma kshitijrajsharma marked this pull request as ready for review September 24, 2022 07:55
README.md Outdated Show resolved Hide resolved


@router.post("/current-snapshot/")
@limiter.limit(f"{export_rate_limit}/minute")
Copy link
Contributor

@NicolasGrosjean NicolasGrosjean Sep 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add this limit in the doc.

Maybe this stackoverflow could help to add the response:

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add limit info on doc and raise formatted error response

@kshitijrajsharma kshitijrajsharma marked this pull request as draft September 27, 2022 05:50
Copy link
Contributor

@NicolasGrosjean NicolasGrosjean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like my only remaining feedback is on documentation of an endpoint which is migrating (#348), for my side, I think this PR can be merged.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.