Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the fetch_package task performance. #145

Open
canassa opened this issue Apr 30, 2015 · 0 comments
Open

Improve the fetch_package task performance. #145

canassa opened this issue Apr 30, 2015 · 0 comments

Comments

@canassa
Copy link
Collaborator

canassa commented Apr 30, 2015

The fetch_package task is the task responsible for accessing for fetching a package metadata from the PyPI API and creating the package models in Localshop. This task is called every time Localshop receives a pip install <package> command with <package> that was already not mirrored on Localshop. The task is also called once a day by the scheduler in order to update the packages.

The problem is that this task can take a long time to complete, especially if the package has many releases (e.g.: Celery). The problem is not API access but our own database. The task does too many INSERTS and too many queries.

Previously, this task code it was being executed in the view itself, which caused pip to timeout and retry the request (!), I "fixed" this by moving the code to task. But the code stills needs to be improved.

Some ideas:

  • Migrate some columns from the Release model to the Package model.
  • Try to use a bulk inserts.
  • Split the code in two functions, one for insert a brand new package into the system and one for updating an existing package in Localshop.
@canassa canassa changed the title Improve the fetch_package performance. Improve the fetch_package task performance. Apr 30, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant