Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improve] add adaptive recuring non blocking timeout handling for Scrapper.py get_page method #29

Open
rajatkb opened this issue Mar 6, 2020 · 5 comments · May be fixed by #70
Open
Labels
bug Something isn't working enhancement New feature or request gssoc20 GSSOC label for gscco20 tag hard GSSOC label for beginner tag

Comments

@rajatkb
Copy link
Owner

rajatkb commented Mar 6, 2020

Requirement

  • The get_page requirement is currently set to static manually set timeout duration. It would great to have an adaptive method get_page , which will not timeout because of few network stability issues but will also not wait indefinitely.

Where to look

  • Scrapper.py inside commons

Update: An implementation is provided for exponential wait time increase in utility package as AdaptiveRequest class. The issue is not resolved, as in case of fault network the system may end up raising its wait time indefinite, i.e leading back to the old problem of having no wait time for request.

@rajatkb rajatkb added bug Something isn't working enhancement New feature or request medium GSSOC label for beginner tag gssoc20 GSSOC label for gscco20 tag labels Mar 6, 2020
@vipuldcoder
Copy link

I am interested to work on this issue.

@rajatkb
Copy link
Owner Author

rajatkb commented Mar 7, 2020

@vipuldcoder address the issue that you already have, without setting up the node and knowing the application you won't be able to tackle this one.

@rajatkb rajatkb added hard GSSOC label for beginner tag and removed medium GSSOC label for beginner tag labels Mar 20, 2020
@secretshardul
Copy link

Interested working on this for #ggsoc. Thought of using Python backoff library. It accepts a give up time parameter so that waiting time does not become infitity.

@rajatkb
Copy link
Owner Author

rajatkb commented Mar 25, 2020

Great you can start working on it then pull a PR for same.

secretshardul pushed a commit to secretshardul/Conference-Notify that referenced this issue Mar 26, 2020
…exponential backoff

- Exponential backoff algorithm used to handle network issues using 'backoff' library
- 'max_time' parameter used to add timeout

References
1. https://github.com/litl/backoff
2. https://en.wikipedia.org/wiki/Exponential_backoff

Fixes rajatkb#29
secretshardul pushed a commit to secretshardul/Conference-Notify that referenced this issue Mar 26, 2020
…exponential backoff

- Exponential backoff algorithm used to handle network issues using 'backoff' library
- 'max_time' parameter used to add timeout

References
1. https://github.com/litl/backoff
2. https://en.wikipedia.org/wiki/Exponential_backoff

Fixes rajatkb#29
secretshardul pushed a commit to secretshardul/Conference-Notify that referenced this issue Mar 26, 2020
…exponential backoff

- Exponential backoff algorithm used to handle network issues using 'backoff' library
- 'max_time' parameter used to add timeout

References
1. https://github.com/litl/backoff
2. https://en.wikipedia.org/wiki/Exponential_backoff

Fixes rajatkb#29
@secretshardul
Copy link

Please review #70

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request gssoc20 GSSOC label for gscco20 tag hard GSSOC label for beginner tag
Projects
None yet
3 participants