Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the keyphrase service to fallback to requesting content from websites if crawl service is unavailable #186

Open
ashley-evans opened this issue Aug 7, 2022 · 0 comments
Labels

Comments

@ashley-evans
Copy link
Owner

Value Added

Ensures the keyphrase service can continue to function even if the crawl service is unavailable

Description

Currently the keyphrase service is temporally coupled to the crawl service, meaning that any keyphrase analysis cannot be performed without first receiving the crawled HTML content from the crawl service.

The keyphrase service should be updated to request the content directly from a website if the content cannot be received from the keyphrase service for whatever reason.

The content should then be parsed and stored in the same fashion as the cached content from the crawl service.

Acceptance Criteria

AC01

  • Update the keyphrase service to request page content from websites directly if the content cannot be received from the crawl service for whatever reason.

AC02

  • The keyphrase service's functionality should be otherwise unaffected: The content should be still stored in the parsed content S3 bucket
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant