You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the crawl-urls lambda function relies on the Apify local storage to know which pages have been visited and what is next on the queue. However, the lambda execution environment is not permanent and as such we cannot rely on this storage persisting between messages
If the message failed to fully process in time and the lambda execution environment was re-initialised then we would not be able to continue from where we left off.
Therefore, the crawl-urls lambda function should be able to restart from a non-initialised environment.
Acceptance Criteria
AC01
Update crawl-urls to be able to restart from the middle of a crawl operation
And the base url for each new entry added to DynamoDB should be the same as previously
AC02
The restart should respect the maximum depth environment variable with respect to the depths from the original crawl operation
e.g. If the last page crawled to was at depth 12 then when restarted the depth should be retained and used for future crawling
AC03
The restart should respect the maximum crawl operations
e.g. If the crawl operation had accessed 10 pages before restart then it should start the counter for the max page crawling at 10 rather than 0
The text was updated successfully, but these errors were encountered:
Description
Currently the
crawl-urls
lambda function relies on the Apify local storage to know which pages have been visited and what is next on the queue. However, the lambda execution environment is not permanent and as such we cannot rely on this storage persisting between messagesTherefore, the
crawl-urls
lambda function should be able to restart from a non-initialised environment.Acceptance Criteria
AC01
crawl-urls
to be able to restart from the middle of a crawl operationAC02
AC03
The text was updated successfully, but these errors were encountered: