-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crawling binary files #21
Comments
I want the keep the ability to download binary files, but I know it could
be problematic downloading large binary data. What behaviour do you expect
here? Maybe a max file size, or an event handler that inspects the headers
and can cancel a request?
…On Tue, 4 Dec 2018, 08:29 joshua-mbg ***@***.*** wrote:
supercrawler is picking up ALL links on a page. If there are links to
movie files, images, or any large files it will add these URLs to the
queue. The urls get passed to request which tries to download them.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#21>, or mute the
thread
<https://github.com/notifications/unsubscribe-auth/AA6EofZYkvG3HUocsSXvg1u7t4X5hxxTks5u1jJpgaJpZM4ZAJbH>
.
|
I have run into the same problem. I'm working on a fix for this issue. |
I finally addressed this issue. I believe it is resolved with #45 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
supercrawler is picking up ALL links on a page. If there are links to movie files, images, or any large files it will add these URLs to the queue. The urls get passed to request which tries to download them.
The text was updated successfully, but these errors were encountered: