Replies: 1 comment 3 replies
-
Does this website has some kind of a protection? It could block ip ranges of your vpc. Did you try xvfb on the server? |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This is a so strange issue, for test it, i deploy same script on a new VPS, use Ubuntu 20.04 (former use Centos 8), and both of them can reproduce this issue.
Before start, let me upload a video for describe the scraping web site.
2.mp4
If you can't see it, please download zipped mp4 video from attachment.
2.zip
As you can see, this video start with after log in success, it enter user admin management page. and then, i send a request to https://www.jin10.com and wait the following element appear.
You can see it works on my local. (Arch Linux) use
ferrum + chrome
, in fact, it works when useferrum + chrome headless
too.but same code, not work on my VPS, i test on two vps, one centos 8, one Ubuntu 20.04, both of them get blocked to wait the tabs in above screenshot red box appear.
Following is my scraping code:
I have to admit, headless on local occasionally not work, and headless on VPS, it works several days ago too, it just very very slow when waiting the tabs in most of case if use with chrome headless.
Anyway, please guide me for how to find out where the issue come from.
I add logger to instance, like this:
it have many logs initialily, but when code keep loop to find out following element, no log output to
log/chrome_headless.log
anymore.Thank you.
Beta Was this translation helpful? Give feedback.
All reactions