-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
selenium / chromedrive should clean up on exit #155
Comments
I think this is the reason why the script starts failing after two weeks, see #145 and also https://stackoverflow.com/questions/71351792/after-one-selenium-timeoutexception-i-get-always-sessionnotcreatedexception I noticed that when I try to start chromedriver manually it fails with "port already in use". Checking my process list, there are indeed leftovers: A workaround is to kill all chromeriver instances on your start scrip
|
I also noticed this, when f.e. the script crashes or is shut down, it never cleans up chromedriver properly. Except I do not use it anymore right now and also do not have the time to find out where this would need to be fixed. So my easy solution was to just restart daily, and then regularly either restart the machine or clean up chromedriver. |
The problem is the script starts chromedriver here https://github.com/flathunters/flathunter/blob/main/flathunter/abstract_crawler.py#L48-L56 One could put a try-catch around the infinte for loop and quit the driver. But this would still cause issues if you kill the process manually. So maybe closing all instances of webdriver before starting a new one is the best solution here? How come you are not using it anymore? Did you find a flat yet? Thanks for all your help with supervisor etc. |
Yeah, I would have to dig into what would be the best way to do it properly. But for now I would probably also do it like you. |
Maybe a quick explanation for why we do not close the driver sessions while crawling. Probably the accumulation happens when the process is not properly stopped. Killing the processes with a script before starting flathunter is a good way, but the people themselves should do this because it could warry from use case to use case. Another solution would be to run it in a Docker container. We could implement into flathunter a way to close all the chrome driver instances of flathunter before starting the crawler. However, this would be a bit more work. I found I do not see any other way. |
What is the way to properly stop it? In my experience, esp when it crashed sometimes due to misformed 2captcha response, it never cleaned up on killing/restarting. (as far as I could tell) So
|
In my observations, upon quitting flathunter, or restarting it, the selenium/google-chrome processes are not closed properly. It would likely need sth like chromedriver.quit() somewhere but not sure where.
The text was updated successfully, but these errors were encountered: