Iota is a web scraper that can find all of the images and links/suburls on a webpage. In order to implement these features, the code uses certain python libraries, Selenium, Request, webdriver_manager, and Beautifulsoup
warning: this project is educational, the author is not responsible for any damage on public websites that is caused by the users
- Supports scraping images and links on the raw html of webpages
- Using request lib and Beautifulsoup
- Unable to parse Javascript
- Requires WebDriver
- Using request, selenium ChromeDriver, webdriver_manager, and Beautifulsoup
- Able to parse JavaScript
- Able to scrape most of the anti-scraping websites
pip install -r requirements.txt
Try to type python iota1.py -h
usage: iota.py [-h] [-img] [-all_img] [-link] [url]
positional arguments:
url The URL of the target website/webpage
optional arguments:
-h, --help show this help message and exit
-img Find all of the image on the webpage
-all_img Find all of the image on the webpage and subwebpages
-link Find all of the suburls/links on the webpage
Example:
python iota2.py -img https://www.w3schools.com/html/html_classes.asp