Skip to content

gabriellydeandrade/webCrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python Web Crawler: Using Selenium

About

The main objective of this project is to create a crawler who could extract the title, name and url of all the products in this website: http://www.epocacosmeticos.com.br/.

Requirements

Mozilla Firefox
webdriver geckodriver
Python 3
Selenium
BeautifulSoup4
Requests

Installation

1. Clone or download this repository

You can use git to clone

git clone https://github.com/Gabrielly-Andrade/webCrawler.git

or you can download the zip package

2. Install firefox brownser and geckodriver

3. Install python3

4. Install the packages

You can install the items in this steps using pip

  • Pip

    4.1 Selenium
    pip install selenium
    
    4.2 Beautifulsoup4
    pip install beautifulsoup4
    
    4.3 Requests
    pip install requests
    

Running

After installing everything, you need to open the terminal, navigate to the right path (use cd to open the src file) and run

python crawler.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages