Skip to content

Web pages scrapping for other my projects

Notifications You must be signed in to change notification settings

taras-z/web-grabber

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web page grabber

Web pages scrapping and parsing for data extraction for the following projects.

The project is based on Apache AirFlow and can be deployed in Docker.

NB. The user, password and key must be specified in docker-compose.yml (see <REPLACE_BY_AIRFLOW_USER>, <REPLACE_BY_AIRFLOW_PASSWORD> and <REPLACE_BY_RANDOM_STRING>).

News grabbing

The initialization script is ./db/init/utils/init_db_sources.py.

The DAG described in ./dags/grab_rss.py.

The result can be accessed in Redis DB #1.

Currency rate grabbing

The initialization isn't required.

The DAG described in ./dags/grab_currency_rate.py.

The result can be accessed in Redis DB #2.

About

Web pages scrapping for other my projects

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages