Skip to content

Latest commit

 

History

History
25 lines (14 loc) · 849 Bytes

README.md

File metadata and controls

25 lines (14 loc) · 849 Bytes

Web page grabber

Web pages scrapping and parsing for data extraction for the following projects.

The project is based on Apache AirFlow and can be deployed in Docker.

NB. The user, password and key must be specified in docker-compose.yml (see <REPLACE_BY_AIRFLOW_USER>, <REPLACE_BY_AIRFLOW_PASSWORD> and <REPLACE_BY_RANDOM_STRING>).

News grabbing

The initialization script is ./db/init/utils/init_db_sources.py.

The DAG described in ./dags/grab_rss.py.

The result can be accessed in Redis DB #1.

Currency rate grabbing

The initialization isn't required.

The DAG described in ./dags/grab_currency_rate.py.

The result can be accessed in Redis DB #2.