- Getting crypto token data
- Loading data into the database
- Analysing coin data with SQL
- Finance meets Data Science
Library | Version |
---|---|
requests | 2.32.3 |
typer | 0.12.4 |
psycopg2 | 2.9.9 |
jupyter | 1.0.0 |
pandas | 2.2.2 |
matplotlib | 3.9.2 |
SQLAlchemy | 2.0.32 |
In the project's files, you will find a Pipfile
that includes all these libraries in their specific versions. If you have installed pipenv
, you can run pipenv install
to install the dependencies in the environment, and activate it with pipenv shell
.
If you have a problem installing the dependencies because of psycopg2, that means that you could need to install the following dependency of this library libpq-dev
.
# In the case you have this previous error, install `libpq-dev` (example with Linux/Ubuntu).
sudo apt-get install libpq-dev
You'll also find a docker-compose.yml
file that sets up the database, user, password, and tables (including the data provided in the repository).
- Database name:
data_exam
- Local port:
5435
- Test user credentials:
- Name:
data_ex
- Password:
data_am
- Name:
To activate the container with the database, run docker compose up -d
in the project's main directory. Additionally, if you want to connect to the database via CLI, you can run:
psql -h localhost -p 5435 -U data_ex -d data_exam
Once everything is set up, you can run two different commands using the main.py
file:
today-info
(accepts two parameters)day-info
(accepts no parameters and is also used by the CRON)
The day-info
extracts data about Cryptocurrencies from CoinGecko, and it does transform it and load it into a local CSV for every execution (and opcionally, the database):
This command has two parameters.
coin
- The name of a cryptocurrency accepted by the CoinGecko API.date
- The date in ISO 8601 format.
Additionally, it includes an optional flag:
db-store
- If set, this flag updates both the local CSV and the database.
The CSV file will be stored in the 'data' folder with the name 'coins_data.csv'.
Here are a few examples:
python main.py day-info ethereum 2024-08-19
python main.py day-info cardano 2024-07-31 --db-store
This command does the same as day-info, but without requiring any parameters. It automatically extracts, transforms, and loads data for three cryptocurrencies: Bitcoin, Ethereum, and Cardano with the current date. This command is also scheduled to run daily at 3 AM via CRON.
Here's an example:
python main.py today-info --db-store
To enable the CRON job, run crontab -e
and add the following line to schedule the today-info
command to run every day at 3 AM.
# every minute | every hour | every day of the month | every month of the year | every day of the week
0 3 * * * /usr/bin/pipenv run python /your_data_path/main.py today-info --db-store
If it doesn't work, check if it is not activated in bash.
service cron status # Cron check.
sudo systemctl enable cron # Cron enable if it was disabled.
- For the third exercise, the queries to run in PostgreSQL are located in the
queries_task_3.sql
file inside the 'data' folder. - For the fourth exercise, you'll find a Jupyter Notebook file in the main folder that contains the first part of the exercise.
- Additionally, plots will be generated in the 'plots' folder; three example plots are already included by default.
- To update the database with recent data without manually running the
main.py
file multiple times, I've included aexercise_4_help.sh
script. This script contains useful snippets and additional information to save you time.