Cryptocurrency markets are highly volatile, making price prediction a challenging task. This project leverages machine learning models to predict future prices of cryptocurrencies.
- Introduction
- Purpose
- Project-Structure
- Pre-Requisites
- Installation
- Usage
- Contribution
- airflow https://www.youtube.com/watch?v=In7zwp0FDX4
This project is made available to everyone for educational purposes, it cannot be used in production. Architecture is based on 3 VM (EC2) spread on 3 different AWS unmanaged accounts.
Meanwhile this light architecture, this project aimed to implement a (almost) full CI/CD based on Github Actions, docker, MLflow, fastAPI, graphical UI. Trunk branching strategy is used.
This project is composed of 4 directories (app)
- db : oHCLVT data are stored in postgreSQL
- mlflow : MLflow components
- fastapi : API used by external users for inference
For local test, for each app, bash scripts allowed developer to facilitate requiired components. A requirements.txt file is present for python
TO COMPLETE
List all the pre-requisites the system needs to develop this project. This project is adapted to a Linux environment and as been tested on
- Arch Linux - Linux 6.10.3
- Ubuntu|Debian
- clone the repo
git clone https://github.com/DstMlOpsCrypto/MainCrypto.git
cd MainCrypto.git
To set up a local environment with Docker, PostgreSQL on Docker, and Python, follow these steps:
Script : local_docker_install.sh file
The script follow these steps:
- update system
- install docker
- install docker-compose
docker-compose up airflow-init
docker-compose up -d
DAG refer to connection id set withing airflow. Setting this connection to "Data server" is mandatory to run DAG
- Point to '0.0.0.0:8080' on your browser
- login: airflow
- password: airflow
- goto /administration
- goto /connection
- name: postgres_crypto
- type : Postgres
- service/host: db
- database: cryptoDb
- user: crypto
- pw: crypto
- port: 5432
- save
Current DAG "crypto_ohcl_dag.py" do following actions
- get all assets (list) from table "assets" withing CryptoDb database
- for each assets, get current crypto price
The DAG is forcast to run every 1 mn (for dev purpose) By default the DAG is on standby. Just toggle it and wait
- Point to '0.0.0.0:8888' on your browser
- login : [email protected]
- password : admin
- Airflow server
- host(service): postgres
- user: airflow
- pw: airflow
- Data server
- host(service): db
- user: crypto
- pw: crypto Mflow server
- host(service): mlflow_db
- user: mlflow
- pw: mlflow
Goto Public to see shemas
Script : local_docker_clean.sh file
The script follow these steps:
- Stop all running containers
- Remove all stopped containers
- Remove all images
- Remove any volumes
- Remove any networks
- Remove all unused data
The script "local_dockercompose_remove.sh" uninstall docker and docker-compose.
# check compose version
docker compose version
# check if docker is running
sudo systemctl status docker
# check containers
docker-compose ps
docker ps
# check the connection to a PostgreSQL database
docker exec -it {container-id or container-name} -U postgres
# List all databases
docker exec -it "$CONTAINER_NAME" psql -U postgres -c "\l"
docker exec -it 2c356056a66b psql -U postgres -c "\l"
# list all table commend sql \dt+
docker exec -it {container-id or container-name} psql -U "{user}" -d "{db_name}" -c "\dt+"
docker exec -it cpostgres psql -U postgres -d ohlcvt -c "\dt+"
# List all columns properties for specific table
docker exec -it {container-id or container-name} psql -U "{user}" -d "{db_name}" -c "\d+ {table_name}"
# view data from a PostgreSQL table
docker exec -it {container-id or container-name} psql -U "{user}" -d "{db_name}" -c "SELECT * FROM {table_name};"
- Point to '0.0.0.0:9090' on your browser
-
Point to '0.0.0.0:3002' on your browser
-
login: grafana
-
password: grafana
Add Data Source : Go to Configuration > Data Sources > Add new data source.
Prometheus
- Select Prometheus.
- Choose Prometheus type: Prometheus
- In URL, enter http://prometheus:9090
- Click Save & Test
AlertManager
- Select a name : Alertmanager
- Choose Implementation: Prometheus
- Choose Alerting / Manage alerts via Alerting UI : yes
- In URL, enter http://alert-manager:9093
- Click Save & Test
- go to 'Alert rules' and add a new alert rules
Add Dashboards for Node Exporter Metrics
- Go to to + > Import.
- Enter dashboard ID 1860 (from Grafana’s dashboard repository); Click on Load
- Enter name and Sectect a prometheus Data Source : Prometheus
- Click on Import
- Select job and host (node-exporter ou statssd-exporter)
Alerting (StatsD)
- IPoint to '0.0.0.0:9093' on your browser
Alerting (Grafana)
- In Grafana, go to > Alert rules and "+ New alert rules"
- Give a name to the rules : Alertmanager
- Choose Source Prometheus
- Select metric and function
Launch
bash setup.sh
Write the deployment instruction here.
TBD
Your contributions are always welcome and appreciated. Following are the things you can do to contribute to this project.
Add important resources here
Pictures of your project.
Credit the authors here.
Add a license here, or a link to it.
The coordination and automation of data flow across various tools and systems to deliver quality data products and analytics.
Manual and error-prone processes Lack of standards around data formats, processes, and processing techniques
Time-based scheduling tools (e.g., Cron) Rise of proprietary scheduling and workload management tools (e.g., AutoSys)
*An increase in data size and complexity of scheduling and workloads *Tools that were designed for specific ecosystems (e.g., Hadoop)