Skip to content

A full prototype ELT data pipeline to extract, load then transform stocks data to build a fully automated BI product.

License

Notifications You must be signed in to change notification settings

pevolution-ahmed/stock-analytics-elt-data-pipeline

Repository files navigation

Stock Market Analytics ELT Workflow

Description

A Data Pipeline for Automating the ELT workflow of the stock market data and then building a BI product on top of this data, whether it's a dashboard or a forecast predictive model.

Data Stack: drawing drawing drawing drawing

General Pipeline Structure

  • The pipeline consists of four layers that data should go through:
    • Extraction and Load
    • Validation and quality gates
    • Transformation
    • BI

TO-DO

  • Save stocks tickers data from Yahoo Finance to Google BigQuery
  • Create a Great Expectation Suite and Checkpoints using the Great Expectation package to validate and test the loaded data (Validation)
  • Setup A dbt-core project as a transformation layer above the source data
  • Automate styling and formatting by adding the following tasks (quality gates):
    • a task for formatting python code using black lib
    • a task to check the linting using pylint, yamllint, sqlfluff
    • a task to run unit tests using pytest, pytest-cov
  • Build the stocks transformations with dbt (Transformation)
  • Add dbt tests (+freshness to the source) to all transformations
  • Add python unit testing to test core python scripts functionality
  • Create a dashboard to share those transformations (BI)

About

A full prototype ELT data pipeline to extract, load then transform stocks data to build a fully automated BI product.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published