Skip to content

Latest commit

 

History

History
36 lines (23 loc) · 1.67 KB

README.md

File metadata and controls

36 lines (23 loc) · 1.67 KB

Spotify ELT (extract, load, and transform) Pipeline

A data pipeline to extract Spotify data from a playlist that is created by students.

Output is a Google Data Studio report, providing insight into the track features and preferences.

Motivation

It provided a good opportunity to develop skills and experience in a range of tools. As such, project is more complex than required, utilising dbt, airflow, docker and cloud based storage, and usage of localstack for testing.

Architecture

  1. Extract data using Spotify API
  2. Simulate AWS S3 locally for testing with localstack
  3. Load into AWS S3
  4. Copy into Snowflake
  5. Transform using dbt
  6. Create Google Looker Studio Dashboard
  7. Orchestrate with Airflow in Docker

Output

  • Final output from Google Looker Studio. Link here. Note that Dashboard is reading from a static CSV output from Snowflake.

Clone using the web URL

NOTE: This was developed using Windows 10. If you're on Mac or Linux, you may need to amend certain components if issues are encountered.

git clone https://github.com/salimt/Spotify-API-Pipeline.git
cd Spotify-API-Pipeline