AWS-ETL-Pipeline

Data Engineering project with AWS Resources, orchestrated with Devflows

Data workflow that pulls data from an API and enriches it with DynamoDB table columns, with outputs sent to S3, for further analytics in Quicksight(future project phase)

Tools and Technologies utilized

Devflows

An ETL tool that allows us to build pipelines and applications by stitching together AWS services and SaaS applications in a visual interface.

AWS Resources and Services

DynamoDB
AWS Athena
AWS Glue Catalogs
AWS Glue ETL pipelines
S3
AWS Lambda
Amazon EventBridge
Amazon SQS (Simple Queue Service)
IAM

Data Sources

The first phase of the project involves pulling JSON feeds from an external API endpoint
The other Data Source is from Four Existing DynamoDB tables

The final Data Structure is an output of joining these tables with the JSON Data Feeds from the Apple Partnerize API

The common data structure

This is table data that can be queried from Amazon Athena (with both SELECT and INSERT statements). A SELECT query against this table produces a csv file which can be viewed from an S3 location

The Payment advice Query

This is a query off the Common Commission table that retrieves data from the JSON feeds that has been enriched from DynamoDB.

The Daily Notification Query

This is another query from the Common Commission Table with selected columns specifically for notifying various Telcos on a daily basis

Data Staging Areas

At various stages of the pipeline, data will be staged within various S3 locations, based on the workflow within the AWS Glue Pipeline:

From the Devflows step, JSON files are staged in an S3 location
From the AWS Glue step, DynamoDB tables are staged in another S3 location.
The S3 location for the common commission table. This location stores the table data in parquet format, which is one of the formats that accepts insert statements from Athena.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
athena-queries		athena-queries
glue-scripts		glue-scripts
images		images
lambda-functions		lambda-functions
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AWS-ETL-Pipeline

Data Engineering project with AWS Resources, orchestrated with Devflows

Tools and Technologies utilized

Devflows

AWS Resources and Services

Data Sources

The common data structure

The Payment advice Query

The Daily Notification Query

Data Staging Areas

About

Releases

Packages

Languages

ovokpus/AWS-ETL-Pipeline

Folders and files

Latest commit

History

Repository files navigation

AWS-ETL-Pipeline

Data Engineering project with AWS Resources, orchestrated with Devflows

Tools and Technologies utilized

Devflows

AWS Resources and Services

Data Sources

The common data structure

The Payment advice Query

The Daily Notification Query

Data Staging Areas

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages