Skip to content

🌯 streaming & aggregating synthetic chipotle orders

License

Notifications You must be signed in to change notification settings

kxzk/fake-chipotle-streaming

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fake Chipotle Streaming

Demo project to mess around.

header

Scala Go Apache Kafka ApacheCassandra

Project Overview

Initially, fake chipotle order data is generated via Go using the script under /src. Within the script we serialize the order into JSON and push to a predefined Kafka topic orders. From there, we grab the topic within Spark, parse the data back to JSON (since must be []byte to send via Kafka) and explode nested columns. Finally, we aggregate order items on a 5 minute basis and push the micro-batch result to a table in Cassandra.

This workflow lighlty ressembles a real-world real-time (ish) reporting scenario. Ideally there would be some sort of BI tool that sits on top of Cassandra to query the data. Additionally, the configuration and replication of the services is overly simplified since I did this project on a singular DigitalOcean droplet.

Architecture Overview

+---------+     +---------+     +---------+     +-------------+
|         | --> |         |     |         |     |             |
|  ORDER  | --> |  KAFKA  | --> |  SPARK  | --> |  CASSANDRA  |
|         | --> |         |     |         |     |             |
+---------+     +---------+     +---------+     +-------------+

Installation

Up & Running

  • Create Kafka Topic
kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partition 1 --topic orders
  • Set Up Cassandra
create keyspace chipotle with replication = {'class': 'SimpleStrategy', 'replication_factor': 1};

create table top_orders_every_5_mins (
    start timestamp,
    end timestamp,
    name text,
    count int,
    primary key (start, name)
);
  • Submit Spark Job + Streaming
make spark
  • Create Fake Order Data
cd src && go run *.go

Data

Chipotle Order Schema

{
  "time": "2022-09-14 22:03:15",
  "customer": {
    "name": "Yoshiko McGlynn",
    "age": 43,
    "city": "Chula Vista",
    "state": "Texas"
  },
  "order": [
    {
      "name": "Patron Margarita",
      "price": 7.15
    },
    {
      "name": "Salad (Chicken)",
      "price": 6.5
    }
  ],
  "creditCard": "Mastercard"
}

Versions

  • Spark -> v 3.2.0
  • Scala -> v 2.12.15
  • SBT -> v 1.7.1
  • Cassandra -> v 4.0.5
  • Kafka -> v 3.2.1
  • Go -> v 1.18.1

Reference

About

🌯 streaming & aggregating synthetic chipotle orders

Topics

Resources

License

Stars

Watchers

Forks