Skip to content

Latest commit

 

History

History
50 lines (40 loc) · 1.87 KB

File metadata and controls

50 lines (40 loc) · 1.87 KB

Batch processing with Spark

Python Kotlin Scala Spark Docker

License

Spark context Web UI available at http://192.168.15.91:4040
Spark context available as 'sc' (master = local[*], app id = local-1735415642729).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.5.3
      /_/
         
Using Scala version 2.12.18 (OpenJDK 64-Bit Server VM, Java 17.0.13)
Type in expressions to have them evaluated.
Type :help for more information.

scala>

Tech Stack

Up and Running

1. Spin up the Spark Cluster

docker compose up -d

2. Refer to the specific implementations for docs on how to run the pipeline:

TODO:

  • Batch Processing with PySpark
  • Batch Processing with Spark and Kotlin-Spark-API
  • Batch Processing with Scala+Spark