Skip to content

Latest commit

 

History

History

module5-batch-processing

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Batch processing with Spark

Python Kotlin Scala Spark Docker

License

Spark context Web UI available at http://192.168.15.91:4040
Spark context available as 'sc' (master = local[*], app id = local-1735415642729).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.5.3
      /_/
         
Using Scala version 2.12.18 (OpenJDK 64-Bit Server VM, Java 17.0.13)
Type in expressions to have them evaluated.
Type :help for more information.

scala>

Tech Stack

Up and Running

1. Spin up the Spark Cluster

docker compose up -d

2. Refer to the specific implementations for docs on how to run the pipeline:

TODO:

  • Batch Processing with PySpark
  • Batch Processing with Spark and Kotlin-Spark-API
  • Batch Processing with Scala+Spark