Skip to content

kunal0829/docker-spark-iceberg

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark + Iceberg Quickstart Image

This is a docker compose environment to quickly get up and running with a Spark environment and a local Iceberg catalog. It uses a postgres database as a JDBC catalog.

note: If you don't have docker installed, you can head over to the Get Docker page for installation instructions.

Usage

First, start up the spark-iceberg and postgres container by running:

docker-compose up

Next, run any of the following commands, depending on which shell you prefer to use:

docker exec -it spark-iceberg spark-shell
docker exec -it spark-iceberg spark-sql
docker exec -it spark-iceberg pyspark
docker exec -it spark-iceberg pyspark-notebook

To stop the service, just run docker-compose down.

To reset the catalog and data, remove the postgres and warehouse directories.

For more information on getting started with using Iceberg, checkout the Getting Started guide in the official docs.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 96.7%
  • Dockerfile 1.7%
  • Python 1.5%
  • Shell 0.1%