A system for scaling out POSIX shell scripts on distributed file systems. DiSh is part of the PaSh project, which is hosted by the Linux Foundation.
DiSh builds heavily on and extends PaSh (command annotations, compiler infrastructure, and JIT orchestration).
Quick Jump: Installation | Running DiSh | Repo Structure | Evaluation | Community & More | Citing
The easiest way to play with DiSh is using docker.
The following steps commands will create a virtual cluster on one machine allow you to experiment with DiSh. If you have multiple machines, you can setup docker-swarm and use the swarm instruction in docker-hadoop.
## Clone the repo
git clone --recurse-submodules https://github.com/binpash/dish.git
## Install docker using our script (tested on Ubuntu)
## Alternatively see https://docs.docker.com/engine/install/ to install docker.
(cd dish; ./scripts/setup-docker.sh)
## Create the virtual cluster on the host machine
(cd docker-hadoop; ./setup-compose.sh) # currently takes several minutes due to rebuilding the images
## The cluster can be torn down using `docker compose down`
## Create a shell on the client
docker exec -it nodemanager1 bash
Let's run a very simple example using DiSh:
cd $DISH_TOP
hdfs dfs -put README.md /README.md # Copies the readme to hdfs
Now, you can run this sample script (or create a script of your own). Run both DiSh and Bash and compare the results!
./di.sh ./scripts/sample.sh
bash ./scripts/sample.sh
This repo hosts most of the components of the dish
development. Some of them are incorporated in PaSh The structure is as follows:
- pash: Contains the complete PaSh repo as a submodule. DiSh uses and extends its annotations, compiler, and JIT orchestration infrastructure.
- evaluation: Shell scripts used for evaluation.
- runtime: Runtime component — e.g., remote fifo channels.
- scripts: Scripts related to installation, deployment, and continuous integration.
Chat:
- Discord Server (Invite)
If you used DiSh, consider citing the following paper:
@inproceedings{dish2023nsdi,
author = {Mustafa, Tammam and Kallas, Konstantinos and Das, Pratyush and Vasilakis, Nikos},
title = {{DiSh}: Dynamic {Shell-Script} Distribution},
booktitle = {20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23)},
year = {2023},
isbn = {978-1-939133-33-5},
address = {Boston, MA},
pages = {341--356},
url = {https://www.usenix.org/conference/nsdi23/presentation/mustafa},
publisher = {USENIX Association},
month = apr,
}