Let’s create a batch job for generating thousands of random numbers.
Our typical REST example at random-generator also contains a plain Java class, RandomRunner
, suitable for batch processing.
This example assumes a Kubernetes installation is available. In this case, it’s best played with Minikube, as we need some support for PVs. Check the INSTALL documentation for installing Minikube.
To access the PersistentVolume used in this demo, let’s mount a local directory into the Minikube VM that we later then use for a PersistentVolume that is mounted into the Pod:
minikube start --mount --mount-string="$(pwd)/logs:/example"
Alternatively, you can mount the directory on the fly in the background.
---
minikube mount $(pwd)/logs:/example &
---
This command makes this directory available within the Minikube VM at the path /example
.
Let’s now create the PersistentVolume along with a PVC, which can be mounted into our Job later:
kubectl apply -f https://k8spatterns.io/BatchJob/pv-and-pvc.yml
The Job can be started with
kubectl create -f https://k8spatterns.io/BatchJob/job.yml
This Job will create multiple Pods to create the numbers.
Each Pod will run for creating 10000 entries into logs/random.log
.
The Job is configured to run the Pod 5 times, with three max running in parallel.
That will take some minutes, but in the end, you should have 50000 entries in the log file.
You can check the Job’s status with
kubectl get jobs
Which should give something like
NAME COMPLETIONS DURATION AGE random-generator-lnx4v 3/5 100s 100s
Also, you can check with kubectl get pods
that only 3 Pods will run simultaneously.
If you want to run the Job a second time, call kubectl create -f job.yml
again.
The Job’s name is auto-generated, so there won’t be any name clash.
Now let’s try out how an Indexed Job work.
With indexed-job.yml
, we split the file logs/random.log
into five files with five concurrent Pods.
Each Pod will create one file by selecting a different range from the original file.
Please see the comments within indexed-job.yml
for more details.
To run this split Job, run the following:
kubectl create -f https://k8spatterns.io/BatchJob/indexed-job.yml
When the Job is finished, you will find files logs/random-0.txt
, …, and logs/random-4.txt
with the parts out of logs/random.log
.
Finally, you can delete all jobs and all created Pods with
kubectl delete jobs -l app=random-generator