Building Docker image to run Spark on Ray with RAPIDS #8062
Replies: 4 comments 18 replies
-
We're not familiar with the Spark on Ray project and have never tested in that environment. As long as Spark on Ray still supports Spark plugins and Spark's GPU scheduling, it should work. Most of the Dockerfile commands you're seeing are just to get Spark setup on Kubernetes in the container and isn't specific to the RAPIDS Accelerator. The key parts in the Dockerfile for the RAPIDS Accelerator rather than generic Spark setup are primarily starting from an image that provides CUDA and these commands:
They get the RAPIDS Accelerator jar and the GPU scheduling script in the container so they can be referenced by the spark submit commands referenced at https://github.com/NVIDIA/spark-rapids/blob/branch-23.02/docs/get-started/getting-started-kubernetes.md#running-spark-applications-in-the-kubernetes-cluster. In summary, once you get Spark setup in a container that provides CUDA and can schedule for GPUs, you should only need to ensure the RAPIDS Accelerator jar is in the container and the relevant startup configs are specified with the Spark job to get the RAPIDS Accelerator to work (again, assuming Spark on Ray supports Spark's GPU scheduling and Spark plugins). |
Beta Was this translation helpful? Give feedback.
-
Thanks @jlowe for the helpful suggestions. Will let you know. |
Beta Was this translation helpful? Give feedback.
-
Updates:
Unfortunately, there we got the error that If there is anything else we can do to verify the gpu resource, that would be super helpful. |
Beta Was this translation helpful? Give feedback.
-
We were able to reproduce the
Note that the cluster is reporting
These changes caused Spark to start asking for |
Beta Was this translation helpful? Give feedback.
-
Hey there!
We are trying to experiment Spark on Ray with RAPIDS but not sure if RAPIDS can support this case.
In 1), we find the below commands to copy items under
spark/
:In 2), if running
pip install raydp-nightly
, we can findpyspark/
.Under
pyspark/
, it has the below content.In this case, will there be concerns if we instead
COPY pyspark/jars /opt/pyspark/jars
or setSPARK_HOME
to the existing.../pyspark
installed by RayDP, 2) there is no/kubernetes/dockerfiles/spark/entrypoint.sh
or/kubernetes/tests
underpyspark/
- I think they may not be required if we are able to launch Spark with RayDP on k8s.Any suggestions or pointers would be very helpful, thanks!
Beta Was this translation helpful? Give feedback.
All reactions