You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Sorry for late reply. As mentioned before, we might not give accurate suggestions about this, because we haven't tried things like this. I do not understand why it'll stuck on GCS, since that's a component of Ray, and should have nothing to do with RayDP. Ray should be ready before you start RayDP. Only installing RayDP should not cause a problem.
Maybe you can try installing raydp-nightly, and see if that makes any difference.
Hi folks!
We found that RayDP did not seem to be compatible with the nvidia base image on GPU machines after multiple tries.
For example, the below is our simple Docker image:
When deploying a
raycluster
, the worker k8s pod stuck at initStarted container wait-gcs-ready
.If checking the pod log of container
wait-gcs-ready
:If we remove installing
raydp
, there is no issue withray
and the k8s pod runs well.Curious if any possible components in RayDP that might cause this incompatibility?
The text was updated successfully, but these errors were encountered: