You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
We don't do much validation on the GPU inputs, and if users put in a maximum GPU count that is more than their quota, it can be difficult to understand why the GPUs will not scale. Maybe if they select the wrong type of GPU (one that they do not have any quota enabled for) their GPU could not come up at all! Additionally, users may not have a large enough node quota, which could cause unexpected failures if the cluster cannot scale up.
Describe the solution you'd like
The GPU quota information can be found via:
gcloud compute regions describe $CLOUDSDK_COMPUTE_REGION
We should use the output of it to validate the user's menu inputs.
Additional context
There are separate quotas for PREEMPTIBLE and regular GPUs, which made me realize we probably just use preemptible for all clusters.
The text was updated successfully, but these errors were encountered:
Additionally, as documented in #367, multi-zone clusters with GPU_NODE_MIN_SIZE of 1 must have a GPU quota of at least 2. Validating the GPU quota would be the best way to prevent this issue.
Is your feature request related to a problem? Please describe.
We don't do much validation on the GPU inputs, and if users put in a maximum GPU count that is more than their quota, it can be difficult to understand why the GPUs will not scale. Maybe if they select the wrong type of GPU (one that they do not have any quota enabled for) their GPU could not come up at all! Additionally, users may not have a large enough node quota, which could cause unexpected failures if the cluster cannot scale up.
Describe the solution you'd like
The GPU quota information can be found via:
gcloud compute regions describe $CLOUDSDK_COMPUTE_REGION
We should use the output of it to validate the user's menu inputs.
Additional context
There are separate quotas for
PREEMPTIBLE
and regular GPUs, which made me realize we probably just use preemptible for all clusters.The text was updated successfully, but these errors were encountered: