Skip to content

Commit

Permalink
Allow field path env variables (#935)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: #935

# Context
Currently the kubernetes scheduler translate all the env of the role into something like this.
```
- name: FOO
  value: bar
```
In some cases we may also need env variables of type:
```
- name: FOO
  valueFrom:
    fieldRef:
      fieldPath: bar.path
```
This diff adds the ability to specify env in the role that will be translated to the second option.

# Changes
- Create constant placeholder to insert in the value of env in roles if we want them to be translated to the second option
- If placeholder is present in the value, convert to code below stripping the placeholder
```
V1EnvVar(
  name=key,
    value_from=V1EnvVarSource(
      field_ref=V1ObjectFieldSelector(field_path=value)
    ),
)
```
if not, translate to
```
V1EnvVar(name=key, value=value)
```
- update tests

Reviewed By: Sanjay-Ganeshan

Differential Revision: D59865500
  • Loading branch information
ndacosta authored and facebook-github-bot committed Jul 20, 2024
1 parent 6112e03 commit dc8eefd
Show file tree
Hide file tree
Showing 2 changed files with 44 additions and 5 deletions.
30 changes: 27 additions & 3 deletions torchx/schedulers/kubernetes_scheduler.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,17 @@

LABEL_INSTANCE_TYPE = "node.kubernetes.io/instance-type"

# role.env translates to static env variables in the yaml
# {"FOO" : "bar"} =====> - name: FOO
# value: bar
# unless this placeholder is present at the start of the role.env value then the env variable
# in the yaml will be dynamically populated at runtime (placeholder is stripped out of the value)
# {"FOO" : "[FIELD_PATH]bar"} =====> - name: FOO
# valueFrom:
# fieldRef:
# fieldPath: bar
PLACEHOLDER_FIELD_PATH = "[FIELD_PATH]"


def sanitize_for_serialization(obj: object) -> object:
from kubernetes import client
Expand All @@ -183,7 +194,9 @@ def role_to_pod(name: str, role: Role, service_account: Optional[str]) -> "V1Pod
V1ContainerPort,
V1EmptyDirVolumeSource,
V1EnvVar,
V1EnvVarSource,
V1HostPathVolumeSource,
V1ObjectFieldSelector,
V1ObjectMeta,
V1PersistentVolumeClaimVolumeSource,
V1Pod,
Expand Down Expand Up @@ -303,9 +316,20 @@ def role_to_pod(name: str, role: Role, service_account: Optional[str]) -> "V1Pod
image=role.image,
name=name,
env=[
V1EnvVar(
name=name,
value=value,
(
V1EnvVar(
name=name,
value_from=V1EnvVarSource(
field_ref=V1ObjectFieldSelector(
field_path=value.strip(PLACEHOLDER_FIELD_PATH)
)
),
)
if value.startswith(PLACEHOLDER_FIELD_PATH)
else V1EnvVar(
name=name,
value=value,
)
)
for name, value in role.env.items()
],
Expand Down
19 changes: 17 additions & 2 deletions torchx/schedulers/test/kubernetes_scheduler_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
KubernetesOpts,
KubernetesScheduler,
LABEL_INSTANCE_TYPE,
PLACEHOLDER_FIELD_PATH,
role_to_pod,
)
from torchx.specs import AppState
Expand Down Expand Up @@ -75,7 +76,7 @@ def _test_app(num_replicas: int = 1) -> specs.AppDef:
"--rank0-env",
specs.macros.rank0_env,
],
env={"FOO": "bar"},
env={"FOO": "bar", "FOO_FIELD_PATH": f"{PLACEHOLDER_FIELD_PATH}bar"},
resource=specs.Resource(
cpu=2,
memMB=3000,
Expand Down Expand Up @@ -149,7 +150,9 @@ def test_role_to_pod(self) -> None:
V1ContainerPort,
V1EmptyDirVolumeSource,
V1EnvVar,
V1EnvVarSource,
V1HostPathVolumeSource,
V1ObjectFieldSelector,
V1ObjectMeta,
V1Pod,
V1PodSpec,
Expand Down Expand Up @@ -188,7 +191,15 @@ def test_role_to_pod(self) -> None:
],
image="pytorch/torchx:latest",
name="name",
env=[V1EnvVar(name="FOO", value="bar")],
env=[
V1EnvVar(name="FOO", value="bar"),
V1EnvVar(
name="FOO_FIELD_PATH",
value_from=V1EnvVarSource(
field_ref=V1ObjectFieldSelector(field_path="bar")
),
),
],
resources=resources,
ports=[V1ContainerPort(name="foo", container_port=1234)],
security_context=V1SecurityContext(),
Expand Down Expand Up @@ -303,6 +314,10 @@ def test_submit_dryrun(self) -> None:
env:
- name: FOO
value: bar
- name: FOO_FIELD_PATH
valueFrom:
fieldRef:
fieldPath: bar
- name: TORCHX_RANK0_HOST
value: localhost
image: pytorch/torchx:latest
Expand Down

0 comments on commit dc8eefd

Please sign in to comment.