From 95f8caec568100dd383eb10a5ecb1035b7ecfeca Mon Sep 17 00:00:00 2001 From: Dmitry Shmulevich Date: Tue, 29 Oct 2024 16:41:22 -0700 Subject: [PATCH] update k8s doc Signed-off-by: Dmitry Shmulevich --- README.md | 32 +++++++++++-------------- docs/k8s.md | 69 ++++++++++++++++++++++++++++++++++++++++++++++++----- 2 files changed, 77 insertions(+), 24 deletions(-) diff --git a/README.md b/README.md index 176d036..f7c5d13 100644 --- a/README.md +++ b/README.md @@ -1,38 +1,34 @@ # Topograph -Topograph is a component designed to expose the underlying physical network topology of a cluster to enable a workload manager make network-topology aware scheduling decisions. It consists of four major components: +Topograph is a component designed to expose the underlying physical network topology of a cluster to enable a workload manager make network-topology aware scheduling decisions. -1. **CSP Connector** -2. **API Server** -3. **Topology Generator** -4. **Node Observer** +Topograph consists of four major components: + +1. **API Server** +2. **Node Observer** +3. **CSP Connector** +4. **Topology Generator**

Design

## Components -### 1. CSP Connector -The CSP Connector is responsible for interfacing with various CSPs to retrieve cluster-related information. Currently, it supports AWS, OCI, GCP, CoreWeave, bare metal, with plans to add support for Azure. The primary goal of the CSP Connector is to obtain the network topology configuration of a cluster, which may require several subsequent API calls. Once the information is obtained, the CSP Connector translates the network topology from CSP-specific formats to an internal format that can be utilized by the Topology Generator. - -### 2. API Server +### 1. API Server The API Server listens for network topology configuration requests on a specific port. When a request is received, the server triggers the Topology Generator to populate the configuration. -The API Server exposes two endpoints: one for synchronous requests and one for asynchronous requests. +### 2. Node Observer +The Node Observer is used when the Topology Generator is deployed in a Kubernetes cluster. It monitors changes in the cluster nodes. +If a node's status changes (e.g., a node goes down or comes up), the Node Observer sends a request to the API Server to generate a new topology configuration. -- The synchronous endpoint responds to the HTTP request with the topology configuration, though this process may take some time. -- In the asynchronous mode, the API Server promptly returns a "202 Accepted" response to the HTTP request. It then begins generating and serializing the topology configuration. +### 3. CSP Connector +The CSP Connector is responsible for interfacing with various CSPs to retrieve cluster-related information. Currently, it supports AWS, OCI, GCP, CoreWeave, bare metal, with plans to add support for Azure. The primary goal of the CSP Connector is to obtain the network topology configuration of a cluster, which may require several subsequent API calls. Once the information is obtained, the CSP Connector translates the network topology from CSP-specific formats to an internal format that can be utilized by the Topology Generator. -### 3. Topology Generator +### 4. Topology Generator The Topology Generator is the central component that manages the overall network topology of the cluster. It performs the following functions: - - **Notification Handling:** Receives notifications from the API Server. - **Topology Gathering:** Instructs the CSP Connector to fetch the current network topology from the CSP. - **User Cluster Update:** Translates network topology from the internal format into a format expected by the user cluster, such as SLURM or Kubernetes. -### 4. Node Observer -The Node Observer is used when the Topology Generator is deployed in a Kubernetes cluster. It monitors changes in the cluster nodes. -If a node's status changes (e.g., a node goes down or comes up), the Node Observer sends a request to the API Server to generate a new topology configuration. - ## Workflow - The API Server listens on the port and notifies the Topology Generator about incoming requests. In kubernetes, the incoming requests sent by the Node Observer, which watches changes in the node status. diff --git a/docs/k8s.md b/docs/k8s.md index 0196968..be59686 100644 --- a/docs/k8s.md +++ b/docs/k8s.md @@ -1,12 +1,69 @@ # Topograph with Kubernetes -In Kubernetes, Topograph performs two main actions: +Topograph is a tool designed to enhance scheduling decisions in Kubernetes clusters by leveraging network topology information. -- Creates a ConfigMap containing the topology information. -- Applies node labels that define the node’s position within the cloud topology. For instance, if a node connects to switch S1, which connects to switch S2, and then to switch S3, Topograph will label the node with the following: - - `topology.kubernetes.io/network-level-1: S1` - - `topology.kubernetes.io/network-level-2: S2` - - `topology.kubernetes.io/network-level-3: S3` +### Overview + +Topograph's primary objective is to assist the Kubernetes scheduler in making intelligent pod placement decisions based on the cluster's network topology. It achieves this by: + +1. Interacting with Cloud Service Providers (CSPs) +2. Extracting cluster topology information +3. Updating the Kubernetes environment with this topology data + +### Current Functionality + +Topograph performs the following key actions: + +1. **ConfigMap Creation**: Generates a ConfigMap containing topology information. This ConfigMap is not currently utilized but serves as an example for potential future integration with the scheduler or other systems. + +2. **Node Labeling**: Applies labels to nodes that define their position within the cloud topology. For example, if a node connects to switch S1, which connects to switch S2, and then to switch S3, Topograph will apply the following labels to the node: + + ``` + topology.kubernetes.io/network-level-1: S1 + topology.kubernetes.io/network-level-2: S2 + topology.kubernetes.io/network-level-3: S3 + ``` + +### Use of Topograph + +While there is currently no fully network-aware scheduler capable of optimally placing groups of pods based on network considerations, Topograph is intended to be a stepping stone towards developing such a scheduler. + +However, Topograph can be used in conjunction with the existing PodAffinity and Topology Spread Constraint features in Kubernetes. This combination allows for improved pod distribution based on network topology information. + +The following is an excerp of a spec of a Kubernetes object deployed in a cluster with tree tiers of network switches: + +```yaml + affinity: + podAffinity: + preferredDuringSchedulingIgnoredDuringExecution: + - weight: 20 + podAffinityTerm: + labelSelector: + matchExpressions: + - key: app + operator: In + values: + - myapp + topologyKey: topology.kubernetes.io/network-level-3 + - weight: 70 + podAffinityTerm: + labelSelector: + matchExpressions: + - key: app + operator: In + values: + - myapp + topologyKey: topology.kubernetes.io/network-level-2 + - weight: 90 + podAffinityTerm: + labelSelector: + matchExpressions: + - key: app + operator: In + values: + - myapp + topologyKey: topology.kubernetes.io/network-level-1 +``` ## Configuration and Deployment TBD