A very simple xDS server.
It watches Kubernetes Services and the respective EndpointSlices and serves them via an xDS server to clients.
Currently the code is in super early development stage and should be considered alpha quality.
There are 2 kustomize bases to deploy the clustered and namespaced scoped resources:
- Clustered resources contain RBAC and CRD definitions needed by the server to be able to watch Service, EndpointSlice and XdsService resources in the ckuster.
- Namespaced resources contain the manifests to deploy the xds server.
In order to configure which services should be streamed as xds targets to clients by the server, users need to specify XdsService resources:
apiVersion: semaphore-xds.uw.systems/v1alpha1
kind: XdsService
metadata:
name: foo
spec:
service:
name: <service-name> # clients will access the service at: xds:///<service-name>.<namespace>:<port>
loadBalancing:
policy: <policy-name> # Optional. Defaults to round_robin
enableRemoteEndpoints: false # Whether to look at remote clusters for service endpoints. Defaults to false
priorityStrategy: local-first # The strategy to use when assigning priorities to endpoints. Possible values are "none|local-first". Defaults to "none"
retry:
retryOn: ["internal", "cancelled"]
numRetries: 1
backoff:
baseInterval: "25ms"
maxInterval: "250ms"
Note that an XdsService can only point to a Service under the same namespace.
Setting a priority strategy will only be meaningful when enableRemoteEndpoints
is also set to true. One of the following strategies will be used to assign
priority to endpoints:
- none: All endpoints equally get the highest priority (0).
- local-first: If there are both local and remote endpoints available, local ones will be assigned the highest priority (0) while remotes will get the next value (1).
Users can use the following labels on Service resources to instruct the xds server to stream the target:
xds.semaphore.uw.systems/enabled: "true"
: to enable xds load balancingxds.semaphore.uw.systems/lb-policy: "<policy>"
: to specify the load balancing policy
If both are specified, configuration that comes from XdsService
resources
should be preferred than Service
labels
The supported values are derived from the envoy proxy library for cluster lb policies. If none is set, or an invalid value is passed, the server configuration will default to round robin.
ring_hash:
Routes requests based upon a consistent hashring based on the configuration of ringHash
below.
Using xxhash, each header is searched for on the request, if found it's hashed and mapped to
a slot within the hashring.
apiVersion: semaphore-xds.uw.systems/v1alpha1
kind: XdsService
metadata:
name: foo
spec:
service:
name: <service-name>
loadBalancing:
policy: ring_hash
ringHash:
minimumRingSize: 1024 # Optional
maximumRingSize: 8000000 # Optional
headers:
- some-header
- another-header
retryOn
must be set with at least one value in order for the rest of the policy to be served, else the whole policy will be ignored.
We are typically using xDS with Go gRPC services, which as of writing only certain values are supported.
backoff
configures the exponential backoff parameters. This is optional, in which case the default value of 25ms for base interval is used along with 10 times the base for the max interval.
Note: this retry policy will apply to all service routes, eventually we'll look to expand to offering per-route config.
The expected server address will follow the pattern:
xds:///<service-name>.<namespace>:<port>
.
For example xds:///grpc-echo-server.labs:50051
Careful that this is not a DNS name, so we cannot append a domain there!
The user's grpc client needs to specify a config map in json config to point to the xDS server, like:
{
"xds_servers": [
{
"server_uri": "semaphore-xds.sys-semaphore.svc.cluster.local:18000",
"channel_creds": [
{
"type": "insecure"
}
],
"server_features": ["xds_v3"]
}
]
}
and expose the location as GRPC_XDS_BOOTSTRAP
env var. Alternatively, one can
pass the json content as string in GRPC_XDS_BOOTSTRAP_CONFIG
environment
variable.
Then you need to import the following module in the client code:
_ "google.golang.org/grpc/xds"
and call the call xds server addresses.
The client above configuration should be identical for all clients living in the
same cluster, assuming only one xDS server is deployed. In such case, it is
handy to use a mutation hook to inject the needed environment variable to your
client pods. A Kustomize base to achieve that using Kyverno is provided here. This is assuming semaphore-xds is deployed
under a namespace called sys-semaphore
so patch if needed. Your pods will need
to be labeled with xds.semaphore.uw.systems/client: "true"
in order to be
selected by the mutating rule.
There are separate metrics available that one can use to determine the status of the controller. The available metrics can give a visibility on errors from the Kubernetes clients, the watchers and the controller's queues. In addition to these, there are metrics available to provide visibility over the resources stored in the snapshot.
semaphore_xds_kube_http_request_total
: Total number of HTTP requests to the Kubernetes API by host, code and method.semaphore_xds_kube_http_request_duration_seconds
: Histogram of latencies for HTTP requests to the Kubernetes API by host and method
semaphore_xds_kube_watcher_objects
: Number of objects watched by kindsemaphore_xds_kube_watcher_events_total
: Number of events handled by kind and event_type
semaphore_xds_queue_depth
: Workqueue depth, by queue name.semaphore_xds_queue_adds_total
: Workqueue adds, by queue name.semaphore_xds_queue_latency_duration_seconds
: Workqueue latency, by queue name.semaphore_xds_queue_work_duration_seconds
: Workqueue work duration, by queue name.semaphore_xds_queue_unfinished_work_seconds
: Unfinished work in seconds, by queue name.semaphore_xds_queue_longest_running_processor_seconds
: Longest running processor, by queue name.semaphore_xds_queue_retries_total
: Workqueue retries, by queue name.semaphore_xds_queue_requeued_items
: Items that have been requeued but not reconciled yet, by queue name.
semaphore_xds_snapshot_cluster
: xDS cluster info by name, lb policy and discovery type.semaphore_xds_snapshot_listener
: xDS listener info by name and target route.semaphore_xds_snapshot_endpoint
: xDS cluster load assignment endpoints info by cluster, locality zone and subzone, address and health.semaphore_xds_snapshot_route
: xDS route configuration info by name, path, target domains, virtual host and target cluster.