Skip to content

Commit

Permalink
Merge pull request #3145 from KyriosGN0/tempo-distributed-feat-add-zo…
Browse files Browse the repository at this point in the history
…ne-aware

[tempo-distributed] feat: add zone aware replication
  • Loading branch information
zalegrala authored Jun 25, 2024
2 parents 07ea99d + 5c853bf commit be2d0a3
Show file tree
Hide file tree
Showing 7 changed files with 296 additions and 18 deletions.
7 changes: 5 additions & 2 deletions charts/tempo-distributed/Chart.lock
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,8 @@ dependencies:
- name: grafana-agent-operator
repository: https://grafana.github.io/helm-charts
version: 0.2.2
digest: sha256:761a500ff2fd8b8c5a52b70683abcdec8b6ffcae6b81ad26ea4ddeddbaf609f1
generated: "2023-09-28T13:42:34.486521-07:00"
- name: rollout-operator
repository: https://grafana.github.io/helm-charts
version: 0.15.0
digest: sha256:7be5c7a4c0d1a71dc6de69b8e99ac5a61c1771d6241e7f9105393cc7117d4f0a
generated: "2024-05-27T19:15:20.601670632+03:00"
7 changes: 6 additions & 1 deletion charts/tempo-distributed/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: v2
name: tempo-distributed
description: Grafana Tempo in MicroService mode
type: application
version: 1.12.0
version: 1.13.0
appVersion: 2.5.0
engine: gotpl
home: https://grafana.com/docs/tempo/latest/
Expand Down Expand Up @@ -31,3 +31,8 @@ dependencies:
version: 0.2.2
repository: https://grafana.github.io/helm-charts
condition: metaMonitoring.grafanaAgent.installOperator
- name: rollout-operator
alias: rollout_operator
repository: https://grafana.github.io/helm-charts
version: 0.15.0
condition: rollout_operator.enabled
25 changes: 24 additions & 1 deletion charts/tempo-distributed/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# tempo-distributed

![Version: 1.12.0](https://img.shields.io/badge/Version-1.12.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 2.5.0](https://img.shields.io/badge/AppVersion-2.5.0-informational?style=flat-square)
![Version: 1.13.0](https://img.shields.io/badge/Version-1.13.0-informational?style=flat-square) ![Type: application](https://img.shields.io/badge/Type-application-informational?style=flat-square) ![AppVersion: 2.5.0](https://img.shields.io/badge/AppVersion-2.5.0-informational?style=flat-square)

Grafana Tempo in MicroService mode

Expand All @@ -14,6 +14,7 @@ Grafana Tempo in MicroService mode
|------------|------|---------|
| https://charts.min.io/ | minio(minio) | 4.0.12 |
| https://grafana.github.io/helm-charts | grafana-agent-operator(grafana-agent-operator) | 0.2.2 |
| https://grafana.github.io/helm-charts | rollout_operator(rollout-operator) | 0.15.0 |

## Chart Repo

Expand Down Expand Up @@ -45,6 +46,11 @@ The command removes all the Kubernetes components associated with the chart and

A major chart version change indicates that there is an incompatible breaking change needing manual actions.

### from Chart versions < 1.11.0

EXPERIMENTAL: Zone Aware Replication has been added to the ingester statefulset.
Attention, the calculation of the pods per AZ is as follows ```(.values.ingester.replicas + numberOfZones -1)/numberOfZones```

### From Chart versions < 1.8.0

Switch to new overrides format, see https://grafana.com/docs/tempo/latest/configuration/#overrides.
Expand Down Expand Up @@ -519,6 +525,23 @@ The memcached default args are removed and should be provided manually. The sett
| ingester.terminationGracePeriodSeconds | int | `300` | Grace period to allow the ingester to shutdown before it is killed. Especially for the ingestor, this must be increased. It must be long enough so ingesters can be gracefully shutdown flushing/transferring all data and to successfully leave the member ring on shutdown. |
| ingester.tolerations | list | `[]` | Tolerations for ingester pods |
| ingester.topologySpreadConstraints | string | Defaults to allow skew no more then 1 node per AZ | topologySpread for ingester pods. Passed through `tpl` and, thus, to be configured as string |
| ingester.zoneAwareReplication | object | `{"enabled":false,"maxUnavailable":50,"topologyKey":null,"zones":[{"extraAffinity":{},"name":"zone-a","nodeSelector":null,"storageClass":null},{"extraAffinity":{},"name":"zone-b","nodeSelector":null,"storageClass":null},{"extraAffinity":{},"name":"zone-c","nodeSelector":null,"storageClass":null}]}` | EXPERIMENTAL Feature, disabled by default |
| ingester.zoneAwareReplication.enabled | bool | `false` | Enable zone-aware replication for ingester |
| ingester.zoneAwareReplication.maxUnavailable | int | `50` | Maximum number of ingesters that can be unavailable per zone during rollout |
| ingester.zoneAwareReplication.topologyKey | string | `nil` | topologyKey to use in pod anti-affinity. If unset, no anti-affinity rules are generated. If set, the generated anti-affinity rule makes sure that pods from different zones do not mix. E.g.: topologyKey: 'kubernetes.io/hostname' |
| ingester.zoneAwareReplication.zones | list | `[{"extraAffinity":{},"name":"zone-a","nodeSelector":null,"storageClass":null},{"extraAffinity":{},"name":"zone-b","nodeSelector":null,"storageClass":null},{"extraAffinity":{},"name":"zone-c","nodeSelector":null,"storageClass":null}]` | Zone definitions for ingester zones. Note: you have to redefine the whole list to change parts as YAML does not allow to modify parts of a list. |
| ingester.zoneAwareReplication.zones[0] | object | `{"extraAffinity":{},"name":"zone-a","nodeSelector":null,"storageClass":null}` | Name of the zone, used in labels and selectors. Must follow Kubernetes naming restrictions: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ |
| ingester.zoneAwareReplication.zones[0].extraAffinity | object | `{}` | extraAffinity adds user defined custom affinity rules (merged with generated rules) |
| ingester.zoneAwareReplication.zones[0].nodeSelector | string | `nil` | nodeselector to restrict where pods of this zone can be placed. E.g.: nodeSelector: topology.kubernetes.io/zone: zone-a |
| ingester.zoneAwareReplication.zones[0].storageClass | string | `nil` | Ingester data Persistent Volume Storage Class If defined, storageClassName: <storageClass> If set to "-", then use `storageClassName: ""`, which disables dynamic provisioning If undefined or set to null (the default), then fall back to the value of `ingester.persistentVolume.storageClass`. |
| ingester.zoneAwareReplication.zones[1] | object | `{"extraAffinity":{},"name":"zone-b","nodeSelector":null,"storageClass":null}` | Name of the zone, used in labels and selectors. Must follow Kubernetes naming restrictions: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ |
| ingester.zoneAwareReplication.zones[1].extraAffinity | object | `{}` | extraAffinity adds user defined custom affinity rules (merged with generated rules) |
| ingester.zoneAwareReplication.zones[1].nodeSelector | string | `nil` | nodeselector to restrict where pods of this zone can be placed. E.g.: nodeSelector: topology.kubernetes.io/zone: zone-b |
| ingester.zoneAwareReplication.zones[1].storageClass | string | `nil` | Ingester data Persistent Volume Storage Class If defined, storageClassName: <storageClass> If set to "-", then use `storageClassName: ""`, which disables dynamic provisioning If undefined or set to null (the default), then fall back to the value of `ingester.persistentVolume.storageClass`. |
| ingester.zoneAwareReplication.zones[2] | object | `{"extraAffinity":{},"name":"zone-c","nodeSelector":null,"storageClass":null}` | Name of the zone, used in labels and selectors. Must follow Kubernetes naming restrictions: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ |
| ingester.zoneAwareReplication.zones[2].extraAffinity | object | `{}` | extraAffinity adds user defined custom affinity rules (merged with generated rules) |
| ingester.zoneAwareReplication.zones[2].nodeSelector | string | `nil` | nodeselector to restrict where pods of this zone can be placed. E.g.: nodeSelector: topology.kubernetes.io/zone: zone-c |
| ingester.zoneAwareReplication.zones[2].storageClass | string | `nil` | Ingester data Persistent Volume Storage Class If defined, storageClassName: <storageClass> If set to "-", then use `storageClassName: ""`, which disables dynamic provisioning If undefined or set to null (the default), then fall back to the value of `ingester.persistentVolume.storageClass`. |
| license.contents | string | `"NOTAVALIDLICENSE"` | |
| license.external | bool | `false` | |
| license.secretName | string | `"{{ include \"tempo.resourceName\" (dict \"ctx\" . \"component\" \"license\") }}"` | |
Expand Down
5 changes: 5 additions & 0 deletions charts/tempo-distributed/README.md.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,11 @@ The command removes all the Kubernetes components associated with the chart and

A major chart version change indicates that there is an incompatible breaking change needing manual actions.

### from Chart versions < 1.11.0

EXPERIMENTAL: Zone Aware Replication has been added to the ingester statefulset.
Attention, the calculation of the pods per AZ is as follows ```(.values.ingester.replicas + numberOfZones -1)/numberOfZones```

### From Chart versions < 1.8.0

Switch to new overrides format, see https://grafana.com/docs/tempo/latest/configuration/#overrides.
Expand Down
179 changes: 176 additions & 3 deletions charts/tempo-distributed/templates/ingester/_helpers-ingester.tpl
Original file line number Diff line number Diff line change
@@ -1,7 +1,180 @@
{{/*
ingester imagePullSecrets
*/}}
{{- define "tempo.ingesterImagePullSecrets" -}}
{{- $dict := dict "tempo" .Values.tempo.image "component" .Values.ingester.image "global" .Values.global.image -}}
{{- include "tempo.imagePullSecrets" $dict -}}
{{- end }}
{{- define "ingester.zoneAwareReplicationMap" -}}
{{- $zonesMap := (dict) -}}
{{- $defaultZone := (dict "affinity" .ctx.Values.ingester.affinity "nodeSelector" .ctx.Values.ingester.nodeSelector "replicas" .ctx.Values.ingester.replicas "storageClass" .ctx.Values.ingester.storageClass) -}}
{{- if .ctx.Values.ingester.zoneAwareReplication.enabled -}}
{{- $numberOfZones := len .ctx.Values.ingester.zoneAwareReplication.zones -}}
{{- if lt $numberOfZones 3 -}}
{{- fail "When zone-awareness is enabled, you must have at least 3 zones defined." -}}
{{- end -}}
{{- $requestedReplicas := .ctx.Values.ingester.replicas -}}
{{- $replicaPerZone := div (add $requestedReplicas $numberOfZones -1) $numberOfZones -}}
{{- range $idx, $rolloutZone := .ctx.Values.ingester.zoneAwareReplication.zones -}}
{{- $_ := set $zonesMap $rolloutZone.name (dict
"affinity" (($rolloutZone.extraAffinity | default (dict)) | mergeOverwrite (include "ingester.zoneAntiAffinity" (dict "rolloutZoneName" $rolloutZone.name "topologyKey" $.ctx.Values.ingester.zoneAwareReplication.topologyKey) | fromYaml))
"nodeSelector" ($rolloutZone.nodeSelector | default (dict) )
"replicas" $replicaPerZone
"storageClass" $rolloutZone.storageClass
) -}}
{{- end -}}
{{- $zonesMap | toYaml }}
{{- end -}}
{{- end -}}
{{/*
Calculate anti-affinity for a zone
Params:
rolloutZoneName = name of the rollout zone
topologyKey = topology key
*/}}
{{- define "ingester.zoneAntiAffinity" -}}
{{- if .topologyKey -}}
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: rollout-group
operator: In
values:
- ingester
- key: zone
operator: NotIn
values:
- {{ .rolloutZoneName }}
topologyKey: {{ .topologyKey | quote }}
{{- else -}}
{}
{{- end -}}
{{- end -}}

{{/*
Calculate annotations with zone-awareness
Params:
ctx = . context
component = component name
rolloutZoneName = rollout zone name (optional)
*/}}
{{- define "ingester.Annotations" -}}
{{- if and .ctx.Values.ingester.zoneAwareReplication.maxUnavailable .rolloutZoneName }}
{{- $map := dict "rollout-max-unavailable" (.ctx.Values.ingester.zoneAwareReplication.maxUnavailable | toString) -}}
{{- toYaml (deepCopy $map | mergeOverwrite .ctx.Values.ingester.annotations) }}
{{- else -}}
{{ toYaml .ctx.Values.ingester.annotations }}
{{- end -}}
{{- end -}}

{{/*
ingester labels
*/}}
{{- define "ingester.labels" -}}
{{- if and .ctx.Values.ingester.zoneAwareReplication.enabled .rolloutZoneName }}
name: {{ printf "%s-%s" .component .rolloutZoneName }}
rollout-group: {{ .component }}
zone: {{ .rolloutZoneName }}
{{- end }}
helm.sh/chart: {{ include "tempo.chart" .ctx }}
app.kubernetes.io/name: {{ include "tempo.name" .ctx }}
app.kubernetes.io/instance: {{ .ctx.Release.Name }}
{{- if .component }}
app.kubernetes.io/component: {{ .component }}
{{- end }}
{{- if .memberlist }}
app.kubernetes.io/part-of: memberlist
{{- end }}
{{- if .ctx.Chart.AppVersion }}
app.kubernetes.io/version: {{ .ctx.Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .ctx.Release.Service }}
{{- end -}}
{{/*
Resource name template
*/}}
{{- define "ingester.resourceName" -}}
{{- $resourceName := include "tempo.fullname" .ctx -}}
{{- if .component -}}{{- $resourceName = printf "%s-%s" $resourceName .component -}}{{- end -}}
{{- if .rolloutZoneName -}}{{- $resourceName = printf "%s-%s" $resourceName .rolloutZoneName -}}{{- end -}}
{{- $resourceName -}}
{{- end -}}


{{/*
ingester selector labels
Params:
ctx = . context
component = name of the component
rolloutZoneName = rollout zone name (optional)
*/}}
{{- define "ingester.selectorLabels" -}}
{{- if .ctx.Values.enterprise.legacyLabels }}
{{- if .component -}}
app: {{ include "tempo.name" .ctx }}-{{ .component }}
{{- end }}
release: {{ .ctx.Release.Name }}
{{- else -}}
app.kubernetes.io/name: {{ include "tempo.name" .ctx }}
app.kubernetes.io/instance: {{ .ctx.Release.Name }}
{{- if .component }}
app.kubernetes.io/component: {{ .component }}
{{- end }}
{{- end -}}
{{- if .rolloutZoneName }}
{{- if not .component }}
{{- printf "Component name cannot be empty if rolloutZoneName (%s) is set" .rolloutZoneName | fail }}
{{- end }}
rollout-group: {{ .component }}
zone: {{ .rolloutZoneName }}
{{- end }}
{{- end -}}

{{/*
ingester POD labels
Params:
ctx = . context
component = name of the component
memberlist = true if part of memberlist gossip ring
rolloutZoneName = rollout zone name (optional)
*/}}
{{- define "ingester.podLabels" -}}
{{ with .ctx.Values.global.podLabels -}}
{{ toYaml . }}
{{ end }}
{{- if .ctx.Values.enterprise.legacyLabels }}
{{- if .component -}}
app: {{ include "tempo.name" .ctx }}-{{ .component }}
{{- if not .rolloutZoneName }}
name: {{ .component }}
{{- end }}
{{- end }}
{{- if .memberlist }}
gossip_ring_member: "true"
{{- end -}}
{{- if .component }}
target: {{ .component }}
release: {{ .ctx.Release.Name }}
{{- end }}
{{- else -}}
helm.sh/chart: {{ include "tempo.chart" .ctx }}
app.kubernetes.io/name: {{ include "tempo.name" .ctx }}
app.kubernetes.io/instance: {{ .ctx.Release.Name }}
app.kubernetes.io/version: {{ .ctx.Chart.AppVersion | quote }}
app.kubernetes.io/managed-by: {{ .ctx.Release.Service }}
{{- if .component }}
app.kubernetes.io/component: {{ .component }}
{{- end }}
{{- if .memberlist }}
app.kubernetes.io/part-of: memberlist
{{- end }}
{{- end }}
{{- with .ctx.Values.ingester.podLabels }}
{{ toYaml . }}
{{- end }}
{{- if .rolloutZoneName }}
{{- if not .component }}
{{- printf "Component name cannot be empty if rolloutZoneName (%s) is set" .rolloutZoneName | fail }}
{{- end }}
rollout-group: ingester
zone: {{ .rolloutZoneName }}
{{- end }}
{{- end -}}
Original file line number Diff line number Diff line change
@@ -1,31 +1,44 @@
{{ $dict := dict "ctx" . "component" "ingester" "memberlist" true }}
{{- $dict := dict "ctx" . "component" "ingester" "memberlist" true -}}
{{- $zonesMap := include "ingester.zoneAwareReplicationMap" $dict | fromYaml -}}
{{- range $zoneName, $rolloutZone := $zonesMap -}}
{{- with $ -}}
{{- $_ := set $dict "rolloutZoneName" $zoneName -}}
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: {{ template "tempo.resourceName" $dict }}
name: {{ template "ingester.resourceName" $dict }}
namespace: {{ .Release.Namespace }}
labels:
{{- include "tempo.labels" $dict | nindent 4 }}
{{- include "ingester.labels" $dict | indent 4 }}
{{- if .Values.ingester.zoneAwareReplication.enabled }}
annotations:
{{- include "ingester.Annotations" $dict | nindent 4}}
{{- else }}
{{- with .Values.ingester.annotations }}
annotations:
{{- toYaml . | nindent 4 }}
{{- end }}
{{- end }}
spec:
{{- if not .Values.ingester.autoscaling.enabled }}
replicas: {{ .Values.ingester.replicas }}
{{- end }}
{{- if not .Values.ingester.autoscaling.enabled }}
replicas: {{ $rolloutZone.replicas }}
{{- end }}
selector:
matchLabels:
{{- include "tempo.selectorLabels" $dict | nindent 6}}
{{- include "ingester.selectorLabels" $dict | nindent 6}}
serviceName: ingester
podManagementPolicy: Parallel
updateStrategy:
{{- if .Values.ingester.zoneAwareReplication.enabled }}
type: OnDelete
{{- else }}
rollingUpdate:
partition: 0
{{- end }}
template:
metadata:
labels:
{{- include "tempo.podLabels" $dict | nindent 8 }}
{{- include "ingester.podLabels" $dict | nindent 8 }}
{{- with .Values.tempo.podLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
Expand Down Expand Up @@ -112,11 +125,11 @@ spec:
{{- tpl . $ | nindent 8 }}
{{- end }}
{{- end }}
{{- with .Values.ingester.affinity }}
{{- with $rolloutZone.affinity }}
affinity:
{{- tpl . $ | nindent 8 }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.ingester.nodeSelector }}
{{- with $rolloutZone.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
Expand Down Expand Up @@ -169,3 +182,6 @@ spec:
requests:
storage: {{ .Values.ingester.persistence.size | quote }}
{{- end }}
---
{{ end }}
{{ end }}
Loading

0 comments on commit be2d0a3

Please sign in to comment.