fix(k8s): increase Sentry Clickhouse storage to 50Gi #3170
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
A Grafana alert has been consistently firing as of late about one of Sentry's storage volumes hovering around 80% utilization. Investigating this I found that this was not to do with Zookeeper which routinely needs old log/snapshot data purged, but rather was an issue with Clickhouse's data volume which stores Sentry's events legitimately growing over time. Reducing disk usage here would require actually deleting some of Sentry's event history. Given the long time period covered by the initially-provisioned 30Gi of data storage (the upstream default for the Helm chart we use to deploy Sentry) and relative cost of storage, I suggest it would be best to just expand the storage volume to 50Gi
Type of change
How has this been tested?
I executed the following command within the
kubernetes/apps/charts/sentry
directory:helm template sentry . -n sentry -f ../../values/sentry_sensitive.yaml -f ./values.yaml --debug
This renders the complete manifests offline for our Sentry deployment, and by diffing the before and after output I verified that only the storage request for the
sentry-clickhouse
StatefulSet was changed:Post-merge follow-ups
Ensure that the following 3 PersistentVolumeClaims change from 30Gi to 50Gi:
sentry-clickhouse-data-sentry-clickhouse-0
sentry-clickhouse-data-sentry-clickhouse-1
sentry-clickhouse-data-sentry-clickhouse-2
It may additionally be necessary to log into the associated pods or nodes and use
resize2fs
to expand the filesystem to make use of the newly available physical disk space. Usedf -h
within each of the three pods to then confirm that the filesystems have 50Gi available and are now well below 80% utilization.