From 1a12f8c30cec7746950e4748bd885898749129e5 Mon Sep 17 00:00:00 2001 From: Alex Howle Date: Mon, 19 Feb 2024 19:24:41 +0000 Subject: [PATCH] [exporter/loadbalancingexporter] amend README for data loss scenarios [chore] (#31271) **Description:** amend the loadbalancingexporter README to explain how topology changes can influence decisions around retry configuration, and how they can result in data loss. **Link to tracking Issue:** 31209 **Testing:** N/A **Documentation:** README.ms --- exporter/loadbalancingexporter/README.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/exporter/loadbalancingexporter/README.md b/exporter/loadbalancingexporter/README.md index b27ba27248eb..8630669ed55d 100644 --- a/exporter/loadbalancingexporter/README.md +++ b/exporter/loadbalancingexporter/README.md @@ -49,6 +49,12 @@ This should be stable enough for most cases, and the larger the number of backen This also supports service name based exporting for traces. If you have two or more collectors that collect traces and then use spanmetrics processor to generate metrics and push to prometheus, there is a high chance of facing label collisions on prometheus if the routing is based on `traceID` because every collector sees the `service+operation` label. With service name based routing, each collector can only see one service name and can push metrics without any label collisions. +## Resilience and scaling considerations +The `loadbalancingexporter` will, irrespective of the chosen resolver (`static`, `dns`, `k8s`), create one exporter per endpoint. The exporter conforms to its published configuration regarding sending queue and retry mechanisms. Importantly, the `loadbalancingexporter` will not attempt to re-route data to a healthy endpoint on delivery failure, and data loss is therefore possible if the exporter's target remains unavailable once redelivery is exhausted. Due consideration needs to be given to the exporter queue and retry configuration when running in a highly elastic environment. + +- When using the `static` resolver and a target is unavailable, all the target's load-balanced telemetry will fail to be delivered until either the target is restored or removed from the static list. The same principle applies to the `dns` resolver. +- When using `k8s`, `dns`, and likely future resolvers, topology changes are eventually reflected in the `loadbalancingexporter`. The `k8s` resolver will update more quickly than `dns`, but a window of time in which the true topology doesn't match the view of the `loadbalancingexporter` remains. + ## Configuration Refer to [config.yaml](./testdata/config.yaml) for detailed examples on using the processor.