Skip to content

Commit

Permalink
Add k8s annotation discovery blogpost
Browse files Browse the repository at this point in the history
Co-authored-by: Dmitrii Anoshin <[email protected]>
Signed-off-by: ChrsMark <[email protected]>
  • Loading branch information
ChrsMark and dmitryax committed Jan 17, 2025
1 parent 3fd0bb5 commit ef3dccf
Showing 1 changed file with 182 additions and 0 deletions.
182 changes: 182 additions & 0 deletions content/en/blog/2025/otel-collector-k8s-discovery/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
---
title: Kubernetes annotation based discovery for OpenTelemetry Collector
linkTitle: Kubernetes annotation discovery
date: 2025-01-20
author: >
[Dmitrii Anoshin](https://github.com/dmitryax) (Cisco/Splunk), [Christos

Check warning on line 6 in content/en/blog/2025/otel-collector-k8s-discovery/index.md

View workflow job for this annotation

GitHub Actions / SPELLING check

Unknown word (Dmitrii) Suggestions: (Dmitri, mitiri, Dimitri, dmigtri, dmintri)

Check warning on line 6 in content/en/blog/2025/otel-collector-k8s-discovery/index.md

View workflow job for this annotation

GitHub Actions / SPELLING check

Unknown word (Anoshin) Suggestions: (anodin, anosia, Anshan, anonmin, anosmia)
Markou](https://github.com/ChrsMark) (Elastic)

Check warning on line 7 in content/en/blog/2025/otel-collector-k8s-discovery/index.md

View workflow job for this annotation

GitHub Actions / SPELLING check

Unknown word (Markou) Suggestions: (marou, marcou, mareou, marfou, margou)
cSpell:ignore:
---

In the world of containers and Kubernetes, observability is crucial. Users need
to know the status of their workloads at any given time. In other words, they
need observability into moving objects.

This is where the [OpenTelemetry Collector](/docs/collector) and its
[receiver creator](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.117.0/receiver/receivercreator)
component come in handy. Users can set up fairly complex monitoring scenarios
with a self-service approach, following the principle of least privilege at the
cluster level.

The self-service approach is great, but how much self-service can it actually
be?
In this blog post, we will explore a newly added feature of the Collector that
makes dynamic workload discovery even easier, providing a seamless experience
for both administrators and users.

## What is Autodiscovery in Observability?

Applications running on containers and pods become moving targets for the
monitoring system. With autodiscovery, monitoring agents like the Collector can
track changes at the container and pod levels and dynamically adjust the
monitoring configuration.

Today, the Collector—and specifically the receiver creator—can provide such an
experience. Using the receiver creator, observability users can define
configuration "templates" that rely on environment conditions.
For example, as an observability engineer, I can configure my Collector to
enable the
[redis receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.117.0/receiver/redisreceiver)

Check failure on line 39 in content/en/blog/2025/otel-collector-k8s-discovery/index.md

View workflow job for this annotation

GitHub Actions / TEXT linter

textlint terminology error

Incorrect term: “redis”, use “Redis” instead
when a Redis pod is deployed on the cluster. The following configuration can
achieve this:

```yaml
receivers:
receiver_creator:
watch_observers: [k8s_observer]
receivers:
redis:
rule: type == "port" && port == 6379
config:
collection_interval: '15s'
```
The above configuration will be enabled when a pod is discovered via the
Kubernetes API that exposes port 6379 (the known port for Redis).
This is great, and as an SRE or Platform Engineer managing an observability
solution, you can rely on this to meet your users' needs for monitoring Redis
workloads. However, what happens if another team wants to monitor a different
type of workload, such as NGINX servers? They would need to inform your team,
and you would need to update the configuration with a new conditional
configuration block, take it through a pull request and review process, and
finally deploy it. This deployment would require the Collector instances to
restart for the new configuration to take effect. While this process might not
be a big deal for some teams, there is definitely room for improvement.
So, what if, as a Collector user, you could simply enable automatic discovery
and then let your cluster users tell the Collector how their workloads should be
monitored by annotating their pods properly? That sounds awesome, and it’s not
actually something new. OpenTelemetry already supports auto-instrumentation
through the Operator
([documentation](https://opentelemetry.io/docs/kubernetes/operator/automatic/)),
allowing users to instrument their applications automatically just by annotating
their pods. In addition, this is a feature that other monitoring agents in the
observability industry already support, and users are familiar with it.
All this motivation led the OpenTelemetry community
([GitHub issue](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/17418))
to create a similar feature for the Collector. We are happy to share that
autodiscovery based on Kubernetes annotations is now supported in the Collector
([GitHub issue](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/34427))!
## The solution
The solution is built on top of the existing functionality provided by the
[Kubernetes observer](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.117.0/extension/observer/k8sobserver)
and
[receiver creator](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.117.0/receiver/receivercreator).
The K8s observer notifies the receiver creator about the objects appearing in
the K8s cluster and provides all the information about them. In addition to the
K8s object metadata, the observer supplies information about the discovered
endpoints that the collector can connect. This means that each discovered
endpoint can potentially be used by a particular collector scraping receiver to
fetch metrics data.
Each scraping receiver has a default configuration with only one required field:
`endpoint`. Given that endpoint information is provided by the Kubernetes
observer, the only information that the user needs to provide explicitly is
which receiver/scraper should be used to scrape data from a discovered endpoint.
That information can be configured on the collector, but as mentioned before,
this is inconvenient. A much more convenient place to define which receiver can
be used to scrape telemetry from a particular pod is the pod itself. Pod’s
annotations is the natural place to put that kind of detail. Given that the
receiver creator has access to the annotations, it can instantiate the proper
receiver with the receiver’s default configuration and discovered endpoint.

The following annotation instructs the receiver creator that this particular pod
runs nginx, and the

Check failure on line 109 in content/en/blog/2025/otel-collector-k8s-discovery/index.md

View workflow job for this annotation

GitHub Actions / TEXT linter

textlint terminology error

Incorrect term: “nginx”, use “NGINX” instead
[nginx receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/v0.117.0/receiver/nginxreceiver)

Check failure on line 110 in content/en/blog/2025/otel-collector-k8s-discovery/index.md

View workflow job for this annotation

GitHub Actions / TEXT linter

textlint terminology error

Incorrect term: “nginx”, use “NGINX” instead
can be used to scrape metrics from it:

```yaml
io.opentelemetry.discovery.metrics/scraper: nginx
```

Apart from that, the discovery on the pod need to be explicitly enabled with the
following annotation:

```yaml
io.opentelemetry.discovery.metrics/enabled: 'true'
```

In some scenarios, the default receiver’s configuration is not suitable for
connecting to a particular pod. In that case, it’s possible to define custom
configuration as part of another annotation:

```yaml
io.opentelemetry.discovery.metrics/config: |
endpoint: "http://`endpoint`/nginx_status"
collection_interval: '20s'
initial_delay: '20s'
read_buffer_size: '10'
```
It’s important to mention that the configuration defined in the annotations
cannot point the receiver creator to another pod. The collector will reject such
configurations.
In addition to the metrics scraping, the annotation-based discovery also
supports log collection with filelog receiver. The following annotation can be
used to enable log collection on a particular pod:
```yaml
io.opentelemetry.discovery.logs/enabled: 'false'
```
Similar to metrics, an optional configuration can be provided in the following
form:
```yaml
io.opentelemetry.discovery.logs/config: |
max_log_size: "2MiB"
operators:
- type: container
id: container-parser
- type: regex_parser
regex: '^(?P<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) (?P<sev>[A-Z]*) (?P<msg>.*)$'
```
If the set of filelog receiver operators needs to be changed, the full list,
including the default container parser, has to be redefined because list config
fields are entirely replaced when merged into the default configuration struct.
The discovery functionality has to be explicitly enabled in the receiver creator
just by adding the following configuration field:
```yaml
receivers:
receiver_creator:
watch_observers: [k8s_observer]
discovery:
enabled: true
```
## Conclusion - Wrapping up
If you are an Opentelemetry Collector user on Kubernetes and you find this new

Check failure on line 178 in content/en/blog/2025/otel-collector-k8s-discovery/index.md

View workflow job for this annotation

GitHub Actions / TEXT linter

textlint terminology error

Incorrect term: “Opentelemetry Collector”, use “OpenTelemetry Collector” instead

Check failure on line 178 in content/en/blog/2025/otel-collector-k8s-discovery/index.md

View workflow job for this annotation

GitHub Actions / TEXT linter

textlint terminology error

Incorrect term: “Opentelemetry”, use “OpenTelemetry” instead
feature interesting, go ahead and visit the official documentation to learn
more! And if you give it a try let us know what you think. Don't hesitate to
reach out to us in the official CNCF [Slack workspace](https://slack.cncf.io/)
and specifically the `#otel-collector` channel.

0 comments on commit ef3dccf

Please sign in to comment.