Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the docker-k8s-otel and solving-problems-with-o11y-cloud workshops #331

Merged
merged 10 commits into from
Jan 8, 2025
Merged
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
title: Connect to EC2 Instance
linkTitle: 1. Connect to EC2 Instance
weight: 1
time: 5 minutes
---

## Connect to your EC2 Instance

We’ve prepared an Ubuntu Linux instance in AWS/EC2 for each attendee.

Using the IP address and password provided by your instructor, connect to your EC2 instance
using one of the methods below:

* Mac OS / Linux
* ssh splunk@IP address
* Windows 10+
* Use the OpenSSH client
* Earlier versions of Windows
* Use Putty

Original file line number Diff line number Diff line change
@@ -0,0 +1,287 @@
---
title: Troubleshoot OpenTelemetry Collector Issues
linkTitle: 10. Troubleshoot OpenTelemetry Collector Issues
weight: 10
time: 20 minutes
---

In the previous section, we added the debug exporter to the collector configuration,
and made it part of the pipeline for traces and logs. We see the debug output
written to the agent collector logs as expected.

However, traces are no longer sent to o11y cloud. Let's figure out why and fix it.

## Review the Collector Config

Whenever a change to the collector config is made via a `values.yaml` file, it's helpful
to review the actual configuration applied to the collector by looking at the config map:

``` bash
kubectl describe cm splunk-otel-collector-otel-agent
```

Let's review the traces pipeline in the agent collector config. It should look
like this:

``` yaml
pipelines:
...
traces:
exporters:
- debug
processors:
- memory_limiter
- k8sattributes
- batch
- resourcedetection
- resource
- resource/add_environment
receivers:
- otlp
- jaeger
- smartagent/signalfx-forwarder
- zipkin
```

Do you see the problem? Only the debug exporter is included in the traces pipeline.
The `otlphttp` and `signalfx` exporters that were present in the configuration previously are gone.
This is why we no longer see traces in o11y cloud.

> How did we know what specific exporters were included before? To find out,
> we could have reverted our earlier customizations and then checked the config
> map to see what was in the traces pipeline originally. Alternatively, we can refer
> to the examples in the [GitHub repo for splunk-otel-collector-chart](https://github.com/signalfx/splunk-otel-collector-chart/blob/main/examples/default/rendered_manifests/configmap-agent.yaml)
> which shows us what default agent config is used by the Helm chart.

## How did the otlphttp and signalfx exporters get removed?

Let's review the customizations we added to the `values.yaml` file:

``` yaml
...
agent:
config:
exporters:
debug:
verbosity: detailed
service:
pipelines:
traces:
exporters:
- debug
logs:
exporters:
- debug
processors:
- memory_limiter
- batch
- resourcedetection
- resource
receivers:
- otlp
```

When we applied the `values.yaml` file to the collector using `helm upgrade`, the
custom configuration got merged with the previous collector configuration.
When this happens, the sections of the `yaml` configuration that contain lists,
such as the list of exporters in the pipeline section, get replaced with what we
included in the `values.yaml` file (which was only the debug exporter).

## Let's Fix the Issue

So when customizing an existing pipeline, we need to fully redefine that part of the configuration.
Our `values.yaml` file should thus be updated as follows:

``` yaml
splunkObservability:
realm: us1
accessToken: ***
infrastructureMonitoringEventsEnabled: true
clusterName: $INSTANCE-cluster
environment: otel-$INSTANCE
agent:
config:
exporters:
debug:
verbosity: detailed
service:
pipelines:
traces:
exporters:
- otlphttp
- signalfx
- debug
logs:
exporters:
- debug
processors:
- memory_limiter
- batch
- resourcedetection
- resource
receivers:
- otlp
```

Let's apply the changes:

``` bash
helm upgrade splunk-otel-collector -f values.yaml \
splunk-otel-collector-chart/splunk-otel-collector
```

And then check the agent config map:

``` bash
kubectl describe cm splunk-otel-collector-otel-agent
```

This time, we should see a fully defined exporters pipeline for traces:

``` bash
pipelines:
...
traces:
exporters:
- otlphttp
- signalfx
- debug
processors:
...
```

## Reviewing the Log Output

The **Splunk Distribution of OpenTelemetry .NET** automatically exports logs enriched with tracing context
from applications that use `Microsoft.Extensions.Logging` for logging (which our sample app does).

Application logs are enriched with tracing metadata and then exported to a local instance of
the OpenTelemetry Collector in OTLP format.

Let's take a closer look at the logs that were captured by the debug exporter to see if that's happening.
To tail the collector logs, we can use the following command:

``` bash
kubectl logs -l component=otel-collector-agent -f
```

Once we're tailing the logs, we can use curl to generate some more traffic. Then we should see
something like the following:

````
2024-12-20T21:56:30.858Z info Logs {"kind": "exporter", "data_type": "logs", "name": "debug", "resource logs": 1, "log records": 1}
2024-12-20T21:56:30.858Z info ResourceLog #0
Resource SchemaURL: https://opentelemetry.io/schemas/1.6.1
Resource attributes:
-> splunk.distro.version: Str(1.8.0)
-> telemetry.distro.name: Str(splunk-otel-dotnet)
-> telemetry.distro.version: Str(1.8.0)
-> os.type: Str(linux)
-> os.description: Str(Debian GNU/Linux 12 (bookworm))
-> os.build_id: Str(6.8.0-1021-aws)
-> os.name: Str(Debian GNU/Linux)
-> os.version: Str(12)
-> host.name: Str(derek-1)
-> process.owner: Str(app)
-> process.pid: Int(1)
-> process.runtime.description: Str(.NET 8.0.11)
-> process.runtime.name: Str(.NET)
-> process.runtime.version: Str(8.0.11)
-> container.id: Str(5bee5b8f56f4b29f230ffdd183d0367c050872fefd9049822c1ab2aa662ba242)
-> telemetry.sdk.name: Str(opentelemetry)
-> telemetry.sdk.language: Str(dotnet)
-> telemetry.sdk.version: Str(1.9.0)
-> service.name: Str(helloworld)
-> deployment.environment: Str(otel-derek-1)
-> k8s.node.name: Str(derek-1)
-> k8s.cluster.name: Str(derek-1-cluster)
ScopeLogs #0
ScopeLogs SchemaURL:
InstrumentationScope HelloWorldController
LogRecord #0
ObservedTimestamp: 2024-12-20 21:56:28.486804 +0000 UTC
Timestamp: 2024-12-20 21:56:28.486804 +0000 UTC
SeverityText: Information
SeverityNumber: Info(9)
Body: Str(/hello endpoint invoked by {name})
Attributes:
-> name: Str(Kubernetes)
Trace ID: 78db97a12b942c0252d7438d6b045447
Span ID: 5e9158aa42f96db3
Flags: 1
{"kind": "exporter", "data_type": "logs", "name": "debug"}
````

In this example, we can see that the Trace ID and Span ID were automatically written to the log output
by the OpenTelemetry .NET instrumentation. This allows us to correlate logs with traces in
Splunk Observability Cloud.

You might remember though that if we deploy the OpenTelemetry collector in a K8s cluster using Helm,
and we include the log collection option, then the OpenTelemetry collector will use the File Log receiver
to automatically capture any container logs.

This would result in duplicate logs being captured for our application. How do we avoid this?

## Avoiding Duplicate Logs in K8s

To avoid capturing duplicate logs, we have one of two options:

1. We can set the `OTEL_LOGS_EXPORTER` environment variable to `none`, to tell the Splunk Distribution of OpenTelemetry .NET to avoid exporting logs to the collector using OTLP.
2. We can manage log ingestion using annotations.

### Option 1

Setting the `OTEL_LOGS_EXPORTER` environment variable to `none` is straightforward. However, the Trace ID and Span ID are not written to the stdout logs generated by the application,
which would prevent us from correlating logs with traces.

To resolve this, we could define a custom logger, such as the example defined in
`/home/splunk/workshop/docker-k8s-otel/helloworld/SplunkTelemetryConfigurator.cs`.

We could include this in our application by updating the `Program.cs` file as follows:

``` cs
using SplunkTelemetry;
using Microsoft.Extensions.Logging.Console;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddControllers();

SplunkTelemetryConfigurator.ConfigureLogger(builder.Logging);

var app = builder.Build();

app.MapControllers();

app.Run();
```

### Option 2

Option 2 requires updating the deployment manifest for the application
to include an annotation. In our case, we would edit the `deployment.yaml` file to add the
`splunk.com/exclude` annotation as follows:

``` yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: helloworld
spec:
selector:
matchLabels:
app: helloworld
replicas: 1
template:
metadata:
labels:
app: helloworld
annotations:
splunk.com/exclude: "true"
spec:
containers:
...
```

Please refer to [Managing Log Ingestion by Using Annotations](https://docs.splunk.com/observability/en/gdi/opentelemetry/collector-kubernetes/kubernetes-config-logs.html#manage-log-ingestion-using-annotations)
for further details on this option.
18 changes: 18 additions & 0 deletions content/en/ninja-workshops/8-docker-k8s-otel/11-summary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
title: Summary
linkTitle: 11. Summary
weight: 11
time: 2 minutes
---

This workshop provided hands-on experience with the following concepts:

* How to deploy the **Splunk Distribution of the OpenTelemetry Collector** on a Linux host.
* How to instrument a .NET application with the **Splunk Distribution of OpenTelemetry .NET**.
* How to "dockerize" a .NET application and instrument it with the **Splunk Distribution of OpenTelemetry .NET**.
* How to deploy the **Splunk Distribution of the OpenTelemetry Collector** in a Kubernetes cluster using Helm.
* How to customize the collector configuration and troubleshoot an issue.

To see how other languages and environments are instrumented with OpenTelemetry,
explore the [Splunk OpenTelemetry Examples GitHub repository](https://github.com/signalfx/splunk-opentelemetry-examples).

Loading