Skip to content

Commit

Permalink
Merge branch 'main' into opentelemetrybot/auto-update-semantic-conven…
Browse files Browse the repository at this point in the history
…tions-v1.26.0
  • Loading branch information
svrnm authored May 23, 2024
2 parents 55e53dc + 1ca30b4 commit 4f07d74
Show file tree
Hide file tree
Showing 204 changed files with 561 additions and 201 deletions.
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,4 @@ content/en/docs/specs/ @open-telemetry/docs-approvers @open-te
content/en/docs/security/ @open-telemetry/docs-approvers @open-telemetry/sig-security-maintainers
content/en/ecosystem/demo/ @open-telemetry/demo-approvers @open-telemetry/demo-approvers
content/en/docs/contributing/ @open-telemetry/docs-approvers @open-telemetry/docs-maintainers
content/zh/ @open-telemetry/docs-zh-approvers
199 changes: 199 additions & 0 deletions content/en/blog/2024/otel-collector-container-log-parser/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
---
title: Introducing the new container log parser for OpenTelemetry Collector
linkTitle: Collector container log parser
date: 2024-05-22
author: '[Christos Markou](https://github.com/ChrsMark) (Elastic)'
cSpell:ignore: Christos containerd Filelog filelog Jaglowski kube Markou
---

[Filelog receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/filelogreceiver)
is one of the most commonly used components of the
[OpenTelemetry Collector](/docs/collector), as indicated by the most recent
[survey](/blog/2024/otel-collector-survey/#otel-components-usage). According to
the same survey, it's unsurprising that
[Kubernetes is the leading platform for Collector deployment (80.6%)](/blog/2024/otel-collector-survey/#deployment-scale-and-environment).
Based on these two facts, we can realize the importance of seamless log
collection on Kubernetes environments.

Currently, the
[filelog receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.100.0/receiver/filelogreceiver/README.md)
is capable of parsing container logs from Kubernetes Pods, but it requires
[extensive configuration](https://github.com/open-telemetry/opentelemetry-helm-charts/blob/aaa70bde1bf8bf15fc411282468ac6d2d07f772d/charts/opentelemetry-collector/templates/_config.tpl#L206-L282)
to properly parse logs according to various container runtime formats. The
reason is that container logs can come in various known formats depending on the
container runtime, so you need to perform a specific set of operations in order
to properly parse them:

1. Detect the format of the incoming logs at runtime.
2. Parse each format accordingly taking into account its format specific
characteristics. For example, define if it's JSON or plain text and take into
account the timestamp format.
3. Extract known metadata relying on predefined patterns.

Such advanced sequence of operations can be handled by chaining the proper
[stanza](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/stanza)
operators together. The end result is rather complex. This configuration
complexity can be mitigated by using the corresponding
[helm chart preset](https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-collector#configuration-for-kubernetes-container-logs).
However, despite having the preset, it can still be challenging for users to
maintain and troubleshoot such advanced configurations.

The community has raised the issue of
[improving the Kubernetes Logs Collection Experience](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/25251)
in the past. One step towards achieving this would be to provide a simplified
and robust option for parsing container logs without the need for manual
specification or maintenance of the implementation details. With the proposal
and implementation of the new
[container parser](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31959),
all these implementation details are encapsulated and handled within the
parser's implementation. Adding to this the ability to cover the implementation
with unit tests and various fail-over logic indicates a significant improvement
in container log parsing.

## How container logs look like

First of all let's quickly recall the different container log formats that can
be met out there:

- Docker container logs:

`{"log":"INFO: This is a docker log line","stream":"stdout","time":"2024-03-30T08:31:20.545192187Z"}`

- cri-o logs:

`2024-04-13T07:59:37.505201169-05:00 stdout F This is a cri-o log line!`

- Containerd logs:

`2024-04-22T10:27:25.813799277Z stdout F This is an awesome containerd log line!`

We can notice that cri-o and containerd log formats are quite similar (both
follow the CRI logging format) but with a small difference in the timestamp
format.

To properly handle these 3 different formats you need 3 different routes of
[stanza](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/stanza)
operators as we can see in the
[container parser operator issue](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31959).

In addition, the CRI format can provide partial logs which you would like to
combine them into one at first place:

```text
2024-04-06T00:17:10.113242941Z stdout P This is a very very long line th
2024-04-06T00:17:10.113242941Z stdout P at is really really long and spa
2024-04-06T00:17:10.113242941Z stdout F ns across multiple log entries
```

Ideally you would like our parser to be capable of automatically detecting the
format at runtime and properly parse the log lines. We will see later that the
container parser will do that for us.

## Attribute handling

Container log files follow a specific naming pattern from which you can extract
useful metadata information during parsing. For example, from
`/var/log/pods/kube-system_kube-scheduler-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d3/kube-scheduler/1.log`,
you can extract the namespace, the name and UID of the pod, and the name of the
container.

After extracting this metadata, you need to store it properly using the
appropriate attributes following the
[Semantic Conventions](/docs/specs/semconv/resource/k8s/). This handling can
also be encapsulated within the parser's implementation, eliminating the need
for users to define it manually.

## Using the new container parser

With all these in mind, the container parser can be configured like this:

```yaml
receivers:
filelog:
include_file_path: true
include:
- /var/log/pods/*/*/*.log
operators:
- id: container-parser
type: container
```
That configuration is more than enough to properly parse the log line and
extract all the useful Kubernetes metadata. It's quite obvious how much less
configuration is required now. Using a combination of operators would result in
about 69 lines of configuration as it was pointed out at the
[original proposal](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31959).
A log line
`{"log":"INFO: This is a docker log line","stream":"stdout","time":"2024-03-30T08:31:20.545192187Z"}`
that is written at
`/var/log/pods/kube-system_kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log`
will produce a log entry like the following:

```json
{
"timestamp": "2024-03-30 08:31:20.545192187 +0000 UTC",
"body": "INFO: This is a docker log line",
"attributes": {
"time": "2024-03-30T08:31:20.545192187Z",
"log.iostream": "stdout",
"k8s.pod.name": "kube-controller-kind-control-plane",
"k8s.pod.uid": "49cc7c1fd3702c40b2686ea7486091d6",
"k8s.container.name": "kube-controller",
"k8s.container.restart_count": "1",
"k8s.namespace.name": "kube-system",
"log.file.path": "/var/log/pods/kube-system_kube-controller-kind-control-plane_49cc7c1fd3702c40b2686ea7486091d6/kube-controller/1.log"
}
}
```

You can notice that you don't have to define the format. The parser
automatically detects the format and parses the logs accordingly. Even partial
logs that cri-o or containerd runtimes can produce will be recombined properly
without the need of any special configuration.

This is really handy, because as a user you don't need to care about specifying
the format and even maintaining different configurations for different
environments.

## Implementation details

In order to implement that parser operator most of the code was written from
scratch, but we were able to re-use the recombine operator internally for the
partial logs parsing. To achieve this, some small refactoring was required but
this gave us the opportunity to re-use an already existent and well tested
component.

During the discussions around the implementation of this feature, a question
popped up: _Why to implement this as an operator and not as a processor?_

One basic reason is that the order of the log records arriving at processors is
not guaranteed. However we need to ensure this, so as to properly handle the
partial log parsing. That's why implementing it as an operator for now was the
way to go. Moreover, at the moment
[it is suggested](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/32080#issuecomment-2035301178)
to do as much work during the collection as possible and having robust parsing
capabilities allows that.

More information about the implementation discussions can be found at the
respective
[GitHub issue](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31959)
and its related/linked PR.

Last but not least, we should mention that with the example of the specific
container parser we can notice the room for improvement that exists and how we
could optimize further for popular technologies with known log formats in the
future.

## Conclusion: container logs parsing is now easier with filelog receiver

Eager to learn more about the container parser? Visit the official
[documentation](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/container.md)
and if you give it a try let us know what you think. Don't hesitate to reach out
to us in the official CNCF [Slack workspace](https://slack.cncf.io/) and
specifically the `#otel-collector` channel.

## Acknowledgements

Kudos to [Daniel Jaglowski](https://github.com/djaglowski) for reviewing the
parser's implementation and providing valuable feedback!
2 changes: 1 addition & 1 deletion content/en/docs/collector/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ export their telemetry data.
- _Observability_: An exemplar of an observable service.
- _Extensibility_: Customizable without touching the core code.
- _Unification_: Single codebase, deployable as an agent or collector with
support for traces, metrics, and logs (future).
support for traces, metrics, and logs.

## When to use a collector

Expand Down
10 changes: 7 additions & 3 deletions content/en/docs/collector/building/receiver.md
Original file line number Diff line number Diff line change
Expand Up @@ -744,7 +744,9 @@ func (tailtracerRcvr *tailtracerReceiver) Start(ctx context.Context, host compon
}

func (tailtracerRcvr *tailtracerReceiver) Shutdown(ctx context.Context) error {
tailtracerRcvr.cancel()
if tailtracerRcvr.cancel != nil {
tailtracerRcvr.cancel()
}
return nil
}
```
Expand All @@ -756,7 +758,7 @@ func (tailtracerRcvr *tailtracerReceiver) Shutdown(ctx context.Context) error {
function field with the cancellation based on a new context created with
`context.Background()` (according the Collector's API documentation
suggestions).
- Updated the `Stop()` method by adding a call to the `cancel()` context
- Updated the `Shutdown()` method by adding a call to the `cancel()` context
cancellation function.

{{% /alert %}}
Expand Down Expand Up @@ -829,7 +831,9 @@ func (tailtracerRcvr *tailtracerReceiver) Start(ctx context.Context, host compon
}

func (tailtracerRcvr *tailtracerReceiver) Shutdown(ctx context.Context) error {
tailtracerRcvr.cancel()
if tailtracerRcvr.cancel != nil {
tailtracerRcvr.cancel()
}
return nil
}
```
Expand Down
10 changes: 10 additions & 0 deletions content/zh/docs/concepts/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: OpenTelemetry 概念
linkTitle: 概念
description: OpenTelemetry 核心概念
aliases: [concepts/overview]
weight: 2
---

在本节中,你将了解 OpenTelemetry 项目的数据来源和关键组件。
这将帮助你理解 OpenTelemetry 的工作原理。
110 changes: 110 additions & 0 deletions content/zh/docs/concepts/components.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
---
title: 组件
description: 构成 OpenTelemetry 的主要组件
aliases: [data-collection]
weight: 20
---

OpenTelemetry 项目目前由以下几个主要部分构成:

- [规范](#specification)
- [Collector](#collector)
- [特定语言的 API 和 SDK 实现](#language-specific-api--sdk-implementations)
- [插桩库](#instrumentation-libraries)
- [导出器](#exporters)
- [零代码插桩](#zero-code-instrumentation)
- [资源检测器](#resource-detectors)
- [跨服务传播器](#cross-service-propagators)
- [采样器](#sampler)
- [K8s Operator](#k8s-operator)
- [函数即服务(FaaS)资产](#function-as-a-service-assets)

OpenTelemetry 让你无需使用特定供应商的 SDK 和工具就能生成和导出遥测数据。

## 规范 {#specification}

本节说明了针对所有实现的跨语言要求和期望。除了术语定义之外,规范还定义了以下内容:

- **API:** 定义了用于生成和关联跟踪、指标和日志数据的数据类型和操作。
- **SDK:** 定义了 API 特定语言实现的要求。配置、数据处理和导出概念也在这里定义。
- **数据:** 定义了 OpenTelemetry 协议(OTLP)和与供应商无关的、遥测后端可以提供支持的语义约定。

更多信息,请参见[规范](/docs/specs/)

## Collector

OpenTelemetry Collector 是一个与供应商无关的代理,可以接收、处理和导出遥测数据。
它支持以多种格式接收遥测数据(例如 OTLP、Jaeger、Prometheus 以及许多商业/专有工具)并将数据发送到一个或多个后端。
它还支持在导出之前处理和过滤遥测数据。

更多信息,请参见 [Collector](/docs/collector/)

## 针对特定编程语言的 API 和 SDK 实现 {#language-specific-api--sdk-implementations}

OpenTelemetry 还提供语言 SDK,允许你使用所选语言的 OpenTelemetry API 生成遥测数据,并将这些数据导出到首选后端。
这些 SDK 还允许你结合常见库和框架的插桩库,以便你可以将其用于应用程序中的手动插桩。

更多信息,请参见[插桩操作](/docs/concepts/instrumentation/)

### 插桩库 {#instrumentation-libraries}

OpenTelemetry 支持通过大量组件来为所支持的语言根据流行的库和框架生成相关遥测数据。
例如,来自 HTTP 库的入站和出站 HTTP 请求将生成有关这些请求的数据。

让流行的库能够开箱即用地进行观测而无需拉入独立的组件中是一个长期目标。

更多信息,请参见[插桩库](/docs/concepts/instrumentation/libraries/)

### 导出器 {#exporters}

{{% docs/languages/exporters/intro %}}

### 零代码插桩 {#zero-code-instrumentation}

如果适用,OpenTelemetry 的特定语言实现将提供一种无需修改源代码即可对应用程序进行插桩的方法。
虽然底层机制取决于使用的语言,但至少会将 OpenTelemetry API 和 SDK 能力添加到你的应用程序中。
此外,它们还可能添加一组插桩库和导出器依赖项。

更多信息,请参见[零代码插桩](/docs/concepts/instrumentation/zero-code/)

### 资源检测器 {#resource-detectors}

[资源](/docs/concepts/resources/)以资源属性表示生成遥测数据的实体。
例如,在 Kubernetes 上运行的容器中生成遥测数据的进程具有 Pod 名称、命名空间和可能的 Deployment 名称。
这三个属性都可以包含在资源中。

OpenTelemetry 的特定语言实现提供了从环境变量 `OTEL_RESOURCE_ATTRIBUTES`
和许多常见实体(如进程运行时、服务、主机或操作系统)中检测资源的功能。

更多信息,请参见[资源](/docs/concepts/resources/)

### 跨服务传播器 {#cross-service-propagators}

传播是一种用于跨服务和进程边界传递信息的机制。
虽然不限于跟踪,但它是允许跟踪在跨越进程和网络边界的服务中建立系统因果关系的信息。

对于绝大多数场景,上下文传播是通过插桩库为你完成的。
但如果需要,你可以自己使用 `Propagators` 来序列化和反序列化跨领域的关注点,
例如 span 的上下文和 [baggage](/docs/concepts/signals/baggage/)

### 采样器 {#sampler}

采样是限制系统生成跟踪数量的过程。
特定语言的实现提供了几种[头部采样器](/docs/concepts/sampling/#head-sampling)

更多信息,请参见[采样](/docs/concepts/sampling)

## K8s Operator

OpenTelemetry Operator 是 Kubernetes Operator 的一种实现。
Operator 管理 OpenTelemetry Collector 以及使用 OpenTelemetry 对工作负载进行自动插桩。

更多信息,请参见 [K8s Operator](/docs/kubernetes/operator/)

## 函数即服务资产 {#function-as-a-service-assets}

OpenTelemetry 支持多种由不同云服务商提供的函数即服务的监控方法。
OpenTelemetry 社区目前提供预构建的 Lambda 层,能够自动对你的应用进行插桩,
另外在手动或自动对应用进行插桩时可以使用的独立 Collector Lambda 层选项。

更多信息,请参见[函数即服务](/docs/faas/)
2 changes: 1 addition & 1 deletion data/registry/collector-builder.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,5 +17,5 @@ createdAt: 2023-12-18
package:
registry: go
name: go.opentelemetry.io/collector/cmd/builder
version: v0.100.0
version: v0.101.0
quickInstall: false
2 changes: 1 addition & 1 deletion data/registry/collector-exporter-alertmanager.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ authors:
package:
registry: go-collector
name: github.com/open-telemetry/opentelemetry-collector-contrib/exporter/alertmanagerexporter
version: v0.100.0
version: v0.101.0
urls:
repo: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/exporter/alertmanagerexporter
createdAt: 2023-12-05
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,4 @@ createdAt: 2021-02-24
package:
registry: go-collector
name: github.com/open-telemetry/opentelemetry-collector-contrib/exporter/alibabacloudlogserviceexporter
version: v0.100.0
version: v0.101.0
2 changes: 1 addition & 1 deletion data/registry/collector-exporter-aws-xray.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,4 @@ createdAt: 2020-06-06
package:
registry: go-collector
name: github.com/open-telemetry/opentelemetry-collector-contrib/exporter/awsxrayexporter
version: v0.100.0
version: v0.101.0
Loading

0 comments on commit 4f07d74

Please sign in to comment.