-
Notifications
You must be signed in to change notification settings - Fork 521
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[release-v2.6] [DOC] Release notes for 2.6 (#4050)
Co-authored-by: Kim Nylander <[email protected]>
- Loading branch information
1 parent
921cefd
commit 40325de
Showing
2 changed files
with
373 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,247 @@ | ||
--- | ||
title: Version 2.6 release notes | ||
menuTitle: V2.6 | ||
description: Release notes for Grafana Tempo 2.6 | ||
weight: 30 | ||
--- | ||
|
||
# Version 2.6 release notes | ||
|
||
The Tempo team is pleased to announce the release of Tempo 2.5. | ||
|
||
This release gives you: | ||
|
||
* Additions to the TraceQL language, including the ability to search by span events, links, and arrays | ||
* Additions to TraceQL metric query-types including a compare function and the ability to do instant queries (which will return faster than range queries). | ||
* Performance and stability enhancements | ||
|
||
<!-- add link when blog is live | ||
Read the [Tempo 2.6 blog post](https://grafana.com/blog/2024/06/03/grafana-tempo-2.5-release-vparquet4-streaming-endpoints-and-more-metrics/) for more examples and details about these improvements. | ||
--> | ||
|
||
These release notes highlight the most important features and bugfixes. For a complete list, refer to the [Tempo changelog](https://github.com/grafana/tempo/releases). | ||
|
||
<!-- add video from blog {{< youtube id=" " >}} --> | ||
|
||
## Features and enhancements | ||
|
||
The most important features and enhancements in Tempo 2.6 are highlighted below. | ||
|
||
### Additional TraceQL metrics (experimental) | ||
|
||
In this release, we’ve added several [TraceQL metrics](https://grafana.com/docs/tempo/latest/operations/traceql-metrics/). In Tempo 2.6, TraceQL metrics adds: | ||
|
||
* Exemplars [[PR 3824](https://github.com/grafana/tempo/pull/3824), [documentation](https://grafana.com/docs/tempo/next/traceql/metrics-queries/#exemplars)] | ||
* Instant metrics queries using `/api/metrics/query` [[PR 3859](https://github.com/grafana/tempo/pull/3859), [documentation](https://grafana.com/docs/tempo/next/api_docs/#traceql-metrics)] | ||
* A `q` parameter to tag-name filtering the search v2 API [[PR 3822](https://github.com/grafana/tempo/pull/3822), [documentation](https://grafana.com/docs/tempo/next/api_docs/#traceql-metrics)] | ||
* A new `compare()` metrics function [[PR 3695](https://github.com/grafana/tempo/pull/3695), [documentation](https://grafana.com/docs/tempo/latest/traceql/metrics-queries/#functions)] | ||
|
||
Additionally, we're working on refactoring the replication factor. Refer to the [Operational change for TraceQL metrics](#operational-change-for-traceql-metrics) section for details. | ||
|
||
Note that using TraceQL metrics may require additional system resources. | ||
|
||
For more information, refer to the [TraceQL metrics queries](https://grafana.com/docs/tempo/latest/traceql/metrics-queries/) and [Configure TraceQL metrics](https://grafana.com/docs/tempo/latest/operations/traceql-metrics/). | ||
|
||
### TraceQL improvements | ||
|
||
Unique to Tempo, TraceQL is a query language that lets you perform custom queries into your tracing data. To learn more about the TraceQL syntax, refer to the [TraceQL documentation](https://grafana.com/docs/tempo/latest/traceql/). | ||
|
||
We’ve added event attributes and link scopes. Like spans, they both have instrinsics and attributes. | ||
|
||
The `event` scope lets you query events that happen within a span. A span event is a unique point in time during the span’s duration. While spans help build the structural hierarchy of your services, span events can provide a deeper level of granularity to help debug your application faster and maintain optimal performance. To learn more about how you can use span events, read the [What are span events?](https://grafana.com/blog/2024/08/15/all-about-span-events-what-they-are-and-how-to-query-them/) blog post. [PRs [3708](https://github.com/grafana/tempo/pull/3708), [3708](https://github.com/grafana/tempo/pull/3748), [3908](https://github.com/grafana/tempo/pull/3908)] | ||
|
||
If you've instrumented your traces for span links, you can use the `link` scope to search for an attribute within a span link. A span link associates one span with one or more other spans. [PRs [3814](https://github.com/grafana/tempo/pull/3814), [3741](https://github.com/grafana/tempo/pull/3741)] | ||
|
||
For more information on span links, refer to the [Span Links](https://opentelemetry.io/docs/concepts/signals/traces/#span-links) documentation in the Open Telemetry project. | ||
|
||
You can search for an attribute in your link: | ||
|
||
``` | ||
{ link.opentracing.ref_type = "child_of" } | ||
``` | ||
|
||
![A TraceQL example showing `link` scope](/media/docs/grafana/data-sources/tempo/query-editor/traceql-link-example.png) | ||
|
||
We’ve also added autocomplete support for `events` and `links`. [[PR 3846](https://github.com/grafana/tempo/pull/3846)] | ||
|
||
Tempo 2.6 improves TraceQL performance with these updates: | ||
|
||
* Performance improvement for `rate() by ()` queries [[PR 3719](https://github.com/grafana/tempo/pull/3719)] | ||
* Add caching to query range queries [[PR 3796](https://github.com/grafana/tempo/pull/3796)] | ||
* Only stream diffs on metrics queries [[PR 3808](https://github.com/grafana/tempo/pull/3808)] | ||
* Tag value lookup use protobuf internally for improved latency [[PR 3731](https://github.com/grafana/tempo/pull/3731)] | ||
* TraceQL metrics queries use protobuf internally for improved latency [[PR 3745](https://github.com/grafana/tempo/pull/3745)] | ||
* TraceQL search and other endpoints use protobuf internally for improved latency and resource usage [[PR 3944](https://github.com/grafana/tempo/pull/3944)] | ||
* Add local disk caching of metrics queries in local-blocks processor [[PR 3799](https://github.com/grafana/tempo/pull/3799)] | ||
* Performance improvement for queries using trace-level intrinsics [[PR 3920](https://github.com/grafana/tempo/pull/3920)] | ||
* Use multiple goroutines to unmarshal responses in parallel in the query frontend. [[PR 3713](https://github.com/grafana/tempo/pull/3713)] | ||
|
||
### Native histogram support | ||
|
||
The metrics-generator can produce native histograms for high-resolution data. [PR 3789](https://github.com/grafana/tempo/pull/3789) | ||
|
||
Native histograms are a data type in Prometheus that can produce, store, and query high-resolution histograms of observations. It usually offers higher resolution and more straightforward instrumentation than classic histograms. | ||
|
||
To learn more, refer to the [Native histogram](https://grafana.com/docs/tempo/<TEMPO_VERSION>/metrics-generator/#native-histograms) documentation. | ||
|
||
### Performance improvements | ||
|
||
One of our major improvements in Tempo 2.6 is the reduction of memory usage due to polling improvements. [PRs [3950](https://github.com/grafana/tempo/pull/3950), [3951](https://github.com/grafana/tempo/pull/3951), [3952](https://github.com/grafana/tempo/pull/3952]) | ||
|
||
![Comparison graph showing the reduction of memory usage due to the polling improvements](/media/docs/tempo/tempo-2-6-poll-improvement.png) | ||
|
||
This improvement is a result of some of these changes: | ||
|
||
* Add data quality metric to measure traces without a root [[PR 3812](https://github.com/grafana/tempo/pull/3812)] | ||
* Reduce memory consumption of query-frontend [[PR 3888](https://github.com/grafana/tempo/pull/3888)] | ||
* Reduce allocs of caching middleware [[PR 3976](https://github.com/grafana/tempo/pull/3976)] | ||
* Reduce allocs building queriers sharded requests [[PR 3932](https://github.com/grafana/tempo/pull/3932)] | ||
* Improve trace id lookup from Tempo Vulture by selecting a date range [[PR 3874](https://github.com/grafana/tempo/pull/3874)] | ||
|
||
### Other enhancements and improvements | ||
|
||
This release also has these notable updates: | ||
|
||
* Bring back OTel receiver metrics. [[PR 3917](https://github.com/grafana/tempo/pull/3917)] | ||
* Add a `q` parameter to `/api/v2/search/tags` for tag name filtering. [[PR 3822](https://github.com/grafana/tempo/pull/3822)] | ||
* Add middleware to block matching URLs. [[PR 3963](https://github.com/grafana/tempo/pull/3963)] | ||
* Add data quality metric to measure traces without a root. [[PR 3812](https://github.com/grafana/tempo/pull/3812)] | ||
* Implement polling tenants concurrently. [[PR 3647](https://github.com/grafana/tempo/pull/3647)] | ||
* Add [native histograms](https://grafana.com/docs/tempo/next/metrics-generator/#native-histograms) for internal metrics [[PR 3870](https://github.com/grafana/tempo/pull/3870)] | ||
* Add a Tempo CLI command to drop traces by id by rewriting blocks. [[PR 3856](https://github.com/grafana/tempo/pull/3856), [documentation](https://grafana.com/docs/tempo/next/operations/tempo_cli/#drop-trace-by-id)] | ||
* Add new OTel compatible Traces API V2. [[PR 3912](https://github.com/grafana/tempo/pull/3912), [documentation](https://grafana.com/docs/tempo/next/api_docs/#query-v2)] | ||
* Rename `Batches` to `ResourceSpans`. [[PR 3895](https://github.com/grafana/tempo/pull/3895)] | ||
|
||
## Upgrade considerations | ||
|
||
When [upgrading](https://grafana.com/docs/tempo/latest/setup/upgrade/) to Tempo 2.6, be aware of these considerations and breaking changes. | ||
|
||
### Operational change for TraceQL metrics | ||
|
||
We've changed to an RF1 (Replication Factor 1) pattern for TraceQL metrics as we were unable to hit performance goals for RF3 de-duplication. This requires some operational changes to query TraceQL metrics. | ||
|
||
TraceQL metrics are still considered experimental. We hope to mark them GA soon when we productionize a complete RF1 write-read path. [PRs [3628](https://github.com/grafana/tempo/pull/3628), [3691]([https://github.com/grafana/tempo/pull/3691](https://github.com/grafana/tempo/pull/3691)), [3723]([https://github.com/grafana/tempo/pull/3723](https://github.com/grafana/tempo/pull/3723)), [3995]([https://github.com/grafana/tempo/pull/3995](https://github.com/grafana/tempo/pull/3995))] | ||
|
||
**For recent data** | ||
|
||
The local-blocks processor must be enabled to start using metrics queries like `{ } | rate()`. If not enabled metrics queries fail with the error `localblocks processor not found`. Enabling the local-blocks processor can be done either per tenant or in all tenants. | ||
|
||
* Per-tenant in the per-tenant overrides: | ||
|
||
```yaml | ||
overrides: | ||
'tenantID': | ||
metrics_generator_processors: | ||
- local-blocks | ||
``` | ||
* By default, for all tenants in the main config: | ||
```yaml | ||
overrides: | ||
defaults: | ||
metrics_generator: | ||
processors: [local-blocks] | ||
``` | ||
Add this configuration to run TraceQL metrics queries against all spans (and not just server spans): | ||
```yaml | ||
metrics_generator: | ||
processor: | ||
local_blocks: | ||
filter_server_spans: false | ||
``` | ||
**For historical data** | ||
To run metrics queries on historical data, you must configure the local-blocks processor to flush rf1 blocks to object storage: | ||
```yaml | ||
metrics_generator: | ||
processor: | ||
local_blocks: | ||
flush_to_storage: true | ||
``` | ||
### Transition to vParquet4 | ||
vParquet4 format is now the default block format. | ||
It's production ready and we highly recommend switching to it for improved query performance. [PR [3810](https://github.com/grafana/tempo/pull/3810)] | ||
Upgrading to Tempo 2.6 modifies the Parquet block format. | ||
Although you can use Tempo 2.6 with vParquet2 or vParquet3, you can only use Tempo 2.6 with vParquet3. | ||
You can also use the `tempo-cli analyse blocks` command to query vParquet4 blocks. [PR 3868](https://github.com/grafana/tempo/pull/3868)]. | ||
Refer to the [Tempo CLI ](https://grafana.com/docs/tempo/next/operations/tempo_cli/#analyse-blocks)documentation for more information. | ||
|
||
For information on upgrading, refer to [Upgrade to Tempo 2.6](https://grafana.com/docs/tempo/next/setup/upgrade/) and [Choose a different block format](https://grafana.com/docs/tempo/next/configuration/parquet/#choose-a-different-block-format). | ||
|
||
### Updated, removed, or renamed configuration parameters | ||
|
||
<table> | ||
<tr> | ||
<td><strong>Parameter</strong> | ||
</td> | ||
<td><strong>Comments</strong> | ||
</td> | ||
</tr> | ||
<tr> | ||
<td> | ||
<code>storage:<br /> | ||
azure:<br /> | ||
use_v2_sdk: </code> | ||
</td> | ||
<td>Removed. Azure v2 is the only and primary Azure backend [PR <a href="https://github.com/grafana/tempo/pull/3875">3875</a>] | ||
</td> | ||
</tr> | ||
<tr> | ||
<td><code>autocomplete_filtering_enabled</code> | ||
</td> | ||
<td>The feature flag option has been removed. The feature is always enabled. [PR <a href="https://github.com/grafana/tempo/pull/3729">3729</a>] | ||
</td> | ||
</tr> | ||
<tr> | ||
<td><code>completedfilepath</code> and <code>blocksfilepath</code> | ||
</td> | ||
<td>Removed unused WAL configuration options. [PR <a href="https://github.com/grafana/tempo/pull/3911">3911</a>] | ||
</td> | ||
</tr> | ||
<tr> | ||
<td><code>compaction_disabled</code> | ||
</td> | ||
<td>New. Allow compaction disablement per-tenant. [PR <a href="https://github.com/grafana/tempo/pull/3965">3965</a>, <a href="https://grafana.com/docs/tempo/next/configuration/#overrides">documentation</a>] | ||
</td> | ||
</tr> | ||
<tr> | ||
<td> | ||
<code> | ||
Storage:<br /> | ||
s3:<br /> | ||
[enable_dual_stack: <bool>]</code> | ||
</td> | ||
<td>Boolean flag to activate or deactivate <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/dual-stack-endpoints.html">dualstack mode</a> on the Storage block configuration for S3. [PR <a href="https://github.com/grafana/tempo/pull/3721">3721</a>, <a href="https://grafana.com/docs/tempo/next/configuration/#standard-overrides">documentation</a>] | ||
</td> | ||
</tr> | ||
</table> | ||
|
||
## Bugfixes | ||
|
||
For a complete list, refer to the [Tempo changelog](https://github.com/grafana/tempo/releases). | ||
|
||
* Fix panic in certain metrics queries using `rate()` with `by`. [[PR 3847](https://github.com/grafana/tempo/pull/3847)] | ||
* Fix metrics queries when grouping by attributes that may not exist. [[PR 3734](https://github.com/grafana/tempo/pull/3734)] | ||
* Fix metrics query histograms and quantiles on `traceDuration`. [[PR 3879](https://github.com/grafana/tempo/pull/3879)] | ||
* Fix divide by 0 bug in query frontend exemplar calculations. [[PR 3936](https://github.com/grafana/tempo/pull/3936)] | ||
* Fix autocomplete of a query using scoped instrinsics. [[PR 3865](https://github.com/grafana/tempo/pull/3865)] | ||
* Improved handling of complete blocks in localblocks processor after enabling flushing. [[PR 3805](https://github.com/grafana/tempo/pull/3805)] | ||
* Fix double appending the primary iterator on second pass with event iterator. [[PR 3903](https://github.com/grafana/tempo/pull/3903)] | ||
* Fix frontend parsing error on cached responses [[PR 3759](https://github.com/grafana/tempo/pull/3759)] | ||
* `max_global_traces_per_user`: take into account `ingestion.tenant_shard_size` when converting to local limit. [[PR 3618](https://github.com/grafana/tempo/pull/3618)] | ||
* Fix HTTP connection reuse on GCP and AWS by reading `io.EOF` through the `http` body. [[PR 3760](https://github.com/grafana/tempo/pull/3760)] | ||
* Handle out of boundaries spans kinds. [[PR 3861](https://github.com/grafana/tempo/pull/3861)] | ||
* Maintain previous tenant blocklist on tenant errors. [PR 3860](https://github.com/grafana/tempo/pull/3860) | ||
* Fix prefix handling in Azure backend `Find()` call. [[PR 3875](https://github.com/grafana/tempo/pull/3875)] | ||
* Correct block end time when the ingested traces are outside the ingestion slack. [[PR 3954](https://github.com/grafana/tempo/pull/3954)] | ||
* Fix race condition where a streaming response could be marshaled while being modified in the combiner resulting in a panic. [[PR 3961](https://github.com/grafana/tempo/pull/3961)] | ||
* Pass search options to the backend for `SearchTagValuesBlocksV2` requests. [[PR 3971](https://github.com/grafana/tempo/pull/3971)] |
Oops, something went wrong.