Skip to content

Latest commit

 

History

History
287 lines (218 loc) · 11.8 KB

tracing.md

File metadata and controls

287 lines (218 loc) · 11.8 KB

Introduction

tracing is a set of crates used for logging and tracing of rust binaries. It is primarily composed of the tracing crate, which is the core way to interact with the framework, and the tracing-subscriber crate, which provides an API to compose systems that consume tracing data.

The tracing docs say it best:

In asynchronous systems like Tokio, interpreting traditional log messages can often be quite challenging. Since individual tasks are multiplexed on the same thread, associated events and log lines are intermixed making it difficult to trace the logic flow. tracing expands upon logging-style diagnostics by allowing libraries and applications to record structured events with additional information about temporality and causality — unlike a log message, a span in tracing has a beginning and end time, may be entered and exited by the flow of execution, and may exist within a nested tree of similar spans. In addition, tracing spans are structured, with the ability to record typed data as well as textual messages.

Logging

The most basic way to use tracing is to use it like other logging crates like log, which additional semi-structured capabilities.

Materialize binaries are setup such that events are printed on stderr.

For example:

tracing::info!("processed {} items", items.len()")

produces:

2022-02-15T18:40:14.289898Z INFO my::module: processed 100 items

(See this page for how the colors will look)

You also also add semi-structured data to the event, for example:

tracing::debug!("processed {} items", items.len(), ?enum_value")

produces:

2022-02-15T18:40:14.289898Z DEBUG my::module: processed 100 items _enum_value_=EnumVariant

because ?enum_value means record enum as its Debug impl. This is referred to as "field recording", see recording fields for more options and examples. This just touches the surface of what you can do with the event macros, see tracing and tracing-subscriber for more info!

Levels and Filtering

You may have noticed that log events have levels. In materialize binaries, we only print events that are level INFO or more urgent. You can control this level by using the log_filter system variable to control the level at a per-crate, per-module, and even per-span granularity.

For example:

ALTER SYSTEM SET log_filter='my_interesting_module::sub_module=trace,error';

logs ALL events in the specific module, but only error events for everything else. See [EnvFilter] for more examples.

Like all system variables, it is also possible to set a log filter default when initially invoking environmentd:

./bin/environmentd -- --system-parameter-default=log_filter=my_interesting_module::sub_module=trace,error

Note that this value is passed through to all compute and storage instances when running locally. In the future, it is possible to add per-service filtering, if deemed useful.

Spans

Please see the tracing docs for more information about spans

Spans are what makes tracing a "tracing" crate, and not just a logging crate. Spans are like events, but they have a start and and end, and are organized into a tree of parents and children. There are many tools that can analyze and visualize spans to help you determine where Materialize is spending its time.

There are macros like the event macros for creating them:

let span tracing::info_span!("my_span", ?field);

Critically, these spans do very nothing till they are entered, which turns on the span. When a span is entered, spans that are created on the same thread inherit the entered span as a parent. For example, in:

let s1 = tracing::debug_span!("my_span").entered();
let s2 = tracing::debug_span!("my_span2");

the s2 span as s1 as a parent.

NOTE: entered() and enter() do not work with async code! See the next 2 sections for a better way to do this!

[tracing::instrument(...)]

Previous examples may feel heavyweight to operate, but tracing provides the tracing::instrument attribute macro to make it easier. This macro is applied to functions (or the implementations of trait methods) (it works with async functions as well, and #[async_trait]), and automatically creates and enters a span each time a function is called. This makes it easy to make a span tree based on your call graph.

An example:

#[tracing::instrument]
fn my_sub_function(field_that_will_be_recorded: String) { ... }


#[tracing::instrument(skip(big_argument))]
fn my_function(big_argument: MyState) {
  ...
  my_sub_function(big_argument.thing);
  ...
  my_sub_function(big_argument.two);
}

Both the spans for the my_sub_function calls will be childen of the my_function span!

See tracing::instrument for more info.

Spans and async code

Please see here for more advice on how to use spans with async code

While #[tracing::instrument] works fine with async code. Span::enter can be problematic, explained in the link above. If #[tracing::instrument] is not an option, then import tracing::Instrument and call async functions like this:

let s = info_span!("important_span");
let ret = thing.call_async_method().instrument(s).await;

Logging of spans

By default in materialize binaries, spans are NOT logged to stderr. However, their fields ARE inherited

let e = tracing::debug_span!("my_span", ?enum_value").entered();
tracing::debug!("processed {} items", items.len());

produces:

2022-02-15T18:40:14.289898Z DEBUG my::module: processed 100 items _enum_value_=EnumVariant

OpenTelemetry

While events are clearly useful for logging, so far in this document, its unclear what spans offer. While there are many crates that do interesting things with tracing information, the primary way we use spans in materialize binaries is to collect OpenTelemetry data. In Materialize binaries, tracing spans can be automatically collected and exported to analysis tools with no additional work required at callsites

Setup

Span visualization

OpenTelemetry UI's like the Grafana/Tempo one in the local setup are extraordinarly powerful tools for debugging. They allow you to visualize and inspect the exact control flow graph of your service, including the latency of each operation, and contextual recorded fields. See Best Practices for more information.

Distributed tracing

OpenTelemetry allows us to associate tracing spans from distributed services with each other, allowing us to not only visualize control-flow within a single service, but across services that communicate with each other.

See (TODO(guswynn): fill this in) for a demo on how this works.

Best Practices

Setup tracing for all communication between services.

This is important to ensure that complex interactions between services can be debugged.

Choose the appropriate abstraction for the queue you are piloting.

Materialize has lots of queues, both in-process (typically tokio::sync::mpsc channels), and cross-process (grpc). If you want to trace work being enqueued and then processed when dequeued:

In-process

Prefer moving a tracing::Span through the queue, like so:

#[tracing::instrument(skip(self))]
fn create_sources(&mut self, cs: CreateSources) {
    self.commands_tx.send(
        InstrumentedCreateSources {
            cs,
            span: tracing::Span::current(),
        }
    )
}

async fn process_work(&mut self) {
    loop {
        match self.commands_rx.recv().await {
            InstrumentedCreateSources { cs, span } => {
                // Note that this will typically be done in the most general way possible for
                // Command/Response patterns
                let create_sources_span = tracing::debug_span!(parent: &span, "process_create_sources");
                self.process_create_sources(cs).instrument(create_sources_span).await;
            }
            _ => ...
        }
    }
}

This will improve how the trace visualization will look (the sender span will be the parent of the child and wont close till the child is done), and will reduce load on collectors.

Across services

Use mz_ore::tracing::OpenTelemetryContext::obtain to pass a context from the sending service to the receiving service, and mz_ore::tracing::OpenTelemetryContext::attach_as_parent() on the receiving side. See the docs for more information.

Use the appropriate Level:

  • ERROR:
    • events that needs to be triaged, or something that is fatal to the binary starting correctly. Note that error events usually show up red on trace visualizations.
  • WARN:
    • events that could represent future failures, or non-standard situations. Note that error events usually show up yellow on trace visualizations.
  • INFO:
    • events with information used commonly enough that it should regularly show up in the stderr log.
  • DEBUG:
    • most spans that track the flow of commands and requests across services.
    • events with information used to debug sub-components.
  • TRACE:
    • exceedingly verbose information intended only for local debugging/tracing

Accessing Tracing Data Locally

To setup trace collection locally, start the ../../misc/monitoring composition. It will spin up Tempo to automatically start storing traces, and Grafana to visualize them.

Then start ./bin/environmentd --monitoring.

Setting the Trace Filter

By default, ./bin/environmentd will only emit INFO-level spans. The filter is controlled through the opentelemetry_filter system variable and can be toggled dynamically:

> psql -U mz_system -h localhost -p 6877

mz_system=> ALTER SYSTEM SET opentelemetry_filter="debug";

Or on startup:

./bin/environmentd --reset --monitoring -- --system-parameter-default='opentelemetry_filter=debug'

More details on the filter syntax can be found in Notion.

Trace ID Notices

It's often valuable to see the traces associated with specific queries. This can be done by setting the session variable:

SET emit_trace_id_notice = true;

Then each subsequent query will emit a NOTICE containing the trace ID for its execution:

materialize=> SELECT 1;
NOTICE:  trace id: 65e87c160063307a9d1221f78ae55cf8

This trace ID can then be plugged into Grafana's TraceQL lookup for an exact match:

grafana tempo trace id lookup