-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inactive Apm Tracer starts an invalid span with zeroes in the trace id #1500
Comments
Do you have the content of |
Also (because I expect |
No, it's just a very primitive code in grpc and http middlewares that just creates a new span on each request. So there is no trace from the client (I'm testing it with Postman and BloomRPC and I do not know any way to create span from them or simulate this creation (some random 16-chars string that mimic W3C Trace header?)).
If I understand you right, it's not a behavior of APM Otel bridge but Opentelemetry library inself? If so, it's a little strange from my point of view. On one side I understand that tracing is a deep infrastructure layer of a system and it should try to create as little problems as it could. And Opentracing interface does not allow to return error to signal that something is broken or unset: type Tracer interface {
Start(ctx context.Context, spanName string, opts ...SpanStartOption) (context.Context, Span)
} But as a developer in this case I have no explicit way to understand that span was created with problem (except try to examine that SpanContext is valid) and need to debug to try to figure out the reason. I think that two things should be in the docs (I suppose that docs of Opentelemetry Go library):
Also I've figured out yesterday that GRPC integration https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/instrumentation/google.golang.org/grpc/otelgrpc/ will create invalid spans if global tracer is unavailable (same situation with zeroes filled trace and span ids as in my first message). So I had to put a TracerProvider to the interceptors with |
And also more "5 cents" against invalid spans. For example we can use trace id for logs enrichment to be able to investigate with grep'ing or ELK. With inactive tracer (or some misconfiguration in case of OtelGRPC integration) span will be invalid. All this will be in a total silent (no errors). I suppose that span MUST have valid trace and span ids in any case. We could have no errors, no span gathering but zero trace id could be destructive. |
No, it's a behavior of the APM agent. The tracing bridge implements its own OpenTelemetry SDK, it doesn't use the otel one (which is why it should only be used to migrate, not long-term).
That makes sense, every otel instrumentation will behave that way, and has no relation to this repository. This is how OpenTelemetry works. What invalid behavior are you seeing with those zeroed spans? Yes, they have zero values, but since that's how the APM agent works, and apmotel is its own implementation of the OpenTelemetry SDK (and therefore cannot be used with an OTLP exporter for example), we should be able to rely on the APM agent behavior. |
He rely on trace id in these scenarios:
All these subsystem will be broken with zero-values trace id. |
I don't understand. Since the APM Tracer is invalid, there won't be any data to see in Kibana. Also, the zero-value trace ID is already the behavior of the APM agent, even without apmotel. |
To be more clear: system works with Elastic APM as a tracing system but now I migrate all the parts from APM itself to Opentelemetry and use Apm Otel Bridge for translating otel data to current ELK Apm (mainly to hide an infrastructure part of Elastic APM behind the Opentelemetry as an interface and common library for tracing). I've figured out this behavior with zero trace ids on my local machine during the tests of the migration. And I haven't pay an attention on such a behavior for clean APM case. For me it was a moment of surprise that created span could be invalid. I've worked with Jaeger and Opentracing as the tracing tools and thought that span MUST have a valid trace id in any case because somehow it was generated. |
OpenTelemetry doesn't generate valid trace IDs if no tracer were configured. And in the API. So again, this may not be how Opentracing and Jaeger work, but the behavior is consistent with how our agent works, as well as with how OpenTelemetry works. I don't think we should change it here. |
Describe the bug
Inactive APM Tracer produces invalid spans for OpenTelemetry Tracing bridge. The bug lives inside
go.elastic.co/apm/module/apmotel/[email protected]/tracer.go:105 in this code:
s.tx = t.provider.apmTracer.StartTransactionOptions(spanName, "", tranOpts)
It returns an APM transaction with invalid trace and span ids (all the zero bytes).To Reproduce
Steps to reproduce the behavior:
Expected behavior
I suppose that in any case span should have a correct state (or there should be something like Noop span with message details that tracer is not in active state right now).
The text was updated successfully, but these errors were encountered: