Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difficulties in Exception reporting migrating from Spandex to OTel #306

Open
apreifsteck opened this issue Mar 26, 2024 · 3 comments
Open
Labels
bug Something isn't working

Comments

@apreifsteck
Copy link

Describe the bug
I recently experimented with adding open telemetry to one of the apps my team owns. It's been over a week now since I made the swap and we've had a few exceptions. However, they all show up in DataDog with no stacktrace info and this message: exit:{{#{'__exception__' => true,...},[...]},{'Elixir.MyAppWeb.Endpoint',...}}.

It looks like the exceptions are coming in as events, which as far as I know is compliant with the OTel spec.
image

It seems like Datadog doesn't handle errors this way, and instead they're included as span attributes (sorry about the redaction, point being is that there is a message and a full stacktrace there)
image

It looks like I might need to add a span processor to add that span attribute. Perhaps this is more of a compatibility issue than anything else. Regardless, if this issue results in a bridge library or a migration guide to follow for Spandex, that would be much appreciated. Of course, I'd be happy to collaborate in any way I can!

Expected behavior
I had expected something like what Spandex presents.
image

Also, it seems like the exception.message attribute is missing on the event.

Additional context

  • Using Collector
  • Elixir 14.4
  • Datadog for APM
  • Migrating from Spandex
  • Issue out of discussion
    Packages:
      {:opentelemetry, "~> 1.3"},
      {:opentelemetry_api, "~> 1.2"},
      {:open_telemetry_decorator, "~> 1.4"},
      {:opentelemetry_exporter, "~> 1.6"},
      {:opentelemetry_phoenix, "~> 1.1"},
      {:opentelemetry_ecto, "~> 1.2"},
      {:opentelemetry_cowboy, "~> 0.3"},
@apreifsteck apreifsteck added the bug Something isn't working label Mar 26, 2024
@apreifsteck
Copy link
Author

Upon doing some further experimentation, it looks like error reporting happens pretty well if the exception occurs inside a manually instrumented trace but not so good if it bubbles up to Cowboy.

On a whim I was going to try switching out to Bandit to see what that did. It looks like the Phoenix Telemetry package on Hex (1.2) doesn't support that yet, although it looks like there was a PR #249 not too long ago that added this. Is a new release for Phoenix Telemetry coming soon, by any chance?

@grzuy
Copy link

grzuy commented Nov 14, 2024

I suspect the following recent changes

https://github.com/open-telemetry/opentelemetry-erlang-contrib/pull/359/files#diff-4c2fd05f88775967cc821a019b047da19e9ab6db7a8a88a82fa6b3db5350b7bcR391-R422

image

should have fixed this issue.

I think it's released in opentelemetry_cowboy v1.0.0-rc.1.

@apreifsteck can you confirm?

@apreifsteck
Copy link
Author

apreifsteck commented Dec 2, 2024

Sorry for my belated reply. Unfortunately I no longer work at the company that was experimenting with otel for their app. I'll let my old coworkers know that this is potentially fixed, though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants