Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anyway to make the faraday instrumentation split by domain but allow an override to not split by domain in some cases? #3719

Open
alexevanczuk opened this issue Jun 14, 2024 · 5 comments

Comments

@alexevanczuk
Copy link
Contributor

We use the split_by_domain option with the faraday instrumentation, which is really helpful to see information about how we interact with 3rd parties.

However, we have a webhooks system which calls out to an arbitrary number of 3rd party systems. I'd like all of these webhook calls to have a single service (e.g. outbound_webhooks) and to have the URL just be a tag on the spans.

Is this possible today?

@marcotc
Copy link
Member

marcotc commented Jul 16, 2024

Hey @alexevanczuk, does the webhooks you mention have a single call site?

@alexevanczuk
Copy link
Contributor Author

Yes @marcotc , it has a single call site. Let me know what you're thinking, sounds like maybe there is a simple way to make this change by updating how tracing is done at that callsite?

@marcotc
Copy link
Member

marcotc commented Jul 17, 2024

This class of issue (service naming for third-party services) is being addressed with the introduction of Inferred Service dependencies. Inferred services allow for you to decide after the fact what's important and what's not. It also represents these external entities better in the Service Map and Services page.
One of the primary goals of Inferred services is to "Reduce the number of 'artificial' services you see in Datadog", which is the case when using split_by_domain.
The issue we have today, similar to the one you have is that we can't provide enough aggregation and filtering tools in the tracing library to provide a good experience for edge-cases, so we will send the raw data (for Faraday, for example, we would send the full URL information, which can then be grouped by in the UI depending on how you want a slice it).

We are moving towards enabling Inferred services by default in the future, but if you think you can opt into the beta, I think you would get a much improved experience for the services created from the Faraday request.

If you need a solution in the meantime, I think adding a Faraday middleware would work best here:

class WebhookServiceNameMiddleware < Faraday::Middleware
  def on_request(env)
    if env['from_webhook'] # We have to set this value at the call site.
      Datadog::Tracing.active_span.service = 'all-my-webhooks'
      # or `Datadog.configuration.env`, if you don't want a new service created for these.
      # Datadog::Tracing.active_span.service = Datadog.configuration.env
    end
  end
end

@alexevanczuk
Copy link
Contributor Author

Thanks @marcotc , that sounds great, thanks for sharing that context.

I opted into the beta and will discuss with my team about turning this on. Appreciate it!

@alexevanczuk
Copy link
Contributor Author

I think one thing I don't fully understand @marcotc is how things are inferred and grouped. Say we have ImportantVendorA, ImportantVendorB, and WebhookEndpointA-Z. It'd be great to see the separate important vendors in their own service so we can dig into just that service dependency (e.g. integrating with the google API). Can you share more information about the mechanics behind how the inferred dependencies are grouped?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants