Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix boundaries for Symfony HttpKernel auto instrumentation #317

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dciprian-petrisor
Copy link

@dciprian-petrisor dciprian-petrisor requested a review from a team as a code owner November 28, 2024 18:04
Copy link

welcome bot commented Nov 28, 2024

Thanks for opening your first pull request! If you haven't yet signed our Contributor License Agreement (CLA), then please do so that we can accept your contribution. A link should appear shortly in this PR if you have not already signed one.

Copy link

linux-foundation-easycla bot commented Nov 28, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: dciprian-petrisor (19f787e)

@dciprian-petrisor dciprian-petrisor force-pushed the fix-symfony-auto-instrumentation-boundaries branch from e78c015 to 19f787e Compare November 28, 2024 18:15
@@ -80,6 +80,30 @@ public static function register(): void
array $params,
?Response $response,
?\Throwable $exception
): void {
$scope = Context::storage()->scope();
if (null === $scope || null === $exception) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if an exception occurs, this post callback will run, otherwise the terminate one will? If so, we should document this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct.
I'm assuming this should be documented in the README.md, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The examples on this page seem to suggest that the handle and terminate can both run.

So it looks like you'd expect 2 spans to be created and then with these changes only close one of them? I'm not so sure that behaviour is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be a parameter to configure whether handle should throw here.

So when $catch = false the handle post-hook might have an exception and terminate will not have been reached yet. I think this would be fine as it handles the recording of the exception and detaches the scope.

However, if $catch = false and there is no exception, then terminate() would be called after handle and the terminate post-hook could try to close a scope which does not belong to it.

Copy link
Author

@dciprian-petrisor dciprian-petrisor Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChrisLightfootWild

  1. If handle throws an exception which is never caught, then terminate will never be reached either, according to the HttpKernelRunner implementation.
    In this situation, it is correct to close the scope on the post hook of handle.

  2. If handle does not throw an exception (or has $catch = true), then terminate would run afterwards, closing the scope.

The only problem I see is there is not a 100% guarantee that terminate will always be executed after handle.

For example, something else might go wrong in HttpKernelRunner after handle was called, e.g. fastcgi_finish_request()/litespeed_finish_request() might crash, flush() might crash, etc, leaving the scope still open.

Unsure how much of a real issue this is, in most cases.

Normally this would be solved by hooking into HttpKernelRunner::run with both pre and post hooks, as it covers both handle and terminate, however $request and $response are not available as parameters there, so I am at a loss right now.

The main issue is that the parent span is lost in between handle and terminate, due to handle detaching it, thus producing two different traces for the same request:

  • one containing everything from the start of the HTTP request up to the end of handle
  • one containing just the dispatch of TerminateEvent done by $this->kernel->terminate afterwards

the second should be part of the first, as it is still logically part of the same HTTP request lifecycle.

We need either a compromise or another solution entirely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that @dciprian-petrisor - I had another look with less bleary eyes and I finally caught up to you guys 💪

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am having the same problem for Drupal, but opted to do an auto root span for now, which obviously duplicates the logic to do that.

That way I have a duplicated span for the Kernel::handle() + Kernel::terminate(), but that does not matter as it's still within one root trace.

The Auto Root Span uses the SDK ShutdownHandler::register so that might be also an option here.

The other thing is:

You could hook into KernelRunner::run(), but not open the scope here, but instead close it just during the post() [if ::run() was called in pre].

While that is more logic to handle, because handle() needs to check that ::run was actually called, so that it can close it, cross hook frontiers are a good way to handle this.

::run pre -> $runCalled = 1
::handle pre -> Open root scope like normal
::handle post -> if ($runCalled) {} -> Do not bother to close scope, else fallback to current logic
::terminate pre -> Open span like normal
::terminate post -> Close span like normal
::run post -> Close scope

In general I personally will still call the Auto Root Scope hook as early as possible (e.g. DrupalKernel::__construct in my case), because else there is always an ::IO extension call, which escapes.

Until that's converted to SPI obviously there are still race conditions.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my implementation:

<?php

namespace OpenTelemetry\Contrib\Instrumentation\Drupal;

use Drupal\Core\DrupalKernel;
use OpenTelemetry\SDK\Trace\AutoRootSpan;
use function OpenTelemetry\Instrumentation\hook;

class DrupalAutoRootSpan {

  public static function register(): void {

    hook(
      DrupalKernel::class,
      '__construct',
      static::registerAutoRootSpan(),
      NULL,
    );
  }

  /**
   * @param \OpenTelemetry\API\Instrumentation\CachedInstrumentation $instrumentation
   *
   * @return \Closure
   */
  public static function registerAutoRootSpan(): \Closure {
    return static function (DrupalKernel $kernel, array $params, string $class, string $function, ?string $filename, ?int $lineno) {
      $request = AutoRootSpan::createRequest();

      if ($request) {
        AutoRootSpan::create($request);
        AutoRootSpan::registerShutdownHandler();
      }
    };
  }

}

It works really well.

Copy link

codecov bot commented Dec 3, 2024

Codecov Report

Attention: Patch coverage is 95.23810% with 1 line in your changes missing coverage. Please review.

Project coverage is 78.66%. Comparing base (f6a567b) to head (19f787e).

Files with missing lines Patch % Lines
...rumentation/Symfony/src/SymfonyInstrumentation.php 95.23% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##               main     #317      +/-   ##
============================================
- Coverage     79.60%   78.66%   -0.95%     
+ Complexity      729      610     -119     
============================================
  Files            69       60       -9     
  Lines          2952     2606     -346     
============================================
- Hits           2350     2050     -300     
+ Misses          602      556      -46     
Flag Coverage Δ
Aws ?
Instrumentation/CakePHP 20.00% <ø> (ø)
Instrumentation/CodeIgniter 73.77% <ø> (ø)
Instrumentation/Curl 90.66% <ø> (ø)
Instrumentation/Guzzle 69.51% <ø> (ø)
Instrumentation/HttpAsyncClient 81.25% <ø> (ø)
Instrumentation/IO 70.68% <ø> (ø)
Instrumentation/MongoDB 76.31% <ø> (ø)
Instrumentation/OpenAIPHP 87.31% <ø> (ø)
Instrumentation/PDO 89.95% <ø> (ø)
Instrumentation/Psr14 77.14% <ø> (ø)
Instrumentation/Psr15 93.82% <ø> (ø)
Instrumentation/Psr16 97.56% <ø> (ø)
Instrumentation/Psr18 81.15% <ø> (ø)
Instrumentation/Psr6 97.67% <ø> (ø)
Instrumentation/Slim 86.89% <ø> (ø)
Instrumentation/Symfony 87.31% <95.23%> (-1.40%) ⬇️
Instrumentation/Yii 77.68% <ø> (ø)
Propagation/ServerTiming 100.00% <ø> (ø)
Propagation/TraceResponse 100.00% <ø> (ø)
ResourceDetectors/Container 93.02% <ø> (ø)
Sampler/RuleBased 33.51% <ø> (ø)
Shims/OpenTracing 92.45% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...rumentation/Symfony/src/SymfonyInstrumentation.php 80.76% <95.23%> (-2.72%) ⬇️

... and 9 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f6a567b...19f787e. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[opentelemetry-php-contrib] Incorrect trace boundaries for Symfony HttpKernel auto instrumentation
4 participants