-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deadlock when used with ddtrace with profiling enabled #3753
Comments
We're going to disable the datadog lock, but I think the learning here is that anything that logs before self._thread_for_pid is set can cause a deadlock if you have the sentry logging integration enabled, because that'll trigger the deadlock. Here's a repro that prints out the bad stack w/o the datadog dependency:
|
Hey @samertm, thanks for the great bug report and for providing a way to repro easily. We will look into this. |
How do you use Sentry?
Sentry Saas (sentry.io)
Version
2.5.1
Steps to Reproduce
This doesn't happen that often, and we only caught the stack trace once.
We use gunicorn to serve a flask app. We have sentry and ddtrace enabled, and ddtrace has profiling enabled. These are our datadog env vars:
ddtrace is on version v2.14.2.
Expected Result
No deadlocks.
Actual Result
We had a pod that didn't shutdown within our 10 minute graceful termination timeout, and when we printed out the stack, one the thread stacks looks like sentry_sdk and ddtrace's monkeypatching caused a deadlock:
The issue there is that it tries to enter the worker.py _lock twice.
We saw a small increase in frontend requests hitting the timeout at the same time for requests that should ~never time out. When you look at all the thread stacks, most of them look like they're waiting on that lock:
Though the weird thing is that the pod was still able to handle requests according to its logs.
The text was updated successfully, but these errors were encountered: