Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix potential temporary deadlock in A2DP codec selection procedure. #730

Merged
merged 1 commit into from
Sep 30, 2024

Conversation

borine
Copy link
Collaborator

@borine borine commented Sep 28, 2024

If a dbus client thread locks the codec_id_mtx mutex while Bluez is processing a SetConfiguration request, then that can lead to the main thread being blocked if a client calls any method that needs to read the current codec. However, the SetConfiguration request needs the main thread to be active to respond to the ClearConfiguration call that Bluez makes during its own handling of SetConfiguration. So there is a risk of deadlock if the client attempts to read the codec id while SetConfiguration is in progress. This deadlock is temporary because the dbus calls have a timeout - but Bluez handles the timeout by closing the A2DP profile connection.

There does not appear to be any clear reason why SetConfiguration needs to lock the client_id_mtx mutex, since it does not actually modify the transport codec_id (a new transport instance is created with the new codec_id instead). So this commit avoids this deadlock simpling by removing the codec_id_mtx mutex lock from ba_transport_select_codec_a2dp().

Fixes #725

Please have a good description of the problem and the fix. Help the reviewer
understand what to expect.

For new feature, extensive change or non-trivial bug fix add a simple unit
test or at least describe how to test the code manually.

If you have an issue number, please reference it with a syntax Fixes #123.

Please delete instructions prefixed with '>' to prove you have read them.

Copy link

codecov bot commented Sep 28, 2024

Codecov Report

Attention: Patch coverage is 80.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 70.23%. Comparing base (5a859d6) to head (d19da25).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/ba-transport.c 72.72% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #730      +/-   ##
==========================================
- Coverage   70.27%   70.23%   -0.04%     
==========================================
  Files          96       96              
  Lines       15837    15833       -4     
  Branches     2501     2499       -2     
==========================================
- Hits        11129    11121       -8     
- Misses       4591     4595       +4     
  Partials      117      117              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@arkq
Copy link
Owner

arkq commented Sep 28, 2024

Hmm... it seems that this is another place where codec mutex causes more harm than good. Recently I've removed it from the ba_transport_debug_name(). To be honest I do not remember why this mutex is here... it might be a relict from some previous designs. Also, there is a dedicated t->codec_select_client_mtx to prevent race during codec selection (which does not work properly to be honest), so I think that you are right about removing it from here.

Some time ago I was thinking about removing the codec_id_mtx all together and changing codec_id to atomic (to suppress tsan warnings). So, after your change, the only place where the codec_id_mtx usage seems to be legit is ba_transport_select_codec_sco. I'll verify that, and if it is the case, I'll add some more cleanups to your PR and merge it.

Many thanks for spending some time on it!

If a D-Bus client thread locks the codec_id_mtx mutex while BlueZ is
processing a SetConfiguration request, then that can lead to the main
thread being blocked if a client calls any method that needs to read
the current codec. However, the SetConfiguration request needs the main
thread to be active to respond to the ClearConfiguration call that BlueZ
makes during its own handling of SetConfiguration. So, there is a risk
of deadlock if the client attempts to read the codec ID while
SetConfiguration is in progress. This deadlock is temporary because the
D-Bus calls have a timeout - but BlueZ handles the timeout by closing
the A2DP profile connection.

There does not appear to be any clear reason why SetConfiguration
needs to lock the client_id_mtx mutex, since it does not actually modify
the transport codec_id (a new transport instance is created instead). So
this commit avoids this deadlock simply by removing the codec_id_mtx
mutex lock from ba_transport_select_codec_a2dp().

Fixes arkq#725
@arkq arkq force-pushed the selectcodec-deadlock branch from 764aa81 to d19da25 Compare September 30, 2024 09:24
@arkq arkq merged commit d19da25 into arkq:master Sep 30, 2024
21 checks passed
@borine borine deleted the selectcodec-deadlock branch October 20, 2024 07:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4.3.0: org.freedesktop.DBus.Error.NoReply
2 participants