Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[21672] Fix data race in TypeObjectFactory::get_instance #5238

Merged
merged 12 commits into from
Sep 17, 2024

Conversation

MiguelCompany
Copy link
Member

@MiguelCompany MiguelCompany commented Sep 13, 2024

Description

While debugging an error reported by LeakSanitizer in #4916, an important data race was discovered in TypeObjectFactory::get_instance.

The test that PR adds is creating several writers from different threads simultaneously.

This leads to calling TypeObjectFactory::get_instance when registering the local writer (during the creation of its WriterProxyData).

The data race made several factories to be constructed, but only one was assigned to the global g_instance pointer.

Only one of the factories was destroyed when the process finished, making LeakSanitizer complain.

This PR adds a mechanism based on an atomic enumeration holding the state of the singleton instance.

@Mergifyio backport 2.14.x

Contributor Checklist

  • Commit messages follow the project guidelines.
  • The code follows the style guidelines of this project.
  • Tests that thoroughly check the new feature have been added/Regression tests checking the bug and its fix have been added; the added tests pass locally
  • N/A: Any new/modified methods have been properly documented using Doxygen.
  • N/A: Any new configuration API has an equivalent XML API (with the corresponding XSD extension)
  • Changes are backport compatible: they do NOT break ABI nor change library core behavior.
  • Changes are API compatible.
  • N/A: New feature has been added to the versions.md file (if applicable).
  • N/A: New feature has been documented/Current behavior is correctly described in the documentation.
  • [x]: Applicable backports have been included in the description.
    • This was discovered in 2.10.x, so this PR targets that branch.
    • It is also present in 2.14.x, so this PR can be directly forward ported to that branch. Marked so above.
    • For 3.x, the interfaces have changed, and the implementation is based on a static shared_ptr, to it is enough to add a test similar to the one in this PR.

Reviewer Checklist

  • The PR has a milestone assigned.
  • The title and description correctly express the PR's purpose.
  • Check contributor checklist is correct.
  • If this is a critical bug fix, backports to the critical-only supported branches have been requested.
  • Check CI results: changes do not issue any warning.
  • Check CI results: failing tests are unrelated with the changes.

Signed-off-by: Miguel Company <[email protected]>
Copy link
Member

@Mario-DL Mario-DL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, leaving a NIT and a suggestion

include/fastrtps/types/BuiltinAnnotationsTypeObject.h Outdated Show resolved Hide resolved
src/cpp/dynamic-types/TypeObjectFactory.cpp Show resolved Hide resolved
test/unittest/xtypes/XTypesTests.cpp Outdated Show resolved Hide resolved
@MiguelCompany
Copy link
Member Author

@Mario-DL Apart from addressing your review, I added commit 20d3974, where I leave the responsibility of calling create_builtin_annotations() to the factory constructor.

Signed-off-by: Miguel Company <[email protected]>
Copy link
Member

@Mario-DL Mario-DL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
@MiguelCompany could you confirm that the failing test are flaky in 2.10.x ?

@MiguelCompany
Copy link
Member Author

Could you confirm that the failing test are flaky in 2.10.x ?

The only one that seems new is UDPv4Tests.double_binding_fails, which is clearly unrelated.

@MiguelCompany
Copy link
Member Author

@Mergifyio backport 2.14.x

@MiguelCompany MiguelCompany merged commit 8f4b4a5 into 2.10.x Sep 17, 2024
15 of 18 checks passed
@MiguelCompany MiguelCompany deleted the bugfix/21664 branch September 17, 2024 13:20
Copy link
Contributor

mergify bot commented Sep 17, 2024

backport 2.14.x

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Sep 17, 2024
* Refs #21664. Regression test.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21664. Improve synchronization in regression test.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21664. Return created instance to visualize data-race and make test fail.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21664. Just count the number of different instances.
This way we have a single final expectation.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21664. Avoid using g_instance inside the instance.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21664. Inject factory in all methods called inside `register_builtin_annotations_types`.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21664. Use atomic enumeration to control the instance state.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21664. Uncrustify.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21672. Notify condition from main thread.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21672. FIx EOL at end of file.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21672. Refactor to create builtin objects inside factory constructor.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21672. Uncrustify.

Signed-off-by: Miguel Company <[email protected]>

---------

Signed-off-by: Miguel Company <[email protected]>
(cherry picked from commit 8f4b4a5)

# Conflicts:
#	src/cpp/dynamic-types/TypeObjectFactory.cpp
MiguelCompany added a commit that referenced this pull request Sep 17, 2024
* Fix data race in `TypeObjectFactory::get_instance` (#5238)

* Refs #21664. Regression test.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21664. Improve synchronization in regression test.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21664. Return created instance to visualize data-race and make test fail.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21664. Just count the number of different instances.
This way we have a single final expectation.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21664. Avoid using g_instance inside the instance.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21664. Inject factory in all methods called inside `register_builtin_annotations_types`.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21664. Use atomic enumeration to control the instance state.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21664. Uncrustify.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21672. Notify condition from main thread.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21672. FIx EOL at end of file.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21672. Refactor to create builtin objects inside factory constructor.

Signed-off-by: Miguel Company <[email protected]>

* Refs #21672. Uncrustify.

Signed-off-by: Miguel Company <[email protected]>

---------

Signed-off-by: Miguel Company <[email protected]>
(cherry picked from commit 8f4b4a5)

# Conflicts:
#	src/cpp/dynamic-types/TypeObjectFactory.cpp

* Fix conflicts

Signed-off-by: Miguel Company <[email protected]>

---------

Signed-off-by: Miguel Company <[email protected]>
Co-authored-by: Miguel Company <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-pending PR which CI is running
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants