Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subclass of extension type with reference cycle sometimes never gets GC'd on free-threaded build #125853

Open
ngoldbaum opened this issue Oct 22, 2024 · 4 comments
Labels
topic-free-threading type-bug An unexpected behavior, bug, or error

Comments

@ngoldbaum
Copy link
Contributor

ngoldbaum commented Oct 22, 2024

Bug report

Bug description:

ping @colesbury

Apply the following diff on top of the current PyO3 main branch:

diff --git a/tests/test_gc.rs b/tests/test_gc.rs
index 9483819c..13a617a5 100644
--- a/tests/test_gc.rs
+++ b/tests/test_gc.rs
@@ -627,12 +627,9 @@ fn test_traverse_subclass() {
             check.assert_not_dropped();
         }
 
-        #[cfg(not(Py_GIL_DISABLED))]
-        {
-            // FIXME: seems like a bug that this is flaky on the free-threaded build
-            // https://github.com/PyO3/pyo3/issues/4627
-            check.assert_drops_with_gc(ptr);
-        }
+        // FIXME: seems like a bug that this is flaky on the free-threaded build
+        // https://github.com/PyO3/pyo3/issues/4627
+        check.assert_drops_with_gc(ptr);
 
         #[cfg(Py_GIL_DISABLED)]
         {
@@ -685,12 +682,9 @@ fn test_traverse_subclass_override_clear() {
             check.assert_not_dropped();
         }
 
-        #[cfg(not(Py_GIL_DISABLED))]
-        {
-            // FIXME: seems like a bug that this is flaky on the free-threaded build
-            // https://github.com/PyO3/pyo3/issues/4627
-            check.assert_drops_with_gc(ptr);
-        }
+        // FIXME: seems like a bug that this is flaky on the free-threaded build
+        // https://github.com/PyO3/pyo3/issues/4627
+        check.assert_drops_with_gc(ptr);
 
         #[cfg(Py_GIL_DISABLED)]
         {

Then do:

$ while UNSAFE_PYO3_BUILD_FREE_THREADED=1 cargo test --test test_gc; do :; done

Eventually, one or both tests modified by the diff above will fail:

test test_traverse_subclass ... FAILED
test test_traverse_subclass_override_clear ... FAILED

failures:

---- test_traverse_subclass stdout ----
thread 'test_traverse_subclass' panicked at tests/test_gc.rs:632:15:
Object was not dropped after 100 GC cycles, refcount is 2

---- test_traverse_subclass_override_clear stdout ----
thread 'test_traverse_subclass_override_clear' panicked at tests/test_gc.rs:687:15:
Object was not dropped after 100 GC cycles, refcount is 2


failures:
    test_traverse_subclass
    test_traverse_subclass_override_clear

assert_drops_with_gc does 100 loops of calling gc.collect() and then sleeping for five milliseconds. There are other tests that make use of the same infrastructure and they always complete quickly on my machine without getting anywhere close to 100 iterations. You can also increase the hard-coded 100 in the for loop to an arbitrarily large number and the object never gets GC'd in my testing, even after waiting for longer than a minute.

See PyO3/pyo3#4627 for the PyO3 issue, which has some additional details.

CPython versions tested on:

3.13

Operating systems tested on:

macOS

@colesbury
Copy link
Contributor

Does this occur in main as well or just 3.13?

Type objects are immortalized in 3.13. This happens after the first non-main thread is created, which might make it non-deterministic with your tests.

https://docs.python.org/3.13/howto/free-threading-python.html#immortalization

@ngoldbaum
Copy link
Contributor Author

I don't think any Python threads are launched by the rust tests. Maybe that's something we should try though!

I just tried with a fresh build of 3.14t-dev from pyenv and I can reproduce the same issue it if I make the following additional modification to bypass the check for the maximum supported python version:

diff --git a/pyo3-ffi/build.rs b/pyo3-ffi/build.rs
index 622c2707..cc7c8a30 100644
--- a/pyo3-ffi/build.rs
+++ b/pyo3-ffi/build.rs
@@ -18,7 +18,7 @@ const SUPPORTED_VERSIONS_CPYTHON: SupportedVersions = SupportedVersions {
     min: PythonVersion { major: 3, minor: 7 },
     max: PythonVersion {
         major: 3,
-        minor: 13,
+        minor: 14,
     },
 };
± while UNSAFE_PYO3_BUILD_FREE_THREADED=1 cargo test --test test_gc; do :; done
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.17s
     Running tests/test_gc.rs (target/debug/deps/test_gc-c659bc81a4afe4be)

 *- deleted test output -*

failures:

---- test_traverse_subclass stdout ----
thread 'test_traverse_subclass' panicked at tests/test_gc.rs:632:15:
Object was not dropped after 100 GC cycles, refcount is 2

---- test_traverse_subclass_override_clear stdout ----
thread 'test_traverse_subclass_override_clear' panicked at tests/test_gc.rs:687:15:
Object was not dropped after 100 GC cycles, refcount is 2


failures:
    test_traverse_subclass
    test_traverse_subclass_override_clear

test result: FAILED. 13 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out; finished in 1.46s

error: test failed, to rerun pass `--test test_gc`

goldbaum at Nathans-MBP in ~/Documents/pyo3 on main!
± python -VV
Python 3.14.0a1+ experimental free-threading build (heads/main:aaed91c, Oct 22 2024, 12:55:05) [Clang 16.0.0 (clang-1600.0.26.3)]

@colesbury
Copy link
Contributor

I'm able to reproduce the issue on 3.14 along with what looks like another assertion failure.

I don't think any Python threads are launched by the rust tests

The tests suite runs in multiple threads concurrently, right? It doesn't matter that the threads are not created by Python's threading module. If the thread calls PyGILState_Ensure(), it has the same effect.

@ngoldbaum
Copy link
Contributor Author

Ah, in that case it definitely is happening, any call to with_gil in PyO3 calls PyGILState_Ensure and it looks like most of the tests in that file do that at least once. Cargo spawns an OS thread for each test and runs them concurrently in a round-robin order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-free-threading type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants