Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FFI] Propagate Python errors across FFI boundaries #15596

Merged
merged 20 commits into from
Sep 8, 2023

Conversation

Lunderberg
Copy link
Contributor

Prior to this commit, if a Python script passes a callback to a C++ function, and that callback raises an exception, the original exception cannot be caught in the outer python script. As a result, interactive debugging, such as done with pdb or ipdb, could only inspect stack frames outside the first Python to C++ FFI call.

This commit updates the FFI API to propagate the Python exception through an FFI boundary. As a result, all Python frames in the stack trace can be inspected.

@tvm-bot
Copy link
Collaborator

tvm-bot commented Aug 20, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

  • No users to tag found in teams: ffi See #10317 for details

Generated by tvm-bot

@Lunderberg
Copy link
Contributor Author

Currently marked as a draft to test for breakages. If possible, it would be good to suppress the packed_func.py stack frames, and to represent any C++ frames within the Python stack trace.

@Lunderberg Lunderberg force-pushed the ffi_propagate_python_exceptions branch 2 times, most recently from e389adf to 153d3c6 Compare August 21, 2023 17:26
@Lunderberg Lunderberg marked this pull request as ready for review August 21, 2023 19:17
@Lunderberg Lunderberg changed the title [Draft][FFI] Propagate Python errors across FFI boundaries [FFI] Propagate Python errors across FFI boundaries Aug 21, 2023
@Lunderberg
Copy link
Contributor Author

The implementation is now updated to insert empty dummy frames representing the C++ portions of the stack trace. Variables are only interactive within Python stack frames, but ipdb does provide a snippet of the surrounding C++ source code when navigating up/down through the C++ frame.

Prior to this commit, the `BacktraceFullCallback` function returned
zero for frames that should be excluded from the stack trace, even if
they were the `"TVMFuncCall"` that should terminate the stack trace.

This commit re-organized `BacktraceFullCallback`, moving the
terminating checks to occur prior to the suppression checks, and
adding comments to indicate why each suppression is present.
Prior to this commit, if a Python script passes a callback to a C++
function, and that callback raises an exception, the original
exception cannot be caught in the outer python script.  As a result,
interactive debugging, such as done with `pdb` or `ipdb`, could only
inspect stack frames outside the first Python to C++ FFI call.

This commit updates the FFI API to propagate the Python exception
through an FFI boundary.  As a result, all Python frames in the stack
trace can be inspected.
Previously, Python exceptions were coerced to `tvm.error.TVMError` if
no corresponding error type had been registered with
`tvm._ffi.register_error`.  With the pass-through of Python
exceptions, this coercion no longer applies.  Unit tests that relied
on this coercion needed to be updated as a result.
@Lunderberg Lunderberg force-pushed the ffi_propagate_python_exceptions branch from 3a4167f to 04d17fc Compare August 26, 2023 01:46
Copy link
Contributor

@csullivan csullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Lunderberg ! This greatly improves the debugability of the runtime given significant use of the FFI. Approving with only one doc nitpick.

include/tvm/runtime/registry.h Outdated Show resolved Hide resolved
@csullivan csullivan merged commit d5a4f66 into apache:main Sep 8, 2023
@tqchen
Copy link
Member

tqchen commented Oct 5, 2023

This PR seems to cause an issue here #15880

@tqchen
Copy link
Member

tqchen commented Oct 23, 2023

NOTE: there has been multiple reports related to TVMGetLastPythonError not found in windows. Looking at this PR, we should have TVM_DLL marked for all the C API functions that get referenced in the ctypes.

Atm we only marks TVMSetLastPythonError, we will need to mark the rest for the feature to function properly in windows

@Lunderberg
Copy link
Contributor Author

I have a quick hotfix made here, which can be used for validating the TVM_DLL insertions required.

Lunderberg added a commit to Lunderberg/tvm that referenced this pull request Oct 31, 2023
Propagation of Python exceptions across C++ stack frames was
introduced in apache#15596.  This commit
primarily fixes a typo in the initial implementation, where
`Py_IncRef` was used instead of `Py_DecRef`.

In addition, this PR resolves errors that were exposed by this typo
fix, which caused test failures in
`tests/python/unittest/test_crt.py::test_compile_runtime`.  These were
due to use of the `Py_IncRef` and `Py_DecRef` functions on threads
that hadn't acquired the GIL.  This usage error has been corrected for
both the ctypes and cython FFI handling.
junrushao pushed a commit that referenced this pull request Nov 6, 2023
Propagation of Python exceptions across C++ stack frames was
introduced in #15596.  This commit
primarily fixes a typo in the initial implementation, where
`Py_IncRef` was used instead of `Py_DecRef`.

In addition, this PR resolves errors that were exposed by this typo
fix, which caused test failures in
`tests/python/unittest/test_crt.py::test_compile_runtime`.  These were
due to use of the `Py_IncRef` and `Py_DecRef` functions on threads
that hadn't acquired the GIL.  This usage error has been corrected for
both the ctypes and cython FFI handling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants