-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Fix Segfault in ISMC #36
base: main
Are you sure you want to change the base?
[Bugfix] Fix Segfault in ISMC #36
Conversation
The change has slowed the frequency, but didn't completely eliminate the segfaults. Here is a gdb backtrace (after ~30 minutes of running in sim) after recompiling
Given RealtimeBuffer stores pointers internally, I wonder if using Another option is that there's memory corruption elsewhere. |
Sorry it has taken me so long to get to this. I'm working to confirm that I can replicate the issue. |
I did some testing using just |
How often do you encounter this error @amarburg? I'm trying to replicate the problem, but haven't had success yet. EDIT: Do you observe any memory leaks? |
I was able to replicate it consistently (consistently enough to spend some time debugging it). In general, it would happen fairly promptly (first 5-10 minutes of operation) but during some testing it could take hours before occuring. |
When I have a quiet moment I'll see if I can still replicate it with images from |
I was able to replicate this morning with: In window 1
In window 2
In window 3
In window 4
Then drive ROV around with gamepad. Walk away for a while to do some other work. Come back and ROV is no longer controllable. Relatively silent error in windows 3 (
|
I repeated the steps that you took above and let the simulator run for ~2.5 hours without any errors. It may be worth looking into writing some unit tests with the realtime_buffer to see if there are any test cases that can replicate/isolate the issue. What are the specs of the machine that you are running on?
|
Over the course of testing, I was able to replicate this behavior on three different machines, an Intel laptop and Intel Desktop (i9-9980HK, 32GB, no GPU) and an AMD desktop (3950X, 64GB, 1650Super). |
Given I'm away from keyboard for at least a week, let's park this. Go ahead with #41, and I'll retest after that's gone through... |
Changes Made
While doing to "joystick teleop" demo, I had frequent segfaults from
controller_manager
, unpredictably after 10-60 seconds. After building with ReleaseWithDebInfo and running in GDB, segfault was at line 262:where
current_system_state
is ashared_ptr<Twist>
. Handling a pointer to a shared_ptr does not increment the refcount, so I think this resulted in the shared_ptr being reaped while this function is still running.This MR replaces the two instances of this template with:
which instantiates a
shared_ptr
for the duration of this scope.It's possible this approach is not the most C++20-ish way to do it, happy to address it other ways.
Associated Issues
n/a (did not open issue re segfault)
Testing
Completed joystick teleoperation for ~10+ minutes without segfault on current
rolling
image.