Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Important frame loss (99% dropped) after long recordings #13497

Open
RDVision-mchedefaux opened this issue Nov 7, 2024 · 8 comments
Open

Important frame loss (99% dropped) after long recordings #13497

RDVision-mchedefaux opened this issue Nov 7, 2024 · 8 comments

Comments

@RDVision-mchedefaux
Copy link

Required Info
Camera Model 1 D435i and 1 D435
Firmware Version D435i (5.12.7.150) and D435(5.15.0.2)
Operating System & Version Ubuntu 20.04
Kernel Version (Linux Only) 5.10.120-tegra
Platform NVIDIA Jetson Orin AGX DevKit
SDK Version 2.55.1.0
Language C++
Segment Embedded programming

Issue Description

I am currently writing a program to record synchronized frame from different cameras : 2 RGB GenICam cameras and 2 realsense.
I have two different problems (one minor and one major) that I need to solve.
For general introducing, I need to record at 10Hz from all four cameras, every cameras as its own USB controller, no hub used.
Each RGB cameras trigger one realsense and one RGB camera is called the master as its trigger the other RGB Camera. The saving is made on a nvme drive (no drive bottleneck). No apparent CPU bottleneck either as we are at 20-30% CPU usage.
I am using the Hardware synchronisation mode 4 (external synchronisation).

1. Important frame loss after long recording (Major) :

After 2 to 3 hours of recording (never more but no apparent periods), the 2 realsense start to drop frames significantly (99%) and synchronized drops (when 1 realsense lose 15 frames in a row, the other too). When this happen, it will never recover and will need a full switch off of the Jetson : if I only reboot it, the frameloss will still be happening. In addition, when the frame loss begin, it is also visible on the realsense viewer showing 0.5 to 1 fps.
It appeared to be a SDK issue as its synchronized between the two cameras but not much more idea.
In addition, I tried to separate the 2 realsense with a 2ms delay between the two RGB Camera (and so between the two realsense) but It didn't change anything.

2. Continuous synchronized frame loss during normal recording (Minor) :

During a recording, I encounter continuous frame loss directly at the beginning of the recording (1% to 2% of frame lost) and this lost are also synchronized between the two realsense cameras (when one lose a frame, the other one too). I also tried to add a delay between the two cameras and got little better results (but still there) but I don't know if it's real improvement or if it's due to randomness.

Code sneek peak

For information, a sneek peak at my recording code for realsense :

  index = 0;
  while (state == SensorState::running)
  {
      try
      {
          ImageRealsense newImg;
          rs2::frameset frames;
          if (pipe->poll_for_frames(&frames))
          {
              newImg.img = frames;
              newImg.timestamp = frames.get_timestamp() / (1e3);

              newImg.index = index;
              for (shared_ptr<INotifier> noti : notifs)
                  noti->notify(config->SensorId, newImg.index); // Used for other components, fast methods, less than 0.1ms

              ImageQueue.push(newImg); // thread safe queue
              index++;
          }
          else
          {
              usleep(10000); // 10ms
          }
      }
      catch (rs2::error e)
      {
          LOG_ERROR("ERREUR : " + string(e.get_failed_function()) + " : " + string(e.what()));
          // break;
      }
      catch (exception e)
      {
          LOG_ERROR("ERREUR : " + string(e.what()));
      }
  }

This is working in a thread and getting single camera frames (Depth, and IR if needed) and there's one other thread that is emptying the ImageQueue thread safe queue to save the frames. Each cameras has its own record thread and saving thread so a total of 4 for the two realsense.

Thank you very much in advance

@MartyG-RealSense
Copy link
Collaborator

Hi @RDVision-mchedefaux Thanks very much for your questions.

  1. Each RealSense camera that is attached to a computer consumes a portion of the computer's resources. The more cameras that are attached and active simultaneously, the more resources that are consumed. Though only two of your four cameras are RealSense, I would imagine that the same resource consumption applies too to the non-RealSense cameras if they are all active at the same time.

For four cameras or more that are capturing simultaneously, a computer with an Intel Core i7 or i9, or a CPU with equivalent processing power to that specification, is recommendable.


If the problem is not caused by your computing hardware then the described problem that occurs after 2 to 3 hours could also be caused by your program if it is experiencing a memory leak (where the computer's available memory capacity is progressively consumed over a period of time until the program becomes unstable or freezes). You can check for the possibility of a memory leak over time using an Ubuntu system monitoring tool such as htop.

The problem only being cleared by a full shutdown instead of a reboot could be further evidence of a memory leak, as a shutdown should give the computer's memory an opportunity to fully clear and reset to its normal state.

  1. Please try inserting a 'for int' instruction before your poll_for_frames() line to make the SDK skip some initial frames before beginning capture. This helps to avoid 'bad' frames that may occur during camera start-up if auto-exposure is enabled. Does doing so make a difference to your lost frame problem, please?
rs2::frameset frames;
for (int i = 0; i < 10; i++) 

if (pipe->poll_for_frames(&frames))
{

@RDVision-mchedefaux
Copy link
Author

Hi @MartyG-RealSense,
As explained in my first message, I don't think it comes from a CPU bottleneck as the Jetson is only at 20-30% CPU usage when executing the recording program. In addition, I don't have any cores going at 100%.

For the memory leak issue, I already checked with jtop and htop to see if the RAM used goes up but I can assure you it stay still at 3-5% RAM used (Jetson with 64Gb of RAM). Moreover, when quitting the program, the RAM used by it get resetted but the problem is still visible.
Also, this problem is visible on the realsense-viewer.

For the reason, it needs a full shutdown to be cleared might be because with a full shutdown, the realsense camera are powered off and with only a reboot, the realsense camera are still powered on.

Finally, the cameras are not using auto-exposure, they are using a fixed exposure.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Nov 7, 2024

Do you still need to shutdown to recover the cameras if you unplug the cameras and then plug them back in? Unplugging should close any active camera connections.

I recommend avoiding using threads if possible to do so as it adds complexity to programs and increases the chances of problems occurring compared to a non-threaded script.

It may also be worth trying a C++ multicam script at #2219 (comment) which captures a single RGB image frame from all attached RealSense cameras simultaneously without using hardware sync.

You would need to remove or comment out a line that checks if the camera model is D415 in order for it to work with your D435 and D435i cameras.

if (strcmp(dev.get_info(RS2_CAMERA_INFO_NAME), "Intel RealSense D415") == 0) // Check for compatibility
{

@RDVision-mchedefaux
Copy link
Author

I don't know if unplugging the cameras solve the problem because it's embedded in a system and I don't have access to the jetson nor the camera to plug and unplug, I'll be able to access the Jetson tomorrow to unplug and plug back if that helps you.

I understand that using thread can add complexity and problems but I don't have the choice as I build a program that record frames from multiple camera and I want the different cameras to be as autonome as possible. That's why the two RGB cameras and the two realsense camera are in different thread, to ensure that a dropped frame on one camera does not implies a dropped on another. And that is working perfectly fine for the RGB cameras but not on the realsense camera.

In addition, your sample code (and the one you tell me to use) also use threads. And I add a buffer and another consumer thread to save the image because I give all the chance to the librealsense to give me image as soon as possible : Between two "poll for frames", there is as little time as possible.

I can use your sample code but I only need to save the depth and IR frames, I do not need RGB camera that can't be hardware synchronized so I don't think the test will help you in a way.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Nov 8, 2024

Intel no longer provide technical support for genlock hardware sync (mode 4 and above). This is because genlock sync was an experimental 'proof of concept' and 'immature' feature that was not officially validated by Intel, whilst modes 1 and 2 are the officially supported and validated 'mature' sync modes. So it is unfortunately not possible to offer new support for diagnosing and debugging a genlock-based sync system and existing references in past cases must be relied upon.

Do your RealSense sync cables have ESD (electrostatic discharge) protection components built into them? If ESD builds up over time and then discharges, it can reset the frame counter of the depth stream. ESD is especially likely to happen if the sync cables are long (more than 3 meters).

Building ESD protection into sync cables is described at the link below.

https://dev.intelrealsense.com/docs/multiple-depth-cameras-configuration#1-connecting-the-cameras

@RDVision-mchedefaux
Copy link
Author

Yes my cable does have ESD protection and are very short (10-20cm). In addition, I don't use the frame counter at all. I detect a frame loss by using the frame timestamp and by comparing it to the last timestamp I received.

@MartyG-RealSense
Copy link
Collaborator

Have you tried syncing only one RGB camera and one RealSense camera to see whether the issue still occurs around the 2 hour point?

@MartyG-RealSense
Copy link
Collaborator

Hi @RDVision-mchedefaux Do you require further assistance with this case, please? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants