[Feature-Request] HF-Net #1215

borongyuan · 2024-02-06T11:00:26Z

@matlabbe @cdb0y511
Hi, when I further looked at the work related to SuperPoint, I noticed that it seemed that HF-Net could also be added. It provides global descriptors for coarse localization in addition to SuperPoint.

RTAB-Map reserves an interface for global descriptor, but does not implement related functions (#1105). Therefore, I am thinking about whether this part can be implemented universally and how to combine it with the existing memory management mechanism of RTAB-Map. My initial feeling is that it can help retrieval, and secondly it may be further used to improve loop closure hypotheses.

The difficulty in integrating HF-Net is that it is implemented in TensorFlow 1. I only have a few old devices with the required environment. So I also converted it to ONNX first. But I'm more curious about whether it will work on OAK. That would be really fun if it could. This model file looks a bit large, I guess it is mainly because of the fully connected layer at the end of the global head. Even if OAK can't run it, I will try to prune it into a MobileNet based SuperPoint implementation. Some people have tried to train SuperPoint based on MobileNet directly but did not achieve good results. If OAK can run it, I will not develop the inference solution on the host for the time being. But if you are interested, you can also develop it. This is an ONNX file that I converted but haven't tested yet.
https://drive.google.com/file/d/1xKJi1RfKZTLQ0JACD4H76HRpxr6kFgtn/view?usp=sharing

hellovuong · 2024-02-08T10:14:52Z

Hi, I am also looking at implementing of global descriptor for rtabmap. There are some repos already implemented in HF-NET TensorRT also, I tested it. Should we consider that? should not be so hard to add to the rtabmap codebase. I just don't know how to integrate it with RTABMap memory management for image retrieval.

matlabbe · 2024-02-08T22:31:01Z

As explained in #1105, we added an interface to feed a global descriptor (can be any format, see also rtabmap_python to easily compress/uncompress numpy matrices) at the same time than image or lidar data to rtabmap node. For images, the easiest way would be to make a node combining the image topics into a RGBDImage topic (including global descriptor field), connected as input to rtabmap node (with subscribe_rgbd:=true). The global descriptor will be saved in the database for each node added to map's graph.

Currently, there is no internal loop closure detector based on global descriptor. An external loop closure detector can get global descriptors of all nodes in WM (and LTM) by calling service /rtabmap/get_map_data (with graphOnly=true, you get only features data to avoid downloading all images). To handle memory management (when used), that external loop closure node would also need to subscribe to rtabmap/mapData topic to get the GlobalDescriptor linked to latest added node (ID) or any nodes retrieved from LTM that were not downloaded on start. When a loop closure is detected, you can call /rtabmap/add_link service to add the constraint to the internal graph. This is roughly how cmr_lidarloop approach did it with a lidar global descriptor.

Back in the days, when we added the GlobalDescriptor table in the database the goal was indeed to add NETVLAD global descriptor support to improve/combine with the actual loop closure detection done inside RTAB-Map (currently based only on local visual descriptors, a.k.a. bags-of-words approach). Currently the external loop closure detection seems to most flexible for any loop closure detection approach, though it requires ROS/ROS2. I guess to do it internally in standalone version we would need to add python global descriptor approach (like we did for external python ML local keypoints/descriptors or for ML feature matching) to avoid re-implementing in c++ every new ML global descriptors coming up. It has been a while since I read on the subject, but is there a common way to do global descriptor matching in current state-of-the-art that could be used with many global descriptors? Is just a naive nearest neighbor approach between global descriptors enough to find the closest ones? If so, that could worth putting the time to implement it inside RTAB-Map so that we can better integrate loop closure hypotheses between global descriptors and BOW (combine them or switch between them).

I think we would not have to modify the memory management approach, as it is based on the current actual loop closure hypotheses to know which ones to retrieve from LTM first (loop closure hypotheses would already include the score of the global descriptor matching).

borongyuan · 2024-02-18T06:38:44Z

It has been a while since I read on the subject, but is there a common way to do global descriptor matching in current state-of-the-art that could be used with many global descriptors? Is just a naive nearest neighbor approach between global descriptors enough to find the closest ones?

In fact, I was relatively vague about this part of the concept before. I just tend to introduce some deep learning methods in loop closure detection, because they should be able to adapt to changing environments better. For odometry I tend to use traditional methods because this part is well-modeled. I'm also curious about what form the global descriptor can take. cmr_lidarloop and VLAD both use feature vectors, and KNN is used for matching. I did some rough searches and there didn't seem to be any more good options for matching. The summary here can be referred to. Therefore, I think we can first implement global retrieval based on KNN.

My plan is to add nanoflann support first #906. This should help both with performance and cross-platform usage. Then I can add ANMS by the way #1127.

I guess to do it internally in standalone version we would need to add python global descriptor approach (like we did for external python ML local keypoints/descriptors or for ML feature matching) to avoid re-implementing in c++ every new ML global descriptors coming up.

Using onnxruntime seems to be much more convenient. But I'm new to dotnet.

borongyuan · 2024-02-19T03:56:56Z

My plan is to add nanoflann support first #906.

The biggest obstacle to adding nanoflann is maintaining compatibility of parameters. Adding it to the end of GMS can lead to confusing logic. Any suggestions?

matlabbe · 2024-02-20T00:01:37Z

I think matching global descriptor with KNN could be a good first step. For nanoflann implementation, I left a new comment on #906 on how that could be integrated.

borongyuan · 2024-03-01T13:04:03Z

Good news, HF-Net now works on OAK cameras. Performance in preliminary tests looks pretty good, slightly faster than SuperPoint with 320×200 input. If the global head is removed, it has acceptable real-time performance even with 640×400 input. I won't provide this cropped model because we definitely want to use the full HF-Net. You can download the converted blob file from my new repository.
https://github.com/Factor-Robotics/depthai-hfnet/raw/main/blobs/hfnet_200x320_5shave.blob

It should be noted that I did not use the previous ONNX file for conversion, but took a different route. I directly converted the TensorFlow model to OpenVINO IR and then compiled it into a blob file. I gusse that this may avoid some model conversion problems, such as the FP16 overflow issue encountered by the SuperPoint model on L2 Norm. Below is the OpenVINO IR visualized in Netron. You can see that in the previous ONNX file L2 Norm was broken down into a series of basic operations, while in OpenVINO IR it is a single NormalizeL2 layer.

I've added local head part in #1193 and it looks good. It is also easy to add the global head part, but I am a little unclear about your previous definition of GlobalDescriptor. I know how to fill data_, but what exactly do type_ and info_ refer to? Should type=0 for cmr_lidarloop, type=1 for HF-Net/NetVLAD? Or type=0 for all descriptors in vector form?

matlabbe · 2024-03-01T18:40:45Z

That looks great! Note that #1193 has been in Draft mode for a while, just checking if it is on purpose or it could be ready to be reviewed/merged.

For GlobalDescriptor:

rtabmap/corelib/include/rtabmap/core/GlobalDescriptor.h

Lines 54 to 56 in 33b875e

    
           int type_; 
        
           cv::Mat info_; 
        
           cv::Mat data_;

These fields don't have a fixed purpose yet. The CMR loop closure detector is using type=0 on their side. I would keep type=0 for types that rtabmap cannot handle internally (e..g, user-defined external descriptor like CMR). We could officially set NETVLAD descriptor as type=1. The info field is optional, it was used for CMR to describes the vectors inside the data field. For NETVLAD, we could just set type=1 and data fields.

borongyuan · 2024-03-02T12:56:31Z

I filled the global descriptor. Unfortunately this part of the data still looks wrong. All values are close to 0.015625. This means that a numerical overflow occurred somewhere. I need to debug the model again. My main suspicion is the ReduceSum layer. It's often dangerous in FP16 inference. ReduceSum was also generated when L2 Norm was decomposed in the previous SuperPoint model.

I submitted the code first because after debugging the model I only needed to update the blob file. The model may require scaling in some parts.

borongyuan · 2024-03-16T14:38:30Z

I have updated the blob files. The real problem with global head is caused by NormalizeL2 in OpenVINO. Elements after intra-normalization are all 1 or -1. This is not data overflow or underflow, but rather a failure to sum along the set dim. I managed to avoid the problem and now all outputs look reasonable.
However, there seems to be another issue with this part. HF-Net's intra-normalization seems to use a wrong dim ethz-asl/hfnet#71. Even so, the model may still give correct results. Because it is distilled from NetVLAD rather than trained directly. It's just running in unexpected ways. We will verify the results later.

borongyuan · 2024-04-14T02:20:41Z

Closing since HF-Net inference has been implemented. Future work will explore the use of global descriptor.

borongyuan mentioned this issue Mar 2, 2024

Wants depthai-core v2.24.0 in ROS repository luxonis/depthai-ros#506

Closed

matlabbe mentioned this issue Mar 3, 2024

GlobalDescriptor (PyDescriptor / netvlad) #1163

Merged

borongyuan mentioned this issue Apr 13, 2024

copy global descriptor #1263

Merged

borongyuan closed this as completed Apr 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature-Request] HF-Net #1215

[Feature-Request] HF-Net #1215

borongyuan commented Feb 6, 2024

hellovuong commented Feb 8, 2024

matlabbe commented Feb 8, 2024 •

edited

Loading

borongyuan commented Feb 18, 2024

borongyuan commented Feb 19, 2024

matlabbe commented Feb 20, 2024 •

edited

Loading

borongyuan commented Mar 1, 2024

matlabbe commented Mar 1, 2024 •

edited

Loading

borongyuan commented Mar 2, 2024

borongyuan commented Mar 16, 2024

borongyuan commented Apr 14, 2024

[Feature-Request] HF-Net #1215

[Feature-Request] HF-Net #1215

Comments

borongyuan commented Feb 6, 2024

hellovuong commented Feb 8, 2024

matlabbe commented Feb 8, 2024 • edited Loading

borongyuan commented Feb 18, 2024

borongyuan commented Feb 19, 2024

matlabbe commented Feb 20, 2024 • edited Loading

borongyuan commented Mar 1, 2024

matlabbe commented Mar 1, 2024 • edited Loading

borongyuan commented Mar 2, 2024

borongyuan commented Mar 16, 2024

borongyuan commented Apr 14, 2024

matlabbe commented Feb 8, 2024 •

edited

Loading

matlabbe commented Feb 20, 2024 •

edited

Loading

matlabbe commented Mar 1, 2024 •

edited

Loading