-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion regarding further steps for speed estimation #18
Comments
So, If I understand you, you would like object ReID to improve the speed estimation, right? |
Although this scenario might be a future work. My current application involves the pedestrian coming from one direction and the vehicle is from other (perpendicular paths of the two), so there might be no scenario for now where the pedestrian goes out of frame. So, currently what I want is the pedestrian's speed. I am looking for Deep sort for some re-id tasks and to extract out the trajectories of the objects but I am getting bit confused how can I compute the speed or velocity. |
If you have an RGB-D camera, you can use the 3D version from this repo ( |
Hey @Rak-r, have you tried the speed estimation from Ultralitycs https://docs.ultralytics.com/guides/speed-estimation/? |
I think that example draws a line region and if object corsses it then it outputs the speed. However, in my case the robot is in motion meaning camera too in motion, I don't know how its going to handle in that scenario, so I am just using the 3D centre coordinates (which are real-world coordinates) and calculating the euclidean distance between the previous message and current message and for speed, divide it by the time stamp (delta_t in this case difference between old timestamp and current timestamp). I tested it with simulating an actor in gazebo it throws out some values but the actor speed was too low it seems it gave nearly (0.0085) kinda values. I will test it with some videos and also have to do some averaging for smoother values. Could you tell that this approach is ok? The motion of the camera is already getting handled as there is get_tranform() function and transform_3d() function which is handling this, so i don't have to find the relative velocity here. Also, I am attaching the script which I tried to wrote for this, I think I have to make the ` import rclpy class SpeedEstimate(Node):
def main(args=None): if name == 'main': ` |
Starting from your code, my idea was something similar to the following node. In this case, adding a speed field to the detection message to republish the 3D detected objects with the computed speed. But I have to test it. import math
import rclpy
from rclpy.node import Node
from geometry_msgs.msg import Pose
from yolov8_msgs.msg import Detection
from yolov8_msgs.msg import DetectionArray
class SpeedEstimate(Node):
def __init__(self) -> None:
super().__init__("speed_estimate_node")
self.previous_detections = {}
self._pub = self.create_publisher(
DetectionArray, "detections_speed", 10)
self._sub = self.create_subscription(
DetectionArray, "detections_3d", self.detections_cb, 10)
def calculate_euclidian(self, old_pos: Pose, cur_pos: Pose) -> float:
return math.sqrt(
math.pow(old_pos.position.x - cur_pos.position.x, 2) +
math.pow(old_pos.position.y - cur_pos.position.y, 2) +
math.pow(old_pos.position.z - cur_pos.position.z, 2)
)
def detections_cb(self, detection_msg: DetectionArray) -> None:
timestamp = detection_msg.header.stamp.sec + \
detection_msg.header.stamp.nanosec / 1e9
detection: Detection
for detection in detection_msg.detections:
if detection.id in self.previous_detections:
old_pos = self.previous_detections[detection.id]["position"]
cur_pos = detection.bbox3d.center
dist = self.calculate_euclidian(old_pos, cur_pos)
t = timestamp - \
self.previous_detections[detection.id]["timestamp"]
detection.speed = dist / t
self.previous_detections[detection.id] = {
"timestamp": timestamp,
"position": detection.bbox3d.center
}
self._pub.publish(detection_msg)
def main():
rclpy.init()
node = SpeedEstimate()
rclpy.spin(node)
node.destroy_node()
rclpy.shutdown() Besides, in the case of objects that disappear and reappear, we can clean the previously detected dict with a time like this. def clean_detections(self) -> None:
cur_time = self.get_clock().now().nanoseconds / 1e9
for detection_id in self.previous_detections:
if cur_time - self.previous_detections[detection_id]["timestamp"] > 1:
del self.previous_detections[detection_id] |
Seems reasonable approach. If you could test it and find if something more could be considered, suggest some changes, then it will be great. I will try to test the scripts with videos on my end as well and adding the averager for smoother values. |
I have created a new branch with the changes for speed estimation. Running the 3D launch includes speed estimation. You can visualize it using the /yolo/dbg_image topic. |
Nice, but have you done some testing with it. Is it giving reasonable outputs?@mgonzs13 |
Yes, I have launched it. I am getting errors for the speed in m/s of immobile objects, maybe due to the bounding boxes but I have to review the code. Btw, it would be interesting to have a ground truth to compare the results. |
For the immobile/stactic objects, it should show nearly zero m/s. Thats the way we can verify it atleast for static one. For moving objects we can try on on some video stream might be not on ros side but genereic yolo style. We are clear on one thing, that we don't have perform transform as we are having 3d cordinates in world right? or we need to take into account the fps instead of ros2 message timestamp. Also might store some messages say(20) and get the averager for smoothing. What you suggest for this? @mgonzs13 |
I tried to test it with just simulation and I have attached he gif for the same. Also providing the topic echo for Also, tested with simulating actor, showing nearly zero, I dont know fully that the actor's simualtion speed is slow and it is looping from start everytime or how can we test it with more precise moving objects. One thing, why in the topic frame_id is getting to base_link while the capturing frame or input frame to yolo is camera_link? All these should be in camera_link , right? Just needed to reclarify. |
The I think that simulated actor could be a good way to get the ground truth to make tests. |
Yeah, that what I am trying but some how the gazebo guys don't provide to set the actor's speed. so that we can have some groundtruth |
I think we can obtain it by using the gazebo_ros_state plugin. You can enable it by adding the following lines to your world and then using the <plugin name="gazebo_ros_state" filename="libgazebo_ros_state.so">
<update_rate>1.0</update_rate>
</plugin> |
I have my stack build up with the new gazebo (https://gazebosim.org/docs/garden/actors) not with gazebo 11. I am finding something similar in the new one but for actor I couldn't find one till now. There is odometry publisher plugin which publishes odometry for any entity but again for the actors you can't apply that plugin. it should be a model. @mgonzs13 |
Are you using Ignition? Do you have a world with objects, actors and a robot to test this? |
Yes, I am using Gazebo but have the same setup for ignition as well with actors, robot mounted with simulated RGB-D camera and objects that's the same one which I sent over in the 2 gifs |
I am working with speed stuff. However I found that the callback in speed estimation node sometimes running twice which is making the timestamp difference double and ultimately wrong values. The desired value should be 0.037 something like this but it is getting to 0.066, 0.1000 I have attached the screenshot which debug flag, the data showing timestamp when entering the callback which is the time stamp of Also, in calculate euclidean, isn't it |
Where are you printing that information? Regarding the Euclidean distance, the order of the positions does not matter. I mean old-cur distance should be the same as cur-old distance. |
I have added logger to the speed estimation node. Yes, it will be same when talking about magnitude but if vectors then I believe it should be Anyways the timing seems an issue here? @mgonzs13
|
Take into account that you are printing it inside the loop. How many objects are detected? You may be getting the information printed twice because you are detecting two objects. |
I am just detecting person class. So it should be only one object. But how that will going to effect time stamp difference to change? @mgonzs13 Could you try on your end? Also could you suggest something here, i am finding velocities in x,y and z. For that subtracting previous point from current recieved point
because when I print the velocities i observe some sudden jumps. How could I handle this amd correct this. One more thing, I think I have already know this but still we dont have to do any transform or relative velocities right because we have 3d points of the objects relative to the camera or whatever frame id we specify, right? |
I'll try it. Yes, all 3D data is already transformed to the frame of the launch file. |
Sorry but i am not getting it clearly using two objects meaning? When i point the camera towards me then it starts publishing. I have set like
|
I am detecting myself as I wrote in that if condition but according to what you just explained, there might be some other object in the background getting detected so the time stamp of that object is getting printed along with my timestamp? |
Yes, that's what I think could be happening when you run it. You can check it with rviz2 and the debug topic or echo the detection topics. Btw, do you want the velocity for each axis? |
Yes. What I need is to predict the trajectory of the person fir that using velocities instead of magnitude. I am also thinking to implement the Kalman filter for this. Would you mind suggesting with this. Then I can file the PR and we can more to this repo. |
I am not using the UUID. Instead, I continue using my messages which include the velocities in a Twist msg. The initial position that I am using is the position of the object at its first detection. |
Ok. If we want to use the tracker from that repo,I am just wondering what to add in UUID field as I think the tracker subscribes to Obstacle message. Have you test the data after adding the cv2 based one? Does it seems reliable velocities or sudden jumps when there are more than one object? |
I am not sure you will get better results with the tracker of the repo since bytetracker also uses Kalman filter. |
Ok. I will do some testing tomorrow and will update here. You haven't experienced any jumps? from the KF before using the cv2 Kalman filter. |
Hi @mgonzs13 , have you got some time to test the velocity stuff? I have been looking for some groundtruth from that gazebo actor but didn't found much. You were mentioning about the actor state in ignition. Could you provide some brief for it. |
I haven't time to test it further. On the groundtruth, I suggested using the gazebo_ros_state plugin, which can be enabled by adding some lines to the gazebo world. Then, using the /get_entity_state service returns the position and twist data for a given object in the simulation. |
But I think its not the same for ignition gazebo? |
I don't know if it is the same or has something equivalent. I have only used the plunging in Gazebo. |
Yeah. I got it. Let me know whenever you can try to test it a bit more. Till the time I am doing it on my end. Could you suggest how to add the orientation. I was reading about getting cartesian to polar cordinates? or there is something else needs to be considered. |
I'm not sure how to obtain the orientation. Ultralytics uses OBB, but it's 2D. |
If I want to use 2D information (x and y) in (meters)real -world cordinates of detected/tracked object and then compute the velocity. So to achieve just simply ignore the z from the 3D centre point? @mgonzs13 |
Yes, if you're only interested in calculating the velocity of an object based on its position in 2D space, you can ignore the depth coordinate. |
Hi @mgonzs13 was just testing the speed stuff with me walking infront of the camera. I think I getting confused here but could you tell that if my camera is mounted on the robot and in the node i have provided target frame as camera link and not base link. Will this report different values? Because I am not publishing static tf between base link and camera link. Basically I was recording a bag file so wanted to verify that will the data going be different if no transform is there and in actual camera is mounted on robot? I think it shouldn't. Also saw speed values, I see when camera in motion there are noisy values below 0.1 m/s looks ok but sometimes it touches 0.3 for stationary objects. I also see sometimes detection is lost. I think this also affects the data. I also visualised speed for person only both with robot stationary and moving. It shows 0.5 -0.8 m/s with some noise reaching till 1.2-1.3 m/s. Could you suggest some more tricks to make it stable when camera in motion? Can we do the average now of (say last 10 speed messages received)? |
The target frame should not affect the velocity unless the robot is moving. In that case, you should use a global frame such as odom or map, as the target frame. |
Yes, The robot was moving. I think that's why it affected. I just read about it. |
Hi @mgonzs13 , I was testing the velocity estimation, but it gives some jumps for static objects. Did you observed similar pattern? Could you advice some potential reasons or how to minimize this. I was looking for implementing EKF but could not find some approach to begin and how to implement it properly. If you could help with this towards getting more reasonable values, it would be highly appreciated. Also, to remove that callback running twice because of detecting more than one object, the number of classes to person, so that the detector will only report messages for person class this can be changed in |
Hey @Rak-r, I haven't time to perform more tests on this. For the EKF, I suppose the KalmanFilter from cv2 can be used with some edits. Are you using my approach or which code are you running? |
Hi @mgonzs13 , I am using the same what we have discussed this far and from the speed estimation branch with cv2.kalman filter which is set to true by default. If I do not use kalman result is more noisy. If you find time to test it for noise, kindly let me know. |
Hi @mgonzs13 , I am trying to implement the EKF for handling non-linealrities if person do not move in a straight line path. I found this https://github.com/hvzzzz/Vehicle_Trajectory_Dataset/blob/interaction/scripts/tools/kalman_filters.py but getting a bit confused how to integrate in the code. Also for orientation, we have pose.x and pose.y from Bbox3D , we can find the polar cordinates from this and feed in the EKF update step to estimate it? Another thing, I wanted to ask, if I need the predicted future trajectories of the Person (from existing code which assumes constant velcotiy model), right now I am extracting it by iterating over the kf object say for 10 future predicted trajectory which nearly be linear (straight line) because the predict step is does same as this
|
Also, I often get this warning:
|
Hi @Rak-r, obtaining the orientation would be a great feature. I have to check that code more deeply. About the warning, let me check what is happening. |
What about the future trajectories? Is that the right approach.@mgonzs13 |
It seems reasonable to me though I have never done it. |
Well assuming constant velocity model and lets say person just follows linear straight path. For this particular, can just take current positions and represent as a line in slope intercept form. Let me know whenever you test the code for ekf if you have time. |
Hello. I am also implementing an Extended Kalman Filter (EKF) from a Kalman filter, but I am facing some issues. |
Hey, we are also discussing the similar approach. I have shared a link above which implements both simple KF and EKF. I was thinking to integrate EKF to the speed estimation but getting confused.@Booooms |
I'm confused because there are many links, what is it? |
Yes, it is the only in the discussion for EKF. |
This issue is stale because it has been open for 30 days with no activity. |
I have tried to setup the stack with Intel realsense D435 and I have checked that topics are susbcribed by the yolo nodes and I can view the camera rgb feed and depth feed in rviz2.
What I want now, is after having the 3d data about the objects, I want to do speed estimation of the detected and tracked objects (let's say person's speed). For this, using yolo default trackers byte track/botosrt, they don't assign re-d to the object which leaves the frame. I want to take some suggestion.
track_id = 1
and it leaves the frame and comes back, tracker will agin assign a new id but will this affect the speed estimation because as soon as the object reappears the speed estimation will start from fresh but will report the correct speed data of that object.The text was updated successfully, but these errors were encountered: