Skip to content

Semantic Segmentation of Images and Point Clouds for Traversability Estimation

License

Notifications You must be signed in to change notification settings

ctu-vras/traversability_estimation

Repository files navigation

Traversability Estimation

Segmentation of Images and Point Clouds for Traversability Estimation.

The module assigns to individual measured LiDAR points either a binary flag or non-negative cost of passing the point by a given robot. Exit from this module could be used as one of the inputs of the planning module. Its use allows planning safe paths around obstacles in a mostly static environment. Example of the module being used in navigation pipeline: video, in simulator.

Installation

Please, follow the instructions in ./docs/install.md.

The node takes as input RGB images and intrinsics (from several cameras as well is supported). Output is a set of semantic labels for each input RGB image.

Topics:

  • input_0/rgb, ... input_{num_cameras - 1}/rgb
  • input_0/camera_info, ... input_{num_cameras - 1}/camera_info
  • output_0/semseg,... output_{num_cameras - 1}/semseg
  • output_0/camera_info,... output_{num_cameras - 1}/camera_info

Parameters:

  • num_cameras [int] - number of image topics for segmentation
  • device [str] - cpu/cuda
  • legend [bool] - if legend for segmentation is required
  • image_transport [str] - 'compressed' or 'raw' if input image topic is compressed

Look at segmentation_inferece for more details.

Point Cloud Semantic Segmentation Node

The node takes as input a point cloud from which a depth image is calculated. The depth image is an input for semantic segmentation network (2D-convolutions based) from torchvision.models.segmentation. The network predicts semantic label for each pixel in the depth image. The labels are futher used to output segmented point cloud.

Topics:

  • cloud_in: input point cloud to subscribe to
  • cloud_out: returned segmented point cloud

Parameters:

  • device: device to run tensor operations on: cpu or cuda
  • max_age: maximum allowed time delay for point clouds time stamps to be processed
  • range_projection [bool]: whether to perform point cloud projection to range image inside a node
  • lidar_channels: number of lidar channels of input point cloud (for instance 32 or 64)
  • lidar_beams: number of lidar beams of input point cloud (for instance 1024 or 2048)
  • lidar_fov_up: LiDAR sensor vertical field of view (from X-axis to Z-axis direction)
  • lidar_fov_down: LiDAR sensor vertical field of view (from X-axis against Z-axis direction)
  • weights: name of torch weights file (*.pth), located in ./config/weights/depth_cloud/ folder
  • cloud_in: topic name to subscribe to (point cloud being segmented)
  • clou_out: topic name to publish segmented cloud to
  • debug: whether to publish debug information (for example range image): may slow down the node performance.

Look at cloud_segmentation for more details.

Data

Bag-file sequences are available at: http://subtdata.felk.cvut.cz/robingas/data/. Please, refer to the monoforce package for more details.

Models Training

The following scripts should be run from the ./scripts/tools/ folder:

roscd traversability_estimation/scripts/tools/

Train point cloud segmentation model to predict traversability labels on SemanticKITTI and SemanticUSL data:

python train_depth --datasets SemanticKITTI SemanticUSL --batch_size 4 --output traversability 

Train image segmentation model on Rellis3D data:

python train_img --dataset Rellis3DImages --batch_size 4 --architecture fcn_resnet50 

Models Evaluation

Evaluate (get IoU score) a point cloud segmentation model trained on TraversabilityClouds data:

python eval_depth --dataset TraversabilityClouds --weights /path/to/deeplabv3_resnet101_lr_0.0001_bs_8_epoch_90_TraversabilityClouds_depth_labels_traversability_iou_0.972.pth --output traversability

3D-Point Cloud Semantic Segmentation Node

The node takes as input a point cloud and utilizes a 3D-convolutions-based network to predict semantic labels for each point. The model is trained using the semi-supervised technique for domain adaptation proposed in T-Concord3D project. Rellis3DClouds data was utilized as the source dataset, while TraversabilityDataset was utilized as the target dataset (using pseudo-labels). Please, have a look at the T-Concord3D project repository for more details about the training procedure.

Topics:

  • cloud_in: input point cloud to subscribe to
  • cloud_out: returned segmented point cloud

Parameters:

  • device: device to run tensor operations on: cpu or cuda
  • max_age: maximum allowed time delay for point clouds time stamps to be processed
  • weights: name of torch weights file (*.pt), located in ./config/weights/t-concord3d/ folder
  • cloud_in: topic name to subscribe to (point cloud being segmented)
  • clou_out: topic name to publish segmented cloud to

Look at cloud_segmentation_tconcord3d for more details.

Geometric Traversability Node

Manually designed geometric features describing the local neighborhood of points based on:

  • estimation of slope (inclination angles) of supporting terrain,
  • estimation of step of supporting terrain.

For more information, please, refer to traversability node implemented in the naex package.

Fused Traversability Node

Method which combines geometric and semantic traversability results. Definitely passable and definitely impassable costs assigned to points values were defined on the basis of geometrical traversability. In the rest of the area (part of point cloud), especially in vegetation where geometrical approach cannot be applied, a model learned from the data was used (semantic traversability estimation).

Topics:

  • geometric_traversability: input point cloud to subscribe to containing geometric traversability information
  • semantic_traversability: input point cloud to subscribe to containing semantic traversability information
  • fused_traversability: output point cloud topic to be published containing resultant traversability information

Parameters:

  • fixed_frame: name of the coordinate frame to consider constant in time to find transformation between semantic and geometric clouds frames
  • trigger: one of "both", "geometric", "semantic", or "timer": defines when to perform traversability cost fusion based on availability of actual geometric or semantic data
  • sync: whether to use approximate time synchronizer or fuse latest available geometric and semantic messages
  • max_time_diff: maximum allowed time difference between semantic and geometric messages
  • dist_th: maximum allowed distance between closest points from geometric and semantic clouds
  • flat_cost_th: lower value of geometrical traversability cost starting from which seamntic traversability is used
  • obstacle_cost_th: higher value of geometrical traversability cost starting from which seamntic traversability is not used
  • semantic_cost_offset: value to add to semantic traversability cost (in the range it's being utilized)
  • timeout: time to wait for the target frame to become available (when looking for transformation between geometric and semantic clouds frames)

Look at traversability_fusion for more details.

Demos

  • Semantic segmentation of images from RELLIS-3D dataset with HRNet:

    roslaunch traversability_estimation image_segmentation_dataset_demo.launch model_name:=hrnet
  • Semantic segmentation of point clouds from RELLIS-3D dataset:

    roslaunch traversability_estimation traversability_dataset_demo.launch traversability:=semantic
  • Coloring lidar cloud using calibrated cameras and semantic classes:

    • Clone and build the point_cloud_color package.
    • Run demo:
      roslaunch traversability_estimation color_pc_bagfile_demo.launch

Citation

Please, cite the papers if you find the package relevant to your research:

@ARTICLE{9699042,
  author={Agishev, Ruslan and Petříček, Tomáš and Zimmermann, Karel},
  journal={IEEE Robotics and Automation Letters},
  title={Trajectory Optimization Using Learned Robot-Terrain Interaction Model in Exploration of Large Subterranean Environments},
  year={2022},
  volume={7},
  number={2},
  pages={3365-3371},
  doi={10.1109/LRA.2022.3147332}
}
@inproceedings{agishev2024monoforce,
    title={MonoForce: Self-supervised Learning of Physics-informed Model for Predicting Robot-terrain Interaction},
    author={Ruslan Agishev and Karel Zimmermann and Vladimír Kubelka and Martin Pecka and Tomáš Svoboda},
    booktitle={IEEE/RSJ International Conference on Intelligent Robots and Systems - IROS},
    year={2024},
    eprint={2309.09007},
    archivePrefix={arXiv},
    primaryClass={cs.RO},
    url={https://arxiv.org/abs/2309.09007},
}