This repository contains the code for the paper "A Multimodal Handover Failure Detection Dataset and Baselines" to appear at the 2024 IEEE International Conference on Robotics and Automation. A link to the paper and dataset can be found here.
See requirements.txt
The code for both the video classification and human action segmentation methods are included.
- Video Classification
We use the implementation of the I3D model from here. See pytorch-i3d-trainer/README.md for instructions on how to train and evaluate the different variants mentioned in the paper.
- Human Action Segmentation
We use the implementation of the MS-TCN model from here. See ms-tcn/README.md for instructions on how to train and evaluate the different variants mentioned in the paper.
Please cite the paper as follows:
@inproceedings{thoduka2024_icra,
author = {Thoduka, Santosh and Hochgeschwender, Nico and Gall, Juergen and Pl\"{o}ger, Paul G.},
title = {{A Multimodal Handover Failure Detection Dataset and Baselines}},
booktitle={2024 IEEE International Conference on Robotics and Automation (ICRA)},
year = {2024},
pages={17013-17019},
doi={10.1109/ICRA57147.2024.10610143}
}