In this repository, we share our implementation of several camera pose regression loss functions in a simple end-to-end network similar to PoseNet. We implemented the homography-based loss functions introduced in our paper alongside PoseNet, Homoscedastic, Geometric and DSAC loss functions. We provide the code to train and test the network on Cambridge, 7-Scenes and custom COLMAP datasets.
Our paper Homography-Based Loss Function for Camera Pose Regression is published in IEEE Robotics and Automation Letters 2022.
Convergence of our proposed Homography loss
We show other losses convergence on our YouTube channel.
This code relies on COLMAP for loading COLMAP models. To satisfy this dependancy, simply run:
git submodule update --init
We share an Anaconda environment that can be easily installed by running:
conda env create -f environment.yml
Anaconda is easy to install and benefits from a lighter implementation named
Miniconda.
Once the environment is installed you can activate it by running:
conda activate homographyloss
Have a look at the datasets folder to setup the datasets.
The script main.py trains the network on a given scene and logs the performance of the model on the train set. It requires one positional argument: the path to the scene on which to train the model. For example, for training the model on the ShopFacade scene, simply run:
python main.py datasets/ShopFacade
Let's say you have a custom dataset in datasets/mydataset
with the structure defined in datasets:
- mydataset
- images
- frame001.jpg
- frame002.jpg
- frame003.jpg
- ...
- cameras.bin
- images.bin
- points3D.bin
- list_db.txt
- list_query.txt
Then you might run the script on your custom dataset:
python main.py datasets/mydataset
Other available training options can be listed by running python main.py -h
.
Training and test metrics are saved in a logs
directory. One can monitor them using tensorboard.
Simply run in a new terminal:
tensorboard --logdir logs
All estimated poses are also saved in a CSV file in logs/[scene]/[loss]/epochs_poses_log.csv
.
For each epoch, each image and each set, we save the estimated pose in the following format:
w_t_chat
is the camera-to-world translation of the image.chat_q_w
is the world-to-camera quaternion representing the rotation of the image.
This work was supported by Ifremer, DYNI team of LIS laboratory and COSMER laboratory.
This code is released under the LGPLv3 licence. Please have a look at the licence file at the repository root.
If you use this work for your research, please cite:
@article{boittiaux2022homographyloss,
author={Boittiaux, Cl\'ementin and
Marxer, Ricard and
Dune, Claire and
Arnaubec, Aur\'elien and
Hugel, Vincent},
journal={IEEE Robotics and Automation Letters},
title={Homography-Based Loss Function for Camera Pose Regression},
year={2022},
volume={7},
number={3},
pages={6242-6249},
}