Qing Wang∗, Jiaming Zhang∗, Kailun Yang†, Kunyu Peng, Rainer Stiefelhagen
∗ denotes equal contribution and † denotes corresponding author
- [09/2022] MatchFormer [PDF] is accepted to ACCV2022.
In this work, we propose a novel hierarchical extract-and-match transformer, termed as MatchFormer. Inside each stage of the hierarchical encoder, we interleave self-attention for feature extraction and cross-attention for feature matching, enabling a human-intuitive extract-and-match scheme.
More detailed can be found in our arxiv paper.
The requirements are listed in the requirement.txt
file. To create your own environment, an example is:
conda create -n matchformer python=3.7
conda activate matchformer
cd /path/to/matchformer
pip install -r requirement.txt
You can prepare the test dataset in the same way as LoFTR, place the dataset and index in the data directory.
A structure of dataset should be:
data
├── scannet
│ ├── index
│ │ ├── intrinsics.npz
│ │ ├── scannet_test.txt
│ │ └── test.npz
│ └── test
│ ├── scene0707_00
│ ├── ...
│ └── scene0806_00
└── megadepth
├── index
│ ├── 0015_0.1_0.3.npz
│ ├── ...
│ ├── 0022_0.5_0.7.npz
│ └── megadepth_test_1500.txt
└── test
├── Undistorted_SfM
└── phoenix
The evaluation configurations can be adjusted at /config/defaultmf.py
The weights can be downloaded in Google Drive.
Put the weight at model/weights
.
# adjust large SEA model config:
MATCHFORMER.BACKBONE_TYPE = 'largesea'
MATCHFORMER.SCENS = 'indoor'
MATCHFORMER.RESOLUTION = (8,2)
MATCHFORMER.COARSE.D_MODEL = 256
MATCHFORMER.COARSE.D_FFN = 256
python test.py /config/data/scannet_test_1500.py --ckpt_path /model/weights/indoor-large-SEA.ckpt --gpus=1 --accelerator="ddp"
# adjust lite LA model config:
MATCHFORMER.BACKBONE_TYPE = 'litela'
MATCHFORMER.SCENS = 'indoor'
MATCHFORMER.RESOLUTION = (8,4)
MATCHFORMER.COARSE.D_MODEL = 192
MATCHFORMER.COARSE.D_FFN = 192
python test.py /config/data/scannet_test_1500.py --ckpt_path /model/weights/indoor-lite-LA.ckpt --gpus=1 --accelerator="ddp"
# adjust large LA model config:
MATCHFORMER.BACKBONE_TYPE = 'largela'
MATCHFORMER.SCENS = 'outdoor'
MATCHFORMER.RESOLUTION = (8,2)
MATCHFORMER.COARSE.D_MODEL = 256
MATCHFORMER.COARSE.D_FFN = 256
python test.py /config/data/megadepth_test_1500.py --ckpt_path /model/weights/outdoor-large-LA.ckpt --gpus=1 --accelerator="ddp"
# adjust lite SEA model config:
MATCHFORMER.BACKBONE_TYPE = 'litesea'
MATCHFORMER.SCENS = 'outdoor'
MATCHFORMER.RESOLUTION = (8,4)
MATCHFORMER.COARSE.D_MODEL = 192
MATCHFORMER.COARSE.D_FFN = 192
python test.py /config/data/megadepth_test_1500.py --ckpt_path /model/weights/indoor-large-SEA.ckpt --gpus=1 --accelerator="ddp"
Based on the LOFTER code to train MatchFormer, replace LoFTR/src/loftr/backbone/ with model/backbone/match_**.py to train.
If you are interested in this work, please cite the following work:
@inproceedings{wang2022matchformer,
title={MatchFormer: Interleaving Attention in Transformers for Feature Matching},
author={Wang, Qing and Zhang, Jiaming and Yang, Kailun and Peng, Kunyu and Stiefelhagen, Rainer},
booktitle={Asian Conference on Computer Vision},
year={2022}
}
Our work is based on LoFTR and we use their code. We appreciate the previous open-source repository LoFTR.