Using the Matterport3D dataset, we present several benchmarking tasks. For each task, all train/test code, pretrained models, and auxiliary data for the experiments are provided.
The image keypoint matching task aims to establish correspondences between keypoints in RGB image data. It leverages the wide variety of camera baselines in the Matterport3D dataset as training and test data. Please see keypoint_match for details.
The view overlap prediction task aims to predict how much the views of two images overlap (what fraction of the visible surfaces are shared between the views). It leverages the wide variety of camera baselines in the Matterport3D dataset as training and test data. Please see view_overlap for details.
The surface normal estimation task aims to predict pixelwise surface normals from RGB images. It leverages normals estimated from the vast number of RGB-D image pairs in the Matterport3D dataset as training and testing data. Please see surface_normal for details.
The region classification task aims to predict the semantic category of the region (e.g., bedroom, kitchen, patio, etc.) containing the camera viewpoint of an RGB image or panorama. It leverages semantic boundaries and labels for manually-specified regions in the Matterport3D dataset. Please see region_classification for details.
The semantic voxel labeling task predicts per-voxel class labels for a scan. Please see semantic_voxel_label for details.