Skip to content

Latest commit

 

History

History
127 lines (86 loc) · 5.29 KB

README.md

File metadata and controls

127 lines (86 loc) · 5.29 KB

S3IM

This repository contains the official pytorch implementation of our paper: [S3IM: Stochastic Structural SIMilarity and Its Unreasonable Effectiveness for Neural Fields].

The implementation of S3IM is quite simple. In this repo, we provide usage examples of S3IM and present some video demos.

SDFStudio has supported our S3IM method.

Abstract

Recently, Neural Radiance Field (NeRF) has shown great success in rendering novel-view images of a given scene by learning an implicit representation with only posed RGB images. NeRF and relevant neural field methods (e.g., neural surface representation) typically optimize a point-wise loss and make point-wise predictions, where one data point corresponds to one pixel. Unfortunately, this line of research failed to use the collective supervision of distant pixels, although it is known that pixels in an image or scene can provide rich structural information. To the best of our knowledge, we are the first to design a nonlocal multiplex training paradigm for NeRF and relevant neural field methods via a novel Stochastic Structural SIMilarity (S3IM) loss that processes multiple data points as a whole set instead of process multiple inputs independently. Our extensive experiments demonstrate the unreasonable effectiveness of S3IM in improving NeRF and neural surface representation for nearly free. The improvements of quality metrics can be particularly significant for those relatively difficult tasks: e.g., the test MSE loss unexpectedly drops by more than 90% for TensoRF and DVGO over eight novel view synthesis tasks; a 198% F-score gain and a 64% Chamfer L1 distance reduction for NeuS over eight surface reconstruction tasks. Moreover, S3IM is consistently robust even with sparse inputs, corrupted images, and dynamic scenes.

Video Demo

TensoRF RGB results w/o and with S3IM (refer to paper table 2)

tensorf_replica_scan1_rgb.mp4
Left: Standard Training (baseline);Right: Multiplex Training via S3IM (ours).

TensoRF Depth results w/o and with S3IM (refer to paper table 2)

tensorf_replica_scan1_depth.mp4
Left: Standard Training (baseline);Right: Multiplex Training via S3IM (ours).

DVGO RGB results w/o and with S3IM (refer to paper figure 2)

dvgo_sparse_truck_rgb.mp4
Left: Standard Training (baseline);Right: Multiplex Training via S3IM (ours).

DVGO Depth results w/o and with S3IM (refer to paper figure 2)

dvgo_sparse_truck_depth.mp4
Left: Standard Training (baseline);Right: Multiplex Training via S3IM (ours).

Installation

Tested on Ubuntu 20.04 + Pytorch 1.10.0 + cu113

Install environment:

pip install -r requirements.txt

Dataset

You can try other dataset as well. S3IM is powerful and robust.

Hyperparameters

The recommended setting for S3IM is

s3im_kernel=4
s3im_stride=4
s3im_repeat_time=10 # repeat time of s3im
s3im_patch_height=64 # height of random mini-patch in s3im 
s3im_patch_width=64 # width of random mini-patch in s3im 

Quick Start

You can prepare the dataset using the following script:

sh scripts/preprocess_data/prepare_data.sh

You can train the TensoRF/DVGO model with s3im using the following script:

#for TensoRF
sh scripts/TensoRF/train_replica.sh
#for DVGO
sh scripts/DVGO/train_replica.sh

You can eval the TensoRF/DVGO model with s3im using the following script:

#for TensoRF
sh scripts/TensoRF/eval_replica.sh
#for DVGO
sh scripts/DVGO/eval_replica.sh

You can render a video based on TensoRF/DVGO with s3im using the following script:

#for TensoRF
sh scripts/TensoRF/render_path_replica.sh
#for DVGO
sh scripts/DVGO/render_path_replica.sh

If you want to try other setting in S3IM, you can modify the config file in

#for TensoRF
models/TensoRF/configs/replica_exp/replica_scan1_s3im_1.0.txt
#for DVGO
models/DVGO/configs/replica_exp/replica_scan1_s3im_1.0.txt

Performance

Here we report our results in Replica Dataset using TensoRF. Please refer to our paper for more quantitative results.

results in Replica

Citation

If you find our code or paper helps, please consider citing:

@inproceedings{xie2023s3im,
  title = {S3IM: Stochastic Structural SIMilarity and Its Unreasonable Effectiveness for Neural Fields},
  author = {Xie, Zeke and Yang, Xindi and Yang, Yujie and Sun, Qi and Jiang, Yixiang and Wang, Haoran and Cai, Yunfeng and Sun, Mingming},
  booktitle = {International Conference on Computer Vision},
  year = {2023}
}

Acknowledgement

The code base is adapted from DVGO and TensoRF, thanks for their great work!