PyTorch 1.2 implementation for paper "Learning Implicit Fields for Generative Shape Modeling", Zhiqin Chen, Hao (Richard) Zhang.
In short, this repo is an implementation of IM-NET with the framework provided by BSP-NET-pytorch.
The improvements over the original implementation is the same as IM-NET (improved TensorFlow1 implementation):
Encoder:
- In IM-AE (autoencoder), changed batch normalization to instance normalization.
Decoder (=generator):
- Changed the first layer from 2048-1024 to 1024-1024-1024.
- Changed latent code size from 128 to 256.
- Removed all skip connections.
- Changed the last activation function from sigmoid to clip ( max(min(h, 1), 0) ).
Training:
- Trained one model on the 13 ShapeNet categories as most Single-View Reconstruction networks do.
- For each category, sort the object names and use the first 80% as training set, the rest as testing set, same as AtlasNet.
- Reduced the number of sampled points by half in the training set. Points were sampled on 2563 voxels.
- Removed data augmentation (image crops), same as Occupancy Networks.
- Added coarse-to-fine sampling for inference to speed up testing.
- Added post-processing to make the output mesh smoother. To enable, find and uncomment all "self.optimize_mesh(vertices,model_z)".
If you find our work useful in your research, please consider citing:
@article{chen2018implicit_decoder,
title={Learning Implicit Fields for Generative Shape Modeling},
author={Chen, Zhiqin and Zhang, Hao},
journal={Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2019}
}
Requirements:
- Python 3.5 with numpy, scipy and h5py
- PyTorch 1.2
- PyMCubes (for marching cubes)
Our code has been tested on Ubuntu 16.04 and Windows 10.
The original voxel models are from HSP.
The rendered views are from 3D-R2N2.
Since our network takes point-value pairs, the voxel models require further sampling.
For data preparation, please see directory point_sampling.
We provide the ready-to-use datasets in hdf5 format, together with our pre-trained network weights.
Backup links:
- IM-NET-pytorch (pwd: bqex)
Please use the provided scripts train_ae.sh, train_svr.sh, test_ae.sh, test_svr.sh to train the network on the training set and get output meshes for the testing set.
To train an autoencoder, use the following commands for progressive training.
python main.py --ae --train --epoch 200 --sample_dir samples/all_vox256_img0_16 --sample_vox_size 16
python main.py --ae --train --epoch 200 --sample_dir samples/all_vox256_img0_32 --sample_vox_size 32
python main.py --ae --train --epoch 200 --sample_dir samples/all_vox256_img0_64 --sample_vox_size 64
The above commands will train the AE model 200 epochs in 163 resolution, then 200 epochs in 323 resolution, and finally 200 epochs in 643 resolution. Training on the 13 ShapeNet categories takes about 3 days on one GeForce RTX 2080 Ti GPU.
After training, you may visualize some results from the testing set.
python main.py --ae --sample_dir samples/im_ae_out --start 0 --end 16
You can specify the start and end indices of the shapes by --start and --end.
To train the network for single-view reconstruction, after training the autoencoder, use the following command to extract the latent codes:
python main.py --ae --getz
Then use the following commands to train the SVR model:
python main.py --svr --train --epoch 1000 --sample_dir samples/all_vox256_img1
After training, you may visualize some results from the testing set.
python main.py --svr --sample_dir samples/im_svr_out --start 0 --end 16
This project is licensed under the terms of the MIT license (see LICENSE for details).