Skip to content
/ nade_k Public

An iterative neural autoregressive distribution estimator (NADE-K)

Notifications You must be signed in to change notification settings

yaoli/nade_k

Repository files navigation

This package contains the accompany code for the following paper:

Tapani Raiko, Li Yao, KyungHyun Cho, Yoshua Bengio
Iterative Neural Autoregressive Distribution Estimator (NADE-k).
Advances in Neural Information Processing Systems 2014 (NIPS14).

Setup

Install Theano

Download Theano and make sure it's working properly.
All the information you need can be found by following this link:
http://deeplearning.net/software/theano/
Make sure theano is added into your PYTHONPATH.

Install Jobman

Very detailed information can be found below:
http://deeplearning.net/software/jobman/install.html.
Make sure jobman is added into your PYTHONPATH.

Prepare the MNIST dataset

You can download the dataset from the links below.
[trainset] (http://www.cs.toronto.edu/~larocheh/public/datasets/binarized_mnist/binarized_mnist_train.amat)
validset
testset

After the dataset has been downloaded, make sure to change the data_path in utils.py.

Reproducing the Results

Train the model

  1. Change exp_path in config.py. This is the directory where all the training outputs are going to be placed. For different experiments, one needs to specify 'save_model_path' in the same config file.
  2. To run NADE-5 1HL in Table 1 of the paper, make sure
    'n_layers': 1, and 'l2': 0.0.
  3. To run NADE-5 2HL in Table 1 of the paper, make sure
    'n_layers': 2, and 'l2': 0.0012279827881.
  4. To start training, python train_model.py

It is highly recommended the code is run on GPUs. For how to make it happen, take a look at this place: http://deeplearning.net/software/theano/tutorial/using_gpu.html.

Training outputs

During the training, lots of information is printed out on the screen, and many files are written to the save_mode_path. You will be able to see the plot of drop of the training cost, the generated samples from the model, the log-likelihood on the validset and testset every valid_freq epochs.

If you use the default setup, the model will be pretrained for 1000 epochs, and finetuned for another 3000 epochs. To have a good generative model, one need to be patient :)

In addition, we have provided some training logs with which you should be able to match your experiments with. See in the directory results.

Evaluation

After training is done, it is time to get all those SOTA numbers in Table 1 of the paper.

  1. In config.py, change the option 'action' to 1. Meanwhile make sure 'from_path' points to the directory that contains model_params_e*.pkl and model_configs.pkl. The option 'epoch' specify which model over there you would like to use.
  2. Then python train_model.py
  3. If all goes well, the evaluation script should be able to produce numbers that match those in the paper.

IMPORTANT: You probably will be surprised when you see better numbers than those reported in our paper. Calm down and we know this could happen. The longer you train our model, the more likely you will get better numbers. And do spread your joy to us when this happens.

Benchmarks with this package

NADE-5 1H model:
testset LL over 10 orderings = -89.43
testset LL over 128 ensembles = -85.77
Those numbers are better than what we used in the paper because the model is trained much longer here.

NADE-5 2H model:
testset LL over 10 orderings = -87.13
testset LL over 128 ensembles = -84.65

Contact

Questions?
Need a trained model?
Contact us: [email protected]

About

An iterative neural autoregressive distribution estimator (NADE-K)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages