Skip to content

Defining a new distribution in NIMBLE

Peter Sujan edited this page Apr 12, 2016 · 1 revision

This tutorial is intended for NIMBLE developers, and explains how to create a distribution included as a default NIMBLE distribution. Those interested in creating a "user-defined" distribution should refer to the user manual.

Tip before moving forward: if anything is confusing, usually you can find another distribution to use as an example/template. For example, you could search for ddirch (Dirichlet distribution) in whatever file you're editing, and see how that was implemented.

Creating a new distribution in NIMBLE requires editing quite a few files, which are listed below, with links to their location in the repo. All can be found under packages/nimble, with the exception of packages/CreatingExportList.R.

  1. R/distributions_implementations.R
  2. R/distributions_inputList.R
  3. inst/CppCode/dists.cpp
  4. inst/CppCode/nimDists.cpp
  5. inst/include/nimble/dists.h
  6. inst/include/nimble/nimDists.h
  7. R/genCpp_eigenization.R
  8. R/genCpp_operatorLists.R
  9. R/genCpp_processSpecificCalls.R
  10. R/genCpp_sizeProcessing.R
  11. R/registration.R
  12. packages/CreatingExportList.R

Details

R/distributions_implementations.R

This file provides the user-facing R functions for calculating the density and sampling from the distribution. These functions follow the typical naming convention of d and r (for "density" and "random" respectively) followed by an abbreviation of the distribution name, which I'll refer to as <distr>. Most of your code here will likely be dispatching to the C version of your functions, using R's .Call function. Also, make sure to provide detailed documentation. Most importantly, make sure to include the @export directive so that users can access your functions after importing NIMBLE.

R/distributions_inputList.R

This file associates how a distribution is referenced in BUGS code with the actual R functions, as well as the return and input types for the distribution. Some distributions have multiple parameterizations, whose relationship should be specified here.

inst/CppCode/dists.cpp (and dists.h)

This file contains the low-level C implementations of the density and sampling functions. You will need to implement functions d<distr> and r<distr>, which actually carry out the density and random sampling. These should take in native C objects like arrays of doubles. The density function will always return a scalar, so it can just return the value. The random sampling function, however, takes many samples (or a multi-dimensional single sample), so it may make more sense to take in a pointer to the results array, and give the function type void.

This file should also contain two functions C_d<distr> and C_r<distr> that take in and return SEXP objects (short for "S-expressions"), which are representations of R objects in C. The majority of the bodies of these functions will consist of converting these R objects into native C objects, and then dispatching to the pure C functions.

Finally, make sure to put the function signatures in the corresponding header file dists.h.

inst/CppCode/nimDists.cpp (and nimDists.h)

In this file, you will define nimArr_d<distr> and nimArr_d<distr>. These two functions are used internally by NIMBLE, and thus need to take in NimArr objects whenever arrays are needed. Otherwise, normal scalar input types (double, int, etc.) are fine. The majority of these functions will be devoted to extracting raw pointers from NimArr objects, and then passing them to the d<distr> and r<distr> functions defined in dists.cpp.

Don't forget to add the headers to nimDists.h.

R/genCpp_eigenization.R

This file transforms certain functions/objects to use the Eigen library. This only applies to vector-valued distributions. 'nimArr_d<distr>' should be added to the line where eigenizeCallsBeforeRecursing is set.

R/genCpp_operatorLists.R

Add both 'nimArr_d<distr>' and 'nimArr_r<distr>' to the callToSkipInEigenization vector.

R/genCpp_processSpecificCalls.R

Add 'nimArr_d<distr>' 'nimArr_r<distr>' to specificCallHandlers.

R/genCpp_sizeProcessing.R

For multivariate distributions, the nimArr_r<distr> function takes a result pointer as its first argument, rather than returning a value. For multivariate distributions, add 'nimArr_r<distr>' to assignmentAsFirstArgFuns. Add 'nimArr_d<distr>' and 'nimArr_r<distr>' to sizeCalls. Specify the argument dimensions in mvFirstArgCheckLists.

R/registration.R

Add two lines:

  1. C_d<distr> = 'C_d<distr>'
  2. C_r<distr> = 'C_r<distr>'

packages/CreatingExportList.R

Add your distribution in three places (all under individualMaskedFunctions):

  1. C_d<distr>
  2. C_r<distr>
  3. node_stoch_d<distr>
Clone this wiki locally