Skip to content

Bayesian nonparametrics

Christopher Paciorek edited this page Feb 28, 2020 · 64 revisions

Overview document describing various BNP models and samplers.

Overview document tex .tex file for document describing various BNP models and samplers.

Todo items for DP-related work:

  • finish nonconjugate dCRP sampler
    • need MCMC configuration to recognize this case and assign sampler (Chris)
    • finalize sampler (Claudia)
    • allow zero intermediate nodes (Claudia / Chris)
    • what BUGS code could a user write that would make the sampler illegitimate (Claudia)
    • allow upper bound on number of components
    • discuss inefficiency of sampling all tilde variables even though most have no observations
    • add new sampler for CRP that iterates over unique labels instead of n labels.
    • add test that has basic usage of a dmnorm mixture.
    • discuss 'links' version of CRP - see Johnson and Sinclair - Environmetrics. 2017;28:e2440 "Modeling joint abundance of multiple species using Dirichlet process mixtures"
  • [] initial conjugate dCRP sampler (i.e., marginalized with respect to new component)
    • determine conjugacy in setup code, including for zero intermediate nodes (Chris)
    • determine conjugacy for normal-inverse gamma and normal-inverse Wishart priors in normal model (done for normal-IG, need to do for normal-IW)
    • determine conjugacy for dmnorm case - Chris thought we had done this but doesn't see it in codebase.
    • use conjugacy determination to develop conjugate sampler run code - all common univariate cases are now handled (Claudia).
    • immediately sample new component parameters when create new component (Claudia)
    • try to reuse setup code and run code so we don't duplicate code (Chris)
    • ask Perry if using model[ something ][i] or model$values(something) makes a difference in efficiency
    • (HIGH) add non-identity-link conjugacy - will likely need variations on 'offset' and 'coeff' from our conjugate sampler setup to allow us to handle things like y[i]~dnorm(c*thetatilde[xi[i]] + x[i]*beta, 1) and correctly account for 'c' and x[i]*beta
      • basic case of y[i]~dnorm(beta0 + beta1tilde[xi[i]]*x[i], 1)
      • y[i] ~ dpois(c*muTilde[xi[i]])
      • y[i] ~ dnorm(mu[xi[i],1:p] %*% x[i,1:p], 1) (or inprod() version of this)
      • y[i] ~ dnorm(beta0[xi[i]] + beta1[xi[i]]*x[i], 1) (this will probably be hard to set up)
      • from Claudia (similar cases to just above) y_i \sim N(X * beta + Z * theta_{xi_i}), where X and Z are matrices or not, and beta and theta are vectors or not. Conjugacy should also be identify when the same model is written as linear combinations, this is, y_i \sim N(x_{i,1} * beta_1 + x_{i,2} * beta_2 + z_{i,1} * theta_{xi_i, 1} + z_{i,2} * theta_{xi_i, 2} ), for instance.
  • avoid updating cluster parameters for unfilled clusters
  • stickbreaking approach
    • determine syntax for stickbreaking and write stickbreaking function (Claudia)
      • determine possible NaN situations
    • detect conjugacy in this setting (Chris)
    • clean up conjugacy for this setting (Chris)
  • compare speed and mixing of Nimble BNP to one or two other popular BNP (e.g., DPpackage) packages; we now compare favorably to DPpackage (which as of 2020 is no longer on CRAN)
  • blog post on BNP functionality - Chris/Abel decided the Avanda GLMM example would be good; also possibly add simpl density estimation example and show NIMBLE supports different kernels (Chris/Claudia)
  • testing
    • formal comparison (e.g., K-L calculation) of density estimates with truth and between CRP and stick-breaking
    • we might also think of other good test cases where we know the right answer (perhaps models Abel has fit previously using his own code)
  • CRP distribution
    • determine possible NaN situations (Claudia)
    • write help (Claudia)
  • standardized output for G when using dCRP (input posterior modelValues and augment with columns for weights and atoms). (Claudia / Nick / Chris)
    • figure out how a user will call this (current rough plan is to have an R function that user calls with that R function using a stand-alone nimbleFunction that sets number of columns in a matrix (not a modelValues)) (Chris / Claudia / Abel / other nimble-devs)
    • handle multivariate clusters - e.g. mixtures of dmnorm components
  • write help for BNP sampler with some examples.
  • write a quasi conjugate sampler for the "conc" parameter when a gamma distribution is assumed
    • write the sampler (Claudia)
    • write help with some examples (Claudia)
    • automatically assign this sampler (Claudia / Chris)
  • fully marginalized dCRP sampler for conjugate models - low priority
    • write sampler
    • write help
    • add sampling for tilde variables only every 'thin' iterations if monitoring for them requested
    • determine conjugacy for normal-inverse gamma prior in normal model
  • more complicated cases, in approximate order of decreasing priority (Claudia, except where indicated)
    • allow multiple observations per cluster, e.g., Quinn-style model (essentially done in bnp_moreGeneral2)
    • allow 'cross-clustering' such as mu[xi[i],xi[j]] (flesh out example models and figure out design for how to handle this)
    • HDP (allows a Polya urn so straightforward)
    • check conjugacy detection in stick-breaking case for Pitman-Yor and generalized Dirichlet cases (Chris)
    • Pitman-Yor (i.e., Beta(1-alpha, beta+k*alpha)) (allows Polya urn so straightforward)
    • Other species sampling models (Chris not clear on whether we are handling SSM generally or focusing on Pitman-Yor)
    • nested DP - possibly use Polya urn, but Abel to check with Peter Mueller about sampling (Abel + Claudia)
    • generalized Dirichlet (Beta(a_k, b_k) or Beta(a, b)) with dCRP (no easy Polya urn)
    • (possibly) NRMs with slice sampler (Italian school); Abel is checking with researchers about whether they have C/C++ code we might use or at least learn from
    • dependent DP
  • submit short piece to ISBA Bulletin, based on blog post (discuss whether to include "more general" example)
  • work on rich examples in NIMBLE, possibly for blog post or papers, perhaps by Masters students; some possibilities are:
    • basic analysis of Kevin Quinn model
    • comparison of biclustering with Kevin Quinn model
    • blog post or paper focused on simple use of DPM for random effects to reach out to poli sci or education (or possibly ecology, though we have connections there already) (Abel suggested the spatial cluster model and there is the Kevin Quinn case we could also revisit)
  • revisit our overall framework for basic DP models only after we get user feedback
Clone this wiki locally