-
Notifications
You must be signed in to change notification settings - Fork 24
Bayesian nonparametrics
Christopher Paciorek edited this page Jun 28, 2018
·
64 revisions
Overview document describing various BNP models and samplers.
Overview document tex .tex file for document describing various BNP models and samplers.
Todo items for DP-related work:
- finish nonconjugate dCRP sampler
- need MCMC configuration to recognize this case and assign sampler (Chris)
- finalize sampler (Claudia)
- allow zero intermediate nodes (Claudia / Chris)
- what BUGS code could a user write that would make the sampler illegitimate (Claudia)
- allow upper bound on number of components
- discuss inefficiency of sampling all tilde variables even though most have no observations
- only compute curLogProb for unique thetas (Claudia, I think this is the same as next bullet, right?)
- (HIGH) add new (more efficient?) sampler for CRP that iterates over unique labels instead of n labels.
- discuss 'links' version of CRP - see Johnson and Sinclair - Environmetrics. 2017;28:e2440 "Modeling joint abundance of multiple species using Dirichlet process mixtures"
- (HIGH) identify conjugacy for variance in a model with non-deterministic nodes (Claudia, could you check if this is working now and if so, just remove this bullet? I forget exactly what this refers to)
- initial conjugate dCRP sampler (i.e., marginalized with respect to new component)
- determine conjugacy in setup code, including for zero intermediate nodes (Chris)
- determine conjugacy for normal-inverse gamma prior in normal model (Claudia, with Chris possibly writing the bivariate distribution as a nimble distribution)
- any other bivariate priors that are commonly used that we should determine conjugacy for?
- use conjugacy determination to develop conjugate sampler run code - all common univariate cases are now handled (Claudia). List of examples:
- immediately sample new component parameters when create new component (Claudia)
- try to reuse setup code and run code so we don't duplicate code (Chris)
- ask Perry if using model[ something ][i] or model$values(something) makes a difference in efficiency
- (HIGH) add non-identity-link conjugacy - will likely need variations on 'offset' and 'coeff' from our conjugate sampler setup to allow us to handle things like
y~dnorm(c*thetatilde[xi[i]] + x[i]*beta, 1)
and correctly account for 'c' andx[i]*beta
(Chris to help get this started)
- stickbreaking approach
- determine syntax for stickbreaking and write stickbreaking function (Claudia)
- determine possible NaN situations
- detect conjugacy in this setting (Chris)
- clean up conjugacy for this setting (Chris)
- determine syntax for stickbreaking and write stickbreaking function (Claudia)
- (HIGH) compare speed and mixing of Nimble BNP to one or two other popular BNP (e.g., DPpackage) packages
- (HIGH) testing
- compare results from stick-breaking and CRP on same model (or ideally a couple models) (Claudia, is this already basically done in the testing
- we might also think of good test cases where we know the right answer (perhaps models Abel has fit previously using his own code)
- Generalized Dirichlet (Claudia is Pitman-Yor an example of this? Abel and I discussed that Pitman-Yor would be a good extension to add soon)
- write GenDirichlet distributions (Claudia)
- add GenDirichlet conjugacy (Chris)
- CRP distribution
- determine possible NaN situations (Claudia)
- write help (Claudia)
- standardized output for G when using dCRP (input posterior modelValues and augment with columns for weights and atoms). (Claudia / Nick / Chris)
- figure out how a user will call this (current rough plan is to have an R function that user calls with that R function using a stand-alone nimbleFunction that sets number of columns in a matrix (not a modelValues)) (Chris / Claudia / Abel / other nimble-devs)
- handle multivariate clusters - e.g. mixtures of dmnorm components
- write help for BNP sampler with some examples.
- write a quasi conjugate sampler for the "conc" parameter when a gamma distribution is assumed
- write the sampler (Claudia)
- write help with some examples (Claudia)
- automatically assign this sampler (Claudia / Chris)
- fully marginalized dCRP sampler for conjugate models
- write sampler
- write help
- add sampling for tilde variables only every 'thin' iterations if monitoring for them requested
- determine conjugacy for normal-inverse gamma prior in normal model
- more complicated cases (partial list of possibly high priority - Claudia/Abel, please add to as you think about this)
- HDP
- (possibly) NRMs with slice sampler (Italian school); Abel is checking with researchers about whether they have C/C++ code we might use or at least learn from
- ... Chris is not sure where DDP and other structures fit in this list ...