Skip to content

Bayesian nonparametrics

claudiawehrhahn edited this page Mar 13, 2018 · 64 revisions

Overview document describing various BNP models and samplers.

Overview document tex .tex file for document describing various BNP models and samplers.

Todo items for DP-related work:

  • finish nonconjugate dCRP sampler
    • need MCMC configuration to recognize this case and assign sampler (Chris)
    • finalize sampler (Claudia)
    • allow zero intermediate nodes (Claudia / Chris)
    • what BUGS code could a user write that would make the sampler illegitimate (Claudia)
    • allow upper bound on number of components
    • only compute curLogProb for unique thetas
    • identify conjugacy for variance in a model with non-deterministic nodes (Claudia, could you check if this is working now?)
    • move code from dCRP sampler setup code to MCMC configuration (Chris)
    • discuss inefficiency of sampling all tilde variables even though most have no observations
  • initial conjugate dCRP sampler (i.e., marginalized with respect to new component)
    • determine conjugacy in setup code, including for zero intermediate nodes (Chris)
    • determine conjugacy for normal-inverse gamma prior in normal model (wait for now)
    • use conjugacy determination to develop conjugate sampler run code for a few examples (Claudia). List of examples:
      • normal sampling with unknown normal mean and known variance: dCRP_conjugate_dnorm_dnorm
      • Poisson sampling with unknown gamma rate: dCRP_conjugate_dgamma_dpois
      • Bernoulli sampling with unknown Beta probability: dCRP_conjugate_dbeta_dbern
      • Multinomial sampling with unknown Dirichlet probability: dCRP_conjugate_ddirch_dmulti
      • exponential sampling with unknown gamma rate, known shape: dCRP_conjugate_dgamma_dexp
      • gamma sampling with unknown gamma rate, both shapes known: dCRP_conjugate_dgamma_dgamma
    • immediately sample new component parameters when create new component (Claudia)
    • try to reuse setup code and run code so we don't duplicate code (Chris)
    • ask Perry if using model[ something ][i] or model$values(something) makes a difference in efficiency
  • stickbreaking approach
    • determine syntax for stickbreaking and write stickbreaking function (Claudia)
      • determine possible NaN situations
    • detect conjugacy in this setting (Chris)
    • clean up conjugacy for this setting (Chris)
  • Generalized Dirichlet
    • write GenDirichlet distributions (Claudia)
    • add GenDirichlet conjugacy (Chris)
  • CRP distribution
    • determine possible NaN situations (Claudia)
    • write help (Claudia)
  • standardized output for G when using dCRP (input posterior modelValues and augment with columns for weights and atoms). Not in total generality yet. (Claudia / Nick / Chris)
    • figure out how a user will call this (current rough plan is to have an R function that user calls with that R function using a stand-alone nimbleFunction that sets number of columns in a matrix (not a modelValues)) (Chris / Claudia / Abel / other nimble-devs)
  • write help for BNP sampler with some examples.
  • write a quasi conjugate sampler for the "conc" parameter when a gamma distribution is assumed
    • write the sampler (Claudia)
    • write help with some examples (Claudia)
    • automatically assign this sampler (Claudia / Chris)
  • fully marginalized dCRP sampler for conjugate models
    • write sampler
    • write help
    • add sampling for tilde variables only every 'thin' iterations if monitoring for them requested
    • determine conjugacy for normal-inverse gamma prior in normal model
Clone this wiki locally