Dirichlet process mixture models (DP-MM) are a generalization of the Dirichlet process to multiple components. The Dirichlet process is a probabilistic model that assumes that a finite number of independent random variables are drawn from a distribution. The DP-MM model assumes that a finite number of independent random variables are drawn from a distribution that is a mixture of Dirichlet processes.
The gaussian mixture distribution is defined as follows:
$$ p(z) = \sum_{i}^{K}\pi_k \mathcal{N}(z |\mu_i,\Sigma_i) $$
Where
So far, we have introduced for each datapoint
We shall view
Suppose we have a succession of value
Let be
- Draw
$X_n$ from$H$ with probability$\dfrac{\alpha}{\alpha+n-1}$ - Set
$X_n = x$ with probability$\dfrac{n_x}{\alpha+n-1}$ . Where$n_x$ is the number of times$x$ has been drawn from$H$ , i.e.$n_x := #{j : X_j=x ,, j<n }$
The
The dirichlet process
Consider the model
$$ G \sim DP(\alpha,G_0) $$
$$ \theta_i \sim G $$
Marginalizing out the random distribution the joint distribution of n replicates
In a DP mixture models (gaussian),
Now consider the posterior distribution of