You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We discussed about the possibility to specify mixture models in brms over grouping variables. Right now, the default is that mixtures are calculated on the level of observations, but sometimes we encounter use cases in which we want to specify different possible models for an entire sequence of data points (like the data of one participant), and fit a mixture over these models (basically what is described in the Stan-Manual as the "erroneous way" of vectorzing mixtures: https://mc-stan.org/docs/2_19/stan-users-guide/vectorizing-mixtures.html; also see the math in the attached picture why mixtures over single observations and over sequences of observations lead to different likelihoods).
For implementing this in brms, we discussed that one could add a new "mix" argument in the aterms of a brmsformula, which specifies over which groups a mixture should be calculated. For example, when attempting to specify a mixture over participants ("ID"), one could write: y | mix(gr = "ID") ~ ...
For translating this into stan-code, one could generate temporary variables for first accumulating the group-level likelihood components, which are then passed to the log_mix function. For a two-component mixture over a grouping-variable this could look something like the example code below, where L_1 and L_2 are the two likelihood components, N_1 is the number of grouping-levels, J_mix is the grouping-indicator per observation (similar to how it is done for random effects), N is the number of total observations, and lambda is the mixture proportion:
model {
vector L_1[N_1] = rep(0, N_1); // Likelihood component for model 1
vector L_2[N_1] = rep(0, N_1); // Likelihood component for model 2
for (i in 1:N) {
L_1[J_mix[i]] += log-likelihood-model1
L_2[J_mix[i]] += log-likelihood-model2
}
for (j in 1:N_1) {
target += log_mix(lambda, L_1[j], L_2[j])
}
}
In case the mixture is specified over single observations instead of groups of observations, this would be ne simplified to:
model {
vector L_1[N] = rep(0, N); // Likelihood component for model 1
vector L_2[N] = rep(0, N); // Likelihood component for model 2
for (i in 1:N) {
L_1[i] += log-likelihood-model1
L_2[i] += log-likelihood-model2
}
for (j in 1:N) {
target += log_mix(lambda, L_1[j], L_2[j])
}
}
Thanks for considering this feature, and I hope the summary of what we discussed helps for implementation!
Hi Paul!
We discussed about the possibility to specify mixture models in brms over grouping variables. Right now, the default is that mixtures are calculated on the level of observations, but sometimes we encounter use cases in which we want to specify different possible models for an entire sequence of data points (like the data of one participant), and fit a mixture over these models (basically what is described in the Stan-Manual as the "erroneous way" of vectorzing mixtures: https://mc-stan.org/docs/2_19/stan-users-guide/vectorizing-mixtures.html; also see the math in the attached picture why mixtures over single observations and over sequences of observations lead to different likelihoods).
For implementing this in
brms
, we discussed that one could add a new "mix" argument in theaterms
of abrmsformula
, which specifies over which groups a mixture should be calculated. For example, when attempting to specify a mixture over participants ("ID"), one could write:y | mix(gr = "ID") ~ ...
For translating this into stan-code, one could generate temporary variables for first accumulating the group-level likelihood components, which are then passed to the log_mix function. For a two-component mixture over a grouping-variable this could look something like the example code below, where L_1 and L_2 are the two likelihood components, N_1 is the number of grouping-levels, J_mix is the grouping-indicator per observation (similar to how it is done for random effects), N is the number of total observations, and lambda is the mixture proportion:
In case the mixture is specified over single observations instead of groups of observations, this would be ne simplified to:
Thanks for considering this feature, and I hope the summary of what we discussed helps for implementation!
Best,
Philipp
Note 21. May 2024.pdf
The text was updated successfully, but these errors were encountered: