Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zi mixture probs (draft) #1555

Closed
wants to merge 3 commits into from

Conversation

jsocolar
Copy link
Contributor

@jsocolar jsocolar commented Oct 28, 2023

This is a draft PR to close #1551. I'm opening the PR as a draft to get feedback on the overall design and implementation before I work on polishing.

This PR adds the argument z_mix = FALSE to pp_mixture with the following effects:

  • the default behavior (z_mix = FALSE) should be unchanged.
  • For z_mix = TRUE:
    • If the model family is a zero-inflation or hurdle family, return the zero-inflation weights in the same format as the weights for mixture families. The first mixture component is the zero-inflation state; the final component is the non-inflated state, and (in the case of zero_one_inflated_beta) the middle component is the one-inflated state.
    • If the model is a mixture family that contains a zero-inflated or hurdle model as one of the mixture components, then the output is like the default output, but the array slice for the zero-inflated component is split into two array slices (three in the case of a zero-one-inflated beta, but this should be irrelevant unless incorrect mixture probabilities returned when mixtures include hurdle families #1552 is resolved in a way that admits the zero-one-inflated-beta in mixture families) following the format described in the previous bullet.
    • If the model family is not itself zero-inflated, and doesn't have any zero-inflated components, then this should error informatively.

TODO (I'll take care of these, but could use feedback from @paul-buerkner where tagged):

  • Verify correctness of output and write tests.
  • Feedback from @paul-buerkner on a good naming convention for the array slices corresponding to different zi/hu mixture components, and implement. Suggestion: P(K = 2 & z = 1 | Y), where z = 1 means a structural zero, z = 2 means no inflation, and z = 3 means a structural 1.
  • Confirm that this works (or fails gracefully) when confronted with multivariate response models.
  • Feedback from @paul-buerkner on whether to retain minor helper functions and where to put them.

A final design note: if preferred, it is possible to construct an implementation that is easier (for me) to grasp, but that supports only garden-variety zi/hu families and not mixture families that include zi/hu families as components of the mixture. It would be possible to insert this simpler implementation inside of pp_mixture() via the same z_mix argument. A functioning example of how this branch of the control flow inside of pp_mixture might look internally is available at #1551.

Here's a snippet to play with:

remotes::install_github("jsocolar/brms", ref = "zi-mixture-probs")
library(brms)
set.seed(2)
dat <- data.frame(
  y = c(rep(0, 40), rpois(30, 3), rnbinom(30, 4, .4)),
  x = rnorm(100)
)
mix <- mixture(poisson, zero_inflated_negbinomial)
fit1 <- brm(y ~ x, dat, family = mix, 
           backend = "cmdstanr", 
           cores = 4)

fit2 <- brm(y ~ 1, dat, family = zero_inflated_poisson, 
            backend = "cmdstanr", 
            cores = 4)

pp_mixture(fit1)
pp_mixture(fit1, z_mix = TRUE)
pp_mixture(fit2, z_mix = FALSE)
pp_mixture(fit2, z_mix = TRUE)

out1 <- pp_mixture(fit1, summary = FALSE)
out2 <- pp_mixture(fit1, summary = FALSE, z_mix = TRUE)
out3 <- pp_mixture(fit2, summary = FALSE, z_mix = TRUE)

all(round(apply(out1, c(1,2), sum), 10) == 1)
all(round(apply(out2, c(1,2), sum), 10) == 1)
all(round(apply(out3, c(1,2), sum), 10) == 1)

@jsocolar
Copy link
Contributor Author

Just a note, there's some kind of numerical instability when calculating the zero inflation mixture components, but it's only cropped up for numerically extreme inputs. For example:

remotes::install_github("jsocolar/brms", ref = "zi-mixture-probs")
library(brms)

set.seed(2)
dat <- data.frame(
  y = c(0, 0:10),
  x = rnorm(12)
)

mix <- mixture(poisson, zero_inflated_negbinomial)
fit1 <- brm(y ~ x, dat, family = mix, 
           backend = "cmdstanr", 
           cores = 4)
fit1

yields a lot of divergences and a numerically absurd fit:

Population-Level Effects: 
                  Estimate   Est.Error      l-95% CI    u-95% CI Rhat Bulk_ESS Tail_ESS
mu1_Intercept        -1.84       36.68       -100.76      104.57 1.53      303       39
mu2_Intercept  20423274.18 25792409.61   -7167867.25 68382835.00 2.43        5       25
mu1_x                11.29      125.03       -352.45      346.55 1.50      312       38
mu2_x         -69804860.36 88156073.74 -233726450.00 24499132.50 2.43        5       25

Family Specific Parameters: 
       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
shape2     2.28      8.93     0.00    17.84 1.48        8       67
zi2        0.42      0.30     0.01     0.97 1.17       18      645
theta1     0.63      0.32     0.01     0.96 1.53        7       34
theta2     0.37      0.32     0.04     0.99 1.53        7       34

Then, pp_mixture(fit1) doesn't yield any problems, but pp_mixture(fit1, z_mix = TRUE, summary = FALSE) yields some NaNs.

@paul-buerkner
Copy link
Owner

paul-buerkner commented Oct 29, 2023

Thank you for your effort! The structure of the log_lik_mixture function looks a bit convoluted now. I don't really want this function to be touched, since it is meant purely for mixture models set up via mixture().

I think I would prefer a different internal approach only affecting pp_mixture, but I cannot fully explain is now because I need to think more about it. Could you provide me with the isolated code how to do pp_mixture for both zi and hu models, if you just implemented it for them in isolation? Without considering the current pp_mixture code? On that basis, I may add the feature myself to pp_mixture. I am sorry to cause you extra effort.

@jsocolar
Copy link
Contributor Author

jsocolar commented Nov 3, 2023

Hi Paul, no worries. I agree this implementation is convoluted, and if you have a vision for how to simplify that's great! The implementation over on the feature request already handles the discrete zero-inflated cases, so I'll just add some logic to recapitulate the data for the hurdle cases and then hand over to you to figure out how to properly expose it through pp_mixture. Might not come for another week or so because I'm currently traveling.

@paul-buerkner
Copy link
Owner

I will close this PR for now due to inactivity but please feel free to reopen it when you should find the time to work on it again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feature suggestion: posterior probability of being in the zero-inflation state
2 participants