Trouble w/ plot_ppc using Weibull family #788
-
Hello everyone, I'm working on a example using Bambi to fit different distributions to 30 years of maximum precipitation data (a synthetic database included here for ease). First, I set up a model adopting a Gumbel distribution, everything worked fine (in line with the same example if set up in PyMC). Then, I built a second model with default properties using a Weibull distribution. The fitting portion looked good, no divergences, reasonable posteriors, etc. The problem shows up when trying to use az.plot_ppc. I encounter the following error: "Your data appears to have a single value or no finite values". When I try to plot with the "cumulative" option, it looks as though it's only considering one data point at a time too. The strange thing is that looking at the predicted values 'mm' there don't seem to be anything different. The histograms looks reasonable and the QQ-plot too. The results from each model look virtually the same upon inspection when it comes to shape, data type, etc. I'm at a loss on why this is not working for the Weibull and does for the custom Gumbel. I'm attaching the code if it helps! Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
Hi @SanBertero ! Thanks for the thorough example. I think the problem happens under PyMC's hood. Let me show you two examples: import numpy as np
import scipy.special as sp
import pymc as pm
rng = np.random.default_rng(123)
# Simulate mean from an only-intercept model. 2 chains, 100 draws, 5 observations.
# So 'mu' is the same for all the observations (because it's intercept-only)
mu_draws = np.abs(150 + np.dstack([rng.normal(size=(2, 100, 1))] * 5))
# Simulate some alpha values
alpha_draws = np.abs(rng.normal(size=(2, 100, 1)))
# With 'mu' and 'alpha' get 'beta', which is what pm.Weibull needs
beta_draws = mu_draws / sp.gamma(1 + 1 / alpha_draws)
# See the draws, for a given chain and draw, they look all the same!
weibull_draws = pm.draw(pm.Weibull.dist(alpha=alpha_draws, beta=beta_draws))
weibull_draws
print((weibull_draws == weibull_draws[:, :, 0][..., None]).all())
# True --> they're in fact all the same The next question I has was "Are we doing something dumb with shapes?". I decided to verify the result with a Gamma distribution. The meaning of the 'alpha' and 'beta' parameters are not the same, but that's not the important thing here. # The draws don't look the same
gamma_draws = pm.draw(pm.Gamma.dist(alpha=alpha_draws, beta=beta_draws))
gamma_draws
print((gamma_draws == gamma_draws[:, :, 0][..., None]).all())
# False --> they're indeed not the same To summarize, I think there's a problem in the code that generates random draws from the Weibull distribution in PyMC. I'll open an issue on their repo to see if it's indeed a problem. |
Beta Was this translation helpful? Give feedback.
-
@SanBertero FYI this is fixed in pymc-devs/pymc#7288. You can install from the main branch or wait for the next release. |
Beta Was this translation helpful? Give feedback.
Yes, the problem is the sampling under the hood in PyMC (which uses plain numpy in this case).
See pymc-devs/pymc#7220
So, if you really need this soon, you can go the PyMC code in your environment and modify the method I show in that issue to include the check suggested by Ricardo. That should be it.
If it's not urgent, then just wait until we update it. It should be at most a few days.