Trouble w/ plot_ppc using Weibull family #788

SanBertero · 2024-03-26T18:20:48Z

SanBertero
Mar 26, 2024

Hello everyone,

I'm working on a example using Bambi to fit different distributions to 30 years of maximum precipitation data (a synthetic database included here for ease). First, I set up a model adopting a Gumbel distribution, everything worked fine (in line with the same example if set up in PyMC). Then, I built a second model with default properties using a Weibull distribution.

The fitting portion looked good, no divergences, reasonable posteriors, etc. The problem shows up when trying to use az.plot_ppc. I encounter the following error: "Your data appears to have a single value or no finite values". When I try to plot with the "cumulative" option, it looks as though it's only considering one data point at a time too.

The strange thing is that looking at the predicted values 'mm' there don't seem to be anything different. The histograms looks reasonable and the QQ-plot too. The results from each model look virtually the same upon inspection when it comes to shape, data type, etc.

I'm at a loss on why this is not working for the Weibull and does for the custom Gumbel. I'm attaching the code if it helps!

bambi_qa.zip

Thanks!

Answered by tomicapretto

Mar 26, 2024

Yes, the problem is the sampling under the hood in PyMC (which uses plain numpy in this case).
See pymc-devs/pymc#7220

So, if you really need this soon, you can go the PyMC code in your environment and modify the method I show in that issue to include the check suggested by Ricardo. That should be it.

If it's not urgent, then just wait until we update it. It should be at most a few days.

View full answer

tomicapretto · 2024-03-26T20:31:14Z

tomicapretto
Mar 26, 2024
Maintainer

Hi @SanBertero ! Thanks for the thorough example.

I think the problem happens under PyMC's hood. Let me show you two examples:

import numpy as np
import scipy.special as sp
import pymc as pm

rng = np.random.default_rng(123)

# Simulate mean from an only-intercept model. 2 chains, 100 draws, 5 observations.
# So 'mu' is the same for all the observations (because it's intercept-only)
mu_draws = np.abs(150 + np.dstack([rng.normal(size=(2, 100, 1))] * 5))

# Simulate some alpha values
alpha_draws = np.abs(rng.normal(size=(2, 100, 1)))

# With 'mu' and 'alpha' get 'beta', which is what pm.Weibull needs
beta_draws = mu_draws / sp.gamma(1 + 1 / alpha_draws)

# See the draws, for a given chain and draw, they look all the same!
weibull_draws = pm.draw(pm.Weibull.dist(alpha=alpha_draws, beta=beta_draws))
weibull_draws

print((weibull_draws == weibull_draws[:, :, 0][..., None]).all())
# True --> they're in fact all the same

The next question I has was "Are we doing something dumb with shapes?". I decided to verify the result with a Gamma distribution. The meaning of the 'alpha' and 'beta' parameters are not the same, but that's not the important thing here.

# The draws don't look the same
gamma_draws = pm.draw(pm.Gamma.dist(alpha=alpha_draws, beta=beta_draws))
gamma_draws

print((gamma_draws == gamma_draws[:, :, 0][..., None]).all())
# False --> they're indeed not the same

To summarize, I think there's a problem in the code that generates random draws from the Weibull distribution in PyMC.

I'll open an issue on their repo to see if it's indeed a problem.

3 replies

SanBertero Mar 26, 2024
Author

Thank you for your quick reply!

If it helps, here's the same example solved with PyMC. In this case Arviz didn't fail. The parameters being inferred are different for the Weibull case with each library, so I don't know how comparable the two setups are, but I figured it may help...

Maybe indeed the issue starts with how the samples are drawn as you showed.

bambi_qa2.zip

tomicapretto Mar 26, 2024
Maintainer

Yes, the problem is the sampling under the hood in PyMC (which uses plain numpy in this case).
See pymc-devs/pymc#7220

So, if you really need this soon, you can go the PyMC code in your environment and modify the method I show in that issue to include the check suggested by Ricardo. That should be it.

If it's not urgent, then just wait until we update it. It should be at most a few days.

Answer selected by SanBertero

SanBertero Mar 26, 2024
Author

Thank you!

tomicapretto · 2024-04-28T14:50:10Z

tomicapretto
Apr 28, 2024
Maintainer

@SanBertero FYI this is fixed in pymc-devs/pymc#7288. You can install from the main branch or wait for the next release.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trouble w/ plot_ppc using Weibull family #788

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Trouble w/ plot_ppc using Weibull family #788

SanBertero Mar 26, 2024

Replies: 2 comments · 3 replies

tomicapretto Mar 26, 2024 Maintainer

SanBertero Mar 26, 2024 Author

tomicapretto Mar 26, 2024 Maintainer

SanBertero Mar 26, 2024 Author

tomicapretto Apr 28, 2024 Maintainer

SanBertero
Mar 26, 2024

Replies: 2 comments 3 replies

tomicapretto
Mar 26, 2024
Maintainer

SanBertero Mar 26, 2024
Author

tomicapretto Mar 26, 2024
Maintainer

SanBertero Mar 26, 2024
Author

tomicapretto
Apr 28, 2024
Maintainer