Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Designs/0028_quantile_functions #42

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

bgoodri
Copy link

@bgoodri bgoodri commented Sep 1, 2021

This is the first design doc for our NSF grant and discusses the fairly narrow question of how best to add quantile functions to Stan Math for eventual inclusion in the Stan language so that they can be called by users (usually in the transformed parameters block). Although this is related to our ideas about speeding up the log-likelihood evaluation and also related to an exciting approach where the log-likelihood could be specified in a different way, this design doc is independent of those (future) design docs, although those future design docs will be influenced by this one.

It is essentially what I advocated in my StanCon presentation about how to do Bayesian inference without prior PDFs because the famous probability distributions were not intended to be used as priors and are difficult to use for that purpose because they were usually constructed to have elementary expressions for their expectation, variance, etc. But prior expectations are much more difficult for users to think about than prior quantiles, which motivates an alternative way to construct Stan programs where prior beliefs about the substantive parameters are conveyed through the quantile function that is applied in the transformed parameters block to a cumulative probability that is declared in the parameters block with an implicit standard uniform prior.

rendered docs here

@jgabry
Copy link
Member

jgabry commented Sep 2, 2021

@bob-carpenter
Copy link
Collaborator

  1. I'd prefer a solution that just turns the checking off for users that want checking turned off. That can be shared with our other "turn off checking" routines.

  2. For the complementary case, the problem isn't precision of the function, it's the representational power of floating point. The smallest epsilon for which 1 - epsilon != 1 is about 1e-16. The smallest epsilon for which epsilon > 0 is about 1e-400. So no matter what the quantile function does, we can't even formulate the argument 1 - 1e-20, because it's just 1 in floating point. Having said that, I'm OK with less precise versions and no complementary versions to start. We can always improve that later.

  3. I'm Ok with either _qf or _icdf.

  4. I think it'll be easier and more coherent to include the locations in the location/scale families.

@betanalpha
Copy link

betanalpha commented Sep 21, 2021 via email

@bob-carpenter
Copy link
Collaborator

I think that we need to decouple the general question of implementing quantile functions from the more contentious (well at least for me) issue of booting prior specification to quantile interpolation. ... The second issue concerns quantiles for prior modeling, and I think this is independent of the implementation of the standard quantile functions and can be spun off into a separate design doc arguing for the design and inclusion of a metalog function.

I'll second that.

One general comment is that I don’t think we should be referencing grants, client contracts, or any other personal obligations in design docs. Whether or not a feature is developed and then incorporated into Stan should be a decision of the community and not what any one developer might promise to an external entity.

I think it's OK to mention things as supporting docs if there's relevant content, but we're under no obligation as a project to merge features just because one of our devs was funded to do so. That's why we've been reluctant to write contracts that require something be integrated into Stan.

whether or not we also require quantile functions for the one-dimensional discrete families to ensure a uniform interface across all of the one-dimensional families.

Unless there's an easy way to implement them, I don't think we should require them.

I’m not sure I understand the comments about the location parameter not being meaningful for the quantile functions.

We should use the full set of parameters, including location. Anything else is just inviting users to implement their own location adjustments and that's a mess of possible sign errors.

require quantile functions for the one-dimensional discrete families to ensure a uniform interface across all of the one-dimensional families.

I agree that they're super useful for truncated RNGs and should be included. Just not required.

@bgoodri
Copy link
Author

bgoodri commented Sep 21, 2021

Thanks @betanalpha . I'm pretty much in agreement with @bob-carpenter 's responses.

The quantile functions in the location-scale family definitely depend the location parameter. The minor question is about the quantile density functions (derivative of the quantile function wrt to the depth), which don't depend on the location, but I'm happy with keeping the signatures consistent.

With the discrete distributions, what is the Stan use case for the _qf that isn't already satisfied by the _rng? If we are lacking _rng functions for any discrete distributions, we should add them, but that seems like more of a defect in current Math.

One general thing on these grant-motivated PRs. The grantee could have time management problems if Stan developers suggest adding potentially useful features to the design doc that were not part of the grant proposal. Not that we should be merging half-baked designs or implementations thereof (irrespective of grant funding), but I wanted to make it explicit why I was putting quantile functions of univariate continuous distributions into the design and put other related things in the section on things that are out of scope for this design doc. The latter things might be worth doing by someone eventually, but I don't think their absence should hold up progress on the univariate continuous quantile functions.

@betanalpha
Copy link

betanalpha commented Oct 1, 2021 via email

@bob-carpenter
Copy link
Collaborator

It would be great to add functionality to the compiler to fill in these gaps so that if a user calls for some functionality that has not yet been implemented then they get a precise error message noting that gap and not the default error message noting the the function being requested just doesn’t exist.

I think this is a great idea. I opened a stanc3 issue to address it, quoting your comment as motivation: stan-dev/stanc3#987

any proposal should be judged on its merits for the project, and not any extraneous circumstances like a motivating grant.

I agree. It's also supported by Stan policy which says nothing about preferential treatment for funded features.

@spinkney
Copy link
Contributor

spinkney commented Feb 9, 2022

When or how do we merge these? I really like this idea, hoping to make it "official". I mean, it's still a design doc so things can change or be updated, right?

@bgoodri
Copy link
Author

bgoodri commented Feb 9, 2022

I want to make some small changes to it first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants