-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Designs/0028_quantile_functions #42
base: master
Are you sure you want to change the base?
Conversation
|
I think that we need to decouple the general question of implementing quantile functions from the more contentious (well at least for me) issue of booting prior specification to quantile interpolation.
Let me start with comments on quantile functions for the existing families of one-dimensional probability distributions. Given their various uses — including generating pseudo random numbers with truncations which is not discussed in the design doc — I don’t see any reason to not include them.
That said I think we should define some policy for whether or not quantile functions are _required_ for any one-dimensional family. Having a quantile function for every family unifies the interface so that users will always know what functions are available instead of having to look up what might be implemented, similar to the functions implemented in base R (which always have a density, a CDF, an quantile function, and a pseudo random number generator), but that also adds an additional burden to the introduction of new families. At this point we’ve largely converged to a standard set of families so I don’t think it would actually be that much of a burden, but it’s not zero.
A corollary to this is whether or not we also require quantile functions for the one-dimensional discrete families to ensure a uniform interface across all of the one-dimensional families. The design doc argues against this but they are very useful when simulating from truncated discrete distributions, and again it would ensure a uniform interface similar to base R.
As for naming I would vote for `famliy_qf`.
I’m not sure I understand the comments about the location parameter not being meaningful for the quantile functions. A location parameter translates the quantile function up and down the real line, as demonstrated by the implementation of the Cauchy quantile function earlier.
The second issue concerns quantiles for prior modeling, and I think this is independent of the implementation of the standard quantile functions and can be spun off into a separate design doc arguing for the design and inclusion of a metalog function.
One general comment is that I don’t think we should be referencing grants, client contracts, or any other personal obligations in design docs. Whether or not a feature is developed and then incorporated into Stan should be a decision of the community and not what any one developer might promise to an external entity. In particular it would be nice to have a formal policy that prior to something like a design doc situation developers only promise implementations developed on external forks to keep everything compartmentalized.
Otherwise I have many reservations about jumping hard into metalog approaches. I do agree that quantile elicitation can be extremely powerful — I use it almost exclusively for prior modeling in my courses and consultations — but no finite set of quantiles fully species a prior model. Moreover the quantile constraints are only ever extracted with some uncertainty, as with any domain expertise elicitation, and prior models need to be robust to small perturbations of the quantiles.
In practice both of these limitations can be resolved by using enough quantile constraints to select a prior density function some a given family, for example by using the algebraic solver. By restricting consideration to a given family we can use soft elicitation of global properties like tail behavior, skewness, and the like to complete the specification beyond the elicited quantiles. This is facilitated common properties of the given family, in particular visualizations of the the density function. At the same time because the standard families of density function are relatively smooth perturbing the quantiles will almost always result in density functions that are visually equivalent.
In my opinion the metalog approach makes this all too complicated. The rigidity provided by the standard families is replaced with an interpolation whose details are hidden to the user. Even if an ambitious user does go to the effort of visualizing the induced probability density function what control will they have to modify or tune the resulting shape? If interpolation configurations are exposed then how interpretable, and useful to most users, will they be? Similarly how robust will the interpolation be to the precise location of the input quantiles?
The abstracted interpolation will have particularly significant consequences when only a few quantiles have been elicited, especially out in the tails which is where prior shape is often most important. Eliciting more quantiles can help but that also places a much stronger burden on the user to translate more and more implicit domain expertise into quantitative constraints, which takes time and effort and training that often aren’t available. At the same time there are plenty of studies that discuss how poorly we reason about extreme events which makes eliciting extreme enough quantiles to pin down asymptotic tail behavior a particular challenge.
At the very least I think there needs to be a substantial amount of research and experimental done to inform how something like metalog can be useful robustly, and I personally think it’s premature to consider it’s inclusion in Stan until then. But again I think that this particular discussion should be separated from the discussion of the standard quantile functions.
|
I'll second that.
I think it's OK to mention things as supporting docs if there's relevant content, but we're under no obligation as a project to merge features just because one of our devs was funded to do so. That's why we've been reluctant to write contracts that require something be integrated into Stan.
Unless there's an easy way to implement them, I don't think we should require them.
We should use the full set of parameters, including location. Anything else is just inviting users to implement their own location adjustments and that's a mess of possible sign errors.
I agree that they're super useful for truncated RNGs and should be included. Just not required. |
Thanks @betanalpha . I'm pretty much in agreement with @bob-carpenter 's responses. The quantile functions in the location-scale family definitely depend the location parameter. The minor question is about the quantile density functions (derivative of the quantile function wrt to the depth), which don't depend on the location, but I'm happy with keeping the signatures consistent. With the discrete distributions, what is the Stan use case for the One general thing on these grant-motivated PRs. The grantee could have time management problems if Stan developers suggest adding potentially useful features to the design doc that were not part of the grant proposal. Not that we should be merging half-baked designs or implementations thereof (irrespective of grant funding), but I wanted to make it explicit why I was putting quantile functions of univariate continuous distributions into the design and put other related things in the section on things that are out of scope for this design doc. The latter things might be worth doing by someone eventually, but I don't think their absence should hold up progress on the univariate continuous quantile functions. |
With the discrete distributions, what is the Stan use case for the _qf that isn't already satisfied by the _rng? If we are lacking _rng functions for any discrete distributions, we should add them, but that seems like more of a defect in current Math.
The main use case I’ve encountered is sampling from truncated discrete distributions, for example a Poisson distribution truncated to values between 0 and 10. Naive rejection sampling approaches are prone to near-infinite loops if the source strength is too large which makes them dangerous for use in the generated quantities block. Because the space is discrete the inverse CDF method can be implemented by summing over the probabilities, but it’s messy. Having explicit quantile functions makes it much easier.
The bigger issue I’m worried about is consistency of the probability library. Most of the probability libraries in R, Python, Mathematica, etc have a pretty uniform coverage of the exposed representations. One you know the name of a family of distributions you know how to call the corresponding density, CDF, quantile function, rng, etc. At the point the coverage in Stan is pretty uniform, but it’s becoming less and less uniform as more obscure families are added without all of the functionality. If a user is assuming that everything will be uniformly available but gets an error message then can become frustrating assuming that there’s a spelling error or other typo somewhere and not just a non-implemented functionality.
It would be great to add functionality to the compiler to fill in these gaps so that if a user calls for some functionality that has not yet been implemented then they get a precise error message noting that gap and not the default error message noting the the function being requested just doesn’t exist.
One general thing on these grant-motivated PRs. The grantee could have time management problems if Stan developers suggest adding potentially useful features to the design doc that were not part of the grant proposal. Not that we should be merging half-baked designs or implementations thereof (irrespective of grant funding), but I wanted to make it explicit why I was putting quantile functions of univariate continuous distributions into the design and put other related things in the section on things that are out of scope for this design doc. The latter things might be worth doing by someone eventually, but I don't think their absence should hold up progress on the univariate continuous quantile functions.
Personally I think that any proposal should be judged on its merits for the project, and not any extraneous circumstances like a motivating grant. A discussion for how many univariate quantile functions we need to implement without making the overhead too high for any contribution is a meaningful one independent of that grant context. But if we continue to consider similar contexts then we’re prone to shifting these discussion away from what’s good for the project to what’s good for a single developer.
|
I think this is a great idea. I opened a stanc3 issue to address it, quoting your comment as motivation: stan-dev/stanc3#987
I agree. It's also supported by Stan policy which says nothing about preferential treatment for funded features. |
When or how do we merge these? I really like this idea, hoping to make it "official". I mean, it's still a design doc so things can change or be updated, right? |
I want to make some small changes to it first. |
This is the first design doc for our NSF grant and discusses the fairly narrow question of how best to add quantile functions to Stan Math for eventual inclusion in the Stan language so that they can be called by users (usually in the
transformed parameters
block). Although this is related to our ideas about speeding up the log-likelihood evaluation and also related to an exciting approach where the log-likelihood could be specified in a different way, this design doc is independent of those (future) design docs, although those future design docs will be influenced by this one.It is essentially what I advocated in my StanCon presentation about how to do Bayesian inference without prior PDFs because the famous probability distributions were not intended to be used as priors and are difficult to use for that purpose because they were usually constructed to have elementary expressions for their expectation, variance, etc. But prior expectations are much more difficult for users to think about than prior quantiles, which motivates an alternative way to construct Stan programs where prior beliefs about the substantive parameters are conveyed through the quantile function that is applied in the
transformed parameters
block to a cumulative probability that is declared in theparameters
block with an implicit standard uniform prior.rendered docs here