-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bayesian analytics #3
Comments
Oh yes! That would be awesome. As you allude to, priors could be generated through our sizing process. Anything is better than no prior. Also, I'm not sure how relevant older data is, considering seasonality... perhaps the 30 days we use in our sizing calculator is sufficient here too? I don't think we need to size experiments in advance with bayesian inference, but we'd still need it for establishing the prior I think. |
There is no such thing as Bayesian with “no prior”. At a bare minimum there is an “uninformed prior” (which is a bit of a misnomer imho), but you can’t take the prior out of the equation (or out of the philosophy, for that matter). |
Could you explain what you mean by "sizing"? I am not familiar with this term. |
We haven't documented this, but before running experiments at Mint Metrics, we calculate the traffic we need for a minimum detectable effect using some helper functions. e.g.: > estimateDurationQuery(
+ app_id = "site_name",
+ trigger_clause = "page_urlpath like '/products/%'",
+ conversion_clause = "page_urlpath = '/order/thank-you/'",
+ delta = -0.07,
+ recipes = 2,
+ stat_power = 0.8
+ )
[1] "Days to run: 31.786299299664"
subjects conversions base_cvr target_cvr
1 47862 5221 0.1090845 0.1014485
It gives us a base line conversion rate for users who would typically be exposed over the last 30 days. Perhaps this baseline conversion rate will be useful as a prior? |
Kind of similar to this calculator: https://www.evanmiller.org/ab-testing/sample-size.html The actual calculation is performed here in our code: https://github.com/mint-metrics/mojito-r-analytics/blob/master/mojito-functions/experiment_sizing.R#L40 |
I assume this refers to the "desired statistical power"? In a Frequentist paradigm, power is needed to control for the type-II error (false negative) rate. Conversely, in a Bayesian paradigm, there are (afaik) no type-II error rate guarantees.
Indeed there is no need. Sizing (or power) is needed to make guarantees about error rates that Bayesian inference does not consider. (ftr: imho this is a limitation of the Bayesian approach, not a strength.) |
Not sure if my thinking is correct, but could we take an approach that leverage the advantages of both paradigms? I.e. Frequentist to determine a target sample size / test duration to reduce type-II errors and Bayesian (with strong priors) to reduce type-I errors and easier to disseminate results? I've seen some other CRO agencies use 'hybrid' approaches, albeit not as simplistic as this, so this train of thought maybe completely off. |
While a Bayesian approach might empirically reduce type-I errors (when evaluated against some simulated data using a Frequentist lens), there are not guarantees about error rates (type-I or type-II). I really have no idea how one would get the best of both worlds. |
@kingo55 should we consider fleshing out Bayesian analytics again? It would be interesting to develop some functionality to run side-by-side with the Frequentist reports we run, to see how it stacks up.
The main thing to put some thought to is how we calculate priors. We could perhaps calculate it (mean + std deviation) based on the past X months worth of conversion data?
Another question is how to we deal with sizing and presenting the data in our reports.
Refs:
The text was updated successfully, but these errors were encountered: