Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notion of "inferential statement" in the paper #202

Open
garyzhubc opened this issue Oct 10, 2023 · 8 comments
Open

Notion of "inferential statement" in the paper #202

garyzhubc opened this issue Oct 10, 2023 · 8 comments

Comments

@garyzhubc
Copy link

The paper suggest:

"However, given sufficient data it should be possible to conclude that there are (at least) two effect variables, and that

$$(b_1\ne 0\text{ or }b_2\ne0)\text{ and }(b_3\ne0\text{ or }b_4\ne0)$$

Our goal, in short, is to provide methods that directly produce this kind of inferential statement."

Are we providing such statement without seeing it as a hypothesis to be tested? If not, what will be the uncertainty of such a statement? Is this given by the algorithm?

@garyzhubc
Copy link
Author

garyzhubc commented Oct 10, 2023

Another question: Can we get such statement by constructing tests based on samples from the joint posterior, like for $N$ samples, calculating

$$\frac{1}{n}\sum_{n=1}^N\mathbb{1}\left(b^n\in{(b_1\ne0\text{ or }b_2\ne0)\text{ and }(b_3\ne0\text{ or }b_4\ne0)}\right)$$

If so, why do we still want to introduce the notion of credible sets?

@pcarbo
Copy link
Member

pcarbo commented Oct 11, 2023

@garyzhubc The idea is that the "credible set" (CS) corresponds to the event that a single variable X has an effect on the response variable Y. So with that constraint, the posterior inclusion probabilities (PIPs) are sufficient to quantify uncertainty in which variables affect Y. No other posterior statistic is needed. I hope that helps.

@garyzhubc
Copy link
Author

So you're saying the uncertainty is quantified via PIP given CS as a fixed subset. But what about uncertainty in CS itself?

I'm thinking that maybe this is explicitly tackled in CS so I'm looking at the software right now as well as the paper. susie_get_cs has a parameter coverage = 0.95 by default. Is coverage the same as $\rho$ in 2.2 of the paper definition 1 of credible set?

If so, does it mean the uncertainty in the CS itself is by default 0.95? I noticed smaller parameter of coverage gives smaller subset sizes, but I was expecting bigger subset sizes for lower confidence of containment, so can you explain why the subsets sizes actually got smaller?

@pcarbo
Copy link
Member

pcarbo commented Oct 12, 2023

"A level-ρ Credible Set is defined to be a subset of variables that has probability >ρ of containing at least one effect variable."

So if ρ goes down, you will need fewer variables in your CS to satisfy the condition.

@stephens999
Copy link
Contributor

@garyzhubc it's great you have so many questions, but this venue is primarily for questions about the software and its usage. Also, you will generally get better answers to questions if you can make them more precise. If you have some specific questions about the software please post them here, but for the more open-ended methods questions you ask here I suggest you might be better to find someone local to you who is also interested in these methods to have discussions among yourselves to see if you can find the answers yourselves.

@garyzhubc
Copy link
Author

garyzhubc commented Oct 21, 2023

So you're saying the uncertainty is quantified via PIP given CS as a fixed subset. But what about uncertainty in CS itself?

I'm thinking that maybe this is explicitly tackled in CS so I'm looking at the software right now as well as the paper. susie_get_cs has a parameter coverage = 0.95 by default. Is coverage the same as ρ in 2.2 of the paper definition 1 of credible set?

If so, does it mean the uncertainty in the CS itself is by default 0.95? I noticed smaller parameter of coverage gives smaller subset sizes, but I was expecting bigger subset sizes for lower confidence of containment, so can you explain why the subsets sizes actually got smaller?

Still a bit counter intuitive to me. In my understanding higher $\rho$ means lower specificity. If you are not sure which one among a set of SNP is causal versus you know a certain SNP is causal.

@garyzhubc
Copy link
Author

@garyzhubc it's great you have so many questions, but this venue is primarily for questions about the software and its usage. Also, you will generally get better answers to questions if you can make them more precise. If you have some specific questions about the software please post them here, but for the more open-ended methods questions you ask here I suggest you might be better to find someone local to you who is also interested in these methods to have discussions among yourselves to see if you can find the answers yourselves.

Cool I'll ask around.

@stephens999
Copy link
Contributor

@garyzhubc a 95% confidence interval will be bigger than a 80% confidence interval. It is the same idea with confidence sets. (I'm not sure what you mean by uncertainty in the CS.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants