Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A more thorough explanation of overfitting, need for balance, etc #124

Closed
malcolmbarrett opened this issue Oct 14, 2022 · 4 comments · Fixed by #274
Closed

A more thorough explanation of overfitting, need for balance, etc #124

malcolmbarrett opened this issue Oct 14, 2022 · 4 comments · Fixed by #274

Comments

@malcolmbarrett
Copy link
Collaborator

We're not satisfied with the way we usually address these questions

@malcolmbarrett
Copy link
Collaborator Author

malcolmbarrett commented Sep 18, 2024

Here's where I sit with this today:

  1. Overfitting is not exactly the same problem as in prediction because we're not taking this model and making predictions with it on different data. It remains true that the right causal model needn't predict particularly well to be unbiased but
  2. Overfitting is model misspecification (https://statmodeling.stat.columbia.edu/2017/07/15/what-is-overfitting-exactly/) and so from that perspective overfit models will have residual confounding and thus biased estimates
  3. Overfitting can exacerbate positivity violations, causing bias (I don't have a good example of this but it was inspired by some of Sherri's work here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5824732/)

@malcolmbarrett
Copy link
Collaborator Author

malcolmbarrett commented Sep 18, 2024

Bias-variance tradeoff and overfitting of PS model? https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5756087/

My instinct is that overfitting lowers the statistical bias at the expense of bias from the true effect, and that in a frequentist since, it increases variance by virtue of the bias from the true effect, leaving poor nominal coverage of the CIs

Include this either with the callout box or somewhere else on its own.

@malcolmbarrett
Copy link
Collaborator Author

callout box in ps model fitting

@malcolmbarrett
Copy link
Collaborator Author

what about when you have the right model but can't model it, e.g., too many confounders vs sample size. (lasso, dim reduction)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant