A ridge penalty that doesn't bias toward WT #51

wsdewitt · 2023-03-02T22:47:24Z

Problem

When a mutation $i$ does not appear in the training data, and we don't have any regularization on the $\beta_i$, the estimate will remain at the initialized value during optimization. This seems undesirable, since it will give random noise predictions for that mutation if it appears in the test set. Commonly this sort of thing is dealt with using a ridge regression term
$$R(\beta) = \sum_i \beta_i^2,$$
which is equivalent to putting a normal prior centered at zero on the $\beta_i$. The drawback of this is that we know typical mutation effects are deleterious, not WT-like near zero, so shrinking towards zero seems like a bad idea.

Proposed resolution

Suppose we magically know that the typical latent mutation effect was $\bar\beta$. In that case we'd want the prior on $\beta_i$ to regularize toward that value instead of zero:
$$R(\beta) = \sum_i (\beta_i-\bar\beta)^2.$$
This is equivalent to a normal prior centered at $\bar\beta$. Instead of encouraging WT-like predictions for unobserved mutations, this will encourage typical/deleterious predictions for them. So, my proposal is to use this offset ridge penalty, and include $\bar\beta$ as learnable scalar parameter representing the typical mutation effect (interpretable as a centering operation in the latent space).

wsdewitt mentioned this issue Jul 16, 2023

If a mutation in the bundle is unmeasured, throw away data for all variants with mutations at this site #84

Closed

jgallowa07 added the enhancement New feature or request label Mar 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A ridge penalty that doesn't bias toward WT #51

A ridge penalty that doesn't bias toward WT #51

wsdewitt commented Mar 2, 2023 •

edited

Loading

A ridge penalty that doesn't bias toward WT #51

A ridge penalty that doesn't bias toward WT #51

Comments

wsdewitt commented Mar 2, 2023 • edited Loading

Problem

Proposed resolution

wsdewitt commented Mar 2, 2023 •

edited

Loading