using spacial distances masks some escape sites #161

Bernadetadad · 2023-03-23T04:12:36Z

I ran COV-3600 antibody with the latest polyclonal version with and without spatial_distances and it gives very different results, namely, addition of spatial distance parameter significantly changes key escape sites seen before (such as site 420 and 486). It is not clear to me why that is the case. Attached is zipped html comparing escape with or without added spatial distances. COV-3600 barcode runs are in BA.2 repo.

One thing I noticed is that spacial distances table does not include information for distances between or within all chains. E.g., in the attached image I read in A, B and C chains but the spacial distances table only includes information for distances between sites 371-420 for chains A and C. This is the case regardless of which .pdb I use, is it supposed to work like this as the distances within and between chains should be different?

COV3600.html.zip

The text was updated successfully, but these errors were encountered:

jbloom · 2023-03-23T16:55:40Z

@Bernadetadad, just looking at this qualitatively, I don't think it is a bug per set but perhaps just limitations of spatial regularization.

First off, the chain thing is not a bug. If you look at the inter_residue_distances function you see it is returning the closest pair of sites across all chains, which are usually the ones in the same monomer. But the Polyclonal fitting does not know about chains, so the chain information isn't used. The intuition here is just that things that are close together should be in the same epitope, and they can be close together either by being in the same monomer or adjacent monomers. Usually it's the same monomer but could be adjacent one for monomer-bridging antibodies. In any case, the fitting uses the closest pair.

The regularization operates on the mean escape at a site, not the total. So if you click on the mean plots, you see for the non-regularized ones the mean escape is a lot higher for 420 and 487 than for 371, as at 371 only a single mutation escapes. Whether the normalization should actually be on the total is a sort of subjective question.

But I think the main issue here is that site 371 is not spatially proximal to sites 420 and 487 in the RBD if you look at the structure. This is because the 371 mutation probably affects up-down conformation of the RBD and is mostly acting that way. So the spatial regularization argues against them being in the same epitope as it doesn't know about things like RBD up-down.

It may be that the solution is to just drop the regularization weights. Right now reg_spatial2_weight is set to 0.001 by default (see here). You could either set it all the way to zero, or just decrease it modestly, like by another order of magnitude, and see if that helps?

If decreasing it helps, maybe see if you think that also helps for other fitting. If so, let me know and we could potentially change the default.

Anyway, can you report back on this issue what you find?

Bernadetadad · 2023-03-23T18:08:56Z

Setting reg_spatial2_weight=0 works like spatial_distances = None, which makes sense.
Need to set reg_spatial2_weight at least 100x lower to see similar escape for site 487 as for spatial_distances = None and at least 1000x lower to see escape at site 420 (not sure if the weight is doing anything with that low value).

jbloom · 2023-03-23T18:14:00Z

OK, maybe try on sera etc and see if you think it is better to just set default weight to zero or make it smaller, and if this should be overall polyclonal default or just something we tune.

Bernadetadad · 2023-03-28T05:16:49Z

Just a follow up, including spacial regularization for antibodies with the pipeline default values (reg_spatial2_weight:0.001) significantly improves correlation in escape values between biological replicates (that makes some sense I think), but in several antibodies now I’ve seen that this leads to loss of what should be strong escape sites, so for mAbs I’m now setting reg_spatial2_weight: 0.000001 , which retains all escape sites observed without spacial regularization and gives a small increase in correlation between biological replicates relative to reg_spatial2_weight: 0.

jbloom · 2023-03-28T13:08:21Z

I will update defaults on this some after we also decide about antibody count defaults.

jbloom · 2023-03-28T13:08:49Z

@fwelsh, do you have a thought on good spatial regularization defaults for your data?

fwelsh · 2023-03-28T19:13:55Z

@jbloom I don't use spatial regularization if I'm only fitting one epitope. For multiple epitopes, I could get reasonable deconvolution if I set reg_spatial2_weight to around 0.001 or 0.01, which is quite high.

I don't really understand the logic behind using spatial regularization for single-epitope models? Penalizing the model for trying to put distant sites in the same epitope makes sense. But when we're just fitting one epitope, this seems like it would add unreasonable constraints and artificially skew the data towards a more targeted immune profile. I could just be misunderstanding the role of spatial regularization here, though, let me know what you think!

Bernadetadad assigned jbloom Mar 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

using spacial distances masks some escape sites #161

using spacial distances masks some escape sites #161

Bernadetadad commented Mar 23, 2023

jbloom commented Mar 23, 2023

Bernadetadad commented Mar 23, 2023

jbloom commented Mar 23, 2023

Bernadetadad commented Mar 28, 2023

jbloom commented Mar 28, 2023

jbloom commented Mar 28, 2023

fwelsh commented Mar 28, 2023

using spacial distances masks some escape sites #161

using spacial distances masks some escape sites #161

Comments

Bernadetadad commented Mar 23, 2023

jbloom commented Mar 23, 2023

Bernadetadad commented Mar 23, 2023

jbloom commented Mar 23, 2023

Bernadetadad commented Mar 28, 2023

jbloom commented Mar 28, 2023

jbloom commented Mar 28, 2023

fwelsh commented Mar 28, 2023