Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It seems that this tool generates a CDRH3 region with a high likelihood of containing many Gs. #16

Open
semal opened this issue Jan 2, 2024 · 1 comment

Comments

@semal
Copy link

semal commented Jan 2, 2024

image

What could possibly cause this phenomenon in the CDRH3 region?

@kxz18
Copy link
Collaborator

kxz18 commented Jan 3, 2024

According to my trials of the model, I think it might be due to out-of-distribution (OOD) test samples if the model keeps generating G and Y, especially when the CDR-H3 is long. There might be a lot of reasons for OOD. The definition of the epitope might not be suitable for the interaction pattern of antibodies. Or the epitope itself is very challenging and very different from the observed space during the model training. Trying various definitions of epitopes might be helpful as it is hard to tell whether the definition is good in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants