Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add FASTA Dataset class #289

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

[WIP] Add FASTA Dataset class #289

wants to merge 1 commit into from

Conversation

a-r-j
Copy link
Owner

@a-r-j a-r-j commented Mar 29, 2023

Reference Issues/PRs

Waiting on #272

What does this implement/fix? Explain your changes

Dataset class for working with Sequence Datasets. Provides utilities for batch folding and embedding with ESM(Fold).

  • Set representative structure. For protein engineering tasks we can have a setup where we predict a single WT structure, which we use as the structure for the mutants & simply appropriately modify the residue types.

  • [] FoldComp compression of the predicted structures. Ideally this would run in the ESMFold step, but we can also do it post-hoc.

What testing did you do to verify the changes in this PR?

Pull Request Checklist

  • Added a note about the modification or contribution to the ./CHANGELOG.md file (if applicable)
  • Added appropriate unit test functions in the ./graphein/tests/* directories (if applicable)
  • Modify documentation in the corresponding Jupyter Notebook under ./notebooks/ (if applicable)
  • Ran python -m py.test tests/ and make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g., python -m py.test tests/protein/test_graphs.py)
  • Checked for style issues by running black . and isort .

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 2 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant