Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Suggestion] Add strand information to the gbk files #10

Open
apcamargo opened this issue Jul 14, 2021 · 1 comment
Open

[Suggestion] Add strand information to the gbk files #10

apcamargo opened this issue Jul 14, 2021 · 1 comment

Comments

@apcamargo
Copy link

Gene strand can be very useful to detect prophages, but it is currently lacking from the .gb files. Because of that, there's no way to benchmark a tool that leverages strandness using proteins/ORFs extracted from this dataset's .gb files (using genbank2sequences.py, for example).

@beardymcjohnface
Copy link
Collaborator

Sorry for the late response here, I've only just had some time to revisit this project. The strand information is available in genbank files.

5' to 3' gene:

gene            9762..10592

3' to 5' gene:

gene            complement(9762..10592)

The programs that run from the genome FASTA-format files generally create their own annotations and will have strand info available (and if not it's their own fault).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants