-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add keep-contig-names parameter #8865
base: master
Are you sure you want to change the base?
Add keep-contig-names parameter #8865
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gokalpcelik One TODO for you
@@ -73,6 +73,10 @@ public class FastaReferenceMaker extends ReferenceWalker { | |||
@Argument(fullName= LINE_WIDTH_LONG_NAME, doc="Maximum length of sequence to write per line", optional=true) | |||
public int basesPerLine = FastaReferenceWriter.DEFAULT_BASES_PER_LINE; | |||
|
|||
public static final String KEEP_CONTIG_NAMES = "keep-contig-names"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an integration test that uses this new argument, with an expected output
writer.appendSequence(lastPosition.getContig(), description, basesPerLine, Bytes.toArray(sequence)); | ||
} | ||
else { | ||
writer.appendSequence(String.valueOf(contigCount), description, basesPerLine, Bytes.toArray(sequence)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're using just a slice of a contig, perhaps the name should include the full genomic coordinates
A forum topic asked whether we can have a behavior to keep original contig names in FastaAlternateReferenceMaker tool.
A new parameter
--keep-contig-names
is added.New optional behavior is to set contig names as
>originalcontigname description
Here is my small local test and its result
VCF
Original Fasta
New Fasta with new optional behavior
Sequence dictionary created for the new Fasta.
Default value is false therefore original behavior is kept. Should not hurt any current tests.