You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, we get chromosome numbers from the config file - and then we define a loop over range(1:nChromosomes+1). But what if we have non-numeric chromosomes in there, like other contigs or mitochondrial genome?
The text was updated successfully, but these errors were encountered:
I think it would be useful to have a text file with chromosome names and lengths. See the genome file format used by bedtools. This has one chromosome per line, a tab, and the chromosome's length:
$ cat my.genome
chr1 1000
chr2 500
Should a genome file be generated as part of this pipeline?
This would be easy if the entry point was one multi-chromosome VCF file. The file could be parsed and each chromosome's highest variant position could be used as the chromosome length. It would also be easy if a genome FASTA file was available.
But it could be tricky if the entry point is multiple VCF files.
Alternatively, we might require the genome file as an additional input, and we could supply a script to generate such a file from VCF/genome FASTA.
Currently, we get chromosome numbers from the config file - and then we define a loop over range(1:nChromosomes+1). But what if we have non-numeric chromosomes in there, like other contigs or mitochondrial genome?
The text was updated successfully, but these errors were encountered: