-
Notifications
You must be signed in to change notification settings - Fork 194
File Types
Adam Novak edited this page May 27, 2022
·
15 revisions
The vg ecosystem uses a lot of file formats. Some are new and not consistently used yet, and some are old and still required for some less-popular operations.
Some of these are described in more detail at Index Types.
These formats store genome references that define spaces in which genomics can be done.
Name | Description | Extension | Purpose | Status | Notes |
---|---|---|---|---|---|
VG Protobuf | |||||
GFA | |||||
HashGraph | |||||
PackedGraph | |||||
Memory-Mapped PackedGraph | |||||
ODGI (vg flavor) | |||||
VG JSON | |||||
Indexed VG Protobuf | |||||
FASTA |
These formats store short or long reads from DNA sequencing machines, and can describe how they fit into references.
Name | Description | Extension | Purpose | Status | Notes |
---|---|---|---|---|---|
GAM Protobuf | |||||
GAF | |||||
Indexed GAM | |||||
GAM JSON | |||||
GAMP Protobuf | |||||
GAMP JSON | |||||
BAM | |||||
SAM | |||||
FASTQ |
These formats can describe individual people or other organisms and how their genomes fit into or differ from references.
Name | Description | Extension | Purpose | Status | Notes |
---|---|---|---|---|---|
GBWT | |||||
GBZ | |||||
VCF | |||||
Pack File | |||||
Pileup Protobuf | |||||
Pileup JSON | |||||
Locus Protobuf | |||||
Locus JSON |
These formats store other kinds of information, or are precomputed indexes to speed up operations on other data.
Name | Description | Extension | Purpose | Status | Notes |
---|---|---|---|---|---|
Distance Index (v1) | |||||
Distance Index (v2) | |||||
GCSA | |||||
Minimizer Index | |||||
BED | |||||
Snarl Protobuf | |||||
Snarl JSON | |||||
SnarlTraversal Protobuf | |||||
SnarlTraversal JSON | |||||
Node ID Translation | |||||
VG Protobuf Index | |||||
GAM Index | |||||
FASTA Index | |||||
BAM Index | |||||
Tabix VCF Index |