Skip to content

update_gff

dytk2134 edited this page Sep 12, 2018 · 1 revision

Inroduction

Update the sequence id and coordinates of a GFF3 file using an alignment file generated by the fasta_diff program

Usage

Before processing the GFF file, you should sort the GFF file first!

you can sort the GFF file by using gff3_sort

gff3_sort -g example_file/example.gff3 -og example_file/example_sorted.gff

Example1:

update_gff –a match.tsv example_file/example1.gff3 example_file/example2.gff3

  • match.tsv : fasta_diff output file (fasta_diff wiki)
  • example1.gff3, example2.gff3: the gff3 files you want to update

Example2:

fasta_diff example_file/old.fa example_file/new.fa | update_gff example_file/example1.gff3 example_file/example2.gff3

Example progress output INFO:

INFO       Alignments read: 165618
INFO     Processing GFF3 file: a.gff...
INFO       Total feature count: 227781
INFO       Updated feature count: 227252
INFO       Removed feature count: 529

Running the program with –h prints the following help:

update_gff -h

usage: update_gff [-h] [-a ALIGNMENT_FILE] [-u UPDATED_POSTFIX]
                  [-r REMOVED_POSTFIX] [-v]
                  GFF_FILE [GFF_FILE ...]

Update the sequence id and coordinates of a GFF3 file using an alignment file generated by the fasta_diff program.
Updated features are written to a new file with '_updated'(default) appended to the original GFF3 file name.
Feature that can not be updated, due to the id being removed completely or the feature contains regions that
are removed or replaced with Ns, are written to a new file with '_removed'(default) appended to the original GFF3 file name.

Example:
    fasta_diff example_file/old.fa example_file/new.fa | update_gff example_file/example1.gff3 example_file/example2.gff3

positional arguments:
  GFF_FILE              List one or more GFF3 files to be updated

optional arguments:
  -h, --help            show this help message and exit
  -a ALIGNMENT_FILE, --alignment_file ALIGNMENT_FILE
                        The alignment file generated by fasta_diff, a TSV file
                        with 6 columns: old_id, old_start, old_end, new_id,
                        new_start, new_end (default: STDIN)
  -u UPDATED_POSTFIX, --updated_postfix UPDATED_POSTFIX
                        The filename postfix for updated features (default:
                        "_updated")
  -r REMOVED_POSTFIX, --removed_postfix REMOVED_POSTFIX
                        The filename postfix for removed features (default:
                        "_removed")
  -v, --version         show program's version number and exit
Clone this wiki locally