Skip to content
Aaron Mussig edited this page Apr 11, 2024 · 7 revisions

This outlines the steps required to update the website for each release.

Step 1: Transferring the database

On Watson, run the command below. Then copy it across to the database server.

pg_dump -U gtdb --data-only --format=tar --no-owner --no-acl --no-privileges --dbname=gtdb_pierre_r220 --file "/tmp/gtdb_r220.tar"

On the database server, restore the release into a new database:

docker exec -it gtdb-postgres psql -U postgres gtdb_r220 -f /backup/gtdb_r220.sql

Bring forward all materialized views from the previous release!

Step 2: Creating the website database

To do this, dump the previous release website database, then restore it as follows:

docker exec -it gtdb-postgres pg_dump -U postgres --dbname=gtdb_r214_web --no-acl --no-privileges --no-owner --file="/backup/gtdb_r214_web.sql"

docker exec -it gtdb-postgres psql -U postgres gtdb_r220_web -f /backup/gtdb_r214_web.sql

Step 2.1 Updating the Taxon history table

  1. Connect to the gtdb_r220_web.taxon_hist table and run the following command:
DELETE FROM taxon_hist WHERE release_ver = 'NCBI';

Run the following command in the new GTDB release database (not the web database):

select 'NCBI'                            AS release_ver,
       g.formatted_source_id             AS genome_id,
       split_part(ncbi_taxonomy, ';', 1) AS rank_domain,
       split_part(ncbi_taxonomy, ';', 2) AS rank_phylum,
       split_part(ncbi_taxonomy, ';', 3) AS rank_class,
       split_part(ncbi_taxonomy, ';', 4) AS rank_order,
       split_part(ncbi_taxonomy, ';', 5) AS rank_family,
       split_part(ncbi_taxonomy, ';', 6) AS rank_genus,
       split_part(ncbi_taxonomy, ';', 7) AS rank_species
from genomes g
         inner join metadata_taxonomy mt ON mt.id = g.id
where mt.gtdb_domain != 'd__'
  AND mt.gtdb_phylum != 'p__'
  AND mt.gtdb_class != 'c__'
  AND mt.gtdb_order != 'o__'
  AND mt.gtdb_family != 'f__'
  AND mt.gtdb_genus != 'g__'
  AND mt.gtdb_species != 's__'

UNION

select 'R220'                AS release_ver,
       g.formatted_source_id AS genome_id,
       mt.gtdb_domain        AS rank_domain,
       mt.gtdb_phylum        AS rank_phylum,
       mt.gtdb_class         AS rank_class,
       mt.gtdb_order         AS rank_order,
       mt.gtdb_family        AS rank_family,
       mt.gtdb_genus         AS rank_genus,
       mt.gtdb_species       AS rank_species
from genomes g
         inner join metadata_taxonomy mt ON mt.id = g.id
where mt.gtdb_domain != 'd__'
  AND mt.gtdb_phylum != 'p__'
  AND mt.gtdb_class != 'c__'
  AND mt.gtdb_order != 'o__'
  AND mt.gtdb_family != 'f__'
  AND mt.gtdb_genus != 'g__'
  AND mt.gtdb_species != 's__'

Export the results and import them to the gtdb_r220_web.taxon_hist table.

Updating the GTDB website

TODO: Put the commit with the release changes in.