-
Notifications
You must be signed in to change notification settings - Fork 2
Database Backups
NatCom now uses the Crunchy Data Operator
in OpenShift
The Crunchy Data Operator
uses pgBackRest
as its backup and restore tool to create and manage backups, which are binary in nature (non-human readable)
pgBackRest
will create and push both full and incremental binary backups using a repository
layout, where multiple equal backups are published to many repositories for redundancy, and these repositories can be backed by different storage solutions. For NatCom we have:
- An
S3
repo using the NRS object storage. See Object Store Connection Details. The storage path is based on the repository configuration - A
netapp-file-backup
type PVC repo which has backups managed by the data center. See Test Backups PVC for test namespace backups location
Test Backups S3 Path: /backups/test/cluster/version
Production Backups S3 Path: /backups/prod/cluster/version
To perform a restore, the process is handled by the Operator+Custom Resource Definition control loop in OpenShift, rather than manual/bash scripting with psql
and logical backups.
Updating the restore stanza in the crunchydata values file and then synchronizing the manifest changes will trigger a point-in-time restore for the cluster and all of its databases.
Alternatively, you can patch the CRD manually using oc
eg:
oc get postgrescluster -oyaml
to view the cluster configuration, and then manually modify the restore
stanza using oc edit postgrescluster/<clustername>
when the CRD change synchronizes, crunchy will begin the restore process
Reference: https://access.crunchydata.com/documentation/postgres-operator/4.6.1/architecture/disaster-recovery/
Note: This section is being kept until the bitnami pg instances are fully retired, refer to the crunchy section
Backups are copied over to objectstore by extending the Ministry's existing backup-container project to a forked project (https://github.com/bcgov/backup-container-compliance-enforcement).
Rolling Backups have been implemented as follows:
- A week of daily backups
- A month of weekly backups (executed on every Sunday)
- A backup from the previous month
These backups are automated as a cronjob in OpenShift, and require no manual steps to trigger.
The backup to objectstore requires the following environment variables to be set:
- OBJECTSTORE_URL: e.g. nrs.objectstore.gov.bc.ca:443
- OBJECTSTORE_ACCESS_KEY: Obtained after requesting an objectstore bucket
- OBJECTSTORE_SECRET_KEY: Obtained after requesting an objectstore bucket
- OBJECTSTORE_BUCKET: Obtained after requesting an objectstore bucket
- OBJECTSTORE_BACKUP_DIRECTORY: directory within your objectstore bucket (e.g. "backups")
These variables have been incorporated into the OpenShift deployment pipeline. The openshift.deploy.yml file references the OpenShift secrets in each of the dev/test/prod namespaces, so each environment has their own set up backups.
Backups are immediately available by accessing the objectstore bucket for this project (see https://github.com/bcgov/nr-compliance-enforcement/wiki/NRS-Objectstore).
The objectstore bucket is organized by daily, weekly, and monthly backups (see below).
- Pause the backend container
oc get dc
oc scale dc/<deployment-config-name> --replicas=0
- Download and unzip the backup sql file from obectstore (see above)
- Connect to the cluster by copying this and running that command in terminal or command prompt.
- Find the database pod
oc get pods | grep database
) and port forward to it
oc port-forward {podname} 5433:5432
- Open a second terminal/command prompt, navigate to the directory where the backup file was downloaded to in step 1 above
- Connect to postgres
psql -U nr-compliance-enforcement -h localhost -p 5434
- Create a temporary database that we'll connect to so that we can drop the nr-compliance-enforcement database
CREATE DATABASE temp_db;
- Quit the session and connect to the temp_db
\q
psql -U nr-compliance-enforcement -h localhost -p 5434 -d temp_db
- Drop all existing connections
SELECT pg_terminate_backend(pg_stat_activity.pid) FROM pg_stat_activity WHERE pg_stat_activity.datname = 'nr-compliance-enforcement';
- Drop the database
DROP DATABASE IF EXISTS "nr-compliance-enforcement";
- Recreate the database
CREATE DATABASE "nr-compliance-enforcement";
- Quit the session
\q
- Drop the temp_db
psql -U nr-compliance-enforcement -h localhost -p 5434 -d nr-compliance-enforcement
DROP DATABASE IF EXISTS 'temp_db';
\q
- Restore the database
psql -U nr-compliance-enforcement -h localhost -p 5434 -f {file name}
- Recreate the pod
oc scale dc/<deployment-config-name> --replicas=1