Skip to content

Database Backups

jon-funk edited this page Jan 22, 2025 · 21 revisions

Introduction

NatCom now uses the Crunchy Data Operator in OpenShift

Backups

The Crunchy Data Operator uses pgBackRest as its backup and restore tool to create and manage backups, which are binary in nature (non-human readable)

pgBackRest will create and push both full and incremental binary backups using a repository layout, where multiple equal backups are published to many repositories for redundancy, and these repositories can be backed by different storage solutions. For NatCom we have:

Test Backups S3 Path: /backups/test/cluster/version

Production Backups S3 Path: /backups/prod/cluster/version

Restore Process

To perform a restore, the process is handled by the Operator+Custom Resource Definition control loop in OpenShift, rather than manual/bash scripting with psql and logical backups.

Updating the restore stanza in the crunchydata values file and then synchronizing the manifest changes will trigger a point-in-time restore for the cluster and all of its databases.

Alternatively, you can patch the CRD manually using oc eg:

oc get postgrescluster -oyaml to view the cluster configuration, and then manually modify the restore stanza using oc edit postgrescluster/<clustername>

when the CRD change synchronizes, crunchy will begin the restore process

Reference: https://access.crunchydata.com/documentation/postgres-operator/4.6.1/architecture/disaster-recovery/

Legacy Method

Note: This section is being kept until the bitnami pg instances are fully retired, refer to the crunchy section

Backups are copied over to objectstore by extending the Ministry's existing backup-container project to a forked project (https://github.com/bcgov/backup-container-compliance-enforcement).

Rolling Backups have been implemented as follows:

  • A week of daily backups
  • A month of weekly backups (executed on every Sunday)
  • A backup from the previous month

These backups are automated as a cronjob in OpenShift, and require no manual steps to trigger.

Configuration

The backup to objectstore requires the following environment variables to be set:

  • OBJECTSTORE_URL: e.g. nrs.objectstore.gov.bc.ca:443
  • OBJECTSTORE_ACCESS_KEY: Obtained after requesting an objectstore bucket
  • OBJECTSTORE_SECRET_KEY: Obtained after requesting an objectstore bucket
  • OBJECTSTORE_BUCKET: Obtained after requesting an objectstore bucket
  • OBJECTSTORE_BACKUP_DIRECTORY: directory within your objectstore bucket (e.g. "backups")

These variables have been incorporated into the OpenShift deployment pipeline. The openshift.deploy.yml file references the OpenShift secrets in each of the dev/test/prod namespaces, so each environment has their own set up backups.

Backups are immediately available by accessing the objectstore bucket for this project (see https://github.com/bcgov/nr-compliance-enforcement/wiki/NRS-Objectstore).

The objectstore bucket is organized by daily, weekly, and monthly backups (see below).

image

Restoration

  1. Pause the backend container

oc get dc

oc scale dc/<deployment-config-name> --replicas=0

  1. Download and unzip the backup sql file from obectstore (see above)
  2. Connect to the cluster by copying this and running that command in terminal or command prompt.
  3. Find the database pod

oc get pods | grep database) and port forward to it

oc port-forward {podname} 5433:5432

  1. Open a second terminal/command prompt, navigate to the directory where the backup file was downloaded to in step 1 above
  2. Connect to postgres

psql -U nr-compliance-enforcement -h localhost -p 5434

  1. Create a temporary database that we'll connect to so that we can drop the nr-compliance-enforcement database

CREATE DATABASE temp_db;

  1. Quit the session and connect to the temp_db

\q psql -U nr-compliance-enforcement -h localhost -p 5434 -d temp_db

  1. Drop all existing connections

SELECT pg_terminate_backend(pg_stat_activity.pid) FROM pg_stat_activity WHERE pg_stat_activity.datname = 'nr-compliance-enforcement';

  1. Drop the database

DROP DATABASE IF EXISTS "nr-compliance-enforcement";

  1. Recreate the database

CREATE DATABASE "nr-compliance-enforcement";

  1. Quit the session

\q

  1. Drop the temp_db

psql -U nr-compliance-enforcement -h localhost -p 5434 -d nr-compliance-enforcement

DROP DATABASE IF EXISTS 'temp_db';

\q

  1. Restore the database

psql -U nr-compliance-enforcement -h localhost -p 5434 -f {file name}

  1. Recreate the pod

oc scale dc/<deployment-config-name> --replicas=1