Managing a Cluster
Configuring Backups with pgBackrest

Using pgBackRest for Backup and Restore

pgBackRest is a reliable, easy-to-use backup and restore solution that can seamlessly scale up to the largest databases and workloads by utilizing algorithms that are optimized for database-specific requirements. pgEdge installers can configure pgBackRest when you initialize a cluster, or you can install pgBackRest as a supported component in an existing cluster.

Configuring pgBackRest

The cluster module supports initializing two configuration setups for using pgBackRest with pgEdge:

  1. Storing backups on a Posix-compliant file system
  2. Storing backups in an S3 Bucket

If you are storing your pgBackRest repository on a file system, we recommend using a network file share, if possible, which will enable pgEdge to leverage the repository from the source node when initializing the target node in an add node operation.

Configuration Options for pgBackRest

When using the cluster module, you can optionally provide the following configuration settings under each node group to configure pgBackRest on each node:

  • backrest.stanza: The name of the stanza to be used.
  • backrest.repo1_path: The path to the repository for backups.
  • backrest.repo1_retention_full: The number of full backups to retain.
  • backrest.log_level_console: The log level for console output.
  • backrest.repo1_cipher_type: The type of encryption to use for the repository. Options are aes-256-cbc or none
  • backrest.archive_mode: The mode for archiving WAL (Write-Ahead Logging) files (on / off). If set to on, archiving will be configured.
  • backrest.repo1_type: The type of repository storage. "s3" and "posix" are currently supported.

Required Node Configuration

If you're using an AWS S3 bucket for your backups, add the following lines to the ~/.bashrc file on each node of the cluster:

export PGBACKREST_REPO1_S3_KEY=AIYFHTUJVLPPE
export PGBACKREST_REPO1_S3_BUCKET=bucket-876t3xpf
export PGBACKREST_REPO1_S3_KEY_SECRET=G9tlpTwj2+yTKLO3qMjeKG9a7GkR4mo
export PGBACKREST_REPO1_S3_ENDPOINT=s3.amazonaws.com
export PGBACKREST_REPO1_S3_REGION=eu-west-2

Note: PGBACKREST_REPO1_S3_KEY and PGBACKREST_REPO1_S3_KEY_SECRET are not required if you wish to use the built-in credential chain using instance roles on AWS Virtual Machines.

If you are using encryption in your repo1_cipher_type configuration, add this line to the ~/.bashrc file on each node to set the cipher pass for encryption / decryption:

export PGBACKREST_REPO1_CIPHER_PASS=YourCipherPassHere

json-create Function

The json-create function in the Cluster module allows for easy configuration of these options when setting up a cluster.

 
./pgedge cluster json-create default 2 defaultdb admin password
 
PostgreSQL version ['15', '16', '17'] (default: '16'):
Spock version ['3.3.6', '3.3.5', '4.0.10', '4.0.9', '4.0.8'] (default: '4.0.10'):
Enable pgBackRest? (Y/N) (default: 'N'): Y
   pgBackRest storage path (default: '/var/lib/pgbackrest'): /backups/
   pgBackRest archive mode (on/off) (default: 'on'):
   pgBackRest repository type (posix/s3) (default: 'posix'):
 
Configuring Node 1
  Public IP address for Node 1 (default: '127.0.0.1'): 172.22.0.4
  Private IP address for Node 1 (default: '172.22.0.4'):
  PostgreSQL port for Node 1 (default: '5432'):
 
Configuring Node 2
  Public IP address for Node 2 (default: '127.0.0.1'): 172.22.0.6
  Private IP address for Node 2 (default: '172.22.0.6'):
  PostgreSQL port for Node 2 (default: '5432'):
 
################################################################################
# Cluster Name       : default
# PostgreSQL Version : 16
# Spock Version      : 4.0.10
# Number of Nodes    : 2
# Database Name      : defaultdb
# User               : admin
# pgBackRest Enabled : Yes
#    Storage Path    : /backups/
#    Archive Mode    : on
#    Repository Type : posix
# Node 1
#    Public IP       : 172.22.0.4
#    Private IP      : 172.22.0.4
#    Port            : 5432
# Node 2
#    Public IP       : 172.22.0.6
#    Private IP      : 172.22.0.6
#    Port            : 5432
################################################################################
Do you want to save this configuration? (Y/N): Y
 

You can further customize the cluster JSON prior to running cluster init to tweak setting defaults.

init Function

When pgBackRest is configured for a given node, it will have the pgBackRest stanza initialized with the provided configuration during cluster initialization. In addition, an initial backup will be taken in the repository. If archiving is enabled, the archive command will be setup on each node to utilize pgBackRest, enabling you to leverage the point-in-time recovery capabilities when performing restores.

Setup a pgBackRest Config file

⚠️
TODO: This is a temporary step. The CLI will be updated to initialize this file for you when calling cluster init or cluster add-node

In order to interact with pgBackRest more easily, create a pgBackRest config file called pgbackrest.conf on each node.

You can place this in the pgEdge install directory, which is typically /home/pgedge/default/<node_name/pgedge if using the cluster module. Your setup for this file may vary depending on your install.

[global]
repo1-path=/path/to/repo/n1/
repo1-retention-full=7
repo1-retention-full-type=count
repo1-type=posix
repo1-cipher-type=aes-256-cbc
repo1-cipher-pass=Really_s3cure_password
repo1-host-user=pgedge
log-level-console=info
process-max=3
compress-level=3
 
[default_stanza_n1]
pg1-path=/home/pgedge/<cluster_name>/<node_name>/pgedge/data/pg<version>
pg1-user=pgedge
pg1-port=5432
db-socket-path=/tmp

Scheduling Backups

The pgEdge CLI does not directly provide any scheduling features for pgBackRest backups.

To schedule backups using pgBackRest, you can utilize cron jobs. Cron is a time-based job scheduler in Unix-like operating systems.

Here’s how you can set up a cron job for pgBackRest backups:

  1. Open the crontab file for editing:

    crontab -e
  2. Add a new line to schedule the backup. For example, to schedule a full backup every day at 2 AM, add the following line:

    0 2 * * * LD_LIBRARY_PATH=/home/pgedge/<cluster_name>/<node_name>/pgedge/pg<version>/lib pgbackrest --config <path_to_pgbackrest_config> --stanza=default_stanza_n1 --type=full backup
  3. Save and close the crontab file.

    If cron is not utilized to your setup, you can utilize any other scheduling or orchestration mechanism which has access to invoke pgBackRest in a similar way.

Monitoring Logs

pgBackRest logs its activities, which can be monitored to ensure backups are running as expected. Logs are typically stored in the /var/log/pgbackrest directory.

Restoration Strategies

pgBackRest provides a variety of mechanisms for restoring a PostgreSQL database to a specific backup, or to a specific point in time. You should consult the pgBackRest user guide (opens in a new tab) to become familiar with backup and restoration strategies.

You can follow these steps to restore a specific pgEdge node using pgBackRest:

  1. Stop pgEdge across all nodes

    From the pgEdge install location where the cluster is configured, run this command:

    ./pgedge cluster command default all stop
  2. Connect to the node you wish to restore, and navigate to the directory where pgEdge is installed.

    This is typically /home/pgedge/<cluster_name>/<node_name>/pgedge when using the cluster module
  3. Ensure pgbackrest can be invoked by providing the correct shared libraries

    ⚠️
    TODO: This is a temporary step. We hope to resolve build issues with pgBackRest to ensure the library path does not need to be set
    export LD_LIBRARY_PATH=/home/pgedge/<cluster_name>/<node_name>/pgedge/pg16/lib/
  4. Use pgBackRest to identify a backup you wish to restore using the info command. Alternatively, you may identify a point in time that you wish to restore the database to.

    pgbackrest info --config=$(pwd)/pgbackrest.conf
  5. Run the restore using pgBackRest

    To restore a specific backup:

    pgbackrest restore --config=$(pwd)/pgbackrest.conf --set=<backup-label> --stanza=default_stanza_n1 --delta --archive-mode=off

    To restore to a specific point in time:

    pgbackrest restore --config=$(pwd)/pgbackrest.conf --stanza=default_stanza_<node_name> --type=time "--target=2025-01-20 14:09:34.918261+00" --target-action=promote --stanza=default_stanza_n1 --delta --archive-mode=off

    pgBackRest will handle restoring the required files and establishing the restore_command which will be run when postgres starts.

  6. Start pgEdge on the node you are restoring

    ./pgedge start
  7. Monitor the PostgreSQL log to ensure that the database recovers to the desired state

    If recovery is not achieved, you may need to tweak the restore options and run the restore again, especially when using PITR and the recovery target could not be reached.

    You may see errors as spock tries to connect to other nodes based on the configured subscriptions - this is expected at this point.

  8. Re-configure your pgBackRest stanza, if necessary.

    You may be able to re-use the existing pgBackRest stanza and repository path from the same node, but in certain situations, it is necessary to establish a new repository. This is useful when you want to maintain the existing repository in the event that you need to perform another restore during your recovery process.

    In order to make establish a new repository path, update the pgbackrest.conf file to set a new repo1-path, and adjust any other configuration as-needed.

  9. Re-enable archiving by connecting to your node, and unset the restore_command which was setup by pgBackRest

    ALTER SYSTEM SET restore_command TO '';
    ALTER SYSTEM SET archive_mode = 'on';
  10. Restart pgEdge to apply the changes:

    ./pgedge restart
  11. Confirm the archive_command is setup with any changes made in Step #7

    Inspect the archive_command via psql:

    show archive_command;

    If necessary, update this command to set any parameters which you have changed, or point to your pgBackRest configuration file:

    ALTER SYSTEM SET archive_command TO '<desired_command>';

    This ensures that the database is fully backed up and archiving is re-enabled for WAL files.

At this point, your pgEdge node should be restored to your desired state, and backing up again properly via pgBackRest.

In a disaster recovery scenario, you can recover a pgEdge Cluster by restoring one node to a specific backup or point in time via this method, and rebuild the Cluster using the remove-node and add-node functions in the cluster module.

If you are not using the cluster module, you may need to rebuild spock configuration using the pgEdge CLI or spock SQL functions in order to re-establish replication between the nodes.

Every database has a different disaster recovery scenario, dependent on many factors. You may need to adjust your approach, customizing the pgBackRest configuration accordingly, as you test and verify your disaster recovery plan.

Restoring a pgEdge Cluster is a last resort. If nodes are still available, but have become out of sync due to inconsistent updates, consider using the Active Consistency Engine (ACE) to resolve discrepancies between nodes.