Provisioning a PostgreSQL VDB PiT with external logs
With the latest PostgreSQL plugin 4.0.0 and onwards, you can provision a VDB to a point in time by providing the WAL logs on the target host.
Point in Time (PiT) recovery in PostgreSQL - Overview
PostgreSQL allows you to recover a cluster to a point in time (PiT) from a base backup by replaying the archived WAL logs. The main requirements to achieve PiT Recovery are:
A base backup, which is essentially the file system backup. In the Delphix environment, the snapshots are nothing but file system backups.
The archived WAL logs.
The recovery target timestamp.
Understanding the PiT Recovery target timestamp
A recovery point from PostgreSQL’s perspective in the transaction commit timestamp. The recovery is performed till the commit timestamp less than or equal to the provided target time.
From PostgreSQL version 13 and onwards, the point-in-time recovery fails if there are no transactions (no activity) post the requested target time.
The
pg_waldump
(pg_xlogdump
for PostgreSQL 9.x versions) can be used to analyze the WAL Logs and identify commit timestamps.The default timezone of the PiT timestamp (unless mentioned in the timestamp string) will the considered as the
timezone
configured inpostgresql.conf
file.
For more information about
WAL Log archiving and Point in Time (PiT) Recovery, see PostgreSQL Documentation
Setup
Scope of VDB PiT with External Logs feature
Provisioning of VDB PiT with External Logs is supported from the snapshots of:
A dSource created with external backups using streaming replication
A dSource created with Delphix Initiated backup using streaming replication
Provisioning of VDB PiT with External Logs is not supported from the snapshots of:
A dSource created with Delphix Initiated Backup using Single Database Ingestion
A dSource created with Staging Push
Any VDB
Any dSource and VDB created using previous plugin versions (< 4.0.0). After upgrading from the plugin versions 3.2.0 and below, perform a SnapSync operation
Prerequisites
You will have to link a supported dSource (mentioned in the above section) from a staging instance, as described in Linking a PostgreSQL dSource
You should set up the POSTGRES target environment with the necessary requirements as described in PostgreSQL support and requirements
Make sure you have the required Instance Owner permissions on the target instance and environment.
The target host must have the required WAL Logs (at least from the REDO WAL Log captured in the snapshot metadata) to perform point-in-time recovery, possible ways to achieve this are:
Archiving the WAL Logs on the source host and mounting (NFS) the archive directory on the target host. The below steps can be followed to enable WAL archiving:
Set the
wal_level
configuration parameter toreplica
or higher.Set the
archive_mode
toon
.Use the
archive_command
parameter to copy the WAL Logs to an archive directory, ex:cp %p <ARCHIVE_DIRECTORY>/%f
The
pg_receivewal
utility can be used on the target host to start a background process that collects the WAL logs from the source database using streaming replication. Note, this will require increasing themax_wal_senders
count in the source configuration (postgresql.conf
) and will only collect the WAL Logs only from the time we start the process.Copying the required WAL logs from the archive directory on the source host to the target host.
Using Delphix-provided solution Unstructured Files to mount the archive directory on the source host as a dSource and later provision the VFiles on the target host. With this approach, the archived WAL logs can be kept in sync using the dSource SnapSync and VFiles Refresh operations.
Info:
For more information, see Unstructured files and app data
The method for Database Permissions for provisioned PostgreSQL VDBs is decided before provisioning.
Procedure
Login to the Delphix Management application.
Click Manage, and select Datasets.
Select a VDB PiT supported dSource and a snapshot from which you want to provision. Click the provision VDB icon to open the provision VDB wizard.
Select a target environment from the left pane, and an Installation to use from the dropdown list of available PostgreSQL instances on that environment.
Set the Environment User to be the Instance Owner.
Note: The picking of the instance owner is only possible if you have multiple environment users set on that host.You will see the Target Configuration section where you need to specify NFS Mount Location.
If you want to use Privilege Elevation, provide the Privileged OS Account username. Make sure you meet all the requirements described in the Prerequisites for Privilege Elevation using DLPX_DB_EXEC Script section.
Note:
To update the Privileged OS Account after VDB creation, first disable the instance, update the Privileged OS Account, and then enable it.
If the Environment OS user will be used to create the VDB, make sure you have the following privileges:
access to the Postgres commands
read+write+execute permissions on the
unix_socket_directories
configured inpostgresql.conf
[Default directories:/var/run/postgresql
&/tmp
]
Provide a unique Virtual Postgres Port Number for a given instance.
Click on Add (+) to fill the VDB PiT with External Logs Details.
Note: The Add Button provided for VDB PiT with External Logs Details is clickable multiple times but should be used ONLY once since the solution is built only to support a single entry. Providing multiple entries will error out the linking process.
All fields within the VDB PiT with External Logs Details are mandatory.
Provide the absolute path to the archived WAL logs available on the target host in the Path to WAL Logs for PiT field.
Note:If privilege elevation is being used, please make sure the high-privileged user:
Is able to access the directory containing the WAL logs and has a minimum of read+execute permissions on the same.
Has read permission on individual WAL Logs.
Warning:
If the VDB PiT is utilizing the same WAL Logs path provided to the dSource created with External Backup using WAL logs (via the old plugin versions 3.2.0 and below), provisioning of the VDB PiT will fail due to the error cp: cannot stat, which can be seen in the PostgreSQL logs. This is because the restore performed by the dSource renames the WAL Logs with a suffix .done (ex: 000000010000000000000020.done).
In order to proactively prevent the above error, please run the Resynchronise dSource operation on the dSource and then provision the VDB PiT.
Provide the target timestamp in the PiT timestamp field.
Note:The plugin accepts timestamps with ISO 8601 basic format. The allowed timestamp formats can be expressed as follows:
Without the timezone offset from UTC:
YYYY-mm-dd HH:MM:SS
YYYY-mm-dd HH:MM:SS.ssssss
In this case, the timezone is assumed to be the same as the timezone parameter configured in
postgresql.conf
file.
With the timezone offset from UTC:
YYYY-mm-dd HH:MM:SS(+/-)HHMM
YYYY-mm-dd HH:MM:SS.ssssss(+/-)HHMM
Here YYYY represents the year, mm represents the month, dd represents the day, HH represents the hour in 24-hour format, MM represents the minutes, SS represents the seconds and ssssss represents the microseconds.
Also, make sure the PiT timestamp is greater than the checkpoint timestamp available in the snapshot metadata.
Optionally, you can set the database configuration parameters for the VDB using Config Settings. You can disable a configuration parameter by selecting the check box Comment Property. This will comment out the parameter in the
postgresql.conf
configuration file.Click Next.
On the Configuration screen, enter the PostgreSQL VDB name which will be displayed on the UI.
Select a Target Group for the VDB and click add (+), to add a new group, if necessary.
Select a Snapshot Policy for the VDB then click Next.
Specify any desired hook operations.
Review the Provisioning Configuration and Data Management information.
Click Submit.
Once the VDB provisioning has been successfully completed, if the source and target instance ids are not the same, you may want to grant secadm and dbadm on the database to the target instance id. See Database permissions for provisioned PostgreSQL VDBs for more information.
There are two snapshot metadata that can be used to identify the results of the VDB PiT restore operation:
User-Provided Recovery Timestamp records the timestamp provided by the end-user.
PiT recovery Restore Timestamp records the last committed transaction timestamp the recovery has happened till.