Best practices for Validated Sync
Delphix Engine controls the number of concurrent restore operations that can run on a staging environment by the validated sync workers, which means we throttle the number of restore operations done by validated sync workers running for different staging databases on the staging environment, with five executing at a time and others waiting for their turn as per First Come First Serve scheduling. This improves overall system performance by reducing resource contention, disk I/O, and network traffic. Also, note that this limit is per Delphix Engine connecting to the staging environment.
Validated sync is only supported for Oracle and SQL Server database types.
Following are the limiting factors that will come into play when looking at the performance of staging databases on a staging environment when a validated sync worker runs to keep up with the production databases:
Backup generation frequency: With higher backup frequency, increased restore time will be seen as the pre-provisioning worker will keep ingesting previous backups while new backups are being generated.
Staging database count: When multiple staging databases are hosted on the same server, the backup ingestion load on the staging host will increase. Additionally, if the frequency of backups is high, there will be a greater number of candidates (pre-provisioning workers) waiting in the queue.
Number of VDB hosted on the server
Multiple Delphix Engines connecting to the same staging environment will increase the number of parallel restore operations running on the staging environment and contribute to the performance.
Below are the troubleshooting steps for improving performance:
Have dedicated Staging servers for role/architecture separation from VDB
Add CPU/Memory
Decrease backup frequency
Introduce dedicated networks
Below is an example of the effect of concurrency on validated sync performances:
The below findings are from a non-production setup
Environment details
Staging Host: 64 GB Memory, 8 vCPUs, ESXi version: 7.0.3
Backup File Size: 200MB
User for linking: Database user
Setup notes:
No other operations were executed on the Delphix Engine other than pre-provisioning worker running.
No virtual database existed on the staging host.
The source servers and the Delphix Engine are all on the same on-prem data center.
Only one Delphix Engine was connecting to the staging host.
Performance Scenario 1
For a staging host with the above configuration supporting 50 staging databases on a single database instance, and with every dSource having a backup at the 15-minute interval, the time taken to restore these transaction logs stay under 13 minutes(< 750 seconds) on average and hence the staging databases keep up with the backups.
Performance Scenario 2
The same setup could support frequent backups, that is every 10 min, but required the staging databases to be reduced. For example, 40 staging databases on a single database instance could support backups every 10 minutes without causing any lag.