IBM Db2 overview
Introduction to Db2
Db2 for Linux, UNIX, and Windows (LUW) is a database server product developed by IBM. Db2 LUW is the "Common Server" product member of the Db2 family, designed to run on most popular operating systems.
The version numbers in Db2 are non-sequential with v11.1 and 11.5 being the two most recent releases. Specifics of Db2 versions and platforms supported on Delphix are located in the Db2 Compatibility Matrix
IBM Db2 authentication
Db2 User and group authentication is managed in a facility external to Db2 LUW, such as the operating system, a domain controller, or a Kerberos security system. This is different from other database management systems (DBMSs), such as Oracle and SQL Server, where user accounts may be defined and authenticated in the database itself, as well as in an external facility such as the operating system.
Any time a user ID and password is explicitly provided to Db2 LUW as part of an instance attachment or database connection request, Db2 attempts to authenticate that user ID and password using this external security facility. If no user ID or password is provided with the request, Db2 implicitly uses the user ID and password that were used to login to the workstation where the request originated. More information on Db2 authentication and authorization is available via IBM documentation
Delphix Db2 authentication
Delphix for Db2 requires that the staging and target hosts must already have the necessary users and authentication systems created/installed on them. Delphix will neither create users nor change database passwords as part of the provisioning process.
Db2 database level support
Delphix supports a number of Db2 database-level features. An overview of Db2 database level support is as follows:
Support for multiple databases linking in a Single Instance, which allows Delphix Engine users to utilize an available instance on the target more efficiently.
Support for using a customer-supplied directory for the Delphix Plugin, Db2 Mount Points, and Db2 Delphix files and logs.
Support for the in-memory BLU feature of Db2.
Support for Kerberos environments.
Support for VDB provisioning of the same OS version level or one version higher than the source and staging instances.
Intelligent handling of HADR logs.
High Availability Disaster Recovery (HADR)
The HADR feature of IBM Db2 provides a high availability solution for both partial and complete site failures. It protects against data loss by replicating data changes from a source database, called the primary, to one or more target databases, called the standby.
Delphix HADR support
HADR replication takes place at a database level, not at the instance level. Therefore, a standby instance can have multiple databases from multiple different primary servers/instances on it. If the instance ID on the Delphix standby is not the same as the instance ID on the primary, the Delphix standby instance ID must have database permissions secadm and dbadm granted to it on the primary database. These permissions and all HADR settings must be implemented on the primary database before you take the backup on the primary database.
Log transmitting
All changes that take place at the primary database server are written into log files. The individual log records within the log files are then transmitted to the secondary database server, where the recorded changes are replayed to the local copy of the database. This procedure ensures that the primary and the secondary database servers are in a synchronized state. Using two dedicated TCP/IP communication ports and a heartbeat, the primary and the standby databases track where they are processing currently, the current state of replication, and whether the standby database is up-to-date with the status of the primary database. When a log record is "closed" (still in memory, but has yet to be written to disk on the primary), it is immediately transmitted to the HADR standby database(s). Transmission of the logs to the standbys may also be time-delayed.
Multiple standby
Beginning in Db2 v10.1, the HADR feature supports multiple standby databases. This enables an advanced topology where you can deploy HADR in multiple standby mode with up to three standby databases for a single primary. One of the databases is designated as the principal HADR standby database, with the others termed as auxiliary HADR standby databases. As with the standard HADR deployment, both types of HADR standbys are synchronized with the HADR primary database through a direct TCP/IP connection. Furthermore, both types support the reads on standby feature and can be configured for time-delayed log replay. It is possible to issue a forced or unforced takeover on any standby, including the delphix auxiliary standby. However, you should never use the Delphix auxiliary standby as a primary, because this will impact Delphix performance.
Delphix HADR synchronization
The Delphix for Db2 uses the HADR capability of Db2 to synchronize data from a production Db2 database into a Delphix-controlled Db2 "standby" server. By using this mature and existing Db2 capability, the Delphix Engine is able to ingest data and keep the standby server in sync with only a minimal impact on production. The HADR connection is configured to Super-Asynchronous (SUPERASYNC) mode where log writes are considered successfully transmitted when the log records are sent from the primary database. Because the primary database does not wait for acknowledgments from the standby database, there is no delay on the primary and transactions are considered committed regardless of the state of the replication of that transaction. For further information on Delphix synchronization, see Linking a dSource from a Db2 Database: An Overview
Database partitioning feature
Database Partitioning Feature (DPF) lets you partition your database across multiple servers. Since you can add new machines and spread your database across them, this allows users to scale their database. This means more CPUs, more memory, and more disks from each of the additional machines are available for your database. DPF can be used to manage large databases for a variety of use cases including data warehousing, data mining, online analytical processing (OLAP), or online transaction processing (OLTP) workloads.
DPF enables the user to divide a database into database partitions, a database partition is a part of a database that consists of its own data, indexes, configuration files, and transaction logs. Each database partition can be configured on the different physical server having its own set of computing resources, including CPU and storage. When a query is processed, the request is divided so each database partition processes the rows that it is responsible for. DPF can maintain consistent query performance as the table grows by providing the capability to add more processing power in the form of additional database partitions. This capability is often referred to as providing linear scalability using Db2s shared-nothing architecture.
DPF is an approach to sizing and configuring an entire database system. Please follow the recommended practices for optimal performance, reliability, and capacity growth. Please refer to IBM documentation of DPF for more details in IBM knowledge center