17 Using Parallel Replicat

You can create (or add) and configure parallel replication in your environment. New Parallel Replicat processes then process the information in all the internal stages, from the beginning to the end in parallel. Components, such as Mappers, Master, and Appliers are also explained.

Topics:

17.1 Overview of Parallel Replicat

Parallel Replicat is a new variant of Replicat that applies transactions in parallel to improve performance.

It takes into account dependencies between transactions, similar to Integrated Replicat. The dependency computation, parallelism of the mapping and apply is performed outside the database so can be off-loaded to another server. The transaction integrity is maintained in this process. In addition, parallel replicat supports the parallel apply of large transactions by splitting a large transaction into chunks and applying them in parallel.

Parallel Replicat supports all databases using the non-integrated option. Parallel Replicat only supports replicating data from trails with full metadata, which requires the classic trail format.

The components of parallel Replicat are:
  • Mappers operate in parallel to read the trail, map trail records, convert the mapped records to the Integrated Replicat LCR format, and send the LCRs to the Merger for further processing. While one Mapper maps one set of transactions, the next Mapper maps the next set of transactions. The the trail information is split and the trail file is untouched because it orders trail information in order.

  • Master processes have two threads, Collater and Scheduler. The Collater receives mapped transactions from the Mappers and puts them back into trail order for dependency calculation. The Scheduler calculates dependencies between transactions, groups transactions into independent batches, and sends the batches to the Appliers to be applied to the target database.

  • Appliers reorder records within a batch for array execution. It applies the batch to the target database and performs error handling. It also tracks applied transactions in checkpoint tables.

The following table lists the features supported by the respective Replicats.

Feature Classic Replicat Coordinated Replicat Integrated Replicat Parallel Replicat

Batch Processing

Yes

Yes

Yes

Yes

Barrier Transactions

No

Yes

Yes

Yes

Dependency Computation

No

No

Yes

Yes

Auto-parallelism

No

No

Yes

Yes

DML Handler

No

No

Yes

YesFoot 1

Procedural Replication

No

No

Yes

YesFoot 2

Auto CDR

No

No

Yes

YesFoot 3

Dependency-aware Transaction Split

No

No

No

Yes

Cross-RAC-node Processing

No

Yes

No

Yes

Footnote 1 Integrated mode

Footnote 2 used for integrated Parallel Replicat (iPR)

Footnote 3 used by iPR only

17.2 Parallel Replication Architecture

Parallel replication processes leverage the apply processing functionality that is available within the Oracle Database in integrated mode.

Within a single Replicat configuration, multiple inbound server child processes, known as apply servers, apply transactions in parallel while preserving the original transaction atomicity.

The architecture diagram depicts the flow of change records through the various processes of a parallel replication from the trail files to the target database.

Description of para_rep_arch.jpg follows
Description of the illustration para_rep_arch.jpg

The Mappers read the trail file and map records, forward the mapped records to the Master. The batches are sent to the Appliers where they are applied to the target database.

The Master process consists of two separate threads, Collater and Scheduler. The Collater is responsible for managing and communicating with the Mappers, along with receiving the mapped transactions and reordering them into a single in-order stream. The Scheduler is responsible for managing and communicating with the Appliers, along with reading transactions from the Collater, batching them, and scheduling them to Appliers.

The Scheduler controller communicates with the Scheduler to gather any necessary information (such as, the current low watermark position). The Scheduler controller is required for CDB mode for Oracle Database because it is responsible for aggregating information pertaining to the different target PDBs and reporting a unified picture. The Scheduler controller is created for simplicity and uniformity of implementation, even when not in CDB mode. Every process reads the parameter file and shares a single checkpoint file.

17.3 Basic Parameters for Parallel Replicat

The following table lists the basic parallel Replicat parameters and their description.

Parameter Description
MAP_PARALLELISM

Configures number of mappers. This controls the number of threads used to read the trail file. The minimum value is 1, maximum value is 100 and the default value is 2.

APPLY_PARALLELISM

Configures number of appliers. This controls the number of connections in the target database used to apply the changes. The default value is four.

MIN_APPLY_PARALLELISM

MAX_APPLY_PARALLELISM

The Apply parallelism is auto-tuned. You can set a minimum and maximum value to define the ranges in which the Replicat automatically adjusts its parallelism. There are no defaults. Do not use with APPLY_PARALLELISM at same time.

SPLIT_TRANS_REC

Specifies that large transactions should be broken into pieces of specified size and applied in parallel. Dependencies between pieces are still honored. Disabled by default.

COMMIT_SERIALIZATION

Enables commit FULL serialization mode, which forces transactions to be committed in trail order.

Advanced Parameters

 
LOOK_AHEAD_TRANSACTIONS

Controls how far ahead the Scheduler looks when batching transactions. The default value is 10000.

CHUNK_SIZE

Controls how large a transaction must be for parallel Replicat to consider it as large. When parallel Replicat encounters a transaction larger than this size, it will serialize it, resulting in decreased performance. However, increasing this value will also increase the amount of memory consumed by parallel Replicat.

Example Parameter File

replicat repA
userid ggadmin, password ***
MAP_PARALLELISM 3
MIN_APPLY_PARALLELISM 2
MAX_APPLY_PARALLELISM 10
SPLIT_TRANS_RECS 1000
map *.*, target *.*;

17.4 Creating a Parallel Replicat

You can create a parallel replication using the graphical user interface or the command line interfaces GGSCI and the Admin Client.

A parallel Replicat requires a checkpoint table so both the Administration Server UI and Admin Client issue an error when the parallel Replicat does not include a checkpoint table.

Note:

Parallel replication does not support COMMIT_SERIALIZATION in Integrated Mode. To use this apply process, use Integrated Replicat.

Creating a Non-Integrated Parallel Replication with the Administration Server

  1. Open a browser and connect to the Service Manager that you created with the Configuration Assistant:

    https://server_name:service_manger_port/
    

    For Example, https://localhost:9000/. In an non secured environment, use http instead of https.

    The Oracle GoldenGate Service Manager is displayed.

  2. Enter the username and password you created and click Sign In.

    In the Service Manager, you can see servers that are running.

  3. In the Services section, click Administration Server, and then log in.

  4. Click the Application Navigation icon to the left of the page title to expand the navigation panel.

  5. Create the checkpoint table by clicking Configuration in the right navigation panel.

  6. Ensure that you have a valid credential and log in to the database by clicking the ‘log in database’ icon under Action.

  7. Click the + sign to add a checkpoint table.

  8. Enter the schema.name of the checkpoint table that you would like to create, and then click Submit.

  9. Validate that the table was created correctly by logging out of the Credential Alias using the log out database icon, and then log back in.

    Once the log in is complete, your new checkpoint table is listed.

  10. Click Overview to return to the main Administration Server page.

  11. Click the + sign next to Replicats.

  12. Select Nonintegrated Replicat then click Next.

  13. Enter the required information making sure that you complete the Credential Domain and Credential Alias fields before completing the Checkpoint Table field, and then select your newly created Checkpoint Table from the list.

  14. Click Next, and then click Create and Run to complete the Replicat creation.

Creating a Non-Integrated Parallel Replicat with the Admin Client

  1. Go the bin directory of your Oracle GoldenGate installation directory.

    cd $OGG_HOME/bin
    
  2. Start the Admin Client.

    ./adminclient
    

    The Admin Client command prompt is displayed.

    OGG (not connected) 12>
    
  3. Connect to the Service Manager deployment source:

    connect http://localhost:9500 deployment Target1 as oggadmin password welcome1
    

    You must use http or https in the connection string; this example is a non-SSL connection.

  4. Add the Parallel Replicat, which may take a few minutes to complete:

    add replicat R1, parallel, exttrail bb checkpointtable ggadmin.ggcheckpoint
    

    You could use just the two character trail name as part of the ADD REPLICAT or you can use the full path, such as /u01/oggdeployments/target1/var/lib/data/bb.

  5. Verify that the Replicat is running:

    info replicat R1
    

    Messages similar to the following are displayed:

    REPLICAT   R1        Initialized   2016-12-20 13:56   Status RUNNING
    Parallel
    Checkpoint Lag       00:00:00 (updated 00:00:22 ago)
    Process ID           30007
    Log Read 
    Checkpoint  File ./ra000000000First Record  RBA 0