You are on page 1of 4

Volume 6, Issue 1, January – 2021 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

Disaster Recovery Planning for Oracle


Middleware Applications
Ananthan Subburaj
Technology Architect
New Jersey, United States of America

Abstract— Disaster recovery planning is one of the most through tools and scripts as much as possible and the people
crucial component for a business but that is often ignored who are assigned.
or inadequately planned. Business organizations must
have a well-structured plan and document process for The DRO should have the well-defined procedures for
disaster recovery and business continuity, before a conducting recovery:
catastrophe occurs. The IT infrastructure disasters can be  Notification and Initial workforce mobilization process
short term or long lasting failure resulting in loss of  Damage assessment process
application and data, but when an organization is well  Disaster declaration process
planned with standby infrastructure and recover plan in  Secondary Workforce mobilization process
place, it can quickly get the business on track. This paper  DR command center establishment process
will clearly outline the disaster recovery planning best  DR support center establishment process
practices and technical process to recover the oracle  Application recovery procedures
applications with zero data loss. This paper aims to
provide best practices for effective disaster management
III. DR RESOURCE PLANNING AND
planning and technical configurations to achieve faster
UTILIZATION
application recovery.
An alternate facility should be available to function as a
Keywords:- Disaster Recovery; High Availability, Fusion DR command & support center supporting the senior DR
middleware DR, DR planning. management and DR process teams along with operations,
help desk, workstation support and other virtual teams. This
I. INTRODUCTION facility should be available for use by IT operations,
application and business team personnel in order to allow
Today’s information world the IT applications have
them to perform their respective duties during a recovery
become increasingly critical for the operation of a company,
process and after the recovery process. Maintenance and
the importance of ensuring the continued operation and the
testing of facilities is on an annual basis to keep pace with DR
rapid recovery of IT applications has increased. The business
organization and recovery requirements:
organizations will be severely impacted by disaster when IT
infrastructure cannot continue to function due to data loss or A. Normal Mode
the application infrastructure failure, it may even go out of During normal day-to-day operations the D.R. hardware
business. An effective disaster recovery plan ensures quick resources can be used for Test, Development, and QA
recovery of data and application infrastructure in the event of activities with minimum allocation for the DR sync activities.
natural or technical disaster. This paper aims to provide
All the DR servers should be on active maintenance contracts
systematic approach to plan the disaster recovery organization with supplying vendors, and all code fix, firmware, and
level process and technology implementation technique to Operating systems should be kept up to date same level as
reduce the disaster recovery time for oracle middle primary Datacenter.
applications using the logical host names.
B. Test Exercise Mode
II. DISASTER RECOVERY ORGANIZATION The DR test can be done in two different options, one is
to switch over the primary to DR and run the business
In the event of a disaster, the objective of the Disaster
operations in the DR for specific window and then switch
Recovery Organization (DRO) is to minimize disruption and back to original primary, the second option is testing the DR
downtime of critical business functions and data loss by
facility in isolated network non-invasive to current production
rapidly recovering business critical infrastructure and
environment with controlled user testing.
application components. The DRO focuses on two metrics
Recovery Point Objective (RPO) and Recovery Time C. Disaster Evernt Mode
Objective (RTO). As the disaster recover work cannot be Actual DR event where all production application
planned and it is response to an unexpected event, the
systems are recovered in DR location and made available for
recovery process should be well documented and automated the business operations.

IJISRT21JAN615 www.ijisrt.com 1476


Volume 6, Issue 1, January – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
IV. DISASTER RECOVERY ARCHITECTURE The Fusion middleware applications uses metadata
schema and application schemas in the oracle database.
The fusion middleware applications production based on Establish oracle physical standby database to replicate the
WebLogic server technology such as Webcenter Portal, data from production site to DR site. Configure the data guard
Webcenter Content, and Imaging and Identity Management broker to monitor and perform administrative tasks for the
products have product software binaries, domain configuration database replication.
file and application metadata schemas in the database. The
Fusion Middleware application disaster recovery includes V. NETWORK CONFIGURATIONS
application tier file system replication and the metadata store
database replication to the standby site. The recommended The application components should be configured using
process for the disaster recovery is to use the shared storage the logical hostname (host name alias) instead of physical
for all the application tier nodes in the production site and host name for the listen address on both WebLogic Admin
using storage replication to synchronize the application tier and managed servers. The logical host name should be
file systems to the standby site, replicate the database tier revolved to appropriate physical host IP address in primary
using the oracle physical standby database. site and also in the DR site. As the alias host name in
production site resolves to production host physical IP
address and in the DR site resolves to the DR host physical IP
address there is no need to update any domain configuration
when switching over to the standby site for disaster recovery
operation. Configuring the application components with alias
hostnames also helps with the server migration in the event of
any host hardware or operating system failure with the
production site, the application can be quickly started on
another server by mounting the shared application file
systems and updating the alias hostname to point to the new
server’s physical IP address.

The load balancer should be configured on both primary


and standby site to route traffic to the physical IP of the
servers. When the switch over happens the application DNS
should be change to connect to the DR load balancer virtual
IP address instead of the primary site load balancer virtual IP.
The site load balancer can be used to route traffic between
primary and standby site load balancers based on the health
check rules or on demand when the DRO declares the disaster
and the application switch over is completed.

The host name resolution can be achieved by using the


local /etc/hosts file on each application and database hosts
involved in the configuration or using the DNS servers. When
using the /etc/hosts file maintain the entry for all the
Fig. 1. Disaster Recovery Topology application servers with same set of entries for better
manageability. When using the separate DNS servers for
The middleware applications have oracle software primary site and DR site the alias host name alias entries can
binaries, domain configuration files, application deployment be preconfigured in the DNS appropriately to point to the
files and application metadata files in the application tier. The respective physical hosts. With global DNS server the
application tier mount points should be created with shared hostname alias need to be updated to point to the DR site
storage file system on the production primary site and install during the recovery process.
the application binaries on shared file system, when using
multiple mount points create all the mounts points from the The host name resolution process should be decided
same consistency group on the primary site to ensure part of the design phase of the DR process. The name
consistent data replication across all the file systems to resolution method can be controlled by changing the
standby site. Setup storage replication at the project level with configuration order in /etc/nsswitch.conf file on each host.
appropriate replication schedule. The recommended schedule The entry like (hosts: files dns nis) this makes the host to
for the binary and configuration files are once a day and one use the hosts file on the server as primary resolution method.
time on demand sync can be done whenever there is major
deployment or upgrade happens on the primary site. As the
file system replication and automatic scheduling of
incremental sync replicates all the changes in the primary site
to standby site and there is no need to install any software or
update configurations in the standby site.

IJISRT21JAN615 www.ijisrt.com 1477


Volume 6, Issue 1, January – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
VI. MANAGING HOST NAME  Execute the DNS change and validate alias host name
resolution to the appropriate standby site hosts.
Depending on the type of host name resolution used  Start all the applications services on the standby nodes.
following are the sample host name alias required for the  Update the application URL DNS to point to the standby
middleware applications. When using the /etc/hosts file based load balancer or use a global load balancer to route user
resolution make sure have entries for all hosts are maintained connections to standby site.
across all the servers part of the application topology and test  If original primary site is accessible, ensure to enable the
the name resolution by using ping command from each node replication of both database and application file systems.
to all other nodes. The naming resolution should be validated  Establish appropriate database and application file system
from all nodes when using the global DNS change for switch backup process to back up the files from the new standby
over to ensure the DNS cache is not pointing the old IP site
address.

Table. 1. Primary Site hostname resolution


IP Physical Host Name Alias Host name
Address
110.24.2. PRIWEBHOST1.SAMP WEBHOST1.SAMP
101 LE.COM LE.COM
WEBHOST1
110.24.2. PRIWEBHOST2.SAMP WEBHOST2.SAMP
102 LE.COM LE.COM
WEBHOST2
110.24.2. PRIAPPHOST1.SAMP APPHOST1.SAMPL
103 LE.COM E.COM APPHOST1
110.24.2. PRIAPPHOST2.SAMP APPHOST2.SAMPL
104 LE.COM E.COM APPHOST2

Table. 2. Standby Site hostname resolution


IP Physical Host Name Alias Host name
Address
110.44.2. DRWEBHOST1.SAMP WEBHOST1.SAMP
101 LE.COM LE.COM
WEBHOST1
110.44.2. DRWEBHOST2.SAMP WEBHOST2.SAMP
102 LE.COM LE.COM
WEBHOST2
110.44.2. DRAPPHOST1.SAMPL APPHOST1.SAMPL Fig. 2. Disaster Recovery process flow
103 E.COM E.COM APPHOST1
110.44.2. DRAPPHOST2.SAMPL APPHOST2.SAMPL VIII. LOAD BALANCER SWITCHOVER
104 E.COM E.COM APPHOST2
The primary site and the standby sites have independent
load balancers configured to load balance application traffic
VII. DISASTER RECOVERY PROCESS
across all the configured application nodes. During the DR
event when the application is switched over to the standby
To activate standby site application when there is a
site, client application access should be transparently
failure or planned outage of the production site, use the
redirected to the standby site which is a now configured as
following steps to bring up the application on the standby site
new primary. To redirect the client application access the
to assume the business operations from standby site:
DNS URL name should be updated to the DR load balancer
 Stop the application services on the production site (for
virtual IP address. This can be automated using the site load
unplanned failure the applications might be already down) balancer configuration to front end the local site level
and stop the file system replication from the production to application load balancers. With the additional infrastructure
standby site cost using the global site load balancer provides additional
 Apply the last available database redo logs to the standby capabilities to monitor and detect local load balancer failure
and execute switchover (planned maintenance of and automate the redirection to available site based on the
production site) or failover (unplanned failure of load balancer rule configuration. The global load balancer
production site) of database using data guard. also avoid the impact of DNS cache issues while updating the
 Mount the replicated file systems on standby servers in DNS alias to DR IP address.
read write mode.

IJISRT21JAN615 www.ijisrt.com 1478


Volume 6, Issue 1, January – 2021 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
IX. BEST PRACTICES FOR DISASTER RECOVERY X. CONCLUSIONS

The following are the best practices for preparing Every business, large or small, in today’s information
disaster recovery site and recovery procedures in readiness world is dependent upon their IT infrastructure servers and
for a site failure. application data for business operations. There are many
 The Disaster recovery site should be geographically common risks such as natural disasters and internal technical
separated to ensure the site availability and avoiding failures such as hardware failure or human errors can lead to
possibility of losing both sites in major natural disasters. adverse effects on the information systems and hinder
 Use Oracle Data Guard to replicate the database changes to business operation. It is essential for a company to create well
standby site database and use Data Guard broker to defined disaster recovery plan and test periodically. It is
simplify the administration tasks. important for business organizations to plan for the disaster
 Configure Active Data Guard feature to offload read-only recovery IT infrastructure, create recovery procedures and
queries to the standby database to utilize the standby test the readiness to take on the disaster challenge, also the
hardware resources. effectively utilization of DR IT assets in the normal mode of
 Use Oracle Flashback Database feature to reinstate the old business operations. This paper aimed to provide insight into
primary database as a standby database in the event of a the disaster recovery planning, infrastructure utilization,
site failover. technical architecture and best practices to achieve quick
 Replicate the application File Systems to the DR site using recovery of business applications and reduce the business
storage replication technology and establish procedure to impact.
reverse the direction of replication in the event or
switchover and use cloned replica for site testing. REFERENCES
 Create role based database services for the application
connectivity to database [1]. https://docs.oracle.com/en/middleware/fusion-
 Test standby site using snapshot standby database to middleware/12.2.1.3/asdrg/toc.htm
temporarily convert the physical standby database to [2]. https://docs.oracle.com/middleware/1212/core/ASADM.
pdf
updatable copy
[3]. https://www.oracle.com/technetwork/database/availabilit
 Create documented operational procedures to streamline
y/maa-site-guard-exalogic-exadata-1978799.pdf
the DR test process and for the actual DR event.
[4]. https://www.oracle.com/technetwork/database/features/a
 To enable faster recovery and to reduce the human errors
vailability/wlsdatasourcefordataguard-1534212.pdf
use tools or automation scripts to execute DR procedure.
[5]. https://www.oracle.com/technetwork/database/availabilit
 Configure Data Guard Broker to automate Data Guard y/maa-fmwsharedstoragebestpractices-402094.pdf
operation and the database failover and switchover steps
 Create DB_ROLE_CHANGE trigger to automate the post
DB switchover or failover configuration steps

IJISRT21JAN615 www.ijisrt.com 1479

You might also like