Last Friday there was one
post on
OTN forums that brought my attention. The OP was wondering “Is Data Guard Buggy” with attention to Fast-Start Failover (FSFO) feature that provides an ability of automatic failover to the standby database if the primary database is not available for certain time.
He had some concerns about FSFO being unreliable, very difficult to be implemented and actually doesn't work properly.
The OP got some immediate response from the OTN users. I couldn't agree more with Joseph Meeks's comment on the topic.
In my opinion, FSFO is very nice feature that plays big role in Oracle's
Maximum Availability Architecture. There might be some valid reasons not to implemented it, but if the automatic failover is a requirement, FSFO is the way to go. Should one have any problems implementing it, the best way to go is to get Oracle Support involved.
In this post, I'd like to show that implementation of the FSFO should not be a difficult task once you have the Data Guard environment set up.
I configured data guard physical standby environment on my laptop. Due to hardware limitations, I'll have the observer running on the same machine with the standby databases. Keep in mind this is just an example. In practice, the primary database, the standby database and the observer should run on different hosts. This example also answers one of the questions OP asked: Will it be possible to set it up on one machine? The answer would be it is possible, as shown in the example :-), but it is not the right way to go.
I use DGMGRL utility in the example..
So, I configured a Data Guard environment where DB1.MYDOMAIN.COM is primary database and STDB.MYDOMAIN.COM is physical standby. Both databases are 10.2.x
By default, protection mode of the Data Guard configuration is set to MAXIMUM PERFORMANCE
DGMGRL> connect sys/***@db1
Connected.
DGMGRL> show configuration
Configuration
Name: DRTest
Enabled: YES
Protection Mode: MaxPerformance
Fast-Start Failover: DISABLED
Databases:
db1 - Primary database
stdb - Physical standby database
Current status for "DRTest":
SUCCESS
Enabling Fast-Start Failover requires the following pre-requisites to be met:
Flashback Database feature is enabled on both the primary and the standby database.
The protection mode of the configuration must be set to MAXIMUM AVAILABILITY
tnsnames.ora in the ORACLE_HOME where the observer runs must be set to see both databases, the primary and the standby.
DGMGRL must be available on the observer host.
Enable Flashback DatabaseI will enable flashback database feature on both databases. This assumes that the flash recovery area is configured.
The flashback database feature provides an ability for an easy reinstatement of the failed primary database to new standby database.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL>
SQL> startup mount
ORACLE instance started.
Total System Global Area 285212672 bytes
Fixed Size 1261348 bytes
Variable Size 100663516 bytes
Database Buffers 176160768 bytes
Redo Buffers 7127040 bytes
Database mounted.
SQL> alter database flashback on;
Database altered.
SQL> alter database open;
Database altered.
Set the protection mode to MAXIMUM AVAILABILITYThe Fast-Start Failover can be enabled only if the protection mode is set to MAXIMUM AVAILABILITY.
LogXptMode has to be set to 'SYNC'
DGMGRL> edit database 'db1'
> set property LogXptMode='SYNC';
Property "logxptmode" updated
DGMGRL> edit database 'stdb'
> set property LogXptMode='SYNC';
Property "logxptmode" updated
DGMGRL> edit configuration set protection mode as MaxAvailability;
Succeeded.
DGMGRL> show configuration
Configuration
Name: DRTest
Enabled: YES
Protection Mode: MaxAvailability
Fast-Start Failover: DISABLED
Databases:
db1 - Primary database
stdb - Physical standby database
Current status for "DRTest":
SUCCESS
SQL> conn sys/***@db1 as sysdba
Connected.
SQL> select protection_mode, protection_level from v$database;
PROTECTION_MODE PROTECTION_LEVEL
-------------------- --------------------
MAXIMUM AVAILABILITY MAXIMUM AVAILABILITY
SQL> conn sys/***@stdb as sysdba
Connected.
SQL> select protection_mode, protection_level from v$database;
PROTECTION_MODE PROTECTION_LEVEL
-------------------- --------------------
MAXIMUM AVAILABILITY MAXIMUM AVAILABILITY
SQL>
I ensured that tnsnames.ora are set correctly as well DGMGRL is installed.
Now, since all the prerequisites are met, lets move forward and enable the FSFO.
Before enabling it, make sure each of the databases in the configuration has set a fast start failover target. This is achieved by setting the FastStartFailoverTarget parameter.
DGMGRL> edit database 'db1' set property faststartfailovertarget='stdb';
Property "faststartfailovertarget" updated
DGMGRL> edit database 'stdb' set property faststartfailovertarget='db1';
Property "faststartfailovertarget" updated
Another important parameter that has to be set is FastStartFailoverThreshold. This parameter specifies the amount of time (in seconds) the observers attempts to reconnect to the primary database before starting the fast-start failover to the standby database. The default value is set to 30 seconds. In the example I set this parameter to 120 seconds.
DGMGRL> edit configuration set property FastStartFailoverThreshold=120;
Property "faststartfailoverthreshold" updated
Now lets enable the fast start failover:
DGMGRL> ENABLE FAST_START FAILOVER;
Enabled.
If you check the configuration at this moment you may find the following warning: ORA-16608: one or more databases have warnings.
If you check the status of one of the databases, you can see this warning: ORA-16819: Fast-Start Failover observer not started
DGMGRL> show configuration
Configuration
Name: DRTest
Enabled: YES
Protection Mode: MaxAvailability
Fast-Start Failover: ENABLED
Databases:
db1 - Primary database
stdb - Physical standby database
- Fast-Start Failover target
Current status for "DRTest":
Warning: ORA-16608: one or more databases have warnings
DGMGRL> show database verbose 'db1'
Database
Name: db1
Role: PRIMARY
Enabled: YES
Intended State: ONLINE
Instance(s):
db1
Properties:
InitialConnectIdentifier = 'db1.mydomain.com'
LogXptMode = 'SYNC'
Dependency = ''
DelayMins = '0'
Binding = 'OPTIONAL'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '180'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'auto'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '2'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = 'stdb, db1'
LogFileNameConvert = 'stdb, db1'
FastStartFailoverTarget = 'stdb'
StatusReport = '(monitor)'
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
HostName = 'miki-laptop'
SidName = 'db1'
LocalListenerAddress = '(ADDRESS=(PROTOCOL=tcp)(HOST=miki-laptop)(PORT=1521))'
StandbyArchiveLocation = 'dgsby_db1'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = '%t_%s_%r.dbf'
LatestLog = '(monitor)'
TopWaitEvents = '(monitor)'
Current status for "db1":
Warning: ORA-16819: Fast-Start Failover observer not started
So lets start the observer. I will repeat again, the observer should run on a different host, however for the sake of this example it will run on the same machine as the databases.
In order to start the observer one should start DGMGRL utility and login to the data guard configuration.
Once logged in, issue START OBSERVER command. This will start the observer.
Optionally you can set a log file destination while invoking DGMGRL utility and specify name for the observer configuration file (for more information check
Data Guard Command-Line Interface Reference).
Once the observer is started, the control is not returned to the user until the observer is stopped.
DGMGRL for Linux: Version 10.2.0.3.0 - Production
Copyright (c) 2000, 2005, Oracle. All rights reserved.
Welcome to DGMGRL, type "help" for information.
DGMGRL> connect sys/***@db1
Connected.
DGMGRL> start observer;
Observer started
So lets check the configuration now.
$ dgmgrl
DGMGRL for Linux: Version 10.2.0.3.0 - Production
Copyright (c) 2000, 2005, Oracle. All rights reserved.
Welcome to DGMGRL, type "help" for information.
DGMGRL> connect sys/***@db1
Connected.
DGMGRL> show configuration verbose;
Configuration
Name: DRTest
Enabled: YES
Protection Mode: MaxAvailability
Fast-Start Failover: ENABLED
Databases:
db1 - Primary database
stdb - Physical standby database
- Fast-Start Failover target
Fast-Start Failover
Threshold: 120 seconds
Observer: miki-laptop
Current status for "DRTest":
SUCCESS
DGMGRL>
As of this moment my configuration has FSFO enabled.
Now lets test if the FSFO really works.
One should be aware of the conditions that must be satisfied for the observer to attempt FSFO.
The full list of conditions can be found in
Data Guard Broker user guide under 5.5.2.1 What Happens When Fast-Start Failover and the Observer Are Running?
I will issue SHUTDOWN ABORT to the primary database (SHUTDOWN NORMAL/IMMEDIATE/TRANSACTIONAL would not trigger the failover).
Once I crash the database, the status of the configuration will return Error: ORA-16625: cannot reach the database. Be aware that since the primary database is down, the only way to check the configuration is to connect using the standby database credentials.
SQL> conn sys/***@db1 as sysdba
Connected.
SQL> shutdown abort
ORACLE instance shut down.
SQL>
DGMGRL> connect sys/***@stdb
Connected.
DGMGRL> show configuration
Configuration
Name: DRTest
Enabled: YES
Protection Mode: MaxAvailability
Fast-Start Failover: ENABLED
Databases:
db1 - Primary database
stdb - Physical standby database
- Fast-Start Failover target
Current status for "DRTest":
Error: ORA-16625: cannot reach the database
After waiting for two minutes (FSFO threshold was set to 120 seconds), I checked the observer log file and found out that it started the failover.
[W000 08/09 17:58:49.75] Observer started.
18:05:38.31 Saturday, August 09, 2008
Initiating fast-start failover to database "stdb"...
Performing failover NOW, please wait...
Failover succeeded, new primary is "stdb"
18:06:16.82 Saturday, August 09, 2008
DGMGRL> show configuration verbose
Configuration
Name: DRTest
Enabled: YES
Protection Mode: MaxAvailability
Fast-Start Failover: ENABLED
Databases:
db1 - Physical standby database (disabled)
- Fast-Start Failover target
stdb - Primary database
Fast-Start Failover
Threshold: 120 seconds
Observer: miki-laptop
Current status for "DRTest":
Warning: ORA-16608: one or more databases have warnings
DGMGRL>
So the observer started the fast-start failover and has successfully converted the former standby to new primary database.
Once the failover is done the observer will attempt to automatically reinstate the former primary database to new physical standby.
Certain requirements have to be met for the automatic reinstatement to take place. If not, the database can be manually reinstated. Once the reinstatement is successful you can see that the former primary database became new physical standby.
DGMGRL> show configuration
Configuration
Name: DRTest
Enabled: YES
Protection Mode: MaxAvailability
Fast-Start Failover: ENABLED
Databases:
db1 - Physical standby database
- Fast-Start Failover target
stdb - Primary database
Current status for "DRTest":
SUCCESS
DGMGRL>
Finally, you can do switchover to perform role change and to place the configuration in its initial state.
The example above shows that enabling fast-start failover is pretty straight forward. There are many good documents with information how to enable FSFO feature. Some of them are listed below:
Oracle® Data Guard Broker 10g Release 2 (10.2) - 5.5 Fast-Start FailoverWhat Happens When Fast-Start Failover and the Observer Are Running?
7.6 Scenario 5: Enabling Fast-Start Failover and the ObserverMetalink Note#359555.1 IMPLEMENTING FAST-START FAILOVER IN 10GR2 DATAGUARD BROKER ENVIRONMENT
Oracle Maximum Availability Architecture – MAAI hope I didn't miss something.
Cheers!