Pythian Blog: Technical Track

SNAPSHOT CONTROLFILE Isn't Just a Backup Copy of a Control File Anymore

Recently, I have troubleshooted failing RMAN backups in 11.2.0.2 RAC environment. The backups failed with an error message "ORA-00245: control file backup operation failed". The solution for the issue is simple. You may know it already, but starting from 11.2.0.2 version, a control file copy in RAC environments needs to be located on a shared location. This fact is clearly indicated in the 11GR2 documentation: " You can specify a cluster file system or a raw device destination for the location of your snapshot control file. This file is shared across all nodes in the cluster and MUST be accessible by all nodes in the cluster." I am providing error messages that I saw in relation to the issue in the second part of this blog post with the simple way to resolve it.

More than just a copy

But let's stop for a minute or two and think about why this change took place. Does that means that control file snapshot copies are more than just a copy starting from version 11.2.0.2? While I don't know all the details yet (I would like to conduct an additional research in this area), the Oracle documentation says: " RMAN creates the snapshot control file so that it has a consistent version of a control file to use when either resynchronizing the recovery catalog or backing up the control file." During my troubleshooting efforts, I have noticed that backups sometimes failed just in the middle of data file or archive log backups. This could be explained by the fact that at some point during RMAN backups, Oracle needs to synchronize a controlfile and a RMAN catalog database. If there are some concurrent operations on a current controlfile, Oracle tries to use a snapshot copy. If a snapshot copy isn't shared, the operation may fail. While the triggers for this issue are not 100% clear for me, SNAPSHOT CONTROLFILE is clearly not just an additional control file copy anymore (as many of us used to think). I'm not sure when I will manage to find time for additional research myself, but if you have information to share or time to shed some light here, please let me/us know.

Solution and error messages

The solution is simple. Just re-configure RMAN to place a snapshot controlfile copy to a shared location (e.g. ASM disk group): [code]configure snapshot controlfile name to '+DISKGROUP/dbname/controlfile/snapcf_dbname.f';[/code] I have observed the following error messages related to the issue:
## alert.log
 Thu Jun 28 01:33:57 2012
 Control file backup creation failed.
 Backup target file size found to be zero.
 Errors in file /d01/DB/11.2.0/admin/DB_host/diag/rdbms/DB/DB2/trace/DB2_arc2_13590.trc:
 ORA-27037: unable to obtain file status
 Linux-x86_64 Error: 2: No such file or directory
 Additional information: 3
 Thu Jun 28 01:37:34 2012
 
 ## trace file
 *** 2012-06-28 01:33:57.986
 Linux-x86_64 Error: 2: No such file or directory
 Additional information: 3
 Redo shipping client performing standby login
 
 ## RMAN log files
 channel t1: finished piece 1 at 2012-06-27:23:21:30
 piece handle=DB_1830_1_p6nekdq4_1_1.incr tag=HOST-DB1-INCRDB comment=API Version 2.0,MMS Version 5.0.0.0
 channel t1: backup set complete, elapsed time: 00:02:57
 released channel: t1
 released channel: t2
 RMAN-00571: ===========================================================
 RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
 RMAN-00571: ===========================================================
 RMAN-03002: failure of backup plus archivelog command at 06/27/2012 23:21:30
 
 RMAN-03009: failure of backup command on t2 channel at 06/27/2012 23:19:15
 ORA-00245: control file backup operation failed
 
 archived log file name=+ARCH/DB/archivelog/2012_06_19/thread_2_seq_434.487.786331921 RECID=1129 STAMP=786331922
 archived log file name=+ARCH/DB/archivelog/2012_06_19/thread_2_seq_435.484.786339421 RECID=1133 STAMP=786339423
 archived log file name=+ARCH/DB/archivelog/2012_06_19/thread_1_seq_398.480.786340595 RECID=1137 STAMP=786340595
 archived log file name=+ARCH/DB/archivelog/2012_06_19/thread_2_seq_436.483.786340593 RECID=1135 STAMP=786340593
 RMAN-08137: WARNING: archived log not deleted, needed for standby or upstream capture process
 archived log file name=+ARCH/DB/archivelog/2012_06_19/thread_2_seq_437.478.786340749 thread=2 sequence=437
 archived log file name=+ARCH/DB/archivelog/2012_06_19/thread_1_seq_399.479.786340749 RECID=1139 STAMP=786340748
 Finished backup at 2012-06-19:04:41:18
 released channel: t1
 released channel: t2
 RMAN-00571: ===========================================================
 RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
 RMAN-00571: ===========================================================
 RMAN-03008: error while performing automatic resync of recovery catalog
 ORA-00245: control file backup operation failed
 
 Starting backup at 2012-06-25:23:22:08
 current log archived
 released channel: t1
 released channel: t2
 RMAN-00571: ===========================================================
 RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
 RMAN-00571: ===========================================================
 RMAN-03002: failure of backup plus archivelog command at 06/25/2012 23:22:17
 RMAN-03014: implicit resync of recovery catalog failed
 RMAN-03009: failure of full resync command on default channel at 06/25/2012 23:22:17
 ORA-00245: control file backup operation failed
 
 archived log file name=+ARCH/DB/archivelog/2012_06_27/thread_2_seq_588.564.787019667 RECID=1721 STAMP=787019669
 archived log file name=+ARCH/DB/archivelog/2012_06_27/thread_2_seq_589.571.787023419 RECID=1725 STAMP=787023421
 Finished backup at 2012-06-27:04:04:34
 
 Starting backup at 2012-06-27:04:04:41
 channel t1: starting full datafile backup set
 channel t1: specifying datafile(s) in backup set
 released channel: t1
 released channel: t2
 RMAN-00571: ===========================================================
 RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
 RMAN-00571: ===========================================================
 RMAN-03009: failure of backup command on t1 channel at 06/27/2012 04:04:46
 ORA-00245: control file backup operation failed
 
 ## OERR description
 oerr ora 00245
 00245, 00000, "control file backup operation failed"
 // *Cause: Failed to create a control file backup because some process
 // signaled an error during backup creation.
 // *Action: Check alert files for further information. This usually happens
 // because some process could not access the backup file during
 // backup creation. Any process of any instance that starts a read/write
 // control file transaction must have an access to the backup control file
 // during backup creation.

No Comments Yet

Let us know what you think

Subscribe by email