Pythian Blog: Technical Track

World Backup Day - Protect Your Oracle Databases

In recognition of World Backup Day, I will discuss backups as they pertain to Oracle databases.

There are a number of options for backing up, testing backups, and restoring them. In fact, the number of possibilities can be a little overwhelming. Backing up does not necessarily need to be too complex. It is best to start off simple, and then modify the backup procedures as needed. This article is not a tutorial by any means, but just an overview of some of the methods for backing up your database, and testing those backups. For some readers, this may be a refresher, and for those new to Oracle, it is a brief introduction to some features of the Oracle RMAN utility. Making a backup can be as simple as the RMAN command backup database. Most backups, however, are more complex than that. Rather than spend much time on backups though, I actually want to talk about testing backups. Backups that do not work when a restore and recovery attempt is made are the things that nightmares are made of. Making sure the backups are actually useful is something that cannot be overstated.

My interest in backups

First a little background on why I tend to be rather particular about making backups and being certain that a backup can be used to successfully restore a database. The two incidents below are from long ago; pre-Pythian and pre-Oracle (at least for me). The technologies used will make this obvious.
Inventory System
Long ago I was responsible for an inventory system. The fact that this inventory software ran on 8-inch floppy disks will provide clues to a couple of things:
  • it was long ago
  • it could not have been a large inventory (thankfully, as you will see)
One day something happened to the inventory disk. I do not recall just what happened, but I do remember that the entire inventory had to be re-entered. By hand. By me. You would think I would have learned my lesson and start making regular backups. But no. If you have never made backups on 8-inch floppy disks, you do not really understand the meaning of the word 'tedium'. But of course, there was another crash, the disk was trash, and I had to re-enter all of the inventory. Again. By hand. That was powerful lesson. But wait, there's more!
The Orders System
At a somewhat later stage, I was at a different job, working onsite at a client site where they ran our custom software. My job that day was to upgrade the version of Xenix (Google it!) on the client's machine. And of course, before doing that I made a backup. This backup was to tape, and so fairly simple to make. The backup was made via the Unix (Linux not yet invented) tar command, directly to the tape device. Pretty standard stuff for the time. When the backup was complete, I ran 'tar tvf /dev/tape' to get a directory listing of the backup as a means of verification. The command was successful, all the correct files were there. Next steps:
  • format the system drive
  • install later version of Xenix
  • reload backup software and data files
The first two steps were fine. That final, and crucial step, however, was not so good. It seems a colleague from the employer had upgraded the RAM the previous day. One step in that process was missed. Everything appeared to be OK after that. The problem was the memory used by the tape driver was not handled correctly. Every file on that backup was nothing but garbage, except for the header. The headers were fine, but the contents were not. This was not a good day, week, or month. Should I ever seem rather persnickety and nitpicky about backups and restores, there are good reasons for that.

Backups are not that important

I say that only in jest; of course backups are important. But are there some things more important than backups? Restores: restores are very important. Even more important is the recovery, as the database cannot be opened until the recovery step completes. How is recovery different than restore? Recovery is the process that brings the restored files to a point where the database can be opened. Often this point is a particular time or SCN (System Change Number) that is specified at the time the database is recovered. This is referred to as an 'incomplete' recovery. A 'complete recovery' is one where ALL redo from the archive logs is applied to the datafiles, applying all changes to the database. Sure, you need backups to do a restore. But if you never test your backups, you may as well just not make them. A backup that cannot be used when a restore is necessary can make life very difficult indeed. You want to do everything you can to protect your data. There is much more to know about Oracle backups, but for the purpose of discussing the validation of backups, the backup will be kept simple.

Testing Oracle Backups

The best way to test a backup is to restore it. While this is the ideal, it is not always practical, and sometimes not even possible due to the size of the database. Very few companies are willing to spend the money and time needed to have a server available just for testing a restore of a 152-terabyte database. The next best thing is to validate the restores. Here, Oracle has you covered with RMAN methods to validate that backups can indeed be restored.

Backup Validate

First on the list is backup validate. I know, I said this would be about restoring databases. However, the backup validate command can help you determine the condition of your database, spotting physical corruption in the database, and if you choose, it can also locate and record logical discrepancies. The command can be as simple as backup validate check logical database This will take some time, depending on the size of the database, as Oracle will read all required database files. This operation can be performed online and is non-destructive, as no backup files are created, and no files are altered. Read more here: Backup Validate

Make a Backup

The following is the RMAN script used to create a backup that is used for some following commands.
run {
  allocate channel ch1 device type DISK;
  allocate channel ch2 device type DISK;
  backup incremental level=0 tag "RMAN-TEST" format '/mnt/oracle-backups/rac19c01/%d_T%T_db_s%s_p%p_t%t' database filesperset 4 plus archivelog tag "RMAN-TEST";
  alter system switch logfile;
  backup format '/mnt/oracle-backups/rac19c01/%d_T%T_arch_s%s_p%p_t%t' archivelog all filesperset 4 tag "RMAN-TEST";
  backup format '/mnt/oracle-backups/rac19c01/%d_T%T_cf%s_p%p_t%t' current controlfile tag "RMAN-TEST";
 }
 

Restore Preview

Restore preview is fast, very fast. It is fast because it really doesn't do much. The preview simply reads the metadata from the backup catalogue. This is useful just to ensure that the catalogue has captured the files necessary for a restore. This command does not read any files, just the metadata.
RMAN> restore preview database;
 
 Starting restore at 2020-03-02 17:30:50
 using target database control file instead of recovery catalog
 allocated channel: ORA_DISK_1
 channel ORA_DISK_1: SID=300 instance=cdb1 device type=DISK
 
 
 List of Backup Sets
 ===================
 
 
 BS Key Type LV Size Device Type Elapsed Time Completion Time
 ------- ---- -- ---------- ----------- ------------ -------------------
 24 Incr 0 840.32M DISK 00:00:20 2020-03-02 17:14:57
  BP Key: 24 Status: AVAILABLE Compressed: NO Tag: RMAN-TEST
  Piece Name: /mnt/oracle-backups/rac19c01/CDB_T20200302_db_s33_p1_t1034010877
  List of Datafiles in backup set 24
  File LV Type Ckp SCN Ckp Time Abs Fuz SCN Sparse Name
  ---- -- ---- ---------- ------------------- ----------- ------ ----
  1 0 Incr 80321492 2020-03-02 17:14:37 NO +DATA/CDB/DATAFILE/system.259.1021126503
  4 0 Incr 80321492 2020-03-02 17:14:37 NO +DATA/CDB/DATAFILE/undotbs1.261.1021126575
  9 0 Incr 80321492 2020-03-02 17:14:37 NO +DATA/CDB/DATAFILE/undotbs2.274.1021127203
 
 ...
 
 
 List of Archived Log Copies for database with db_unique_name CDB
 =====================================================================
 
 Key Thrd Seq S Low Time
 ------- ---- ------- - -------------------
 810 1 457 A 2020-03-02 17:14:19
  Name: +DATA/CDB/ARCHIVELOG/2020_03_02/thread_1_seq_457.346.1034010983
 
 812 1 458 A 2020-03-02 17:16:23
  Name: +DATA/CDB/ARCHIVELOG/2020_03_02/thread_1_seq_458.342.1034010991
 
 813 1 459 A 2020-03-02 17:16:31
  Name: +DATA/CDB/ARCHIVELOG/2020_03_02/thread_1_seq_459.340.1034010995
 
 811 2 359 A 2020-03-02 17:14:19
  Name: +DATA/CDB/ARCHIVELOG/2020_03_02/thread_2_seq_359.344.1034010985
 
 814 2 360 A 2020-03-02 17:16:25
  Name: +DATA/CDB/ARCHIVELOG/2020_03_02/thread_2_seq_360.339.1034010997
 
 recovery will be done up to SCN 80321491
 Media recovery start SCN is 80321491
 Recovery must be done beyond SCN 80321782 to clear datafile fuzziness
 Finished restore at 2020-03-02 17:30:52
 
 
At the end of the listing, you will find that media recovery must be done beyond SCN 80321782 for a successful recovery. However, the listing does not include that SCN information. It is there, we just need to look at the archive logs explicitly:
RMAN> list backup of archivelog all;
 
 List of Backup Sets
 ===================
 
 
 BS Key Size Device Type Elapsed Time Completion Time
 ------- ---------- ----------- ------------ -------------------
 3 325.40M DISK 00:00:06 2020-03-02 17:05:26
  BP Key: 3 Status: AVAILABLE Compressed: NO Tag: RMAN-TEST
  Piece Name: /mnt/oracle-backups/rac19c01/CDB_T20200302_db_s11_p1_t1034010320
 
  List of Archived Logs in backup set 3
  Thrd Seq Low SCN Low Time Next SCN Next Time
  ---- ------- ---------- ------------------- ---------- ---------
  1 451 80282070 2020-03-02 15:00:03 80302686 2020-03-02 16:22:34
  1 452 80302686 2020-03-02 16:22:34 80318226 2020-03-02 17:05:18
  2 355 80302690 2020-03-02 16:22:36 80318223 2020-03-02 17:05:17
 
 ...
 
  List of Archived Logs in backup set 43
  Thrd Seq Low SCN Low Time Next SCN Next Time
  ---- ------- ---------- ------------------- ---------- ---------
  1 457 80321365 2020-03-02 17:14:19 80321856 2020-03-02 17:16:23
  1 458 80321856 2020-03-02 17:16:23 80321897 2020-03-02 17:16:31
  1 459 80321897 2020-03-02 17:16:31 80321912 2020-03-02 17:16:34
  2 360 80321860 2020-03-02 17:16:25 80321989 2020-03-02 17:16:37
 
 ...
 
We see 80321860, which is well past 80321782 and so we are confident that at least the required files have been recorded in the backup catalogue. Note: By default, RMAN uses a date formate of 'YYYY-MON-DD'. This behavior can be altered by setting the NLS_DATE_FORMAT shell variable:
$ export NLS_DATE_FORMAT='YYYY-MM-DD HH24:MI:SS'
 
 $ rman target /
 ...
 
 

Restore Validate

Does the knowledge that your backup files are recorded in the backup catalogue give you a warm, fuzzy feeling of contentment? It shouldn't. A database cannot be restored with metadata alone. Not knowing that backups have been tested is the kind of thing that should keep a DBA awake at night. Knowing that you need proper rest, let's see another method to test backups, restore validate. RMAN can validate the files used for a restore. This is not just a test to see if the files are available, but a read of all blocks in the backup files to ensure they are valid. The following is an example, using the backup that was made previously:
RMAN> restore validate database;
 
 Starting restore at 2020-03-02 17:52:37
 using channel ORA_DISK_1
 
 channel ORA_DISK_1: starting validation of datafile backup set
 channel ORA_DISK_1: reading from backup piece /mnt/oracle-backups/rac19c01/CDB_T20200302_db_s32_p1_t1034010877
 channel ORA_DISK_1: piece handle=/mnt/oracle-backups/rac19c01/CDB_T20200302_db_s32_p1_t1034010877 tag=RMAN-TEST
 channel ORA_DISK_1: restored backup piece 1
 channel ORA_DISK_1: validation complete, elapsed time: 00:00:01
 
 ...
 
 channel ORA_DISK_1: starting validation of datafile backup set
 channel ORA_DISK_1: reading from backup piece /mnt/oracle-backups/rac19c01/CDB_T20200302_db_s43_p1_t1034010975
 channel ORA_DISK_1: piece handle=/mnt/oracle-backups/rac19c01/CDB_T20200302_db_s43_p1_t1034010975 tag=RMAN-TEST
 channel ORA_DISK_1: restored backup piece 1
 channel ORA_DISK_1: validation complete, elapsed time: 00:00:07
 Finished restore at 2020-03-02 17:54:11
 
This is a small test database; the validation required only about two minutes. It's okay to get some warm and fuzzies now, but don't get too cozy. There are other types of files that must be restored for a recovery to take place. Let's check on the archive log files. We know that all the required archive logs were created after 13:00 today, as the backup was made at about 17:14, and the alert log shows several log switches between 17:00 and 17:14
2020-03-02T17:07:24.248403-08:00
 Thread 1 advanced to log sequence 455 (LGWR switch)
  Current log# 1 seq# 455 mem# 0: +DATA/CDB/ONLINELOG/group_1.265.1021126651
  Current log# 1 seq# 455 mem# 1: +DATA/CDB/ONLINELOG/group_1.268.1021126653
 2020-03-02T17:07:24.320618-08:00
 ARC2 (PID:30093): Archived Log entry 805 added for T-1.S-454 ID 0x7f6be175 LAD:1
 
So now we can validate that archive logs can be read:
RMAN> restore validate archivelog from time "to_date('2020-03-02 17:00:00','yyyy-mm-dd hh24:mi:ss')";
 
 Starting restore at 2020-03-02 18:09:58
 using channel ORA_DISK_1
 
 channel ORA_DISK_1: scanning archived log +DATA/CDB/ARCHIVELOG/2020_03_02/thread_1_seq_452.341.1034010319
 channel ORA_DISK_1: scanning archived log +DATA/CDB/ARCHIVELOG/2020_03_02/thread_1_seq_453.349.1034010441
 channel ORA_DISK_1: scanning archived log +DATA/CDB/ARCHIVELOG/2020_03_02/thread_1_seq_454.352.1034010445
 channel ORA_DISK_1: scanning archived log +DATA/CDB/ARCHIVELOG/2020_03_02/thread_1_seq_455.351.1034010445
 channel ORA_DISK_1: scanning archived log +DATA/CDB/ARCHIVELOG/2020_03_02/thread_1_seq_456.347.1034010859
 channel ORA_DISK_1: scanning archived log +DATA/CDB/ARCHIVELOG/2020_03_02/thread_1_seq_457.346.1034010983
 channel ORA_DISK_1: scanning archived log +DATA/CDB/ARCHIVELOG/2020_03_02/thread_1_seq_458.342.1034010991
 channel ORA_DISK_1: scanning archived log +DATA/CDB/ARCHIVELOG/2020_03_02/thread_1_seq_459.340.1034010995
 channel ORA_DISK_1: scanning archived log +DATA/CDB/ARCHIVELOG/2020_03_02/thread_2_seq_355.353.1034010317
 channel ORA_DISK_1: scanning archived log +DATA/CDB/ARCHIVELOG/2020_03_02/thread_2_seq_356.319.1034010441
 channel ORA_DISK_1: scanning archived log +DATA/CDB/ARCHIVELOG/2020_03_02/thread_2_seq_357.350.1034010447
 channel ORA_DISK_1: scanning archived log +DATA/CDB/ARCHIVELOG/2020_03_02/thread_2_seq_358.348.1034010859
 channel ORA_DISK_1: scanning archived log +DATA/CDB/ARCHIVELOG/2020_03_02/thread_2_seq_359.344.1034010985
 channel ORA_DISK_1: scanning archived log +DATA/CDB/ARCHIVELOG/2020_03_02/thread_2_seq_360.339.1034010997
 Finished restore at 2020-03-02 18:10:12
 
 
Let's also see if the controlfile is there:
RMAN> restore validate controlfile;
 
 Starting restore at 2020-03-02 18:13:09
 using channel ORA_DISK_1
 
 channel ORA_DISK_1: starting validation of datafile backup set
 channel ORA_DISK_1: reading from backup piece +DATA/CDB/AUTOBACKUP/2020_03_02/s_1034011019.338.1034011019
 channel ORA_DISK_1: piece handle=+DATA/CDB/AUTOBACKUP/2020_03_02/s_1034011019.338.1034011019 tag=TAG20200302T171659
 channel ORA_DISK_1: restored backup piece 1
 channel ORA_DISK_1: validation complete, elapsed time: 00:00:01
 Finished restore at 2020-03-02 18:13:11
 
 
Finally, a list of the backup by tag (you do use tags, don't you?) can be used to show that all files are present:
RMAN> list backupset tag "RMAN-TEST";
 
 
 List of Backup Sets
 ===================
 
 
 BS Key Size Device Type Elapsed Time Completion Time
 ------- ---------- ----------- ------------ -------------------
 3 325.40M DISK 00:00:06 2020-03-02 17:05:26
  BP Key: 3 Status: AVAILABLE Compressed: NO Tag: RMAN-TEST
  Piece Name: /mnt/oracle-backups/rac19c01/CDB_T20200302_db_s11_p1_t1034010320
 
 ...
 
 BS Key Size Device Type Elapsed Time Completion Time
 ------- ---------- ----------- ------------ -------------------
 44 4.32M DISK 00:00:00 2020-03-02 17:16:54
  BP Key: 44 Status: AVAILABLE Compressed: NO Tag: RMAN-TEST
  Piece Name: /mnt/oracle-backups/rac19c01/CDB_T20200302_arch_s53_p1_t1034011014
 
  List of Archived Logs in backup set 44
  Thrd Seq Low SCN Low Time Next SCN Next Time
  ---- ------- ---------- ------------------- ---------- ---------
  1 453 80318226 2020-03-02 17:05:18 80318656 2020-03-02 17:07:20
  1 454 80318656 2020-03-02 17:07:20 80318683 2020-03-02 17:07:24
  2 356 80318223 2020-03-02 17:05:17 80318660 2020-03-02 17:07:21
  2 357 80318660 2020-03-02 17:07:21 80318695 2020-03-02 17:07:27
 
 BS Key Type LV Size Device Type Elapsed Time Completion Time
 ------- ---- -- ---------- ----------- ------------ -------------------
 45 Full 19.08M DISK 00:00:01 2020-03-02 17:16:58
  BP Key: 45 Status: AVAILABLE Compressed: NO Tag: RMAN-TEST
  Piece Name: /mnt/oracle-backups/rac19c01/CDB_T20200302_cf54_p1_t1034011017
  Control File Included: Ckp SCN: 80322102 Ckp time: 2020-03-02 17:16:57
 
Now you can rest easy, as the data files, archivelogs and controlfile are all present in the backup made earlier. The only way to be more sure of the backup is to use it to restore and recover the database. Docs here: RESTORE

Recover Validate Header

While recover validate may sound like restore validate on steroids, it really isn't. According to the documentation, this command is nearly the same as restore preview. The difference is that when recover validate is used, the file headers are read to ensure they match what is found in the backup catalogue. This would provide a little more assurance than the restore preview, but not as much as the restore validate. The problem here is that in this 19.3 database, recover database validate header is examining only the archive log backups:
MAN> recover database validate header;
 
 Starting recover at 2020-03-02 18:43:11
 using channel ORA_DISK_1
 using channel ORA_DISK_1
 
 
 List of Backup Sets
 ===================
 
 
 BS Key Size Device Type Elapsed Time Completion Time
 ------- ---------- ----------- ------------ -------------------
 40 173.87M DISK 00:00:04 2020-03-02 17:16:49
  BP Key: 40 Status: AVAILABLE Compressed: NO Tag: RMAN-TEST
  Piece Name: /mnt/oracle-backups/rac19c01/CDB_T20200302_arch_s49_p1_t1034011005
 
  List of Archived Logs in backup set 40
  Thrd Seq Low SCN Low Time Next SCN Next Time
  ---- ------- ---------- ------------------- ---------- ---------
  1 449 80240390 2020-03-02 12:18:51 80261546 2020-03-02 13:39:51
 
 BS Key Size Device Type Elapsed Time Completion Time
 ------- ---------- ----------- ------------ -------------------
 38 199.81M DISK 00:00:05 2020-03-02 17:16:43
  BP Key: 38 Status: AVAILABLE Compressed: NO Tag: RMAN-TEST
  Piece Name: /mnt/oracle-backups/rac19c01/CDB_T20200302_arch_s47_p1_t1034010998
 
  List of Archived Logs in backup set 38
  Thrd Seq Low SCN Low Time Next SCN Next Time
  ---- ------- ---------- ------------------- ---------- ---------
  2 354 80240394 2020-03-02 12:18:51 80302690 2020-03-02 16:22:36
  1 450 80261546 2020-03-02 13:39:51 80282070 2020-03-02 15:00:03
 
 BS Key Size Device Type Elapsed Time Completion Time
 ------- ---------- ----------- ------------ -------------------
 39 174.02M DISK 00:00:05 2020-03-02 17:16:43
  BP Key: 39 Status: AVAILABLE Compressed: NO Tag: RMAN-TEST
  Piece Name: /mnt/oracle-backups/rac19c01/CDB_T20200302_arch_s48_p1_t1034010998
 
  List of Archived Logs in backup set 39
  Thrd Seq Low SCN Low Time Next SCN Next Time
  ---- ------- ---------- ------------------- ---------- ---------
  1 451 80282070 2020-03-02 15:00:03 80302686 2020-03-02 16:22:34
 
 BS Key Size Device Type Elapsed Time Completion Time
 ------- ---------- ----------- ------------ -------------------
 41 151.39M DISK 00:00:05 2020-03-02 17:16:50
  BP Key: 41 Status: AVAILABLE Compressed: NO Tag: RMAN-TEST
  Piece Name: /mnt/oracle-backups/rac19c01/CDB_T20200302_arch_s50_p1_t1034011005
 
  List of Archived Logs in backup set 41
  Thrd Seq Low SCN Low Time Next SCN Next Time
  ---- ------- ---------- ------------------- ---------- ---------
  1 452 80302686 2020-03-02 16:22:34 80318226 2020-03-02 17:05:18
  2 355 80302690 2020-03-02 16:22:36 80318223 2020-03-02 17:05:17
 
 BS Key Size Device Type Elapsed Time Completion Time
 ------- ---------- ----------- ------------ -------------------
 44 4.32M DISK 00:00:00 2020-03-02 17:16:54
  BP Key: 44 Status: AVAILABLE Compressed: NO Tag: RMAN-TEST
  Piece Name: /mnt/oracle-backups/rac19c01/CDB_T20200302_arch_s53_p1_t1034011014
 
  List of Archived Logs in backup set 44
  Thrd Seq Low SCN Low Time Next SCN Next Time
  ---- ------- ---------- ------------------- ---------- ---------
  2 356 80318223 2020-03-02 17:05:17 80318660 2020-03-02 17:07:21
  1 453 80318226 2020-03-02 17:05:18 80318656 2020-03-02 17:07:20
  1 454 80318656 2020-03-02 17:07:20 80318683 2020-03-02 17:07:24
  2 357 80318660 2020-03-02 17:07:21 80318695 2020-03-02 17:07:27
 
 BS Key Size Device Type Elapsed Time Completion Time
 ------- ---------- ----------- ------------ -------------------
 42 15.46M DISK 00:00:00 2020-03-02 17:16:53
  BP Key: 42 Status: AVAILABLE Compressed: NO Tag: RMAN-TEST
  Piece Name: /mnt/oracle-backups/rac19c01/CDB_T20200302_arch_s51_p1_t1034011013
 
  List of Archived Logs in backup set 42
  Thrd Seq Low SCN Low Time Next SCN Next Time
  ---- ------- ---------- ------------------- ---------- ---------
  1 455 80318683 2020-03-02 17:07:24 80318691 2020-03-02 17:07:24
  1 456 80318691 2020-03-02 17:07:24 80321365 2020-03-02 17:14:19
  2 358 80318695 2020-03-02 17:07:27 80321362 2020-03-02 17:14:19
  2 359 80321362 2020-03-02 17:14:19 80321860 2020-03-02 17:16:25
 
 BS Key Size Device Type Elapsed Time Completion Time
 ------- ---------- ----------- ------------ -------------------
 43 5.46M DISK 00:00:00 2020-03-02 17:16:53
  BP Key: 43 Status: AVAILABLE Compressed: NO Tag: RMAN-TEST
  Piece Name: /mnt/oracle-backups/rac19c01/CDB_T20200302_arch_s52_p1_t1034011013
 
  List of Archived Logs in backup set 43
  Thrd Seq Low SCN Low Time Next SCN Next Time
  ---- ------- ---------- ------------------- ---------- ---------
  1 457 80321365 2020-03-02 17:14:19 80321856 2020-03-02 17:16:23
  1 458 80321856 2020-03-02 17:16:23 80321897 2020-03-02 17:16:31
  2 360 80321860 2020-03-02 17:16:25 80321989 2020-03-02 17:16:37
  1 459 80321897 2020-03-02 17:16:31 80321912 2020-03-02 17:16:34
 validation succeeded for backup piece
 recovery will be done up to SCN 2163919
 Media recovery start SCN is 2163919
 Recovery must be done beyond SCN 80321989 to clear datafile fuzziness
 Finished recover at 2020-03-02 18:43:13
 
At this time I do not know if this is due to a bug in the code, or a bug in the documentation. Either way, restore validate gives me the kind of assurance I need that my backups can be restored. Learn more about Pythian's Oracle Services.

No Comments Yet

Let us know what you think

Subscribe by email