Pythian Blog: Technical Track

How to Rename a Copied or Cloned ASM Disk Group

ASM is definitely one of the coolest technologies inside the Oracle Database. On the other hand, the ability of the storage arrays to provide a read/write access to a copy or a “snapshot” of its content is something we can easily leverage as an Oracle DBA. For a couple of weeks, I wanted to copy a database stored in an ASM Disk Group with one of those storage technologies and mount it on the same server; unfortunately, this is not supported even with 11.1.0.6. The good news is that I finally overcame all the obstacles to do it in a specific case. This post relates a couple of the tips I used to get to that result.

What You May Want to Know Before You Start

Several people helped me, even if most of them don’t know. Luca Canali did an awesome presentation about ASM at UKOUG. Alexandro Vargas’s Blog is a must-have in the list of your RSS feed if you work with ASM or RAC. Last but not least, Alex Fatkulin rocks when you have a problem with Oracle. I’m very lucky to work 10 metres away from him and other amazing folks at Pythian.

N.B.: What I describe below is among the worst things you can ever do with ASM. You can use it to play around, but never use it with anything other than test data. If you lose something because of me, you’ll be the only one to blame!

I’ve tested the whole process with Oracle 11.1.0.6 on Ubuntu Gutsy Gibbon, based on a previous post I made on this blog. In order to copy the ASM disks, I’ve used the dd command, but it should not make any difference with a storage array command.

The Principle

The only way I’ve found to have the original Disk Group and its copy mounted all together on the same server is to rename the cloned Disk Groups and Disks. Running the copy of the Disk Group on separate ASM instances just failed for me. The error will come up later in this post. Obviously, there is no command for now to rename the Disk Group or Disk, and I have no idea how to generate those Metadata with a C program, though.

Here’s a simple idea: let Oracle create the Disk Group and Group Metadata by creating a brand new Disk Group with its set of Disks that exactly fit the layout of the original Disk Group. Once this is done, you’ll:

  1. Copy the Metadata with a couple of dd commands in a set of files
  2. Turn off ASM rebalancing
  3. Set your database in backup mode
  4. Clone/Copy the disks into the set of disks you used to create the new Disk Group
  5. End the backup mode for the database and turn on ASM rebalancing
  6. Inject the New Disk and Disk Group metadata in the copy/clone of your disk
  7. Mount the copy of the Disk Group with its new name in ASM
  8. End with the database cloning process (i.e.: make sure you’ve deleted the copies of control files, spfiles, redo logs and file alias in the new disk group, and created a new one to mount a new instance with the different file paths)

Step 1: Create a Separate ASM Instance

Why use a separate ASM instance? It may be related to the fact the test case was not running on a supported platform, but once a new the diskgroup created, and even after changing the asm_diskstring parameter and querying v$asm_disk, the devices couldn’t get dropped. The lsof command was always showing ASM DBWR, and LGWR processes using them, and losetup -d was complaining because of that. Anyway, to prevent us from stopping the primary ASM instance, you can always remove one device from the ASM discovery string by changing the asm_diskstring on the primary ASM instance:

SQL> alter system set asm_diskgroup='/dev/loop[0-6]';

Once you have done this, you can create a new ASM instance on the same server. To do that, all you need is a separate parameter file. Here is an example with an instance named +CLONE, and its init.ora file:

*.asm_diskstring='/dev/loop7'
*.diagnostic_dest='/u01/app/oracle'
*.instance_type='asm'
*.large_pool_size=12M
*.db_unique_name='+CLONE'

Note that, if you want to mount a Disk Group with the same name as one Disk Group this is already mounted in the primary ASM instance (i.e. +ASM) in a separate instance, you get an error like the one below:

SQL> alter diskgroup DGBLUJ mount;
alter diskgroup DGBLUJ mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15003: diskgroup "DGBLUJ" already mounted in another lock name space

We tried a couple of things but haven’t been able to overcome this situation. This is just a shame; it would have enabled us to use a copy of the Disk Group without renaming it with the _asmsid parameter on a new database instance.

Step 2: Create a Disk Group in the New ASM Instance

To help you understand the test case, I’ll assume the original Disk Group is named DGBLUJ and it has only one disk in it. Its redundancy is set to external. We’ll create a new Disk Group named DGREDX. You can look at V$ASM_DISKGROUP and V$ASM_DISK to find what your next mount should look like. Once you have done this, create the new Disk Group in the ASM secondary instance. The set of commands to do so, in my case, is the following:

$ . oraenv
ORACLE_SID = [+CLONE] ? +CLONE
The Oracle base for ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1

$ sqlplus / as sysdba

SQL> create diskgroup DGREDX
  external redundancy
  disk '/dev/loop7';

You can then shut down the secondary ASM instance and delete your device from the server (in this test with losetup):

SQL> shutdown immediate;
SQL> exit;

$ losetup -d /dev/loop7

Step 3: Copy the Disk Group Information from DGREDX

I wish I could tell you more about why I caught those data. I may not have spent enough time with strace yet. What I suspected was that the ASM Disk’s and Disk Group’s information was in the file #0 and file #2 headers. Despite the block size being 4k for the ASM files in my case, it worked perfectly when I copied the first 2k of those files. I’m pretty sure many of you know more than me. Assuming the disk.bis file is actually the disk with the DGREDX data, I use the dd commands as below:

$ dd if=disk.bis of=dgredx-file0.2k bs=2048 count=1
1+0 records in
1+0 records out
2048 bytes (2.0 kB) copied, 6.4291e-05 seconds, 31.9 MB/s

$ dd if=disk.bis of=dgredx-file2.2k bs=2048 count=1 skip=1536
1+0 records in
1+0 records out
2048 bytes (2.0 kB) copied, 0.0437474 seconds, 46.8 kB/s

To view the content of this segment, you can use vi as below:

$ vi -b dgredx-file0.2k

[Esc]+:%!xxd
[Esc]+:w dgredx-file0.2k.txt
[Esc]+:q!

If you want to investigate further, I suggest you read the web sites I earlier pointed out. It will give you many useful information about X$KFFXP, X$KFFI, and many more. You may also want to use the amdu utility that comes with 11g. Try amdu -help to start. Here is an example of such commands to look at the content of the Disk Group MetaData:

$ amdu -diskstring /dev/loop0 -dump DGBLUJ -noimage
$ amdu -diskstring /dev/loop7 -dump DGREDX -noimage

Step 4: Make Your Clone

As I mentioned earlier, this is important. You turn off the ASM rebalancing and start the the database BEGIN backup. Once this is done, you’ll have to use your favourite storage commands. In my case, the command is:

$ dd if=disk of=disk.bis bs=4096

Then you can end with the database backup mode and restart the ASM automatic I/O rebalancing.

Step 5: Merge the Meta Data from DGREDX Into the DGBLUJ Copy

Once you’ve got the Disk Group clone, you can merge the metadata you’ve copied from DGREDX into the DGBLUJ clone. You can use the dd command as below:

$ dd if=dgredx-file0.2k of=disk.bis bs=2048 conv=notrunc,nocreat
1+0 records in
1+0 records out
2048 bytes (2.0 kB) copied, 0.0156065 seconds, 131 kB/s

$ dd if=dgredx-file2.2k of=disk.bis bs=2048 seek=1536 conv=notrunc,nocreat
1+0 records in
1+0 records out
2048 bytes (2.0 kB) copied, 0.0488208 seconds, 41.9 kB/s

Step 6: Create the Device on the Server

Once again, this operation is very specific to this test case; to create a device with losetup, use the commands below:

# losetup /dev/loop1 /asmdisks/disk.bis
# losetup -a
/dev/loop0: [0802]:7438421 (/asmdisks/disk)
/dev/loop1: [0802]:7438425 (/asmdisks/disk.bis)

To avoid any problem, check the privileges of the files and of the devices.

Step 7: Mount the Diskgroup in the Primary ASM Instance

Once the disk is copied, patched, and mounted to a system device, you can simply mount it on the primary ASM instance as below:

$ . oraenv
ORACLE_SID = [BLACK] ? +ASM
The Oracle base for ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1

$ sqlplus / as sysdba

SQL> alter diskgroup DGREDX mount;

Diskgroup altered.

SQL> exit;

The safer way to continue is probably to delete the files that pointed to files in the primary Disk Group. Here is an example showing how to delete the controlfile copies:

$ asmcmd

ASMCMD> ls
DGBLUJ/
DGREDX/

ASMCMD> ls DGREDX/BLUJ/CONTROLFILE/*
Current.260.646592475
Current.261.646592475

ASMCMD> rm DGREDX/BLUJ/CONTROLFILE/*
You may delete multiple files and/or directories. 
Are you sure? (y/n) y

ASMCMD> exit

This not only includes the controlfiles but also:

  • the ASM aliases
  • the SPFILE
  • the Redo Logs — because ASM uses OMF and you’ll have to open the clone of the database you’ll create with RESETLOGS. You shouldn’t be able to rename those files anyway.

Step 8: End by Cloning the Database

I guess if you’re come this far, this step should be easy for you.

Anyway, this time more than ever, I’ll be very interested if you go further into this test case.

No Comments Yet

Let us know what you think

Subscribe by email