Pythian Blog: Technical Track

ASM nternals: tracking down failed ASM reads

On a live customer system, we've encountered repeated incidents of errors such as the following:
WARNING: Read Failed. group:1 disk:3 AU:86753 offset:524288 size:262144
 
Since Oracle doesn't tell us what exactly is failing here, some research was in order. There's a few posts out there about mapping ASM allocation units (AU) to database extents. But I felt that some of them weren't entirely clear on what is being done, how and why. This prompted me to do some digging of my own. This is our starting point. We know that:
  • The error happened on RAC instance 1 (since it was logged in the alert log of said instance).
  • The ASM disk group number is 1.
  • The ASM disk number is 3.
  • The AU number is 86753.
  • We can't read that AU.
  • Database version is 11.2.0.4 on Linux.
  • ASM disk group redundancy is external.
We can further tell, that the failed read was at byte offset 524288 (which is 512KB) into the AU, and it was a multi-block read of 32 blocks (262144 / 8192). Thus it was likely a full table scan. Disclaimer: what follows next is undocumented, and the usual disclaimers apply: check with Oracle support before running any of this against your production system. In an ASM instance, Oracle exposes the ASM AU map in the fixed table X$KFFXP. We can query that to get some additional details, using the information we already have: [code lang="sql"] select inst_id, group_kffxp, number_kffxp, pxn_kffxp from x$kffxp where group_kffxp=1 and disk_kffxp=3 and au_kffxp=86753; INST_ID GROUP_KFFXP NUMBER_KFFXP PXN_KFFXP ---------- ----------- ------------ ---------- 1 1 287 5526 [/code]   Note: you have to run this in an ASM instance. On a database instance, the table doesn't contain any rows (on the current version that I tested this on). The columns in this table aren't officially documented, but my own testing confirms that the information that can be found on google is fairly reliable in the current versions. What we used here is:
  • GROUP_KFFXP - the ASM disk group number, 1 in our case.
  • DISK_KFFXP - the ASM disk number, 3.
  • AU_KFFXP - the AU number, 86753.
The view now tells us the first two pieces of the puzzle that we need to know:
  • NUMBER_KFFXP - the ASM file number (not to be confused with the Oracle data file number).
  • PXN_KFFXP - the physical extent number in that file.
Armed with this information, we can now determine the details of the file that's experiencing read errors: [code lang="sql"] set lines 200 pages 999 col dg for a12 col name for a20 col fname for a40 select t.name, substr(f.name, instr(f.name,'/',-1) + 1) as fname, a.file_number, f.file# as DBFILE#, f.bytes/1024/1024 as file_mb from v$datafile f, v$tablespace t, v$asm_diskgroup g, v$asm_alias a, v$asm_file af where g.name(+) = substr(f.name,2,instr(f.name,'/')-2) and a.name(+) = upper(substr(f.name, instr(f.name,'/',-1) + 1)) and a.file_number = af.file_number and a.group_number = af.group_number and f.ts# = t.ts# and af.file_number = 287 / NAME FNAME FILE_NUMBER DBFILE# FILE_MB -------------------- ------------------- ----------- ---------- ---------- USERS users.287.795706011 287 4 11895 [/code] We can see that the file is a part of the USERS table-space, and has a data file ID of 4. Let's double check our environment: [code lang="sql"] select allocation_unit_size from v$asm_diskgroup where group_number=1; ALLOCATION_UNIT_SIZE -------------------- 1048576 select block_size from dba_tablespaces where tablespace_name='USERS'; BLOCK_SIZE ---------- 8192 [/code] Now we have all that we need to get the final piece of our puzzle. We can use the following formula to calculate the position of the extent in the file, and from there, hit DBA_EXTENTS to see what that is.

 [ AU_SIZE ] * [ PXN ] / [ BLOCK_SIZE ]
 
In our case, that becomes the following query: [code lang="sql"] select owner, segment_name, segment_type from dba_extents where file_id = 4 and 1048576 * 5526 / 8192 between block_id and block_id + blocks -1; OWNER SEGMENT_NAME SEGMENT_TYPE ------------------------------ ------------------------------- ------------------ FOO CUSTOMER_OLD TABLE [/code] We can also confirm that our result is correct by attempting to read it (note we are forcing a full scan to make sure we're actually reading the table segment): [code lang="sql"] select /*+ full(t) */ count(*) from FOO.CUSTOMER_OLD t * ERROR at line 1: ORA-03113: end-of-file on communication channel Process ID: 741 Session ID: 695 Serial number: 40797 [/code] And sure enough, we see our familiar error message in the alert log instantly:
Thu Dec 10 04:24:23 2015
 WARNING: Read Failed. group:1 disk:3 AU:86753 offset:524288 size:262144
 ERROR: unrecoverable error ORA-15188 raised in ASM I/O path; terminating process 741 
 
We found the affected segment, and can now proceed with the usual recovery scenarios that are available to us. In this particular case, the table can likely be dropped as it was a backup. Nonetheless, it is quite clear that the underlying disk (disk number 3 in group number 1) is faulty and must be replaced. There is one more thing, though, that we need to be mindful of. In order to replace the disk, Oracle has to be able to read all the allocated AUs on that disk as part of the re-balance process that is triggered when dropping/adding disks. How do we tell if there aren't any other segments that can't be read? We'd have to be able to retrieve a list of all extents that are located on the disk in question. Of course, we can simply go for it, and let the drop/re-balance operation fail, which would also tell us that there are additional areas with problems on that disk. Since this is production, I prefer to be in the know instead of running something blindly. Additionally, you may hit one error during the re-balance, correct that, re-try and then hit another one. Rinse and repeat. Doesn't sound too comforting, does it? So let's see how we can get that information together. There is but one problem we need to solve first. The data that we need is not available in the same place:
  1. X$KFFXP is only available on an ASM instance.
  2. DBA_EXTENTS is only available on a database instance.
I opted to go for the external table approach, and pull the data out of ASM first by creating the file /tmp/asm_map.sql with these contents: [code lang="sql"] set echo off set feedback off set termout off set pages 0 spool /tmp/asm_map.txt select x.number_kffxp || ',' || x.pxn_kffxp as data from x$kffxp x where x.group_kffxp=1 and x.disk_kffxp=3 and x.number_kffxp > 255 / spool off [/code] Again, we are specifying the group number from our error message (GROUP_KFFXP=1) and the problematic disk (DISK_KFFXP=3). Execute that script while connected to your ASM instance. Beware, if you have huge LUNs, this may write a lot of data. You may want to relocate the file to an alternate location. Again, please verify with Oracle support before running this against your production database, as with anything that involves underscore parameters, or x$ tables. Next, switch to the database instance, and run the following: [code lang="sql"] create directory tmp_asm as '/tmp' / create table asm_map ( asm_file number, asm_pxn number ) organization external ( type oracle_loader default directory tmp_asm access parameters ( records delimited by newline fields terminated by ',' ) location ( 'asm_map.txt' ) ) / [/code] Ensure that the data is properly readable: [code lang="sql"] select * from asm_map where rownum < 10 / ASM_FILE ASM_PXN ---------- ---------- 256 4 256 7 256 21 256 28 256 35 256 49 256 52 256 75 256 88 9 rows selected. [/code] Now we can join into v$datafile and dba_extents to get the actual data we're after. Let's first build a list of table spaces and data files that are stored on this disk: [code lang="sql"] col ts_name for a30 col fname for a50 set lines 200 pages 999 select unique t.name as TS_NAME, substr(f.name, instr(f.name,'/',-1) + 1) as fname, a.file_number, f.file# as DBFILE# from v$datafile f, v$tablespace t, v$asm_diskgroup g, v$asm_alias a, v$asm_file af, ( select distinct asm_file from asm_map ) m where g.name(+) = substr(f.name,2,instr(f.name,'/')-2) and a.name(+) = upper(substr(f.name, instr(f.name,'/',-1) + 1)) and a.file_number = af.file_number and a.group_number = af.group_number and f.ts# = t.ts# and af.file_number = m.asm_file order by 1,2 / [/code] Now let's expand that to also include dba_extents. I am creating a copy of the contents of dba_extents, which is known to often not perform in an optimal fashion, particularly on large databases. Otherwise the query may take an extremely long time. This extra step is particularly helpful and yields more benefit if you want to repeatedly query the data in dba_extents, which an exercise like this is a good example of. [code lang="sql"] create table tmp_extents tablespace users as select * from dba_extents / [/code] And now we're ready to get the list of all segments that would be affected by problems on this one disk. This query gives us a list of everything stored on that disk: [code lang="sql"] col ts_name for a30 col obj for a100 set lines 200 pages 999 col segment_type for a18 set lines 200 pages 999 select unique t.name as TS_NAME, e.owner || '.' || e.segment_name as obj, e.segment_type, a.file_number, f.file# as DBFILE# from v$datafile f, v$tablespace t, v$asm_diskgroup g, v$asm_alias a, v$asm_file af, asm_map m, tmp_extents e where g.name(+) = substr(f.name,2,instr(f.name,'/')-2) and a.name(+) = upper(substr(f.name, instr(f.name,'/',-1) + 1)) and a.file_number = af.file_number and a.group_number = af.group_number and f.ts# = t.ts# and af.file_number = m.asm_file and f.file# = e.file_id and t.name = e.tablespace_name and g.allocation_unit_size * m.asm_pxn / f.block_size between e.block_id and e.block_id + e.blocks -1 order by 1,3,2 / [/code] Now with this information, you can proceed to verify if any other segments exist which are unreadable and located on defective sectors:
  • Tables can be full scanned.
  • Indexes can either be rebuilt online, or also read with a fast full scan plan.
  • Lobs can be read with a small PL/SQL block.
  • Clusters should be okay as well if the contained tables are scanned. as that will read the respective blocks.
  • Partitioned tables and indexes can be treated analogous to their non-partitioned counterparts.
  • If undo segments are affected and can't be read, you may want to involve Oracle support at this point.
By doing that, you can ensure that any potential problems can be detected before the applications or end users are affected by it, and if you don't detect any other problems, you can feel fairly safe when swapping out the disk that you won't be hit by any unexpected errors. Once we are done, let's not forget to clean up: [code lang="sql"] drop table tmp_extents / drop table asm_map / drop directory tmp_asm / [/code] Discover more about our expertise in the world of Oracle.

No Comments Yet

Let us know what you think

Subscribe by email