Pythian Blog: Technical Track

Analyze index validate structure - the dark side

Recently a co-worker wanted to discuss a problem he had encountered after upgrading a database. The upgrade plan included steps to verify object integrity; this was being done with analyze table <tablename> validate structure cascade. All was fine until one particular table was being analyzed. Suddenly it seemed the process entered a hung state. The job was killed and separate commands were created to analyze each object individually. That went well up until one of the last indexes was reached. Me: How long has it been running? Coworker: Three days. Yes, you read that correctly, it had been running for three days. My friend ran a 10046 trace to see what the process was doing; nearly all the work was 'db file sequential read' on the table. At this time I suspected it was related to the clustering_factor for the index in question. The analyze process for an index verifies each row in the index. If the cluster is well ordered then the number of blocks read from the table will be similar to the number of blocks making up the table. If however the table is not well ordered relative to the columns in the index the number of blocks read from the table can be many times the total number of blocks that are actually in the table. Consider for a moment that we have rows with an ID of 1,2,3,4 and 5. Let's assume that our index is created on the ID column. If these rows are stored in order in the table, it is very likely these rows will all be in the same block, and that a single block read will fetch all of these rows. If however the rows are stored in some random order, it may be that a separate block read is required for each lookup.
ID Block Number
1 22
2 75
3 16
4 2
5 104
In this case 5 separate blocks must be read to retrieve these rows. In the course of walking the index, some minutes later these rows must also be read:
ID Block Number
1048576 22
1048577 75
1048578 16
1048579 2
1048580 104
The blocks where these rows reside are the same blocks as the earlier example. The problem of course is that quite likely the blocks have been removed from cache by this time, and must be read again from disk. Now imagine performing this for millions of rows. With a poor clustering factor the analyze command on an index could take quite some time to complete. This seemed worthy of a test so we could get a better idea of just how bad this issue might be. The test was run with 1E7 rows. The SQL shown below creates 1E7 rows, but you can simply change the value of level_2 to 1e3 to reduce the total rows to 1E6, or even smaller if you like.  
[sourcecode language="sql" padlinenumbers="true"] -- keep this table small and the rows easily identifiable -- or not... -- 1e3 x 1e4 = 1e7 def level_1=1e3 def level_2=1e4 drop table validate_me purge; create table validate_me tablespace alloctest_a -- EXTENT MANAGEMENT LOCAL AUTOALLOCATE SEGMENT SPACE MANAGEMENT AUTO --alloctest_m -- EXTENT MANAGEMENT LOCAL AUTOALLOCATE SEGMENT SPACE MANAGEMENT MANUAL --alloctest_u -- EXTENT MANAGEMENT LOCAL UNIFORM SIZE 65536 SEGMENT SPACE MANAGEMENT AUTO pctfree 0 as select -- for a good clustering factor --id -- -- for a bad clustering factor floor(dbms_random.value(1,1e6)) id , substr('ABCDEFGHIJKLMNOPQRSTUVWXYZ',mod(id,10),15) search_data , to_char(id,'99') || '-' || rpad('x',100,'x') padded_data from ( select rownum id from ( select null from dual connect by level <= &level_1 ) a, ( select null from dual connect by level <= &level_2 ) b ) / create index validate_me_idx1 on validate_me(id,search_data); exec dbms_stats.gather_table_stats(user,'VALIDATE_ME',method_opt => 'for all columns size 1') [/sourcecode]
  Let's see just what the clustering factor is for this index. The following script cluster-factor.sql will get this information for us.  
[sourcecode language="sql"] col v_tablename new_value v_tablename noprint col v_owner new_value v_owner noprint col table_name format a20 head 'TABLE NAME' col index_name format a20 head 'INDEX NAME' col index_rows format 9,999,999,999 head 'INDEX ROWS' col table_rows format 9,999,999,999 head 'TABLE ROWS' col clustering_factor format 9,999,999,999 head 'CLUSTERING|FACTOR' col leaf_blocks format 99,999,999 head 'LEAF|BLOCKS' col table_blocks format 99,999,999 head 'TABLE|BLOCKS' prompt prompt Owner: prompt set term off feed off verify off select upper('&1') v_owner from dual; set term on feed on prompt prompt Table: prompt set term off feed off verify off select upper('&2') v_tablename from dual; set term on feed on select t.table_name , t.num_rows table_rows , t.blocks table_blocks , i.index_name , t.num_rows index_rows , i.leaf_blocks , clustering_factor from all_tables t join all_indexes i on i.table_owner = t.owner and i.table_name = t.table_name where t.owner = '&v_owner' and t.table_name = '&v_tablename' / undef 1 2 [/sourcecode]
  Output from the script:  
[sourcecode language="sql"] SQL> @cluster-factor jkstill validate_me Owner: Table: TABLE LEAF CLUSTERING TABLE NAME TABLE ROWS BLOCKS INDEX NAME INDEX ROWS BLOCKS FACTOR -------------------- -------------- ----------- -------------------- -------------- ----------- -------------- VALIDATE_ME 10,000,000 164,587 VALIDATE_ME_IDX1 10,000,000 45,346 10,160,089 1 row selected. Elapsed: 00:00:00.05 [/sourcecode]
  On my test system creating the table for 1E7 rows required about 2 minutes and 15 seconds, while creating the index took 28 seconds. You may be surprised at just how long it takes to analyze that index.  
[sourcecode language="sql"] SQL> analyze index jkstill.validate_me_idx1 validate structure online; Index analyzed. Elapsed: 00:46:06.49 [/sourcecode]
  Prior to executing this command a 10046 trace had been enabled, so there is a record of how Oracle spent its time on this command.   If you are wondering how much of the 46 minutes was consumed by the tracing and writing the trace file, it was about 6 minutes:  
[sourcecode language="bash"] $> grep "WAIT #48004569509552: nam='db file sequential read'"; oravm1_ora_2377_VALIDATE.trc | awk '{ x=x+$8 } END { printf ("%3.2f\n",x/1000000/60) }' 40.74 [/sourcecode]

A Well Ordered Table

Now lets see how index analyze validate structure performs when the table is well ordered. The table uses the DDL as seen in the previous example, but rather than use dbms_random to generate the ID column, the table is created with the rows loaded in ID order. This is done by uncommenting id in the DDL and commenting out the call to dbms_random.  
[sourcecode language="sql"] SQL> analyze index jkstill.validate_me_idx1 validate structure online; Index analyzed. Elapsed: 00:01:40.53 [/sourcecode]
That was a lot faster than previous. 1 minute and 40 seconds whereas previously the same command ran for 40 minutes.   Using some simple command line tools we can see how many times each block was visited.   First find the cursors and verify this cursor used only once in the session
[sourcecode language="bash"] $> grep -B1 '^analyze index' oravm1_ora_19987_VALIDATE.trc PARSING IN CURSOR #47305432305952 len=64 dep=0 uid=90 oct=63 lid=90 tim=1462922977143796 hv=2128321230 ad='b69cfe10' sqlid='318avy9zdr6qf' analyze index jkstill.validate_me_idx1 validate structure online $> grep -nA1 'PARSING IN CURSOR #47305432305952' oravm1_ora_19987_VALIDATE.trc 63:PARSING IN CURSOR #47305432305952 len=64 dep=0 uid=90 oct=63 lid=90 tim=1462922977143796 hv=2128321230 ad='b69cfe10' sqlid='318avy9zdr6qf' 64-analyze index jkstill.validate_me_idx1 validate structure online -- 276105:PARSING IN CURSOR #47305432305952 len=55 dep=0 uid=90 oct=42 lid=90 tim=1462923077576482 hv=2217940283 ad='0' sqlid='06nvwn223659v' 276106-alter session set events '10046 trace name context off' [/sourcecode]
As this cursor was reused, we need to limit the lines we considered from the trace file.   One wait line appears like this: WAIT #47305432305952: nam='db file sequential read' ela= 317 file#=8 block#=632358 blocks=1 obj#=335456 tim=1462923043050233 As it is already known the entire table resides in one file, it is not necessary to check the file. From the following command it is clear that no block was read more than once during the analyze index validate structure when the table was well ordered in relation to the index.  
[sourcecode language="bash"] $> tail -n +64 oravm1_ora_19987_VALIDATE.trc| head -n +$((276105-64)) | grep "WAIT #47305432305952: nam='db file sequential read'" | awk '{ print $10 }' | awk -F= '{ print $2 }' | sort | uniq -c | sort -n | tail 1 742993 1 742994 1 742995 1 742996 1 742997 1 742998 1 742999 1 743000 1 743001 1 743002 [/sourcecode]
  That command line may look a little daunting, but it is really not difficult when each bit is considered separately. From the grep command that searched for cursors we know that the cursor we are interested in first appeared at line 64 in the trace file. tail -n +64 oravm1_ora_19987_VALIDATE.trc The cursor was reused at line 276105, so tell the tail command to output only the lines up to that point in the file. head -n +$((276105-64)) The interesting information in this case is for 'db file sequential read' on the cursor of interest. grep "WAIT #47305432305952: nam='db file sequential read'" Next awk is used to output the block=N portion of each line. awk '{ print $10 }' awk is again used, but this time to split the block=N output at the '=' operator, and output only the block number. awk -F= '{ print $2 }' The cut command could have been used here as well. eg. cut -d= -f2 Sort the block numbers sort Use the uniq command to get a count of how many times each value appears in the output. uniq -c Use sort -n to sort the output from uniq. If there are any counts greater than 1, they will appear at the end of the output. sort -n And pipe the output through tail. We only care if any block was read more than once. tail Now for the same procedure on the trace file generated from the poorly ordered table.  
[sourcecode language="bash"] [/sourcecode]
[sourcecode language="bash"] $> grep -B1 '^analyze index' oravm1_ora_2377_VALIDATE.trc PARSING IN CURSOR #48004569509552 len=64 dep=0 uid=90 oct=63 lid=90 tim=1462547433220254 hv=2128321230 ad='aad620f0' sqlid='318avy9zdr6qf' analyze index jkstill.validate_me_idx1 validate structure online $> grep -nA1 'PARSING IN CURSOR #48004569509552' oravm1_ora_2377_VALIDATE.trc 51:PARSING IN CURSOR #48004569509552 len=64 dep=0 uid=90 oct=63 lid=90 tim=1462547433220254 hv=2128321230 ad='aad620f0' sqlid='318avy9zdr6qf' 52-analyze index jkstill.validate_me_idx1 validate structure online -- 6076836:PARSING IN CURSOR #48004569509552 len=55 dep=0 uid=90 oct=42 lid=90 tim=1462550199668869 hv=2217940283 ad='0' sqlid='06nvwn223659v' 6076837-alter session set events '10046 trace name context off' [/sourcecode]
  The top 30 most active blocks were each read 53 or more times when the table was not well ordered in relation to the index.
[sourcecode language="bash"] $> tail -n +51 oravm1_ora_2377_VALIDATE.trc | head -n +$((6076836-51)) | grep "WAIT #48004569509552: nam='db file sequential read'" | awk '{ print $10 }' | awk -F= '{ print $2 }' | sort | uniq -c | sort -n | tail -30 53 599927 53 612399 53 613340 53 633506 53 640409 53 644099 53 649054 53 659198 53 659620 53 662600 53 669176 53 678119 53 682177 53 683409 54 533294 54 533624 54 537977 54 549041 54 550178 54 563206 54 568045 54 590132 54 594809 54 635330 55 523616 55 530064 55 532693 55 626066 55 638284 55 680250 [/sourcecode]
 

Use RMAN

There is a feature of RMAN that allows checking for logical and physical corruption of an Oracle database via the command backup check logical validate database. This command does not actually create a backup, but just reads the database looking for corrupt blocks. Following is an (edited) execution of running this command on the same database where the analyze index commands were run. A portion of the block corruption report is included.
[sourcecode language="sql" padlinenumbers="true"] RMAN> backup check logical validate database; 2> Starting backup at 06-MAY-16 allocated channel: ORA_DISK_1 channel ORA_DISK_1: SID=29 instance=oravm1 device type=DISK channel ORA_DISK_1: starting full datafile backup set channel ORA_DISK_1: specifying datafile(s) in backup set input datafile file number=00008 name=+DATA/oravm/datafile/alloctest_a.273.789580415 input datafile file number=00009 name=+DATA/oravm/datafile/alloctest_u.272.789582305 input datafile file number=00024 name=+DATA/oravm/datafile/swingbench.375.821472595 input datafile file number=00023 name=+DATA/oravm/datafile/swingbench.374.821472577 input datafile file number=00019 name=+DATA/oravm/datafile/bh08.281.778786819 input datafile file number=00002 name=+DATA/oravm/datafile/sysaux.257.770316147 input datafile file number=00004 name=+DATA/oravm/datafile/users.259.770316149 input datafile file number=00001 name=+DATA/oravm/datafile/system.256.770316143 input datafile file number=00011 name=+DATA/oravm/datafile/alloctest_m.270.801310167 input datafile file number=00021 name=+DATA/oravm/datafile/ggs_data.317.820313833 input datafile file number=00006 name=+DATA/oravm/datafile/undotbs2.265.770316553 input datafile file number=00026 name=+DATA/oravm/datafile/undotbs1a.667.850134899 input datafile file number=00005 name=+DATA/oravm/datafile/example.264.770316313 input datafile file number=00014 name=+DATA/oravm/datafile/bh03.276.778786795 input datafile file number=00003 name=+DATA/oravm/datafile/rcat.258.861110361 input datafile file number=00012 name=+DATA/oravm/datafile/bh01.274.778786785 input datafile file number=00013 name=+DATA/oravm/datafile/bh02.275.778786791 input datafile file number=00022 name=+DATA/oravm/datafile/ccdata.379.821460707 input datafile file number=00007 name=+DATA/oravm/datafile/hdrtest.269.771846069 input datafile file number=00010 name=+DATA/oravm/datafile/users.271.790861829 input datafile file number=00015 name=+DATA/oravm/datafile/bh04.277.778786801 input datafile file number=00016 name=+DATA/oravm/datafile/bh05.278.778786805 input datafile file number=00017 name=+DATA/oravm/datafile/bh06.279.778786809 input datafile file number=00018 name=+DATA/oravm/datafile/bh07.280.778786815 input datafile file number=00020 name=+DATA/oravm/datafile/bh_legacy.282.778787059 input datafile file number=00025 name=+DATA/oravm/datafile/baseline_dat.681.821717827 input datafile file number=00027 name=+DATA/oravm/datafile/sqlt.668.867171675 input datafile file number=00028 name=+DATA/oravm/datafile/bh05.670.878914399 channel ORA_DISK_1: backup set complete, elapsed time: 00:25:27 List of Datafiles ================= File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 1 OK 0 75632 256074 375655477 File Name: +DATA/oravm/datafile/system.256.770316143 Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 158478 Index 0 17160 Other 0 4730 File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 2 OK 0 36332 394240 375655476 File Name: +DATA/oravm/datafile/sysaux.257.770316147 Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 170007 Index 0 138603 Other 0 49298 [/sourcecode]
  As shown in the report, only 25 minutes were required to check the entire database for physically or logically corrupt blocks, as opposed to the 40 minutes needed to analyze index validate structure. While the RMAN corruption check is not the same as the check performed by analyze index validate structure, it is a test that can be completed in a much more timely manner, particularly if some indexes are both large and have a high value for the clustering factor.

Rebuild the Index?

If you have strong suspicions that a large index with an unfavorable clustering factor has corrupt blocks, it may be more expedient to just rebuild the index. If the database is on Oracle Enterprise Edition, the rebuild can also be done with the ONLINE option. Consider again the index on the test table with 1E7 rows. Creating the index required 28 seconds, while validating the structure required 40 minutes.  
[sourcecode language="bash"] SQL> alter index validate_me_idx1 rebuild online; Index altered. Elapsed: 00:00:59.88 [/sourcecode]
  The conclusion is quite clear; the use of analyze index validate structure needs to be carefully considered when its use it contemplated for large indexes. The use of this command could be very resource intensive and take quite some time to complete. It is worthwhile to consider alternatives that my be much less resource intensive and time consuming.

No Comments Yet

Let us know what you think

Subscribe by email