Pythian Blog: Technical Track

Oracle file extent map, the old-fashioned way

Have you ever wanted to know where exactly in a datafile segments are placed? Have you ever wondered just how fragmented a tablespace is? How about how much space you could reclaim if the last segment of a datafile was moved out of the tablespace? These questions, and many more, can be easily answered with a detailed extent map of the datafiles.

Yes, I know, this subject is a rather old one, and a few different solutions have been provided by several professionals, including the famous Tom Kyte. But none of the answers I found did exactly what I wanted, and therefore, I chose to write my own solution. OEM does provide this, but for a price — the Tablespace Map is part of the Oracle Tuning Pack — and I like the free stuff and the extra flexibility I have using queries.

As all of you know, Oracle has been providing this extent map via the dba_extents view forever (or at least since v6, which is the version I first worked with). The problem is that today’s super-large databases tend to have segments with thousands of extents. These extents are either of fixed size (LMT Uniformed Sized, DMT with pct_increase=0) or variable sizes (DMT, system-managed extent LMTs). In both cases, the dba_extents view provides one entry per extent, but several of these extents may in fact be contiguous extents. To answer the questions I began with, a view of this extent list aggregated by segment is much more useful.

To have a fully aggregated map, one needs to union the content of dba_extents and dba_free_space. Piece of cake, right? No! The basic hierarchical query is relatively simple to understand, but getting Oracle to run it efficiently is something else. No matter how I tried to write the query, with or without hints, I was simply not able to get it to provide me the aggregated map within an acceptable amount of time, and without adding an insane load on the server.

The main problem is the underlying complexity of the dba_extents and dba_free_space views. The solution to that is simple — get the data you want into a regular heap table (I used global temporary tables) and — case solved. The queries to produce the map can now run in just a few minutes, if not seconds.

Here is what I wrote to produce the aggregated extent map. From this map, you can run all kinds of interesting and useful queries. I’ve also provided some of those to get you started.

The base script:

--
-- Create the global temporary tables.
--
create global temporary table seg_list
(file_id number,
 block_id number,
 owner varchar2(30),
 segment_name varchar2(30),
 segment_type varchar2(30),
 blocks number,
 constraint seg_list_pk primary key 
   (file_id, block_id, owner, segment_name, segment_type))
;
create global temporary table aggregated_extent_map
(file_id number,
 root_block_id number,
 owner varchar2(30),
 segment_name varchar2(30),
 segment_type varchar2(30),
 total_blocks number)
;
--
-- Load the base extent data 
-- from dba_extents and dba_free_space
--
insert into seg_list
select file_id,block_id,owner,
       segment_name,segment_type,blocks
  from dba_extents
-- this is optional, you can load all your tablespaces
-- where tablespace_name = 'MY_TS'  
union all
select file_id,block_id,
       'free space',
       'free space',
       'free space',
       blocks 
  from dba_free_space 
-- this is optional, you can load all your tablespaces
-- where tablespace_name = 'MY_TS';
--
-- Generate the aggregate extent map using a hierarchical query.
-- Be patient, this will take a short while depending on the number 
-- of extents to process and your system's speed. It took 5:02.69
-- minutes on a dev server to process 18033 extents and 
-- generated 11848 aggregated extents. 
--
insert into aggregated_extent_map
select file_id, root, owner, segment_name, segment_type,
       sum(blocks)
  from
(
select owner, segment_name, segment_type, file_id,
       blocks, block_id,
       substr(sys_connect_by_path(block_id,'/'),2,
       decode(instr(sys_connect_by_path(block_id,'/'),'/',2)
       -2,-2,length(sys_connect_by_path(block_id,'/')),
       instr(sys_connect_by_path(block_id,'/'),'/',2)-2))
       root
  from seg_list a
 start with (file_id, block_id) in
            (select file_id, block_id
              from seg_list
             where (file_id,block_id) in 
                   (select file_id, min(block_id)
                      from seg_list group by file_id)
            union all
            select b.file_id, b.block_id
              from seg_list a, seg_list b
             where b.block_id = a.block_id + a.blocks
               and a.file_id = b.file_id
               and (a.owner <> b.owner or 
                    a.segment_name <> b.segment_name)
           )
connect by owner = prior owner
       and segment_name = prior segment_name
       and file_id = prior file_id
       and block_id = prior a.block_id + prior a.blocks
) c
group by owner, segment_name, segment_type, file_id, root
;

-- >>> run all your queries here... 
-- Don't forget to re-populate the temporary tables if you 
-- sign out or rollback.

Here are some sample queries.

Query 1

This query lists the last five aggregated extents for each datafile.

break on file_id skip 1
set linesize 140 pagesize 10000
col file_id for 9999
col top_n noprint
col segment_type for a12
col size_mb for 999999.99
select * from (
select a.file_id, 
       rank() over (partition by a.file_id 
                    order by root_block_id desc) top_n,
       segment_name, 
       segment_type,
       root_block_id,
       total_blocks*(b.bytes/b.blocks)/1048576 size_mb
  from aggregated_extent_map a, dba_data_files b
 where a.file_id = b.file_id
-- use this if you loaded more than one TS in the seg_list
-- and tablespace_name = 'MY_TS'
-- use this to list a single datafile
-- and file_id = 7
) where top_n 
;

Sample output with segment names masked:

FILE_ID SEGMENT_NAME    SEGMENT_TYPE ROOT_BLOCK_ID  SIZE_MB
------- --------------- ------------ -------------  -------
      1 free space      free space           16965   246.94
        IDL_UB2$        TABLE                16901     1.00
        SOURCE$         TABLE                16837     1.00
        free space      free space           16809      .44
        PLAN_TABLE      TABLE                16805      .06

      2 free space      free space          260261    29.44
        _SYSSMU64$      TYPE2 UNDO          260257      .06
        free space      free space          260253      .06
        _SYSSMU44$      TYPE2 UNDO          260249      .06
        free space      free space          242577   276.13

      3 JOB_LOGS_PK     INDEX                13381     1.00
        JOB_MASTERS     TABLE                13317     1.00
        JOB_LOGS        TABLE                13253     1.00
        JOB_MASTERS_PK  INDEX                13189     1.00
        JOB_MASTERS     TABLE                13125     1.00

      4 T1              TABLE               511997      .06
        T2              TABLE               511993      .06
        IDX1            INDEX               511989      .06
        IDX2            INDEX               511981      .13
        T3              TABLE               511973      .13

      5 free space      free space               5    64.00

      6 free space      free space           19465   103.75
        IDX3            INDEX                19401      .50
        T4              TABLE                19241     1.25
        free space      free space           18921     2.50
        T4              TABLE                18889      .25

      7 free space      free space          319493   124.00
        IDX4            INDEX               304133   240.00
        T5              TABLE               301573    40.00
        IDX4            INDEX               298757    44.00
        free space      free space          258821   624.00

      8 free space      free space           40965   320.00
        T6              TABLE                32773   128.00
        T7              TABLE                12293   320.00
        T8              TABLE                    5   192.00
...

The most interesting observation I can make of the first listing is that datafile 7 has a 624MB free space extent in the last position. Should I need to reclaim some or all of that space, I could run a move of table T5 and a rebuild of IDX4 in place, or better yet, to another tablespace. Assuming I move it to another tablespace, I will be able to reclaim over 1GB of disk space by shrinking datafile 7.

Running that query for the single datafile and extracting more aggregate extents might show that even more space would get freed (that was not the case for me as the sixth-last extent was used by another table).

Query 2

This query shows the “real” fragmentation of each datafile.

break on file_id skip 1
compute sum of cnt on file_id
compute sum of blocks on file_id
compute sum of size_mb on file_id
set linesize 140 pagesize 10000
col file_id for 9999
col segment_type for a32
col cnt for 999999
col blocks for 999999999
col size_mb for 999999.99
select a.file_id,
       segment_type,
       count(*) cnt,
       sum(total_blocks) blocks,
       sum(total_blocks*(b.bytes/b.blocks)/1048576) size_mb
  from aggregated_extent_map a, dba_data_files b
 where a.file_id = b.file_id
 group by a.file_id, segment_type
order by file_id, cnt desc
;

Sample output:

FILE_ID SEGMENT_TYPE         CNT     BLOCKS    SIZE_MB
------- ---------------- ------- ---------- ----------
      1 INDEX                679       3084      48.19
        TABLE                621      11540     180.31
        LOBSEGMENT            65        396       6.19
        CLUSTER               63       1468      22.94
        LOBINDEX              51        204       3.19
        TABLE PARTITION       27        108       1.69
        INDEX PARTITION       24         96       1.50
        ROLLBACK               2         28        .44
        free space             2      15832     247.38
        CACHE                  1          4        .06
        NESTED TABLE           1          4        .06
*******                  ------- ---------- ----------
sum                         1536      32764     511.94

      2 TYPE2 UNDO           521      51023     797.23
        free space           120     210076    3282.44
*******                  ------- ---------- ----------
sum                          641     261099    4079.67

      3 TABLE                130       8640     135.00
        INDEX                 73       4800      75.00
*******                  ------- ---------- ----------
sum                          203      13440     210.00

      4 INDEX               2113     147512    2304.88
        TABLE               1614     187228    2925.44
        free space           348     176808    2762.63
        LOBINDEX              28        224       3.50
        LOBSEGMENT            28        224       3.50
*******                  ------- ---------- ----------
sum                         4131     511996    7999.94

      5 free space             1       4096      64.00
*******                  ------- ---------- ----------
sum                            1       4096      64.00

      6 TABLE                124      15456     120.75
        free space            43      16000     125.00
        LOBSEGMENT            11       1184       9.25
        INDEX                  1         64        .50
        LOBINDEX               1         32        .25
*******                  ------- ---------- ----------
sum                          180      32736     255.75

      7 TABLE                 71      44544     696.00
        free space            44     264704    4136.00
        INDEX                  2      18176     284.00
*******                  ------- ---------- ----------
sum                          117     327424    5116.00

      8 TABLE                  3      40960     640.00
        free space             1      20480     320.00
*******                  ------- ---------- ----------
sum                            4      61440     960.00

...

The second listing shows that file 4 is much more fragmented than the other ones, and that the fragmentation is well distributed over tables and indexes. Further analysis is required to know if this is abnormal, as it is possible that file 4 stores lots of very small objects (see listing 3).

We can also see that datafile 5 is totally empty and is very small. Perhaps it can be dropped.

Datafile 7 is very interesting. It has 4.1GB free out of 5.1GB. But, if you refer to listing 1, this datafile as it stands can only shrink by 124MB. Some objects would need to be moved and rebuilt to get all that freespace sitting at the end of the datafile, and to be able to shrink the datafile significantly.

Query 3

This query shows the number of distinct aggregated segments for each datafiles.

clear break
col file_id for 9999
col cnt for 999999
select file_id, 
       count(distinct segment_name) cnt
from aggregated_extent_map
group by file_id
;

Sample output:

FILE_ID     CNT
------- -------
      1    1039
      2     361
      3      18
      4     552
      5       1
      6      28
      7      32
      8       4
...

The third listing shows why there’s so many extents in datafile 4. It hosts 552 distinct objects. Listing 2 shows that this datafile has a total of 4131 contiguous extents. That’s an average of 7.5 contiguous extents per objects. Not that bad. A quick look at the dba_data_files reveals that this file belongs to the USERS tablespace. The dba_tablespaces view shows that the segment_space_management for USERS is AUTO (i.e. system-managed). System-managed segment extents always start with small extents and grow to about a maximum of 128MB. It is therefore not surprising at all that the datafile has so many extents.

The problem here is that this datafile has 2.76GB of free space spread over 348 chunks, interspersed with 3783 (4131-348) segment extents. In the event that you’d need to release it back to the system, that space is not currently reclaimable without a serious reorganization.

Query 4

This query shows the largest segment, and the first and last aggregated extents per datafile.

break on file_id skip 1
set linesize 140 pagesize 10000
col file_id for 9999
col top_n noprint
col tb head ''
col segment_name head ''
col fe head ''
col le head ''
col ls head ''
col nl newline head ''
col filler1 for a42 head ''
col filler2 for a10 head ''
col size_mb for 999999.99
select aggr.file_id, 
       'total blocks:' tb, 
       '' filler1, 
       file_blocks,
       null nl,
       '      First extent:' fe,
       minext.segment_name,
       aggr.min_rbi,
       minext.total_blocks,
       null nl,
       '      Last  extent:' le,
       maxext.segment_name, 
       aggr.max_rbi, 
       maxext.total_blocks,
       null nl,
       '      largest seg :' ls,
       lg.segment_name,
       '' filler2,
       lg.total_blocks
from (select file_id, min(root_block_id) min_rbi,
             max(root_block_id) max_rbi,
             sum(total_blocks) file_blocks
        from aggregated_extent_map group by file_id) aggr,
     aggregated_extent_map minext,
     aggregated_extent_map maxext,
     (select z.*,
             rank() over (partition by file_id 
                          order by total_blocks desc) top_n
        from (select file_id, segment_name,
                     sum(total_blocks) total_blocks
                from aggregated_extent_map 
               where segment_name <> 'free space'
               group by file_id, segment_name 
               order by total_blocks) z
     ) lg
where minext.file_id = aggr.file_id 
  and minext.root_block_id = aggr.min_rbi
  and maxext.file_id = aggr.file_id 
  and maxext.root_block_id = aggr.max_rbi
  and lg.file_id = aggr.file_id
  and lg.top_n
;

Sample output:

FILE_ID                                         FILE_BLOCKS
------- ------------- ------------------------- -----------
                                       MIN_RBI TOTAL_BLOCKS
- ------------------- ------------- ---------- ------------
                                       MAX_RBI TOTAL_BLOCKS
- ------------------- ------------- ---------- ------------
                                               TOTAL_BLOCKS
- ------------------- ------------- ---------- ------------
      1 total blocks:                                 32764
        First extent: SYSTEM                 5            8
        Last  extent: free space         16965        15804
        largest seg : FGA_LOG$                         4608

      2 total blocks:                                261227
        First extent: _SYSSMU1$              6            3
        Last  extent: free space        260261         1884
        largest seg : _SYSSMU266$                     28096

      3 total blocks:                                 13440
        First extent: free space             5          192
        Last  extent: JOB_LOGS_PK        13381           64
        largest seg : JOB_MASTERS                      4032

      4 total blocks:                                511996
        First extent: free space             5            8
        Last  extent: BP                511997            4
        largest seg : HOLD_LOGMINER                   34144

      6 total blocks:                                 32736
        First extent: CHAT_ROOM              9          288
        Last  extent: free space         19401        13344
        largest seg : SMSMESSAGE                      10752

      7 total blocks:                                 31744
        First extent: CHAT_EVENT             5          256
        Last  extent: free space         31493          256
        largest seg : PODCONTACTS                      5632

      8 total blocks:                                 61440
        First extent: PODFILEACL             5        12288
        Last  extent: free space         40965        20480
        largest seg : PODFILEDATA                     20480

...

Ok, you get the idea. You can run queries against this aggregated_extent_map table to get information about datafiles, segments, and tablespaces. Heck, if you’re up to it, you can probably extract the list of aggregated extents for a tablespace, load it into an excel sheet, and chart it with nice colors and all, à la OEM.

I think this mapping is very useful to anyone who wants to understand where the space is being used within their database and see if maintenance would be beneficial. Remember, space is not cheap — MBs are, but DBAs’ and SAs’ time is not, and nor is backup storage capacity, nor the time to backup (and restore) the datafiles.

For maximum performance, keep your databases lean and clean!

Enjoy!
Marc.

No Comments Yet

Let us know what you think

Subscribe by email