Pythian Blog: Technical Track

Old Dog with New Tricks: Indexing Null Columns in Oracle

I have been an Oracle DBA for quite some time now. Like most humans, when I find something that works well, I tend to use it and use it and use it.

Take, for instance, Oracle Function-Based Indexes. When you need to index a column based on data that is not there, you can create a  Function-Based Index. Oracle will create a pseudocolumn for the results of the Function-Based Indexes. 

As a quick recap on pseudocolumns, there are two types of Oracle psuedocolumns: default pseudocolumns (such as rownum) and those created by Oracle when you create Function-Based Indexes. These “manually created” pseudocolumns will hold a copy of the converted data.  

In order to be able to use Function-Based Indexes, both QUERY_REWRITE_ENABLED and QUERY_REWRITE_INTEGRITY must be set. QUERY_REWRITE_ENABLED will allow Oracle to rewrite the query. and then find the most efficient path using the rewritten query.

QUERY_REWRITE_INTEGRITY determines the degree to which Oracle must enforce query rewriting.

Two common scenarios of when you would want to use a Function-Based Index:

a) Users can enter their name in any format, uppercase, lowercase, or mixed case:
  • Smith
  • SMITH
  • smith

When you search, you want to find every Smith.  So you would use a where clause, such as:

  • upper(lastname) = ‘SMITH’

 

b) Columns are nullable, and you are looking for any records where there is a null in a column. To search this, you would need to use:

  • where mycolumn is null

In both scenarios A and B, Oracle would have to do a full table scan to find the relevant records. To avoid doing full table scans to identify a few records, I would create indexes such as:

  1. create index users_lastname on users_table ( upper (lastname) );
  2. create index find_null on users_table ( nvl ( mycolumn, ‘ISNULL’ );

Oracle would then create a psuedocolumn and create an index based on the pseudocolumn, making queries more efficient. 

So, what is the new trick for this old dog? If I want to easily identify all records with a NULL value in a column, I can use this syntax: 

  • create index find_null on users_table ( mycolumn, 1) ;  

This functionality has actually been available since Oracle 11g. It has been a long time since I have had to create an index to identify the records with null values, and only recently had the need to revisit this scenario. Normally, Oracle will not index columns with null values. However, by adding an extra character to the index (‘1),  Oracle will index the nullable values.

The following example demonstrates the above:

a) Create a procedure to create and load test table (  logic modified from “How to Quickly Create Large Tables for Testing (Doc ID 810826.1”)

create or replace  PROCEDURE TESTPROC
( val1 IN NUMBER DEFAULT 1
, val2 IN NUMBER DEFAULT 1) as
   col1_val varchar2(20);
   col2_val varchar2(20);
   table_exists number(1,0);
BEGIN
-- Check whether the table exists or not
-- If yes, then the table is truncated
select count(*) into table_exists from user_objects where object_name = 'BABETTE_TEST' and object_type='TABLE';
if table_exists=1 then
 execute immediate 'truncate table BABETTE_TEST';
else
  execute immediate 'CREATE TABLE BABETTE_TEST (COL1 VARCHAR2(20), COL2 VARCHAR2(20) )';
end if;


for i in 1..val1 loop
   col1_val := 'A'||i;
   for j in 1..val2 loop
   col2_val := 'B'||j;
    execute immediate 'insert into BABETTE_TEST values(:col1_val,:col2_val )' 
  using col1_val,col2_val;
   end loop;
   commit;
   end loop;
   commit;
END TESTPROC;
/


SQL> exec testproc (100,2000);

b) Set a few rows to null value and query for records with null values. Notice that a full table scan is done :

SQL> update babette_test set col2 = null where rownum < 100;
SQL> exec  DBMS_STATS.GATHER_TABLE_STATS('SYS','BABETTE_TEST');
SQL> set autotrace on

ORC1 (SYS) SQL> update babette_test set col2 = 'GOOD' where col2 is null;

99 rows updated.

Execution Plan
----------------------------------------------------------
Plan hash value: 3281971041

-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT   |              |     1 |     6 |   117   (7)| 00:00:01 |
|   1 |  UPDATE            | BABETTE_TEST |       |       |            |          |
|*  2 |   TABLE ACCESS FULL| BABETTE_TEST |     1 |     6 |   117   (7)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("COL2" IS NULL)
c) Add a function-based index using the new syntax. Verify that a pseudocolumn was added to the table.
SQL> create index babette_test_ix on babette_test (col2, 1) ; 

SQL> select * from dba_ind_expressions where table_name = 'BABETTE_TEST';   
   
INDEX_OWNER  INDEX_NAME         TABLE_OWNER  TABLE_NAME       COLUMN_EXPRESSION    POS
------------ ----------------   ------------ --------------   ------------------   -----
SYS          BABETTE_TEST_IX    SYS          BABETTE_TEST     1          

d) Query for records with null values and notice that an index scan is done.

SQL> update babette_test set col2 = null where rownum < 100;
SQL> exec  DBMS_STATS.GATHER_TABLE_STATS('SYS','BABETTE_TEST');

SQL> update babette_test set col2 = 'GOOD' where col2 is null;

Execution Plan
----------------------------------------------------------
Plan hash value: 4024672828

-------------------------------------------------------------------------------------
| Id  | Operation         | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT  |                 |    99 |   594 |     3   (0)| 00:00:01 |
|   1 |  UPDATE           | BABETTE_TEST    |       |       |            |          |
|*  2 |   INDEX RANGE SCAN| BABETTE_TEST_IX |    99 |   594 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("COL2" IS NULL)

 

No Comments Yet

Let us know what you think

Subscribe by email