Pythian Blog: Technical Track

Tackling time troubles - how to use dates correctly in Oracle

For the Oracle Code series of events I have delivered an older but still relevant talk called "Tackling Time Troubles." Half of it is nice-to-know geek trivia and the other half is a few points on how to avoid very common mistakes when dealing with these datatypes in an Oracle database. https://www.youtube.com/watch?v=jB1CvFqUVyY&w=560&h=315 If you don't have the patience to watch the full talk, here is an outtake.

Avoid functions on datetime columns in your predicates - part 1

So this first one is a query against a somewhat large table. I created it by selecting dba_objects into a new table several times and created a regular index on the last_ddl_time column. This results in a "date" column. Unlike MySQL and postgres, "date" in Oracle also stores the hours, minutes and seconds and a common (but far from great) way to find all rows that have a date on a given date is to use the trunc() function which will set all components that are more specific than the day (hours, minutes and seconds) to 00, making the comparison to a single date an equality check.
SQL> SELECT COUNT(*)
 FROM bigtab
 WHERE TRUNC(last_ddl_time) = to_date('20170908','YYYYMMDD');
  COUNT(*)
 ----------
  1772
 Elapsed: 00:00:07.23
 
 Execution Plan
 ----------------------------------------------------------
 Plan hash value: 2140185107
 -----------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 -----------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1 | 8 | 427 (1) | 00:00:01 |
 | 1 | SORT AGGREGATE | | 1 | 8 | | |
 |* 2 | TABLE ACCESS FULL| BIGTAB | 908 | 7264 | 427 (1) | 00:00:01 |
 -----------------------------------------------------------------------------
 Predicate Information (identified by operation id):
 ---------------------------------------------------
  2 - filter(TRUNC(INTERNAL_FUNCTION("LAST_DDL_TIME"))=TO_DATE('
  2017-09-08 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
At this point you are wondering what's wrong with that query. Below I have another one that looks somewhat more complicated and longer and uses " between conditions". It does the same thing as the query above and returns the same results. But it does it 35x faster.
SQL> SELECT COUNT(*)
 FROM bigtab
 WHERE last_ddl_time BETWEEN to_date('20170908','YYYYMMDD')
  AND to_date('20170908-23:59:59','YYYYMMDD-HH24:MI:SS');
  COUNT(*)
 ----------
  1772
 Elapsed: 00:00:00.21
 
 Execution Plan
 ----------------------------------------------------------
 Plan hash value: 397841597
 --------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 --------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1 | 8 | 7 (0) | 00:00:01 |
 | 1 | SORT AGGREGATE | | 1 | 8 | | |
 |* 2 | INDEX RANGE SCAN | BIGTAB_IDX | 228 | 1824 | 7 (0) | 00:00:01 |
 --------------------------------------------------------------------------------
 Predicate Information (identified by operation id):
 ---------------------------------------------------
  2 - access("LAST_DDL_TIME">=TO_DATE(' 2017-09-08 00:00:00',
  'syyyy-mm-dd hh24:mi:ss') AND "LAST_DDL_TIME"<=TO_DATE(' 2017-09-08
  23:59:59', 'syyyy-mm-dd hh24:mi:ss'))
The explanation for the difference here is that Oracle cannot use a regular index if you wrap the indexed column inside of a function. One may argue that a function based index on trunc(last_ddl_time) would solve this but replacing the index would also mean that all queries would now have to use the trunc() function or do without an index. As a bonus you also have the chance for partition pruning if you are not using a function. If you are lucky enough to be running on Exadata the same is true for smart scans and storage indexes.

Avoid functions on datetime columns in your predicates - part 2

For this second example, let's consider that there is no index on the column in the predicate and compare the two syntaxes again. First, the good example with "between" taking about 5s.
SQL> SELECT COUNT(*)
 FROM bigtab
 WHERE created BETWEEN to_date('20170908','YYYYMMDD')
  AND to_date('20170908-23:59:59','YYYYMMDD-HH24:MI:SS');
  COUNT(*)
 ----------
  408
 Elapsed: 00:00:05.13
In this case, the lazier syntax with trunc() took more than twice as long to run even though both queries are performing a (fully cached) full table scan.
SQL> SELECT COUNT(*)
 FROM bigtab
 WHERE TRUNC(created) = to_date('20110908','YYYYMMDD');
  COUNT(*)
 ----------
  408
 Elapsed: 00:00:12.32
The reason for the difference in execution time here is that with the trunc() syntax Oracle still has to apply the function to every row it scans, taking up extra context switches and CPU cycles even though the amount of logical IOs is the same.

Conclusion

I hope I made the point clearly enough to not use functions like trunc() in the where-clause of your queries. The "between" syntax is more verbose and requires a little more typing but you get rewarded with better performance because Oracle may still use regular indexes, partition pruning, smart scans and at the very least avoid context switching to evaluate a function.

No Comments Yet

Let us know what you think

Subscribe by email