Oracle uses or ignores indexed column depending on

2020-05-21 06:45发布

问题:

I'm using an indexed column used as a filter by putting it 'between' two literal values. (The column is in the second position of the index and actually makes execution slower; I will deal with that later).

What's confusing me is that Oracle (11.2.0.3.0) uses or ignores said index depending on the format of the value and format strings supplied to to_date:

This ignores the index:

SQL> SELECT *
  2  FROM gprs_history_import  gh
  3  WHERE start_call_date_time BETWEEN
  4      to_date('20140610 000000','yyyymmdd hh24miss') AND
  5      to_date('20140610 235959','yyyymmdd hh24miss')
  6  /

Execution Plan
----------------------------------------------------------
Plan hash value: 990804809

--------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name                | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                     |   350 |   219K|   242K  (1)| 00:56:42 |       |       |
|   1 |  PARTITION RANGE SINGLE|                     |   350 |   219K|   242K  (1)| 00:56:42 |    74 |    74 |
|   2 |   PARTITION LIST ALL   |                     |   350 |   219K|   242K  (1)| 00:56:42 |     1 |     3 |
|*  3 |    TABLE ACCESS FULL   | GPRS_HISTORY_IMPORT |   350 |   219K|   242K  (1)| 00:56:42 |   220 |   222 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("START_CALL_DATE_TIME"<=TO_DATE(' 2014-06-10 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))

This one does use the index (notice the space after the date part in line 4):

SQL> SELECT *
  2  FROM gprs_history_import  gh
  3  WHERE start_call_date_time BETWEEN
  4      to_date('20140610 ','yyyymmdd ') AND
  5      to_date('20140610 235959','yyyymmdd hh24miss')
  6  /

Execution Plan
----------------------------------------------------------
Plan hash value: 464458373

---------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                          |   350 |   219K|  2795K  (1)| 10:52:15 |       |       |
|*  1 |  FILTER                              |                          |       |       |            |          |       |       |
|   2 |   PARTITION RANGE ITERATOR           |                          |   350 |   219K|  2795K  (1)| 10:52:15 |   KEY |    74 |
|   3 |    PARTITION LIST ALL                |                          |   350 |   219K|  2795K  (1)| 10:52:15 |     1 |     3 |
|   4 |     TABLE ACCESS BY LOCAL INDEX ROWID| GPRS_HISTORY_IMPORT      |   350 |   219K|  2795K  (1)| 10:52:15 |   KEY |   222 |
|*  5 |      INDEX SKIP SCAN                 | GPRS_HISTORY_IMPORT_IDX1 |     1 |       |  2795K  (1)| 10:52:15 |   KEY |   222 |
---------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TO_DATE('20140610 ','yyyymmdd ')<=TO_DATE(' 2014-06-10 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))
   5 - access("START_CALL_DATE_TIME">=TO_DATE('20140610 ','yyyymmdd ') AND "START_CALL_DATE_TIME"<=TO_DATE(' 2014-06-10
              23:59:59', 'syyyy-mm-dd hh24:mi:ss'))
       filter("START_CALL_DATE_TIME">=TO_DATE('20140610 ','yyyymmdd ') AND "START_CALL_DATE_TIME"<=TO_DATE(' 2014-06-10
              23:59:59', 'syyyy-mm-dd hh24:mi:ss'))

(The filter in (1) seems a bit silly, as if Oracle didn't understand the expression)

Again, this one doesn't (I removed the trailing space):

SQL> SELECT *
  2  FROM gprs_history_import  gh
  3  WHERE start_call_date_time BETWEEN
  4      to_date('20140610','yyyymmdd') AND
  5      to_date('20140610 235959','yyyymmdd hh24miss')
  6  /

Execution Plan
----------------------------------------------------------
Plan hash value: 990804809

--------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name                | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                     |   350 |   219K|   242K  (1)| 00:56:42 |       |       |
|   1 |  PARTITION RANGE SINGLE|                     |   350 |   219K|   242K  (1)| 00:56:42 |    74 |    74 |
|   2 |   PARTITION LIST ALL   |                     |   350 |   219K|   242K  (1)| 00:56:42 |     1 |     3 |
|*  3 |    TABLE ACCESS FULL   | GPRS_HISTORY_IMPORT |   350 |   219K|   242K  (1)| 00:56:42 |   220 |   222 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("START_CALL_DATE_TIME"<=TO_DATE(' 2014-06-10 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))

Putting quotes around the space precludes the index from ever being used.

What gives?

回答1:

Ok - I'll give it a try, this is mostly deduction from the availabe Information:

Why does Oracle choose a different execution plan?

It seems in your second query with the unusual Date-Format, the optimizer has no idea what the value of the resulting date is. You see the Filter Predicate:

1 - filter(TO_DATE('20140610 ','yyyymmdd ')<=TO_DATE(' 2014-06-10 23:59:59', 'syyyy-mm-dd hh24:mi:ss'))

Which means the optimizer is not even sure that the first date is smaller than the second! That means the optimizer has no idea about the number of returned rows and will just use a generic plan without taking specific statistics into account. It would be the same if you had a user-defined function xyt() which would return a date for the range. The optimizer has no way to know what date-value will result - This means you get a generall all purpose plan, which should be pretty decent for any date-range specified.

In the first and third case, the optimizer seems to understand the date directly and can guess the number of rows which are in the date range by using statistics. So while the second Query was to the Optimizer like BETWEEN X AND 3 this Query is like BETWEEN 1 AND 3 So he optimizes the query plan for the predicted number of returned rows!

The Strange thing seems to be, that the query optimizer has such problems with a strange date format, could be filed as a bug/request for improvement...

But an important Point:

  1. A full table scan does not have to be a BAD plan... As well as using an index is not always faster!
  2. The cost in the query plan is in no way directly related to the actual execution time or performance - it is an internal measurement to compare different plans for the SAME QUERY (So you cannot compare the cost of different querys like your queries 1,2 and 3)

Basically if you return a high number of rows from a table a full table scan without index access will in many cases be much faster, especially when operating on certain partitions! - The Table scan will only access the pertition for the matching date range - so only for the date in question and returns all rows from this partition. This is much faster than queriyng the index for each single row and then extracting the row by index access... Try to profile the querys - the full table scan on partition should be 3 times as fast with much less IO



回答2:

Bugs in the optimizer or the parser cause some date formats to downgrade static partition pruning to dynamic partition pruning. Partition pruning changes lead to different cardinality and costs, which then lead to significant changes to many other parts of the plan.

This answer only partially explains the problem and contains some speculation. Hopefully it will at least shed a little light on what the problem is and is not. It is at least a good starting point if you really need a complete explanation and want to submit a service request to Oracle.

Terminology and some background reading

Static partition pruning is when the optimizer determines at compile time which partition will be used. Statistics are per-partition, leading to better cardinality estimates, leading to better plans. For example, think of a table partitioned by status where the partition for CANCELLED is tiny and the partition for ACTIVE is large. Knowing which partition is used can completely change the join order and access methods of the optimal plan. Pstart and Pstop will be numeric values when static partition pruning is used.

Dynamic partition pruning is when the optimizer cannot determine the partition until runtime. Data is only retrieved from the required partitions but the execution plan is built without special knowledge of which partition is used. Some partition statistic estimates will be a simple average of all available partitions. In the above example of a table partitioned by status, the average of a tiny partition and a large partition does not accurately represent either. Either Pstart or Pstop will include the word KEY when dynamic partition pruning is used.

The Oracle® Database VLDB and Partitioning Guide includes a section about Datatype Conversions that is worth reading. For example, one relevant quote from the manual:

Only a properly applied TO_DATE function guarantees that the database is capable of uniquely determining the date value and using it potentially for static pruning, which is especially beneficial for single partition access.

Sample schema and data

This simple test case demonstrates the problem. It also rules out common performance issues, such as missing statistics.

First, create a sample table with 2 partitions, one large and one small.

create table gprs_history_import(id number, start_call_date_time date)
partition by range (start_call_date_time)
(
    partition p_large values less than (date '2014-06-01'),
    partition p_small values less than (date '2014-07-01')
);

insert into gprs_history_import
select level, date '2014-05-01'
from dual connect by level <= 1000;

insert into gprs_history_import
select level, date '2014-06-01'
from dual connect by level <= 10;

begin
    dbms_stats.gather_table_stats(user, 'GPRS_HISTORY_IMPORT');
end;
/

select count(*) from gprs_history_import partition (p_large); -- 1000
select count(*) from gprs_history_import partition (p_small); --   10

Static to dynamic causes bad cardinality estimates

The static cardinality estimate is a perfect 1000. The extra space in second date format changes Pstop from 1 to KEY. The plan changes from static to dynamic partition pruning. The dynamic estimate is an inaccurate 505, the average of 1000 and 10

For simplicity this example only shows a bad cardinality estimate. It is not necessary to show a query running slow, since bad row estimates inevitably lead to bad execution plans for many reasons.

explain plan for select /* static partition pruning */ *
from gprs_history_import
where start_call_date_time < to_date('20140601 000000','yyyymmdd hh24miss');

select * from table(dbms_xplan.display);

Plan hash value: 452971246

--------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name                | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                     |  1000 | 12000 |    16   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE|                     |  1000 | 12000 |    16   (0)| 00:00:01 |     1 |     1 |
|   2 |   TABLE ACCESS FULL    | GPRS_HISTORY_IMPORT |  1000 | 12000 |    16   (0)| 00:00:01 |     1 |     1 |
--------------------------------------------------------------------------------------------------------------

explain plan for select /* dybnamic partition pruning */ *
from gprs_history_import
where start_call_date_time < to_date('20140601 ','yyyymmdd ');

select * from table(dbms_xplan.display);


Plan hash value: 2464174375

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name                | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                     |   505 |  6060 |    29   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR|                     |   505 |  6060 |    29   (0)| 00:00:01 |     1 |   KEY |
|*  2 |   TABLE ACCESS FULL      | GPRS_HISTORY_IMPORT |   505 |  6060 |    29   (0)| 00:00:01 |     1 |   KEY |
----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("START_CALL_DATE_TIME"<TO_DATE('20140601 ','yyyymmdd '))

Date format parsing problems

Now for some speculation as to why the query moves from static to dynamic partition pruning.

It is not always obvious when the optimizer can use static and dynamic partitioning. In general, literals allow static pruning and variables require dynamic pruning.

--#1: Obviously static: It uses an unambiguous ANSI date literal.
select * from gprs_history_import where start_call_date_time = date '2000-11-01';

--#2: Obviously dyanmic: It uses a bind variable.
select * from gprs_history_import where start_call_date_time = :date;

--#3: Probably dynamic: The optimizer cannot always infer the literal value. 
select * from gprs_history_import where start_call_date_time = 
    (select date '2000-11-01' from dual);

--#4: Probably static: FEB is not always valid, but Oracle can figure that out.
select * from gprs_history_import where start_call_date_time = 
    to_date('01-FEB-2000', 'DD-MON-YYYY');

When you consider all the performance and internationalization issues around case #4 it becomes clear how difficult it is to parse dates. The value of to_date('01-FEB-2000', 'DD-MON-YYYY') depends on several NLS parameters, such as NLS_DATE_LANGUAGE. The query is valid for English but not German. And if NLS_CALENDAR is not set to GREGORIAN than even an all-numeric date format can be wrong. The to_date string is not a bind value but it's not clearly a literal either.

The difference between true date literals and formatted strings is more apparent if hard parses are counted. Query #1 will not force a hard-parse even if the language is changed, but query #4 will. This can be demonstrated by running several variations of each, changing the language, and then running select value from v$sesstat natural join v$statname where name = 'parse count (hard)' and sid = userenv('SID');.

Oracle must have a variable somewhere to denote "this is not a bind variable but could lead to different plans based on NLS settings". That variable does not always lead to dynamic partition pruning but there must be some bugs somewhere that occasionally break it.