可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
We have Oracle 10g and we need to query 1 table (no joins) and filter out rows where 1 of the columns is null. When we do this - WHERE OurColumn IS NOT NULL - we get a full table scan on a very large table - BAD BAD BAD. The column has an index on it but it gets ignored in this instance. Are there any solutions to this?
Thanks
回答1:
The optimizer thinks that the full table scan will be better.
If there are just a few NULL
rows, the optimizer is right.
If you are absolutely sure that the index access will be faster (that is, you have more than 75%
rows with col1 IS NULL
), then hint your query:
SELECT /*+ INDEX (t index_name_on_col1) */
*
FROM mytable t
WHERE col1 IS NOT NULL
Why 75%
?
Because using INDEX SCAN
to retrieve values not covered by the index implies a hidden join on ROWID
, which costs about 4
times as much as table scan.
If the index range includes more than 25%
of rows, the table scan is usually faster.
As mentioned by Tony Andrews
, clustering factor is more accurate method to measure this value, but 25%
is still a good rule of thumb.
回答2:
The optimiser will make its decision based on the relative cost of the full table scan and using the index. This mainly comes down to how many blocks will have to be read to satisfy the query. The 25%/75% rule of thumb mentioned in another answer is simplistic: in some cases a full table scan will make sense even to get 1% of the rows - i.e. if those rows happen to be spread around many blocks.
For example, consider this table:
SQL> create table t1 as select object_id, object_name from all_objects;
Table created.
SQL> alter table t1 modify object_id null;
Table altered.
SQL> update t1 set object_id = null
2 where mod(object_id,100) != 0
3 /
84558 rows updated.
SQL> analyze table t1 compute statistics;
Table analyzed.
SQL> select count(*) from t1 where object_id is not null;
COUNT(*)
----------
861
As you can see, only approximately 1% of the rows in T1 have a non-null object_id. But due to the way I built the table, these 861 rows will be spread more or less evenly around the table. Therefore, the query:
select * from t1 where object_id is not null;
is likely to visit almost every block in T1 to get data, even if the optimiser used the index. It makes sense then to dispense with the index and go for a full table scan!
A key statistic to help identify this situation is the index clustering factor:
SQL> select clustering_factor from user_indexes where index_name='T1_IDX';
CLUSTERING_FACTOR
-----------------
460
This value 460 is quite high (compared to the 861 rows in the index), and suggests that a full table scan will be used. See this DBAZine article on clustering factors.
回答3:
If you are doing a select *, then it would make sense to do a table scan rather than using the index. If you know which columns you are interested in, you could create a covered index with those colums plus the one you are applying the IS NOT NULL condition.
回答4:
It can depend on the type of index you have on the table.
Most B-tree indexes do not store null entries. Bitmap indexes do store null entries.
So, if you have:
select * from mytable
where mycolumn is null
and you have a standard B-tree index on mycolumn
, then the query can't use the index as the "null" isn't in the index.
(If the index is against multiple columns, and one of the indexed columns is not null then there will be an entry in the index.)
回答5:
Create an index on that column.
To make sure the index is used, it should be on the index and other columns in the where.
ocdecio answered:
If you are doing a select *, then it would make sense to do a table scan rather than using the index.
That's not strictly true; an index will be used if there is an index that fits your where clause, and the query optimizer decides using that index would be faster than doing a table scan. If there is no index, or no suitable index, only then must a table scan be done.
回答6:
It's also worth checking whether Oracle's statistics on the table are up to date. It may not know that a full table scan will be slower.
回答7:
Oracle database don't index null values at all in regular (b-tree) indexes, so it can't use it nor you can't force oracle database to use it.
BR
回答8:
Using hints should be done only as a work around rather than a solution.
As mentioned in other answers, the null value is not available in B-TREE indexes.
Since you know that you have mostly null values in this column, would you be able to replace the null value by a range for instance.
That really depends on your column and the nature of your data but typically, if your column is a date type for instance:
where mydatecolumn is not null
Can be translated in a rule saying: I want all rows which have a date.
Then you can most definitely do this:
where mydatecolumn <=sysdate (in oracle)
This will return all rows with a date and ommit null values while taking advantage of the index on that column without using any hints.
回答9:
See http://www.oracloid.com/2006/05/using-index-for-is-null/
If your index is on one single field, it will NOT be used. Try to add a dummy field or a constant in the index:
create index tind on t(field_to_index, 1);