Filtered index condition is ignored by optimizer

2019-01-15 05:55发布

问题:

Assume I'm running a website that shows funny cat pictures. I have a table called CatPictures with the columns Filename, Awesomeness, and DeletionDate, and the following index:

create nonclustered index CatsByAwesomeness
on CatPictures (Awesomeness) 
include (Filename)
where DeletionDate is null

My main query is this:

select Filename from CatPictures where DeletionDate is null and Awesomeness > 10

I, as a human being, know that the above index is all that SQL Server needs, because the index filter condition already ensures the DeletionDate is null part.

SQL Server however doesn't know this; the execution plan for my query will not use my index:

Even if adding an index hint, it will still explicitly check DeletionDate by looking at the actual table data:

(and in addition complain about a missing index that would include DeletionDate).

Of course I could

include (Filename, DeletionDate)

instead, and it will work:

But it seems a waste to include that column, since this just uses up space without adding any new information.

Is there a way to make SQL Server aware that the filter condition is already doing the job of checking DeletionDate?

回答1:

No, not currently.

See this connect item. It is Closed as Won't Fix. (Or this one for the IS NULL case specifically)

The connect item does provide a workaround shown below.

Posted by RichardB CFCU on 29/09/2011 at 9:15 AM

A workaround is to INCLUDE the column that is being filtered on.

Example:

CREATE NONCLUSTERED INDEX [idx_FilteredKey1] ON [dbo].[TABLE] 
(
    [TABLE_ID] ASC,
    [TABLE_ID2] ASC
)
INCLUDE ( [REMOVAL_TIMESTAMP]) --explicitly include the column here
WHERE ([REMOVAL_TIMESTAMP] IS NULL)


回答2:

Is there a way to make SQL Server aware that the filter condition is already doing the job of checking DeletionDate?

No.

Filtered indexes were designed to solve certain problems, not ALL. Things evolve and some day, you may see SQL Server supporting the feature you expect of filtered indexes, but it is also possible that you may never see it.

There are several good reasons I can see for how it works.

What it improves on:

  1. Storage. The index contains only keys matching the filtering condition
  2. Performance. A shoo-in from the above. Less to write and fewer pages = faster retrieval

What it does not do:

  1. Change the query engine radically

Putting them together, considering that SQL Server is a heavily pipelined, multi-processor parallelism capable beast, we get the following behaviour when dealing with servicing a query:

  1. Pre-condition to the query optimizer selecting indexes: check whether a Filtered Index is applicable against the WHERE clause.
  2. Query optimizer continues it's normal work of determining selectivity from statistics, weighing up index->bookmark lookup vs clustered/heap scan depending on whether the index is covering etc

Threading the condition against the filtered index into the query optimizer "core" I suspect is going to be a much bigger job than leaving it at step 1.

Personally, I respect the SQL Server dev team and if it were easy enough, they might pull it into a not-too-distant sprint and get it done. However, what's there currently has achieved what it was intended to and makes me quite happy.



回答3:

Just found that "gap in functionality", it's really sad that filtered indexes are ignored by optimizer. I think I'll try to use indexed views for that, take a look at this article

http://www.sqlperformance.com/2013/04/t-sql-queries/optimizer-limitations-with-filtered-indexes