How can i speed up this Indexed View?

2020-02-09 08:38发布

I have a simple Indexed View. When I query against it, it's pretty slow. First I show you the schema's and indexes. Then the simple queries. Finally a query plan screnie.

Update: Proof of Solution at the bottom of this post.

Schema

This is what it looks like :-

CREATE view [dbo].[PostsCleanSubjectView] with SCHEMABINDING AS
    SELECT PostId, PostTypeId, 
        [dbo].[ToUriCleanText]([Subject]) AS CleanedSubject
    FROM [dbo].[Posts]

My udf ToUriCleanText just replaces various characters with an empty character. Eg. replaces all '#' chars with ''.

Then i've added two indexes on this :-

Indexes

Primary Key Index (ie. Clustered Index)

CREATE UNIQUE CLUSTERED INDEX [PK_PostCleanSubjectView] ON 
    [dbo].[PostsCleanSubjectView] 
(
    [PostId] ASC
)
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, 
      SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF,
      ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

And a Non-Clustered Index

CREATE NONCLUSTERED INDEX [IX_PostCleanSubjectView_PostTypeId_Subject] ON 
    [dbo].[PostsCleanSubjectView] 
(
    [CleanedSubject] ASC,
    [PostTypeId] ASC
)
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, 
      SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF,
      ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

Now, this has around 25K rows. Nothing big at all.

When i do the following queries, they both take around 4 odd seconds. WTF? This should be.. basically instant!

Query 1

SELECT a.PostId
FROM PostsCleanSubjectView a 
WHERE a.CleanedSubject = 'Just-out-of-town'

Query 2 (added another where clause item)

SELECT a.PostId
FROM PostsCleanSubjectView a 
WHERE a.CleanedSubject = 'Just-out-of-town' AND a.PostTypeId = 1

What have I done wrong? Is the UDF screwing things up? I thought that, because i have index'd this view, it would be materialised. As such, it would not have to calculate that string column.

Here's a screenie of the query plan, if this helps :- alt text

Also, notice the index it's using? Why is it using that index?

That index is...

CREATE NONCLUSTERED INDEX [IX_Posts_PostTypeId_Subject] ON [dbo].[Posts] 
(
    [PostTypeId] ASC,
    [Subject] ASC
)
WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, 
      SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, 
      ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

So yeah, any ideas folks?

Update 1: Added schema for the udf.

CREATE FUNCTION [dbo].[ToUriCleanText]
(
    @Subject NVARCHAR(300)
)
RETURNS NVARCHAR(350) WITH SCHEMABINDING
AS 
BEGIN
   <snip>
   // Nothing insteresting in here. 
   //Just lots of SET @foo = REPLACE(@foo, '$', ''), etc.
END

Update 2: Solution

Yep, it was because i wasn't using the index on the view and had to manually make sure i didn't expand the view. The server is Sql Server 2008 Standard Edition. The full answer is below. Here's the proof, WITH (NOEXPAND) alt text

Thank you all for helping me solve this problem :)

6条回答
迷人小祖宗
2楼-- · 2020-02-09 09:04

I suspect it has to call that function for every row before it can do the comparison in your where clause. I'd expose subject, run the query checking against that directly and see how the times work out. I've generally seen a lot of slowness whenever I modify a value using a function and then use it in the where clause...

查看更多
Luminary・发光体
3楼-- · 2020-02-09 09:10

I see an @ sign in the query code in your execution plan. There's a string variable involved.

Sql Server has a NASTY behavior if the string variable's type does not match the type of the string column in the index. Sql Server will ... convert the whole column to that type, perform the fast lookup, and then throw away the converted index so it can do the whole thing again next query.


Simon figured it out - but here's more helpful detail: http://msdn.microsoft.com/en-us/library/ms187373.aspx

If a query contains references to columns that are present both in an indexed view and base tables, and the query optimizer determines that using the indexed view provides the best method for executing the query, the query optimizer uses the index on the view. This function is called indexed view matching, and is supported only in the SQL Server Enterprise and Developer editions.

However, for the optimizer to consider indexed views for matching or use an indexed view that is referenced with the NOEXPAND hint, the following SET options must be set to ON:

So, what's happening here is that indexed view matching is not working. Make sure you're using Enterprise or Developer editions of Sql Server (pretty likely). Then check your SET options according to the article.

查看更多
放我归山
4楼-- · 2020-02-09 09:14

I recently built a large database containing hundreds of millions of call detail records and there are some functions I was using in queries and views that I turned into persisted computed columns. This worked out much better because I could index on the computed column.

I wasn't using SQL Enterprise though so I didn't get the opportunity to use indexed views. Is the indexed view supposed to be able to index the deterministic results of the UDF?

查看更多
叼着烟拽天下
5楼-- · 2020-02-09 09:15

What benefit are you looking for by using an indexed view? Is it not possible to properly index the table(s) themselves? Without good justification, you're adding complexity and asking the optimizer to deal with more database objects with less flexibility.

Have you evaluated the same query logic with standard indexes?

Mixing in UDF logic muddies things even more.

查看更多
Root(大扎)
6楼-- · 2020-02-09 09:18

What edition of SQL Server? I believe that only Enterprise and Developer Edition will use indexed views automatically, while the others support it using query hints.

SELECT a.PostId
FROM PostsCleanSubjectView a WITH (NOEXPAND)
WHERE a.CleanedSubject = 'Just-out-of-town' AND a.PostTypeId = 1

From Query Hints (Transact SQL) on MSDN:

The indexed view is not expanded only if the view is directly referenced in the SELECT part of the query and WITH (NOEXPAND) or WITH (NOEXPAND, INDEX( index_value [ ,...n ] ) ) is specified.

查看更多
小情绪 Triste *
7楼-- · 2020-02-09 09:23

If all you want is to persist the return value of a UDF, consider a persisted computed column rather than an indexed view.

查看更多
登录 后发表回答