I have a simple Indexed View. When I query against it, it's pretty slow. First I show you the schema's and indexes. Then the simple queries. Finally a query plan screnie.
Update: Proof of Solution at the bottom of this post.
Schema
This is what it looks like :-
CREATE view [dbo].[PostsCleanSubjectView] with SCHEMABINDING AS
SELECT PostId, PostTypeId,
[dbo].[ToUriCleanText]([Subject]) AS CleanedSubject
FROM [dbo].[Posts]
My udf ToUriCleanText
just replaces various characters with an empty character. Eg. replaces all '#' chars with ''.
Then i've added two indexes on this :-
Indexes
Primary Key Index (ie. Clustered Index)
CREATE UNIQUE CLUSTERED INDEX [PK_PostCleanSubjectView] ON
[dbo].[PostsCleanSubjectView]
(
[PostId] ASC
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
And a Non-Clustered Index
CREATE NONCLUSTERED INDEX [IX_PostCleanSubjectView_PostTypeId_Subject] ON
[dbo].[PostsCleanSubjectView]
(
[CleanedSubject] ASC,
[PostTypeId] ASC
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Now, this has around 25K rows. Nothing big at all.
When i do the following queries, they both take around 4 odd seconds. WTF? This should be.. basically instant!
Query 1
SELECT a.PostId
FROM PostsCleanSubjectView a
WHERE a.CleanedSubject = 'Just-out-of-town'
Query 2 (added another where clause item)
SELECT a.PostId
FROM PostsCleanSubjectView a
WHERE a.CleanedSubject = 'Just-out-of-town' AND a.PostTypeId = 1
What have I done wrong? Is the UDF screwing things up? I thought that, because i have index'd this view, it would be materialised. As such, it would not have to calculate that string column.
Here's a screenie of the query plan, if this helps :-
Also, notice the index it's using? Why is it using that index?
That index is...
CREATE NONCLUSTERED INDEX [IX_Posts_PostTypeId_Subject] ON [dbo].[Posts]
(
[PostTypeId] ASC,
[Subject] ASC
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
So yeah, any ideas folks?
Update 1: Added schema for the udf.
CREATE FUNCTION [dbo].[ToUriCleanText]
(
@Subject NVARCHAR(300)
)
RETURNS NVARCHAR(350) WITH SCHEMABINDING
AS
BEGIN
<snip>
// Nothing insteresting in here.
//Just lots of SET @foo = REPLACE(@foo, '$', ''), etc.
END
Update 2: Solution
Yep, it was because i wasn't using the index on the view and had to manually make sure i didn't expand the view. The server is Sql Server 2008 Standard Edition. The full answer is below.
Here's the proof, WITH (NOEXPAND)
Thank you all for helping me solve this problem :)
I suspect it has to call that function for every row before it can do the comparison in your where clause. I'd expose subject, run the query checking against that directly and see how the times work out. I've generally seen a lot of slowness whenever I modify a value using a function and then use it in the where clause...
I see an @ sign in the query code in your execution plan. There's a string variable involved.
Sql Server has a NASTY behavior if the string variable's type does not match the type of the string column in the index. Sql Server will ... convert the whole column to that type, perform the fast lookup, and then throw away the converted index so it can do the whole thing again next query.
Simon figured it out - but here's more helpful detail: http://msdn.microsoft.com/en-us/library/ms187373.aspx
So, what's happening here is that indexed view matching is not working. Make sure you're using Enterprise or Developer editions of Sql Server (pretty likely). Then check your SET options according to the article.
I recently built a large database containing hundreds of millions of call detail records and there are some functions I was using in queries and views that I turned into persisted computed columns. This worked out much better because I could index on the computed column.
I wasn't using SQL Enterprise though so I didn't get the opportunity to use indexed views. Is the indexed view supposed to be able to index the deterministic results of the UDF?
What benefit are you looking for by using an indexed view? Is it not possible to properly index the table(s) themselves? Without good justification, you're adding complexity and asking the optimizer to deal with more database objects with less flexibility.
Have you evaluated the same query logic with standard indexes?
Mixing in UDF logic muddies things even more.
What edition of SQL Server? I believe that only Enterprise and Developer Edition will use indexed views automatically, while the others support it using query hints.
From Query Hints (Transact SQL) on MSDN:
If all you want is to persist the return value of a UDF, consider a persisted computed column rather than an indexed view.