So, I just found out that SQL Server 2008 doesn't let you index a view with a CTE in the definition, but it allows you to alter
the query to add with schemabinding
in the view definition. Is there a good reason for this? Does it make sense for some reason I am unaware of? I was under the impression that WITH SCHEMABINDING
s main purpose was to allow you to index a view
new and improved with more query action
;with x
as
(
select rx.pat_id
,rx.drug_class
,count(*) as counts
from rx
group by rx.pat_id,rx.drug_class
)
select x.pat_id
,x.drug_class
,x.counts
,SUM(c.std_cost) as [Healthcare Costs]
from x
inner join claims as c
on claims.pat_id=x.pat_id
group by x.pat_id,x.drug_class,x.counts
And the code to create the index
create unique clustered index [TestIndexName] on [dbo].[MyView]
( pat_id asc, drug_class asc, counts asc)
You can't index a view with a CTE. Even though the view can have SCHEMABINDING
. Think of it this way. In order to index a view, it must meet two conditions (and many others): (a) that it has been created WITH SCHEMABINDING
and (b) that it does not contain a CTE. In order to schemabind a view, it does not need to meet the condition that it does not contain a CTE.
I'm not convinced there is a scenario where a view has a CTE and will benefit from being indexed. This is peripheral to your actual question, but my instinct is that you are trying to index this view to magically make it faster. An indexed view isn't necessarily going to be any faster than a query against the base tables - there are restrictions for a reason, and there are only particular use cases where they make sense. Please be careful to not just blindly index all of your views as a magic "go faster" button. Also remember that an indexed view requires maintenance. So it will increase the cost of any and all DML operations in your workload that affect the base table(s).
Schemabinding is not just for indexing views. It can also be used
on things like UDFs to help persuade determinism, can be used on
views and functions to prevent changes to the underlying schema, and
in some cases it can improve performance (for example, when a UDF is
not schema-bound, the optimizer may have to create a table spool to
handle any underlying DDL changes). So please don't think that it is
weird that you can schema-bind a view but you can't index it.
Indexing a view requires it, but the relationship is not mutual.
For your specific scenario, I recommend this:
CREATE VIEW dbo.PatClassCounts
WITH SCHEMABINDING
AS
SELECT pat_id, drug_class,
COUNT_BIG(*) AS counts
FROM dbo.rx
GROUP BY pat_id, drug_class;
GO
CREATE UNIQUE CLUSTERED INDEX ON dbo.PatClassCounts(pat_id, drug_class);
GO
CREATE VIEW dbo.ClaimSums
WITH SCHEMABINDING
AS
SELECT pat_id,
SUM(c.std_cost) AS [Healthcare Costs],
COUNT_BIG(*) AS counts
FROM dbo.claims
GROUP BY pat_id;
GO
CREATE UNIQUE CLUSTERED INDEX ON dbo.ClaimSums(pat_id);
GO
Now you can create a non-indexed view that just does a join between these two indexed views, and it will utilize the indexes (you may have to use NOEXPAND
on a lower edition, not sure):
CREATE VIEW dbo.OriginalViewName
WITH SCHEMABINDING
AS
SELECT p.pat_id, p.drug_class, p.counts, c.[Healthcare Costs]
FROM dbo.PatClassCounts AS p
INNER JOIN dbo.ClaimSums AS c
ON p.pat_id = c.pat_id;
GO
Now, this all assumes that it is worthwhile to pre-aggregate this information - if you run this query infrequently, but the data is modified a lot, it may be better to NOT create indexed views.
Also note that the SUM(std_cost)
from the ClaimSums
view will be the same for every pat_id
+ drug_class
combination, since it's only aggregated to pat_id
. I guess there might be a drug_class
in the claims
table that should be part of the join criteria too, but I'm not sure. If that is the case, I think this could be collapsed to a single indexed view.