In SQLSERVER 2005, I'm using table-valued function as a convenient way to perform arbitrary aggregation on subset data from large table (passing date range or such parameters).
I'm using theses inside larger queries as joined computations and I'm wondering if the query plan optimizer work well with them in every condition or if I'm better to unnest such computation in my larger queries.
- Does query plan optimizer unnest table-valued functions if it make sense?
- If it doesn't, what do you recommend to avoid code duplication that would occur by manually unnesting them?
- If it does, how do you identify that from the execution plan?
code sample:
create table dbo.customers (
[key] uniqueidentifier
, constraint pk_dbo_customers
primary key ([key])
)
go
/* assume large amount of data */
create table dbo.point_of_sales (
[key] uniqueidentifier
, customer_key uniqueidentifier
, constraint pk_dbo_point_of_sales
primary key ([key])
)
go
create table dbo.product_ranges (
[key] uniqueidentifier
, constraint pk_dbo_product_ranges
primary key ([key])
)
go
create table dbo.products (
[key] uniqueidentifier
, product_range_key uniqueidentifier
, release_date datetime
, constraint pk_dbo_products
primary key ([key])
, constraint fk_dbo_products_product_range_key
foreign key (product_range_key)
references dbo.product_ranges ([key])
)
go
.
/* assume large amount of data */
create table dbo.sales_history (
[key] uniqueidentifier
, product_key uniqueidentifier
, point_of_sale_key uniqueidentifier
, accounting_date datetime
, amount money
, quantity int
, constraint pk_dbo_sales_history
primary key ([key])
, constraint fk_dbo_sales_history_product_key
foreign key (product_key)
references dbo.products ([key])
, constraint fk_dbo_sales_history_point_of_sale_key
foreign key (point_of_sale_key)
references dbo.point_of_sales ([key])
)
go
create function dbo.f_sales_history_..snip.._date_range
(
@accountingdatelowerbound datetime,
@accountingdateupperbound datetime
)
returns table as
return (
select
pos.customer_key
, sh.product_key
, sum(sh.amount) amount
, sum(sh.quantity) quantity
from
dbo.point_of_sales pos
inner join dbo.sales_history sh
on sh.point_of_sale_key = pos.[key]
where
sh.accounting_date between
@accountingdatelowerbound and
@accountingdateupperbound
group by
pos.customer_key
, sh.product_key
)
go
-- TODO: insert some data
-- this is a table containing a selection of product ranges
declare @selectedproductranges table([key] uniqueidentifier)
-- this is a table containing a selection of customers
declare @selectedcustomers table([key] uniqueidentifier)
declare @low datetime
, @up datetime
-- TODO: set top query parameters
.
select
saleshistory.customer_key
, saleshistory.product_key
, saleshistory.amount
, saleshistory.quantity
from
dbo.products p
inner join @selectedproductranges productrangeselection
on p.product_range_key = productrangeselection.[key]
inner join @selectedcustomers customerselection on 1 = 1
inner join
dbo.f_sales_history_..snip.._date_range(@low, @up) saleshistory
on saleshistory.product_key = p.[key]
and saleshistory.customer_key = customerselection.[key]
I hope the sample makes sense.
Much thanks for your help!
In this case, it's an "inline table valued function" The optimiser simply expands (unnests) it if it's useful (or view).
If the function is treated as "black box" by the outer query, the quickest way is to compare IO shown in SSMS vs IO in profiler. Profler captures "black box" IO that SSMS does not.
Blog post by Adam Mechanic (his book is in my drawer at work)
1) Yes, using your syntax, it does. If you happened to use a UDF that returned a table which had conditional logic in it, it would not, though.
3) The optimizer won't point out what part of your query it's optimizing, because it may see fit to combine chunks of the plan with your function, or to optimize bits away.