Though it's a quite subjective question but I feel it necessary to share on this forum.
I have personally experienced that when I create a UDF (even if that is not complex) and use it into my SQL it drastically decrease the performance. But when I use SQL inbuild function they happen to work pretty faster. Conversion , logical & string functions are clear example of that.
So, my question is "Why SQL in build functions are faster than UDF"? and it would be an advantage if someone can guide me how can I judge/manipulate function cost either mathematically or logically.
This is a well known issue with scalar UDFs in SQL Server.
They are not inlined into the plan and calling them adds overhead compared with having the same logic inline.
The following takes just under 2 seconds on my machine
WITH T10(N) AS
(
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
) --10 rows
, T(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM T10 a, T10 b, T10 c, T10 d, T10 e, T10 f, T10 g) -- 10 million rows
SELECT MAX(N - N)
FROM T
OPTION (MAXDOP 1)
Creating the simple scalar UDF
CREATE FUNCTION dbo.F1 (@N BIGINT)
RETURNS BIGINT
WITH SCHEMABINDING
AS
BEGIN
RETURN (@N - @N)
END
And changing the query to MAX(dbo.F1(N))
instead of MAX(N - N)
it takes around 26 seconds with STATISTICS TIME OFF
and 37 with it on.
An average increase of 2.6μs / 3.7μs for each of the 10 million function calls.
Running the Visual Studio profiler shows that the vast majority of time is taken under UDFInvoke
. The names of the methods in the call stack gives some idea of what the additional overhead is doing (copying parameters, executing statements, setting up security context).
Moving the logic into an inline table valued function
CREATE FUNCTION dbo.F2 (@N BIGINT)
RETURNS TABLE
RETURN(SELECT @N - @N AS X)
And rewriting the query as
SELECT MAX(X)
FROM Nums
CROSS APPLY dbo.F2(N)
executes in as fast as a time as the original query that does not use any functions.