I have a pivot query on a table with millions of rows. Running the query normally, it runs in 2 seconds and returns 2983 rows. If I add TOP 1000 to the query it takes 10 seconds to run.
What could be causing this?
SELECT *
FROM (SELECT l.PatientID,
l.LabID,
l.Result
FROM dbo.Labs l
JOIN (SELECT MAX(LabDate) maxDate,
PatientID,
LabID
FROM dbo.Labs
GROUP BY PatientID, LabID) s ON l.PatientID = s.PatientID
AND l.LabID = s.LabID
AND l.LabDate = s.maxDate) A
PIVOT(MIN(A.Result) FOR A.LabID IN ([1],[2],[3],[4],[5],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17])) p
Execution plans:
This alternate formulation has the same problem:
select
*
FROM (
SELECT
l.PatientID,
l.LabID,
l.Result
FROM dbo.Labs l
where l.LabDate = (
select
MAX(LabDate)
from Labs l2
where l2.PatientID = l.PatientID
and l2.LabID = l.LabID
)
) A
PIVOT(MIN(A.Result) FOR A.LabID IN ([1],[2],[3],[4],[5],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17])) p
You may also try creating an indexed view like this:
and use it in the query:
After doing some googling about suggesting an execution plan, I found the solution.
OPTION (HASH JOIN)
being the thing. The resulting execution plan for the version with TOP looks like the original non-top one, with a TOP tacked on at the end.Since I was originally doing this in a view what I actually ended up doing was changing
JOIN
toINNER HASH JOIN
There is a specific order in which queries are processed.
A normal SQL query will be written as follows:
But the processing order is different:
When using
SELECT DISTINCT [...]
orSELECT TOP [...]
the processing order will be as follows:Hence it's taking longer as your
SELECT TOP 1000
is processed last.Take a look at this link for further details: http://blogs.msdn.com/b/sqlqueryprocessing/