I'm working on a query where I need to look at patient vitals (specifically blood pressure) that were entered when a patient visited a clinic. I'm pulling results for the entire year of 2015, and of course there are certain patients who visited multiple times, and I need to only see the vitals that were entered at the most recent visit. Another slight twist is that systolic and diastolic pressures are entered separately, so I end up with results like:
Patient ID Name DOB Test Results Date
---------------------------------------------------------------------------------
1000 John Smith 1/1/1955 BP - Diastolic 120 2/10/2015
1000 John Smith 1/1/1955 BP - Systolic 70 2/10/2015
1000 John Smith 1/1/1955 BP - Diastolic 128 7/12/2015
1000 John Smith 1/1/1955 BP - Systolic 75 7/12/2015
1000 John Smith 1/1/1955 BP - Diastolic 130 10/22/2015
1000 John Smith 1/1/1955 BP - Systolic 76 10/22/2015
9999 Jane Doe 5/4/1970 BP - Diastolic 130 4/2/2015
9999 Jane Doe 5/4/1970 BP - Systolic 60 4/2/2015
9999 Jane Doe 5/4/1970 BP - Diastolic 127 11/20/2015
9999 Jane Doe 5/4/1970 BP - Systolic 65 11/20/2015
There are 26,000+ results so obviously I don't want to go through every single patient and see when their most recent results were. I'd like my results to look like this:
Patient ID Name DOB Test Results Date
---------------------------------------------------------------------------------
1000 John Smith 1/1/1955 BP - Diastolic 130 10/22/2015
1000 John Smith 1/1/1955 BP - Systolic 76 10/22/2015
9999 Jane Doe 5/4/1970 BP - Diastolic 127 11/20/2015
9999 Jane Doe 5/4/1970 BP - Systolic 65 11/20/2015
I know the name and date of birth and whatnot would get repeated, but I'm mainly focused on the results column.
Here's my query:
SELECT DISTINCT
pd.PatientID as [Patient ID],
pd.PatientName as Name,
pd.DateOfBirth as DOB,
v.Test as Test,
v.Results as Results,
v.TestDate as Date
FROM PatientDemographic pd JOIN Vitals v ON pd.PatientID = v.PatientID
WHERE v.TestDate BETWEEN '01/01/2015' AND '12/31/2015'
AND v.Test LIKE 'BP%'
ORDER BY pd.PatientID, v.TestDate
After looking around for other answers, I tried doing a GROUP BY
and the MAX()
aggregate function for the v.TestDate
column in the SELECT
statement (I was specifically referencing this link, though it's for Oracle and I'm using SQL Server so I'm not entirely sure if the syntax would be the same). My query then looked like:
SELECT DISTINCT
pd.PatientID as [Patient ID],
pd.PatientName as Name,
pd.DateOfBirth as DOB,
v.Test as Test,
v.Results as Results,
MAX(v.TestDate) as Date
FROM PatientDemographic pd JOIN Vitals v ON pd.PatientID = v.PatientID
WHERE v.TestDate BETWEEN '01/01/2015' AND '12/31/2015'
AND v.Test LIKE 'BP%'
GROUP BY pd.PatientID
Admittedly, I've always struggled a little with using GROUP BY
. In this particular case, I get an error stating that I need to add the Patient Name column to the GROUP BY
clause as well, so I do, and then it asks for DOB. Then the Test name. Basically, it wants me to add everything from my SELECT
statement to the GROUP BY
.
What's the best way to proceed and get my most recent patient visits?