As the title says, I'm using SQL Server 2008. Apologies if this question is very basic. I've only been using SQL for a few days. Right now I have the following query:
SELECT TOP 10 p.id, pl.nm, pl.val, pl.txt_val
from dm.labs pl
join mas_data.patients p
on pl.id = p.id
where pl.nm like '%LDL%'
and val is not null
What I want to do is use select top n together with distinct values in the id column. Searching through some forums says to use
SELECT DISTINCT TOP 10 ...
but when I replace the first line with
SELECT DISTINCT TOP 10 p.id, pl.nm, pl.val, pl.txt_val
I get the same results as without the word distinct. What should I be doing to only get to filter out duplicate id entries?
Thank you.
I think the problem is that you want one result for each p.id?
But you are getting "duplicate" results for some p.id's, is that right?
The DISTINCT keyword applies to the entire result set, so applies to pl.nm, pl.val, pl.txt_val, not just p.id.
You need something like
Won't need the distinct keyword then.
You could use a Common Table Expression to get the top 10 distinct ID's and then join those to the rest of your data:
That should work. Mind you: if you have a "TOP x" clause, you typically also need an ORDER BY clause - if you want the TOP 10, you need to tell the system in what order that "TOP" is.
PS: why do you even join the "patients" table, if you never select any fields from it??
I know this thread is old, but figured I would throw in what came up with since I just ran into this same issue. It may not be efficient, but I believe it gets the job done.
Few ideas:
Try something like this:
Note that i commented out some of the SELECT to limit your result set and DISTINCT logic.
The easy option is to use group by and select min/max for all other fields
This can get quite tedious for wide table so the other option is to use rank over and partiion