i've found a lot of answers on how to find duplicates including the PK-column or without focus on it as this:
If you have a table called T1, and the columns are c1, c2 and c3 then this query would show you the duplicate values.
SELECT C1, C2, C3, count(*)as DupCount
from T1
GROUP BY C1, C2, C3
HAVING COUNT(*) > 1
But a more common requirement would be to get the ID of the all duplicates that have equal c1,c2,c3 values.
So i need following what doesn't work because the id must be aggregated:
SELECT ID
from T1
GROUP BY C1, C2, C3
HAVING COUNT(*) <> 1
(The ID of all duplicates must be different but the columns must be equal)
Edit:
Thank you all. I'm always suprised how fast people give excellent answers on Stackoverflow!
There is a lot of versions suggested here but I think I came up with a new one.
select *
from @T as T1
where exists (select *
from @T as T2
where
T1.ID <> T2.ID and
T1.C1 = T2.C1 and
T1.C2 = T2.C2 and
T1.C3 = T2.C3)
;WITH CTE
AS (SELECT ID,
C1,
C2,
C3,
COUNT(*) OVER (PARTITION BY C1, C2, C3) AS Cnt
FROM T1)
SELECT ID,
C1,
C2,
C3
FROM CTE
WHERE Cnt > 1
To get all the rows that are duplicates:
Use this:
WITH Dups AS
(
SELECT *,
COUNT(1) OVER(PARTITION BY C1, C2, C3) AS CNT
FROM T1
)
SELECT *
FROM Dups
WHERE CNT > 1
and to unique row (i.e. retain one row and filter the other duplicate rows) use this:
WITH NoDups AS
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY C1, C2, C3 ORDER BY ID) AS RN
FROM T1
)
SELECT *
FROM NoDups
WHERE RN = 1
Assuming at least SQL 2005 for the CTE:
;with cteDuplicates as (
select c1, c2, c3
from t1
group by c1, c2, c3
having count(*) > 1
)
select id
from t1
inner join cteDuplicates d
on t1.c1 = d.c1
and t1.c2 = d.c2
and t1.c3 = d.c3
I don't totally understand your problem, but here is a shot at a different style of solution:
select id
from t1 a
join t1 b on a.c1 = b.c2
join t1 c on b.c2 = c.c3
where a.id <> b.id and b.id <> c.id and a.id <> c.id
You can store the C1, C2, C3 combination for duplicates into a temp table and then join it to get the IDs.
select C1, C2, C3
into #duplicates
from T1
group by C1, C2, C3
having count(*) > 1
select ID
from T1 t
inner join #duplicates d
on t.C1 = d.C1
and t.C2 = d.C2
and t.C3 = d.C3