How can I delete duplicate rows
where no unique row id
exists?
My table is
col1 col2 col3 col4 col5 col6 col7
john 1 1 1 1 1 1
john 1 1 1 1 1 1
sally 2 2 2 2 2 2
sally 2 2 2 2 2 2
I want to be left with the following after the duplicate removal:
john 1 1 1 1 1 1
sally 2 2 2 2 2 2
I've tried a few queries but i think they depend on a row id as I don't get desired result. For example:
DELETE FROM table WHERE col1 IN (
SELECT id FROM table GROUP BY id HAVING ( COUNT(col1) > 1 )
)
Please see the below way of deletion too.
Created a sample table named
@table
and loaded it with given data.Note: If you are giving all columns in the
Partition by
part, thenorder by
do not have much significance.I know, the question is asked three years ago, and my answer is another version of what Tim has posted, But posting just incase it is helpful for anyone.
Oh wow, i feel so stupid by ready all this answers, they are like experts' answer with all CTE and temp table and etc.
And all I did to get it working was simply aggregated the ID column by using MAX.
NOTE: you might need to run it multiple time to remove duplicate as this will only delete one set of duplicate rows at a time.
If you can find number of duplicate rows, for instance you have n duplicate row, then use this command
for more info I suggest this
Another way of removing dublicated rows without loosing information in one step is like following:
I like CTEs and
ROW_NUMBER
as the two combined allow us to see which rows are deleted (or updated), therefore just change theDELETE FROM CTE...
toSELECT * FROM CTE
:DEMO (result is different; I assume that it's due to a typo on your part)
This example determines duplicates by a single column
col1
because of thePARTITION BY col1
. If you want to include multiple columns simply add them to thePARTITION BY
: