I have rows in an Oracle database table which should be unique for a combination of two fields but the unique constrain is not set up on the table so I need to find all rows which violate the constraint myself using SQL. Unfortunately my meager SQL skills aren't up to the task.
My table has three columns which are relevant: entity_id, station_id, and obs_year. For each row the combination of station_id and obs_year should be unique, and I want to find out if there are rows which violate this by flushing them out with an SQL query.
I have tried the following SQL (suggested by this previous question) but it doesn't work for me (I get ORA-00918 column ambiguously defined):
SELECT
entity_id, station_id, obs_year
FROM
mytable t1
INNER JOIN (
SELECT entity_id, station_id, obs_year FROM mytable
GROUP BY entity_id, station_id, obs_year HAVING COUNT(*) > 1) dupes
ON
t1.station_id = dupes.station_id AND
t1.obs_year = dupes.obs_year
Can someone suggest what I'm doing wrong, and/or how to solve this?
Re-write of your query
I think the ambiguous column error (ORA-00918) was because you were
select
ing columns whose names appeared in both the table and the subquery, but you did not specifiy if you wanted it fromdupes
or frommytable
(aliased ast1
).I thought a lot of the solutions here were cumbersome and tough to understand since I had a 3 column primary key constraint and needed to find the duplicates. So here's an option
Specify the fields to find duplicates on both the SELECT and the GROUP BY.
It works by using
GROUP BY
to find any rows that match any other rows based on the specified Columns. TheHAVING COUNT(*) > 1
says that we are only interested in seeing any rows that occur more than 1 time (and are therefore duplicates)