I have rows in an Oracle database table which should be unique for a combination of two fields but the unique constrain is not set up on the table so I need to find all rows which violate the constraint myself using SQL. Unfortunately my meager SQL skills aren't up to the task.
My table has three columns which are relevant: entity_id, station_id, and obs_year. For each row the combination of station_id and obs_year should be unique, and I want to find out if there are rows which violate this by flushing them out with an SQL query.
I have tried the following SQL (suggested by this previous question) but it doesn't work for me (I get ORA-00918 column ambiguously defined):
SELECT
entity_id, station_id, obs_year
FROM
mytable t1
INNER JOIN (
SELECT entity_id, station_id, obs_year FROM mytable
GROUP BY entity_id, station_id, obs_year HAVING COUNT(*) > 1) dupes
ON
t1.station_id = dupes.station_id AND
t1.obs_year = dupes.obs_year
Can someone suggest what I'm doing wrong, and/or how to solve this?
You need to specify the table for the columns in the main select. Also, assuming entity_id is the unique key for mytable and is irrelevant to finding duplicates, you should not be grouping on it in the dupes subquery.
Try:
Could you not create a new table that includes the unique constraint, and then copy across the data row by row, ignoring failures?
by Quassnoi is the most efficient for large tables. I had this analysis of cost :
gave a cost of 1322341
gave a cost of 1271699
while
gave a cost of 1021984
The table was not indexed....
Change the 3 fields in the initial select to be