Removing duplicate rows from table in Oracle

I'm testing something in Oracle and populated a table with some sample data, but in the process I accidentally loaded duplicate records, so now I can't create a primary key using some of the columns.

How can I delete all duplicate rows and leave only one of them?

标签： oracle duplicate-removal delete-row

21条回答

路过你的时光

2楼-- · 2019-01-01 02:04

delete from dept
where rowid in (
     select rowid
     from dept
     minus
     select max(rowid)
     from dept
     group by DEPTNO, DNAME, LOC
);

0人赞添加讨论(0) 举报

余欢

3楼-- · 2019-01-01 02:06

From Ask Tom

delete from t
 where rowid IN ( select rid
                    from (select rowid rid, 
                                 row_number() over (partition by 
                         companyid, agentid, class , status, terminationdate
                                   order by rowid) rn
                            from t)
                   where rn <> 1);

(fixed the missing parenthesis)

0人赞添加讨论(0) 举报

流年柔荑漫光年

4楼-- · 2019-01-01 02:10

You should do a small pl/sql block using a cursor for loop and delete the rows you don't want to keep. For instance:

declare
prev_var my_table.var1%TYPE;

begin

for t in (select var1 from my_table order by var 1) LOOP

-- if previous var equal current var, delete the row, else keep on going.
end loop;

end;

0人赞添加讨论(0) 举报

忆尘夕之涩

5楼-- · 2019-01-01 02:10

5. solution

delete from emp where rowid in 
    (
      select  rid from
       (
         select rowid rid,rank() over (partition by emp_id order by rowid)rn from emp     
       )
     where rn > 1
    );

0人赞添加讨论(0) 举报

不再属于我。

6楼-- · 2019-01-01 02:10

I didn't see any answers that use common table expressions and window functions. This is what I find easiest to work with.

DELETE FROM
 YourTable
WHERE
 ROWID IN
    (WITH Duplicates
          AS (SELECT
               ROWID RID, 
               ROW_NUMBER() 
               OVER(
               PARTITION BY First_Name, Last_Name, Birth_Date)
                  AS RN
               SUM(1)
               OVER(
               PARTITION BY First_Name, Last_Name, Birth_Date
               ORDER BY ROWID ROWS BETWEEN UNBOUNDED PRECEDING 
                                       AND UNBOUNDED FOLLOWING)
                   AS CNT
              FROM
               YourTable
              WHERE
               Load_Date IS NULL)
     SELECT
      RID
     FROM
      duplicates
     WHERE
      RN > 1);

Somethings to note:

1) We are only checking for duplication on the fields in the partition clause.

2) If you have some reason to pick one duplicate over others you can use an order by clause to make that row will have row_number() = 1

3) You can change the number duplicate preserved by changing the final where clause to "Where RN > N" with N >= 1 (I was thinking N = 0 would delete all rows that have duplicates, but it would just delete all rows).

4) Added the Sum partition field the CTE query which will tag each row with the number rows in the group. So to select rows with duplicates, including the first item use "WHERE cnt > 1".

0人赞添加讨论(0) 举报

人间绝色

7楼-- · 2019-01-01 02:11

The Fastest way for really big tables

Create exception table with structure below: exceptions_table

ROW_ID ROWID
OWNER VARCHAR2(30)
TABLE_NAME VARCHAR2(30)
CONSTRAINT VARCHAR2(30)

Try create a unique constraint or primary key which will be violated by the duplicates. You will get an error message because you have duplicates. The exceptions table will contain the rowids for the duplicate rows.
```
alter table add constraint
unique --or primary key
(dupfield1,dupfield2) exceptions into exceptions_table;
```

Join your table with exceptions_table by rowid and delete dups

delete original_dups where rowid in (select ROW_ID from exceptions_table);

If the amount of rows to delete is big, then create a new table (with all grants and indexes) anti-joining with exceptions_table by rowid and rename the original table into original_dups table and rename new_table_with_no_dups into original table
```
create table new_table_with_no_dups AS (
    select field1, field2 ........ 
    from original_dups t1
    where not exists ( select null from exceptions_table T2 where t1.rowid = t2.row_id )
)
```

0人赞添加讨论(0) 举报

1 2 3 4 下一页

Removing duplicate rows from table in Oracle

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间