delete duplicate rows in a table

最新推荐文章于 2022-05-21 19:29:57 发布

原创最新推荐文章于 2022-05-21 19:29:57 发布 · 713 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#delete #table #duplicates #oracle #sql #query

本文提供了一种在Oracle数据库中删除表内重复行的方法。通过使用ROWID伪列来区分并保留每组重复行中的一条记录，其余重复项被删除。文中还介绍了几种检测重复行的有效方式。

Purpose:
========

The purpose of this article is to provide you with an example of how to
delete duplicate rows in a table.

Scope & Application:
====================

This article is most useful for Oracle Developers.

How to Delete Duplicate Rows in a Table:
========================================

1. Table dept_copy is a copy of the dept table and contains a few duplicate
values.

SQL> select * from dept_copy;

     DEPTNO DNAME          LOC
   ---------- -------------- -------------
           10 ACCOUNTING     NEW YORK
           20 RESEARCH       DALLAS
           30 SALES          CHICAGO
           40 OPERATIONS     BOSTON
           10 ACCOUNTING     NEW YORK
           20 RESEARCH       DALLAS
           30 SALES          CHICAGO
           40 OPERATIONS     BOSTON
           10 ACCOUNTING     NEW YORK
           20 RESEARCH       DALLAS
           30 SALES          CHICAGO
           40 OPERATIONS     BOSTON
           10 ACCOUNTING     NEW YORK
           20 RESEARCH       DALLAS
           30 SALES          CHICAGO
           40 OPERATIONS     BOSTON

   16 rows selected.
   SQL>

2. Drop the duplicate rows using the rowid because rowid is the only
   difference between these rows.

   SQL> delete from dept_copy
        where rowid in
        (select rowid from dept_copy
        minus
        select max(rowid) from dept_copy
        group by DEPTNO, DNAME, LOC);

12 rows deleted.

3. Verify the table is correct by doing the following:

   SQL> select * from dept_copy
        group by DEPTNO, DNAME, LOC having count(*) > 1;

   no rows selected

Note that this query shows you another method to display duplicate rows.

References:
===========

Oracle 8i Release 2 (8.1.6).
SQL Reference, Volume 1 and Volume 2.
Note 65080.1 Using SQL to Delete Duplicate Rows in a Table.
Note 1015631.6 HOW TO SELECT DUPLICATE ROWS WITHOUT USING ROWID.
Note 1004425.6 HOW TO FIND OR DELETE DUPLICATE ROWS IN TABLE.

Keywords
========
duplicate
delete
rows
ora-1

There's an Oracle-specific trick to delete the duplicates from a table.

Suppose your table FOO has the following fields: bar, baz, blat, blong, blork

To get rid of the dupes, you want to identify the rows that are identical on -all- the fields in the table. So the query to ID the dupes would be something like

select bar, baz, blat, blong, blork
from FOO
group by bar, baz, blat, blong, blork
having count(*) > 1 ;

So now you have a list of dupes. Fine and dandy. Now there's a neat trick you can use to get rid of all but 1 of each of these rows! Oracle provides a "pseudocolumn," ROWID, which is a representation of the physical location of each row and so is unique for each row in the table. So the following query gets you the rows that you want to keep:

select min(rowid), bar, baz, blat, blong, blork
from FOO
group by bar, baz, blat, blong, blork ;

For rows that AREN'T dupes, this will just return the row with its rowid. For rows that ARE dupes, this will return the row data along with the rowid for ONE copy of that row, no matter how many copies are in the table. Neat eh? So all you have to do is get rid of all the rows that AREN'T in this resultset:

delete from FOO
where rowid not in
( select min(rowid)
from FOO
group by bar, baz, blat, blong, blork
)
;

To detect duplicates
select x.rowid, x.a,x.b ....
from <tbl> x
where x.rowid>any (
  select y.rowid
  from <tbl> y
  where x.a = y.a
  )
To delete dupliocates
delete from <tbl> x
where x.rowid > any
  (select y.rowid
  from <tbl> y where y.a=x.a

Other ways to detect duplicates

select <key cols>,count(<key cols>)
from <tbl>
group by <key cols>
having count(<key cols>) > 1

select *
from <tbl> a, <tbl> b
where a.key = b.key
and a.rowid != b.rowid