如何删除一个表中某列是重复的数据？

最新推荐文章于 2021-07-28 16:31:18 发布

原创最新推荐文章于 2021-07-28 16:31:18 发布 · 2.2k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#delete #table

SQL 专栏收录该内容

8 篇文章

订阅专栏

本文介绍四种去除数据库表中重复记录的方法：使用DISTINCT关键字、不使用DISTINCT关键字、DELETE语句循环删除及直接删除非最大ID记录。第四种方法被认为是最优方案。

比如表T：
id   name
1    aa
2    bb
3    cc
4    aa
5    aa
6    cc

如何变成（保留最大的id或者最小的id）：
id   name
1    aa
2    bb
3    cc

1.使用distinct关键字，导出无重复的数据到一个临时表，删除原表，再从临时表导入数据：
select distinct name into #temp from T
drop table T
select identity(int,1,1) as id, * into T from #temp
drop table #temp
如果原表的id和其他表没有联系并只有id，name列时，用这种方法可以，当然最好原表数据不要太多。

2.不用distinct关键字，导出无重复的数据到一个临时表，删除原表，再从临时表导入数据：
select name into #temp from T where id in (select max(id) from aa group by name)
drop table T
select identity(int,1,1) as id, * into T from #temp
drop table #temp

3.使用delete语句，把name相同的类中id最大（或者最小）的数据删掉，循环执行，直到受影响的行数是0：
delete from T where id in (select max(id) from T group by name having count(*)>1)

4.还是delete语句，把除了name相同的类中最大的id（或者最小）的数据删掉，执行一次即可：
delete from T where id not in (select max(id) from T group by name )

感觉当然是第4种方法比较好了，不知道有没有其他的好方法了。