数据库去除重复数据

最新推荐文章于 2025-02-28 16:39:03 发布

原创最新推荐文章于 2025-02-28 16:39:03 发布 · 3.7k 阅读

6 ·

CC 4.0 BY-SA版权

sql 专栏收录该内容

2 篇文章

订阅专栏

背景：去除数据库中字段重复的数据

表t_bmk 包含三个字段 id,ksno,fenzu_code;现在去除t_bmk中ksno和fenzu_code两个字段重复的数据，步骤如下：

方法一：

1、根据重复字段用group by函数进行分组，查出所有重复的数据，用max函数获取每组重复数据最大的id
select fenzu_code,zhiwen_status,count(fenzu_code),max(id) id from t_bmk where fenzu_code is not null group by fenzu_code,zhiwen_status having count(1)>1;

只取id:

select max(id) id from t_bmk where fenzu_code is not null group by fenzu_code,zhiwen_status having count(1)>1;

2、查询出剩下的不重复的数据的id
select fenzu_code,zhiwen_status,count(fenzu_code),max(id) id from t_bmk where fenzu_code is not null group by fenzu_code,zhiwen_status having count(1)=1;

只取id:

select id from t_bmk where fenzu_code is not null group by fenzu_code,zhiwen_status having count(1)=1;

3、每一组重复数据的最大id加上不重复数据的id就是所有的不包含重复数据的id，根据剩下的id可以去掉重复数据：
delete from t_bmk where id not in(
select max(id) id from t_bmk where fenzu_code is not null group by fenzu_code,zhiwen_status having count(1)>1 union
select max(id) id from t_bmk where fenzu_code is not null group by fenzu_code,zhiwen_status having count(1)=1);

方法二：

只取id:

select max(id) id from t_bmk where fenzu_code is not null group by fenzu_code,zhiwen_status having count(1)>1;

2、查询出所有重复数据的id

select id from t_bmk where (fenzu_code,zhiwen_status) in(
select fenzu_code,zhiwen_status from t_bmk where fenzu_code is not null group by fenzu_code,zhiwen_status having count(1)>1
);

3、从这些重复数据的id中去除，每组重复数据的最大id,就是应该删除数据的id
select id from t_bmk where (fenzu_code,zhiwen_status) in(
select fenzu_code,zhiwen_status from t_bmk where fenzu_code is not null group by fenzu_code,zhiwen_status having count(1)>1
) and id not in(select max(id) id from t_bmk where fenzu_code is not null group by fenzu_code,zhiwen_status having count(1)>1);

4、直接根据id删除重复数据：
delete from t_bmk where id in(
select id from t_bmk where (fenzu_code,zhiwen_status) in(
select fenzu_code,zhiwen_status from t_bmk where fenzu_code is not null group by fenzu_code,zhiwen_status having count(1)>1
) and id not in(select max(id) id from t_bmk where fenzu_code is not null group by fenzu_code,zhiwen_status having count(1)>1));

方法三：

表t_bmk 包含三个字段 id,ksno,fenzu_code;现在去除t_bmk中ksno和fenzu_code两个字段重复的数据，步骤如下：

方法一：

取每一组的最大id:

select max(id) id from t_bmk where fenzu_code is not null group by fenzu_code,zhiwen_status;

2、每一组数据的最大id就是所有的不包含重复数据的id，根据剩下的id可以去掉重复数据：
delete from t_bmk where id not in(
select max(id) id from t_bmk where fenzu_code is not null group by fenzu_code,zhiwen_status);