MySql数据库:数据查重、去重的实现

数据库的查重去重是java面试经常会被问到的问题,下面讲讲怎么实现这些,因为本人能力有限,所以只管实现,不考虑效率问题;

假设有一个表user,字段分别有id–nick_name–password–email–phone,分情况如下(注意删除多余记录时要创建临时表,不然会报错):

单字段(nick_name)

查出所有有重复记录的所有记录

select * from user where nick_name in
     (select nick_name from user group by nick_name having count(nick_name)>1);

查出有重复记录的各个记录组中id最大的记录

select * from user where id in (select max(id) from user group by nick_name having count(nick_name)>1);

查出多余的记录,不查出id最小的记录

select * from user where nick_name in

(select nick_name from user group by nick_name having count(nick_name)>1)

and id not in

(select min(id) from user group by nick_name having count(nick_name)>1);

删除多余的重复记录,只保留id最小的记录

delete from user where nick_name in
     (select nick_name from

(select nick_name from user group by nick_name having count(nick_name)>1) as tmp1)

and id not in

(select id from

(select min(id) from user group by nick_name having count(nick_name)>1) as tmp2);

多字段(nick_name,password)

查出所有有重复记录的记录

select * from user where (nick_name,password) in

(select nick_name,password from user group by nick_name,password where having count(nick_name)>1);

查出有重复记录的各个记录组中id最大的记录

select * from user where id in

(select max(id) from user group by nick_name,password where having count(nick_name)>1);

查出各个重复记录组中多余的记录数据,不查出id最小的一条

select * from user where (nick_name,password) in

(select nick_name,password from user group by nick_name,password having count(nick_name)>1)

and id not in

(select min(id) from user group by nick_name,password having count(nick_name)>1);

删除多余的重复记录,只保留id最小的记录

delete from user where (nick_name,password) in

(select nick_name,password from

(select nick_name,password from user group by nick_name,password having count(nick_name)>1) as tmp1)

and id not in

(select id from

(select min(id) id from user group by nick_name,password having count(nick_name)>1) as tmp2);

作者:zilie972803341
来源:优快云
原文:https://blog.youkuaiyun.com/zilie972803341/article/details/78444703
版权声明:本文为博主原创文章,转载请附上博文链接!

### 数据库中检查字段复值或进行操作 在数据库管理过程中,检查字段复值以及执行操作是一项常见的需求。以下是针对不同数据库系统的具体方法。 #### MySQL 中的操作 对于 MySQL 数据库,可以利用 SQL 查询来查找并除逗号分隔字符串中的复项。通过 `FIND_IN_SET` 和 `GROUP_CONCAT` 函数组合使用,能够有效地完成这一目标[^1]: ```sql SELECT GROUP_CONCAT(DISTINCT FIND_IN_SET(substring, 'your_column') ORDER BY substring SEPARATOR ',') FROM ( SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(your_column, ',', numbers.n), ',', -1)) AS substring FROM your_table CROSS JOIN ( SELECT a.N + b.N * 10 + 1 n FROM (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a, (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b ) numbers WHERE LENGTH(REPLACE(your_column, ',' , '')) <= LENGTH(your_column) - LENGTH(REPLACE(your_column, ',' , '')) AND numbers.n <= 1 + LENGTH(your_column) - LENGTH(REPLACE(your_column, ',' , '')) ) sub; ``` 上述查询会返回一个无复的逗号分隔字符串列表。 #### 使用 Redis 实现数据 Redis 供了一种高效的方式用于存储和检索唯一键值对。为了实现数据功能,可以通过哈希结构配合命令如 `HSET` 完成。当尝试向某个特定字段插入新值时,如果该字段已存在则不会覆盖原有记录而是保持不变[^2]: ```bash HSET myhash field value ``` 此命令会在名为 `myhash` 的散列里设定键为 `"field"` 值为 `"value"` 。只有当指定字段尚不存在于散列之中才会被创建出来;反之只会更新现有条目而无需担心引入冗余副本。 #### Oracle 删除多字段复记录 Oracle 数据库允许我们基于单个或者多个字段条件来定位那些具有相同属性集合却拥有不同 ROWID 的行,并从中挑选出唯一的保留对象加以保存其余全部清除掉。下面展示了一个例子说明怎样依据两个字段 (`tid`, `tname`) 找到所有出现次数超过一次的情况之后再进一步筛选排除最早录入的那个实例之外的所有其他版本: ```sql DELETE FROM test112101 t WHERE (t.tid, t.tname) IN ( SELECT tid, tname FROM test112101 t1 GROUP BY t1.tid, t1.tname HAVING COUNT(*) > 1 ) AND rowid NOT IN ( SELECT MIN(rowid) FROM test112101 GROUP BY tid, tname HAVING COUNT(1) > 1 ); ``` 这段脚本首先识别出了哪些 `(tid, tname)` 组合出现了多余的一次以上,接着剔除了这些组当中除最小ROWID外的一切成员[^3]。
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值