表DELETE删除124亿行数据,耗时不到1分钟!

在这里插入图片描述

问题现象:

ClickHouse数据库占用磁盘空间较高,经排查,业务数据占用很低,system系统库日志表很大,需要清理日志数据。
日志表包括:

metric_log,part_log,query_views_log,asynchronous_metric_log,query_log,trace_log。

问题分析:

clickhouse@CJC-DB-01:/clickhouse/9000/scripts$df -h /clickhouse/
Filesystem                         Size  Used Avail Use% Mounted on
/dev/mapper/vg_data-lv_clickhouse   40G   29G   11G  75% /clickhouse

查看库大小

SELECT sum(rows) AS `总行数`, formatReadableSize(sum(data_uncompressed_bytes)) AS `原始大小`, formatReadableSize(sum(data_compressed_bytes)) AS `压缩大小`, round((sum(data_compressed_bytes) / sum(data_uncompressed_bytes)) * 100, 0) AS `压缩率` FROM system.parts;

查看表大小

SELECT table,database,sum(rows) AS `总行数`, formatReadableSize(sum(data_uncompressed_bytes)) AS `原始大小`, formatReadableSize(sum(data_compressed_bytes)) AS `压缩大小`, round((sum(data_compressed_bytes) / sum(data_uncompressed_bytes)) * 100, 0) AS `压缩率` FROM system.parts where database='system' group by table,database;

┌─table───────────────────┬─database─┬──────总行数─┬─原始大小───┬─压缩大小───┬─压缩率─┐
│ metric_log              │ system   │    36963799 │ 164.90 GiB │ 3.58 GiB   │      2 │
│ part_log                │ system   │    31767213 │ 14.04 GiB  │ 1.18 GiB   │      8 │
│ query_views_log         │ system   │    12654040 │ 9.14 GiB   │ 626.44 MiB │      7 │
│ asynchronous_metric_log │ system   │ 12445955729 │ 177.01 GiB │ 3.87 GiB   │      2 │
│ query_log               │ system   │    11389155 │ 11.49 GiB  │ 1.14 GiB   │     10 │
│ trace_log               │ system   │   581048589 │ 184.38 GiB │ 8.69 GiB   │      5 │
└─────────────────────────┴──────────┴─────────────┴────────────┴────────────┴────────┘

其中:asynchronous_metric_log表行数12445955729(124亿),原大小177GB,压缩大小3.87GB。

查询:

select count(*) from system.trace_log; ---580079339
select count(*) from system.asynchronous_metric_log; ---12289709359
select count(*) from system.part_log;  ---31704145
select count(*) from system.query_log;  ---11376778
select count(*) from system.metric_log; ---36814266
select count(*) from system.query_views_log; ---12653720

解决方案:

注意:仅供参考,请勿用于生产!
1.删除一个月前的所有数据,保留少量数据

## alter table system.trace_log delete where event_date < '2024-10-28';
## alter table system.query_log delete where event_date < '2024-10-28';
## alter table system.part_log delete where event_date < '2024-10-28';
## alter table system.asynchronous_metric_log delete where event_date < '2024-10-28';
## alter table system.metric_log delete where event_date < '2024-10-28';
## alter table system.query_views_log delete where event_date < '2024-10-28';

前台执行直接返回,后台异步执行,查看执行结果,删除速度很快,每张表基本不超过1分钟。

查看后台执行情况:

CJC-DB-01 :) select mutation_id,table,mutation_id,command,create_time,is_done from system.mutations;

SELECT
    mutation_id,
    table,
    mutation_id,
    command,
    create_time,
    is_done
FROM system.mutations

Query id: 968d5bc4-ceaf-4765-be0b-3dc91f3fcf37

┌─mutation_id──────────┬─table───────────────────┬─mutation_id──────────┬─command────────────────────────────────┬─────────create_time─┬─is_done─┐
│ mutation_5281938.txt │ asynchronous_metric_log │ mutation_5281938.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:35:14 │       1 │
│ mutation_4913986.txt │ metric_log              │ mutation_4913986.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:35:19 │       1 │
│ mutation_627962.txt  │ part_log                │ mutation_627962.txt  │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:30:49 │       1 │
│ mutation_928174.txt  │ query_log               │ mutation_928174.txt  │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:30:43 │       1 │
│ mutation_925037.txt  │ query_views_log         │ mutation_925037.txt  │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:35:25 │       1 │
│ mutation_4357949.txt │ trace_log               │ mutation_4357949.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:29:13 │       1 │
└──────────────────────┴─────────────────────────┴──────────────────────┴────────────────────────────────────────┴─────────────────────┴─────────┘

6 rows in set. Elapsed: 0.002 sec. 

添加日志表自动删除:

##ALTER TABLE trace_log MODIFY TTL event_date + toIntervalDay(30);
##ALTER TABLE query_log MODIFY TTL event_date + toIntervalDay(30);
##ALTER TABLE part_log MODIFY TTL event_date + toIntervalDay(30);
##ALTER TABLE asynchronous_metric_log MODIFY TTL event_date + toIntervalDay(30);
##ALTER TABLE metric_log MODIFY TTL event_date + toIntervalDay(30);
##ALTER TABLE query_views_log MODIFY TTL event_date + toIntervalDay(30);

查看后台执行情况:

CJC-DB-01 :) select mutation_id,table,mutation_id,command,create_time,is_done from system.mutations;

SELECT
    mutation_id,
    table,
    mutation_id,
    command,
    create_time,
    is_done
FROM system.mutations

Query id: 849ee890-810f-42b1-9690-5e726969ccfc

┌─mutation_id──────────┬─table───────────────────┬─mutation_id──────────┬─command────────────────────────────────┬─────────create_time─┬─is_done─┐
│ mutation_5281938.txt │ asynchronous_metric_log │ mutation_5281938.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:35:14 │       1 │
│ mutation_5281999.txt │ asynchronous_metric_log │ mutation_5281999.txt │ MATERIALIZE TTL                        │ 2024-11-28 17:42:14 │       1 │
│ mutation_4913986.txt │ metric_log              │ mutation_4913986.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:35:19 │       1 │
│ mutation_4914064.txt │ metric_log              │ mutation_4914064.txt │ MATERIALIZE TTL                        │ 2024-11-28 17:44:54 │       1 │
│ mutation_627962.txt  │ part_log                │ mutation_627962.txt  │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:30:49 │       1 │
│ mutation_627973.txt  │ part_log                │ mutation_627973.txt  │ MATERIALIZE TTL                        │ 2024-11-28 17:42:07 │       1 │
│ mutation_928174.txt  │ query_log               │ mutation_928174.txt  │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:30:43 │       1 │
│ mutation_928200.txt  │ query_log               │ mutation_928200.txt  │ MATERIALIZE TTL                        │ 2024-11-28 17:40:37 │       1 │
│ mutation_925037.txt  │ query_views_log         │ mutation_925037.txt  │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:35:25 │       1 │
│ mutation_925053.txt  │ query_views_log         │ mutation_925053.txt  │ MATERIALIZE TTL                        │ 2024-11-28 17:45:29 │       1 │
│ mutation_4357949.txt │ trace_log               │ mutation_4357949.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:29:13 │       1 │
│ mutation_4358037.txt │ trace_log               │ mutation_4358037.txt │ MATERIALIZE TTL                        │ 2024-11-28 17:40:04 │       1 │
└──────────────────────┴─────────────────────────┴──────────────────────┴────────────────────────────────────────┴─────────────────────┴─────────┘

12 rows in set. Elapsed: 0.002 sec.

检查磁盘使用率,使用率由75%下降到27%。

root@CJC-DB-01:/root#df -h /clickhouse
/dev/mapper/vg_data-lv_clickhouse   40G   11G   29G  27% /clickhouse

###chenjuchao 20241203###
欢迎关注我的公众号《IT小Chen

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值