
问题现象:
ClickHouse数据库占用磁盘空间较高,经排查,业务数据占用很低,system系统库日志表很大,需要清理日志数据。
日志表包括:
metric_log,part_log,query_views_log,asynchronous_metric_log,query_log,trace_log。
问题分析:
clickhouse@CJC-DB-01:/clickhouse/9000/scripts$df -h /clickhouse/
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_data-lv_clickhouse 40G 29G 11G 75% /clickhouse
查看库大小
SELECT sum(rows) AS `总行数`, formatReadableSize(sum(data_uncompressed_bytes)) AS `原始大小`, formatReadableSize(sum(data_compressed_bytes)) AS `压缩大小`, round((sum(data_compressed_bytes) / sum(data_uncompressed_bytes)) * 100, 0) AS `压缩率` FROM system.parts;
查看表大小
SELECT table,database,sum(rows) AS `总行数`, formatReadableSize(sum(data_uncompressed_bytes)) AS `原始大小`, formatReadableSize(sum(data_compressed_bytes)) AS `压缩大小`, round((sum(data_compressed_bytes) / sum(data_uncompressed_bytes)) * 100, 0) AS `压缩率` FROM system.parts where database='system' group by table,database;
┌─table───────────────────┬─database─┬──────总行数─┬─原始大小───┬─压缩大小───┬─压缩率─┐
│ metric_log │ system │ 36963799 │ 164.90 GiB │ 3.58 GiB │ 2 │
│ part_log │ system │ 31767213 │ 14.04 GiB │ 1.18 GiB │ 8 │
│ query_views_log │ system │ 12654040 │ 9.14 GiB │ 626.44 MiB │ 7 │
│ asynchronous_metric_log │ system │ 12445955729 │ 177.01 GiB │ 3.87 GiB │ 2 │
│ query_log │ system │ 11389155 │ 11.49 GiB │ 1.14 GiB │ 10 │
│ trace_log │ system │ 581048589 │ 184.38 GiB │ 8.69 GiB │ 5 │
└─────────────────────────┴──────────┴─────────────┴────────────┴────────────┴────────┘
其中:asynchronous_metric_log表行数12445955729(124亿),原大小177GB,压缩大小3.87GB。
查询:
select count(*) from system.trace_log; ---580079339
select count(*) from system.asynchronous_metric_log; ---12289709359
select count(*) from system.part_log; ---31704145
select count(*) from system.query_log; ---11376778
select count(*) from system.metric_log; ---36814266
select count(*) from system.query_views_log; ---12653720
解决方案:
注意:仅供参考,请勿用于生产!
1.删除一个月前的所有数据,保留少量数据
## alter table system.trace_log delete where event_date < '2024-10-28';
## alter table system.query_log delete where event_date < '2024-10-28';
## alter table system.part_log delete where event_date < '2024-10-28';
## alter table system.asynchronous_metric_log delete where event_date < '2024-10-28';
## alter table system.metric_log delete where event_date < '2024-10-28';
## alter table system.query_views_log delete where event_date < '2024-10-28';
前台执行直接返回,后台异步执行,查看执行结果,删除速度很快,每张表基本不超过1分钟。
查看后台执行情况:
CJC-DB-01 :) select mutation_id,table,mutation_id,command,create_time,is_done from system.mutations;
SELECT
mutation_id,
table,
mutation_id,
command,
create_time,
is_done
FROM system.mutations
Query id: 968d5bc4-ceaf-4765-be0b-3dc91f3fcf37
┌─mutation_id──────────┬─table───────────────────┬─mutation_id──────────┬─command────────────────────────────────┬─────────create_time─┬─is_done─┐
│ mutation_5281938.txt │ asynchronous_metric_log │ mutation_5281938.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:35:14 │ 1 │
│ mutation_4913986.txt │ metric_log │ mutation_4913986.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:35:19 │ 1 │
│ mutation_627962.txt │ part_log │ mutation_627962.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:30:49 │ 1 │
│ mutation_928174.txt │ query_log │ mutation_928174.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:30:43 │ 1 │
│ mutation_925037.txt │ query_views_log │ mutation_925037.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:35:25 │ 1 │
│ mutation_4357949.txt │ trace_log │ mutation_4357949.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:29:13 │ 1 │
└──────────────────────┴─────────────────────────┴──────────────────────┴────────────────────────────────────────┴─────────────────────┴─────────┘
6 rows in set. Elapsed: 0.002 sec.
添加日志表自动删除:
##ALTER TABLE trace_log MODIFY TTL event_date + toIntervalDay(30);
##ALTER TABLE query_log MODIFY TTL event_date + toIntervalDay(30);
##ALTER TABLE part_log MODIFY TTL event_date + toIntervalDay(30);
##ALTER TABLE asynchronous_metric_log MODIFY TTL event_date + toIntervalDay(30);
##ALTER TABLE metric_log MODIFY TTL event_date + toIntervalDay(30);
##ALTER TABLE query_views_log MODIFY TTL event_date + toIntervalDay(30);
查看后台执行情况:
CJC-DB-01 :) select mutation_id,table,mutation_id,command,create_time,is_done from system.mutations;
SELECT
mutation_id,
table,
mutation_id,
command,
create_time,
is_done
FROM system.mutations
Query id: 849ee890-810f-42b1-9690-5e726969ccfc
┌─mutation_id──────────┬─table───────────────────┬─mutation_id──────────┬─command────────────────────────────────┬─────────create_time─┬─is_done─┐
│ mutation_5281938.txt │ asynchronous_metric_log │ mutation_5281938.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:35:14 │ 1 │
│ mutation_5281999.txt │ asynchronous_metric_log │ mutation_5281999.txt │ MATERIALIZE TTL │ 2024-11-28 17:42:14 │ 1 │
│ mutation_4913986.txt │ metric_log │ mutation_4913986.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:35:19 │ 1 │
│ mutation_4914064.txt │ metric_log │ mutation_4914064.txt │ MATERIALIZE TTL │ 2024-11-28 17:44:54 │ 1 │
│ mutation_627962.txt │ part_log │ mutation_627962.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:30:49 │ 1 │
│ mutation_627973.txt │ part_log │ mutation_627973.txt │ MATERIALIZE TTL │ 2024-11-28 17:42:07 │ 1 │
│ mutation_928174.txt │ query_log │ mutation_928174.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:30:43 │ 1 │
│ mutation_928200.txt │ query_log │ mutation_928200.txt │ MATERIALIZE TTL │ 2024-11-28 17:40:37 │ 1 │
│ mutation_925037.txt │ query_views_log │ mutation_925037.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:35:25 │ 1 │
│ mutation_925053.txt │ query_views_log │ mutation_925053.txt │ MATERIALIZE TTL │ 2024-11-28 17:45:29 │ 1 │
│ mutation_4357949.txt │ trace_log │ mutation_4357949.txt │ DELETE WHERE event_date < '2024-10-28' │ 2024-11-28 17:29:13 │ 1 │
│ mutation_4358037.txt │ trace_log │ mutation_4358037.txt │ MATERIALIZE TTL │ 2024-11-28 17:40:04 │ 1 │
└──────────────────────┴─────────────────────────┴──────────────────────┴────────────────────────────────────────┴─────────────────────┴─────────┘
12 rows in set. Elapsed: 0.002 sec.
检查磁盘使用率,使用率由75%下降到27%。
root@CJC-DB-01:/root#df -h /clickhouse
/dev/mapper/vg_data-lv_clickhouse 40G 11G 29G 27% /clickhouse
###chenjuchao 20241203###
欢迎关注我的公众号《IT小Chen》
5427

被折叠的 条评论
为什么被折叠?



