解释计划
Explain
https://www.postgresql.org/docs/10/sql-explain.html
基本使用
利用规划器评估解释计划及其执行成本
explain不执行SQL只进行评估,同时不会评估数据传输的网络开销
explain select * from crm_t_archive_person
# Result
Seq Scan on crm_t_archive_person (cost=0.00..3059.75 rows=53675 width=2739)
- Seq Scan:表示对表顺序扫描
- 0.00:启动成本,SQL段执行开始前已消耗的成本
- 3059.72:总费用,SQL段执行完成总消耗成本,总费用-启动成本=本SQL段自己消耗的成本
- 53675:预计本SQL段输出的行数
- 2739: 预计输出行的平均宽度(单位:byte)
ANALYZE选项
默认关闭
评估并执行SQL记录真实运行结果,不会评估数据传输的网络开销
explain analyze select * from crm_t_archive_person where cl_name_cn like '%AND%'
# Result
Seq Scan on crm_t_archive_person (cost=0.00..1448.71 rows=1 width=827) (actual time=0.813..3.560 rows=2 loops=1)
Filter: ((cl_name_cn)::text ~~ '%AND%'::text)
Rows Removed by Filter: 5735
Planning time: 0.195 ms
Execution time: 3.597 ms
- actual time=0.026…5.354:0.026为实际启动时间(毫秒),5.354为实际总时间(毫秒)
- rows:实际本SQL段输出的行数
- loops:本SQL段实际执行次数
- Rows Removed by Filter:查询条件实际过滤掉的行数
- Planning time:规划耗时
- Execution time: 执行耗时
BUFFFERS选项
默认关闭
显示说明
- 共享块(shared blocks)命中、读取、弄脏、写入的数量,共享块保存表和索引的数据
- 本地块(local blocks)命中、读取、弄脏、写入的数量,本地块保存临时表和索引数据
- 临时块(temp blocks)读取和写入数量,临时块保存短期生成的排序、Hash、Materialize等数据
块操作说明
- 命中(hits):从cache中读取,避免了从磁盘直接读取。
- 弄脏(dirtied):SQL修改了数据使Cache块失效的数量
- 写入(written):SQL从cache中逐出先前脏块的数量
- 读取(read):直接从磁盘读取数据
其他选项
选项 | 默认值 | 说明 |
---|---|---|
VERBOSE | FALSE | 显示每个节点输出的列名等 |
COST | TRUE | 显示每个节点启动成本、总成本、行数、行宽度 |
TIMING | TRUE | ANALYZE下才能启用,输出每个节点花费的实际启动时间和总时间 |
SUMMARY | FALSE,在ANALYZE下默认为TRUE | 输出摘要信息,如总时间 |
FORMAT | TEXT | 指定输出格式,包括TEXT、XML、JSON、YAML |
explain (VERBOSE,BUFFERS FALSE) select * from crm_t_archive_person where cl_name_cn like '%AND%'
单表扫描类型
聚合
HashAggregate
hash聚合根据每个group字段值生成一个hash结果,来实现快速统计
EXPLAIN select count(1) from inv_t_invoice group by cl_sale_unit
# Result
HashAggregate (cost=2323.58..2323.59 rows=1 width=57)
Group Key: cl_sale_unit
-> Seq Scan on inv_t_invoice (cost=0.00..2278.72 rows=8972 width=49)
GroupAggregate
根据group字段值排序,然后进行统计
EXPLAIN select count(1) from pms_t_cash_register group by cl_id
# Result
Finalize GroupAggregate (cost=293888.00..435273.53 rows=1581285 width=41)
Group Key: cl_id
-> Gather Merge (cost=293888.00..414809.84 rows=930168 width=41)
Workers Planned: 1
-> Partial GroupAggregate (cost=292887.99..309165.93 rows=930168 width=41)
Group Key: cl_id
-> Sort (cost=292887.99..295213.41 rows=930168 width=33)
Sort Key: cl_id
-> Parallel Seq Scan on pms_t_cash_register (cost=0.00..175240.68 rows=930168 width=33)
- 内存消耗GroupAggregate更少,表pms_t_cash_register数据量较大,内存消耗会很多,因此采用了GroupAggregate
- 效率上HashAggregate更快,表inv_t_invoice数据量较小,内存消耗问题不大,因此采用HashAggregate
Seq Scan
全表扫描
EXPLAIN select * from crm_t_archive_person
# Result
Seq Scan on crm_t_archive_person (cost=0.00..3059.75 rows=53675 width=2739)
Index Scan
索引扫描,先通过索引找出符合条件的TID(数据行ID),在通过RowID去表中查找数据
EXPLAIN select * from crm_t_archive_person where cl_id = ''
# Result
Index Only Scan using crm_t_archive_person_pkey on crm_t_archive_person (cost=0.28..8.30 rows=1 width=36)
Index Cond: (cl_id = ''::text)
Bitmap Index Scan
位图索引扫描,先通过索引找出所有符合条件的TID(数据行ID),并进行排序,然后去表中查找数据
explain select * from crm_t_archive_person where cl_id < '1'
# Result
Bitmap Heap Scan on crm_t_archive_person (cost=47.37..919.51 rows=398 width=827)
Recheck Cond: ((cl_id)::text < '1'::text)
-> Bitmap Index Scan on crm_t_archive_person_pkey (cost=0.00..47.27 rows=398 width=0)
Index Cond: ((cl_id)::text < '1'::text)
Index Only Scan
通过索引就能查到想要的数据,不需要再访问表
explain select count(1) from crm_t_archive_person where cl_id = ''
# Result
Aggregate (cost=8.30..8.31 rows=1 width=8)
-> Index Only Scan using crm_t_archive_person_pkey on crm_t_archive_person (cost=0.28..8.30 rows=1 width=0)
Index Cond: (cl_id = ''::text)
BitmapAnd
多索引结果And
EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000;
# Result
Bitmap Heap Scan on tenk1 (cost=25.08..60.21 rows=10 width=244)
Recheck Cond: ((unique1 < 100) AND (unique2 > 9000))
-> BitmapAnd (cost=25.08..25.08 rows=10 width=0)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0)
Index Cond: (unique1 < 100)
-> Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0)
Index Cond: (unique2 > 9000)
BitmapOr
多索引结果Or
连表索引
Nested Loop
先遍历A,再遍历B
性能为O(A*B)
EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
# Result
Nested Loop (cost=4.65..118.62 rows=10 width=488)
-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244)
Recheck Cond: (unique1 < 10)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0)
Index Cond: (unique1 < 10)
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244)
Index Cond: (unique2 = t1.unique2)
Nested Loop中可能会用到Materialize,Materialize表示该查询段只会执行一次,对结果进行物化存储用于后续处理
EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t2.unique2 < 10 AND t1.hundred < t2.hundred;
# Result
Nested Loop (cost=4.65..49.46 rows=33 width=488)
Join Filter: (t1.hundred < t2.hundred)
-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244)
Recheck Cond: (unique1 < 10)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0)
Index Cond: (unique1 < 10)
-> Materialize (cost=0.29..8.51 rows=10 width=244)
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..8.46 rows=10 width=244)
Index Cond: (unique2 < 10)
Hash Join
联表条件为等于时,有可能将两个结果集(A,B)进行Hash,Hash后能够加快比较速度,性能O(A)+O(B)
EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;
# Result
Hash Join (cost=230.47..713.98 rows=101 width=488)
Hash Cond: (t2.unique2 = t1.unique2)
-> Seq Scan on tenk2 t2 (cost=0.00..445.00 rows=10000 width=244)
-> Hash (cost=229.20..229.20 rows=101 width=244)
-> Bitmap Heap Scan on tenk1 t1 (cost=5.07..229.20 rows=101 width=244)
Recheck Cond: (unique1 < 100)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0)
Index Cond: (unique1 < 100)
Merge Join
先对结果集A,B进行排序,再进行联表操作
可用于等于和范围联表条件
EXPLAIN SELECT *
FROM tenk1 t1, onek t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;
# Result
Merge Join (cost=198.11..268.19 rows=10 width=488)
Merge Cond: (t1.unique2 = t2.unique2)
-> Index Scan using tenk1_unique2 on tenk1 t1 (cost=0.29..656.28 rows=101 width=244)
Filter: (unique1 < 100)
-> Sort (cost=197.83..200.33 rows=1000 width=244)
Sort Key: t2.unique2
-> Seq Scan on onek t2 (cost=0.00..148.00 rows=1000 width=244)