Postgres10高性能开发(2)解释计划

本文深入探讨PostgreSQL的查询优化技巧,包括如何使用EXPLAIN分析SQL执行计划,理解SeqScan、IndexScan等扫描类型,以及HashJoin、MergeJoin等连接策略。通过具体案例,解析不同查询条件下最优的执行路径。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

解释计划

Explain

https://www.postgresql.org/docs/10/sql-explain.html

基本使用

利用规划器评估解释计划及其执行成本

explain不执行SQL只进行评估,同时不会评估数据传输的网络开销

explain select * from crm_t_archive_person

# Result
Seq Scan on crm_t_archive_person  (cost=0.00..3059.75 rows=53675 width=2739)
  • Seq Scan:表示对表顺序扫描
  • 0.00:启动成本,SQL段执行开始前已消耗的成本
  • 3059.72:总费用,SQL段执行完成总消耗成本,总费用-启动成本=本SQL段自己消耗的成本
  • 53675:预计本SQL段输出的行数
  • 2739: 预计输出行的平均宽度(单位:byte)
ANALYZE选项

默认关闭

评估并执行SQL记录真实运行结果,不会评估数据传输的网络开销

explain analyze select * from crm_t_archive_person where cl_name_cn like '%AND%'

# Result
Seq Scan on crm_t_archive_person  (cost=0.00..1448.71 rows=1 width=827) (actual time=0.813..3.560 rows=2 loops=1)
  Filter: ((cl_name_cn)::text ~~ '%AND%'::text)
  Rows Removed by Filter: 5735
Planning time: 0.195 ms
Execution time: 3.597 ms
  • actual time=0.026…5.354:0.026为实际启动时间(毫秒),5.354为实际总时间(毫秒)
  • rows:实际本SQL段输出的行数
  • loops:本SQL段实际执行次数
  • Rows Removed by Filter:查询条件实际过滤掉的行数
  • Planning time:规划耗时
  • Execution time: 执行耗时
BUFFFERS选项

默认关闭

显示说明

  • 共享块(shared blocks)命中、读取、弄脏、写入的数量,共享块保存表和索引的数据
  • 本地块(local blocks)命中、读取、弄脏、写入的数量,本地块保存临时表和索引数据
  • 临时块(temp blocks)读取和写入数量,临时块保存短期生成的排序、Hash、Materialize等数据

块操作说明

  • 命中(hits):从cache中读取,避免了从磁盘直接读取。
  • 弄脏(dirtied):SQL修改了数据使Cache块失效的数量
  • 写入(written):SQL从cache中逐出先前脏块的数量
  • 读取(read):直接从磁盘读取数据
其他选项
选项默认值说明
VERBOSEFALSE显示每个节点输出的列名等
COSTTRUE显示每个节点启动成本、总成本、行数、行宽度
TIMINGTRUEANALYZE下才能启用,输出每个节点花费的实际启动时间和总时间
SUMMARYFALSE,在ANALYZE下默认为TRUE输出摘要信息,如总时间
FORMATTEXT指定输出格式,包括TEXT、XML、JSON、YAML
explain (VERBOSE,BUFFERS FALSE) select * from crm_t_archive_person where cl_name_cn like '%AND%'

单表扫描类型

聚合

HashAggregate

hash聚合根据每个group字段值生成一个hash结果,来实现快速统计

EXPLAIN select count(1) from inv_t_invoice group by cl_sale_unit

# Result
HashAggregate  (cost=2323.58..2323.59 rows=1 width=57)
  Group Key: cl_sale_unit
  ->  Seq Scan on inv_t_invoice  (cost=0.00..2278.72 rows=8972 width=49)
GroupAggregate

根据group字段值排序,然后进行统计

EXPLAIN select count(1) from pms_t_cash_register group by cl_id

# Result
Finalize GroupAggregate  (cost=293888.00..435273.53 rows=1581285 width=41)
  Group Key: cl_id
  ->  Gather Merge  (cost=293888.00..414809.84 rows=930168 width=41)
        Workers Planned: 1
        ->  Partial GroupAggregate  (cost=292887.99..309165.93 rows=930168 width=41)
              Group Key: cl_id
              ->  Sort  (cost=292887.99..295213.41 rows=930168 width=33)
                    Sort Key: cl_id
                    ->  Parallel Seq Scan on pms_t_cash_register  (cost=0.00..175240.68 rows=930168 width=33)
  • 内存消耗GroupAggregate更少,表pms_t_cash_register数据量较大,内存消耗会很多,因此采用了GroupAggregate
  • 效率上HashAggregate更快,表inv_t_invoice数据量较小,内存消耗问题不大,因此采用HashAggregate

Seq Scan

全表扫描

EXPLAIN select * from crm_t_archive_person

# Result
Seq Scan on crm_t_archive_person  (cost=0.00..3059.75 rows=53675 width=2739)

Index Scan

索引扫描,先通过索引找出符合条件的TID(数据行ID),在通过RowID去表中查找数据

EXPLAIN select * from crm_t_archive_person where cl_id = ''

# Result
Index Only Scan using crm_t_archive_person_pkey on crm_t_archive_person  (cost=0.28..8.30 rows=1 width=36)
  Index Cond: (cl_id = ''::text)

Bitmap Index Scan

位图索引扫描,先通过索引找出所有符合条件的TID(数据行ID),并进行排序,然后去表中查找数据

explain select * from crm_t_archive_person where cl_id < '1'

# Result
Bitmap Heap Scan on crm_t_archive_person  (cost=47.37..919.51 rows=398 width=827)
  Recheck Cond: ((cl_id)::text < '1'::text)
  ->  Bitmap Index Scan on crm_t_archive_person_pkey  (cost=0.00..47.27 rows=398 width=0)
        Index Cond: ((cl_id)::text < '1'::text)

Index Only Scan

通过索引就能查到想要的数据,不需要再访问表

explain select count(1) from crm_t_archive_person where cl_id = ''

# Result
Aggregate  (cost=8.30..8.31 rows=1 width=8)
  ->  Index Only Scan using crm_t_archive_person_pkey on crm_t_archive_person  (cost=0.28..8.30 rows=1 width=0)
        Index Cond: (cl_id = ''::text)

BitmapAnd

多索引结果And

EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000;

# Result
Bitmap Heap Scan on tenk1  (cost=25.08..60.21 rows=10 width=244)
   Recheck Cond: ((unique1 < 100) AND (unique2 > 9000))
   ->  BitmapAnd  (cost=25.08..25.08 rows=10 width=0)
         ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=101 width=0)
               Index Cond: (unique1 < 100)
         ->  Bitmap Index Scan on tenk1_unique2  (cost=0.00..19.78 rows=999 width=0)
               Index Cond: (unique2 > 9000)

BitmapOr

多索引结果Or

连表索引

Nested Loop

先遍历A,再遍历B
性能为O(A*B)

EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;

# Result
 Nested Loop  (cost=4.65..118.62 rows=10 width=488)
   ->  Bitmap Heap Scan on tenk1 t1  (cost=4.36..39.47 rows=10 width=244)
         Recheck Cond: (unique1 < 10)
         ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..4.36 rows=10 width=0)
               Index Cond: (unique1 < 10)
   ->  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.29..7.91 rows=1 width=244)
         Index Cond: (unique2 = t1.unique2)

Nested Loop中可能会用到Materialize,Materialize表示该查询段只会执行一次,对结果进行物化存储用于后续处理

EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t2.unique2 < 10 AND t1.hundred < t2.hundred;

# Result
 Nested Loop  (cost=4.65..49.46 rows=33 width=488)
   Join Filter: (t1.hundred < t2.hundred)
   ->  Bitmap Heap Scan on tenk1 t1  (cost=4.36..39.47 rows=10 width=244)
         Recheck Cond: (unique1 < 10)
         ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..4.36 rows=10 width=0)
               Index Cond: (unique1 < 10)
   ->  Materialize  (cost=0.29..8.51 rows=10 width=244)
         ->  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.29..8.46 rows=10 width=244)
               Index Cond: (unique2 < 10)

Hash Join

联表条件为等于时,有可能将两个结果集(A,B)进行Hash,Hash后能够加快比较速度,性能O(A)+O(B)

EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;

# Result
 Hash Join  (cost=230.47..713.98 rows=101 width=488)
   Hash Cond: (t2.unique2 = t1.unique2)
   ->  Seq Scan on tenk2 t2  (cost=0.00..445.00 rows=10000 width=244)
   ->  Hash  (cost=229.20..229.20 rows=101 width=244)
         ->  Bitmap Heap Scan on tenk1 t1  (cost=5.07..229.20 rows=101 width=244)
               Recheck Cond: (unique1 < 100)
               ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=101 width=0)
                     Index Cond: (unique1 < 100)

Merge Join

先对结果集A,B进行排序,再进行联表操作
可用于等于和范围联表条件

EXPLAIN SELECT *
FROM tenk1 t1, onek t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;

# Result
 Merge Join  (cost=198.11..268.19 rows=10 width=488)
   Merge Cond: (t1.unique2 = t2.unique2)
   ->  Index Scan using tenk1_unique2 on tenk1 t1  (cost=0.29..656.28 rows=101 width=244)
         Filter: (unique1 < 100)
   ->  Sort  (cost=197.83..200.33 rows=1000 width=244)
         Sort Key: t2.unique2
         ->  Seq Scan on onek t2  (cost=0.00..148.00 rows=1000 width=244)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值