Postgres10高性能开发（2）解释计划_本地块和临时块的区别-优快云博客

本文链接：https://blog.youkuaiyun.com/supermancoke/article/details/104613418

本文深入探讨PostgreSQL的查询优化技巧，包括如何使用EXPLAIN分析SQL执行计划，理解SeqScan、IndexScan等扫描类型，以及HashJoin、MergeJoin等连接策略。通过具体案例，解析不同查询条件下最优的执行路径。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

解释计划

Explain

https://www.postgresql.org/docs/10/sql-explain.html

基本使用

利用规划器评估解释计划及其执行成本

explain不执行SQL只进行评估，同时不会评估数据传输的网络开销

explain select * from crm_t_archive_person

# Result
Seq Scan on crm_t_archive_person  (cost=0.00..3059.75 rows=53675 width=2739)

Seq Scan：表示对表顺序扫描
0.00：启动成本，SQL段执行开始前已消耗的成本
3059.72：总费用，SQL段执行完成总消耗成本，总费用-启动成本=本SQL段自己消耗的成本
53675：预计本SQL段输出的行数
2739: 预计输出行的平均宽度（单位：byte）

ANALYZE选项

默认关闭

评估并执行SQL记录真实运行结果，不会评估数据传输的网络开销

explain analyze select * from crm_t_archive_person where cl_name_cn like '%AND%'

# Result
Seq Scan on crm_t_archive_person  (cost=0.00..1448.71 rows=1 width=827) (actual time=0.813..3.560 rows=2 loops=1)
  Filter: ((cl_name_cn)::text ~~ '%AND%'::text)
  Rows Removed by Filter: 5735
Planning time: 0.195 ms
Execution time: 3.597 ms

actual time=0.026…5.354：0.026为实际启动时间（毫秒），5.354为实际总时间（毫秒）
rows：实际本SQL段输出的行数
loops：本SQL段实际执行次数
Rows Removed by Filter：查询条件实际过滤掉的行数
Planning time：规划耗时
Execution time：执行耗时

BUFFFERS选项

默认关闭

显示说明

共享块（shared blocks）命中、读取、弄脏、写入的数量，共享块保存表和索引的数据
本地块（local blocks）命中、读取、弄脏、写入的数量，本地块保存临时表和索引数据
临时块（temp blocks）读取和写入数量，临时块保存短期生成的排序、Hash、Materialize等数据

块操作说明

命中（hits）：从cache中读取，避免了从磁盘直接读取。
弄脏（dirtied）：SQL修改了数据使Cache块失效的数量
写入（written）：SQL从cache中逐出先前脏块的数量
读取（read）：直接从磁盘读取数据

其他选项

选项	默认值	说明
VERBOSE	FALSE	显示每个节点输出的列名等
COST	TRUE	显示每个节点启动成本、总成本、行数、行宽度
TIMING	TRUE	ANALYZE下才能启用，输出每个节点花费的实际启动时间和总时间
SUMMARY	FALSE，在ANALYZE下默认为TRUE	输出摘要信息，如总时间
FORMAT	TEXT	指定输出格式，包括TEXT、XML、JSON、YAML

explain (VERBOSE,BUFFERS FALSE) select * from crm_t_archive_person where cl_name_cn like '%AND%'

单表扫描类型

聚合

HashAggregate

hash聚合根据每个group字段值生成一个hash结果，来实现快速统计

EXPLAIN select count(1) from inv_t_invoice group by cl_sale_unit

# Result
HashAggregate  (cost=2323.58..2323.59 rows=1 width=57)
  Group Key: cl_sale_unit
  ->  Seq Scan on inv_t_invoice  (cost=0.00..2278.72 rows=8972 width=49)

GroupAggregate

根据group字段值排序，然后进行统计

EXPLAIN select count(1) from pms_t_cash_register group by cl_id

# Result
Finalize GroupAggregate  (cost=293888.00..435273.53 rows=1581285 width=41)
  Group Key: cl_id
  ->  Gather Merge  (cost=293888.00..414809.84 rows=930168 width=41)
        Workers Planned: 1
        ->  Partial GroupAggregate  (cost=292887.99..309165.93 rows=930168 width=41)
              Group Key: cl_id
              ->  Sort  (cost=292887.99..295213.41 rows=930168 width=33)
                    Sort Key: cl_id
                    ->  Parallel Seq Scan on pms_t_cash_register  (cost=0.00..175240.68 rows=930168 width=33)

内存消耗GroupAggregate更少，表pms_t_cash_register数据量较大，内存消耗会很多，因此采用了GroupAggregate
效率上HashAggregate更快，表inv_t_invoice数据量较小，内存消耗问题不大，因此采用HashAggregate

Seq Scan

全表扫描

EXPLAIN select * from crm_t_archive_person

# Result
Seq Scan on crm_t_archive_person  (cost=0.00..3059.75 rows=53675 width=2739)

Index Scan

索引扫描，先通过索引找出符合条件的TID(数据行ID)，在通过RowID去表中查找数据

EXPLAIN select * from crm_t_archive_person where cl_id = ''

# Result
Index Only Scan using crm_t_archive_person_pkey on crm_t_archive_person  (cost=0.28..8.30 rows=1 width=36)
  Index Cond: (cl_id = ''::text)

Bitmap Index Scan

位图索引扫描，先通过索引找出所有符合条件的TID(数据行ID)，并进行排序，然后去表中查找数据

explain select * from crm_t_archive_person where cl_id < '1'

# Result
Bitmap Heap Scan on crm_t_archive_person  (cost=47.37..919.51 rows=398 width=827)
  Recheck Cond: ((cl_id)::text < '1'::text)
  ->  Bitmap Index Scan on crm_t_archive_person_pkey  (cost=0.00..47.27 rows=398 width=0)
        Index Cond: ((cl_id)::text < '1'::text)

Index Only Scan

通过索引就能查到想要的数据，不需要再访问表

explain select count(1) from crm_t_archive_person where cl_id = ''

# Result
Aggregate  (cost=8.30..8.31 rows=1 width=8)
  ->  Index Only Scan using crm_t_archive_person_pkey on crm_t_archive_person  (cost=0.28..8.30 rows=1 width=0)
        Index Cond: (cl_id = ''::text)

BitmapAnd

多索引结果And

EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000;

# Result
Bitmap Heap Scan on tenk1  (cost=25.08..60.21 rows=10 width=244)
   Recheck Cond: ((unique1 < 100) AND (unique2 > 9000))
   ->  BitmapAnd  (cost=25.08..25.08 rows=10 width=0)
         ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=101 width=0)
               Index Cond: (unique1 < 100)
         ->  Bitmap Index Scan on tenk1_unique2  (cost=0.00..19.78 rows=999 width=0)
               Index Cond: (unique2 > 9000)

BitmapOr

多索引结果Or

连表索引

Nested Loop

先遍历A，再遍历B
性能为O(A*B)

EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;

# Result
 Nested Loop  (cost=4.65..118.62 rows=10 width=488)
   ->  Bitmap Heap Scan on tenk1 t1  (cost=4.36..39.47 rows=10 width=244)
         Recheck Cond: (unique1 < 10)
         ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..4.36 rows=10 width=0)
               Index Cond: (unique1 < 10)
   ->  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.29..7.91 rows=1 width=244)
         Index Cond: (unique2 = t1.unique2)

Nested Loop中可能会用到Materialize,Materialize表示该查询段只会执行一次，对结果进行物化存储用于后续处理

EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t2.unique2 < 10 AND t1.hundred < t2.hundred;

# Result
 Nested Loop  (cost=4.65..49.46 rows=33 width=488)
   Join Filter: (t1.hundred < t2.hundred)
   ->  Bitmap Heap Scan on tenk1 t1  (cost=4.36..39.47 rows=10 width=244)
         Recheck Cond: (unique1 < 10)
         ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..4.36 rows=10 width=0)
               Index Cond: (unique1 < 10)
   ->  Materialize  (cost=0.29..8.51 rows=10 width=244)
         ->  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.29..8.46 rows=10 width=244)
               Index Cond: (unique2 < 10)

Hash Join

联表条件为等于时，有可能将两个结果集(A,B)进行Hash，Hash后能够加快比较速度，性能O(A)+O(B)

EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;

# Result
 Hash Join  (cost=230.47..713.98 rows=101 width=488)
   Hash Cond: (t2.unique2 = t1.unique2)
   ->  Seq Scan on tenk2 t2  (cost=0.00..445.00 rows=10000 width=244)
   ->  Hash  (cost=229.20..229.20 rows=101 width=244)
         ->  Bitmap Heap Scan on tenk1 t1  (cost=5.07..229.20 rows=101 width=244)
               Recheck Cond: (unique1 < 100)
               ->  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=101 width=0)
                     Index Cond: (unique1 < 100)

Merge Join

先对结果集A，B进行排序，再进行联表操作
可用于等于和范围联表条件

EXPLAIN SELECT *
FROM tenk1 t1, onek t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2;

# Result
 Merge Join  (cost=198.11..268.19 rows=10 width=488)
   Merge Cond: (t1.unique2 = t2.unique2)
   ->  Index Scan using tenk1_unique2 on tenk1 t1  (cost=0.29..656.28 rows=101 width=244)
         Filter: (unique1 < 100)
   ->  Sort  (cost=197.83..200.33 rows=1000 width=244)
         Sort Key: t2.unique2
         ->  Seq Scan on onek t2  (cost=0.00..148.00 rows=1000 width=244)