Bitmap Index Scan

最新推荐文章于 2024-05-04 13:45:04 发布

weixin_33724659

最新推荐文章于 2024-05-04 13:45:04 发布

阅读量1.9k

点赞数 1

CC 4.0 BY-SA版权

文章标签：数据库

原文链接：https://yq.aliyun.com/articles/662100

Bitmap Index Scan是数据库中的一种扫描方式，它通过一次性获取所有符合条件的tuple-pointers并进行内存排序，以提高数据读取的局部性，从而在大数据量下提升效率。相比传统的Index Scan，Bitmap Scan在数据量大时更具优势，尤其是在AND和OR条件查询中。然而，对于LIMIT等操作，Bitmap Index Scan可能不是最佳选择。通过调整参数如enable_bitmapscan和random_page_cost，可以影响查询计划选择哪种扫描方式。在某些情况下，需要留意recheck cond过程，因为数据量大时可能会出现有损的Bitmap Heap Scan，这时可能需要调整work_mem来优化。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Bitmap Index Scan

数据库里面的表的扫描方式主要是以下几种方式：sequential scans, index scans, and bitmap index scans，当然还有index only scan，这种算是index scans中比较特殊的一种，需要的信息在索引中都能找到，扫描索引即可，不需要去扫描表。

这篇文章主要谈谈Bitmap Index Scan的原理及适用的场景。

定义

A plain indexscan fetches one tuple-pointer at a time from the index,and immediately visits that tuple in the table. A bitmap scan fetches all the tuple-pointers from the index in one go, sorts them using an in-memory "bitmap" data structure, and then visits the table tuples in physical tuple-location order. The bitmap scan improves locality of reference to the table at the cost of more bookkeeping overhead to manage the "bitmap" data structure --- and at the cost that the data is no longer retrieved in index order, which doesn't matter for your query but would matter if you said ORDER BY.

A bitmapped index scan works in two stages. First the index or indexes are scanned to create a bitmap representing matching tuple.

这段解释是tom lane在postgres邮件组中的回答，我觉得是比较权威而且通俗易懂的解释。

核心是传统的index scan每次从索引中去取一个tuple的指针，然后立马去表中取数据，每一次会造成一次随机io。如果数据量较多的情况下，会比较低效。而bitmap scan一次性将符合条件的tuple-pointers全部取出来，然后在内存中进行地址排序，然后去取出数据，这时的读取数据由于进行的地址排序，读取时就变成了顺序的读。其实就是一个随机读转化为顺序读取的过程，但是取出的数据由于进行了地址的排序，就没有顺序。同时，对于limit这种sql，bitmap index scan这种就不适合，因为它一次会取出所有数据。

和传统索引扫描对比：

_2018_10_31_3_39_38

在读取数据量比较小时，index scan比较合适，在读取数据量比较大的情况下，bitmap index scan会更有优势。

下面通过实验验证

建立测试表

创建测试表
CREATE TABLE sampletable (x numeric);

插入数据
INSERT INTO sampletable SELECT random() * 10000 FROM generate_series(1, 10000000);

全表扫描

postgres=# explain (analyze,buffers) select * from sampletable where x = 12;
                                                           QUERY PLAN
--------------------------------------------------------------