❝开头还是介绍一下群,如果感兴趣PolarDB ,MongoDB ,MySQL ,PostgreSQL ,Redis, OceanBase, Sql Server等有问题,有需求都可以加群群内有各大数据库行业大咖,可以解决你的问题。加群请联系 liuaustin3 ,(共3300人左右 1 + 2 + 3 + 4 +5 + 6 + 7 + 8 +9)(1 2 3 4 5 6 7群均已爆满,开8群近400 9群 200+,开10群PolarDB专业学习群100+)
PolarDB for PostgreSQL 中包含了了一个Polar_csi的插件,通过在PolarDB for PostgreSQL 上安装插件的方式来使用向量引擎,列式索引。
这里需要提醒使用列式索引的前提条件
1 wal_level 参数必须设置为logical
2 一张表只能有一个列式索引
3 列式索引建立后不能修改,只能重建
4 安装polar_csi插件后,需要在数据库中执行create extension polar_csi 命令
5 是否启用polar_csi 有开关命令可以进行验证
同时要修改参数
test=> create extension polar_csi;
CREATE EXTENSION
test=>
test=> -- 创建 customers 表
test=> CREATE TABLE customers (
test(> customer_id SERIAL PRIMARY KEY,
test(> name VARCHAR(100),
test(> email VARCHAR(100),
test(> created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
test(> );
CREATE TABLE
test=>
test=> -- 创建 products 表
test=> CREATE TABLE products (
test(> product_id SERIAL PRIMARY KEY,
test(> name VARCHAR(100),
test(> price DECIMAL(10, 2),
test(> created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
test(> );
CREATE TABLE
test=>
test=> -- 创建 orders 表
test=> CREATE TABLE orders (
test(> order_id SERIAL PRIMARY KEY,
test(> customer_id INT REFERENCES customers(customer_id),
test(> product_id INT REFERENCES products(product_id),
test(> order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
test(> quantity INT,
test(> total_amount DECIMAL(10, 2)
test(> );
针对这三张表我们每张表插入100,200万数据。
test=> \d
List of relations
Schema | Name | Type | Owner
--------+---------------------------+----------+----------
public | customers | table | dba_test
public | customers_customer_id_seq | sequence | dba_test
public | orders | table | dba_test
public | orders_order_id_seq | sequence | dba_test
public | products | table | dba_test
public | products_product_id_seq | sequence | dba_test
(6 rows)
test=> select count(*) from customers;
count
---------
2000000
(1 row)
test=> explain analyze select count(*) from customers;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=27472.22..27472.23 rows=1 width=8) (actual time=139.527..142.490 rows=1 loops=1)
-> Gather (cost=27472.00..27472.21 rows=2 width=8) (actual time=136.042..142.482 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=26472.00..26472.01 rows=1 width=8) (actual time=125.038..125.038 rows=1 loops=3)
-> Parallel Index Only Scan using customers_pkey on customers (cost=0.43..24388.62 rows=833354 width=0) (actual time=0.202..90.587 rows=666667 loops=3)
Heap Fetches: 0
Planning Time: 0.040 ms
Execution Time: 142.566 ms
(9 rows)
test=> select count(*) from orders;
count
---------
1000000
(1 row)
test=> select count(*) from products;
count
---------
2000000
(1 row)
test=>
同时在数据库查询中,无法优化的SQL聚合加子查询,且没有数据的过滤,即使建立了索引也无法使用还是要走全表扫描等。
test=> explain analyze SELECT
test-> c.customer_id,
test-> c.name,
test-> COALESCE(order_counts.total_orders, 0) AS total_orders,
test-> COALESCE(order_totals.total_spent, 0) AS total_spent
test-> FROM
test-> customers c
test-> LEFT JOIN (
test(> SELECT
test(> customer_id,
test(> COUNT(order_id) AS total_orders
test(> FROM
test(> orders
test(> GROUP BY
test(> customer_id
test(> ) AS order_counts ON c.customer_id = order_counts.customer_id
test-> LEFT JOIN (
test(> SELECT
test(> customer_id,
test(> SUM(total_amount) AS total_spent
test(> FROM
test(> orders
test(> GROUP BY
test(> customer_id
test(> ) AS order_totals ON c.customer_id = order_totals.customer_id
test-> ORDER BY
test-> total_spent DESC;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Gather Merge (cost=217678.61..412136.56 rows=1666666 width=77) (actual time=1407.023..2104.550 rows=2000000 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=216678.59..218761.92 rows=833333 width=77) (actual time=1323.073..1454.480 rows=666667 loops=3)
Sort Key: (COALESCE(order_totals.total_spent, '0'::numeric)) DESC
Sort Method: external merge Disk: 46088kB
Worker 0: Sort Method: external merge Disk: 52096kB
Worker 1: Sort Method: external merge Disk: 21328kB
-> Hash Left Join (cost=47408.07..91348.40 rows=833333 width=77) (actual time=633.505..938.891 rows=666667 loops=3)
Hash Cond: (c.customer_id = order_totals.customer_id)
-> Hash Left Join (cost=23704.03..65456.86 rows=833333 width=45) (actual time=292.432..479.223 rows=666667 loops=3)
Hash Cond: (c.customer_id = order_counts.customer_id)
-> Parallel Seq Scan on customers c (cost=0.00..39565.33 rows=833333 width=37) (actual time=0.003..65.482 rows=666667 loops=3)
-> Hash (cost=23704.02..23704.02 rows=1 width=12) (actual time=292.407..292.409 rows=1 loops=3)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Subquery Scan on order_counts (cost=23704.00..23704.02 rows=1 width=12) (actual time=292.398..292.400 rows=1 loops=3)
-> HashAggregate (cost=23704.00..23704.01 rows=1 width=12) (actual time=292.397..292.398 rows=1 loops=3)
Group Key: orders.customer_id
Batches: 1 Memory Usage: 25kB
Worker 0: Batches: 1 Memory Usage: 25kB
Worker 1: Batches: 1 Memory Usage: 25kB

最低0.47元/天 解锁文章
890

被折叠的 条评论
为什么被折叠?



