实验二十二 GaussDB分布式场景调优

原创已于 2025-11-23 13:01:50 修改 · 573 阅读

8 ·

CC 4.0 BY-SA版权

文章标签：

#gaussdb #分布式

于 2025-11-23 01:16:21 首次发布

通过实验，快速掌握GaussDB数据库专栏收录该内容

28 篇文章

订阅专栏

查询集群状态：

[Ruby@gs01 ~]$ cm_ctl query -Cv
[  CMServer State   ]

node    instance state
------------------------
1  gs01 1        Primary
2  gs02 2        Standby
3  gs03 3        Standby

[    ETCD State     ]

node    instance state
------------------------------
1  gs01 7001     StateLeader
2  gs02 7002     StateFollower
3  gs03 7003     StateFollower

[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
balanced        : Yes
current_az      : AZ_ALL

[ Coordinator State ]

node    instance state
------------------------
1  gs01 5001     Normal
2  gs02 5002     Normal
3  gs03 5003     Normal

[ Central Coordinator State ]

node    instance state
------------------------
2  gs02 5002     Normal

[     GTM State     ]

node    instance state                    sync_state
--------------------------------------------------------
1  gs01 1001     P Primary Connection ok  Sync
2  gs02 1002     S Standby Connection ok  Sync
3  gs03 1003     S Standby Connection ok  Sync

[  Datanode State   ]

node    instance state            | node    instance state            | node    instance state
------------------------------------------------------------------------------------------------------------
1  gs01 6001     P Primary Normal | 2  gs02 6002     S Standby Normal | 3  gs03 6003     S Standby Normal
2  gs02 6004     P Primary Normal | 3  gs03 6005     S Standby Normal | 1  gs01 6006     S Standby Normal
3  gs03 6007     P Primary Normal | 1  gs01 6008     S Standby Normal | 2  gs02 6009     S Standby Normal
[Ruby@gs01 ~]$

一、关联列非分布列

构建测试表及测试数据，执行相同SQL，对比分布列不同对执行计划的影响。

1.登录数据库。

gsql -d postgres -p 8000 -r

2.设置 max_datanode_for_plan 以便后续查看具体执行计划。

SET max_datanode_for_plan = 3;

3.创建测试表。

DROP TABLE IF EXISTS test_2_1;

CREATE TABLE test_2_1(
cno int,
cname varchar(10),
tno int
)DISTRIBUTE BY hash(cno);

DROP TABLE IF EXISTS test_2_2;

CREATE TABLE test_2_2(
sno int,
cno int,
dgeree int
) DISTRIBUTE BY hash(sno);

4.插入测试数据，大约60s。

INSERT INTO test_2_1 SELECT generate_series(1,100000), generate_series(1,100000);
INSERT INTO test_2_2 SELECT generate_series(1,1000), generate_series(1,1000);

5.设置打印风格为pretty。

set explain_perf_mode=pretty;

6.关联条件非分布列，执行测试SQL。

EXPLAIN (costs off, ANALYZE on)
SELECT t1.cno, t2.sno
FROM test_2_1 t1, test_2_2 t2
WHERE t1.cno = t2.cno;

可以看出，此时计划中出现REDISTRIBUTE算子，该算子在数据量较大时会对性能造成较大影响，可以通过修改分布键消除该算子以提升效率。

gaussdb=# EXPLAIN (costs off, ANALYZE on)
gaussdb-# SELECT t1.cno, t2.sno
gaussdb-# FROM test_2_1 t1, test_2_2 t2
gaussdb-# WHERE t1.cno = t2.cno;
 id |                operation                |     A-time      | A-rows | Peak Memory | A-width 
----+-----------------------------------------+-----------------+--------+-------------+---------
  1 | ->  Streaming (type: GATHER)            | 59.315          |   1000 | 82KB        | 
  2 |    ->  Hash Join (3,5)                  | [12.769,50.412] |   1000 | [8KB,8KB]   | 
  3 |       ->  Streaming(type: REDISTRIBUTE) | [0.150,38.386]  |   1000 | [44KB,44KB] | 
  4 |          ->  Seq Scan on test_2_2 t2    | [0.082,0.094]   |   1000 | [53KB,53KB] | 
  5 |       ->  Hash                          | [11.482,12.073] | 100000 | [1MB,1MB]   | [20,20]
  6 |          ->  Seq Scan on test_2_1 t1    | [5.817,6.285]   | 100000 | [52KB,52KB] | 
(6 rows)

 Predicate Information (identified by plan id) 
-----------------------------------------------
   2 --Hash Join (3,5)
         Hash Cond: (t2.cno = t1.cno)
(2 rows)

              Memory Information (identified by plan id)              
----------------------------------------------------------------------
 Coordinator Query Peak Memory:
         Query Peak Memory: 0MB
 Datanode:
         Max Query Peak Memory: 0MB
         Min Query Peak Memory: 0MB
   5 --Hash
         Max Buckets: 32768  Max Batches: 1  Max Memory Usage: 1181kB
         Min Buckets: 32768  Min Batches: 1  Min Memory Usage: 1163kB
(8 rows)

                           User Define Profiling                           
---------------------------------------------------------------------------
 Segment Id: 2  Track name: Datanode build connection
  (actual time=[0.357, 5.080], calls=[1, 1])
 Plan Node id: 1  Track name: coordinator get datanode connection
  (actual time=[2.641, 2.641], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator check and update node definition
  (actual time=[0.003, 0.003], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator serialize plan
  (actual time=[0.740, 0.740], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator send query id with sync
  (actual time=[0.257, 0.257], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator send begin command
  (actual time=[0.002, 0.002], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator start transaction and send query
  (actual time=[0.102, 0.102], calls=[1, 1])
 Plan Node id: 2  Track name: Datanode start up stream thread
  (actual time=[0.087, 0.203], calls=[1, 1])
(16 rows)

                                ====== Query Summary =====                                
------------------------------------------------------------------------------------------
 Datanode executor start time [dn_6001_6002_6003, dn_6007_6008_6009]: [2.671 ms,5.298 ms]
 Datanode executor run time [dn_6007_6008_6009, dn_6001_6002_6003]: [12.902 ms,50.542 ms]
 Datanode executor end time [dn_6007_6008_6009, dn_6004_6005_6006]: [0.103 ms,0.139 ms]
 Coordinator executor start time: 0.371 ms
 Coordinator executor run time: 59.377 ms
 Coordinator executor end time: 0.022 ms
 Planner runtime: 0.552 ms
 Plan size: 4157 byte
 Query Id: 72339069014829424
 Total runtime: 59.783 ms
(10 rows)

我们在分布式场景下创建表时，可以联想以后可能会引用该表常用SQL,并且把常用sql中的关联列或者聚集列作为分布列，以此减少数据重分布带来的开销。

7.修改测试表test_2_2的分片键为cno

DROP TABLE test_2_2;

CREATE TABLE test_2_2(
sno int,
cno int,
dgeree int
) DISTRIBUTE BY hash(cno);

8.重新插入测试数据 -- 大约1s

INSERT INTO test_2_2 SELECT generate_series(1,1000), generate_series(1,1000);

9.执行测试SQL。

EXPLAIN ANALYZE
SELECT t1.cno, t2.sno
FROM test_2_1 t1, test_2_2 t2
WHERE t1.cno = t2.cno;

修改完成以后我们重新执行测试可以发现sql执行时间得到了优化。
按新分布键创建表后执行计划及执行时间如下：

gaussdb=# EXPLAIN ANALYZE
gaussdb-# SELECT t1.cno, t2.sno
gaussdb-# FROM test_2_1 t1, test_2_2 t2
gaussdb-# WHERE t1.cno = t2.cno;
                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 Data Node Scan  (cost=0.00..0.00 rows=0 width=0) (actual time=9.436..9.687 rows=1000 loops=1)
   Node/s: All datanodes
 Total runtime: 9.821 ms
(3 rows)

10.查看DN具体执行计划。

EXPLAIN VERBOSE
SELECT t1.cno, t2.sno
FROM test_2_1 t1, test_2_2 t2
WHERE t1.cno = t2.cno;

具体执行计划如下：
由于我们重新分配了表的分布列，现在表的关联列就是表的分布列。这也就意味着在执行join算子时不需要去其他DN节点获取数据，可以直接在本节点完成计算，减少了REDISTRIBUTE算子带来的额外传输开销。

gaussdb=# EXPLAIN VERBOSE
gaussdb-# SELECT t1.cno, t2.sno
gaussdb-# FROM test_2_1 t1, test_2_2 t2
gaussdb-# WHERE t1.cno = t2.cno;
                                               QUERY PLAN                                                
---------------------------------------------------------------------------------------------------------
 Data Node Scan  (cost=0.00..0.00 rows=0 width=0)
   Output: t1.cno, t2.sno
   Node/s: All datanodes
   Remote query: SELECT t1.cno, t2.sno FROM public.test_2_1 t1, public.test_2_2 t2 WHERE t1.cno = t2.cno
 
 Remote SQL: SELECT t1.cno, t2.sno FROM public.test_2_1 t1, public.test_2_2 t2 WHERE t1.cno = t2.cno
 Datanode Name: dn_6001_6002_6003
   Hash Join  (cost=53.76..2821.58 rows=209282 distinct=[200, 200] width=8)
     Output: t1.cno, t2.sno
     Hash Cond: (t1.cno = t2.cno)
     ->  Seq Scan on public.test_2_1 t1  (cost=0.00..352.20 rows=21520 width=4)
           Output: t1.cno, t1.cname, t1.tno
     ->  Hash  (cost=29.45..29.45 rows=1945 width=8)
           Output: t2.sno, t2.cno
           ->  Seq Scan on public.test_2_2 t2  (cost=0.00..29.45 rows=1945 width=8)
                 Output: t2.sno, t2.cno
 
 Datanode Name: dn_6004_6005_6006
   Hash Join  (cost=53.76..2821.58 rows=209282 distinct=[200, 200] width=8)
     Output: t1.cno, t2.sno
     Hash Cond: (t1.cno = t2.cno)
     ->  Seq Scan on public.test_2_1 t1  (cost=0.00..352.20 rows=21520 width=4)
           Output: t1.cno, t1.cname, t1.tno
     ->  Hash  (cost=29.45..29.45 rows=1945 width=8)
           Output: t2.sno, t2.cno
           ->  Seq Scan on public.test_2_2 t2  (cost=0.00..29.45 rows=1945 width=8)
                 Output: t2.sno, t2.cno
 
 Datanode Name: dn_6007_6008_6009
   Hash Join  (cost=53.76..2801.26 rows=207745 distinct=[200, 200] width=8)
     Output: t1.cno, t2.sno
     Hash Cond: (t1.cno = t2.cno)
     ->  Seq Scan on public.test_2_1 t1  (cost=0.00..349.62 rows=21362 width=4)
           Output: t1.cno, t1.cname, t1.tno
     ->  Hash  (cost=29.45..29.45 rows=1945 width=8)
           Output: t2.sno, t2.cno
           ->  Seq Scan on public.test_2_2 t2  (cost=0.00..29.45 rows=1945 width=8)
                 Output: t2.sno, t2.cno
 
(39 rows)

分片表选取分布列需要考虑哪些因素？

回答：

1. 应使用取值较为离散的字段作为分布键，以便数据能够均匀分布到各个DN中。

2. 在满足1.应使用取值较为离散的字段作为分布键，以便数据...的情况下，存在常量过滤的字段不建议成为分布键，否则会使得所有的查询任务都会分发到唯一固定的DN上。

3. 在满足1.应使用取值较为离散的字段作为分布键，以便数据...和2.在满足1.应使用取值较为离散的字段作为分布键...情况下，尽量选择查询中的关联条件作为分布键，这样可保证JOIN任务的相关数据分布在相同的DN上，减少DN间数据的流动代价。

二、使用全局索引避免数据重分布。

构建测试表及测试数据，执行相同SQL，对比全局索引对执行计划的影响。

1.登录gaussdb数据库。

gsql -d postgres -p 8000 -r

2.设置max_datanode_for_plan以便后续查看具体执行计划。

SET max_datanode_for_plan = 3;

3.创建测试表。

DROP TABLE IF EXISTS test_3_1;
CREATE TABLE test_3_1(c1 int, c2 int) distribute by hash(c2);

DROP TABLE IF EXISTS test_3_2;
CREATE TABLE test_3_2(c3 int, c4 int) distribute by hash(c3);

4.插入测试数据, 大约10s。

INSERT INTO test_3_1 select x, x + 500 from generate_series(1,100000) AS x;
INSERT INTO test_3_2 select x, x + 500 from generate_series(1,10000) AS x;

5.设置打印风格为pretty。

set explain_perf_mode=pretty;

6.执行测试SQL。

EXPLAIN ANALYZE SELECT c2, c4 FROM test_3_1, test_3_2 WHERE c2 = c4;

执行计划如下：

gaussdb=# EXPLAIN ANALYZE SELECT c2, c4 FROM test_3_1, test_3_2 WHERE c2 = c4;
 id |                operation                |     A-time      | A-rows | E-rows | Peak Memory | A-width | E-width | E-costs 
----+-----------------------------------------+-----------------+--------+--------+-------------+---------+---------+---------
  1 | ->  Streaming (type: GATHER)            | 24.446          |  10000 |     30 | 82KB        |         |       8 | 31.21
  2 |    ->  Hash Join (3,5)                  | [14.222,15.096] |  10000 |     30 | [8KB,8KB]   |         |       8 | 29.96
  3 |       ->  Streaming(type: REDISTRIBUTE) | [0.654,0.794]   |  10000 |     30 | [68KB,68KB] |         |       4 | 15.49
  4 |          ->  Seq Scan on test_3_2       | [0.606,0.617]   |  10000 |     30 | [53KB,53KB] |         |       4 | 14.14
  5 |       ->  Hash                          | [11.351,11.919] | 100000 |     29 | [1MB,1MB]   | [20,20] |       4 | 14.14
  6 |          ->  Seq Scan on test_3_1       | [5.977,6.247]   | 100000 |     30 | [52KB,52KB] |         |       4 | 14.14
(6 rows)

 Predicate Information (identified by plan id)  
------------------------------------------------
   2 --Hash Join (3,5)
         Hash Cond: (test_3_2.c4 = test_3_1.c2)
(2 rows)

              Memory Information (identified by plan id)              
----------------------------------------------------------------------
 Coordinator Query Peak Memory:
         Query Peak Memory: 0MB
 Datanode:
         Max Query Peak Memory: 0MB
         Min Query Peak Memory: 0MB
   5 --Hash
         Max Buckets: 32768  Max Batches: 1  Max Memory Usage: 1181kB
         Min Buckets: 32768  Min Batches: 1  Min Memory Usage: 1163kB
(8 rows)

                           User Define Profiling                           
---------------------------------------------------------------------------
 Segment Id: 2  Track name: Datanode build connection
  (actual time=[0.235, 0.401], calls=[1, 1])
 Plan Node id: 1  Track name: coordinator get datanode connection
  (actual time=[3.378, 3.378], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator check and update node definition
  (actual time=[0.004, 0.004], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator serialize plan
  (actual time=[0.676, 0.676], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator send query id with sync
  (actual time=[0.193, 0.193], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator send begin command
  (actual time=[0.001, 0.001], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator start transaction and send query
  (actual time=[0.103, 0.103], calls=[1, 1])
 Plan Node id: 2  Track name: Datanode start up stream thread
  (actual time=[0.074, 0.116], calls=[1, 1])
(16 rows)

                                ====== Query Summary =====                                
------------------------------------------------------------------------------------------
 Datanode executor start time [dn_6007_6008_6009, dn_6001_6002_6003]: [0.585 ms,0.669 ms]
 Datanode executor run time [dn_6001_6002_6003, dn_6004_6005_6006]: [15.247 ms,16.553 ms]
 Datanode executor end time [dn_6004_6005_6006, dn_6007_6008_6009]: [0.114 ms,0.136 ms]
 Coordinator executor start time: 0.128 ms
 Coordinator executor run time: 24.989 ms
 Coordinator executor end time: 0.021 ms
 Planner runtime: 0.419 ms
 Plan size: 3999 byte
 Query Id: 72339069014842974
 Total runtime: 25.154 ms

和此前一样，由于关联列非分布列，计划中出现了Streaming(type: REDISTRIBUTE)算子，触发了数据的重分布。

7.为test_3_2添加全局索引。

这里在分布式场景下全局索引的声明中需要带有分布键，基于该分布键将数据分布于DN间。在需要获取数据时，相较于使用Stream算子进行数据的重分布，我们可以通过使用全局索引在当前DN上通过Index Only Scan直接获取数据。如果是建立普通本地索引的话索引中的数据并不会按照分布键在各DN间分布，而是依据创建表时的分布键分配的数据，对其中索引语句中指定的列添加索引。

DROP INDEX IF EXISTS idx_1;
CREATE GLOBAL INDEX idx_1 ON test_3_2(c4) DISTRIBUTE BY HASH(c4);

8.执行测试SQL。

EXPLAIN ANALYZE SELECT c2, c4 FROM test_3_1, test_3_2 WHERE c2 = c4;

执行计划如下：

gaussdb=# EXPLAIN ANALYZE SELECT c2, c4 FROM test_3_1, test_3_2 WHERE c2 = c4;
                                           QUERY PLAN                                            
-------------------------------------------------------------------------------------------------
 Data Node Scan  (cost=0.00..0.00 rows=0 width=0) (actual time=6.303..13.290 rows=10000 loops=1)
   Node/s: All datanodes
 Total runtime: 13.821 ms
(3 rows)

9.查看具体执行计划。

EXPLAIN VERBOSE SELECT c2, c4 FROM test_3_1, test_3_2 WHERE c2 = c4;

具体执行计划如下：
可以看到DN间的数据交互被消除， c4列的数据通过全局索引Index Only San获得。

gaussdb=# EXPLAIN VERBOSE SELECT c2, c4 FROM test_3_1, test_3_2 WHERE c2 = c4;
                                                    QUERY PLAN                                                     
-------------------------------------------------------------------------------------------------------------------
 Data Node Scan  (cost=0.00..0.00 rows=0 width=0)
   Output: test_3_1.c2, idx_1.c4
   Node/s: All datanodes
   Remote query: SELECT test_3_1.c2, idx_1.c4 FROM public.test_3_1, ONLY public.idx_1 WHERE test_3_1.c2 = idx_1.c4
 
 Remote SQL: SELECT test_3_1.c2, idx_1.c4 FROM public.test_3_1, ONLY public.idx_1 WHERE test_3_1.c2 = idx_1.c4
 Datanode Name: dn_6001_6002_6003
   Hash Join  (cost=2.48..592.34 rows=1 distinct=[200, 200] width=8)
     Output: test_3_1.c2, idx_1.c4
     Hash Cond: (test_3_1.c2 = idx_1.c4)
     ->  Seq Scan on public.test_3_1  (cost=0.00..463.89 rows=33589 width=4)
           Output: test_3_1.c1, test_3_1.c2
     ->  Hash  (cost=2.46..2.46 rows=1 width=4)
           Output: idx_1.c4
           ->  Index Only Scan using idx_1 on public.idx_1  (cost=0.00..2.46 rows=1 width=4)
                 Output: idx_1.c4
 
 Datanode Name: dn_6004_6005_6006
   Hash Join  (cost=2.48..588.16 rows=1 distinct=[200, 200] width=8)
     Output: test_3_1.c2, idx_1.c4
     Hash Cond: (test_3_1.c2 = idx_1.c4)
     ->  Seq Scan on public.test_3_1  (cost=0.00..460.58 rows=33358 width=4)
           Output: test_3_1.c1, test_3_1.c2
     ->  Hash  (cost=2.46..2.46 rows=1 width=4)
           Output: idx_1.c4
           ->  Index Only Scan using idx_1 on public.idx_1  (cost=0.00..2.46 rows=1 width=4)
                 Output: idx_1.c4
 
 Datanode Name: dn_6007_6008_6009
   Hash Join  (cost=2.48..583.97 rows=1 distinct=[200, 200] width=8)
     Output: test_3_1.c2, idx_1.c4
     Hash Cond: (test_3_1.c2 = idx_1.c4)
     ->  Seq Scan on public.test_3_1  (cost=0.00..457.53 rows=33053 width=4)
           Output: test_3_1.c1, test_3_1.c2
     ->  Hash  (cost=2.46..2.46 rows=1 width=4)
           Output: idx_1.c4
           ->  Index Only Scan using idx_1 on public.idx_1  (cost=0.00..2.46 rows=1 width=4)
                 Output: idx_1.c4
 
(39 rows)

在哪种场景下可以考虑使用全局索引？

分布列和关联列或聚集列不一致的时候。

三、优化建表方式

构建测试表及测试数据，执行相同SQL，对比分片表和复制表的表结构不同对执行计划的影响。

1.登录gaussdb数据库。

gsql -d postgres -p 8000 -r

2.设置 max_datanode_for_plan 以便后续查看具体执行计划。

SET max_datanode_for_plan = 3;

3.创建测试表。

DROP TABLE IF EXISTS test_4_1;
CREATE TABLE test_4_1(
id int,
num int
) DISTRIBUTE BY hash(id);

DROP TABLE IF EXISTS test_4_2;
CREATE TABLE test_4_2(
id int,
num int
) DISTRIBUTE BY hash(num);

4.插入测试数据，插入时间预计30s。

INSERT INTO test_4_1 select generate_series(1,100000), generate_series(1,100000);
INSERT INTO test_4_2 select generate_series(1,10000), generate_series(1,10000);

5.设置打印风格为pretty。

set explain_perf_mode=pretty;

6.分片表关联，执行测试。

EXPLAIN (costs off, ANALYZE on)
SELECT *
FROM test_4_1 t1,test_4_2 t2
WHERE t1.id=t2.id;

SQL执行计划及执行时间如下：

gaussdb=# EXPLAIN (costs off, ANALYZE on)
gaussdb-# SELECT *
gaussdb-# FROM test_4_1 t1,test_4_2 t2
gaussdb-# WHERE t1.id=t2.id;
 id |                operation                |     A-time      | A-rows | Peak Memory | A-width 
----+-----------------------------------------+-----------------+--------+-------------+---------
  1 | ->  Streaming (type: GATHER)            | 22.581          |  10000 | 85KB        | 
  2 |    ->  Hash Join (3,5)                  | [13.796,14.661] |  10000 | [10KB,10KB] | 
  3 |       ->  Streaming(type: REDISTRIBUTE) | [0.613,0.681]   |  10000 | [52KB,52KB] | 
  4 |          ->  Seq Scan on test_4_2 t2    | [0.508,0.521]   |  10000 | [53KB,53KB] | 
  5 |       ->  Hash                          | [10.874,11.266] | 100000 | [1MB,1MB]   | [24,24]
  6 |          ->  Seq Scan on test_4_1 t1    | [4.880,5.221]   | 100000 | [53KB,53KB] | 
(6 rows)

 Predicate Information (identified by plan id) 
-----------------------------------------------
   2 --Hash Join (3,5)
         Hash Cond: (t2.id = t1.id)
(2 rows)

              Memory Information (identified by plan id)              
----------------------------------------------------------------------
 Coordinator Query Peak Memory:
         Query Peak Memory: 0MB
 Datanode:
         Max Query Peak Memory: 0MB
         Min Query Peak Memory: 0MB
   5 --Hash
         Max Buckets: 32768  Max Batches: 1  Max Memory Usage: 1312kB
         Min Buckets: 32768  Min Batches: 1  Min Memory Usage: 1292kB
(8 rows)

                           User Define Profiling                           
---------------------------------------------------------------------------
 Segment Id: 2  Track name: Datanode build connection
  (actual time=[0.295, 0.502], calls=[1, 1])
 Plan Node id: 1  Track name: coordinator get datanode connection
  (actual time=[0.682, 0.682], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator check and update node definition
  (actual time=[0.003, 0.003], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator serialize plan
  (actual time=[0.784, 0.784], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator send query id with sync
  (actual time=[0.239, 0.239], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator send begin command
  (actual time=[0.001, 0.001], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator start transaction and send query
  (actual time=[0.084, 0.084], calls=[1, 1])
 Plan Node id: 2  Track name: Datanode start up stream thread
  (actual time=[0.023, 0.034], calls=[1, 1])
(16 rows)

                                ====== Query Summary =====                                
------------------------------------------------------------------------------------------
 Datanode executor start time [dn_6004_6005_6006, dn_6001_6002_6003]: [0.482 ms,0.611 ms]
 Datanode executor run time [dn_6001_6002_6003, dn_6007_6008_6009]: [15.261 ms,17.065 ms]
 Datanode executor end time [dn_6004_6005_6006, dn_6007_6008_6009]: [0.096 ms,0.142 ms]
 Coordinator executor start time: 0.182 ms
 Coordinator executor run time: 23.042 ms
 Coordinator executor end time: 0.022 ms
 Planner runtime: 0.537 ms
 Plan size: 4192 byte
 Query Id: 72339069014857668
 Total runtime: 23.273 ms
(10 rows)

可以看出对于关联键非分布列的分片表，会生成带REDISTRIBUTE的算子，之前的实验我们可以采用修改分布键的方式消除该算子，同样我们也可以采用复制表的方式消除该算子，但需要注意，复制表仅适用于数据量少且改动不频繁的表。

7.使用复制表替换哈希分布表。

DROP TABLE IF EXISTS test_4_2;

CREATE TABLE test_4_2(
id int,
num int
) DISTRIBUTE BY replication;

8.为复制表插入数据并且更新统计信息，插入时间预计1s。

INSERT INTO test_4_2 select generate_series(1,10000), generate_series(1,10000);

9.执行测试SQL。

EXPLAIN (costs off, ANALYZE on)
SELECT *
FROM test_4_1 t1,test_4_2 t2
WHERE t1.id=t2.id;

SQL2执行计划及执行时间如下：

gaussdb=# EXPLAIN (costs off, ANALYZE on)
gaussdb-# SELECT *
gaussdb-# FROM test_4_1 t1,test_4_2 t2
gaussdb-# WHERE t1.id=t2.id;
                           QUERY PLAN                           
----------------------------------------------------------------
 Data Node Scan (actual time=13.400..17.736 rows=10000 loops=1)
   Node/s: All datanodes
 Total runtime: 18.279 ms
(3 rows)

10.查看DN具体执行计划。

EXPLAIN VERBOSE
SELECT *
FROM test_4_1 t1,test_4_2 t2
WHERE t1.id=t2.id;

DN执行计划如下：

gaussdb=# EXPLAIN VERBOSE
gaussdb-# SELECT *
gaussdb-# FROM test_4_1 t1,test_4_2 t2
gaussdb-# WHERE t1.id=t2.id;
                                                     QUERY PLAN                                                      
---------------------------------------------------------------------------------------------------------------------
 Data Node Scan  (cost=0.00..0.00 rows=0 width=0)
   Output: t1.id, t1.num, t2.id, t2.num
   Node/s: All datanodes
   Remote query: SELECT t1.id, t1.num, t2.id, t2.num FROM public.test_4_1 t1, public.test_4_2 t2 WHERE t1.id = t2.id
 
 Remote SQL: SELECT t1.id, t1.num, t2.id, t2.num FROM public.test_4_1 t1, public.test_4_2 t2 WHERE t1.id = t2.id
 Datanode Name: dn_6001_6002_6003
   Hash Join  (cost=883.57..1121.67 rows=8383 distinct=[1, 33581] width=16)
     Output: t1.id, t1.num, t2.id, t2.num
     Hash Cond: (t2.id = t1.id)
     ->  Seq Scan on public.test_4_2 t2  (cost=0.00..122.83 rows=8383 width=8)
           Output: t2.id, t2.num
     ->  Hash  (cost=463.81..463.81 rows=33581 width=8)
           Output: t1.id, t1.num
           ->  Seq Scan on public.test_4_1 t1  (cost=0.00..463.81 rows=33581 width=8)
                 Output: t1.id, t1.num
 
 Datanode Name: dn_6004_6005_6006
   Hash Join  (cost=877.53..1115.63 rows=8383 distinct=[1, 33357] width=16)
     Output: t1.id, t1.num, t2.id, t2.num
     Hash Cond: (t2.id = t1.id)
     ->  Seq Scan on public.test_4_2 t2  (cost=0.00..122.83 rows=8383 width=8)
           Output: t2.id, t2.num
     ->  Hash  (cost=460.57..460.57 rows=33357 width=8)
           Output: t1.id, t1.num
           ->  Seq Scan on public.test_4_1 t1  (cost=0.00..460.57 rows=33357 width=8)
                 Output: t1.id, t1.num
 
 Datanode Name: dn_6007_6008_6009
   Hash Join  (cost=869.89..1107.99 rows=8383 distinct=[1, 33062] width=16)
     Output: t1.id, t1.num, t2.id, t2.num
     Hash Cond: (t2.id = t1.id)
     ->  Seq Scan on public.test_4_2 t2  (cost=0.00..122.83 rows=8383 width=8)
           Output: t2.id, t2.num
     ->  Hash  (cost=456.62..456.62 rows=33062 width=8)
           Output: t1.id, t1.num
           ->  Seq Scan on public.test_4_1 t1  (cost=0.00..456.62 rows=33062 width=8)
                 Output: t1.id, t1.num
 
(39 rows)

复制表适合什么场景使用？

数据量较小的表或者更新频率非常低的表。

四、算子下推

构建测试表及测试数据，执行原始SQL，对函数状态进行改写，对比改写前后生成执行计划的不同。

1.登录gaussdb数据库。

gsql -d postgres -p 8000 -r

2.创建测试表。

DROP TABLE IF EXISTS test_5;
CREATE TABLE test_5(c1 int,c2 int,c3 int) distribute by hash(c1);

3.插入数据，大约3s。

insert into test_5 select floor(random() * 5000), floor(random() * 5000), floor(random() * 5000) from generate_series(1,10000);
ANALYZE test_5;

4.设置打印风格为pretty。

set explain_perf_mode=pretty;

5.创建函数。

create or replace function 
func_t (num1 int, num2 int) return int
as
begin
return num1 + num2;
end;
/

6.执行测试SQL1。

该查询将利用func_t将test_5中c2和c3进行求和后排序。

EXPLAIN (VERBOSE ON, ANALYZE) select func_t(c2,c3) as ans from test_5 order by ans;

执行计划如下所示：

可以看到执行计划中，显示REMOTE_TABLE_QUERY，func_t的执行，在cn上执行。

gaussdb=# EXPLAIN (VERBOSE ON, ANALYZE) select func_t(c2,c3) as ans from test_5 order by ans;
                                                                  QUERY PLAN                                                                   
-----------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=3164.39..3189.39 rows=10000 width=8) (actual time=46.263..47.686 rows=10000 loops=1)
   Output: (func_t(test_5.c2, test_5.c3))
   Sort Key: (func_t(test_5.c2, test_5.c3))
   Sort Method: quicksort  Memory: 1010kB
   ->  Data Node Scan on test_5 "_REMOTE_TABLE_QUERY_"  (cost=0.00..2500.00 rows=10000 width=8) (actual time=2.160..41.364 rows=10000 loops=1)
         Output: func_t(test_5.c2, test_5.c3)
         Node/s: All datanodes
         Remote query: SELECT c2, c3 FROM ONLY public.test_5 WHERE true
 Total runtime: 48.171 ms
(9 rows)

7.分析func_t，预期是将2个入参求和，即num1 + num2，其值稳定，创建的函数默认是不可下推的，可以设置成immutable，这样func_t就可以下推。

alter function func_t(int, int) immutable;

8.执行测试SQL2。

EXPLAIN (VERBOSE ON, ANALYZE) select func_t(c2,c3) as ans from test_5 order by ans;

执行计划如下所示，func_t已经下推，算子在dn上执行：

gaussdb=# EXPLAIN (VERBOSE ON, ANALYZE) select func_t(c2,c3) as ans from test_5 order by ans;
 id |              operation              |     A-time      | A-rows | E-rows | E-distinct |  Peak Memory  |    A-width    | E-width | E-costs 
----+-------------------------------------+-----------------+--------+--------+------------+---------------+---------------+---------+---------
  1 | ->  Streaming (type: GATHER)        | 22.719          |  10000 |   9999 |            | 171KB         |               |       8 | 1499.56
  2 |    ->  Sort                         | [12.136,13.509] |  10000 |   9999 |            | [297KB,318KB] | [63260,70080] |       8 | 1082.94
  3 |       ->  Seq Scan on public.test_5 | [10.392,11.411] |  10000 |  10000 |            | [55KB,55KB]   |               |       8 | 879.58
(3 rows)

      Memory Information (identified by plan id)       
-------------------------------------------------------
 Coordinator Query Peak Memory:
         Query Peak Memory: 0MB
 Datanode:
         Max Query Peak Memory: 0MB
         Min Query Peak Memory: 0MB
   2 --Sort
         Sort Method: quicksort  Memory: 294kB ~ 316kB
 Sort Method: quicksort  Disk: 1024kB ~ 0kB
(8 rows)

     Targetlist Information (identified by plan id)     
--------------------------------------------------------
   1 --Streaming (type: GATHER)
         Output: (func_t(c2, c3))
         Merge Sort Key: (func_t(test_5.c2, test_5.c3))
         Node/s: All datanodes
   2 --Sort
         Output: (func_t(c2, c3))
         Sort Key: (func_t(test_5.c2, test_5.c3))
   3 --Seq Scan on public.test_5
         Output: func_t(c2, c3)
         Distribute Key: c1
(10 rows)

                           User Define Profiling                           
---------------------------------------------------------------------------
 Plan Node id: 1  Track name: coordinator get datanode connection
  (actual time=[3.338, 3.338], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator serialize plan
  (actual time=[0.606, 0.606], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator send begin command
  (actual time=[0.002, 0.002], calls=[1, 1])
 Plan Node id: 1  Track name: Coordinator start transaction and send query
  (actual time=[0.102, 0.102], calls=[1, 1])
(8 rows)

                                ====== Query Summary =====                                
------------------------------------------------------------------------------------------
 Datanode executor start time [dn_6001_6002_6003, dn_6004_6005_6006]: [0.047 ms,0.165 ms]
 Datanode executor run time [dn_6004_6005_6006, dn_6007_6008_6009]: [12.913 ms,14.493 ms]
 Datanode executor end time [dn_6001_6002_6003, dn_6001_6002_6003]: [0.009 ms,0.009 ms]
 Coordinator executor start time: 0.048 ms
 Coordinator executor run time: 23.215 ms
 Coordinator executor end time: 0.039 ms
 Planner runtime: 0.166 ms
 Plan size: 3189 byte
 Query Id: 72339069014886763
 Total runtime: 23.314 ms
(10 rows)

要寻找分布式场景下的性能瓶颈可以往什么方向探索？

分布式架构带来的性能优势主要来自于DN间的计算并行，在看到执行计划时，可以观察算子是否下推到DN执行。

五、数据倾斜

构建测试表及测试数据，测试数据均匀和倾斜场景下SQL性能，及耗时影响。

1.登录gaussdb数据库。

gsql -d postgres -p 8000 -r

2.创建测试表。

DROP TABLE IF EXISTS test_6;
CREATE TABLE test_6(
a int,
b int
) DISTRIBUTE BY hash(a);

3.插入测试数据，大约60s。

INSERT INTO test_6 select 1, generate_series(1,150000);

4.设置打印风格为pretty。

set explain_perf_mode=pretty;

5.执行测试SQL。

EXPLAIN performance
SELECT count(b)
FROM test_6;

通过执行计划分析可以发现SQL执行耗时在DN的扫描和聚合操作，通过DN的统计信息部分可以看到dn_6004_6005_6006分片扫描的行数和聚合操作的耗时高，怀疑表的数据可能存在数据倾斜。

gaussdb=# EXPLAIN performance
gaussdb-# SELECT count(b)
gaussdb-# FROM test_6;
WARNING:  Statistics in some tables or columns(public.test_6.a) are not collected.
HINT:  Do analyze for them in order to generate optimized plan.
 id |               operation                |     A-time     | A-rows | E-rows | E-distinct | Peak Memory  | A-width | E-width | E-costs 
----+----------------------------------------+----------------+--------+--------+------------+--------------+---------+---------+---------
  1 | ->  Aggregate                          | 35.273         |      1 |      1 |            | 11KB         |         |      12 | 14.31
  2 |    ->  Streaming (type: GATHER)        | 35.209         |      3 |      3 |            | 81KB         |         |      12 | 14.31
  3 |       ->  Aggregate                    | [0.006,32.448] |      3 |      3 |            | [11KB, 11KB] |         |      12 | 14.19
  4 |          ->  Seq Scan on public.test_6 | [0.002,19.859] | 150000 |     30 |            | [11KB, 53KB] |         |       4 | 14.14
(4 rows)

                  Memory Information (identified by plan id)                   
-------------------------------------------------------------------------------
 Coordinator Query Peak Memory:
         Query Peak Memory: 0MB
 DataNode Query Peak Memory
         dn_6001_6002_6003 Query Peak Memory: 0MB
         dn_6004_6005_6006 Query Peak Memory: 0MB
         dn_6007_6008_6009 Query Peak Memory: 0MB
   1 --Aggregate
         Peak Memory: 11KB, Estimate Memory: 64MB
   2 --Streaming (type: GATHER)
         Peak Memory: 81KB, Estimate Memory: 64MB
   3 --Aggregate
         dn_6001_6002_6003 Peak Memory: 11KB, Estimate Memory: 64MB
         dn_6004_6005_6006 Peak Memory: 11KB, Estimate Memory: 64MB
         dn_6007_6008_6009 Peak Memory: 11KB, Estimate Memory: 64MB
         dn_6001_6002_6003 Stream Send time: 0.000; Data Serialize time: 0.003
         dn_6004_6005_6006 Stream Send time: 0.000; Data Serialize time: 0.012
         dn_6007_6008_6009 Stream Send time: 0.000; Data Serialize time: 0.003
   4 --Seq Scan on public.test_6
         dn_6001_6002_6003 Peak Memory: 11KB, Estimate Memory: 64MB
         dn_6004_6005_6006 Peak Memory: 53KB, Estimate Memory: 64MB
         dn_6007_6008_6009 Peak Memory: 11KB, Estimate Memory: 64MB
(21 rows)

 Targetlist Information (identified by plan id) 
------------------------------------------------
   1 --Aggregate
         Output: count((count(b)))
   2 --Streaming (type: GATHER)
         Output: (count(b))
         Node/s: All datanodes
   3 --Aggregate
         Output: count(b)
   4 --Seq Scan on public.test_6
         Output: a, b
         Distribute Key: a
(10 rows)

                         Datanode Information (identified by plan id)                          
-----------------------------------------------------------------------------------------------
   1 --Aggregate
         (actual time=35.273..35.273 rows=1 loops=1)
         (Buffers: 0)
         (CPU: ex c/r=44760, ex row=3, ex cyc=134282, inc cyc=74496032)
   2 --Streaming (type: GATHER)
         (actual time=2.286..35.209 rows=3 loops=1)
         (Buffers: 0)
         (CPU: ex c/r=24787250, ex row=3, ex cyc=74361750, inc cyc=74361750)
   3 --Aggregate
         dn_6001_6002_6003 (actual time=0.006..0.006 rows=1 loops=1)
         dn_6004_6005_6006 (actual time=32.448..32.448 rows=1 loops=1)
         dn_6007_6008_6009 (actual time=0.006..0.006 rows=1 loops=1)
         dn_6001_6002_6003 (Buffers: 0)
         dn_6004_6005_6006 (Buffers: shared hit=573)
         dn_6007_6008_6009 (Buffers: 0)
         dn_6001_6002_6003 (CPU: ex c/r=0, ex row=0, ex cyc=8137, inc cyc=11366)
         dn_6004_6005_6006 (CPU: ex c/r=239, ex row=150000, ex cyc=35886864, inc cyc=68529573)
         dn_6007_6008_6009 (CPU: ex c/r=0, ex row=0, ex cyc=8792, inc cyc=12748)
   4 --Seq Scan on public.test_6
         dn_6001_6002_6003 (actual time=0.002..0.002 rows=0 loops=1)
         dn_6004_6005_6006 (actual time=0.026..19.859 rows=150000 loops=1)
         dn_6007_6008_6009 (actual time=0.002..0.002 rows=0 loops=1)
         dn_6001_6002_6003 (Buffers: 0)
         dn_6004_6005_6006 (Buffers: shared hit=573)
         dn_6007_6008_6009 (Buffers: 0)
         dn_6001_6002_6003 (CPU: ex c/r=0, ex row=0, ex cyc=3229, inc cyc=3229)
         dn_6004_6005_6006 (CPU: ex c/r=217, ex row=150000, ex cyc=32642709, inc cyc=32642709)
         dn_6007_6008_6009 (CPU: ex c/r=0, ex row=0, ex cyc=3956, inc cyc=3956)
(28 rows)

                           User Define Profiling                           
---------------------------------------------------------------------------
 Plan Node id: 2  Track name: coordinator get datanode connection
        cn_5001: (time=0.747 total_calls=1 loops=1)
 Plan Node id: 2  Track name: Coordinator serialize plan
        cn_5001: (time=0.655 total_calls=1 loops=1)
 Plan Node id: 2  Track name: Coordinator send begin command
        cn_5001: (time=0.002 total_calls=1 loops=1)
 Plan Node id: 2  Track name: Coordinator start transaction and send query
        cn_5001: (time=0.087 total_calls=1 loops=1)
(8 rows)

                                ====== Query Summary =====                                
------------------------------------------------------------------------------------------
 Datanode executor start time [dn_6001_6002_6003, dn_6004_6005_6006]: [0.183 ms,0.278 ms]
 Datanode executor run time [dn_6001_6002_6003, dn_6004_6005_6006]: [0.011 ms,32.467 ms]
 Datanode executor end time [dn_6001_6002_6003, dn_6004_6005_6006]: [0.006 ms,0.019 ms]
 Remote query poll time: 0.000 ms, Deserialze time: 0.000 ms
 Coordinator executor start time: 0.076 ms
 Coordinator executor run time: 35.286 ms
 Coordinator executor end time: 0.023 ms
 Planner runtime: 0.300 ms
 Plan size: 3149 byte
 Query Id: 72339069014898956
 Total runtime: 35.420 ms
(11 rows)

6.通过如下方式查看表数据的倾斜情况。

gaussdb=# SELECT table_skewness('test_6');
              table_skewness              
------------------------------------------
 ("dn_6004_6005_6006   ",150000,100.000%)
 ("dn_6001_6002_6003   ",0,0.000%)
 ("dn_6007_6008_6009   ",0,0.000%)
(3 rows)

当前的分布列中存在数据倾斜，所有数据集中在一个DN, 为了避免数据倾斜，重新选择表的分布列。

7.重新建表，选择数据分布均匀的列作为分布列。

DROP TABLE IF EXISTS test_6;
CREATE TABLE test_6(
a int,
b int
) DISTRIBUTE BY hash(b);

8.插入同样的测试数据，大约60s。

INSERT INTO test_6 select 1, generate_series(1,150000);

9.查看表数据倾斜情况。

gaussdb=# SELECT table_skewness('test_6');
             table_skewness             
----------------------------------------
 ("dn_6001_6002_6003   ",50362,33.575%)
 ("dn_6004_6005_6006   ",49959,33.306%)
 ("dn_6007_6008_6009   ",49679,33.119%)
(3 rows)

10.执行测试SQL。

EXPLAIN performance
SELECT count(b)
FROM test_6;

调整表的分布列之后，SQL执行时间由66ms提升到28ms，执行计划：

gaussdb=# EXPLAIN performance
gaussdb-# SELECT count(b)
gaussdb-# FROM test_6;
WARNING:  Statistics in some tables or columns(public.test_6.b) are not collected.
HINT:  Do analyze for them in order to generate optimized plan.
 id |               operation                |     A-time      | A-rows | E-rows | E-distinct | Peak Memory  | A-width | E-width | E-costs 
----+----------------------------------------+-----------------+--------+--------+------------+--------------+---------+---------+---------
  1 | ->  Aggregate                          | 16.325          |      1 |      1 |            | 11KB         |         |      12 | 14.31
  2 |    ->  Streaming (type: GATHER)        | 16.285          |      3 |      3 |            | 81KB         |         |      12 | 14.31
  3 |       ->  Aggregate                    | [11.061,11.218] |      3 |      3 |            | [11KB, 11KB] |         |      12 | 14.19
  4 |          ->  Seq Scan on public.test_6 | [6.703,7.271]   | 150000 |     30 |            | [53KB, 53KB] |         |       4 | 14.14
(4 rows)

                  Memory Information (identified by plan id)                   
-------------------------------------------------------------------------------
 Coordinator Query Peak Memory:
         Query Peak Memory: 0MB
 DataNode Query Peak Memory
         dn_6001_6002_6003 Query Peak Memory: 0MB
         dn_6004_6005_6006 Query Peak Memory: 0MB
         dn_6007_6008_6009 Query Peak Memory: 0MB
   1 --Aggregate
         Peak Memory: 11KB, Estimate Memory: 64MB
   2 --Streaming (type: GATHER)
         Peak Memory: 81KB, Estimate Memory: 64MB
   3 --Aggregate
         dn_6001_6002_6003 Peak Memory: 11KB, Estimate Memory: 64MB
         dn_6004_6005_6006 Peak Memory: 11KB, Estimate Memory: 64MB
         dn_6007_6008_6009 Peak Memory: 11KB, Estimate Memory: 64MB
         dn_6001_6002_6003 Stream Send time: 0.000; Data Serialize time: 0.011
         dn_6004_6005_6006 Stream Send time: 0.000; Data Serialize time: 0.011
         dn_6007_6008_6009 Stream Send time: 0.000; Data Serialize time: 0.011
   4 --Seq Scan on public.test_6
         dn_6001_6002_6003 Peak Memory: 53KB, Estimate Memory: 64MB
         dn_6004_6005_6006 Peak Memory: 53KB, Estimate Memory: 64MB
         dn_6007_6008_6009 Peak Memory: 53KB, Estimate Memory: 64MB
(21 rows)

 Targetlist Information (identified by plan id) 
------------------------------------------------
   1 --Aggregate
         Output: count((count(b)))
   2 --Streaming (type: GATHER)
         Output: (count(b))
         Node/s: All datanodes
   3 --Aggregate
         Output: count(b)
   4 --Seq Scan on public.test_6
         Output: a, b
         Distribute Key: b
(10 rows)

                         Datanode Information (identified by plan id)                         
----------------------------------------------------------------------------------------------
   1 --Aggregate
         (actual time=16.325..16.325 rows=1 loops=1)
         (Buffers: 0)
         (CPU: ex c/r=28218, ex row=3, ex cyc=84655, inc cyc=34477556)
   2 --Streaming (type: GATHER)
         (actual time=15.776..16.285 rows=3 loops=1)
         (Buffers: 0)
         (CPU: ex c/r=11464300, ex row=3, ex cyc=34392901, inc cyc=34392901)
   3 --Aggregate
         dn_6001_6002_6003 (actual time=11.062..11.063 rows=1 loops=1)
         dn_6004_6005_6006 (actual time=11.218..11.218 rows=1 loops=1)
         dn_6007_6008_6009 (actual time=11.060..11.061 rows=1 loops=1)
         dn_6001_6002_6003 (Buffers: shared hit=192)
         dn_6004_6005_6006 (Buffers: shared hit=190)
         dn_6007_6008_6009 (Buffers: shared hit=189)
         dn_6001_6002_6003 (CPU: ex c/r=243, ex row=50362, ex cyc=12244336, inc cyc=23363045)
         dn_6004_6005_6006 (CPU: ex c/r=245, ex row=49959, ex cyc=12258915, inc cyc=23692690)
         dn_6007_6008_6009 (CPU: ex c/r=244, ex row=49679, ex cyc=12157254, inc cyc=23358595)
   4 --Seq Scan on public.test_6
         dn_6001_6002_6003 (actual time=0.026..6.703 rows=50362 loops=1)
         dn_6004_6005_6006 (actual time=0.028..7.271 rows=49959 loops=1)
         dn_6007_6008_6009 (actual time=0.027..7.070 rows=49679 loops=1)
         dn_6001_6002_6003 (Buffers: shared hit=192)
         dn_6004_6005_6006 (Buffers: shared hit=190)
         dn_6007_6008_6009 (Buffers: shared hit=189)
         dn_6001_6002_6003 (CPU: ex c/r=220, ex row=50362, ex cyc=11118709, inc cyc=11118709)
         dn_6004_6005_6006 (CPU: ex c/r=228, ex row=49959, ex cyc=11433775, inc cyc=11433775)
         dn_6007_6008_6009 (CPU: ex c/r=225, ex row=49679, ex cyc=11201341, inc cyc=11201341)
(28 rows)

                           User Define Profiling                           
---------------------------------------------------------------------------
 Plan Node id: 2  Track name: coordinator get datanode connection
        cn_5001: (time=3.384 total_calls=1 loops=1)
 Plan Node id: 2  Track name: Coordinator serialize plan
        cn_5001: (time=0.650 total_calls=1 loops=1)
 Plan Node id: 2  Track name: Coordinator send begin command
        cn_5001: (time=0.002 total_calls=1 loops=1)
 Plan Node id: 2  Track name: Coordinator start transaction and send query
        cn_5001: (time=0.101 total_calls=1 loops=1)
(8 rows)

                                ====== Query Summary =====                                
------------------------------------------------------------------------------------------
 Datanode executor start time [dn_6001_6002_6003, dn_6007_6008_6009]: [0.083 ms,0.090 ms]
 Datanode executor run time [dn_6007_6008_6009, dn_6004_6005_6006]: [11.078 ms,11.235 ms]
 Datanode executor end time [dn_6007_6008_6009, dn_6004_6005_6006]: [0.014 ms,0.025 ms]
 Remote query poll time: 0.000 ms, Deserialze time: 0.000 ms
 Coordinator executor start time: 0.072 ms
 Coordinator executor run time: 16.336 ms
 Coordinator executor end time: 0.018 ms
 Planner runtime: 0.195 ms
 Plan size: 3142 byte
 Query Id: 72339069014905495
 Total runtime: 16.464 ms
(11 rows)