Greenplum 函数 gp_dist_random

最新推荐文章于 2024-04-16 16:42:24 发布

weixin_34032621

最新推荐文章于 2024-04-16 16:42:24 发布

阅读量1.6k

点赞数

文章标签： runtime 数据库

原文链接：http://www.cnblogs.com/xibuhaohao/p/11133294.html

版权

Greenplum的gp_dist_random函数用于在所有segment节点执行查询，如gp_dist_random('gp_id')。它避免了函数在master节点执行的情况，例如random()函数。该函数常用于分布式统计，如查询数据库或表的大小。通过在每个segment执行，可以获取多条返回结果。若需限制返回记录数，可以结合limit使用，但执行仍会遍历所有segment。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

转载自：https://yq.aliyun.com/articles/7593

函数作用：

gp_dist_random('gp_id')本质上就是在所有节点查询gp_id，
gp_dist_random('pg_authid')就是在所有节点查询pg_authid，

使用greenplum时，如果需要调用一个函数，这个函数很可能就在master执行，而不会跑到segment上去执行。
例如 random()函数。
通过select random()来调用的话，不需要将这条SQL发送到segment节点，所以执行计划如下，没有gather motion的过程。

postgres=# explain analyze select random();  
                                       QUERY PLAN                                         
----------------------------------------------------------------------------------------  
 Result  (cost=0.01..0.02 rows=1 width=0)  
   Rows out:  1 rows with 0.017 ms to end, start offset by 0.056 ms.  
   InitPlan  
     ->  Result  (cost=0.00..0.01 rows=1 width=0)  
           Rows out:  1 rows with 0.004 ms to end of 2 scans, start offset by 0.059 ms.  
 Slice statistics:  
   (slice0)    Executor memory: 29K bytes.  
   (slice1)    Executor memory: 29K bytes.  
 Statement statistics:  
   Memory used: 128000K bytes  
 Total runtime: 0.074 ms  
(11 rows)

如果要让这个函数在segment执行，怎么办呢？
通过gp_dist_random('gp_id')来调用，gp_dist_random的参数是一个可查询的视图，或表。

postgres=# explain analyze select random() from gp_dist_random('gp_id');  
                                                               QUERY PLAN                                                                  
-----------------------------------------------------------------------------------------------------------------------------------------  
 Gather Mot

最低0.47元/天解锁文章