PostgreSQL aggregate function customize

Postgres2015全国用户大会将于11月20至21日在北京丽亭华苑酒店召开。本次大会嘉宾阵容强大,国内顶级PostgreSQL数据库专家将悉数到场,并特邀欧洲、俄罗斯、日本、美国等国家和地区的数据库方面专家助阵:

  • Postgres-XC项目的发起人铃木市一(SUZUKI Koichi)
  • Postgres-XL的项目发起人Mason Sharp
  • pgpool的作者石井达夫(Tatsuo Ishii)
  • PG-Strom的作者海外浩平(Kaigai Kohei)
  • Greenplum研发总监姚延栋
  • 周正中(德哥), PostgreSQL中国用户会创始人之一
  • 汪洋,平安科技数据库技术部经理
  • ……
 
  • 2015年度PG大象会报名地址:http://postgres2015.eventdove.com/
  • PostgreSQL中国社区: http://postgres.cn/
  • PostgreSQL专业1群: 3336901(已满)
  • PostgreSQL专业2群: 100910388
  • PostgreSQL专业3群: 150657323


昨天聊到了一个关于多行转换成单行以提高查询性能的场景。
这里用到了PostgreSQL的聚合函数, 将多行聚合成单行.
当然其他数据库也有聚合函数, 但是PostgreSQL 支持自建聚合函数, 这就给开发带来了较大的便利.
例如PostgreSQL 8.2 并没有昨天用到的array_agg聚合函数.
那么要实现同样的场景怎么办呢?
例如在GreenPlum 3.3.6中, 实际上是PostgreSQL 8.2.13的版本.
要实现同样的功能, 可以简单通过的创建一个自定义聚合函数来实现。
例如我们的目标是整合成app_id||'_'||rating 逗号隔开的一个text值. 
那么可以通过||这个操作符或者函数来实现, 如下.
-- 创建聚合函数

digoal=# create aggregate agg_append (text) (
sfunc = textcat,
stype = text);

-- 创建测试表

digoal=# create table recommendation_mpt (user_id int8, app_id numeric, rating numeric) distributed randomly;
CREATE TABLE

-- 插入测试数据

digoal=# insert into recommendation_mpt select generate_series(1,1000000), generate_series(1,41), random();
INSERT 0 41000000

-- 创建聚合后的表

digoal=# create table recommendation_mpt_new ( user_id int8, app_id text ) distributed randomly;
CREATE TABLE

-- 插入聚合后的数据

digoal=# insert into recommendation_mpt_new select user_id, agg_append(app_id||'_'||rating||',')  from  recommendation_mpt group by user_id;
INSERT 0 1000000

-- 查询结果

digoal=# \x
Expanded display is on.
digoal=# select * from recommendation_mpt_new limit 2;
-[ RECORD 1 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
user_id | 1
app_id  | 21_0.936229802202433,19_0.241772226057947,27_0.956521810032427,9_0.8754295501858,7_0.338058620225638,41_0.226744973100722,38_0.204116075765342,40_0.884842214174569,17_0.659971259534359,10_0.767988855019212,37_0.901241634506732,6_0.853149639908224,16_0.745244240853935,39_0.813799708615988,1_0.868315862957388,28_0.228956982959062,31_0.311310657765716,26_0.180641509592533,30_0.0854418091475964,29_0.289730631746352,36_0.90720307873562,18_0.533597331959754,8_0.459905118215829,20_0.144313692115247,11_0.18922403594479,25_0.202533770818263,24_0.994741758797318,2_0.458359555341303,34_0.530500751920044,4_0.789138578809798,14_0.557009539101273,15_0.853544842917472,22_0.110565081238747,13_0.238142486196011,3_0.734461918473244,32_0.154231588821858,5_0.299331326968968,33_0.53225368168205,35_0.422972090076655,12_0.806360349990427,23_0.380616633687168,
-[ RECORD 2 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
user_id | 346
app_id  | 21_0.923598561901599,39_0.534792800433934,29_0.0692782052792609,22_0.206713933963329,9_0.9313296796754,31_0.67749793222174,1_0.0934103117324412,10_0.695828275289387,41_0.621980169788003,11_0.87125027552247,30_0.079912081360817,40_0.478423351421952,19_0.387606668751687,20_0.496157462708652,8_0.552907709032297,24_0.0219633430242538,23_0.76717281434685,28_0.927976955194026,14_0.818206876516342,16_0.478273885324597,17_0.306423192843795,3_0.193629603367299,33_0.418421885930002,13_0.748691521584988,5_0.0270167640410364,36_0.358466576784849,34_0.444800178986043,27_0.790532193612307,26_0.380120156798512,35_0.933182283304632,2_0.253765208646655,37_0.922488207463175,4_0.0538468402810395,38_0.768641801085323,15_0.406475386582315,18_0.493758061900735,25_0.463594998698682,7_0.884454429149628,12_0.0971177448518574,32_0.395383660681546,6_0.936530550476164,

注意
因为aggregate函数支持order by是从9.0开始才有的, 所以在greenplum 3.3.6也就是PostgreSQL8.2中创建的aggregate当然无法使用orderby来对rating排序. 语法错误如下 : 

digoal=# insert into recommendation_mpt_new select user_id, agg_append(app_id||'_'||rating||',' order by rating desc)  from  recommendation_mpt group by user_id;
ERROR:  syntax error at or near "order"
LINE 1: ...lect user_id, agg_append(app_id||'_'||rating||',' order by r...
                                                             ^

当然array_agg也是可以实现的, 需要用到的函数是array_append. 如下 : 

digoal=# create aggregate array_agg (anyelement) (
  sfunc = array_append,
  stype = anyarray);

select user_id, array_agg(app_id||'_'||rating) from recommendation_mpt group by user_id;
                                              array_agg                                                                             
                                                                                                                                    
                                                                                                                                    
                                                                                            
---------+--------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------
       3 | {12_0.337564039975405,18_0.605327501893044,39_0.0606710785068572,30_0.249253052286804,31_0.68522938946262,1_0.58626456419
006,11_0.342015425674617,3_0.383131602779031,29_0.485176796559244,9_0.489787132013589,21_0.478506665676832,22_0.881940244697034,19_0
.501696343999356,38_0.611436820123345,23_0.740357991773635,40_0.657328587025404,2_0.973510192707181,13_0.987239361740649,20_0.827145
711518824,28_0.930344801396132,32_0.726034674327821,41_0.175069946330041,10_0.876182099804282,7_0.609756181947887,33_0.3557396410033
11,8_0.861955059692264,27_0.173304431606084,15_0.544581302441657,25_0.661107395309955,16_0.671136009506881,24_0.384254441130906,6_0.
0688453107140958,34_0.290453566238284,17_0.787856891285628,14_0.0598205337300897,26_0.78198639722541,5_0.641678368207067,37_0.560569
697991014,4_0.314639654941857,35_0.1592174815014,36_0.154369496740401}
...


那么排序怎么来实现呢?
我们来看看aggregate的语法, 有一个finalfunc这个是在每次分组的聚合处理完后执行的函数. 例如上面得到的结果

12_0.337564039975405,18_0.605327501893044,39_0.0606710785068572,30_0.249253052286804,31_0.68522938946262,1_0.58626456419
006,11_0.342015425674617,3_0.383131602779031,29_0.485176796559244,9_0.489787132013589,21_0.478506665676832,22_0.881940244697034,19_0
.501696343999356,38_0.611436820123345,23_0.740357991773635,40_0.657328587025404,2_0.973510192707181,13_0.987239361740649,20_0.827145
711518824,28_0.930344801396132,32_0.726034674327821,41_0.175069946330041,10_0.876182099804282,7_0.609756181947887,33_0.3557396410033
11,8_0.861955059692264,27_0.173304431606084,15_0.544581302441657,25_0.661107395309955,16_0.671136009506881,24_0.384254441130906,6_0.
0688453107140958,34_0.290453566238284,17_0.787856891285628,14_0.0598205337300897,26_0.78198639722541,5_0.641678368207067,37_0.560569
697991014,4_0.314639654941857,35_0.1592174815014,36_0.154369496740401

对这个结果进行处理, 每个分组都会调用一次
我们可以利用这个finalfunc指定的函数来对每个分组的聚合结果进行排序.
例如 : 

digoal=# select split_part(i,'_',1) as app_id, split_part(i,'_',2) as rating from regexp_split_to_table('21_0.923598561901599,39_0.534792800433934,29_0.0692782052792609,22_0.206713933963329,9_0.9313296796754,31_0.67749793222174,1_0.0934103117324412,10_0.695828275289387,41_0.621980169788003,11_0.87125027552247,30_0.079912081360817,40_0.478423351421952,19_0.387606668751687,20_0.496157462708652,8_0.552907709032297,24_0.0219633430242538,23_0.76717281434685,28_0.927976955194026,14_0.818206876516342,16_0.478273885324597,17_0.306423192843795,3_0.193629603367299,33_0.418421885930002,13_0.748691521584988,5_0.0270167640410364,36_0.358466576784849,34_0.444800178986043,27_0.790532193612307,26_0.380120156798512,35_0.933182283304632,2_0.253765208646655,37_0.922488207463175,4_0.0538468402810395,38_0.768641801085323,15_0.406475386582315,18_0.493758061900735,25_0.463594998698682,7_0.884454429149628,12_0.0971177448518574,32_0.395383660681546,6_0.936530550476164',',') t(i) order by 2 desc;
 app_id |       rating       
--------+--------------------
 6      | 0.936530550476164
 35     | 0.933182283304632
 9      | 0.9313296796754
 28     | 0.927976955194026
 21     | 0.923598561901599
 37     | 0.922488207463175
 7      | 0.884454429149628
 11     | 0.87125027552247
 14     | 0.818206876516342
 27     | 0.790532193612307
 38     | 0.768641801085323
 23     | 0.76717281434685
 13     | 0.748691521584988
 10     | 0.695828275289387
 31     | 0.67749793222174
 41     | 0.621980169788003
 8      | 0.552907709032297
 39     | 0.534792800433934
 20     | 0.496157462708652
 18     | 0.493758061900735
 40     | 0.478423351421952
 16     | 0.478273885324597
 25     | 0.463594998698682
 34     | 0.444800178986043
 33     | 0.418421885930002
 15     | 0.406475386582315
 32     | 0.395383660681546
 19     | 0.387606668751687
 26     | 0.380120156798512
 36     | 0.358466576784849
 17     | 0.306423192843795
 2      | 0.253765208646655
 22     | 0.206713933963329
 3      | 0.193629603367299
 12     | 0.0971177448518574
 1      | 0.0934103117324412
 30     | 0.079912081360817
 29     | 0.0692782052792609
 4      | 0.0538468402810395
 5      | 0.0270167640410364
 24     | 0.0219633430242538
(41 rows)

但是greenplum中有点问题, 如下 : 

create table recommendation_mpt (user_id int8, app_id numeric, rating numeric);
insert into recommendation_mpt select generate_series(1,10000), generate_series(1,41), random();

create aggregate agg_append (text) (
sfunc = textcat,
stype = text,
FINALFUNC = final_array_agg);

create aggregate array_agg (anyelement) (
sfunc = array_append,
stype = anyarray);

create or replace function final_array_agg (i_text text) returns text[] as $$
declare
  result text[];
begin
  select array_agg(app_id||'_'||rating) into result from 
    (select split_part(i,'_',1) as app_id, split_part(i,'_',2) as rating from 
      regexp_split_to_table(i_text,',') t(i)
      order by 2 desc) t;
  return result;
end;
$$ language plpgsql strict;

select user_id, agg_append(app_id||'_'||rating||',') from recommendation_mpt group by user_id;

digoal=# select user_id, agg_append(app_id||'_'||rating||',') from recommendation_mpt group by user_id;
ERROR:  Functions that execute SQL statements from the segDBs are not yet supported (spi.c:203)  (seg10 slice2 dw-host6-if0:50001 pid=19758)
DETAIL:  
         SQL statement "select array_agg(app_id||'_'||rating) from (select split_part(i,'_',1) app_id, split_part(i,'_',2) rating from regexp_split_to_table( $1 ,',') t(i) order by 2 desc) t"
PL/pgSQL function "final_array_agg" line 4 at SQL statement

在PostgreSQL 8.3的版本中测试正常, 如下 : 

digoal=# select user_id, agg_append(app_id||'_'||rating||',') from recommendation_mpt group by user_id;
 user_id |                                                                                                                          
                                                                                                                                    
                                                                                                                                    
                                           agg_append                                                                               
                                                                                                                                    
                                                                                                                                    
                                                                                       
---------+--------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------
       1 | {25_0.981541138608009,15_0.955886580049992,17_0.955500768963248,28_0.950485301204026,41_0.941958157345653,8_0.93016316043
213,34_0.903705002740026,24_0.898381588049233,38_0.884811857715249,14_0.84847314003855,2_0.812469538301229,21_0.789327840320766,19_0
.777864037547261,22_0.730009535793215,9_0.729899833910167,10_0.662983400281519,32_0.638831827323884,12_0.624156034551561,5_0.6165286
74494475,31_0.61100034089759,16_0.598527556750923,27_0.596998091321439,7_0.546153449919075,37_0.539897245820612,1_0.534418095834553,
26_0.485467836726457,20_0.484848829451948,35_0.434873293153942,36_0.328121873084456,30_0.325269365217537,40_0.27276889514178,6_0.267
069734167308,11_0.221765073016286,29_0.164679736364633,33_0.162994640879333,3_0.161364886444062,18_0.151077626738697,13_0.1455773869
52937,39_0.126275546848774,4_0.113149816170335,23_0.109499100595713,_}
...

以上的解决办法是 聚合(agg_append) - 每条聚合结果打散(regexp_split_to_table) - 每条打散后的结果集再次进行聚合(array_agg) 的过程. 所以效率不高.
注意到前面的排序后的结果中末尾出现了一个array元素='_', 这个是怎么造成的呢?

digoal=# select split_part(i,'_',1) as app_id, split_part(i,'_',2) as rating from regexp_split_to_table('21_0.923598561901599,39_0.534792800433934,29_0.0692782052792609,22_0.206713933963329,9_0.9313296796754,31_0.67749793222174,1_0.0934103117324412,10_0.695828275289387,41_0.621980169788003,11_0.87125027552247,30_0.079912081360817,40_0.478423351421952,19_0.387606668751687,20_0.496157462708652,8_0.552907709032297,24_0.0219633430242538,23_0.76717281434685,28_0.927976955194026,14_0.818206876516342,16_0.478273885324597,17_0.306423192843795,3_0.193629603367299,33_0.418421885930002,13_0.748691521584988,5_0.0270167640410364,36_0.358466576784849,34_0.444800178986043,27_0.790532193612307,26_0.380120156798512,35_0.933182283304632,2_0.253765208646655,37_0.922488207463175,4_0.0538468402810395,38_0.768641801085323,15_0.406475386582315,18_0.493758061900735,25_0.463594998698682,7_0.884454429149628,12_0.0971177448518574,32_0.395383660681546,6_0.936530550476164,',',') t(i) where split_part(i,'_',1) = '' order by 2 desc;
 app_id | rating 
--------+--------
        | 
(1 row)

因为agg_append聚合在处理finalfunc前的值是末尾带了个逗号的, 在 regexp_split_to_table处理时会多出一行app_id, rating = ('','')的结果. 因此在final_array_agg中过滤即可.

create or replace function final_array_agg (i_text text) returns text[] as $$
declare
  result text[];
begin
  select array_agg(app_id||'_'||rating) into result from 
    (select split_part(i,'_',1) as app_id, split_part(i,'_',2) as rating from 
      regexp_split_to_table(i_text,',') t(i) where split_part(i,'_', 2) <> ''
      order by 2 desc) t;
  return result;
end;
$$ language plpgsql strict;

现在的结果就正常了, 如下 : 

postgres=# select user_id, agg_append(app_id||'_'||rating||',') from (select * from recommendation_mpt order by user_id,rating desc) t group by user_id;
 user_id |                                                                                                                          
                                                                                                                                    
                                                                                                                                    
                                          agg_append                                                                                
                                                                                                                                    
                                                                                                                                    
                                                                                     
---------+--------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------
       1 | {25_0.981541138608009,15_0.955886580049992,17_0.955500768963248,28_0.950485301204026,41_0.941958157345653,8_0.93016316043
213,34_0.903705002740026,24_0.898381588049233,38_0.884811857715249,14_0.84847314003855,2_0.812469538301229,21_0.789327840320766,19_0
.777864037547261,22_0.730009535793215,9_0.729899833910167,10_0.662983400281519,32_0.638831827323884,12_0.624156034551561,5_0.6165286
74494475,31_0.61100034089759,16_0.598527556750923,27_0.596998091321439,7_0.546153449919075,37_0.539897245820612,1_0.534418095834553,
26_0.485467836726457,20_0.484848829451948,35_0.434873293153942,36_0.328121873084456,30_0.325269365217537,40_0.27276889514178,6_0.267
069734167308,11_0.221765073016286,29_0.164679736364633,33_0.162994640879333,3_0.161364886444062,18_0.151077626738697,13_0.1455773869
52937,39_0.126275546848774,4_0.113149816170335,23_0.109499100595713}

同样可以不需要rating的值, 也是修改final_array_agg函数, 如下 : 

create or replace function final_array_agg (i_text text) returns text[] as $$
declare
  result text[];
begin
  select array_agg(app_id) into result from 
    (select split_part(i,'_',1) as app_id, split_part(i,'_',2) as rating from 
      regexp_split_to_table(i_text,',') t(i) where split_part(i,'_',2) <> ''
      order by 2 desc) t;
  return result;
end;
$$ language plpgsql strict;

修改后执行结果如下 : 

digoal=# select user_id, agg_append(app_id||'_'||rating||',') from recommendation_mpt group by user_id;
 user_id |                                                     agg_append                                                      
---------+---------------------------------------------------------------------------------------------------------------------
       1 | {25,15,17,28,41,8,34,24,38,14,2,21,19,22,9,10,32,12,5,31,16,27,7,37,1,26,20,35,36,30,40,6,11,29,33,3,18,13,39,4,23}
2 | {18,36,32,8,16,39,40,34,3,41,6,14,31,33,27,5,13,4,26,37,22,10,15,12,9,21,1,20,11,35,17,25,28,24,30,7,2,19,38,23,29}
3 | {2,11,15,34,24,39,30,28,17,20,41,23,7,25,36,26,8,5,22,9,1,3,12,13,40,29,4,37,27,19,21,14,38,10,31,33,18,35,6,16,32}
4 | {25,2,28,14,4,11,31,20,27,35,40,21,7,32,29,6,36,8,15,26,9,41,39,18,19,22,30,13,38,37,23,5,12,16,34,3,17,10,33,24,1}
5 | {18,31,22,29,8,30,38,39,32,25,15,13,21,10,40,27,36,12,6,23,3,1,4,33,16,17,2,35,5,19,7,26,11,37,41,24,28,20,14,34,9}
...

与PostgreSQL 9.0开始自带的array_agg比较, 效率如何呢? 下面在PostgreSQL 9.2.1 同一个数据库中进行的测试 : 

create table recommendation_mpt (user_id int8, app_id numeric, rating numeric, primary key (user_id,app_id));
insert into recommendation_mpt select generate_series(1,100000), generate_series(1,41), random();
-- 为了插入时避免xlog产生瓶颈, 使用unlogged table, 并且不使用PK : 
create unlogged table recommendation_mpt_new (user_id int8, app_id numeric[]);

PostgreSQL 9.0 自带的array_agg : 

digoal=# \timing
Timing is on.
digoal=# insert into recommendation_mpt_new select user_id, array_agg(app_id order by rating desc)  from  recommendation_mpt group by user_id;
INSERT 0 100000
Time: 10670.576 ms
执行计划
digoal=# explain select user_id, array_agg(app_id order by rating desc) from recommendation_mpt group by user_id;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=0.00..130574.58 rows=100338 width=24)
-> Index Scan using recommendation_mpt_pkey on recommendation_mpt (cost=0.00..108820.36 rows=4100000 width=24)
(2 rows)
digoal=# select * from recommendation_mpt_new limit 5;
 user_id |                                                       app_id                                                        
---------+---------------------------------------------------------------------------------------------------------------------
       1 | {19,3,38,36,2,15,16,31,39,14,6,29,37,8,33,40,25,21,23,34,4,18,10,1,27,13,11,20,9,32,26,17,35,24,12,5,22,30,28,41,7}
       2 | {9,11,5,4,10,27,22,31,30,33,3,23,32,20,28,14,39,18,35,15,36,13,7,24,1,6,34,8,40,2,19,37,21,38,17,26,12,41,25,29,16}
       3 | {11,32,25,35,34,21,40,19,24,38,12,37,20,2,31,17,5,1,8,33,28,22,14,10,29,9,23,18,6,41,27,13,16,4,26,36,15,39,3,30,7}
       4 | {7,8,14,20,27,38,16,11,25,39,13,40,31,9,19,29,24,4,41,10,22,12,34,6,17,18,23,36,33,3,37,35,28,2,21,5,15,32,1,30,26}
       5 | {2,31,40,20,27,16,30,5,32,26,28,33,41,29,14,7,39,12,1,19,13,22,17,4,10,9,24,3,8,11,15,36,34,23,25,6,38,18,21,37,35}
(5 rows)

结果HASH : 

digoal=# select sum(hashtext((t.*)::text)) from recommendation_mpt_new t;
     sum      
--------------
 122980440213
(1 row)


自定义的agg_append : 

create aggregate agg_array (anyelement) (
sfunc = array_append,
stype = anyarray);

create or replace function final_agg_array (i_text text) returns text[] as $$
declare
  result text[];
begin
  select array_agg(app_id) into result from 
    (select split_part(i,'_',1) as app_id, split_part(i,'_',2) as rating from 
      regexp_split_to_table(i_text,',') t(i) where split_part(i,'_',2) <> ''
      order by 2 desc) t;
  return result;
end;
$$ language plpgsql strict;

create aggregate agg_append (text) (
sfunc = textcat,
stype = text,
FINALFUNC = final_agg_array);

digoal=# truncate recommendation_mpt_new ;
TRUNCATE TABLE

digoal=# insert into recommendation_mpt_new 
select user_id, (agg_append(app_id||'_'||rating||','))::numeric[] from recommendation_mpt group by user_id;
INSERT 0 100000
Time: 86763.557 ms
执行计划
digoal=# explain select user_id, agg_append(app_id||'_'||rating||',') from recommendation_mpt group by user_id;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=0.00..227158.24 rows=100338 width=24)
-> Index Scan using recommendation_mpt_pkey on recommendation_mpt (cost=0.00..108820.36 rows=4100000 width=24)
(2 rows)
digoal=# select * from recommendation_mpt_new limit 5;
 user_id |                                                       app_id                                                        
---------+---------------------------------------------------------------------------------------------------------------------
       1 | {19,3,38,36,2,15,16,31,39,14,6,29,37,8,33,40,25,21,23,34,4,18,10,1,27,13,11,20,9,32,26,17,35,24,12,5,22,30,28,41,7}
       2 | {9,11,5,4,10,27,22,31,30,33,3,23,32,20,28,14,39,18,35,15,36,13,7,24,1,6,34,8,40,2,19,37,21,38,17,26,12,41,25,29,16}
       3 | {11,32,25,35,34,21,40,19,24,38,12,37,20,2,31,17,5,1,8,33,28,22,14,10,29,9,23,18,6,41,27,13,16,4,26,36,15,39,3,30,7}
       4 | {7,8,14,20,27,38,16,11,25,39,13,40,31,9,19,29,24,4,41,10,22,12,34,6,17,18,23,36,33,3,37,35,28,2,21,5,15,32,1,30,26}
       5 | {2,31,40,20,27,16,30,5,32,26,28,33,41,29,14,7,39,12,1,19,13,22,17,4,10,9,24,3,8,11,15,36,34,23,25,6,38,18,21,37,35}
(5 rows)

结果HASH : 

digoal=# select sum(hashtext((t.*)::text)) from recommendation_mpt_new t;
     sum      
--------------
 122980440213
(1 row)

hash结果一致,
由此可见, 两者实现了同样的功能但是效率相差了8倍多.

【参考】
4. 查看当前所有的aggregate函数

\da *.*
digoal=> \da *.*
                                                                                 List of aggregate functions
   Schema   |      Name      |      Result data type       |        Argument data types         |                                   
     Description                                         
------------+----------------+-----------------------------+------------------------------------+-----------------------------------
---------------------------------------------------------
 digoal     | agg_append     | text                        | text                               | 
 pg_catalog | array_agg      | anyarray                    | anyelement                         | concatenate aggregate input into a
n array
 pg_catalog | avg            | numeric                     | bigint                             | the average (arithmetic mean) as n
umeric of all bigint values
 pg_catalog | avg            | double precision            | double precision                   | the average (arithmetic mean) as f
loat8 of all float8 values
....

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值