今天看到网上的例子,可以参考PostgreSQL聚合例子,如下:
postgres=# create table test(id int, info text);
CREATE TABLE
postgres=# insert into test values (1,'test1');
INSERT 0 1
postgres=# insert into test values (1,'test1');
INSERT 0 1
postgres=# insert into test values (1,'test2');
INSERT 0 1
postgres=# insert into test values (1,'test3');
INSERT 0 1
postgres=# insert into test values (2,'test1');
INSERT 0 1
postgres=# insert into test values (2,'test1');
INSERT 0 1
postgres=# insert into test values (2,'test1');
INSERT 0 1
postgres=# insert into test values (3,'test4');
INSERT 0 1
postgres=# insert into test values (3,'test4');
INSERT 0 1
postgres=# insert into test values (3,'test4');
INSERT 0 1
postgres=# insert into test values (3,'test4');
INSERT 0 1
postgres=# insert into test values (3,'test4');
INSERT 0 1
postgres=# select * from test;
id | info
----+-------
1 | test1
1 | test1
1 | test2
1 | test3
2 | test1
2 | test1
2 | test1
3 | test4
3 | test4
3 | test4
3 | test4
3 | test4
(12 rows)
根据id分组并统计每个分组的info列出现频率最高的值以及其出现的次数。
可以使用 subquery解决。
sde=# select id,info,cnt from
sde-# (select id,info,cnt,row_number() over(partition by id order by cnt desc) as rn from
sde(# (select id,info,count(*) cnt from test group by id,info) t) t
sde-# where t.rn=1;
id | info | cnt
----+-------+-----
1 | test1 | 2
2 | test1 | 3
3 | test4 | 5
(3 行记录)
如果使用with语法,结构会更清晰,如下所示:
sde=# with t1 as
sde-# (
sde(# select id,info,count(*) cnt from test group by id,info
sde(# ),
sde-# t2 as
sde-# (
sde(# select id,info,cnt,row_number() over (partition by id order by cnt desc) as rn from t1
sde(# ),
sde-# t3 as
sde-# (
sde(# select id,info,cnt from t2 where rn=1
sde(# )
sde-#
sde-# select * from t3;
id | info | cnt
----+-------+-----
1 | test1 | 2
2 | test1 | 3
3 | test4 | 5
(3 行记录)