GROUP BY子句及其扩展再续


 
 
 
 
 
 
 
 
 
 
 
 
 
GROUP BY子句及其扩展续

分析函数

 
字号:    

GROUP BY子句及其扩展再续

3、三个grouping函数扩展

    三个grouping函数是为了更清楚的显示哪些行是再聚合的结果,因此它们都只支持GROUP BY ROLLUP/CUBE/GROUPING SETS()子句,它们有着“美化”这三个子句的效果。

    GROUP_ID()

GROUP_ID distinguishes duplicate groups resulting from a GROUP BY specification. It returns an Oracle NUMBER to uniquely identify duplicate groups. If n duplicates exist for a particular grouping, then GROUP_ID returns numbers in the range 0 to n-1.一句话,就是这个函数用来标记经过GROUP BY ROLLUP/CUBE帅选之后的重复记录的行数。

先看个简单的例子:

23:13:31 SQL> select job,comm,count(*) num,group_id() from emp group by job,rollup(comm,job) order by 1,2,num;

JOB             COMM        NUM GROUP_ID()

--------- ---------- ---------- ----------

ANALYST                       2          0

ANALYST                       2          0

ANALYST                       2          1

CLERK                         4          1

CLERK                         4          0

CLERK                         4          0

MANAGER                       3          0

MANAGER                       3          1

MANAGER                       3          0

PRESIDENT                     1          1

PRESIDENT                     1          0

PRESIDENT                     1          0

SALESMAN           0          2          1

SALESMAN           0          2          0

SALESMAN         300          2          1

SALESMAN         300          2          0

SALESMAN                      4          0

已选择17行。

    上面这个例子证明了GROUP_ID的含义,再请看下面这个例子

23:06:06 SQL> select * from t;

A B                   C          D          E

- ---------- ---------- ---------- ----------

a test1               1         11        111

a test1               2         11        111

23:06:42 SQL> select a,b,count(*) num,group_id() from t group by rollup(a,b);

A B                 NUM GROUP_ID()

- ---------- ---------- ----------

a test1               2          0

a                     2          0

                      2          0

23:07:18 SQL> select a,b,count(*) num,group_id() from t group by rollup(a,b,d);

A B                 NUM GROUP_ID()

- ---------- ---------- ----------

a test1               2          0  -- 第一行

a test1               2          0  -- 第二行

a                     2          0

                      2          0

这里最后一个查询的前两行(粉红色部分)不是一摸一样吗?为什么在第二行的GROUP_ID()还是0而不是1呢?请注意看这里group by rollup(a,b,d),多了个d列,由此可以知道:假设第一行是通过GROUP BY A,B,D而来,那么第二行就是通过GROUP BY A,B而来,这是两个不同的GROUP BY帅选条件了,即使得到同样的结果,GROUP_ID()也不认为它们是重复的。

    因此,GROUP_ID()函数是标记相同的GROUP BY条件帅选出来的结果的重复值

   

GROUPING(exp)

GROUPING distinguishes superaggregate rows from regular grouped rows. GROUP BY extensions such as ROLLUP and CUBE produce superaggregate rows where the set of all values is represented by null.

也就是说,GROUPING函数是标记出除了exp而外的GROUP BY中表达式的被再聚合的行,即exp IS NULL并且是其他列被再聚合的行,而不管这种再聚合是ROLLUP还是CUBEexpNULL的时候,就标记为1,否则为0

下面看个例子来说明这个问题:

23:33:11 SQL> select index_type,status,count(*) num,grouping(status) from test group by rollup(index_type,status);

INDEX_TYPE                  STATUS                 NUM GROUPING(STATUS)

--------------------------- --------------- ---------- ----------------

LOB                         N/A                      1                0

LOB                         VALID                  566                0

LOB                                                567                1  -- 第一行,

因为statusnull,故GROUPING(STATUS)标记为1

BITMAP                      N/A                      7                0

BITMAP                      VALID                    8                0

BITMAP                                              15                1

DOMAIN                      VALID                    1                0

DOMAIN                                               1                1

NORMAL                      N/A                     56                0

NORMAL                      VALID                 1565                0

NORMAL                                            1621                1

CLUSTER                     VALID                   10                0

CLUSTER                                             10                1

IOT - TOP                   VALID                  114                0

IOT - TOP                                          114                1

FUNCTION-BASED DOMAIN       VALID                    1                0

FUNCTION-BASED DOMAIN                                1                1

FUNCTION-BASED NORMAL       VALID                   17                0

FUNCTION-BASED NORMAL                               17                1

                                                  2346                1

已选择20行。

-- 注意看上面的粉红色部分,被标记为1的都是对除了status列外GROUP BY表达式中剩余部分,即IDNEX_TYPE列,进行再聚合的行。我以第一行做为讲例,它是通过GROUP BY INDEX_TYPE,即GROUP BY ROLLUP(INDEX_TYPE,STATUS)GROUP BY (INDEX)。所有被标记为1的都是GROUP BY ROLLUP(INDEX_TYPE,STATUS)去除STATUSGROUP BY ROLLUP(INDEX_TYPE)情况。

这里其实就是标记出INDEX_TYPE的小计。

-- 我们来看相反的情况,把INDEX_TYPE列设置为要去除的列,看看结果:

23:36:33 SQL> select index_type,status,count(*) num,grouping(index_type) from test group by rollup(index_type,status);

INDEX_TYPE                  STATUS                 NUM GROUPING(INDEX_TYPE)

--------------------------- --------------- ---------- --------------------

LOB                         N/A                      1                    0

LOB                         VALID                  566                    0

LOB                                                567                    0

BITMAP                      N/A                      7                    0

BITMAP                      VALID                    8                    0

BITMAP                                              15                    0

DOMAIN                      VALID                    1                    0

DOMAIN                                               1                    0

NORMAL                      N/A                     56                    0

NORMAL                      VALID                 1565                    0

NORMAL                                            1621                    0

CLUSTER                     VALID                   10                    0

CLUSTER                                             10                    0

IOT - TOP                   VALID                  114                    0

IOT - TOP                                          114                    0

FUNCTION-BASED DOMAIN       VALID                    1                    0

FUNCTION-BASED DOMAIN                                1                    0

FUNCTION-BASED NORMAL       VALID                   17                    0

FUNCTION-BASED NORMAL                               17                    0

                                                  2346                    1

已选择20行。

-- 完全正确,因为GROUP BY ROLLUP(INDEX_TYPE,STAUS)去除INDEX_TYPE而得到的GROUP BY ROLLUP(STATUS),它只要GROUP BY(0)这种情况。

-- 下面我们看看CUBE,去除STATUS,即STATUS IS NULL INDEX_TYPE的再聚合情况

23:33:25 SQL> select index_type,status,count(*) num,grouping(status) from test group by cube(index_type,status);

INDEX_TYPE                  STATUS                 NUM GROUPING(STATUS)

--------------------------- --------------- ---------- ----------------

                                                  2346                1

                            N/A                     64                0

                            VALID                 2282                0

LOB                                                567                1

LOB                         N/A                      1                0

LOB                         VALID                  566                0

BITMAP                                              15                1

BITMAP                      N/A                      7                0

BITMAP                      VALID                    8                0

DOMAIN                                               1                1

DOMAIN                      VALID                    1                0

NORMAL                                            1621                1

NORMAL                      N/A                     56                0

NORMAL                      VALID                 1565                0

CLUSTER                                             10                1

CLUSTER                     VALID                   10                0

IOT - TOP                                          114                1

IOT - TOP                   VALID                  114                0

FUNCTION-BASED DOMAIN                                1                1

FUNCTION-BASED DOMAIN       VALID                    1                0

FUNCTION-BASED NORMAL                               17                1

FUNCTION-BASED NORMAL       VALID                   17                0

已选择22行。

-- 下面是去除INDEX_TYPE,即INDEX_TYPE IS NULLSTATUS的再聚合情况

23:39:02 SQL> select index_type,status,count(*) num,grouping(index_type) from test group by cube(index_type,status);

 

INDEX_TYPE                  STATUS                 NUM GROUPING(INDEX_TYPE)

--------------------------- --------------- ---------- --------------------

                                                  2346                    1

                            N/A                     64                    1

                            VALID                 2282                    1

LOB                                                567                    0

LOB                         N/A                      1                    0

LOB                         VALID                  566                    0

BITMAP                                              15                    0

BITMAP                      N/A                      7                    0

BITMAP                      VALID                    8                    0

DOMAIN                                               1                    0

DOMAIN                      VALID                    1                    0

NORMAL                                            1621                    0

NORMAL                      N/A                     56                    0

NORMAL                      VALID                 1565                    0

CLUSTER                                             10                    0

CLUSTER                     VALID                   10                    0

IOT - TOP                                          114                    0

IOT - TOP                   VALID                  114                    0

FUNCTION-BASED DOMAIN                                1                    0

FUNCTION-BASED DOMAIN       VALID                    1                    0

FUNCTION-BASED NORMAL                               17                    0

FUNCTION-BASED NORMAL       VALID                   17                    0

已选择22行。

-- 下面是说明GROUPING函数的参数只能是一个表达式

23:39:19 SQL> select index_type,status,count(*) num,grouping(index_type,status) from test group by cube(index_type,status);

select index_type,status,count(*) num,grouping(index_type,status) from test group by cube(index_type,status)

                                      *

1 行出现错误:

ORA-00909: 参数个数无效

 

23:40:12 SQL> select index_type,status,count(*) num,grouping(status,index_type) from test group by cube(index_type,status);

select index_type,status,count(*) num,grouping(status,index_type) from test group by cube(index_type,status)

                                      *

1 行出现错误:

ORA-00909: 参数个数无效

     通过运用GROUPING(exp),就可以很简单的找出被再聚合的那些列的聚合结果,即GROUPING(exp)=1,反应了GROUP BY子句中其它列的聚合结果。通俗的说法,即GROUPING(exp)=1所在的那一行,是GROUP BY子句中除了exp外其它部分的一个小计统计。具体请看上面的例子。

 

GROUPING_ID(exp[,...])

GROUPING_ID is functionally equivalent to taking the results of multiple GROUPING functions and concatenating them into a bit vector (a string of ones and zeros).

GROUPING_ID returns a number corresponding to the GROUPING bit vector associated

with a row. GROUPING_ID is applicable only in a SELECT statement that contains a GROUP BY extension, such as ROLLUP or CUBE, and a GROUPING function.

    它的含义是:

GROUPING_ID(exp1,exp2,...,expN)={GROUPING(exp1)||GROUPING(exp2)||...||GROUPING(expN)}变成十进制数,如:

如果GROUPING(A)=1,GROUPING(B)=0,GROUPING(C)=1,那么

GROUPING_ID(A,B,C) = [101]二进制 = 5

GROUPING_ID(B,A,C) = [011]二进制 = 3.

    因此,你就可以很简单得通过GROUPING_ID函数来得到列的再聚合情况,如上面的GROUPING_ID(A,B,C) = 5,并且我们假设查询语句的GROUP BY子句是GROUP BY CUBE(A,B,C),那么它的含义就表示对A,C列进行了GROUPING结果,即B列进行的聚合小计(除了A,C以外的GROUP BY中的剩余列的聚合小计,这个是根据GROUPING函数的含义而来)。

    我们来看个例子:

08:39:20 SQL> select * from t;

A B                   C

- ---------- ----------

a test1               1

a test1               2

a test2               1

a test2               2

08:39:26 SQL> select a,b,sum(c) sc from t group by rollup(a,b);

A B                  SC

- ---------- ----------

a test1               3

a test2               3

a                     6

                      6

08:40:11 SQL> select a,b,sum(c) sc,grouping(a),grouping(b),grouping_id(a,b),grouping_id(b,a) from t group by rollup(a,b);

A B                  SC GROUPING(A) GROUPING(B) GROUPING_ID(A,B) GROUPING_ID(B,A)

- ---------- ---------- ----------- ----------- ---------------- ----------------

a test1               3           0           0                0                0

a test2               3           0           0                0                0

a                     6           0           1                1                2

                      6           1           1                3                3

-- 注意这里的粉红色部分,GROUPING_ID(B,A)=2,可推导出GROUPING(B)=1,GROUPING(A)=0,而查询语句中GROUP BY ROLLUP(A,B),这就说明这行是对A做了聚合小计(除B以外的GROUP BY子句中的列,A,做聚合小计)。

-- 同样,也可以这样推导:GROUPING_ID(A,B)=1,可推导出GROUPING(A)=0,GROUPING(B)=1,而查询语句中GROUP BY ROLLUP(A,B),这就说明这行是对A做了聚合小计(除B以外的GROUP BY子句中的列,A,做聚合小计)。

-- 看看CUBE的情况,根据GROUPING_ID的解释,也很明了。

08:40:48 SQL> select a,b,sum(c) sc,grouping(a),grouping(b),grouping_id(a,b),grouping_id(b,a) from t group by cube(a,b);

A B                  SC GROUPING(A) GROUPING(B) GROUPING_ID(A,B) GROUPING_ID(B,A)

- ---------- ---------- ----------- ----------- ---------------- ----------------

                      6           1           1                3                3

  test1               3           1           0                2                1

  test2               3           1           0                2                1

a                     6           0           1                1                2

a test1               3           0           0                0                0

a test2               3           0           0                0                0

6 rows selected.

    我们常将GROUPING_ID函数结合HAVING子句运用,把部署聚合小计的部分去除掉,如:

08:56:04 SQL> select a,b,sum(c) sc,grouping(a),grouping(b),grouping_id(a,b)

10:34:39   2  from t group by rollup(a,b) having grouping_id(a,b)>0;

A B                  SC GROUPING(A) GROUPING(B) GROUPING_ID(A,B)

- ---------- ---------- ----------- ----------- ----------------

a                     6           0           1                1

                      6           1           1                3

    这样就能很方便的查看聚合小计。

FAQ

    我们可以通过灵活运用GROUPING SETS,ROLLUP,CUBE来实现显示不同层次的聚合。

下面来看这个例子,在这个句子中GROUPING SETS相当于ROLLUP

10:34:40 SQL> select a,b,sum(c) sc,grouping(a),grouping(b),grouping_id(a,b)

10:40:49   2  from t group by rollup(a,b);

A B                  SC GROUPING(A) GROUPING(B) GROUPING_ID(A,B)

- ---------- ---------- ----------- ----------- ----------------

a test1               3           0           0                0

a test2               3           0           0                0

a                     6           0           1                1

                      6           1           1                3

 

10:40:50 SQL> select a,b,sum(c) sc,grouping(a),grouping(b),grouping_id(a,b)

10:40:57   2  from t group by grouping sets((a,b),(a),null);

A B                  SC GROUPING(A) GROUPING(B) GROUPING_ID(A,B)

- ---------- ---------- ----------- ----------- ----------------

a test1               3           0           0                0

a test2               3           0           0                0

a                     6           0           1                1

                      6           1           1                3

    下面这个例子是GROUPING SETS相当于CUBE

10:42:58 SQL> select a,b,sum(c) sc,grouping(a),grouping(b),grouping_id(a,b)

10:43:26   2  from t group by cube(a,b)

10:43:26   3  order by 1,2,3;

A B                  SC GROUPING(A) GROUPING(B) GROUPING_ID(A,B)

- ---------- ---------- ----------- ----------- ----------------

a test1               3           0           0                0

a test2               3           0           0                0

a                     6           0           1                1

  test1               3           1           0                2

  test2               3           1           0                2

                      6           1           1                3

6 rows selected.

 

10:43:27 SQL> select a,b,sum(c) sc,grouping(a),grouping(b),grouping_id(a,b)

10:43:34   2  from t group by grouping sets((a,b),(a),(b),())

10:43:34   3  order by 1,2,3;

A B                  SC GROUPING(A) GROUPING(B) GROUPING_ID(A,B)

- ---------- ---------- ----------- ----------- ----------------

a test1               3           0           0                0

a test2               3           0           0                0

a                     6           0           1                1

  test1               3           1           0                2

  test2               3           1           0                2

                      6           1           1                3

6 rows selected.

    这个里面需要说明的就是蓝色加粗部分,它是两种不同的表达方式,其实就是相当于做GROUP BY(0),请看操作例子:

10:46:32 SQL> select null,null,sum(c) sc,null,null,null

10:46:49   2  from t

10:46:49   3  order by 1,2,3;

N N         SC N N N

- - ---------- - - -

             6

参考文献

1、  official documents

2、  http://xsb.itpub.net/post/419/29159

3、  http://tomszrp.itpub.net/post/11835/64788


<script language="JavaScript" type="text/javascript">function initUD(){ UD.body=$("_$_inner_layer"); UD.layer=$("_$_outer_layer"); UD.window=$("blog-163-com-body"); UD.bar=$("theme_selector_pointer"); } </script> <script language="JavaScript" type="text/javascript"> // =0;i--){ _o = $(_arr[i]); if(i==0)_o && (_o.οnclick=function(){exitAfterLogged();return false;}); else if(i==1)_o && (_o.οnclick=function(){showLoginDlg('chenzs19850728.blog.163.com');return false;}); } topFromUrsLogin(); } function g_initUtil(){ } // ]]> </script> <script src="http://st.blog.163.com/js/utils/InfoAlertPad.js" type="text/javascript"></script> <script language="javascript" type="text/javascript"> function initAll(){ try{ initUD(); g_initCommon(); initFlash(0); showPageTopBar({serverName:'chenzs19850728.blog.163.com',serverHostName:'blog.163.com'}); initPage(); window.setTimeout("g_initUtil();",3000); }catch(e){} } initAll(); </script> <script language="JavaScript" type="text/javascript"> var ver = new Date().getTime(); document.write('
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值