常用的窗口函数

原始数据

0: jdbc:hive2://localhost:10000> select * from t_access;
+----------------+---------------------------------+-----------------------+--------------+--+
|  t_access.ip   |          t_access.url           | t_access.access_time  | t_access.dt  |
+----------------+---------------------------------+-----------------------+--------------+--+
| 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20   | 20170804     |
| 192.168.33.3   | http://www.edu360.cn/teach      | 2017-08-04 15:35:20   | 20170804     |
| 192.168.33.4   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20   | 20170804     |
| 192.168.33.4   | http://www.edu360.cn/job        | 2017-08-04 16:30:20   | 20170804     |
| 192.168.33.5   | http://www.edu360.cn/job        | 2017-08-04 15:40:20   | 20170804     |
| 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-05 15:30:20   | 20170805     |
| 192.168.44.3   | http://www.edu360.cn/teach      | 2017-08-05 15:35:20   | 20170805     |
| 192.168.33.44  | http://www.edu360.cn/stu        | 2017-08-05 15:30:20   | 20170805     |
| 192.168.33.46  | http://www.edu360.cn/job        | 2017-08-05 16:30:20   | 20170805     |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-05 15:40:20   | 20170805     |
| 192.168.133.3  | http://www.edu360.cn/register   | 2017-08-06 15:30:20   | 20170806     |
| 192.168.111.3  | http://www.edu360.cn/register   | 2017-08-06 15:35:20   | 20170806     |
| 192.168.34.44  | http://www.edu360.cn/pay        | 2017-08-06 15:30:20   | 20170806     |
| 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20   | 20170806     |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20   | 20170806     |
| 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20   | 20170806     |
| 192.168.33.25  | http://www.edu360.cn/job        | 2017-08-06 15:40:20   | 20170806     |
| 192.168.33.36  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20   | 20170806     |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20   | 20170806     |
+----------------+---------------------------------+-----------------------+--------------+--+

## LAG() over() 函数

求每个人的每次访问记录后加上他的上次访问时间

select ip,url,access_time,
row_number() over(partition by ip order by access_time) as rn,
lag(access_time,1,0) over(partition by ip order by access_time)as last_access_time
from t_access;

+----------------+---------------------------------+----------------------+-----+----------------------+--+
|       ip       |               url               |     access_time      | rn  |   last_access_time   |
+----------------+---------------------------------+----------------------+-----+----------------------+--+
| 192.168.111.3  | http://www.edu360.cn/register   | 2017-08-06 15:35:20  | 1   | 0                    |
| 192.168.133.3  | http://www.edu360.cn/register   | 2017-08-06 15:30:20  | 1   | 0                    |
| 192.168.33.25  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 1   | 0                    |
| 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | 0                    |
| 192.168.33.3   | http://www.edu360.cn/teach      | 2017-08-04 15:35:20  | 2   | 2017-08-04 15:30:20  |
| 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 3   | 2017-08-04 15:35:20  |
| 192.168.33.36  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 1   | 0                    |
| 192.168.33.4   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | 0                    |
| 192.168.33.4   | http://www.edu360.cn/job        | 2017-08-04 16:30:20  | 2   | 2017-08-04 15:30:20  |
| 192.168.33.44  | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 1   | 0                    |
| 192.168.33.46  | http://www.edu360.cn/job        | 2017-08-05 16:30:20  | 1   | 0                    |
| 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 2   | 2017-08-05 16:30:20  |
| 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 3   | 2017-08-06 16:30:20  |
| 192.168.33.5   | http://www.edu360.cn/job        | 2017-08-04 15:40:20  | 1   | 0                    |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-05 15:40:20  | 1   | 0                    |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 2   | 2017-08-05 15:40:20  |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 3   | 2017-08-06 15:40:20  |
| 192.168.34.44  | http://www.edu360.cn/pay        | 2017-08-06 15:30:20  | 1   | 0                    |
| 192.168.44.3   | http://www.edu360.cn/teach      | 2017-08-05 15:35:20  | 1   | 0                    |
+----------------+---------------------------------+----------------------+-----+----------------------+--+


LEAD() over() 函数

求每个人的每次访问记录后加上他的下次访问时间

select ip,url,access_time,
row_number() over(partition by ip order by access_time) as rn,
lead(access_time,1,0) over(partition by ip order by access_time)as pre_access_time
from t_access;
+----------------+---------------------------------+----------------------+-----+----------------------+--+
|       ip       |               url               |     access_time      | rn  |   pre_access_time   |
+----------------+---------------------------------+----------------------+-----+----------------------+--+
| 192.168.111.3  | http://www.edu360.cn/register   | 2017-08-06 15:35:20  | 1   | 0                    |
| 192.168.133.3  | http://www.edu360.cn/register   | 2017-08-06 15:30:20  | 1   | 0                    |
| 192.168.33.25  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 1   | 0                    |
| 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | 2017-08-04 15:35:20  |
| 192.168.33.3   | http://www.edu360.cn/teach      | 2017-08-04 15:35:20  | 2   | 2017-08-05 15:30:20  |
| 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 3   | 0                    |
| 192.168.33.36  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 1   | 0                    |
| 192.168.33.4   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | 2017-08-04 16:30:20  |
| 192.168.33.4   | http://www.edu360.cn/job        | 2017-08-04 16:30:20  | 2   | 0                    |
| 192.168.33.44  | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 1   | 0                    |
| 192.168.33.46  | http://www.edu360.cn/job        | 2017-08-05 16:30:20  | 1   | 2017-08-06 16:30:20  |
| 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 2   | 2017-08-06 16:30:20  |
| 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 3   | 0                    |
| 192.168.33.5   | http://www.edu360.cn/job        | 2017-08-04 15:40:20  | 1   | 0                    |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-05 15:40:20  | 1   | 2017-08-06 15:40:20  |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 2   | 2017-08-06 15:40:20  |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 3   | 0                    |
| 192.168.34.44  | http://www.edu360.cn/pay        | 2017-08-06 15:30:20  | 1   | 0                    |
| 192.168.44.3   | http://www.edu360.cn/teach      | 2017-08-05 15:35:20  | 1   | 0                    |
+----------------+---------------------------------+----------------------+-----+----------------------+--+

FIRST_VALUE() over() 函数

例:取每个用户访问的第一个页面

select ip,url,access_time,
row_number() over(partition by ip order by access_time) as rn,
first_value(url) over(partition by ip order by access_time rows between unbounded preceding and unbounded following)as last_access_time
from t_access;
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+
|       ip       |               url               |     access_time      | rn  |        last_access_time         |
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+
| 192.168.111.3  | http://www.edu360.cn/register   | 2017-08-06 15:35:20  | 1   | http://www.edu360.cn/register   |
| 192.168.133.3  | http://www.edu360.cn/register   | 2017-08-06 15:30:20  | 1   | http://www.edu360.cn/register   |
| 192.168.33.25  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 1   | http://www.edu360.cn/job        |
| 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | http://www.edu360.cn/stu        |
| 192.168.33.3   | http://www.edu360.cn/teach      | 2017-08-04 15:35:20  | 2   | http://www.edu360.cn/stu        |
| 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 3   | http://www.edu360.cn/stu        |
| 192.168.33.36  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 1   | http://www.edu360.cn/excersize  |
| 192.168.33.4   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | http://www.edu360.cn/stu        |
| 192.168.33.4   | http://www.edu360.cn/job        | 2017-08-04 16:30:20  | 2   | http://www.edu360.cn/stu        |
| 192.168.33.44  | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 1   | http://www.edu360.cn/stu        |
| 192.168.33.46  | http://www.edu360.cn/job        | 2017-08-05 16:30:20  | 1   | http://www.edu360.cn/job        |
| 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 2   | http://www.edu360.cn/job        |
| 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 3   | http://www.edu360.cn/job        |
| 192.168.33.5   | http://www.edu360.cn/job        | 2017-08-04 15:40:20  | 1   | http://www.edu360.cn/job        |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-05 15:40:20  | 1   | http://www.edu360.cn/job        |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 2   | http://www.edu360.cn/job        |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 3   | http://www.edu360.cn/job        |
| 192.168.34.44  | http://www.edu360.cn/pay        | 2017-08-06 15:30:20  | 1   | http://www.edu360.cn/pay        |
| 192.168.44.3   | http://www.edu360.cn/teach      | 2017-08-05 15:35:20  | 1   | http://www.edu360.cn/teach      |
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+

LAST_VALUE() over() 函数

例:取每个用户访问的最后一个页面

select ip,url,access_time,
row_number() over(partition by ip order by access_time) as rn,
last_value(url) over(partition by ip order by access_time rows between unbounded preceding and unbounded following)as last_access_time
from t_access;
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+
|       ip       |               url               |     access_time      | rn  |        last_access_url         |
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+
| 192.168.111.3  | http://www.edu360.cn/register   | 2017-08-06 15:35:20  | 1   | http://www.edu360.cn/register   |
| 192.168.133.3  | http://www.edu360.cn/register   | 2017-08-06 15:30:20  | 1   | http://www.edu360.cn/register   |
| 192.168.33.25  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 1   | http://www.edu360.cn/job        |
| 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | http://www.edu360.cn/stu        |
| 192.168.33.3   | http://www.edu360.cn/teach      | 2017-08-04 15:35:20  | 2   | http://www.edu360.cn/stu        |
| 192.168.33.3   | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 3   | http://www.edu360.cn/stu        |
| 192.168.33.36  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 1   | http://www.edu360.cn/excersize  |
| 192.168.33.4   | http://www.edu360.cn/stu        | 2017-08-04 15:30:20  | 1   | http://www.edu360.cn/stu        |
| 192.168.33.4   | http://www.edu360.cn/job        | 2017-08-04 16:30:20  | 2   | http://www.edu360.cn/stu        |
| 192.168.33.44  | http://www.edu360.cn/stu        | 2017-08-05 15:30:20  | 1   | http://www.edu360.cn/stu        |
| 192.168.33.46  | http://www.edu360.cn/job        | 2017-08-05 16:30:20  | 1   | http://www.edu360.cn/job        |
| 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 2   | http://www.edu360.cn/job        |
| 192.168.33.46  | http://www.edu360.cn/excersize  | 2017-08-06 16:30:20  | 3   | http://www.edu360.cn/job        |
| 192.168.33.5   | http://www.edu360.cn/job        | 2017-08-04 15:40:20  | 1   | http://www.edu360.cn/job        |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-05 15:40:20  | 1   | http://www.edu360.cn/job        |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 2   | http://www.edu360.cn/job        |
| 192.168.33.55  | http://www.edu360.cn/job        | 2017-08-06 15:40:20  | 3   | http://www.edu360.cn/job        |
| 192.168.34.44  | http://www.edu360.cn/pay        | 2017-08-06 15:30:20  | 1   | http://www.edu360.cn/pay        |
| 192.168.44.3   | http://www.edu360.cn/teach      | 2017-08-05 15:35:20  | 1   | http://www.edu360.cn/teach      |
+----------------+---------------------------------+----------------------+-----+---------------------------------+--+

SUM() OVER()

组内累计求和

+-----+----------+-------+--+
| id  |  month   | cnts  |
+-----+----------+-------+--+
| A   | 2015-01  | 33    |
| A   | 2015-02  | 10    |
| A   | 2015-03  | 20    |
| B   | 2015-01  | 30    |
| B   | 2015-02  | 15    |
| B   | 2015-03  | 45    |
| C   | 2015-01  | 30    |
| C   | 2015-02  | 40    |
| C   | 2015-03  | 30    |
+-----+----------+-------+--+

select id,month,cnts,
sum(cnts) over(partition by id order by month rows between unbounded preceding and current row ) as accumulate_cnts

from (select id,month,sum(cnt) as cnts 

from t_access_time
group by id,month) tmp;

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值