拼多多秋招学霸批(数据分析)笔试 7.28
第二题:
第二题:
表的结构如下:
User_id createtime opration
cookie1,2015-04-10,A
cookie1,2015-04-11,B
cookie1,2015-04-12,C
cookie1,2015-04-13,A
cookie1,2015-04-14,B
cookie1,2015-04-15,A
cookie1,2015-04-16,A
cookie2,2015-04-10,G
cookie2,2015-04-11,A
cookie2,2015-04-12,A
cookie2,2015-04-13,A
cookie2,2015-04-14,A
cookie2,2015-04-15,A
cookie2,2015-04-16,C
cookie3,2015-04-10,A
cookie3,2015-04-11,B
cookie3,2015-04-12,C
cookie3,2015-04-13,D
cookie3,2015-04-14,D
cookie3,2015-04-15,E
cookie3,2015-04-16,F
题目要求:
把A操作后是B,AB必须相连的客户找出来
步骤一:hive 创建表
create table pddtest2(cookieid string,
createtime string,
opration string)
row format delimited fields terminated by ',';
步骤二:导入数据
load data local inpath "/opt/data_w/pddtest2.txt" into table pddtest2;
步骤三:中间表pddtest2m
hive> create table pddtest2m as
> select cookieid,createtime,opration,
> row_number() over (partition by cookieid order by createtime) as rn,
> LEAD(opration,1) over (partition by cookieid order by createtime) as nextop
> from pddtest2;
结果如图所示:
具体的开窗函数(lead,lag等用法详见:)
https://www.cnblogs.com/qingyunzong/p/8798606.html