1.多行转换为一行的函数collect_list,collect_set
collect_list:不去重,多行变成一行,实际上就是变成一个数组
collect_set:去重
2.concat_ws将数组中的值用指定分隔符拼接起来
2.一行转成多行
explode(col):将hive一列中复杂的array或者map结构拆分成多行
lateral view
用法:lateral view udtf(expression) tablealias as columnalias
解释:lateral view 经常于explode,split等UDTF函数一起使用,它能将一列数据拆成多行数据
split能够将字符串转换为数组
3.分组函数与collect_set的结合使用
select name,month,cost,
sum(cost) over () cost1,--整体求和
sum(cost) over (partition by name) cost2,--分组求和
sum(cost) over (partition by name ,month order by date) cost3,--分组求累计值
sum(cost) over (partition by name,month order by date rows between unbounded preceding and current row) cost4,--cost4=cost3,分组求累计值
sum(cost) over (partition by name,month order by date rows between 1 preceding and current row) cost6,--当前行和前一行求和
sum(cost) over (partition by name,month order by date rows between 1 preceding and 1 following) cost6,--当前行和前一行和后一行求和
from sale_details;
案例展示:求各个月都有哪些客户来过
select name,orderdate,cost,
concat_ws(",",collect_set(name) over (partition by substr(orderdate,1,7)))
from sale_details;
备注:collect_set:将多行数据变成一个数组
concat_ws:将数组变成,逗号隔开的字符串