Hive / Presto 行转列 列转行

本文详细介绍了在Hive和Presto中如何进行行转列(使用collect_set和array_agg等函数)以及列转行(split和explode或crossjoinunnest)的操作,包括示例代码,展示了这两种数据转换技巧在实际业务中的应用。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

行转列

1、Hive:

collect_set转为数组并去重,concat_ws将数组用逗号间隔连接成字符串

select 
    fuid,
    concat_ws(',', collect_set(cast(fdeal_id as string) )) as order_ids
from tmp.tmp_test
where dt = '2022-03-31'
    and event_type = 1
group by fuid

2、Presto:

array_agg转为数组,array_distinct去重,array_join将数组用逗号间隔连接成字符串

select 
	fuid,
    array_join(array_distinct(array_agg( cast(fdeal_id as varchar) )), ',') as order_ids
from tmp.tmp_test
where dt = '2022-03-31'
	and event_type = 1
group by fuid

列转行

Hive

1、split将order_ids拆分成数组,lateral view explode将数组炸裂开

select a.fuid
    , b.fdeal_id
from tmp.tmp_test a
lateral view explode(split(order_ids, ',')) b as fdeal_id

###炸裂 + map

select  model_code,
        item_code,
        refer_enum,
        busi_cnt
from
(select model_code
       item_code,
       count(distinct if(item_value >= 2 and item_value <= 5,business_id,null)) as cnt2,
       count(distinct if(item_value >= 6 and item_value <= 9,business_id,null)) as cnt3,
       count(distinct if(item_value >= 10 and item_value <= 12,business_id,null)) as cnt4
from tmp.tmp_test
where dt = '2021-05-24'
group by model_code,
         item_code) a
lateral view explode(map('2-5', cnt2,
                         '6-9', cnt3,
                         '10-12', cnt4)) b as refer_enum, busi_cnt

Presto

1、split将order_ids拆分成数组,cross join unnest将数组炸裂开

select a.fuid
    , b.fdeal_id
from tmp.tmp_test a
cross join unnest(split(order_ids, ',')) as b(fdeal_id) 

2、炸裂 + map

select 
    t1.fuid,
    t2.lable_name, 
    t2.label_value
from (
        select t1.fuid,                                   
               cast(t1.bus_type as varchar) bus_type,        
               t1.dept_code,                    
               t1.dept_name,                    
               cast(t1.black_gold as varchar) black_gold,                         
               cast(t1.chat_tag as varchar) chat_tag                       
          from tmp.tmp_test t1
         where t1.dt = '2021-06-30'
       ) t1
 cross join unnest (
  array['bus_type', 'dept_code', 'dept_name', 'black_gold', 'chat_tag'],
  array[bus_type, dept_code, dept_name, black_gold,chat_tag]
 ) t2 (lable_name, label_value)
Presto中的CROSS JOIN和UNNEST是用于将数组拆分成行的操作。CROSS JOIN用于将两个表的每一行进行组合,而UNNEST则用于将数组中的元素展开成多行数据。在给出的引用中,UNNEST被用于将order_ids这个数组拆分成多行,并将每个元素作为b表的order_id的值。同时,CROSS JOIN将tmp_col_to_row表和UNNEST操作后的结果进行了组合,从而得到了最终的查询结果。 [2<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* [【presto行转列转行](https://blog.youkuaiyun.com/lz6363/article/details/124557442)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 33.333333333333336%"] - *2* [Hive/Spark/Presto/标准SQL实现行转列转行](https://blog.youkuaiyun.com/soaring0121/article/details/99870447)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 33.333333333333336%"] - *3* [datax支持presto读取](https://download.youkuaiyun.com/download/qq_27048639/13489400)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 33.333333333333336%"] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值