建表如下:
# 创建商品与促销活动的映射表
hive -e "set mapred.job.queue.name=pms;
set hive.exec.reducers.max=32;
set mapred.reduce.tasks=32;
drop table if exists product_promotion;
create table product_promotion(product_id bigint, promotion_id String);
insert into table product_promotion
select p2.product_id, p2.promotion_id
from pms.promotionv2 p1 inner join pms.promotionv2_main_product_sku p2
on (p1.id=p2.promotion_id)
where from_unixtime(unix_timestamp(),'yyyy-MM-dd HH:mm:ss') between p1.start_date and p1.end_date;"
数据表的记录如下:
对promotion_id进行合并:
select product_id, concat_ws('_',collect_set(promotion_id)) as promotion_ids from product_promotion group by product_id执行结果:
hive > select product_id, concat_ws('_',collect_set(promotion_id)) as promotion_ids from product_promotion group by product_id;
OK
5112 960024_960025_960026_960027_960028
5113 960043_960044_960045_960046
Time taken: 3.116 seconds这里的collect_set的作用是对promotion_id去重,值得注意的是,必须保证promotion_id的类型是string类型
本文介绍如何使用Hive SQL创建商品与促销活动的映射表,并通过示例展示如何对promotion_id字段进行合并及去重操作。
1397

被折叠的 条评论
为什么被折叠?



