最近学习Hive。实现了其中几个作业,如下。Hql语句没怎么调整格式,见谅。
Hive练习二中的题目
(1). 2017 年4 月1 日各个商品品牌的交易笔数,按照销售交易从多到少排序
1 select 2 brand, 3 count(*) as totalCount 4 from 5 record 6 join brand_dimension on record.bid = brand_dimension.bid 7 where record.trancation_date= '2017-04-01' 8 group by brand_dimension.brand 9 order by totalCount desc;
输出
+------------+-------------+--+ | brand | totalcount | +------------+-------------+--+ | SAMSUNG | 2 | | WULIANGYE | 1 | | PUMA | 1 | | OPPO | 1 | | DELL | 1 | +------------+-------------+--+
(2). 不同性别消费的商品类别情况(不同性别消费不同商品类别的总价)
1 select 2 gender, 3 category, 4 sum(price) as totalPrice 5 from record 6 join user_dimension on record.uid = user_dimension.uid 7 join brand_dimension on record.bid = brand_dimension.bid 8 group by gender, 9 category 10 order by 11 gender, 12 category, 13 totalPrice;
输出
+---------+------------+-------------+--+ | gender | category | totalprice | +---------+------------+-------------+--+ | M | computer | 252 | | M | food | 429 | | M | sports | 120 | | M | telephone | 1669 | +---------+------------+-------------+--+
Hive练习三中的题目
(1). 谁不是经理
1 select 2 name 3 from 4 employees 5 where 6 size(subordinates)<=0;
输出
+-------------------+---------------+--+ | name | subordinates | +-------------------+---------------+--+ | Todd Jones | [] | | Bill King | [] | | Stacy Accountant | [] | +-------------------+---------------+--+ 3 rows selected (0.092 seconds)
(2). 谁住在邮编比60500 大的地区
1 select name,address.zip from employees where address.zip> 60500;
输出
+-------------------+--------+--+ | name | zip | +-------------------+--------+--+ | John Doe | 60600 | | Mary Smith | 60601 | | Todd Jones | 60700

这篇博客主要展示了使用Hive进行数据处理的几个实际案例,包括交易品牌排名、性别消费类别分析、雇员筛选及商品好评率计算等。通过实例详细解释了Hive SQL查询语句的编写和执行,帮助理解Hive在大数据分析中的应用。
最低0.47元/天 解锁文章
1762

被折叠的 条评论
为什么被折叠?



