Hive(三、2)案例之京东店铺访问指标

本文通过Hive实现京东店铺的访问指标分析,包括计算每个店铺的UV(访客数)以及获取访问次数前三的访客详情,涉及数据表操作和数据插入。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Hive(三、2)案例之京东店铺访问指标

一、前期准备

1.0、打开集群&服务&客户端

#1)注释掉配置文件内容,这样就不用开启metastore服务啦
    vim $HIVE_HOME/conf/hive-site.xml
    #注释掉如下内容
	<!--    
	<property>
        <name>hive.metastore.uris</name>
        <value>thrift://hadoop11:9083</value>
    </property>
	-->

#2)启动hadoop集群
#3)启动hiveserver2服务
     nohup hive --service hiveserver2 > /opt/logs/hiveserver2.log  &
     #或
     nohup $HIVE_HOME/bin/hiveserver2 > /opt/logs/hiveserver2.log  &
#4)打开datagrip软件,连接客户端服务

1.1、需求

50W个京东店铺,每个顾客访问任何一个店铺的任何一个商品时都会产生一条访问日志,访问日志存储的表名为Visit,访客的用户id为user_id,被访问的店铺名称为shop,访问时间为visit_time。数据样例:'huawei','1001','2017-02-10''apple','1001','2017-02-11'……
请统计:
1)	每个店铺的UV(访客数)
2)每个店铺访问次数top3的访客信息。输出店铺名称、访客id、访问次数

1.2、数据表

店铺名称用户id访问时间
shopuser_idvisit_time
--建表语句
drop table if exists Visit;
create table Visit(
    shop string COMMENT '店铺名称',
    user_id string COMMENT '用户id',
    visit_time string COMMENT '访问时间'
)
row format delimited fields terminated by '\t';

1.3、插入数据

insert into table Visit values ('huawei','1005','2017-02-10');
insert into table Visit values ('huawei','1005','2017-02-10');
insert into table Visit values ('huawei','1005','2017-02-10');
insert into table Visit values ('huawei','1005','2017-02-10');
insert into table Visit values ('huawei','1004','2017-02-10');
insert into table Visit values ('huawei','1004','2017-02-10');
insert into table Visit values ('huawei','1003','2017-02-10');
insert into table Visit values ('huawei','1003','2017-02-10');
insert into table Visit values ('huawei','1001','2017-02-10');
insert into table Visit values ('huawei','1002','2017-02-10');
insert into table Visit values ('huawei','1006','2017-02-10');
insert into table Visit values ('apple','1001','2017-02-10');
insert into table Visit values ('apple','1001','2017-02-10');
insert into table Visit values ('apple','1001','2017-02-10');
insert into table Visit values ('apple','1001','2017-02-10');
insert into table Visit values ('apple','1002','2017-02-10');
insert into table Visit values ('apple','1002','2017-02-10');
insert into table Visit values ('apple','1005','2017-02-10');
insert into table Visit values ('apple','1005','2017-02-10');
insert into table Visit values ('apple','1006','2017-02-10');
insert into table Visit values ('apple','1004','2017-02-10');
insert into table Visit values ('meizu','1006','2017-02-10');
insert into table Visit values ('meizu','1006','2017-02-10');
insert into table Visit values ('meizu','1006','2017-02-10');
insert into table Visit values ('meizu','1006','2017-02-10');
insert into table Visit values ('meizu','1003','2017-02-10');
insert into table Visit values ('meizu','1003','2017-02-10');
insert into table Visit values ('meizu','1003','2017-02-10');
insert into table Visit values ('meizu','1002','2017-02-10');
insert into table Visit values ('meizu','1002','2017-02-10');
insert into table Visit values ('meizu','1004','2017-02-10');

二、需求实现

2.1、每个店铺的UV(访客数)

-- 1)	每个店铺的UV(访客数)
select shop,
       count(user_id) shop_user_view
from visit
group by shop;

2.2、每个店铺访问次数top3的访客信息。输出店铺名称、访客id、访问次数

-- 2)每个店铺访问次数top3的访客信息。输出店铺名称、访客id、访问次数
--按店铺、访客分组,查询访问次数
select shop,
       user_id,
       count(user_id) shop_user_count
from visit
group by shop, user_id;
--求rank
select t1.shop,
       t1.user_id,
       t1.shop_user_count,
       rank() over (partition by shop order by shop_user_count desc) shop_user_count_rk
from (
         select shop,
                user_id,
                count(user_id) shop_user_count
         from visit
         group by shop, user_id
     ) t1;
--rank前三
select *
from (
         select t1.shop,
                t1.user_id,
                t1.shop_user_count,
                row_number() over (partition by shop order by shop_user_count desc) shop_user_count_rk
         from (
                  select shop,
                         user_id,
                         count(user_id) shop_user_count
                  from visit
                  group by shop, user_id
              ) t1
     ) t2
where t2.shop_user_count_rk <= 3;
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值