SQL面试题

1. 表如下,请计算每个月每个部门评分大于等于90的人数,评分大于等于90的人数环比增长率,评分有提升的人数。

年月	姓名	部门	 评分
202101	张三	销售	90
202101	李四	技术	90
202101	王五	运营	80
202101	赵六	销售	70
202101	孙七	技术	95
202101	周八	运营	93
202101	吴九	销售	84
202101	郑十	技术	83
202102	张三	销售	95
202102	李四	技术	95
202102	王五	运营	95
202102	赵六	销售	95

思路:

1、获取每个月每个部门评分大于等于90的人数 -- t1

2、将 t1 和 t1 自关联,求增长率

3、将自身的表和自身的表关联,求分数有提升的人数

4、将所有的结果展示一下

本地模式:
set hive.exec.mode.local.auto=true;
//开启本地mr
//设置local mr的最大输入数据量,当输入数据量小于这个值时采用local  mr的方式,默认为134217728,即128M
set hive.exec.mode.local.auto.inputbytes.max=50000000;
//设置local mr的最大输入文件个数,当输入文件个数小于这个值时采用local mr的方式,默认为4
set hive.exec.mode.local.auto.input.files.max=10;

CREATE TABLE `empgo` (
  dt STRING  ,
  name STRING  ,
  dept STRING  ,
  score int
);
set hive.stats.column.autogather=false;
insert into empgo values ('202101','张三','销售','90');
insert into empgo values ('202101','李四','财务','90');
insert into empgo values ('202101','王五','运营','80');
insert into empgo values ('202101','赵六','销售','70');
insert into empgo values ('202101','孙七','技术','95');
insert into empgo values ('202101','周八','运营','93');
insert into empgo values ('202101','吴九','销售','84');
insert into empgo values ('202101','郑十','技术','83');
insert into empgo values ('202102','张三','销售','95');
insert into empgo values ('202102','李四','技术','95');
insert into empgo values ('202102','王五','运营','95');
insert into empgo values ('202102','赵六','销售','95');

第一步:

select dt,dept,count(1) peopleNum from empgo where score >=90 group by dt,dept;

第二步:将 t1 和 t1 自关联,求增长率

with t as (
    select dt,dept,count(1) peopleNum from empgo where score >=90 group by dt,dept
)
select t2.dt,t2.dept,round((t2.peopleNum-t1.peopleNum)/t1.peopleNum-1,2) from t t1 join t t2 on t1.dept = t2.dept and t1.dt = date_format(add_months(from_unixtime(unix_timestamp(t2.dt,'yyyyMM')),1),'yyyyMM')
;

结果:

还有另一个做法:

with t as (
    select dt,dept,count(1) peopleNum from empgo where score >=90 group by dt,dept
),t2 as(
    select *,lag(peopleNum,1,peopleNum) over(partition by dept order by dt ) lastMonthNum from t
)
select t2.dept,t2.dt,peopleNum/lastMonthNum-1 from t2 ;

第三步:将自身的表和自身的表关联,求分数有提升的人数

with getlastscore as (
    select *,lag(score,1,score) over(partition by name order by dt ) lastScore from empgo
)
select * from getlastscore ;

with getlastscore as (
    select *,lag(score,1,score) over(partition by name order by dt ) lastScore from empgo
)
select dt,count(1) score_growth_num from getlastscore where score > lastScore group by dt ;


-- 请计算每个月每个部门评分大于等于90的人数,评分大于等于90的人数环比增长率,评分有提升的人数。
with t as (
    select dt,dept,count(1) peopleNum from empgo where score >=90 group by dt,dept
),rate_growth as(
    select t2.dt,t2.dept,round((t2.peopleNum/t1.peopleNum)-1,2) grow_rate,t1.dt tt from t t1 join t t2 on t1.dept = t2.dept and t1.dt = date_format(add_months(from_unixtime(unix_timestamp(t2.dt,'yyyyMM')),1),'yyyyMM')
),getlastscore as (

    select *,lag(score,1,score) over(partition by name order by dt ) lastScore from empgo
),score_growth as (
    select dt,count(1) score_growth_num from getlastscore where score > lastScore group by dt
)
select
    t.dt,
    t.dept,
    t.peopleNum,
    rate_growth.grow_rate,
    score_growth_num
    from t left join rate_growth on  t.dt=rate_growth.tt and t.dept = rate_growth.dept
      left join score_growth on t.dt = score_growth.dt
;

最终结果:

2.交易记录表,表结构如下,请计算每个月各产品的购货人数,购货金额,购货金额排名,复购人数

(复购:前5个月曾购买此产品,本月有购买;购买: 购货金额大于0)

年月	订单号	原定单号	是否退货单	产品	购货人	金额	数量
202101	s100000	s100000	0	苹果	A	10	2
202101	s100001	s100000	1	苹果	A	-10	2
202101	s100002	s100002	0	西瓜	C	50	1
202106	s100003	s100003	0	西瓜	C	50	1
202102	...	...	...	...	...	...	...
create table deal (
  dt string,
  orderId string,
  oldOrderId string,
  isReturn int,
  producee string,
  customer string,
  money int,
  num int
);

insert into deal values ('202101','s100000','s100000',0,'苹果','A',10,2);
insert into deal values ('202101','s100001','s100000',1,'苹果','A',-10,2);
insert into deal values ('202101','s100002','s100002',0,'西瓜','C',50,1);
insert into deal values ('202106','s100003'	,'s100003',0,'西瓜','C',	50,1);
insert into deal values ('202101','s100004'	,'s100004',0,'西瓜','A',	50,1);
-- 请计算每个月各产品的购货人数,购货金额,购货金额排名,复购人数
-- (复购:前5个月曾购买此产品,本月有购买;购买: 购货金额大于0)

-- 第一步:求每个月各产品的购货人数,购货金额
select dt,producee,sum(money) from deal where money > 0 group by dt,producee ;
-- 第二步:每个月每个产品的购货排名
with t as (
    select dt,producee,sum(money) summoney from deal where money > 0 group by dt,producee
),t2 as(
    select dt,producee,dense_rank() over(partition by dt order by summoney desc) paiming from t
)
select * from t2;
-- 第三步:求每一个月每个产品的复购人数
-- 复购人数  (复购:前5个月曾购买此产品,本月有购买;购买: 购货金额大于0)
select d1.dt,d1.producee,count(1) fugou from deal d1 left join  deal d2
   on d1.customer = d2.customer and d1.producee = d2.producee
   where d2.dt >=  date_format(add_months(from_unixtime(unix_timestamp(d1.dt,'yyyyMM')),-5),'yyyyMM')
      and d2.dt < d1.dt group by d1.dt,d1.producee;

-- 综合一下sql语句即可:
-- 请计算每个月各产品的购货人数,购货金额,购货金额排名,复购人数
-- -- (复购:前5个月曾购买此产品,本月有购买;购买: 购货金额大于0)

with t as (
    select dt,producee,sum(money) summoney,count(distinct customer) peopleNum  from deal where money > 0 group by dt,producee
),t2 as(
    select dt,producee,dense_rank() over(partition by dt order by summoney desc) paiming from t
),t3 as (
    select d1.dt,d1.producee,count(1) fugou from deal d1 left join  deal d2
   on d1.customer = d2.customer and d1.producee = d2.producee
   where d2.dt >=  date_format(add_months(from_unixtime(unix_timestamp(d1.dt,'yyyyMM')),-5),'yyyyMM')
      and d2.dt < d1.dt group by d1.dt,d1.producee
)
select t.dt,
       t.producee,
       t.summoney,
       t.peopleNum,
       t2.paiming,
       nvl(t3.fugou,0)
   from t join t2 on t.dt=t2.dt and t.producee=t2.producee left join t3 on t.dt=t3.dt and t.producee=t3.producee;

最终结果:

3、交易记录表,表结构如下,请计算每个月购货人同时购买苹果和西瓜的金额(购货人单月只购买其中一样不计算,需在一个月内两个都购买)

年月	订单号	原定单号	是否退货单	产品	购货人	金额	数量
202101	s100000	s100000	0	苹果	A	10	2
202101	s100001	s100000	1	苹果	A	-10	2
202101	s100002	s100002	0	西瓜	C	50	1
202101	s100003	s100003	0	苹果	C	10	1
202102	...	...	...	...	...	...	...
CREATE TABLE IF NOT EXISTS orders_info (
    year_month string,
    order_number string,
    original_order_number string,
    is_return_order int,
    product string,
    purchaser string,
    amount double,
    quantity int
);

INSERT INTO TABLE orders_info
VALUES
    ('202101','s100000','s100000',0,'苹果','A',10.0,2),
    ('202101','s100001','s100000',1,'苹果','A',-10.0,2),
    ('202101','s100002','s100002',0,'西瓜','C',50.0,1),
    ('202101','s100003','s100003',0,'苹果','C',10.0,1);

思路:

1、先获取购买西瓜的人的信息

2、再获取购买苹果的人的信息

3、将两个表关联,谁的条数等于 2 就是谁

with t as (

select year_month,purchaser,product,sum(amount) total_money from orders_info where product='西瓜' or product='苹果' and amount >0 group by  year_month,purchaser,product
)
select count(product) cc,purchaser,year_month,sum(total_money) sumMoney from t group by year_month,purchaser having cc = 2;

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值