1.商品推荐题目
1.思路:
- 通过取差集 得出要推荐的商品
- 差集的选取:except直接取差集 或者a left join b on where b = null
2.知识点
1.except
select
friendship_info.user1_id as user_id,
sku_id
from
friendship_info
join favor_info on friendship_info.user2_id = favor_info.user_id
except
select
user_id,
sku_id
from
favor_info
2.交并差
2.连续日期
1. 思路:
1.通过lag(向上) lead(向下)求出2天,三天比较
2.通过rk和日期进行相减
- 其实就是连续的日期进行分组,将连续的放入一组,不连续的放入另一组
- 第一步, 使用开窗 构造rk
- 计算日期和rk的差值,算出差值日期,这个差值日期就保证同一个分组,
- 使用开窗count 算出每个分组是否有 符合要求的 >=2
with
s1 as (
select
user_id,
date (login_ts) start_date,
row_number() OVER (
PARTITION by
user_id
order by
date (login_ts)
) rk
from
user_login_detail
GROUP by
user_id,
date (login_ts)
),
s11 as (
select
user_id,
start_date,
date_sub (start_date, rk) head_date
from
s1
)
select
user_id,
min(start_date) `start_date`,
max(start_date) `end_date`
from s11
GROUP by user_id,
head_date
HAVING datediff(max(start_date) ,min(start_date))>=1
3. 截止到某天统计最近几天的总和和平均值
1.思路
其实想法都是一样的,要求截止到当天,最近3天的数据,就要把前两天的数据拿过来,
方法有两个,一个是自连接join 一个是直接开窗
1. join
join需要考虑空值,分母就少了,所以需要一个if判断
2.开窗函数
开窗直接sum avg 但是首先注意,这里时间是String类型,无法进行between and
需要转化成日期类型或者时间戳(秒 毫秒 都是int类型)
注意range的用法,range 的between and 是根据 前面的值进行减法 加法
比如 order by timestamp_day RANGE BETWEEN 172800 PRECEDING and current row 就是 范围是 当前行的timestamp_day -172800 到当前行 如果后面用following就是加
with
s1 as (
select
create_date,
sum(total_amount) total_Day,
unix_timestamp(date(create_date)) timestamp_day
from
order_info
GROUP by
create_date
)
select
create_date,
cast(sum(total_Day) over(order by timestamp_day RANGE BETWEEN 172800 PRECEDING and current row ) as decimal(16,2)) total_3d,
cast(avg(total_Day) over(order by timestamp_day RANGE BETWEEN 172800 PRECEDING and current row ) as DECIMAl(16,2) )avg_3d
from s1
4. 关于奇数偶数的分配问题
1. 思路
1.直接将奇数和偶数分开然后union
with s1 as (select
category_id,
price,
count(*) over(PARTITION by category_id) cnt ,
row_number() over(partition by category_id order by price) rk
from sku_info)
-- 偶数
select
category_id,
cast(avg(if(rk=(cnt/2) or rk = (cnt/2)+1,price,null)) as DECIMAL(16,2)) medprice
from s1
where cnt%2=0
GROUP by category_id
union
-- 奇数
select
category_id,
-- 奇数是向上取整取1个, 偶数是向上取整,然后再向上找一个
cast(avg(if(rk=ceil(cnt/2),price,null)) as DECIMAL(16,2)) medprice
from s1
where cnt%2=1
GROUP by category_id
2.优化 因为奇数是取结果向下+1, 偶数是取结果和结果+1,可以过滤
因为rk是整数
奇数rk = floor(cnt/2)+1 cnt/2<floor(cnt/2)+1 <cnt/2+1
偶数rk = floor(cnt/2)+1 floor(cnt/2) 两边相等
如果把条件改 大于等于 小于等于即可
with s1 as (select
category_id,
price,
count(*) over(PARTITION by category_id) cnt ,
row_number() over(partition by category_id order by price) rk
from sku_info)
-- 偶数
select
category_id,
cast(avg(price) as DECIMAL(16,2)) medprice
from s1
where cnt/2<=rk and rk<= (cnt/2+1)
GROUP by category_id