Platform address:vue-sqleditor
题目需求:
现需要请向所有用户推荐其朋友收藏但是用户自己未收藏的商品,请从好友关系表(friendship_info)和收藏表(favor_info)中查询出应向哪位用户推荐哪些商品。
期望结果如下:
user_id(用户id)sku_id(应向该用户推荐的商品id)
101 2
101 4
101 7
101 9
101 8
101 11
101 1
需要用到的表:
好友关系表:friendship_info
user1_id(用户1 id) user2_id(用户2 id)
101 1010
101 108
101 106
收藏表:favor_info
user_id(用户id) sku_id(商品id) create_date(收藏日期)
101 3 2021-09-23
101 12 2021-09-23
101 6 2021-09-25
建表语句:
CREATE TABLE `friendship_info`(
`user1_id` string,
`user2_id` string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
CREATE TABLE `favor_info`(
`user_id` string,
`sku_id` string,
`create_date` string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
插入样例数据:
Insert into friendship_info
Select '101','1010'
Union all
Select '101','108'
Union all
Select '101','106';
Insert into favor_info
Select '101','3','2021-09-23'
Union all
Select '101','12','2021-09-23'
Union all
Select '101','6','2021-09-25';
需求分析:
用户朋友收藏的商品 s3
select user1_id,sku_id from friendship_info s1 join favor_info s2 on s1.user2_id=s2.user_id group by user1_id,sku_id
101 1
101 10
101 11
101 12
101 2
101 3
101 4
101 5
101 6
101 7
101 8
101 9
用户自己收藏的商品 s4
select user1_id,sku_id from friendship_info s1 join favor_info s2 on s1.user1_id=s2.user_id group by user1_id,sku_id
101 10
101 12
101 3
101 5
101 6
其朋友收藏但是用户自己未收藏的,即应被推荐的商品
自己收藏的商品总数与未匹配到的商品总数相等,证明朋友收藏的此商品,自己没有,应该被推荐
select s3.user1_id as user_id,s3.sku_id from (select user1_id,sku_id from friendship_info s1 join favor_info s2 on s1.user2_id=s2.user_id group by user1_id,sku_id)s3 join (select user1_id,sku_id from friendship_info s1 join favor_info s2 on s1.user1_id=s2.user_id group by user1_id,sku_id)s4 on s3.user1_id=s4.user1_id group by s3.user1_id,s3.sku_id having count(1)-sum(if(s3.sku_id!=s4.sku_id,1,0))=0
最终SQL:
select
s3.user1_id as user_id,
s3.sku_id
from
(
select
user1_id,
sku_id
from
friendship_info s1
join favor_info s2 on s1.user2_id = s2.user_id
group by
user1_id,
sku_id
) s3
join (
select
user1_id,
sku_id
from
friendship_info s1
join favor_info s2 on s1.user1_id = s2.user_id
group by
user1_id,
sku_id
) s4 on s3.user1_id = s4.user1_id
group by
s3.user1_id,
s3.sku_id
having
count(1) - sum(if(s3.sku_id ! = s4.sku_id, 1, 0)) = 0
AC时间:
2023年01月05日13:55:23