背景
一、构建问题:
1、识别问题:
对上述的数据集进行观察,可以得到olist平台的运营情况,主要是如下3个维度展开:
① 订单维度: 订单销售额,销售量、
订单品类: 销量最高和最高sku(立足核心数据集olist_Order Dataset.csv)
② 用户维度:用户城市分布、用户的付款方式、用户满意度(评价)?
二、 解决问题:
整体维度
1、16-18年销售额跟订单量、客单价、情况:
-- 销售额跟订单量、客单价、情况
SELECT
Year(a.order_purchase_timestamp) as '年'
,MONTH(a.order_purchase_timestamp) as '月'
--,c.customer_city as '城市'
,COUNT(a.order_id) as '订单量'
,sum(b.payment_value) as '订单金额'
,sum(b.payment_value)/COUNT(a.order_id) as '客单价'
from orders_dataset as a
left join order_payments_dataset as b
on a.order_id = b.order_id
-- left join customers_dataset as c
---on a.customer_id = c.customer_id---
where a.order_status = 'delivered'
group by Year(a.order_purchase_timestamp),MONTH(a.order_purchase_timestamp)
order by Year(a.order_purchase_timestamp),MONTH(a.order_purchase_timestamp);
2