需求
今天接到一个需求,需要得到消费者的每次消费行为是其消费的第几个消费天,由于需要对每个用户单独单独计算,着实是让我思考了一番。现把利用clickhouse arrayjoin函数的实现与大家分享。
我的表结构
CREATE TABLE miniLLJ.trade_drop_table_conv_bobo (
`id` Int32,
`create_on` Int64,
`insert_time` DateTime,
`reg` Nullable(Int64),
`reg_time` Nullable(DateTime),
`user_id` Nullable(Int32),
`device_id` Nullable(Int32),
`product_id` Nullable(Int32),
`consume_type` Nullable(UInt8),
`start_coins` Nullable(UInt16),
`catch_status` Nullable(UInt8),
`store_id` Nullable(UInt8),
`card_id` Nullable(String),
`card_name` Nullable(String),
`card_type` Nullable(String),
`store_name` Nullable(String),
`device_name` Nullable(String),
`balance` Nullable(UInt32),
`platform` Nullable(UInt8),
`product_name` Nullable(String),
`product_cost` Nullable(Float32),
`product_claw` Nullable(String),
`product_brand` Nullable(String),
`min_power` Nullable(UInt16),
`max_power` Nullable(UInt16),
`play_code` UInt8,
`device_type` UInt16,
`device_claw` String,
`is_manual` UInt8,
`gugu_create_on` Nullable(Int64),
`trade_no` Nullable(String),
`product_conv_bobo` Nullable(UInt32)
) ENGINE = MergeTree PARTITION BY toYYYYMM(insert_time) ORDER BY create_on SETTINGS index_granularity = 8192
select
user_id,
arraySort(groupUniqArray(toDate(insert_time))) as consume_date
from
miniLLJ.trade_drop_table_conv_bobo tdt
GROUP by user_id
order by user_id
得到的结果:
然后将其拆到每一行
select user_id as luid,consume_date,datee,day_number from (select
user_id,
arraySort(groupUniqArray(toDate(insert_time))) as consume_date
from
miniLLJ.trade_drop_table_conv_bobo tdt
GROUP by user_id
order by user_id) user_consume_days
array join
consume_date as datee,
arrayEnumerate(consume_date) AS day_number;
得到我们期望的结果
之后我们将其与原先的表innerjoin起来就可以了,注意小表放右边 XD。
用到的函数:
arraySort() 将array内元素进行排序
groupUniqArray() 将array中元素去重
arrayEnumerate() 获得array元素的下标。