某电商网站用户行为分析【已脱敏】

用户行为分析

final_data为脱敏后的数据

final_data.head()
user_id item_id behavior_type user_geohash item_category time
0 54007195 79633535 1 NaN 3940 2014-11-24 16
1 136952642 337800294 1 NaN 4830 2014-11-22 12
2 121255158 108926788 1 NaN 1970 2014-11-22 08
3 72256073 144090786 1 NaN 4008 2014-12-09 20
4 65645933 250029185 1 9t4qqgn 2825 2014-11-25 17
data = final_data[['user_id', 'item_id', 'behavior_type', 'time']]
data.head()
user_id item_id behavior_type time
0 54007195 79633535 1 2014-11-24 16
1 136952642 337800294 1 2014-11-22 12
2 121255158 108926788 1 2014-11-22 08
3 72256073 144090786 1 2014-12-09 20
4 65645933 250029185 1 2014-11-25 17
data.shape
(12256906, 4)
data['date'] = data['time'].map(lambda x:x.split(' ')[0])
data['hour'] = data['time'].map(lambda x:x.split(' ')[1])
data.head()
C:\work\software\Anaconda5.3.0\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.
user_id item_id behavior_type time date hour
0 54007195 79633535 1 2014-11-24 16 2014-11-24 16
1 136952642 337800294 1 2014-11-22 12 2014-11-22 12
2 121255158 108926788 1 2014-11-22 08 2014-11-22 08
3 72256073 144090786 1 2014-12-09 20 2014-12-09 20
4 65645933 250029185 1 2014-11-25 17 2014-11-25 17
data.drop(['time'], axis=1, inplace=True)
data.head()
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值