vg = valid1.groupby(['Coupon_id'])
for i in vg:
print(i)
print(i[1])
break
输出如下:
('1', User_id Merchant_id Coupon_id Discount_rate Distance Date_received \ 768069 472146 6889 1 20:1 9 20160522 962551 2266597 6889 1 20:1 0 20160603 964821 3057133 6889 1 20:1 0 20160606 1665538 5555255 6889 1 20:1 3 20160530 Date discount_type discount_rate discount_man ... \ 768069 20160602 1 0.95 20 ... 962551 null 1 0.95 20 ... 964821 null 1 0.95 20 ... 1665538 null 1 0.95 20 ... weekday_type weekday_1 weekday_2 weekday_3 weekday_4 weekday_5 \ 768069 1 0 0 0 0 0 962551 0 0 0 0 0 1 964821 0 1 0 0 0 0 1665538 0 1 0 0 0 0 weekday_6 weekday_7 label pred_prob 768069 0 1 1 0.013437 962551 0 0 0 0.106432 964821 0 0 0 0.101014 1665538 0 0 0 0.053949 [4 rows x 23 columns]) User_id Merchant_id Coupon_id Discount_rate Distance Date_received \ 768069 472146 6889 1 20:1 9 20160522 962551 2266597 6889 1 20:1 0 20160603 964821 3057133 6889 1 20:1 0 20160606 1665538 5555255 6889 1 20:1 3 20160530 Date discount_type discount_rate discount_man ... \ 768069 20160602 1 0.95 20 ... 962551 null 1 0.95 20 ... 964821 null 1 0.95 20 ... 1665538 null 1 0.95 20 ... weekday_type weekday_1 weekday_2 weekday_3 weekday_4 weekday_5 \ 768069 1 0 0 0 0 0 962551 0 0 0 0 0 1 964821 0 1 0 0 0 0 1665538 0 1 0 0 0 0 weekday_6 weekday_7 label pred_prob 768069 0 1 1 0.013437 962551 0 0 0 0.106432 964821 0 0 0 0.101014 1665538 0 0 0 0.053949 [4 rows x 23 columns]
对DataFrame使用groupby()函数后,是一个object对象,里面的子对象是根据参数来分类的,子对象是一个有标签的对象,子对象相当于一个二行N列的矩阵,例如i[0]就是对象分类标签"1", 而i[1]则是DataFrame里面隶属于分类参数标签的元素行,是dataframe格式。