DataFrame.groupby()简析

groupby分组函数

  返回值:返回重构格式的DataFrame,特别注意,groupby里面的字段内的数据重构后都会变成索引

  groupby(),一般和sun()一起使用,如下例:

from pandas import Series,DataFrame
a=[['Li','男','PE',98.],['Li','男','MATH',60.],['liu','男','MATH',60.],['yu','男','PE',100.]]

af=DataFrame(a,columns=['name','sex','course','score'])
af

产生的DataFrame结构为:

    

 

af.groupby(['name','course'])['score'].sum()#先将af按照namej进行分组,再按照score进行分组,最后将score进行叠加

生成的新DataFrame数据结构为:

    

特别注意:groupby里面的字段内的数据重构后都会变成索引

  当使用groupby()进行分组之前,name和course字段都为数值字段,不可进行访问,。执行group之前,执行下面代码:

af['Li']

  提示错误!

  使用group分组之后,name和course都变成了索引,name为外层索引,course为外层索引。执行下面代码:

af.groupby(['name','course'])['score'].sum()['Li']

  成功访问到了数据,显示结果:

   

 

转载于:https://www.cnblogs.com/2017Crown/p/7247694.html

D:\Programs>D:/Programs/Python311/python.exe d:/Programs/计调中台/在线数据查询/在线数据查询.py 程序启动中... 当前版本: 1.0, 数据库中无更高版本 CURRENT_VERSION: 1.0 版本检查完成 2025-07-23 09:59:26,512 INFO sqlalchemy.engine.Engine SELECT DATABASE() 2025-07-23 09:59:26,515 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 09:59:26,583 INFO sqlalchemy.engine.Engine SELECT @@sql_mode 2025-07-23 09:59:26,583 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 09:59:26,618 INFO sqlalchemy.engine.Engine SELECT @@lower_case_table_names 2025-07-23 09:59:26,618 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 09:59:26,687 INFO sqlalchemy.engine.Engine BEGIN (implicit) 2025-07-23 09:59:26,688 INFO sqlalchemy.engine.Engine SELECT * FROM permission WHERE IP = %(ip)s 2025-07-23 09:59:26,688 INFO sqlalchemy.engine.Engine [generated in 0.00142s] {'ip': '10.137.25.50'} 权限检查结果 - IP: 10.137.25.50, 权限值: 1, 维护权限: True 2025-07-23 09:59:26,723 INFO sqlalchemy.engine.Engine ROLLBACK 缓存目录: C:\Users\10309723\AppData\Roaming\PCBDataViewer 缓存文件: C:\Users\10309723\AppData\Roaming\PCBDataViewer\data_cache.pkl 加载配置文件失败: Expecting value: line 5 column 23 (char 64) Debug: search_vars exists - False Debug: search_widgets exists - False 获取计调员唯一值失败: 数据框未初始化 获取小组长唯一值失败: 数据框未初始化 获取大组长唯一值失败: 数据框未初始化 获取结存状态唯一值失败: 数据框未初始化 获取线体唯一值失败: 数据框未初始化 获取产品大类唯一值失败: 数据框未初始化 已启用维护按钮 2025-07-23 09:59:26,908 INFO sqlalchemy.engine.Engine BEGIN (implicit) 2025-07-23 09:59:26,909 INFO sqlalchemy.engine.Engine SELECT DISTINCT 计调员 FROM part_category WHERE 计调员 IS NOT NULL AND 计调员 <> '' 2025-07-23 09:59:26,910 INFO sqlalchemy.engine.Engine [generated in 0.00168s] {} 2025-07-23 09:59:26,952 INFO sqlalchemy.engine.Engine ROLLBACK 2025-07-23 09:59:27,053 INFO sqlalchemy.engine.Engine BEGIN (implicit) 2025-07-23 09:59:27,054 INFO sqlalchemy.engine.Engine DESCRIBE `bj_database`.`SELECT 工厂, 单板代码, 计调员 FROM part_category` 2025-07-23 09:59:27,055 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 09:59:27,090 INFO sqlalchemy.engine.Engine SELECT 工厂, 单板代码, 计调员 FROM part_category 2025-07-23 09:59:27,090 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 09:59:27,372 INFO sqlalchemy.engine.Engine ROLLBACK 从缓存加载产出计划表数据成功,数据行数: 3596, 上次更新时间: 2025-07-21 21:41:51.553931 缓存数据为空 未找到有效缓存数据,请点击'更新数据'按钮获取最新数据 ===== 开始处理工厂数据,当前工厂ID: 53 ===== DataViewer实例创建完成 当前时间: 2025-07-23 09:59:31 准备进入主循环 获取到工号: 10309723 使用工厂 53 的仓库ID列表,共 8 个仓库 ===== URL列表 ===== URL1 (wipDailyStatistics): https://imes.zte.com.cn/zte-mes-manufactureshare-gateway/zte-mes-manufactureshare-productionmgmtsys/wipDailyStatistics/qu ery URL2 (schedulingAdjustPage): https://imes.zte.com.cn/zte-mes-manufactureshare-gateway/zte-mes-manufactureshare-planschedulesys/PS/schedulingAdjustPa ge?workOrderStatusList=%E5%B7%B2%E6%8F%90%E4%BA%A4,%E5%B7%B2%E5%BC%80%E5%B7%A5,%E6%8C%82%E8%B5%B7&workshopCode=&lineCode=&order=asc&sort=scheduleSta rtDate&page=1&rows=30000&craftSections=%E8%83%8C%E6%9D%BF,%E9%93%A3%E6%9D%BF,DIP,%E6%B6%82%E8%A6%86,Test,SMT-A,SMT-B,%E6%89%8B%E8%B4%B4 URL3 (workOrderPriorityQueue): https://imes.zte.com.cn/zte-mes-manufactureshare-gateway/zte-mes-manufactureshare-planschedulesys/workOrderPriorityQu eue/page URL4 (conciseDailyQuery): https://imes.zte.com.cn/zte-mes-manufactureshare-gateway/zte-mes-manufactureshare-productionmgmtsys/conciseDaily/conciseDa ilyQuery URL5 (linesideWarehouseInfo): https://imes.zte.com.cn/zte-mes-manufactureshare-gateway/zte-mes-manufactureshare-productiondeliverysys/linesideWareho useInfoCtrl/getReqHead ----- 开始请求2 (schedulingAdjustPage) ----- 请求时间: 2025-07-23 09:59:31 请求URL: https://imes.zte.com.cn/zte-mes-manufactureshare-gateway/zte-mes-manufactureshare-planschedulesys/PS/schedulingAdjustPage?workOrderStatusLi st=%E5%B7%B2%E6%8F%90%E4%BA%A4,%E5%B7%B2%E5%BC%80%E5%B7%A5,%E6%8C%82%E8%B5%B7&workshopCode=&lineCode=&order=asc&sort=scheduleStartDate&page=1&rows=3 0000&craftSections=%E8%83%8C%E6%9D%BF,%E9%93%A3%E6%9D%BF,DIP,%E6%B6%82%E8%A6%86,Test,SMT-A,SMT-B,%E6%89%8B%E8%B4%B4 ----- 开始请求3 (workOrderPriorityQueue) -----请求头: {'x-factory-id': '53', 'x-emp-no': '10309723', 'x-auth-value': '523b27c3cb6e70e458ac150d9d381d 32'} 请求时间: 2025-07-23 09:59:31 请求URL: https://imes.zte.com.cn/zte-mes-manufactureshare-gateway/zte-mes-manufactureshare-planschedulesys/workOrderPriorityQueue/page 请求参数: {'releaseDateStart': '', 'releaseDateEnd': '', 'deliveryDateStart': '', 'deliveryDateEnd': '', 'page': 1, 'rows': 300} 请求头: {'x-factory-id': '53', 'x-emp-no': '10309723', 'x-auth-value': '523b27c3cb6e70e458ac150d9d381d32'} 响应状态码: 200 获取到数据行数: 250 响应状态码: 200 获取到数据行数: 1431 ----- 开始请求5 (linesideWarehouseInfo) ----- 请求时间: 2025-07-23 09:59:34 请求URL: https://imes.zte.com.cn/zte-mes-manufactureshare-gateway/zte-mes-manufactureshare-productiondeliverysys/linesideWarehouseInfoCtrl/getReqHea d 请求头: {'x-factory-id': '53', 'x-emp-no': '10309723', 'x-auth-value': '523b27c3cb6e70e458ac150d9d381d32'} 获取到数据行数: 294 首条数据示例: {'id': '1947821489868275713', 'accessSystem': '102', 'billNo': 'XBRKCN20250723047419', 'billInfoType': '调拨入库', 'billTypeCode': '央 仓调拨入库(A材)', 'srcBillNo': 'IMES20250723000040', 'statusCode': '已提交', 'productionBatch': '7725582', 'fromWarehouseId': '', 'fromStockId': Non e, 'fromLocationId': None, 'toWarehouseId': '深圳B1-3F线边仓001', 'toStockId': 'XBC_ZTSZ', 'toLocationId': None, 'createdDate': '2025-07-23 08:50:06 ', 'createdBy': '10182329', 'lastUpdatedDate': '2025-07-23 08:50:06', 'lastUpdatedBy': '10182329', 'warehouseNo': None, 'warehouseName': None, 'stoc kNo': None, 'stockName': None, 'locationNo': None, 'locationId': None, 'dateStart': None, 'dateEnd': None, 'rfFeature': '{"externalorderkey2":"IMES2 0250723000040"}', 'current': None, 'searchCount': None, 'size': None, 't': None, 'page': 1, 'total': 294, 'currentPage': None, 'pageSize': None, 'he adId': None, 'workOrderNo': None, 'warehouseIds': None, 'attribute1': None, 'factoryId': None, 'empNo': None, 'taskNo': 'wqx2507082F03-z', 'batchTyp e': 0, 'headSrcBill': None, 'warehouseId': None, 'itemNo': None, 'externalOrderKey2': 'IMES20250723000040'} 工厂ID 53 获取到 2549 条记录 d:\Programs\计调中台\在线数据查询\在线数据查询.py:509: DeprecationWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclu de the groupings or explicitly select the grouping columns after groupby to silence this warning. pf_final = pf_filtered.groupby('批次').apply(merge_rows).reset_index(drop=True) ----- 开始合并df1和pf_final ----- 合并前df1行数: 961 合并前pf_final行数: 714 去重后df1行数: 923 去重后pf_final行数: 714 合并后df_1行数: 923 合并后重复批次检查: 0个重复批次 d:\Programs\计调中台\在线数据查询\在线数据查询.py:536: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas. Value '' has dtype incompatible with float64, please explicitly cast to a compatible dtype first. df_1.fillna('', inplace=True) ----- 开始读取delivery_date表数据 ----- 2025-07-23 10:00:01,621 INFO sqlalchemy.engine.Engine SELECT DATABASE() 2025-07-23 10:00:01,623 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 10:00:01,689 INFO sqlalchemy.engine.Engine SELECT @@sql_mode 2025-07-23 10:00:01,689 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 10:00:01,723 INFO sqlalchemy.engine.Engine SELECT @@lower_case_table_names 2025-07-23 10:00:01,724 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 10:00:01,792 INFO sqlalchemy.engine.Engine BEGIN (implicit) 2025-07-23 10:00:01,792 INFO sqlalchemy.engine.Engine DESCRIBE `bj_database`.`SELECT * FROM delivery_date` 2025-07-23 10:00:01,793 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 10:00:01,822 INFO sqlalchemy.engine.Engine SELECT * FROM delivery_date 2025-07-23 10:00:01,828 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 10:00:02,099 INFO sqlalchemy.engine.Engine ROLLBACK 获取到delivery_date表数据行数: 2795 ----- 开始合并df_1和df_sql ----- 合并前df_1行数: 923 合并前df_sql行数: 2608 合并后df行数: 923 新增'是否急件'字段后行数: 923 2025-07-23 10:00:02,153 INFO sqlalchemy.engine.Engine BEGIN (implicit) 2025-07-23 10:00:02,154 INFO sqlalchemy.engine.Engine DESCRIBE `bj_database`.`SELECT 工厂, 单板代码, 计调员, 周期类别 FROM part_category` 2025-07-23 10:00:02,155 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 10:00:02,189 INFO sqlalchemy.engine.Engine SELECT 工厂, 单板代码, 计调员, 周期类别 FROM part_category 2025-07-23 10:00:02,190 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 10:00:02,323 INFO sqlalchemy.engine.Engine ROLLBACK 2025-07-23 10:00:02,415 INFO sqlalchemy.engine.Engine BEGIN (implicit) 2025-07-23 10:00:02,416 INFO sqlalchemy.engine.Engine DESCRIBE `bj_database`.`SELECT 线体, 小组长, 大组长, 科室 FROM lineteamleader` 2025-07-23 10:00:02,416 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 10:00:02,450 INFO sqlalchemy.engine.Engine SELECT 线体, 小组长, 大组长, 科室 FROM lineteamleader 2025-07-23 10:00:02,451 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 10:00:02,489 INFO sqlalchemy.engine.Engine ROLLBACK ----- 计调员和科室信息 ----- 计调员示例: 未分配 科室示例: nan 2025-07-23 10:00:02,723 INFO sqlalchemy.engine.Engine BEGIN (implicit) 2025-07-23 10:00:02,725 INFO sqlalchemy.engine.Engine DESCRIBE `bj_database`.`SELECT 批次, 转交时间 FROM gongxu90time` 2025-07-23 10:00:02,725 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 10:00:02,759 INFO sqlalchemy.engine.Engine SELECT 批次, 转交时间 FROM gongxu90time 2025-07-23 10:00:02,761 INFO sqlalchemy.engine.Engine [raw sql] {} 2025-07-23 10:00:03,135 INFO sqlalchemy.engine.Engine ROLLBACK 成功获取url_5数据,共213条记录 成功合并url_5数据 工厂ID 53 获取到 2550 条记录 工厂ID 55 获取到 1477 条记录 Traceback (most recent call last): File "d:\Programs\计调中台\在线数据查询\在线数据查询.py", line 5477, in <module> main() File "d:\Programs\计调中台\在线数据查询\在线数据查询.py", line 5467, in main app.run() File "d:\Programs\计调中台\在线数据查询\在线数据查询.py", line 2086, in run self.root.mainloop() File "D:\Programs\Python311\Lib\tkinter\__init__.py", line 1504, in mainloop self.tk.mainloop(n) KeyboardInterrupt
最新发布
07-24
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值