
问题:在最近的需求开发中,有这么个分组比例计算求和问题,根据字段'CPN'进行分组,计算每一笔PO Line Actual CT2R * line 数量比重,取名为'Weighted(QTY)CT2R',再根据相同的'CPN'对每行'Weighted(QTY)CT2R'值进行汇总求和得到总的'Weighted(QTY)CT2R'值,如下图填充色为黄色的单元格即是我们所需要的目标值
具体计算逻辑如下:
用Pandas代码实现上述需求如下所示:
import pandas as pd
df = pd.DataFrame([['01-0989',10,90],
['01-0989',10,90],
['01-0989',10,90],
['01-0989',10,90],
['01-0989',10,90],
['01-0989',10,90],
['01-0989',10,90],
['01-0989',10,90],
['01-0989',10,90],
['01-0989',200,50],
['02-0437',20,80],
['02-0437',20,80],
['02-0437',20,80]
],columns = ['cpn','po_line_qty','actual_ct2r'])
# 根据字段'cpn'进行分组,对字段'po_line_qty'中的值进行求和,取名为total
total = df.groupby('cpn').agg({'po_line_qty':sum}).reset_index()
# 将字段'po_line_qty'更名为'total_po_line_qty'
total = total.rename(columns = {'po_line_qty':'total_po_line_qty'})
# df表与total表根据字段'cpn'进行左连接,取名为new_res
new_res = pd.merge(df,total,how='left',on='cpn')
def weighted_qty_ct2r(row):
scale = row['po_line_qty'] / row['total_po_line_qty']
weighted_qty_ct2r = scale * row['actual_ct2r']
return weighted_qty_ct2r
# 生成字段'weighted_qty_ct2r'
new_res['weighted_qty_ct2r'] = new_res.apply(lambda row:weighted_qty_ct2r(row), axis=1)
# 根据字段'cpn'进行分组,对字段'weighted_qty_ct2r'中的值进行求和,取名为df_result
df_result = new_res.groupby('cpn').agg({'weighted_qty_ct2r':sum})
df
total
new_res
df_result