pandas groupby分组一学就会

文章详细介绍了Pandas库中用于数据分组处理的关键方法,包括GroupBy的使用,如通过列名或条件进行分组,以及agg、transform和filter函数的应用。agg用于聚合操作,如计算最大值、均值等;transform对原始数据进行转换,如累积求和;filter则根据函数返回的布尔序列过滤数据。apply函数允许应用自定义函数到数据框的每个分组。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

GroupBy

df.groupby(cols, dropna)
df.groupby(condition)

cols, str or list.
condition, series of group names, matching rows.
dropna is a bool value,
indicating whether regard NaN as a group.
Return a DataFrameGroupBy or SeriesGroupBy.

GroupBy[cols] return a GroupBy.

attributes

ngroups, num of groups.
groups, dict[group_name, index]. \

function:
size, size of groups.
get_group, get group in origin data.

Aggregate function

All return a new dataframe.

Normally:

  1. max, min, idxmax, idxmin
  2. mean, sum, prod, std, var, size, count, nunique
  3. all, any
  4. median, quantile
  5. sem, standard error of the mean of groups.
  6. skew, unbiased skew.

agg

gb.agg(list), list, a list of function name.
gb.agg(dict), dict, dict[col_name, List[func] | List[(name, func)]].

agg returns dataframe with origin col and func.__name__(name) as column.

Inner function input: column in group;
output: a aggregate value

Transform function

Transform the origin dataframe.

Normally, cumcount, cumsum, cumprod, cummax, cummin

transform

gb.transform(func)
func, input: a column in group; output: a column in group.
The function puts inner output together as output.

Filter function

gb.filter(func)
func, input: a column in group; output: a series of bool.
The function filters data according the inner output.

apply

gb.apply(func)
func, input: a dataframe in group; output: scalar | Series | DataFrame.
apply will automatically split joint the inner output.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值