Python Pandas与Numpy中axis参数的意思

本文详细解释了Python中Pandas库的数据结构DataFrame中axis参数的意义。通过实例展示,清晰地说明了axis=0和axis=1分别对应的操作方向,帮助读者理解如何在数据分析任务中正确使用这些参数。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

转自 stackoverflow.com
https://stackoverflow.com/questions/25773245/ambiguity-in-pandas-dataframe-numpy-array-axis-definition/25774395#25774395?newreg=547fe2de322e46a3a20d6d1aeeba4df9
参考 https://www.jianshu.com/p/9aa448ea397c

原文:
I’ve been very confused about how python axes are defined, and whether they refer to a DataFrame’s rows or columns. Consider the code below:

>>> df = pd.DataFrame([[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3]], columns=["col1", "col2", "col3", "col4"])
>>> df
   col1  col2  col3  col4
0     1     1     1     1
1     2     2     2     2
2     3     3     3     3

So if we call df.mean(axis=1), we’ll get a mean across the rows:

>>> df.mean(axis=1)
0    1
1    2
2    3
However, if we call df.drop(name, axis=1), 
we actually drop a column, not a row:

>>> df.drop("col4", axis=1)
   col1  col2  col3
0     1     1     1
1     2     2     2
2     3     3     3

其中有一位大神的解答非常不错:

It’s perhaps simplest to remember it as 0=down and 1=across.

This means:

Use axis=0 to apply a method down each column, or to the row labels (the index).
Use axis=1 to apply a method across each row, or to the column labels.

Here’s a picture to show the parts of a DataFrame that each axis refers to:
这里写图片描述

It’s also useful to remember that Pandas follows NumPy’s use of the word axis. The usage is explained in NumPy’s glossary of terms:

Axes are defined for arrays with more than one dimension. A 2-dimensional array has two corresponding axes: the first running vertically downwards across rows (axis 0), and the second running horizontally across columns (axis 1). [my emphasis]

So, concerning the method in the question, df.mean(axis=1), seems to be correctly defined. It takes the mean of entries horizontally across columns, that is, along each individual row. On the other hand, df.mean(axis=0) would be an operation acting vertically downwards across rows.

Similarly, df.drop(name, axis=1) refers to an action on column labels, because they intuitively go across the horizontal axis. Specifying axis=0 would make the method act on rows instead.

参考中文翻译,总结来说:
axis=0是每一列做自上而下的执行,axis=1是每一行做自左向右的执行,强调的是一种遍历的概念

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值