本文翻译自:How to reset index in a pandas data frame? [duplicate]
I have a data frame from which I remove some rows. 我有一个数据框,从中删除了一些行。 As a result, I get a data frame in which index is something like that: [1,5,6,10,11]
and I would like to reset it to [0,1,2,3,4]
. 结果,我得到一个数据帧,其中的索引是这样的: [1,5,6,10,11]
,我想将其重置为[0,1,2,3,4]
。 How can I do it? 我该怎么做?
The following seems to work: 以下似乎有效:
df = df.reset_index()
del df['index']
The following does not work: 以下内容不起作用:
df = df.reindex()
#1楼
参考:https://stackoom.com/question/1nYsi/如何重置熊猫数据框中的索引-重复
#2楼
reset_index()
is what you're looking for. reset_index()
是您要寻找的。 If you don't want it saved as a column, then do: 如果您不希望将其另存为列,请执行以下操作:
df = df.reset_index(drop=True)
If you don't want to reassign: 如果您不想重新分配:
df.reset_index(drop=True, inplace=True)
#3楼
Another solutions are assign RangeIndex
or range
: 另一个解决方案是分配RangeIndex
或range
:
df.index = pd.RangeIndex(len(df.index))
df.index = range(len(df.index))
It is faster: 它更快:
df = pd.DataFrame({'a':[8,7], 'c':[2,4]}, index=[7,8])
df = pd.concat([df]*10000)
print (df.head())
In [298]: %timeit df1 = df.reset_index(drop=True)
The slowest run took 7.26 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 105 µs per loop
In [299]: %timeit df.index = pd.RangeIndex(len(df.index))
The slowest run took 15.05 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.84 µs per loop
In [300]: %timeit df.index = range(len(df.index))
The slowest run took 7.10 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 14.2 µs per loop
#4楼
data1.reset_index(inplace=True)