从26个字母中精选出23个Pandas常用的函数,将它们的使用方法介绍给大家。其中o、y、z没有相应的函数。
import pandas as pd
import numpy as np
下面介绍每个函数的使用方法,更多详细的内容请移步官网:pandas.pydata.org/docs/refere…
assign函数
df = pd.DataFrame({
'temp_c': [17.0, 25.0]},
index=['Portland', 'Berkeley'])
df
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
temp_c | |
---|---|
Portland | 17.0 |
Berkeley | 25.0 |
# 生成新的字段
df.assign(temp_f=df['temp_c'] * 9 / 5 + 32)
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
temp_c | temp_f | |
---|---|---|
Portland | 17.0 | 62.6 |
Berkeley | 25.0 | 77.0 |
df # 原来DataFrame是不改变的
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
temp_c | |
---|---|
Portland | 17.0 |
Berkeley | 25.0 |
df["temp_f1"] = df["temp_c"] * 9 / 5 + 32
df
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
temp_c | temp_f1 | |
---|---|---|
Portland | 17.0 | 62.6 |
Berkeley | 25.0 | 77.0 |
df
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
temp_c | temp_f1 | |
---|---|---|
Portland | 17.0 | 62.6 |
Berkeley | 25.0 | 77.0 |
bool函数
返回单个Series或者DataFrame中单个元素的bool值:True或者False
pd.Series([True]).bool()
True
pd.Series([False]).bool()
False
pd.DataFrame({'col': [True]}).bool()
True
pd.DataFrame({'col': [False]}).bool()
False
# # 多个元素引发报错
# pd.DataFrame({'col': [True,False]}).bool()
concat函数
该函数是用来表示多个DataFrame的拼接,横向或者纵向皆可。
df1 = pd.DataFrame({
"sid":["s1","s2"],
"name":["xiaoming","Mike"]})
df1
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
sid | name | |
---|---|---|
0 | s1 | xiaoming |
1 | s2 | Mike |
df2 = pd.DataFrame({
"sid":["s3","s4"],
"name":["Tom","Peter"]})
df2
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
sid | name | |
---|---|---|
0 | s3 | Tom |
1 | s4 | Peter |
df3 = pd.DataFrame({
"address":["北京","深圳"],
"sex":["Male","Female"]})
df3
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
address | sex | |
---|---|---|
0 | 北京 | Male |
1 | 深圳 | Female |
# 使用1:纵向
pd.concat([df1,df2])
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
sid | name | |
---|---|---|
0 | s1 | xiaoming |
1 | s2 | Mike |
0 | s3 | Tom |
1 | s4 | Peter |
# 使用2:横向
pd.concat([df1,df3],axis=1)
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
sid | name | address | sex | |
---|---|---|---|---|
0 | s1 | xiaoming | 北京 | Male |
1 | s2 | Mike | 深圳 | Female |
dropna函数
删除空值
df4 = pd.DataFrame({
"sid":["s1","s2", np.nan],
"name":["xiaoming",np.nan, "Mike"]})
df4
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
sid | name | |
---|---|---|
0 | s1 | xiaoming |
1 | s2 | NaN |
2 | NaN | Mike |
df4.dropna()
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
sid | name | |
---|---|---|
0 | s1 | xiaoming |
df4.dropna(subset=["name"])
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
sid |
---|