一、排序
1.索引排序
sort_index()
排序默认使用升序排序,
ascending=False
为降序
import pandas as pd
import numpy as np
# Series
s4 = pd.Series(range(10,15), index=np.random.randint(5, size=5))
print(s4)
# 索引排序
print(s4.sort_index()) # 0 0 1 33
对DataFrame操作时注意轴方向
import pandas as pd
import numpy as np
# DataFrame
df4 = pd.DataFrame(np.random.randn(3, 5),
index=np.random.randint(3, size=3), columns=np.random.randint(5, size=5))
print(df4)
df4_isort=df4.sort_index(axis=1,ascending=False)
print(df4_isort) # 4 2 1 10
2.按值排序
sort_values(by='column name')
根据某个唯一的列名进行排序,如果有其他相同列名则报错。
import pandas as pd
import numpy as np
# DataFrame
df4 = pd.DataFrame(np.random.randn(3, 5),
index=np.random.randint(3, size=3),
columns=np.random.randint(5, size=5))
# 按值排序
df4_vsort = df4.sort_values(by=0, ascending=False)
print(df4_vsort)
二、处理缺失数据
import numpy as np
import pandas as pd
df_data = pd.DataFrame([np.random.randn(3), [1., 2., np.nan],
[np.nan, 4., np.nan], [1., 2., 3.]])
print(df_data.head())
1.判断是否存在缺失值:isnull()
import numpy as np
import pandas as pd
df_data = pd.DataFrame([np.random.randn(3), [1., 2., np.nan],
[np.nan, 4., np.nan], [1., 2., 3.]])
# isnull
print(df_data.isnull())
2.丢弃缺失数据:dropna()
根据axis轴方向,丢弃包含NaN的行或列。
import numpy as np
import pandas as pd
df_data = pd.DataFrame([np.random.randn(3), [1., 2., np.nan],
[np.nan, 4., np.nan], [1., 2., 3.]])
# dropna
print(df_data.dropna())
print(df_data.dropna(axis=1))
3.填充缺失数据:fillna()
import numpy as np
import pandas as pd
df_data = pd.DataFrame([np.random.randn(3), [1., 2., np.nan],
[np.nan, 4., np.nan], [1., 2., 3.]])
# fillna
print(df_data.fillna(-100.))