demo参照E-Commerce Data那篇对country类别的处理。
换个简单的数据集
Demo:
import pandas as pd
import numpy as np
df = pd.DataFrame({"Person":
["John", "Myla", "Lewis", "John", "Myla"],
"Age": [24., np.nan, 21., 33, 26],
"Single": [False, True, True, True, False]})
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder().fit(df.Person)
classes = le.classes_
l = [i for i in range(3)]
dict(zip(list(classes),l))
df['Person'] = le.transform(df['Person'])
print(df.head())
Out:
编码完毕
Person Age Single
0 0 24.0 False
1 2 NaN True
2 1 21.0 True
3 0 33.0 True
4 2 26.0 False