数据建模与定制新闻源构建指南
随机森林分类器特征重要性分析
在建模过程中,了解哪些特征对模型有影响至关重要。随机森林分类器的特征重要性可以更准确地反映给定特征的真实影响。以下是使用随机森林分类器分析特征重要性的代码:
from sklearn.ensemble import RandomForestClassifier
clf_rf = RandomForestClassifier(n_estimators=1000)
clf_rf.fit(X_train, y_train)
f_importances = clf_rf.feature_importances_
f_names = X_train.columns
f_std = np.std([tree.feature_importances_ for tree in clf_rf.estimators_], axis=0)
zz = zip(f_importances, f_names, f_std)
zzs = sorted(zz, key=lambda x: x[0], reverse=True)
n_features = 10
imps = [x[0] for x in zzs[:n_features]]
labels = [x[1] for x in zzs[:n_features]]
errs = [x[2] for x in zzs[:n_features]]
fig, ax = plt.subplots(figsize=(16, 8))
ax.bar(range(n_features), imps, color="r", yerr=er
超级会员免费看
订阅专栏 解锁全文
1127

被折叠的 条评论
为什么被折叠?



