一、一直对xgboost的输出有些疑惑,这里记录一下
1.xgboost的predict接口输出问题(参数pred_leaf、pred_contribs)
2.训练过程中输出相关参数的探究(evals、evals_result、verbose_eval)
3.多分类内部原理探究(不涉及源码)
4.利用gbdt进行特征组合问题(gbdt+lr)
二、导入验证数据,验证问题
针对问题1
import xgboost
from sklearn.datasets import load_iris(多分类), load_breast_cancer(二分类)
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.preprocessing import OneHotEncoder
iris_data = load_iris()
x = iris_data.data
y = iris_data.target
dtrain = xgboost.DMatrix(data=x, label=y, weight=None, missing=None)
params = {
'booster': 'gbtree'
, 'eta': 0.3
, 'gamma': 0
, 'max_depth': 6
, 'objective': 'multi:softmax'
, 'num_class': 3
}
xgb_model = xgboost.train(params=params, dtrain=dtrain, num_boost_round=3)