步骤:
1.构建目标函数(贝叶斯优化的目标函数)
构建目标函数
目标函数(参数优化最大化的函数):-1*mse,使用10折交叉验证(cross_val_score),
取10次的平均MSE作为最终目标函数.
def xgb_cv(max_depth, learning_rate, n_estimators, min_child_weight, subsample, colsample_bytree, reg_alpha, gamma):
val = cross_val_score(estimator=XGBRegressor(max_depth=int(max_depth),
learning_rate=learning_rate,
n_estimators=int(n_estimators),
min_child_weight=min_child_weight,
subsample=max(min(subsample, 1), 0),
colsample_bytree=max(min(colsample_bytree, 1), 0),
reg_alpha=max(reg_alpha, 0), gamma=gamma, objective='reg:squarederror',
booster='gbtree',
seed=888), X=x_train, y=y_train, scoring='neg_mean_squared_error',
cv=10).mean()
return val
参数优化
pbounds:参数搜索空间
n_iter:迭代次数,init_points:随机多样性
默认实数范围内搜索
xgb_bo = BayesianOptimization(xgb_cv, pbounds={'max_depth': (1, 10),
'learning_rate': (0.01, 0.3),
'n_estimators': (1, 1000),
'min_child_weight': (0, 20),
'subsample': (0.001, 1),
'colsample_bytree': (0.01, 1),
'reg_alpha': (0.001, 20),
'gamma': (0.001, 10)})
xgb_bo.maximize(n_iter=100, init_points=10)
利用最优参数拟合模型
由于前面int不起作用,对整数参数进行取整
xgb_bo.max['params']:存储最佳参数的字典
params = xgb_bo.max['params']
xgb1 = XGBRegressor(gamma=params['gamma'], colsample_bytree=params['colsample_bytree'],
learning_rate=params['learning_rate'],
max_depth=int(params['max_depth']), min_child_weight=params['min_child_weight'],
n_estimators=int(params['n_estimators']),
reg_alpha=params['reg_alpha'], subsample=params['subsample'],
objective='reg:squarederror',
booster='gbtree',
n_jobs=4)
参考:
强大而精致的机器学习调参方法:贝叶斯优化
具有贝叶斯优化的XGBoost和随机森林
fmfn/BayesianOptimization
调参神器贝叶斯优化(bayesian-optimization)实战篇