利用BayesianOptimization库对模型进行贝叶斯调参(XGBOOST)

最新推荐文章于 2025-04-21 11:00:49 发布

西红柿炒豆腐

最新推荐文章于 2025-04-21 11:00:49 发布

阅读量5.7k

点赞数 3

分类专栏：机器学习基础知识文章标签： python

本文链接：https://blog.youkuaiyun.com/weixin_44839513/article/details/108699862

版权

机器学习基础知识专栏收录该内容

10 篇文章

订阅专栏

本文介绍了一种基于贝叶斯优化的XGBoost参数优化方法。通过定义目标函数并设置参数搜索空间，利用贝叶斯优化算法寻找最优参数组合。具体步骤包括构建目标函数、设定参数范围及迭代次数，并应用最优参数训练模型。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

步骤：
1.构建目标函数（贝叶斯优化的目标函数）

构建目标函数

目标函数(参数优化最大化的函数)：-1*mse，使用10折交叉验证（cross_val_score），
取10次的平均MSE作为最终目标函数.

def xgb_cv(max_depth, learning_rate, n_estimators, min_child_weight, subsample, colsample_bytree, reg_alpha, gamma):
    val = cross_val_score(estimator=XGBRegressor(max_depth=int(max_depth),
                                                 learning_rate=learning_rate,
                                                 n_estimators=int(n_estimators),
                                                 min_child_weight=min_child_weight,
                                                 subsample=max(min(subsample, 1), 0),
                                                 colsample_bytree=max(min(colsample_bytree, 1), 0),
                                                 reg_alpha=max(reg_alpha, 0), gamma=gamma, objective='reg:squarederror',
                                                 booster='gbtree',
                                                 seed=888), X=x_train, y=y_train, scoring='neg_mean_squared_error',
                          cv=10).mean()
    return val

参数优化

pbounds:参数搜索空间
n_iter:迭代次数，init_points:随机多样性

默认实数范围内搜索

xgb_bo = BayesianOptimization(xgb_cv, pbounds={'max_depth': (1, 10),
                                               'learning_rate': (0.01, 0.3),
                                               'n_estimators': (1, 1000),
                                               'min_child_weight': (0, 20),
                                               'subsample': (0.001, 1),
                                               'colsample_bytree': (0.01, 1),
                                               'reg_alpha': (0.001, 20),
                                               'gamma': (0.001, 10)})
xgb_bo.maximize(n_iter=100, init_points=10)

利用最优参数拟合模型

由于前面int不起作用，对整数参数进行取整

xgb_bo.max['params']：存储最佳参数的字典

params = xgb_bo.max['params']
xgb1 = XGBRegressor(gamma=params['gamma'], colsample_bytree=params['colsample_bytree'],
                    learning_rate=params['learning_rate'],
                    max_depth=int(params['max_depth']), min_child_weight=params['min_child_weight'],
                    n_estimators=int(params['n_estimators']),
                    reg_alpha=params['reg_alpha'], subsample=params['subsample'],
                    objective='reg:squarederror',
                    booster='gbtree',
                    n_jobs=4)