调参----贝叶斯优化(BayesianOptimization)

本文介绍如何使用Bayesian优化来调整SVM和支持向量机(SVM)及随机森林(Random Forest)的参数,通过定义交叉验证函数并应用Bayesian优化器,寻找最佳参数组合以最大化SVM的AUC指标和最小化随机森林的对数损失。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier as RFC
from sklearn.svm import SVC

from bayes_opt import BayesianOptimization
from bayes_opt.util import Colours


def get_data():
    """Synthetic binary classification dataset."""
    data, targets = make_classification(
        n_samples=1000,
        n_features=45,
        n_informative=12,
        n_redundant=7,
        random_state=134985745,
    )
    return data, targets


def svc_cv(C, gamma, data, targets):
    """SVC cross validation.
    This function will instantiate a SVC classifier with parameters C and
    gamma. Combined with data and targets this will in turn be used to perform
    cross validation. The result of cross validation is returned.
    Our goal is to find combinations of C and gamma that maximizes the roc_auc
    metric.
    """
    estimator = SVC(C=C, gamma=gamma, random_state=2)
    cval = cross_val_score(estimator, data, targets, scoring='roc_auc', cv=4)
    return cval.mean()


def rfc_cv(n_estimators, min_samples_split, max_features, data, targets):
    """Random Forest cross validation.
    This function will instantiate a random forest classifier with parameters
    n_estimators, min_samples_split, and max_features. Combined with data and
    targets this will in turn be used to perform cross validation. The result
    of cross validation is returned.
    Our goal is to find combinations of n_estimators, min_samples_split, and
    max_features that minimzes the log loss.
    """
    estimator = RFC(
        n_estimators=n_estimators,
        min_samples_split=min_samples_split,
        max_features=max_features,
        random_state=2
    )
    cval = cross_val_score(estimator, data, targets, scoring='neg_log_loss', cv=4)
    return cval.mean()


def optimize_svc(data, targets):
    """Apply Bayesian Optimization to SVC parameters."""

    def svc_crossval(expC, expGamma):
        """Wrapper of SVC cross validation.
        Notice how we transform between regular and log scale. While this
        is not technically necessary, it greatly improves the performance
        of the optimizer.
        """
        C = 10 ** expC
        gamma = 10 ** expGamma
        return svc_cv(C=C, gamma=gamma, data=data, targets=targets)

    optimizer = BayesianOptimization(
        f=svc_crossval,
        pbounds={"expC": (-3, 2), "expGamma": (-4, -1)},
        random_state=1234,
        verbose=2
    )
    optimizer.maximize(n_iter=10)

    print("Final result:", optimizer.max)


def optimize_rfc(data, targets):
    """Apply Bayesian Optimization to Random Forest parameters."""

    def rfc_crossval(n_estimators, min_samples_split, max_features):
        """Wrapper of RandomForest cross validation.
        Notice how we ensure n_estimators and min_samples_split are casted
        to integer before we pass them along. Moreover, to avoid max_features
        taking values outside the (0, 1) range, we also ensure it is capped
        accordingly.
        """
        return rfc_cv(
            n_estimators=int(n_estimators),
            min_samples_split=int(min_samples_split),
            max_features=max(min(max_features, 0.999), 1e-3),
            data=data,
            targets=targets,
        )

    optimizer = BayesianOptimization(
        f=rfc_crossval,
        pbounds={
            "n_estimators": (10, 250),
            "min_samples_split": (2, 25),
            "max_features": (0.1, 0.999),
        },
        random_state=1234,
        verbose=2
    )
    optimizer.maximize(n_iter=10)

    print("Final result:", optimizer.max)


if __name__ == "__main__":
    data, targets = get_data()

    print(Colours.yellow("--- Optimizing SVM ---"))
    optimize_svc(data, targets)

    print(Colours.green("--- Optimizing Random Forest ---"))
    optimize_rfc(data, targets)

 

贝叶斯优化调参是一种基于贝叶斯定理的优化算法,用于在给定的参数空间中寻找最优的参数组合。在Python中,可以使用`BayesianOptimization`库来实现贝叶斯优化调参。 首先,你需要安装`BayesianOptimization`库。可以使用以下命令进行安装: ``` pip install bayesian-optimization ``` 接下来,你需要定义一个目标函数,该函数接受待优化的参数作为输入,并返回一个评估指标(例如模型的准确率、F1分数等)。以下是一个示例的目标函数: ```python def target_function(x, y): # 在这里编写你的模型训练和评估代码 # 使用x和y作为待优化的参数 # 返回一个评估指标,例如模型的准确率 return accuracy ``` 然后,你需要定义参数空间,即待优化的参数范围。可以使用`Bounds`类来定义参数的上下界。以下是一个示例的参数空间定义: ```python from bayes_opt import BayesianOptimization # 定义参数空间 bounds = {'x': (0, 1), 'y': (-1, 1)} ``` 接下来,你可以使用`BayesianOptimization`类来进行贝叶斯优化调参。以下是一个示例的代码: ```python # 创建贝叶斯优化对象 optimizer = BayesianOptimization(f=target_function, pbounds=bounds) # 进行优化 optimizer.maximize(init_points=5, n_iter=10) # 输出最优参数和评估指标 print(optimizer.max) ``` 在上述代码中,`init_points`参数表示初始采样点的数量,`n_iter`参数表示迭代次数。优化完成后,可以通过`optimizer.max`获取最优的参数和评估指标。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值