XGBoost与Lightgbm

最新推荐文章于 2025-06-15 20:46:55 发布

chenguiyuan1234

最新推荐文章于 2025-06-15 20:46:55 发布

阅读量587

点赞数

CC 4.0 BY-SA版权

分类专栏： python

本文链接：https://blog.youkuaiyun.com/chenguiyuan1234/article/details/87913290

本文详细介绍了XGBoost和LightGBM的重要参数、操作步骤以及实战应用。XGBoost的关键参数包括eta、max_depth等，而LightGBM的参数如num_iterations、num_leaves等对模型性能有显著影响。文章还涵盖了如何训练、预测、保存和加载模型，以及两者的参数调优策略。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

本文主要参考自以下网站
https://cloud.tencent.com/developer/article/1389899
https://cloud.tencent.com/developer/article/1052678
https://cloud.tencent.com/developer/article/1052664

XGBoost
1、重要参数详解
booster[default=gbtree]： gbtree, gblinear
nthread: 线程数
eta[default=0.3]: 收缩步长，防止过拟合
max_depth[default=6]: 树的最大深度
min_child_weight: 孩子节点中最小的样本权重和
subsample[default=1]: 用于训练模型的子样本占整个样本集合的比例
lambda[default=0]:　L2正则的惩罚系数
alpha [default=0] ： L1 正则的惩罚系数
objective [ default=reg:linear ] ：定义学习任务及相应的学习目标
可选的目标函数如下：
“reg:linear” —— 线性回归。
“reg:logistic”—— 逻辑回归。
“binary:logistic”—— 二分类的逻辑回归问题，输出为概率。
“binary:logitraw”—— 二分类的逻辑回归问题，输出的结果为wTx。
“count:poisson”—— 计数问题的poisson回归，输出结果为poisson分布。在poisson回归中，max_delta_step的缺省值为0.7。
“multi:softmax” –让XGBoost采用softmax目标函数处理多分类问题，同时需要设置参数num_class（类别个数）
“multi:softprob” –和softmax一样，但是输出的是ndata * nclass的向量，可以将该向量reshape成ndata行nclass列的矩阵。没行数据表示样本所属于每个类别的概率。
“rank:pairwise” –set XGBoost to do ranking task by minimizing the pairwise loss
eval_metric [ default according to objective ]：校验数据所需要的评价标准
“rmse”: root mean square error
“logloss”: negative log-likelihood
“error”: Binary classification error rate
“merror”: Multiclass classification error rate.
“mlogloss”: Multiclass logloss.
“auc”: Area under the curve for ranking evaluation.
“ndcg”:Normalized Discounted Cumulative Gain
“map”:Mean average precision