阿里云天池金融风控训练营【Task5 模型融合】学习笔记

最新推荐文章于 2025-02-23 22:32:19 发布

优岚岚

最新推荐文章于 2025-02-23 22:32:19 发布

阅读量503

点赞数 1

文章标签：机器学习 python 人工智能深度学习数据挖掘

本文链接：https://blog.youkuaiyun.com/weixin_49270402/article/details/116400621

版权

这篇学习笔记详细介绍了阿里云天池金融风控训练营Task5的模型融合技术，包括平均法、投票法、stacking和blending。平均法分为简单平均和加权平均，投票法分为硬投票和软投票。stacking通过多层模型提升预测性能，而blending则通过结合预测值和原始特征来预测。笔记讨论了如何选择融合方法，并对比了stacking与blending的优缺点。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

金融风控训练营 Task5 模型融合学习笔记

本学习笔记为阿里云天池龙珠计划金融风控训练营的学习内容，学习链接为：https://tianchi.aliyun.com/specials/activity/promotion/aicampfr

一、学习知识点概要

文章目录

金融风控训练营 Task5 模型融合学习笔记

二、学习内容

1. 模型融合的方式

平均：简单平均法/加权平均法
投票：简单投票法/加权投票法
综合：排序融合/log融合
stacking：构建多层模型，并利用预测结果再拟合预测
blending：选取部分数据预测训练得到预测结果作为新特征，带入剩下的数据中预测
boosting/bagging (task4)

2. 平均

常用

快速、简单

简单加权平均法直接求预测结果的平均值

pre = (pre1 + pre2 + pre3 +…+pren )/n
加权平均法加权求平均值

pre = 0.3pre1 + 0.3pre2 + 0.4pre3

3. 投票

简单投票（硬投票分类器）

聚合每个分类器的预测，获得最多投票的类作为自己的预测

from xgboost import XGBClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier, VotingClassifier
clf1 = LogisticRegression(random_state=1)
clf2 = RandomForestClassifier(random_state=1)
clf3 = XGBClassifier(learning_rate=0.1, n_estimators=150, max_depth=4, min_child_weight=2, subsample=0.7,objective='binary:logistic')

vclf = VotingClassifier(estimators=[('lr', clf1), ('rf', clf2), ('xgb', clf3)])
vclf = vclf .fit(x_train,y_train)
print(vclf .predict(x_test))

加权投票（软投票分类器）

将所有模型预测样本为某一类别的概率的平均值作为标准，概率最高的对应的类型为最终的预测结果

from xgboost import XGBClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier, VotingClassifier
clf1 = LogisticRegression(random_state=1)
clf2 = RandomForestClassifier(random_state=1)
clf3 = XGBClassifier(learning_rate=0.1, n_estimators=150, max_depth=4, min_child_weight