python_xgboost例02_predict_leaf_indices获得叶子索引

本文通过XGBoost实例展示了如何获取决策树的叶子节点指数,并详细解释了预测过程。从加载数据到训练模型,再到逐步预测叶子节点,对于理解GBDT工作原理极具价值。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

本专栏所有例题数据放在 网址https://download.youkuaiyun.com/download/u012338969/85439555
%matplotlib inline

Demo for obtaining leaf index

import os
import xgboost as xgb
# load data in do training
dtrain = xgb.DMatrix( './data/agaricus.txt.train')
dtest = xgb.DMatrix( './data/agaricus.txt.test')
param = {'max_depth': 2, 'eta': 1, 'objective': 'binary:logistic'}
watchlist = [(dtest, 'eval'), (dtrain, 'train')]
num_round = 20
bst = xgb.train(param, dtrain, num_round, watchlist)
print('start testing predict the leaf indices')
# predict using first 2 tree
[0]	eval-logloss:0.22669	train-logloss:0.23338
[1]	eval-logloss:0.13787	train-logloss:0.13666
[2]	eval-logloss:0.08046	train-logloss:0.08253
[3]	eval-logloss:0.05833	train-logloss:0.05647
[4]	eval-logloss:0.03829	train-logloss:0.04151
[5]	eval-logloss:0.02663	train-logloss:0.02961
[6]	eval-logloss:0.01388	train-logloss:0.01919
[7]	eval-logloss:0.01020	train-logloss:0.01332
[8]	eval-logloss:0.00848	train-logloss:0.01113
[9]	eval-logloss:0.00692	train-logloss:0.00663
[10]	eval-logloss:0.00544	train-logloss:0.00504
[11]	eval-logloss:0.00445	train-logloss:0.00420
[12]	eval-logloss:0.00336	train-logloss:0.00356
[13]	eval-logloss:0.00277	train-logloss:0.00281
[14]	eval-logloss:0.00252	train-logloss:0.00244
[15]	eval-logloss:0.00177	train-logloss:0.00194
[16]	eval-logloss:0.00157	train-logloss:0.00161
[17]	eval-logloss:0.00135	train-logloss:0.00142
[18]	eval-logloss:0.00123	train-logloss:0.00125
[19]	eval-logloss:0.00107	train-logloss:0.00107
start testing predict the leaf indices


D:\d_programe\Anaconda3\lib\site-packages\xgboost\core.py:525: FutureWarning: Pass `evals` as keyword args.  Passing these as positional arguments will be considered as error in future releases.
  warnings.warn(
leafindex = bst.predict(
    dtest, iteration_range=(0, 2), pred_leaf=True, strict_shape=True
)
print(leafindex.shape)

(1611, 2, 1, 1)
print(leafindex)
[[[[4.]]

  [[3.]]]


 [[[3.]]

  [[3.]]]


 [[[4.]]

  [[3.]]]


 ...


 [[[3.]]

  [[3.]]]


 [[[5.]]

  [[4.]]]


 [[[3.]]

  [[3.]]]]
# predict all trees
leafindex = bst.predict(dtest, pred_leaf=True)
print(leafindex.shape)
(1611, 20)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值