XGBOOST案例

最新推荐文章于 2024-10-21 22:00:55 发布

小橙不吃辣椒

最新推荐文章于 2024-10-21 22:00:55 发布

阅读量1k

点赞数 30

CC 4.0 BY-SA版权

分类专栏：机器学习文章标签：人工智能机器学习

本文链接：https://blog.youkuaiyun.com/Jochen12345/article/details/139740981

最近我在Kaggle上找到一个跟XGBOOST相关的代码，这有助于我们去实战性的学习。

这段代码旨在使用XGBoost和TPU进行大规模的分子绑定预测。

比赛项目：NeurIPS 2024 - Predict New Medicines with BELKA | Kaggle

训练样本代码：

上图是我们已经处理好的训练样本，右边三列是我们要去预测的蛋白质

import numpy as np  # linear algebra
import pandas as pd  # data processing, CSV file I/O (e.g. pd.read_csv)
import pickle
import random, os, gc
from scipy import sparse
from sklearn.metrics import average_precision_score
from sklearn.feature_selection import VarianceThreshold
from xgboost import DMatrix
import xgboost as xgb
from sklearn.model_selection import StratifiedKFold
import tensorflow as tf

这部分导入了所需的库，包括NumPy、Pandas、Pickle、SciPy、Scikit-learn、XGBoost和TensorFlow。VarianceThreshold是方差阈值。

# Detect hardware, return appropriate distribution strategy
try:
    tpu = tf.distribute.cluster_resolver.TPUClusterResolver.connect(tpu="local")  # "local" for 1VM TPU
    strategy = tf.distribute.TPUStrategy(tpu)
    print("Running on TPU")
    print("REPLICAS: ", strategy.num_replicas_in_sync)
except tf.errors.NotFoundError:
    strategy = tf.distribute.get_strategy()  # Default strategy for CPU/GPU
    print("Not on TPU, running on ", strategy)

这是我在源代码的基础上增添了在TPU上训练，尝试连接TPU集群。如果连接成功，使用TPUStrategy，否则使用默认的分布式策略。

1. TPUClusterResolver：
- `TPUClusterResolver` 是 TensorFlow 中用于连接 TPU 群集的类。在这段代码中，使用 `TPUClusterResolver.connect(tpu="local")` 尝试连接本地的 TPU 资源。参数 `"local"` 表示连接到单个虚拟机（1VM）上的 TPU。
- 如果成功连接到 TPU，就会创建一个 `TPUStrategy` 对象 `strategy`，用于在 TPU 上进行分布式训练。

2. TPUStrategy：
- `TPUStrategy` 是 TensorFlow 中专门为 TPU 设计的分布策略。它可以管理和分发计算任务到 TPU 设备上，并提供了一些工具和接口来简化在 TPU 上的模型训练过程。

3. Fallback to CPU/GPU：
- 如果无法连接到 TPU（捕获到 `tf.errors.NotFoundError`），则执行 `tf.distribute.get_strategy()`，该函数返回默认的策略，通常是针对 CPU 或 GPU 的单机训练策略。
- `get_strategy()` 返回的是 `MirroredStrategy`，用于在单个设备（单机多 GPU）上进行分布式训练。

4. 输出信息：
- 如果连接成功，输出 "Running on TPU" 并打印 TPU 群集的 REPLICA 数量。
- 如果连接失败（没有找到 TPU），输出 "Not on TPU, running on " 后面跟着默认策略（通常是 CPU 或 GPU）的信息。

因此，这段代码展示了如何在 TensorFlow 中利用 `TPUClusterResolver` 和 `TPUStrategy` 来实现分布式训练，并在没有 TPU 可用时回退到 CPU/GPU 上进行训练。
输出信息示例