pytorch-frame开源程序适用于 PyTorch 的表格深度学习库，一个模块化深度学习框架，用于在异构表格数据上构建神经网络模型。

最新推荐文章于 2025-12-21 10:13:38 发布

原创

最新推荐文章于 2025-12-21 10:13:38 发布 · 1.2k 阅读

22 ·

CC 4.0 BY-SA版权

文章标签：

#深度学习 #pytorch #神经网络 #python

一、软件介绍

文末提供程序和源码下载

pytorch-frame开源程序适用于 PyTorch 的表格深度学习库，一个模块化深度学习框架，用于在异构表格数据上构建神经网络模型。

PyTorch Frame 是 PyTorch 的深度学习扩展，专为具有不同列类型（包括数字、分类、时间、文本和图像）的异构表格数据而设计。它为实现现有和未来的方法提供了一个模块化框架。该库包含来自最先进模型、用户友好的小批量加载器、基准测试数据集和自定义数据集成接口的方法。

二、Library Highlights 库亮点

PyTorch Frame builds directly upon PyTorch, ensuring a smooth transition for existing PyTorch users. Key features include:
PyTorch Frame 直接基于 PyTorch 构建，确保现有 PyTorch 用户能够顺利过渡。主要功能包括：

Diverse column types: PyTorch Frame supports learning across various column types: numerical, categorical, multicategorical, text_embedded, text_tokenized, timestamp, image_embedded, and embedding. See here for the detailed tutorial.
多种列类型：PyTorch Frame 支持跨各种列类型学习： numerical 、 categorical multicategorical text_embedded text_tokenized timestamp image_embedded embedding 和。有关详细教程，请参阅此处。
Modular model design: Enables modular deep learning model implementations, promoting reusability, clear coding, and experimentation flexibility. Further details in the architecture overview.
模块化模型设计：支持模块化深度学习模型实施，促进可重用性、清晰的编码和实验灵活性。有关更多详细信息，请参阅体系结构概述.
Models Implements many state-of-the-art deep tabular models as well as strong GBDTs (XGBoost, CatBoost, and LightGBM) with hyper-parameter tuning.
模型实现许多最先进的深度表格模型以及具有超参数优化的强大 GBDT（XGBoost、CatBoost 和 LightGBM）。
Datasets: Comes with a collection of readily-usable tabular datasets. Also supports custom datasets to solve your own problem. We benchmark deep tabular models against GBDTs.
数据集：附带一组易于使用的表格数据集。还支持自定义数据集来解决您自己的问题。我们将深度表格模型与 GBDT 进行基准测试。
PyTorch integration: Integrates effortlessly with other PyTorch libraries, facilitating end-to-end training of PyTorch Frame with downstream PyTorch models. For example, by integrating with PyG, a PyTorch library for GNNs, we can perform deep learning over relational databases. Learn more in RelBench and example code.
PyTorch 集成：轻松与其他 PyTorch 库集成，促进 PyTorch Frame 与下游 PyTorch 模型的端到端训练。例如，通过与 PyG（一个用于 GNN 的 PyTorch 库）集成，我们可以对关系数据库执行深度学习。在 RelBench 和示例代码中了解更多信息。

三、Architecture Overview 架构概述

Models in PyTorch Frame follow a modular design of FeatureEncoder, TableConv, and Decoder, as shown in the figure below:
PyTorch Frame 中的模型遵循 FeatureEncoder 、、 TableConv 和 Decoder 的模块化设计，如下图所示：

In essence, this modular setup empowers users to effortlessly experiment with myriad architectures:
从本质上讲，这种模块化设置使用户能够毫不费力地尝试各种架构：

Materialization handles converting the raw pandas DataFrame into a TensorFrame that is amenable to Pytorch-based training and modeling.
Materialization 处理将原始 pandas 转换为 TensorFrame 适合基于 Pytorch 的训练和建模的 pandas DataFrame 。
FeatureEncoder encodes TensorFrame into hidden column embeddings of size [batch_size, num_cols, channels].
FeatureEncoder 编码 TensorFrame 为 size [batch_size, num_cols, channels] 的隐藏列嵌入向量。
TableConv models column-wise interactions over the hidden embeddings.
TableConv 对隐藏嵌入的逐列交互进行建模。
Decoder generates embedding/prediction per row.
Decoder 每行生成嵌入/预测。

四、Quick Tour 快速浏览

In this quick tour, we showcase the ease of creating and training a deep tabular model with only a few lines of code.
在这个快速导览中，我们展示了仅使用几行代码创建和训练深度表格模型的便利性。

Build and train your own deep tabular model
构建和训练您自己的深度表格模型

As an example, we implement a simple ExampleTransformer following the modular architecture of Pytorch Frame. In the example below:
例如，我们按照 Pytorch Frame 的模块化架构实现了一个简单的 ExampleTransformer 。在下面的示例中：

self.encoder maps an input TensorFrame to an embedding of size [batch_size, num_cols, channels].
self.encoder 将 input TensorFrame 映射到 size [batch_size, num_cols, channels] 的嵌入向量。
self.convs iteratively transforms the embedding of size [batch_size, num_cols, channels] into an embedding of the same size.
self.convs 迭代地将 size [batch_size, num_cols, channels]