A quick tour of Torch internals

本文深入探讨了Torch的核心组件TH库,揭示了其内部实现机制,包括代码生成技巧、OOP风格的虚拟表格、共享内存使用、SIMD指令集应用等,并提供了详细的文件结构和功能说明。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Recently, I have been kind of confused. I couldn’t find myself anything to work on and had no ideas for new projects (apparently, I just had to wait for the new academic year to start - I have plenty of ideas now, but no time for them).

Anyway, I often get the impression that many people are using Machine Learning libraries as a kind of black-boxes with only a high-level API. It’s as if they weren’t interested at all in how they work, but solely in the output (this is why I like Torch so much - it’s hackable to the bone). I’ve been using Torch for a few months now and I’ve always been curious how it’s built. This is why I decided to get down to it and browse the code of TH library, which is at the core of Torch.

It’s really a great thing to do. I’ll write more about it in the end of this post, but you should seriously consider doing it with your favourite library or framework too.

Torch’s source is written in plain C, which was very pleasing to me. I don’t really like many C++ features and although I find it very powerful and flexible, it often seems confusing. C’s extremely minimal syntax allows you to read and quickly grasp what exactly happens at any moment. However, if C++ is the way to go for you, there is also a wrapper around TH called thpp.

Where can you get it?

You can find the TH library in two places:

The folder structure is very simple. There are some cmake tests and definitions in cmake directory while the code is located both in generic directory and at the repo root.

Interesting findings

Before going into the details and describing functionality implemented in individual files I’d like to point out some really cool techniques that I’ve found in the implementation.

Code generation

First thing that appeared really strange to me was that many files existed both in the root folder as well as in generic. If you opened them, you would quickly notice that copies in generic contain the actual code, while at the root they all look very similar. Here is THStorage.c for example:

#include "THAtomic.h"
#include "THStorage.h"

#include "generic/THStorage.c"
#include "THGenerateAllTypes.h"

#include "generic/THStorageCopy.c"
#include "THGenerateAllTypes.h"

Quite unusual for a .c file, right?

THGenerateAllTypes sounds interesting so i looked it up and this if what I’ve found:

#ifndef TH_GENERIC_FILE
#error "You must define TH_GENERIC_FILE before including THGenerateAllTypes.h"
#endif

#define real unsigned char
#define accreal long
#define Real Byte
#define TH_REAL_IS_BYTE
#line 1 TH_GENERIC_FILE
#include TH_GENERIC_FILE
#undef real
#undef accreal
#undef Real
#undef TH_REAL_IS_BYTE

#define real char
#define accreal long
#define Real Char
#define TH_REAL_IS_CHAR
#line 1 TH_GENERIC_FILE
#include TH_GENERIC_FILE
#undef real
#undef accreal
#undef Real
#undef TH_REAL_IS_CHAR
...

Which continued for a few more types. At first I was puzzled, but then I suddenly realized what it does and how brilliant this is! There are no templates in C, but objects like THStorage should be available for different types. It would be a terrible waste to repeat the same implementation with just a few words replaced and this is what this piece achieves! In generic files you can see variables of type real all over the place. At first it was obvious to me that it’s probably a matter of some compile time optimizations whether it was chosen to be a float or a double, but apparently it’s different - it allows code generation for many other types too!

Clever usage of macros also makes the generic files more readable. Take this example from generic/THStorage.c:

real* THStorage_(data)(const THStorage *self)

It looks nice, but what about name conflicts for different types? It can’t be THStorage and THStorage_data all the time! Worry not, macros take care of that as well:

#define THStorage        TH_CONCAT_3(TH,Real,Storage)
#define THStorage_(NAME) TH_CONCAT_4(TH,Real,Storage_,NAME)

During preprocessing this function name will be expanded to something like THByteStorage_dataand THStorage will be replaced with THByteStorage. Super cool x2!

It’s also smart to use a #line 1 TH_GENERIC_FILE directive, because if there would be any errors they will appear in the compiler as if they were in the original generic file - not in the middle of the implementation pasted over and over.

I think that these are some awesome ways to make C code more type-agnostic.

OOP & Virtual tables

TH also implements a file API, where you can find good examples of how you could implement some basic OOP patterns in C. There are four files that I’ll be talking about here:

  • THFilePrivate.h - defines basic structs
  • THFile.c - contains some generic implementation
  • THDiskFile.c - code for handling disk files
  • THMemoryFile.c - implementation of in-memory files

Let’s start with the private header file.

struct THFile__
{
    struct THFileVTable *vtable;

    int isQuiet;
    ...
};

struct THFileVTable
{
    int (*isOpened)(THFile *self);

    long (*readByte)(THFile *self, unsigned char *data, long n);
    long (*readChar)(THFile *self, char *data, long n);
    long (*readShort)(THFile *self, short *data, long n);
    long (*readInt)(THFile *self, int *data, long n);
    ...
};

You can see that it defines a virtual method table with pointers to functions that THFile subclasses will have to implement (THFile is an abstract class - it has no constructors). Other structs are defined as such:

typedef struct THDiskFile__
{
    THFile file;

    FILE *handle;
    char *name;
    int isNativeEncoding;

} THDiskFile;

What makes this struct interesting is that because it’s first member is of type THFile it’s actually valid to cast struct THDiskFile * to struct THFile * and use it normally. What’s more, because THDiskFile’s constructor fills in the function pointers in file field’s virtual table, it will behave as THDiskFile object even when casted to THFile!

Shared memory

I had little knowledge about UNIX process management and threading until now, when I took up an operating systems course at my university, so it was really interesting to learn about mmap (maps a file to memory, so you can use it like an array) and to see how memory can be shared between processes with shm_open. I even wrote a piece of code to try it out. You can find it here.

SIMD

Another cool thing you can find in TH are vector instructions. There are some cmake tests that check if they are available on your CPU (cmake/FindSSE.cmake) and several files implementing convolution operations using them (generic/simd/*). I can’t understand it yet - function that takes 10 lines of code is expanded to unrolled vectorized loop taking more than 120 lines and using APIs with unreadable function names for a SSE beginner. This code spans 134 lines after macro expansion:

void convolve_5x5_1_avx(float* output, float* image, float* weight, long count, long outputStride, long inputStride) {
  long i = 0;
  long alignedCount = count & 0xFFFFFFF8;
  DECLARE_OUTPUT_1()
  for (; i < alignedCount; i+=8) {
    CONVOLVE_8COLS_XROWS(1, i)
  }
}

Anyway, it’s definitely a thing worth learning so I will probably write more about it soon!

Allocators

TH declares it’s own function for memory allocation called THAlloc. It tries to allocate a properly aligned chunks if you allocate big blocks and handles out-of-memory errors. Before reading Torch’s source I didn’t know about the concept of allocators. They are just small virtual tables providing their own memory management API (alloc, realloc, free). It’s cool that you can pass an Allocator to THStorage or THTensor and construct it not only in the regular heap region, but also allocate it in the shared memory.

Random module

It’s natural to have a pseudorandom number generator in all programming languages, but I’ve never read an implementation of one (ok, except the linear congruential generator). In THRandom.c you can find a full implementation of Mersenne twister, which (according to Wikipedia) is a default implementation for R, Python, Ruby, PHP, CMU Common Lisp, GLib, MATLAB and some more. There are also several methods which convert returned uniform distribution into other shapes.

Quick library overview

In this section I will briefly describe most of the functionalities provided by TH.

  • THAllocator
    • creates a default default allocator, which just calls TH memory management functions and, if possible, a THMapAllocator that can map files or shared memory objects into memory.
  • THAtomic
    • multiplatform implementation of atomic operations
  • THTensor
    • defines a general Tensor type
    • supports lots of indexing, linear algebra and math operations
    • available for all primitive datatypes (TH<type>Tensor, e.g. THFloatTensor)
  • THBlas
    • wraps BLAS library for use in THTensor
    • provides a general implementation as a fallback
  • THLapack
    • wraps LAPACK library for use in THTensor
    • DOESN’T provide fallbacks - throws errors if called
  • THFile
    • abstract file class
    • only creates wrappers for calling methods contained in virtual table
  • THDiskFile
    • concrete file class
    • wraps disk file APIs
  • THMemoryFile
    • concrete file class
    • operates on an in-memory buffer and fakes file operations
  • THGeneral
    • implements general utilities
    • contains memory management routines
    • can notify external GCs
  • THRandom
    • implements a random number generator
    • can sample from many distributions
  • THStorage
    • defines a general storage object
    • contains mainly bookkeeping code
    • available for all primitive datatypes (TH<type>Storage, e.g. THFloatStorage)

How to use it

If you want to install TH you can either perform a full Torch installation or you can follow these steps:

# clone Torch repository
git clone https://github.com/torch/torch7
mkdir th_build
cd th_build
# configure TH build
cmake ../torch7/lib/TH
# compile library
make
# install shared library and header files
make install

Then, you only have to #include <TH/TH.h> in your program and link the library during the compilation process (-lTH).

Example program

To wrap up I just wanted to show you an example program using TH. It will simply load 10 floats from two files into tensors, compute their dot product and add to it a sum of all values in one of them. This is the code:

#include "TH/TH.h"

int main()
{
    THFile *x_file = THDiskFile_new("x", "r", 0);
    THFile *y_file = THDiskFile_new("y", "r", 0);

    THFloatTensor *x = THFloatTensor_newWithSize1d(10);
    THFloatTensor *y = THFloatTensor_newWithSize1d(10);

    THFile_readFloat(x_file, x->storage);
    THFile_readFloat(y_file, y->storage);

    double result = THFloatTensor_dot(x, y) + THFloatTensor_sumall(x);

    printf("%f\n", result);

    THFloatTensor_free(x);
    THFloatTensor_free(y);
    THFile_free(x_file);
    THFile_free(y_file);
    return 0;
}

All input parsing and possible errors are handled by Torch. Convenient, isn’t it?

Afterthoughts

I actually enjoy reading other’s source code - especially if it’s well written. If you have some spare time, then seriously, consider picking your favourite library or framework, and try to understand how it works - even the tiniest bits of it. I guarantee that you will find many fascinating things and learn many concepts and ways of structuring your code that you had no idea existed. I haven’t learned that much in such short period of time for a while. I liked it so that I’m thinking about doing this on a more regular basis.

TH has no documentation at the moment. Since I’ve already studied most of it’s code, I’ll probably try to write at least a bit. I’ve used Torch for so long that it’s time to make some contribution myself.

Thanks for reading! I hope that you liked it!


Source: https://apaszke.github.io/torch-internals.html

<think>好的,用户遇到了X-AnyLabeling中predict_shapes函数出现“list index out of range”错误的问题。我需要先理解这个错误的常见原因,然后结合可能的场景给出解决方案。首先,“list index out of range”通常表示尝试访问列表中不存在的索引,比如列表为空或者索引超过了列表长度。在机器学习的预测结果处理中,可能是因为模型返回了空的预测结果,或者在处理预测框、标签时没有进行有效性检查。 接下来,我需要考虑用户的具体情况。用户可能在运行预测形状时,模型没有检测到任何对象,导致返回的预测列表为空。当代码试图访问第一个元素(如predictions[0])时,就会触发这个错误。此外,可能存在版本兼容性问题,或者数据预处理步骤中的错误,导致模型输出不符合预期。 可能的解决方案包括检查模型是否正常加载,确认输入数据的格式是否正确,添加对预测结果的空值检查,以及更新软件版本。还需要指导用户如何调试,例如打印预测结果的长度和内容,查看日志文件等。同时,引用提供的代码片段可能涉及学习率调整和验证损失,但用户的问题更偏向于数据处理和模型输出处理,因此需要将重点放在预测结果的处理流程上。 需要确保回答结构清晰,分步骤说明,并给出代码示例和检查点。同时,生成的相关问题要围绕可能的原因和解决方案展开,帮助用户进一步排查问题。</think>### X-AnyLabeling predict_shapes列表越界错误解决方案 在X-AnyLabeling中遇到`predict_shapes`函数的`list index out of range`错误,通常与模型预测结果处理或数据格式相关。以下是分步解决方案: #### 1. 检查模型预测输出 ```python # 在调用predict_shapes方法前添加调试语句 predictions = model.predict(input_data) print("预测结果长度:", len(predictions)) # 检查是否为空列表 print("预测内容示例:", predictions[:1]) # 检查首个元素结构 ``` 如果输出长度为0,说明模型未检测到任何目标,需要检查: - 输入图像是否符合模型要求的分辨率/格式 - 模型配置文件(如`config.yaml`)中的类别定义是否匹配 - 模型权重文件是否完整 #### 2. 添加空值保护机制 ```python def predict_shapes(self, image): predictions = self.model.inference(image) # 新增空结果判断 if not predictions or len(predictions[0]) == 0: # 根据实际数据结构调整 print("警告:未检测到有效目标") return [] # 原始处理逻辑 shapes = [] for pred in predictions[0]: x1, y1, x2, y2 = pred[:4] label = self.classes[int(pred[5])] shapes.append(( label, [(x1, y1), (x2, y1), (x2, y2), (x1, y2)], float(pred[4]) )) return shapes ``` #### 3. 验证数据预处理 检查输入图像的预处理是否与模型训练设置一致: ```python # 示例预处理验证 input_tensor = preprocess(image) print("输入张量形状:", input_tensor.shape) # 应与模型期望的(1,3,H,W)等格式匹配 print("数值范围:", input_tensor.min(), input_tensor.max()) # 应在[0,1]或[0,255]等预期范围内 ``` #### 4. 更新依赖库版本 常见兼容性问题可通过更新解决: ```bash # 升级核心依赖 pip install --upgrade anylabeling pip install --upgrade onnxruntime # 如果使用ONNX模型 ``` #### 5. 检查标注配置文件 验证`labels.txt`文件: ```bash # 检查标签文件格式 cat labels.txt | wc -l # 确认行数与模型类别数一致 head -n3 labels.txt # 查看前三个标签定义 ``` #### 6. 调试模型输出层 如果是自定义模型,检查输出层结构: ```python import onnxruntime as ort sess = ort.InferenceSession("model.onnx") outputs = sess.get_outputs() for i, out in enumerate(outputs): print(f"输出层{i}: {out.name} 形状:{out.shape} 类型:{out.type}") ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值