tvm如何添加npu支持

最新推荐文章于 2025-04-15 23:58:28 发布

原创

最新推荐文章于 2025-04-15 23:58:28 发布 · 3.6k 阅读

14 ·

CC 4.0 BY-SA版权

文章标签：

#深度学习 #神经网络

本文档详细介绍了如何在TVM中为ethos硬件添加NPU支持，包括环境配置、get_pattern_table、MergeComposite、AnnotateTarget等步骤。通过对TVM编译流程的分析，展示了从模型转换到生成针对ethos-n的代码的过程，为添加自定义NPU硬件支持提供了指导。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

TVM算是比较大的社区，用的小伙伴很多，效果也不错，知乎上有很多关于TVM stack relay流程的分析，现在自行设计AI硬件的公司很多，包括研究所，但是没有关于添加自定义硬件的帖子(也有可能是我没搜到 ^-!),所以我接下来介绍下怎么添加自定义硬件。

为方便大家对流程理解和实操，我以ethos这个硬件来写，阅读的时候需要关注一下源码里面的中文说明。

Tips: 实话实说，在下文笔不好，但尽量把关键点都写错来。

代码准备

我用的TVM版本是0.7，对应ethos stack的版本是20.5，这里版本必须对应，因为ethos stack在新版本中把部分class定义改了，不能一次编过。

tvm 环境

tvm 的环境我这里就不写了，网上一大堆教程

ethos stack

下载 ethos stack
1.从github下载
git clone https://github.com/Arm-software/ethos-n-driver-stack ethos-stack
2.从gitee 镜像下载，这个下载比较快;
git clone https://gitee.com/mirrors_ARM-software/ethos-n-driver-stack.git ethos-stack
安装一些依赖

sudo apt install git scons make sparse  gcc bc \
    gcc-aarch64-linux-gnu g++-aarch64-linux-gnu

切换到对应版本,并创建自己的branch

    cd  ethos-stack
    git checkout 20.05
    git branch mydev
    git checkout mydev

编译
我这里在host开发，所以不用交叉编译方式

复制 ethosn.bin 到host

    cd ethos-stack
    sudo cp firmware/ethosn.bin /lib/firmware/

编译ko和insmod

    cd ethos-stack/kernel-module 
    make -C /usr/src/linux-headers-4.15.0-136-generic M=$PWD  EXTRA_CCFLAGS=" -DETHOSN_NS" modules

linux-headers-4.15.0-136-generic是我当前系统的源码路径，你可以用’uname -a’ 查看你当前的源码版本
EXTRA_CCFLAGS=" -DETHOSN_NS" 表示使用非安全模式
编译完成后会在当前目录生成ethosn.ko
接下来就是加载驱动到系统

    sudo insmod ethosn.ko

Tips:
ethosn.ko加载后，会查找device，如果匹配会创建 /dev/ethosn0 这个设备节点。然后runtime会访问这个设备节点，当然在host设备上是没有这个设备节点的，因为host上没有ethosn的硬件。当然我没可以修改ethos-stack/kernel-module 这个驱动，以实现一个假的npu,在insmod ethosn.ko是能生成/dev/ethons0这个设备节点。甚至实现驱动里面的ioctl功能。
3. 编译ethos-driver

    cd ethos-stack/ethosn-driver/driver
    scons install_prefix=<install_directory> install

install_directory 就是driver编译完成后放到哪里，
比如

    scons install_prefix=/home/xxx/opt/arm/ethosn-driver install

相信大家在编译TVM的时候，在config.make的时候看到过下面这段话

# Whether to build with Arm Ethos-N support
# Possible values:
# - OFF: disable Arm Ethos-N support
# - path/to/arm-ethos-N-stack: use a specific version of the
#   Ethos-N driver stack
set(USE_ETHOSN OFF)

没错，如果你要使用ethos，这里可以改成

set(USE_ETHOSN /home/xxx/opt/arm/ethosn-driver)

relay.build for ethosn

接下来梳理ethos npu的build流程，理解了此流程，离添加自己的加速器硬件支持就更进一步了。

先来看个测试代码

# 源码见 "<tvm_path>/tests/python/contrib/test_ethosn/test_networks.py"
def _test_image_network(
    model_url,    model_sub_path,    input_dict,    compile_hash,
    output_count,    host_ops=0,    npu_partitions=1,    run=False,
):
    if not ethosn_available():
        return

    def get_model():
        if model_url[-3:] in ("tgz", "zip"):
            model_path = tf_testing.get_workload_official(
                model_url,
                model_sub_path,
            )
        else:
            model_path = download.download_testdata(
                model_url,
                model_sub_path,
            )
        return _get_tflite_model(model_path, input_dict, "uint8")

    inputs = {
   
   }
    for input_name in input_dict:
        input_shape = input_dict[input_name]
        inputs[input_name] = tei.get_real_image(input_shape[1], input_shape[2])
    str = print(input_shape)

    mod, params = get_model()
    m = tei.build(mod, params, npu=True, expected_host_ops=host_ops,
                  npu_partitions=npu_partitions)#这里调用的下面的build函数
    tei.assert_lib_hash(m.get_lib(), compile_hash)
    if run:
        tei.run(m, inputs, output_count, npu=True)

上面这个函数调用的几个重要接口是get_model，tei.build以及tei.run。
get_model的功能就是调用frontend模块，将各种框架的model和weight映射成TVM IR格式。
tei.build 正如其字面意思build，实现optimize和compile，调用的是下面这个build的函数。
tei.run 就是把build输出的结果加载到runtime执行inference。

# 源码见 "<tvm_path>/tests/python/contrib/test_ethosn/infrastructure.py"
def build(mod, params, npu=True, expected_host_ops=0, npu_partitions=1):
    """Build a network with or without Ethos-N offloading.
    """
    relay.backend.compile_engine.get().clear()
    with tvm.transform.PassContext(
        opt_level=3, config={
   
   "relay.ext.ethos-n.options": {
   
   "variant": 0}}
    ):
        with tvm.target.Target("llvm"):
            if npu:
                f = relay.build_module.bind_params_by_name(mod["main"], params)
                mod = tvm.IRModule()
                mod["main"] = f
                # step 1
                pattern = get_pattern_table("ethos-n")
                # step 2
                mod = relay.transform.MergeComposite(pattern)(mod)
                mod = relay.transform.AnnotateTarget("ethos-n")(mod)
                # step 3
                mod = relay.transform.MergeCompilerRegions()(mod)
                mod = relay.transform.PartitionGraph()(mod)
                host_op_count = get_host_op_count(mod)
                assert (
                    host_op_count == expected_host_ops
                ), "Got {} h