SAM导出onnx模型报错问题记录

最新推荐文章于 2025-07-09 13:36:03 发布

原创最新推荐文章于 2025-07-09 13:36:03 发布 · 993 阅读

8 ·

CC 4.0 BY-SA版权

文章标签：

#pytorch #目标检测

python 专栏收录该内容

7 篇文章

订阅专栏

问题记录：

1、Unsupported ONNX opset version: 17

2、Exporting the operator repeat_interleave to ONNX opset version 11 is not supported

系统及版本

win10系统，在Anaconda中为SAM项目创建了单独的虚拟环境 python=3.8，torch=1.8，cuda=10.2

背景知识

在布料瑕疵检测项目中，发现模型文件有.pt,.onnx,.trt,.engine等格式，算法的同事更新模型之后，获得.pt格式的模型，然后导出.onnx模型给到我，我在现场部署时，转换成.trt模型，用于实时推断。
ONNX（Open Neural Network Exchange）是一个开放的格式，旨在使机器学习模型能够在不同框架和平台之间互操作。
ONNX opset（运算符集版本）是指 ONNX 模型格式中所支持的运算符（operators）的版本。它定义了模型导出和导入时可用算子（运算符）的版本和行为。每个 ONNX 模型都有一个关联的 opset 版本，这个版本表示该模型中使用的运算符版本。
了解了ONNX opset之后，上面的2个问题是怎么回事，就有个大致的概念了。

Unsupported ONNX opset version: 17

错误信息表明，环境中的torch版本不能支持ONNX opset=17版本的导出，可以尝试降低ONNX opset版本值，或升级torch版本。
SAM模型对于ONNX opset版本的要求是>=11，默认17.
我的推断模型使用的是torch=1.8，cuda=10.2，不想要升级torch版本，尝试修改opset= 11，12，13，14，15，都未能成功。opset= 11，12，13时，提示问题2：Exporting the operator repeat_interleave to ONNX opset version 11 is not supported

Exporting the operator repeat_interleave to ONNX opset version 11 is not supported

错误信息表明，导出模型过程中，PyTorch的操作符 repeat_interleave在 ONNX opset 版本 12 中不被支持。
解决办法：替换 repeat_interleave 操作
在SAM代码中查找repeat_interleave函数
在这里插入图片描述
发现build、segment_anything文件夹下的mask_decoder.py文件中都有repeat_interleave函数。尝试发现，导出onnx脚本调用的是segment_anything文件夹下的mask_decoder.py。
将图上红框中126、132行代码，替换成127、133行内容，添加打印输出（便于确认变量维度是否正确），就可以成功导出onnx模型了。
repeat_interleave函数的介绍示例如下：

import torch

# 假设 image_pe 的形状是 (2, 3, 4, 4)：表示有 2 个样本，每个样本是一个形状为 (3, 4, 4) 的张量（例如 RGB 图像）。
image_pe = torch.randn(2, 3, 4, 4)
# 假设 tokens.shape[0] 为 3
tokens = torch.randn(3, 4)
# 将image_pe 沿 dim=0 维度（样本个数）重复tokens.shape[0]次，对数据进行扩展，以便与其它张量（如 tokens）对齐
pos_src = torch.repeat_interleave(image_pe, tokens.shape[0], dim=0)
print(pos_src.shape)  # 输出：torch.Size([6, 3, 4, 4])
# repeat将image_pe 在dim=0 维度上重复 tokens.shape[0] 次
# view将张量展平为需要的形状，-1 代表让 PyTorch 自动计算该维度的大小
pos_src = image_pe.repeat(1, tokens.shape[0], 1, 1).view(-1, image_pe.shape[1], image_pe.shape[2],                                                     image_pe.shape[3])
print(pos_src.shape)  # 输出：torch.Size([6, 3, 4, 4])