onnxruntime调用AI模型的python和C++编程

最新推荐文章于 2025-11-25 10:27:41 发布

原创

最新推荐文章于 2025-11-25 10:27:41 发布 · 1.6w 阅读

56 ·

CC 4.0 BY-SA版权

文章标签：

#深度学习 #onnxruntime

本文详细介绍了如何在Python和C++环境中安装、配置onnxruntime，包括CPU和GPU版本的选择，以及如何处理不同版本之间的兼容性问题。重点讲解了如何在C++中使用onnxruntime进行模型推理，涉及模型输入输出处理、C++ API应用和TensorRT优化。

python版的onnxruntime是比较容易使用的，先保证pip更新到最新再安装onnxruntime:

pip install --upgrade pip
#安装cpu版
pip install onnxruntime
#或者安装gpu版
#pip install onnxruntime-gpu

只是用来验证模型的话，用cpu版的就很好了,比较简单易用。注意两种版本不要同时安装，否则调用时怎么弄都出错，说参数不对：

incompatible constructor arguments. The following argument types are supported

卸载掉onnxruntime-gpu版后,onnxruntime InferenceSession(...)报错:

module 'onnxruntime' has no attribute 'InferenceSession'

解决办法就是把onnxruntime cpu版也卸载掉后重新安装onnxruntime cpu版即可。

参照这里的API文档和sample代码写自己的代码即可:

https://www.onnxruntime.ai/python/api_summary.html

https://www.onnxruntime.ai/python/tutorial.html

当模型的输入输出都只有一个参数时，一般仿照sample里代码写就行了，如果有多个参数时可以仿照API文档里的例子使用iobinding来指定每个输入输出参数。

我自己代码，看起来调用部分代码很简单吧:

ort_session = onnxruntime.InferenceSession("models/efficientdet-d0-s.onnx")
...
#preprocess input data to get x
...
#
ort_inputs = {ort_session.get_inputs()[0].name: to_numpy(x)}
ort_outs = ort_session.run(None, ort_inputs)
regression = torch.from_numpy(ort_outs[0])
classification = torch.from_numpy(ort_outs[1])
anchors = torch.from_numpy(ort_outs[2])
#postprocess regression and classification and anchors to get bboxes
...

使用C++调用就麻烦些了，一般做AI模型的实验和工程应用落地的开发都是在linux上，但是，MS一贯的毛病就是只优先自家的宝贝Windows .NET:

所以，首先你得针对你的Linux操作系统和CPU芯片编译出so库(python版的whl安装包也会顺带编译出来),如何编译，可以参考前一篇文章。

编译完后在onnxruntime/build/Linux/MinSizeRel/下会生成一堆文件，其中有几个so库文件，其中libonnxruntime.so就是需要在编译你的调用程序时需要加上-lonnxruntime进行链接的。

另外，头文件在onnxruntime/include/onnxruntime/core/下，编译时是需要include的，为了方便可以把整个onnxruntime/include/下的onnxruntime整个目录全部拷贝或者链接到项目的include路径里去：

CFLAGS+= -DUSE_TENSORRT -Wno-deprecated-declarations -fPIC -DDS_VERSION=\"5.0.0\" -DBOOST_ALL_DYN_LINK \
         -Wno-deprecated-declarations -fPIC -DDS_VERSION=\"5.0.0\" -DBOOST_ALL_DYN_LINK \
         -I /usr/local/cuda-$(CUDA_VER)/include \
         -I /opt/nvidia/deepstream/deepstream/sources/includes/ \
         -I $(PROJECT_ROOT)/plugin/common \
         -I $(PROJECT_ROOT)/ext/inc \
         -I $(PROJECT_ROOT)/ext/inc/onnxruntime \
         -I $(PROJECT_ROOT)/ext/inc/onnxruntime/core/common \
         -I $(PROJECT_ROOT)/ext/inc/onnxruntime/core/common/logging \
         -I $(PROJECT_ROOT)/ext/inc/onnxruntime/core/framework \
         -I $(PROJECT_ROOT)/ext/inc/onnxruntime/core/graph \
         -I $(PROJECT_ROOT)/ext/inc/onnxruntime/core/optimizer \
         -I $(PROJECT_ROOT)/ext/inc/onnxruntime/core/platform \
         -I $(PROJECT_ROOT)/ext/inc/onnxruntime/core/session \
         -I /usr/local/cuda-10.2/targets/aarch64-linux/include/ \
         -I /usr/include/opencv4

C++ API的文档是 https://github.com/microsoft/onnxruntime/blob/master/include/onnxruntime/core/session/onnxruntime_cxx_api.h

C++ samples里可以看看这两个简单的例子:

https://gi