onnx转tensorflow以便于发挥显卡的混合精度

最新推荐文章于 2025-01-21 10:20:34 发布

mania_yan

最新推荐文章于 2025-01-21 10:20:34 发布

阅读量654

点赞数 7

分类专栏： AI 文章标签：人工智能 tensorflow

本文链接：https://blog.youkuaiyun.com/yyw794/article/details/135206499

版权

AI 专栏收录该内容

22 篇文章

订阅专栏

文章比较了Triton中使用TensorFlow和ONNX进行混合精度计算的性能。尽管ONNX在单个序列处理上有优势，但在批量处理（如50条）时，TensorFlow混合精度模型的性能显著高于ONNX，提升超过两倍。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

triton目前只有tensorflow可以打开混合精度充分使用T4和V100显卡的混合精度计算单元。

#import os
#os.environ["CUDA_VISIBLE_DEVICES"] = "1"
import onnx
from onnx_tf.backend import prepare
onnx_model = onnx.load("onnx/model.onnx")  # load onnx model
#strict需要为False，否则转换的TF模型运行报错
tf_rep = prepare(onnx_model, strict=False)  # prepare tf representation
#tf_folder里的就是TF模型，可直接拷贝到triton里运行
tf_rep.export_graph("tf_folder") # export the model