1.创建Looger 用于记录转换过程中log
2.创建INetwork
创建Network有两种方法:1.直接用tensorrt的api搭建网络 2.使用parser(解析器)将已有的模型转换成Network
2.1使用api创建Network(略)
可以参考:
https://github.com/NVIDIA/TensorRT/blob/master/samples/opensource/sampleMNISTAPI/sampleMNISTAPI.cpp 官方例子
2.2使用parser 转换模型
转换模型的最终目标是生成 ICudaEngine,这样可以将ICudaEngine 序列化保存 也可以利用ICudaEngine 生成InferContext进行推理
转换过程的主要过程:
IBuilder--->parser--->Network--->builderConfig--->ICudaEngine
首先创建 Ibuilder;
然后根据模型格式不同创建不同的parser(caffe onnx uff);
然后调用: builder->createNetworkV2(0U); 创建一个空的INetWork
然后调用 builder->createBuilderConfig(); 创建一个BuilderConfig
builderConfig是一个结构体,里面定义了 模型的workspace,是否量化(如果量化的话还需要设置校准器(caliborator)) 等
最后调用 ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config); 创建出 ICudaEngine
根据格式不同,创建不同的network/parser:
ONNX
nvinfer1::INetworkDefinition* network = builder->createNetworkV2(1U << static_cast<uint32_t>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH)); auto parser = nvonnxparser::createParser(*network, gLogger);
UFF
nvinfer1::INetworkDefinition* network = builder->createNetworkV2(0U);auto parser = nvuffparser::createUffParser();
Caffe
nvinfer1::INetworkDefinition* network = builder->createNetworkV2(0U);auto parser = nvcaffeparser1::createCaffeParser();
来自 <https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#c_topics>
加载模型的例子:
1.使用parser->parse加载caffe模型:
1.1 create
Ibuilder * builder=createInferBuilder(gLogger)
INetworkDefinition* network=builder->createNetWorkV2(0U)
1.2.parser:
ICaffeParser *parser=createCaffeParser();
1.3.parse and get blob-tensor map
Const IBlobNameToTensor * blobNCameToSensor=parser->parse("deploy file","modelFiule",DataType:kFLOAT)
2.使用uffParser加载tensorflow model:
2.1.创建 builder 和 network
IBuilder* builder = createInferBuilder(gLogger);
INetworkDefinition* network = builder->createNetworkV2(0U);
2.2创建 uff parser
IUFFParser* parser = createUffParser();
2.3.注册输入输出:
parser->registerInput("Input_0", DimsCHW(1, 28, 28), UffInputOrder::kNCHW);
parser->registerOutput("Binary_3");
2.4 parse:
parser->parse(uffFile, *network, nvinfer1::DataType::kFLOAT);
3.使用 onnx parser 载入模型:
3.1.crate builder/network
IBuilder* builder = createInferBuilder(gLogger);
const auto explicitBatch = 1U << static_cast<uint32_t>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);
INetworkDefinition* network = builder->createNetworkV2(explicitBatch);
3.2.创建parser
nvonnxparser::IParser* parser =
nvonnxparser::createParser(*network, gLogger);
3.3parse:
parser->parseFromFile(onnx_filename, ILogger::Severity::kWARNING);
3.ICudaEngine
创建:
IBuilderConfig* config = builder->createBuilderConfig();
config->setMaxWorkspaceSize(1 << 20);
ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
注:
对于 parser/network/config/builder 调用destroy() 释放资源
序列化与反序列化:
序列化:
IHostMemory *serializedModel = engine->serialize();
// store model to disk
//IHostMemory 是字节型数据,可以通过其 data() 获取数据的指针,通过size()获取其字节数。
serializedModel->destroy();
反序列化:
IRuntime* runtime = createInferRuntime(gLogger);
ICudaEngine* engine = runtime->deserializeCudaEngine(modelData, modelSize, nullptr);