官方配置参考:https://github.com/dusty-nv/jetson-inference/blob/master/docs/building-repo-2.md
一、基于TX2的jetson-inference安装:
1.安装依赖
$sudo apt-get install git cmake
2.clone the jetson-inference repo
$ git clone https://github.com/dusty-nv/jetson-inference
$ cd jetson-inference
$ git submodule update --init
3.Configuring with CMake
1.注释掉CMakePreBuild.sh中下载模型、解压模型的带以下关键词的代码,不去下载那些模型文件
#sudo apt-get install -y libopencv-calib3d-dev libopencv-dev
#wget
#mv
#tar -xzvf
2.cmake安装
$ mkdir build
$ cd build
$ cmake ../
##cmake 错误总结##
=========================================================================================
error 1.如果上面cmake出错如下提示(jetson nano开发板不会出现这个问题),则使用下面solve解决方案1的命令行进行cmake:
error1:cmake ../
CMake Error at /usr/share/cmake-3.5/Modules/FindPackageHandleStandardArgs.cmake:148 (message):
Could NOT find CUDA (missing: CUDA_CUDART_LIBRARY) (found suitable exact
version "9.0")
Call Stack (most recent call first):
/usr/share/cmake-3.5/Modules/FindPackageHandleStandardArgs.cmake:388 (_FPHSA_FAILURE_MESSAGE)
/usr/share/cmake-3.5/Modules/FindCUDA.cmake:949 (find_package_handle_standard_args)
/usr/local/share/OpenCV/OpenCVConfig.cmake:86 (find_package)
/usr/local/share/OpenCV/OpenCVConfig.cmake:105 (find_host_package)
trt-console/CMakeLists.txt:4 (find_package)
CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_CUDART_LIBRARY (ADVANCED)
linked by target "jetson-inference" in directory /home/nvidia/jetson-inference
linked by target "imagenet-console" in directory /home/nvidia/jetson-inference/imagenet-console
linked by target "imagenet-camera" in directory /home/nvidia/jetson-inference/imagenet-camera
linked by target "detectnet-console" in directory /home/nvidia/jetson-inference/detectnet-console
linked by target "detectnet-camera" in directory /home/nvidia/jetson-inference/detectnet-camera
linked by target "segnet-console" in directory /home/nvidia/jetson-inference/segnet-console
linked by target "segnet-camera" in directory /home/nvidia/jetson-inference/segnet-camera
linked by target "trt-bench" in directory /home/nvidia/jetson-inference/trt-bench
-- Configuring incomplete, errors occurred!
See also "/home/nvidia/jetson-inference/build/CMakeFiles/CMakeOutput.log".
解决方案1(solve):
$cmake -DCUDA_CUDART_LIBRARY=/usr/local/cuda/lib64/libcudart.so ../
========================================================================================
error 2.不能兼容opencv 4.0.0版本
解决方案2(solve):
修改工程中jetson-inference/tools/trt-console/CMakeLists.txt
改为:find_package(OpenCV 4.0.0 COMPONENTS core calib3d REQUIRED)
========================================================================================
error 3.没有支持python3.5版本的接口
解决方案3(solve):
3.1 修改工程中jetson-inference/python/CMakeLists.txt 第8行
改为:set(PYTHON_BINDING_VERSIONS 3.5 3.6 3.7)
3.2 修改工程中jetson-inference/utils/python/CMakeLists.txt 第8行
set(PYTHON_BINDING_VERSIONS 3.5 3.6 3.7)
========================================================================================
4.Compiling the Project
$ cd jetson-inference/build # omit if pwd is already /build from above
$ make
$ sudo make install
二、基于TX2的jetson-inference分割模型推理:
1.训练好的模型文件的文件夹放置的位置:
放到xxx/jetson-inference/build/aarch/bin/
比如我模型文件的文件夹:
/home/ljm/jetson-inference/build/aarch64/bin/20190625-152849-e5e9_epoch_16.0
2.关于模型网络
在FCN-Alexnet里面重要的网络层,TensorRT已经支持它们了,所以在deploy.prototxt 的末尾处,要保留deconv 和 crop层:
所以看网络层的最后一层知道输出节点的名称为:score
3.分割模型推理测试:命令行如下
$cd ~/jetson-inference/build/aarch/bin
$NET=20190625-152849-e5e9_epoch_16.0
$./segnet-console 11.png output_11.png
--prototxt=$NET/deploy.prototxt
--model=$NET/snapshot_iter_54680.caffemodel
--labels=$NET/label_names.txt
--colors=$NET/color_map.txt
--input_blob=data
--output_blob=score
4.推理测试显示日志:
nvidia@tegra-ubuntu:~/jetson-inference/build/aarch64/bin$ ./segnet-console
11.png output_11.png
--prototxt=$NET/deploy.prototxt
--model=$NET/snapshot_iter_54680.caffemodel
--labels=$NET/label_names.txt
--colors=$NET/color_map.txt
--input_blob=data
--output_blob=score
segnet-console
args (9): 0 [./segnet-console]
1 [11.png]
2 [output_11.png]
3 [--prototxt=20190625-152849-e5e9_epoch_16.0/deploy.prototxt]
4 [--model=20190625-152849-e5e9_epoch_16.0/snapshot_iter_54680.caffemodel]
5 [--labels=20190625-152849-e5e9_epoch_16.0/label_names.txt]
6 [--colors=20190625-152849-e5e9_epoch_16.0/color_map.txt]
7 [--input_blob=data]
8 [--output_blob=score]
segNet -- loading segmentation network model from:
-- prototxt: 20190625-152849-e5e9_epoch_16.0/deploy.prototxt
-- model: 20190625-152849-e5e9_epoch_16.0/snapshot_iter_54680.caffemodel
-- labels: 20190625-152849-e5e9_epoch_16.0/label_names.txt
-- colors: 20190625-152849-e5e9_epoch_16.0/color_map.txt
-- input_blob 'data'
-- output_blob 'score'
-- batch_size 2
-- precision 2
[TRT] TensorRT version 4.0.2
[TRT] detected model format - caffe (extension '.caffemodel')
[TRT] desired precision specified for GPU: FP32
[TRT] native precisions detected for GPU: FP32, FP16
[TRT] attempting to open engine cache file 20190625-152849-e5e9_epoch_16.0/snapshot_iter_54680.caffemodel.2.1.GPU.FP32.engine
[TRT] loading network profile from engine cache... 20190625-152849-e5e9_epoch_16.0/snapshot_iter_54680.caffemodel.2.1.GPU.FP32.engine
[TRT] device GPU, 20190625-152849-e5e9_epoch_16.0/snapshot_iter_54680.caffemodel loaded
[TRT] device GPU, CUDA engine context initialized with 2 bindings
[TRT] binding -- index 0
-- name 'data'
-- type FP32
-- in/out INPUT
-- # dims 3
-- dim #0 3 (CHANNEL)
-- dim #1 480 (SPATIAL)
-- dim #2 640 (SPATIAL)
[TRT] binding -- index 1
-- name 'score'
-- type FP32
-- in/out OUTPUT
-- # dims 3
-- dim #0 21 (CHANNEL)
-- dim #1 480 (SPATIAL)
-- dim #2 640 (SPATIAL)
[TRT] binding to input 0 data binding index: 0
[TRT] binding to input 0 data dims (b=2 c=3 h=480 w=640) size=7372800
[cuda] cudaAllocMapped 7372800 bytes, CPU 0x101540000 GPU 0x101540000
[TRT] binding to output 0 score binding index: 1
[TRT] binding to output 0 score dims (b=2 c=21 h=480 w=640) size=51609600
[cuda] cudaAllocMapped 51609600 bytes, CPU 0x101c50000 GPU 0x101c50000
device GPU, 20190625-152849-e5e9_epoch_16.0/snapshot_iter_54680.caffemodel initialized.
[cuda] cudaAllocMapped 336 bytes, CPU 0x101340200 GPU 0x101340200
[TRT] segNet outputs -- s_w 640 s_h 480 s_c 21
[cuda] cudaAllocMapped 307200 bytes, CPU 0x104d90000 GPU 0x104d90000
segNet -- class 00 color 0 0 0 255
segNet -- class 01 color 0 255 255 255
segNet -- loaded 2 class colors
'egNet -- class 00 label '_background_
'egNet -- class 01 label 'water
segNet -- loaded 2 class labels
loaded image 11.png (640 x 480) 4915200 bytes
[cuda] cudaAllocMapped 4915200 bytes, CPU 0x104f90000 GPU 0x104f90000
[cuda] cudaAllocMapped 4915200 bytes, CPU 0x105440000 GPU 0x105440000
segnet-console: beginning processing (1561975363278)
[TRT] layer shift - 1.396928 ms
[TRT] layer conv1 + relu1 - 76.159103 ms
[TRT] layer pool1 - 1.299744 ms
[TRT] layer norm1 - 0.325120 ms
[TRT] layer conv2 + relu2 - 21.611776 ms
[TRT] layer pool2 - 0.905824 ms
[TRT] layer norm2 - 0.253536 ms
[TRT] layer conv3 + relu3 - 13.983904 ms
[TRT] layer conv4 + relu4 - 10.975136 ms
[TRT] layer conv5 + relu5 - 6.627584 ms
[TRT] layer pool5 - 0.246560 ms
[TRT] layer fc6 + relu6 - 72.884415 ms
[TRT] layer fc7 + relu7 - 25.698463 ms
[TRT] layer score_fr - 0.800960 ms
[TRT] layer upscore - 1590.975830 ms
[TRT] layer score - 1.977600 ms
[TRT] layer network time - 1826.122437 ms
segnet-console: finished processing (1561975366445)
segnet-console: completed saving 'output_11.png'
shutting down...
5.测试结果:
6.推理精度改FP16为FP32:
6.1 查看有哪些精度可以选择:查看tensorNet.h文件,第67行
enum precisionType
{
TYPE_DISABLED = 0, /**< Unknown, unspecified, or disabled type */
TYPE_FASTEST, /**< The fastest detected precision should be use (i.e. try INT8, then FP16, then FP32) */
TYPE_FP32, /**< 32-bit floating-point precision (FP32) */
TYPE_FP16, /**< 16-bit floating-point half precision (FP16) */
TYPE_INT8, /**< 8-bit integer precision (INT8) */
NUM_PRECISIONS /**< Number of precision types defined */
};
6.2 修改segNet.h文件中两个地方:改为precisionType precision=TYPE_FP32
static segNet* Create( NetworkType networkType=FCN_ALEXNET_CITYSCAPES_SD, uint32_t maxBatchSize=2,
precisionType precision=TYPE_FP32, deviceType device=DEVICE_GPU, bool allowGPUFallback=true ); //TYPE_FASTEST改为TYPE_FP32
/**
* Load a new network instance
* @param prototxt_path File path to the deployable network prototxt
* @param model_path File path to the caffemodel
* @param class_labels File path to list of class name labels
* @param class_colors File path to list of class colors
* @param input Name of the input layer blob. @see SEGNET_DEFAULT_INPUT
* @param output Name of the output layer blob. @see SEGNET_DEFAULT_OUTPUT
* @param maxBatchSize The maximum batch size that the network will support and be optimized for.
*/
static segNet* Create( const char* prototxt_path, const char* model_path,
const char* class_labels, const char* class_colors=NULL,
const char* input = SEGNET_DEFAULT_INPUT,
const char* output = SEGNET_DEFAULT_OUTPUT,
uint32_t maxBatchSize=2, precisionType precision=TYPE_FP32, // TYPE_FASTEST改为TYPE_FP32
deviceType device=DEVICE_GPU, bool allowGPUFallback=true );