torchvision0.5.0+libtorch1.4.0+cmake3.17+vs2017+cu101+cudnn765+python365
1. 编译环境
-
需要提前安装pthread库,在此使用版本为pthread-w32-2-9-1-release,安装方法参考 https://blog.youkuaiyun.com/June_Xixi/article/details/83214450
-
提前准备安装好python,并且安装debug版本,编译debug需要用到pythonXY_d.lib,在此使用版本为python3.6
2. 修改CMakeLists
- 增加cu文件的包含
set(CMAKE_CXX_STANDARD 14)
find_package(Torch REQUIRED)
file(GLOB HEADERS torchvision/csrc/vision.h)
file(GLOB MODELS_HEADERS torchvision/csrc/models/*.h)
file(GLOB MODELS_SOURCES torchvision/csrc/models/*.h torchvision/csrc/models/*.cpp)
add_library (${PROJECT_NAME} SHARED ${MODELS_SOURCES})
target_link_libraries(${PROJECT_NAME} PUBLIC "${TORCH_LIBRARIES}")
改为
set(CMAKE_CXX_STANDARD 14)
set(TORCHVISION_VERSION 0.5.0)
option(WITH_CUDA "Enable CUDA support" ON)
if(WITH_CUDA)
enable_language(CUDA)
endif()
find_package(Torch REQUIRED)
file(GLOB HEADERS torchvision/csrc/*.h)
file(GLOB OPERATOR_SOURCES torchvision/csrc/cpu/*.h torchvision/csrc/cpu/*.cpp torchvision/csrc/*.cpp)
if(WITH_CUDA)
file(GLOB OPERATOR_SOURCES ${OPERATOR_SOURCES} torchvision/csrc/cuda/*.h torchvision/csrc/cuda/*.cu)
endif()
file(GLOB MODELS_HEADERS torchvision/csrc/models/*.h)
file(GLOB MODELS_SOURCES torchvision/csrc/models/*.h torchvision/csrc/models/*.cpp)
add_library (${PROJECT_NAME} SHARED ${MODELS_SOURCES} ${OPERATOR_SOURCES})
target_link_libraries(${PROJECT_NAME} PUBLIC "${TORCH_LIBRARIES}")
3. Cmake configure
-
可能报找不到Torch_Dir:手工添加TorchConfig.cmake所在的目录,如<libtorch-win-shared-with-deps-debug-1.4.0>|<libtorch-win-shared-with-deps-1.4.0>\libtorch\share\cmake\Torch,注意debug/release区分
-
可能报找不到CuDnn版本:手工修改<libtorch_install_path>\libtorch\share\cmake\Caffe2\public\cuda.cmake
if(NOT CUDNN_VERSION_MAJOR)
set(CUDNN_VERSION "?")
else()
set(CUDNN_VERSION
"${CUDNN_VERSION_MAJOR}.${CUDNN_VERSION_MINOR}.${CUDNN_VERSION_PATCH}")
endif()
改为自己的cudnn版本(这里使用了cudnn7.6.5)
set(CUDNN_VERSION "7.6.5")
4. VS2017编译
1) 编译工程修改
- 去掉所有NVCC的附加参数:其中的openmp cuda10.1不支持,vcxproj中删除
<AdditionalOptions>%(AdditionalOptions) /Z7 /EHa /wd4267 /wd4251 /wd4522 /wd4838 /wd4305 /wd4244 /wd4190 /wd4101 /wd4996 /wd4275 /bigobj -openmp -Xcompiler="/EHsc -Zi -Ob0"</AdditionalOptions>
- 去除vision.cpp的编译使用:vcxproj中删除
<ClCompile Include="<vision-0.5.0 path>\vision-0.5.0\torchvision\csrc\vision.cpp" />
2) 源代码修改
- vison_cpu.h与vison_cuda.h中的方法加入接口导出
- ROIAlign等层的_cpu.cpp/_cuda.cu实现中有个别实现没有include vison_cpu.h或vison_cuda.h,需全部补上
- vison_cuda.h中由于包含了torch_extension.h会导致nvcc编译器无法编译静态成员变量在.h中初始化报错,将
#include <torch/extension.h>
改为#include <torch/cuda.h>
3) 编译参数修改
- 通用宏定义加入WITH_CUDA
- 附加库目录加入Python.h所在目录:<python_install_dir>\include
- 附加链接库加入lib(区分release/debug):<python_install_dir>\libs<python36.lib | python36_d.lib>
- nvcc 编译cuda源码:more than one operator “==” matches these operands"…:
参考 https://blog.youkuaiyun.com/longma666666/article/details/81190065,在nvcc预定义宏中添加__CUDA_NO_HALF_OPERATORS__ - 有时候连接库中找不到torch.lib,在属性->连接器->输入中输入torch.lib的全路径
- 增加尽量多的GPU架构支持,否则在较新的GPU上会运行失败,修改CUDA C/C++ -> Code Generation,将compute_30,sm_30改为compute_30,sm_30;compute_35,sm_35;compute_37,sm_37;compute_50,sm_50;compute_52,sm_52;compute_60,sm_60;compute_61,sm_61;compute_70,sm_70;compute_75,sm_75;
5. 整理发布包
- 头文件:
原有<vision-0.5.0 path>\vision-0.5.0/torchvision/csrc/cpu中只保留vision_cpu.h;
原有<vision-0.5.0 path>\vision-0.5.0/torchvision/csrc/cuda中只保留vision_cuda.h;
原有<vision-0.5.0 path>\vision-0.5.0/torchvision/csrc/models中只保留所有.h文件;
原有<vision-0.5.0 path>\vision-0.5.0/torchvision/csrc中的文件全部保留
将修改后的<vision-0.5.0 path>\vision-0.5.0/torchvision/csrc中的全部目录和文件保存至<vision_install_path>/include/torchvision下
- lib/dll文件可以自行定义
6. 在自己的工程中使用
- 包含头文件目录需加入
<python_install_dir>\include
<vision_install_path>/include/
<vision_install_path>/include\torch\csrc\api\include
- 库目录加入
<python_install_dir>\libs
<vision_install_path>/<your libs dir>
- 附加依赖项加入
python36_d.lib或python36.lib
torchvision.lib
- 将vision.cpp加入自己的工程编译文件中