caffe源码分析--math_functions.cu代码研究

本文深入解析CUDA宏定义CUDA_KERNEL_LOOP的使用方式,并介绍如何利用这一技巧实现一个向量点积运算的CUDA核函数。讨论了当向量维数超过线程总数时,如何通过循环使单个线程处理多个元素。
部署运行你感兴趣的模型镜像



其中用到一个宏定义CUDA_KERNEL_LOOP

common.hpp中有。


#defineCUDA_KERNEL_LOOP(i,n) \

for(inti = blockIdx.x * blockDim.x + threadIdx.x; \

i < (n); \

i +=blockDim.x * gridDim.x)



先看看caffe采取的线程格和线程块的维数设计,

还是从common.hpp可以看到

CAFFE_CUDA_NUM_THREADS

CAFFE_GET_BLOCKS(constintN)

明显都是一维的。


整理一下CUDA_KERNEL_LOOP格式看看,

for(inti = blockIdx.x * blockDim.x + threadIdx.x;

i< (n);

i+= blockDim.x * gridDim.x)

blockDim.x* gridDim.x表示的是该线程格所有线程的数量。

n表示核函数总共要处理的元素个数。

有时候,n会大于blockDim.x* gridDim.x,因此并不能一个线程处理一个元素。

由此通过上面的方法,让一个线程串行(for循环)处理几个元素。

这其实是常用的伎俩,得借鉴学习一下。




再来看一下这个核函数的实现。


template<typename Dtype>

__global__void mul_kernel(const int n, const Dtype* a,

constDtype* b, Dtype* y)

{

CUDA_KERNEL_LOOP(index,n)

{

y[index]= a[index] * b[index];

}

}


明显就是算两个向量的点积了。

由于向量的维数可能大于该kernel函数线程格的总线程数量。

因此有些线程可以要串行处理几个元素。





您可能感兴趣的与本文相关的镜像

PyTorch 2.5

PyTorch 2.5

PyTorch
Cuda

PyTorch 是一个开源的 Python 机器学习库,基于 Torch 库,底层由 C++ 实现,应用于人工智能领域,如计算机视觉和自然语言处理

-- Build files have been written to: /root/Workspace/openpose/build/caffe/src/openpose_lib-build [ 75%] Performing build step for 'openpose_lib' [ 1%] Running C++/Python protocol buffer compiler on /root/Workspace/openpose/3rdparty/caffe/src/caffe/proto/caffe.proto [ 1%] Building CXX object src/caffe/CMakeFiles/caffeproto.dir/__/__/include/caffe/proto/caffe.pb.cc.o [ 1%] Linking CXX static library ../../lib/libcaffeproto.a [ 1%] Built target caffeproto [ 1%] Building NVCC (Device) object src/caffe/CMakeFiles/cuda_compile_1.dir/util/cuda_compile_1_generated_math_functions.cu.o [ 1%] Building NVCC (Device) object src/caffe/CMakeFiles/cuda_compile_1.dir/layers/cuda_compile_1_generated_absval_layer.cu.o [ 2%] Building NVCC (Device) object src/caffe/CMakeFiles/cuda_compile_1.dir/layers/cuda_compile_1_generated_base_data_layer.cu.o [ 2%] Building NVCC (Device) object src/caffe/CMakeFiles/cuda_compile_1.dir/layers/cuda_compile_1_generated_accuracy_layer.cu.o [ 2%] Building NVCC (Device) object src/caffe/CMakeFiles/cuda_compile_1.dir/layers/cuda_compile_1_generated_batch_norm_layer.cu.o [ 2%] Building NVCC (Device) object src/caffe/CMakeFiles/cuda_compile_1.dir/layers/cuda_compile_1_generated_batch_reindex_layer.cu.o [ 4%] Building NVCC (Device) object src/caffe/CMakeFiles/cuda_compile_1.dir/layers/cuda_compile_1_generated_bnll_layer.cu.o [ 4%] Building NVCC (Device) object src/caffe/CMakeFiles/cuda_compile_1.dir/layers/cuda_compile_1_generated_bias_layer.cu.o In file included from /root/Workspace/openpose/3rdparty/caffe/src/caffe/util/math_functions.cu:1: /usr/local/cuda-11.8/include/math_functions.h:54:2: warning: #warning "math_functions.h is an internal header file and must not be used directly. This file will be removed in a future CUDA release. Please use cuda_runtime_api.h or cuda_runtime.h instead." [-Wcpp] 54 | #warning "math_functions.h is an internal header file and must not be used directly. This file will be removed in a future CUDA release.
03-15
好像卡住了”(.venv) PS E:\PyTorch_Build\pytorch> # 1. 准备工作空间 (.venv) PS E:\PyTorch_Build\pytorch> $sourceDir = "E:\PyTorch_Build\pytorch" (.venv) PS E:\PyTorch_Build\pytorch> $buildDir = "$sourceDir\build" (.venv) PS E:\PyTorch_Build\pytorch> $installDir = "$sourceDir\install" (.venv) PS E:\PyTorch_Build\pytorch> (.venv) PS E:\PyTorch_Build\pytorch> # 清理工作区 (.venv) PS E:\PyTorch_Build\pytorch> Remove-Item -Path $buildDir -Recurse -Force -ErrorAction SilentlyContinue (.venv) PS E:\PyTorch_Build\pytorch> Remove-Item -Path $installDir -Recurse -Force -ErrorAction SilentlyContinue (.venv) PS E:\PyTorch_Build\pytorch> New-Item -Path $buildDir -ItemType Directory -Force | Out-Null (.venv) PS E:\PyTorch_Build\pytorch> New-Item -Path $installDir -ItemType Directory -Force | Out-Null (.venv) PS E:\PyTorch_Build\pytorch> (.venv) PS E:\PyTorch_Build\pytorch> # 2. 设置环境变量 (.venv) PS E:\PyTorch_Build\pytorch> $env:Path = "C:\Program Files\CMake\bin;E:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\bin;E:\Python310;$env:Path" (.venv) PS E:\PyTorch_Build\pytorch> $env:CUDA_PATH = "E:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0" (.venv) PS E:\PyTorch_Build\pytorch> (.venv) PS E:\PyTorch_Build\pytorch> # 3. 安装正确的 CMake 版本 (.venv) PS E:\PyTorch_Build\pytorch> Write-Host "安装 CMake 3.28..." 安装 CMake 3.28... (.venv) PS E:\PyTorch_Build\pytorch> Invoke-WebRequest -Uri "https://github.com/Kitware/CMake/releases/download/v3.28.3/cmake-3.28.3-windows-x86_64.msi" -OutFile "$env:TEMP\cmake.msi" Invoke-WebRequest: One or more errors occurred. (The response ended prematurely. (ResponseEnded)) (.venv) PS E:\PyTorch_Build\pytorch> Start-Process msiexec -ArgumentList "/i `"$env:TEMP\cmake.msi`" /quiet" -Wait (.venv) PS E:\PyTorch_Build\pytorch> $env:Path = "C:\Program Files\CMake\bin;$env:Path" (.venv) PS E:\PyTorch_Build\pytorch> (.venv) PS E:\PyTorch_Build\pytorch> # 4. 生成 CMake 配置 (.venv) PS E:\PyTorch_Build\pytorch> Set-Location $buildDir (.venv) PS E:\PyTorch_Build\pytorch\build> (.venv) PS E:\PyTorch_Build\pytorch\build> $cmakeCommand = @" >> cmake $sourceDir ` >> -G "Visual Studio 17 2022" -A x64 ` >> -DCMAKE_INSTALL_PREFIX="$installDir" ` >> -DCMAKE_TOOLCHAIN_FILE="$sourceDir/cmake/modules/win_toolchain.cmake" ` >> -DBUILD_PYTHON=ON ` >> -DPython_EXECUTABLE="E:\Python310\python.exe" ` >> -DUSE_CUDA=ON ` >> -DUSE_CUDNN=ON ` >> -DCUDNN_INCLUDE_DIR="E:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include" ` >> -DCUDNN_LIBRARY="E:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\lib\x64\cudnn.lib" ` >> -DUSE_MKLDNN=OFF ` >> -DUSE_DNNL=OFF ` >> -DUSE_STATIC_OPENMP=ON ` >> -DBUILD_TEST=OFF ` >> -DBUILD_TORCH=ON ` >> -DCMAKE_BUILD_TYPE=Release >> "@ (.venv) PS E:\PyTorch_Build\pytorch\build> (.venv) PS E:\PyTorch_Build\pytorch\build> Invoke-Expression $cmakeCommand 2>&1 | Tee-Object -FilePath "$buildDir\cmake_configure.log" -- Building for: Visual Studio 17 2022 -- Selecting Windows SDK version 10.0.26100.0 to target Windows . -- The CXX compiler identification is MSVC 19.44.35215.0 -- The C compiler identification is MSVC 19.44.35215.0 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe - skipped -- Detecting C compile features -- Detecting C compile features - done -- Not forcing any particular BLAS to be found -- Build type not set - defaulting to Release -- Performing Test C_HAS_AVX_1 -- Performing Test C_HAS_AVX_1 - Success -- Performing Test C_HAS_AVX2_1 -- Performing Test C_HAS_AVX2_1 - Success -- Performing Test C_HAS_AVX512_1 -- Performing Test C_HAS_AVX512_1 - Success -- Performing Test CXX_HAS_AVX_1 -- Performing Test CXX_HAS_AVX_1 - Success -- Performing Test CXX_HAS_AVX2_1 -- Performing Test CXX_HAS_AVX2_1 - Success -- Performing Test CXX_HAS_AVX512_1 -- Performing Test CXX_HAS_AVX512_1 - Success -- Current compiler supports avx2 extension. Will build perfkernels. -- Performing Test COMPILER_SUPPORTS_HIDDEN_VISIBILITY -- Performing Test COMPILER_SUPPORTS_HIDDEN_VISIBILITY - Failed -- Performing Test COMPILER_SUPPORTS_HIDDEN_INLINE_VISIBILITY -- Performing Test COMPILER_SUPPORTS_HIDDEN_INLINE_VISIBILITY - Failed -- Could not find hardware support for NEON on this machine. -- No OMAP3 processor on this machine. -- No OMAP4 processor on this machine. -- Compiler does not support SVE extension. Will not build perfkernels. -- Performing Test HAS/UTF_8 -- Performing Test HAS/UTF_8 - Success -- Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR) (found version "13.0") -- Building using own protobuf under third_party per request. -- Use custom protobuf build. -- -- 3.13.0.0 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - not found -- Found Threads: TRUE -- Caffe2 protobuf include directory: $<BUILD_INTERFACE:E:/PyTorch_Build/pytorch/third_party/protobuf/src>$<INSTALL_INTERFACE:include> -- Trying to find preferred BLAS backend of choice: MKL -- MKL_THREADING = OMP -- Looking for sys/types.h -- Looking for sys/types.h - found -- Looking for stdint.h -- Looking for stdint.h - found -- Looking for stddef.h -- Looking for stddef.h - found -- Check size of void* -- Check size of void* - done -- MKL_THREADING = OMP -- MKL_THREADING = OMP -- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core - libiomp5md] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_intel_thread - mkl_core - libiomp5md] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_intel_thread - mkl_core] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_sequential - mkl_core] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_sequential - mkl_core] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_core - libiomp5md - pthread] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_core - libiomp5md - pthread] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_core - pthread] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_core - pthread] -- Library mkl_intel: not found -- Checking for [mkl - guide - pthread - m] -- Library mkl: not found -- MKL library not found -- Checking for [blis] -- Library blis: BLAS_blis_LIBRARY-NOTFOUND -- Checking for [Accelerate] -- Library Accelerate: BLAS_Accelerate_LIBRARY-NOTFOUND -- Checking for [vecLib] -- Library vecLib: BLAS_vecLib_LIBRARY-NOTFOUND -- Checking for [flexiblas] -- Library flexiblas: BLAS_flexiblas_LIBRARY-NOTFOUND -- Checking for [openblas] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [openblas - pthread - m] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [openblas - pthread - m - gomp] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [libopenblas] -- Library libopenblas: BLAS_libopenblas_LIBRARY-NOTFOUND -- Checking for [goto2 - gfortran] -- Library goto2: BLAS_goto2_LIBRARY-NOTFOUND -- Checking for [goto2 - gfortran - pthread] -- Library goto2: BLAS_goto2_LIBRARY-NOTFOUND -- Checking for [acml - gfortran] -- Library acml: BLAS_acml_LIBRARY-NOTFOUND -- Checking for [blis] -- Library blis: BLAS_blis_LIBRARY-NOTFOUND -- Could NOT find Atlas (missing: Atlas_CBLAS_INCLUDE_DIR Atlas_CLAPACK_INCLUDE_DIR Atlas_CBLAS_LIBRARY Atlas_BLAS_LIBRARY Atlas_LAPACK_LIBRARY) -- Checking for [ptf77blas - atlas - gfortran] -- Library ptf77blas: BLAS_ptf77blas_LIBRARY-NOTFOUND -- Checking for [] -- Looking for sgemm_ -- Looking for sgemm_ - not found -- Cannot find a library with BLAS API. Not using BLAS. -- Using pocketfft in directory: E:/PyTorch_Build/pytorch/third_party/pocketfft/ -- Using third party subdirectory Eigen. -- Found Python: E:/Python310/python.exe (found version "3.10.10") found components: Interpreter Development.Module NumPy -- Using third_party/pybind11. -- pybind11 include dirs: E:/PyTorch_Build/pytorch/cmake/../third_party/pybind11/include -- Could NOT find OpenTelemetryApi (missing: OpenTelemetryApi_INCLUDE_DIRS) -- Using third_party/opentelemetry-cpp. -- opentelemetry api include dirs: E:/PyTorch_Build/pytorch/cmake/../third_party/opentelemetry-cpp/api/include -- Could NOT find MPI_C (missing: MPI_C_LIB_NAMES MPI_C_HEADER_DIR MPI_C_WORKS) -- Could NOT find MPI_CXX (missing: MPI_CXX_LIB_NAMES MPI_CXX_HEADER_DIR MPI_CXX_WORKS) -- Could NOT find MPI (missing: MPI_C_FOUND MPI_CXX_FOUND) -- MKL_THREADING = OMP -- Check OMP with lib C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/lib/x64/libomp.lib and flags -openmp:experimental -- MKL_THREADING = OMP -- Check OMP with lib C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/lib/x64/libomp.lib and flags -openmp:experimental -- Found OpenMP_C: -openmp:experimental -- Found OpenMP_CXX: -openmp:experimental -- Found OpenMP: TRUE -- Adding OpenMP CXX_FLAGS: -openmp:experimental -- Will link against OpenMP libraries: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/lib/x64/libomp.lib -- Found nvtx3: E:/PyTorch_Build/pytorch/third_party/NVTX/c/include -- ROCM_PATH environment variable is not set and C:/opt/rocm does not exist. Building without ROCm support. -- Found Python3: E:/Python310/python.exe (found version "3.10.10") found components: Interpreter -- ONNX_PROTOC_EXECUTABLE: $<TARGET_FILE:protobuf::protoc> -- Protobuf_VERSION: Protobuf_VERSION_NOTFOUND -- -- ******** Summary ******** -- CMake version : 4.1.0 -- CMake command : E:/Python310/Lib/site-packages/cmake/data/bin/cmake.exe -- System : Windows -- C++ compiler : C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe -- C++ compiler version : 19.44.35215.0 -- CXX flags : /DWIN32 /D_WINDOWS /EHsc /Zc:__cplusplus /MP /bigobj /FS /utf-8 /EHsc /wd26812 -- Build type : Release -- Compile definitions : ONNX_ML=1;ONNXIFI_ENABLE_EXT=1 -- CMAKE_PREFIX_PATH : -- CMAKE_INSTALL_PREFIX : C:/Program Files (x86)/Torch -- CMAKE_MODULE_PATH : E:/PyTorch_Build/pytorch/cmake/Modules;E:/PyTorch_Build/pytorch/cmake/public/../Modules_CUDA_fix -- -- ONNX version : 1.18.0 -- ONNX NAMESPACE : onnx_torch -- ONNX_USE_LITE_PROTO : OFF -- USE_PROTOBUF_SHARED_LIBS : OFF -- ONNX_DISABLE_EXCEPTIONS : OFF -- ONNX_DISABLE_STATIC_REGISTRATION : OFF -- ONNX_WERROR : OFF -- ONNX_BUILD_TESTS : OFF -- BUILD_SHARED_LIBS : OFF -- -- Protobuf compiler : $<TARGET_FILE:protobuf::protoc> -- Protobuf includes : -- Protobuf libraries : -- ONNX_BUILD_PYTHON : OFF -- Found CUDA with FP16 support, compiling with torch.cuda.HalfTensor -- Adding -DNDEBUG to compile flags -- Could not find hardware support for NEON on this machine. -- No OMAP3 processor on this machine. -- No OMAP4 processor on this machine. -- MKL_THREADING = OMP -- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core - libiomp5md] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_intel_thread - mkl_core - libiomp5md] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_intel_thread - mkl_core] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_sequential - mkl_core] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_sequential - mkl_core] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_core - libiomp5md - pthread] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_core - libiomp5md - pthread] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_core - pthread] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_core - pthread] -- Library mkl_intel: not found -- Checking for [mkl - guide - pthread - m] -- Library mkl: not found -- MKL library not found -- Checking for [blis] -- Library blis: BLAS_blis_LIBRARY-NOTFOUND -- Checking for [Accelerate] -- Library Accelerate: BLAS_Accelerate_LIBRARY-NOTFOUND -- Checking for [vecLib] -- Library vecLib: BLAS_vecLib_LIBRARY-NOTFOUND -- Checking for [flexiblas] -- Library flexiblas: BLAS_flexiblas_LIBRARY-NOTFOUND -- Checking for [openblas] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [openblas - pthread - m] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [openblas - pthread - m - gomp] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [libopenblas] -- Library libopenblas: BLAS_libopenblas_LIBRARY-NOTFOUND -- Checking for [goto2 - gfortran] -- Library goto2: BLAS_goto2_LIBRARY-NOTFOUND -- Checking for [goto2 - gfortran - pthread] -- Library goto2: BLAS_goto2_LIBRARY-NOTFOUND -- Checking for [acml - gfortran] -- Library acml: BLAS_acml_LIBRARY-NOTFOUND -- Checking for [blis] -- Library blis: BLAS_blis_LIBRARY-NOTFOUND -- Could NOT find Atlas (missing: Atlas_CBLAS_INCLUDE_DIR Atlas_CLAPACK_INCLUDE_DIR Atlas_CBLAS_LIBRARY Atlas_BLAS_LIBRARY Atlas_LAPACK_LIBRARY) -- Checking for [ptf77blas - atlas - gfortran] -- Library ptf77blas: BLAS_ptf77blas_LIBRARY-NOTFOUND -- Checking for [] -- Cannot find a library with BLAS API. Not using BLAS. -- LAPACK requires BLAS -- Cannot find a library with LAPACK API. Not using LAPACK. -- MIOpen not found. Compiling without MIOpen support -- {fmt} version: 11.2.0 -- Build type: Release -- Using CPU-only version of Kineto -- Configuring Kineto dependency: -- KINETO_SOURCE_DIR = E:/PyTorch_Build/pytorch/third_party/kineto/libkineto -- KINETO_BUILD_TESTS = OFF -- KINETO_LIBRARY_TYPE = static -- Found PythonInterp: E:/Python310/python.exe (found version "3.10.10") -- CUDA_SOURCE_DIR = -- ROCM_SOURCE_DIR = -- CUPTI unavailable or disabled - not building GPU profilers -- Kineto: FMT_SOURCE_DIR = E:/PyTorch_Build/pytorch/third_party/fmt -- Kineto: FMT_INCLUDE_DIR = E:/PyTorch_Build/pytorch/third_party/fmt/include -- CUPTI_INCLUDE_DIR = /extras/CUPTI/include -- ROCTRACER_INCLUDE_DIR = /include/roctracer -- DYNOLOG_INCLUDE_DIR = E:/PyTorch_Build/pytorch/third_party/kineto/libkineto/third_party/dynolog/ -- IPCFABRIC_INCLUDE_DIR = E:/PyTorch_Build/pytorch/third_party/kineto/libkineto/third_party/dynolog//dynolog/src/ipcfabric/ -- Configured Kineto (CPU) -- Performing Test HAS/WD4624 -- Performing Test HAS/WD4624 - Success -- Performing Test HAS/WD4068 -- Performing Test HAS/WD4068 - Success -- Performing Test HAS/WD4067 -- Performing Test HAS/WD4067 - Success -- Performing Test HAS/WD4267 -- Performing Test HAS/WD4267 - Success -- Performing Test HAS/WD4661 -- Performing Test HAS/WD4661 - Success -- Performing Test HAS/WD4717 -- Performing Test HAS/WD4717 - Success -- Performing Test HAS/WD4244 -- Performing Test HAS/WD4244 - Success -- Performing Test HAS/WD4804 -- Performing Test HAS/WD4804 - Success -- Performing Test HAS/WD4273 -- Performing Test HAS/WD4273 - Success -- Performing Test HAS_WNO_STRINGOP_OVERFLOW -- Performing Test HAS_WNO_STRINGOP_OVERFLOW - Failed -- Found Git: E:/Program Files/Git/cmd/git.exe (found version "2.51.0.windows.1") -- -- Note: when building with Visual Studio the build type is specified when building. -- For example: 'cmake --build . --config=Release -- Architecture: -- Use the C++ compiler to compile (MI_USE_CXX=ON) -- -- Library name : mimalloc -- Version : 2.2.4 -- Build type : release -- C++ Compiler : C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe -- Compiler flags : /Zc:__cplusplus -- Compiler defines : MI_CMAKE_BUILD_TYPE=release;MI_BUILD_RELEASE -- Link libraries : psapi;shell32;user32;advapi32;bcrypt -- Build targets : static -- -- don't use NUMA -- Looking for backtrace -- Looking for backtrace - not found -- Could NOT find Backtrace (missing: Backtrace_LIBRARY Backtrace_INCLUDE_DIR) -- headers outputs: torch\csrc\inductor\aoti_torch\generated\c_shim_cuda.h not found torch\csrc\inductor\aoti_torch\generated\c_shim_cpu.h not found torch\csrc\inductor\aoti_torch\generated\c_shim_aten.h not found -- sources outputs: -- declarations_yaml outputs: -- Performing Test COMPILER_SUPPORTS_NO_AVX256_SPLIT -- Performing Test COMPILER_SUPPORTS_NO_AVX256_SPLIT - Failed -- Using ATen parallel backend: OMP -- Check size of long double -- Check size of long double - done -- Performing Test COMPILER_SUPPORTS_FLOAT128 -- Performing Test COMPILER_SUPPORTS_FLOAT128 - Failed -- Found OpenMP_C: -openmp:experimental (found version "2.0") -- Found OpenMP_CXX: -openmp:experimental (found version "2.0") -- Found OpenMP: TRUE (found version "2.0") -- Performing Test COMPILER_SUPPORTS_OPENMP -- Performing Test COMPILER_SUPPORTS_OPENMP - Success -- Performing Test COMPILER_SUPPORTS_OMP_SIMD -- Performing Test COMPILER_SUPPORTS_OMP_SIMD - Failed -- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES -- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES - Failed -- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH -- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH - Failed -- Performing Test COMPILER_SUPPORTS_SYS_GETRANDOM -- Performing Test COMPILER_SUPPORTS_SYS_GETRANDOM - Failed -- Configuring build for SLEEF-v3.8.0 -- Using option `/D_CRT_SECURE_NO_WARNINGS /D_CRT_NONSTDC_NO_DEPRECATE ` to compile libsleef -- Building shared libs : OFF -- Building static test bins: OFF -- MPFR : LIB_MPFR-NOTFOUND -- GMP : LIBGMP-NOTFOUND -- RT : -- FFTW3 : LIBFFTW3-NOTFOUND -- OPENSSL : -- SDE : SDE_COMMAND-NOTFOUND -- COMPILER_SUPPORTS_OPENMP : FALSE -- -- ******** Summary ******** -- General: -- CMake version : 4.1.0 -- CMake command : E:/Python310/Lib/site-packages/cmake/data/bin/cmake.exe -- System : Windows -- C++ compiler : C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe -- C++ compiler id : MSVC -- C++ compiler version : 19.44.35215.0 -- Using ccache if found : OFF -- CXX flags : /DWIN32 /D_WINDOWS /EHsc /Zc:__cplusplus /MP /bigobj /FS /utf-8 -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273 -- Shared LD flags : /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 -- Static LD flags : /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 -- Module LD flags : /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 -- Build type : Release -- Compile definitions : ONNX_ML=1;ONNXIFI_ENABLE_EXT=1;ONNX_NAMESPACE=onnx_torch;_CRT_SECURE_NO_DEPRECATE=1;USE_EXTERNAL_MZCRC;MINIZ_DISABLE_ZIP_READER_CRC32_CHECKS;EXPORT_AOTI_FUNCTIONS;WIN32_LEAN_AND_MEAN;_UCRT_LEGACY_INFINITY;NOMINMAX;USE_MIMALLOC -- CMAKE_PREFIX_PATH : -- CMAKE_INSTALL_PREFIX : C:/Program Files (x86)/Torch -- USE_GOLD_LINKER : OFF -- -- TORCH_VERSION : 2.9.0 -- BUILD_STATIC_RUNTIME_BENCHMARK: OFF -- BUILD_BINARY : OFF -- BUILD_CUSTOM_PROTOBUF : ON -- Link local protobuf : ON -- BUILD_PYTHON : ON -- Python version : 3.10.10 -- Python executable : E:/Python310/python.exe -- Python library : E:/Python310/libs/python310.lib -- Python includes : E:/Python310/include -- Python site-package : E:\Python310\Lib\site-packages -- BUILD_SHARED_LIBS : ON -- CAFFE2_USE_MSVC_STATIC_RUNTIME : OFF -- BUILD_TEST : OFF -- BUILD_JNI : OFF -- BUILD_MOBILE_AUTOGRAD : OFF -- BUILD_LITE_INTERPRETER: OFF -- INTERN_BUILD_MOBILE : -- TRACING_BASED : OFF -- USE_BLAS : 0 -- USE_LAPACK : 0 -- USE_ASAN : OFF -- USE_TSAN : OFF -- USE_CPP_CODE_COVERAGE : OFF -- USE_CUDA : OFF -- USE_XPU : OFF -- USE_ROCM : OFF -- BUILD_NVFUSER : -- USE_EIGEN_FOR_BLAS : ON -- USE_EIGEN_FOR_SPARSE : OFF -- USE_FBGEMM : OFF -- USE_KINETO : ON -- USE_GFLAGS : OFF -- USE_GLOG : OFF -- USE_LITE_PROTO : OFF -- USE_PYTORCH_METAL : OFF -- USE_PYTORCH_METAL_EXPORT : OFF -- USE_MPS : OFF -- CAN_COMPILE_METAL : -- USE_MKL : OFF -- USE_MKLDNN : OFF -- USE_UCC : OFF -- USE_ITT : OFF -- USE_XCCL : OFF -- USE_NCCL : OFF -- Found NVSHMEM : -- USE_NNPACK : OFF -- USE_NUMPY : ON -- USE_OBSERVERS : ON -- USE_OPENCL : OFF -- USE_OPENMP : ON -- USE_MIMALLOC : ON -- USE_MIMALLOC_ON_MKL : OFF -- USE_VULKAN : OFF -- USE_PROF : OFF -- USE_PYTORCH_QNNPACK : OFF -- USE_XNNPACK : OFF -- USE_DISTRIBUTED : OFF -- Public Dependencies : -- Private Dependencies : Threads::Threads;cpuinfo;fp16;caffe2::openmp;fmt::fmt-header-only;kineto -- Public CUDA Deps. : -- Private CUDA Deps. : -- USE_COREML_DELEGATE : OFF -- BUILD_LAZY_TS_BACKEND : ON -- USE_ROCM_KERNEL_ASSERT : OFF -- Performing Test HAS_WMISSING_PROTOTYPES -- Performing Test HAS_WMISSING_PROTOTYPES - Failed -- Performing Test HAS_WERROR_MISSING_PROTOTYPES -- Performing Test HAS_WERROR_MISSING_PROTOTYPES - Failed -- Configuring incomplete, errors occurred! (.venv) PS E:\PyTorch_Build\pytorch\build> (.venv) PS E:\PyTorch_Build\pytorch\build> # 5. 编译和安装 (.venv) PS E:\PyTorch_Build\pytorch\build> cmake --build . --config Release --target install -j 8 2>&1 | Tee-Object -FilePath "$buildDir\cmake_build.log" 閫傜敤浜?.NET Framework MSBuild 鐗堟湰 17.14.19+164abd434 MSBUILD : error MSB1009: 椤圭洰鏂囦欢涓嶅瓨鍦ㄣ€? 寮€鍏?install.vcxproj (.venv) PS E:\PyTorch_Build\pytorch\build> (.venv) PS E:\PyTorch_Build\pytorch\build> # 6. 安装 Python 绑定 (.venv) PS E:\PyTorch_Build\pytorch\build> Set-Location $sourceDir (.venv) PS E:\PyTorch_Build\pytorch> & "E:\Python310\python.exe" setup.py develop --cmake --no-deps Building wheel torch-2.9.0a0+git2d31c3d -- Building version 2.9.0a0+git2d31c3d -- Checkout nccl release tag: v2.27.5-1 cmake -GVisual Studio 17 2022 -Ax64 -Thost=x64 -DBUILD_ONLY_CORE=1 -DBUILD_PYTHON=True -DBUILD_SHARED_LIBS=OFF -DBUILD_TEST=True -DBUILD_TORCH=1 -DCMAKE_BUILD_TYPE=Release -DCMAKE_EXE_LINKER_FLAGS=-static -DCMAKE_GENERATOR=Visual Studio 17 2022 -DCMAKE_INSTALL_PREFIX=E:\PyTorch_Build\pytorch\torch -DCMAKE_PREFIX_PATH=E:\Python310\Lib\site-packages -DCMAKE_TOOLCHAIN_FILE=E:\PyTorch_Build\pytorch\cmake\modules\win_toolchain.cmake -DPython_EXECUTABLE=E:\Python310\python.exe -DPython_NumPy_INCLUDE_DIR=E:\Python310\lib\site-packages\numpy\core\include -DTORCH_BUILD_VERSION=2.9.0a0+git2d31c3d -DUSE_NINJA=0 -DUSE_NUMPY=True -DUSE_STATIC_CUDNN=ON -DUSE_STATIC_NCCL=ON -DUSE_STATIC_OPENMP=ON E:\PyTorch_Build\pytorch CMake Deprecation Warning at CMakeLists.txt:18 (cmake_policy): The OLD behavior for policy CMP0126 will be removed from a future version of CMake. The cmake-policies(7) manual explains that the OLD behaviors of all policies are deprecated and that a policy should be set to OLD only under specific short-term circumstances. Projects should be ported to the NEW behavior and not rely on setting a policy to OLD. -- Selecting Windows SDK version 10.0.26100.0 to target Windows . -- The CXX compiler identification is MSVC 19.44.35215.0 -- The C compiler identification is MSVC 19.44.35215.0 -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe - skipped -- Detecting C compile features -- Detecting C compile features - done -- Not forcing any particular BLAS to be found CMake Warning at CMakeLists.txt:425 (message): TensorPipe cannot be used on Windows. Set it to OFF CMake Warning at CMakeLists.txt:427 (message): KleidiAI cannot be used on Windows. Set it to OFF CMake Warning at CMakeLists.txt:439 (message): Libuv is not installed in current conda env. Set USE_DISTRIBUTED to OFF. Please run command 'conda install -c conda-forge libuv=1.39' to install libuv. -- Performing Test C_HAS_AVX_1 -- Performing Test C_HAS_AVX_1 - Success -- Performing Test C_HAS_AVX2_1 -- Performing Test C_HAS_AVX2_1 - Success -- Performing Test C_HAS_AVX512_1 -- Performing Test C_HAS_AVX512_1 - Success -- Performing Test CXX_HAS_AVX_1 -- Performing Test CXX_HAS_AVX_1 - Success -- Performing Test CXX_HAS_AVX2_1 -- Performing Test CXX_HAS_AVX2_1 - Success -- Performing Test CXX_HAS_AVX512_1 -- Performing Test CXX_HAS_AVX512_1 - Success -- Current compiler supports avx2 extension. Will build perfkernels. -- Performing Test COMPILER_SUPPORTS_HIDDEN_VISIBILITY -- Performing Test COMPILER_SUPPORTS_HIDDEN_VISIBILITY - Failed -- Performing Test COMPILER_SUPPORTS_HIDDEN_INLINE_VISIBILITY -- Performing Test COMPILER_SUPPORTS_HIDDEN_INLINE_VISIBILITY - Failed -- Could not find hardware support for NEON on this machine. -- No OMAP3 processor on this machine. -- No OMAP4 processor on this machine. -- Compiler does not support SVE extension. Will not build perfkernels. CMake Warning at CMakeLists.txt:845 (message): x64 operating system is required for FBGEMM. Not compiling with FBGEMM. Turn this warning off by USE_FBGEMM=OFF. -- Performing Test HAS/UTF_8 -- Performing Test HAS/UTF_8 - Success -- Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR) (found version "13.0") CMake Warning at cmake/public/cuda.cmake:31 (message): PyTorch: CUDA cannot be found. Depending on whether you are building PyTorch or a PyTorch dependent library, the next warning / error will give you more info. Call Stack (most recent call first): cmake/Dependencies.cmake:44 (include) CMakeLists.txt:873 (include) CMake Warning at cmake/Dependencies.cmake:76 (message): Not compiling with CUDA. Suppress this warning with -DUSE_CUDA=OFF. Call Stack (most recent call first): CMakeLists.txt:873 (include) CMake Warning at cmake/Dependencies.cmake:95 (message): Not compiling with XPU. Could NOT find SYCL. Suppress this warning with -DUSE_XPU=OFF. Call Stack (most recent call first): CMakeLists.txt:873 (include) -- Building using own protobuf under third_party per request. -- Use custom protobuf build. CMake Warning at cmake/ProtoBuf.cmake:37 (message): Ancient protobuf forces CMake compatibility Call Stack (most recent call first): cmake/ProtoBuf.cmake:87 (custom_protobuf_find) cmake/Dependencies.cmake:107 (include) CMakeLists.txt:873 (include) CMake Deprecation Warning at third_party/protobuf/cmake/CMakeLists.txt:2 (cmake_minimum_required): Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument <min> value. Or, use the <min>...<max> syntax to tell CMake that the project requires at least <min> but has been updated to work with policies introduced by <max> or earlier. -- -- 3.13.0.0 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed -- Looking for pthread_create in pthreads -- Looking for pthread_create in pthreads - not found -- Looking for pthread_create in pthread -- Looking for pthread_create in pthread - not found -- Found Threads: TRUE -- Caffe2 protobuf include directory: $<BUILD_INTERFACE:E:/PyTorch_Build/pytorch/third_party/protobuf/src>$<INSTALL_INTERFACE:include> -- Trying to find preferred BLAS backend of choice: MKL -- MKL_THREADING = OMP -- Looking for sys/types.h -- Looking for sys/types.h - found -- Looking for stdint.h -- Looking for stdint.h - found -- Looking for stddef.h -- Looking for stddef.h - found -- Check size of void* -- Check size of void* - done -- MKL_THREADING = OMP CMake Warning at cmake/Dependencies.cmake:213 (message): MKL could not be found. Defaulting to Eigen Call Stack (most recent call first): CMakeLists.txt:873 (include) CMake Warning at cmake/Dependencies.cmake:279 (message): Preferred BLAS (MKL) cannot be found, now searching for a general BLAS library Call Stack (most recent call first): CMakeLists.txt:873 (include) -- MKL_THREADING = OMP -- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core - libiomp5md] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_intel_thread - mkl_core - libiomp5md] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_intel_thread - mkl_core] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_sequential - mkl_core] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_sequential - mkl_core] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_core - libiomp5md - pthread] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_core - libiomp5md - pthread] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_core - pthread] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_core - pthread] -- Library mkl_intel: not found -- Checking for [mkl - guide - pthread - m] -- Library mkl: not found -- MKL library not found -- Checking for [blis] -- Library blis: BLAS_blis_LIBRARY-NOTFOUND -- Checking for [Accelerate] -- Library Accelerate: BLAS_Accelerate_LIBRARY-NOTFOUND -- Checking for [vecLib] -- Library vecLib: BLAS_vecLib_LIBRARY-NOTFOUND -- Checking for [flexiblas] -- Library flexiblas: BLAS_flexiblas_LIBRARY-NOTFOUND -- Checking for [openblas] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [openblas - pthread - m] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [openblas - pthread - m - gomp] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [libopenblas] -- Library libopenblas: BLAS_libopenblas_LIBRARY-NOTFOUND -- Checking for [goto2 - gfortran] -- Library goto2: BLAS_goto2_LIBRARY-NOTFOUND -- Checking for [goto2 - gfortran - pthread] -- Library goto2: BLAS_goto2_LIBRARY-NOTFOUND -- Checking for [acml - gfortran] -- Library acml: BLAS_acml_LIBRARY-NOTFOUND -- Checking for [blis] -- Library blis: BLAS_blis_LIBRARY-NOTFOUND -- Could NOT find Atlas (missing: Atlas_CBLAS_INCLUDE_DIR Atlas_CLAPACK_INCLUDE_DIR Atlas_CBLAS_LIBRARY Atlas_BLAS_LIBRARY Atlas_LAPACK_LIBRARY) -- Checking for [ptf77blas - atlas - gfortran] -- Library ptf77blas: BLAS_ptf77blas_LIBRARY-NOTFOUND -- Checking for [] -- Looking for sgemm_ -- Looking for sgemm_ - not found -- Cannot find a library with BLAS API. Not using BLAS. -- Using pocketfft in directory: E:/PyTorch_Build/pytorch/third_party/pocketfft/ CMake Warning at cmake/Dependencies.cmake:351 (message): Target architecture "" is not supported in {Q/X}NNPACK. Supported architectures are x86, x86-64, ARM, and ARM64. Turn this warning off by USE_{Q/X}NNPACK=OFF. Call Stack (most recent call first): CMakeLists.txt:873 (include) CMake Deprecation Warning at third_party/cpuinfo/CMakeLists.txt:1 (CMAKE_MINIMUM_REQUIRED): Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument <min> value. Or, use the <min>...<max> syntax to tell CMake that the project requires at least <min> but has been updated to work with policies introduced by <max> or earlier. CMake Warning at third_party/cpuinfo/CMakeLists.txt:93 (MESSAGE): Target processor architecture is not specified. cpuinfo will compile, but cpuinfo_initialize() will always fail. -- Found Git: E:/Program Files/Git/cmd/git.exe (found version "2.51.0.windows.1") -- Google Benchmark version: v1.9.3, normalized to 1.9.3 -- Looking for shm_open in rt -- Looking for shm_open in rt - not found -- Performing Test HAVE_CXX_FLAG_WX -- Performing Test HAVE_CXX_FLAG_WX - Success -- Cross-compiling to test HAVE_STD_REGEX CMake Warning at third_party/benchmark/cmake/CXXFeatureCheck.cmake:49 (message): If you see build failures due to cross compilation, try setting HAVE_STD_REGEX to 0 Call Stack (most recent call first): third_party/benchmark/CMakeLists.txt:311 (cxx_feature_check) -- Performing Test HAVE_STD_REGEX -- success -- Cross-compiling to test HAVE_GNU_POSIX_REGEX -- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile -- Cross-compiling to test HAVE_POSIX_REGEX -- Performing Test HAVE_POSIX_REGEX -- failed to compile -- Cross-compiling to test HAVE_STEADY_CLOCK CMake Warning at third_party/benchmark/cmake/CXXFeatureCheck.cmake:49 (message): If you see build failures due to cross compilation, try setting HAVE_STEADY_CLOCK to 0 Call Stack (most recent call first): third_party/benchmark/CMakeLists.txt:322 (cxx_feature_check) -- Performing Test HAVE_STEADY_CLOCK -- success -- Cross-compiling to test HAVE_PTHREAD_AFFINITY -- Performing Test HAVE_PTHREAD_AFFINITY -- failed to compile CMake Warning at cmake/Dependencies.cmake:749 (message): FP16 is only cmake-2.8 compatible Call Stack (most recent call first): CMakeLists.txt:873 (include) CMake Deprecation Warning at third_party/FP16/CMakeLists.txt:1 (CMAKE_MINIMUM_REQUIRED): Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument <min> value. Or, use the <min>...<max> syntax to tell CMake that the project requires at least <min> but has been updated to work with policies introduced by <max> or earlier. CMake Deprecation Warning at third_party/psimd/CMakeLists.txt:1 (CMAKE_MINIMUM_REQUIRED): Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument <min> value. Or, use the <min>...<max> syntax to tell CMake that the project requires at least <min> but has been updated to work with policies introduced by <max> or earlier. -- Using third party subdirectory Eigen. -- Found Python: E:\Python310\python.exe (found version "3.10.10") found components: Interpreter Development.Module NumPy -- Using third_party/pybind11. -- pybind11 include dirs: E:/PyTorch_Build/pytorch/cmake/../third_party/pybind11/include -- Could NOT find OpenTelemetryApi (missing: OpenTelemetryApi_INCLUDE_DIRS) -- Using third_party/opentelemetry-cpp. -- opentelemetry api include dirs: E:/PyTorch_Build/pytorch/cmake/../third_party/opentelemetry-cpp/api/include -- Could NOT find MPI_C (missing: MPI_C_LIB_NAMES MPI_C_HEADER_DIR MPI_C_WORKS) -- Could NOT find MPI_CXX (missing: MPI_CXX_LIB_NAMES MPI_CXX_HEADER_DIR MPI_CXX_WORKS) -- Could NOT find MPI (missing: MPI_C_FOUND MPI_CXX_FOUND) CMake Warning at cmake/Dependencies.cmake:894 (message): Not compiling with MPI. Suppress this warning with -DUSE_MPI=OFF Call Stack (most recent call first): CMakeLists.txt:873 (include) -- MKL_THREADING = OMP -- Check OMP with lib C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/lib/x64/libomp.lib and flags -openmp:experimental -- MKL_THREADING = OMP -- Check OMP with lib C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/lib/x64/libomp.lib and flags -openmp:experimental -- Found OpenMP_C: -openmp:experimental -- Found OpenMP_CXX: -openmp:experimental -- Found OpenMP: TRUE -- Adding OpenMP CXX_FLAGS: -openmp:experimental -- Will link against OpenMP libraries: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/lib/x64/libomp.lib -- Found nvtx3: E:/PyTorch_Build/pytorch/third_party/NVTX/c/include -- ROCM_PATH environment variable is not set and C:/opt/rocm does not exist. Building without ROCm support. -- Found Python3: E:\Python310\python.exe (found version "3.10.10") found components: Interpreter -- ONNX_PROTOC_EXECUTABLE: $<TARGET_FILE:protobuf::protoc> -- Protobuf_VERSION: Protobuf_VERSION_NOTFOUND Generated: E:/PyTorch_Build/pytorch/build/third_party/onnx/onnx/onnx_onnx_torch-ml.proto Generated: E:/PyTorch_Build/pytorch/build/third_party/onnx/onnx/onnx-operators_onnx_torch-ml.proto Generated: E:/PyTorch_Build/pytorch/build/third_party/onnx/onnx/onnx-data_onnx_torch.proto -- -- ******** Summary ******** -- CMake version : 4.1.0 -- CMake command : E:/Python310/Lib/site-packages/cmake/data/bin/cmake.exe -- System : Windows -- C++ compiler : C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe -- C++ compiler version : 19.44.35215.0 -- CXX flags : /DWIN32 /D_WINDOWS /EHsc /Zc:__cplusplus /MP /bigobj /FS /utf-8 /EHsc /wd26812 -- Build type : Release -- Compile definitions : ONNX_ML=1;ONNXIFI_ENABLE_EXT=1 -- CMAKE_PREFIX_PATH : E:\Python310\Lib\site-packages -- CMAKE_INSTALL_PREFIX : E:/PyTorch_Build/pytorch/torch -- CMAKE_MODULE_PATH : E:/PyTorch_Build/pytorch/cmake/Modules;E:/PyTorch_Build/pytorch/cmake/public/../Modules_CUDA_fix -- -- ONNX version : 1.18.0 -- ONNX NAMESPACE : onnx_torch -- ONNX_USE_LITE_PROTO : OFF -- USE_PROTOBUF_SHARED_LIBS : OFF -- ONNX_DISABLE_EXCEPTIONS : OFF -- ONNX_DISABLE_STATIC_REGISTRATION : OFF -- ONNX_WERROR : OFF -- ONNX_BUILD_TESTS : OFF -- BUILD_SHARED_LIBS : OFF -- -- Protobuf compiler : $<TARGET_FILE:protobuf::protoc> -- Protobuf includes : -- Protobuf libraries : -- ONNX_BUILD_PYTHON : OFF -- Found CUDA with FP16 support, compiling with torch.cuda.HalfTensor -- Adding -DNDEBUG to compile flags CMake Warning at cmake/Dependencies.cmake:1418 (message): Not compiling with MAGMA. Suppress this warning with -DUSE_MAGMA=OFF. Call Stack (most recent call first): CMakeLists.txt:873 (include) -- Could not find hardware support for NEON on this machine. -- No OMAP3 processor on this machine. -- No OMAP4 processor on this machine. -- MKL_THREADING = OMP -- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core - libiomp5md] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_intel_thread - mkl_core - libiomp5md] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_intel_thread - mkl_core] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_sequential - mkl_core] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_sequential - mkl_core] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_core - libiomp5md - pthread] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_core - libiomp5md - pthread] -- Library mkl_intel: not found -- Checking for [mkl_intel_lp64 - mkl_core - pthread] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_core - pthread] -- Library mkl_intel: not found -- Checking for [mkl - guide - pthread - m] -- Library mkl: not found -- MKL library not found -- Checking for [blis] -- Library blis: BLAS_blis_LIBRARY-NOTFOUND -- Checking for [Accelerate] -- Library Accelerate: BLAS_Accelerate_LIBRARY-NOTFOUND -- Checking for [vecLib] -- Library vecLib: BLAS_vecLib_LIBRARY-NOTFOUND -- Checking for [flexiblas] -- Library flexiblas: BLAS_flexiblas_LIBRARY-NOTFOUND -- Checking for [openblas] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [openblas - pthread - m] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [openblas - pthread - m - gomp] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [libopenblas] -- Library libopenblas: BLAS_libopenblas_LIBRARY-NOTFOUND -- Checking for [goto2 - gfortran] -- Library goto2: BLAS_goto2_LIBRARY-NOTFOUND -- Checking for [goto2 - gfortran - pthread] -- Library goto2: BLAS_goto2_LIBRARY-NOTFOUND -- Checking for [acml - gfortran] -- Library acml: BLAS_acml_LIBRARY-NOTFOUND -- Checking for [blis] -- Library blis: BLAS_blis_LIBRARY-NOTFOUND -- Could NOT find Atlas (missing: Atlas_CBLAS_INCLUDE_DIR Atlas_CLAPACK_INCLUDE_DIR Atlas_CBLAS_LIBRARY Atlas_BLAS_LIBRARY Atlas_LAPACK_LIBRARY) -- Checking for [ptf77blas - atlas - gfortran] -- Library ptf77blas: BLAS_ptf77blas_LIBRARY-NOTFOUND -- Checking for [] -- Cannot find a library with BLAS API. Not using BLAS. -- LAPACK requires BLAS -- Cannot find a library with LAPACK API. Not using LAPACK. disabling CUDA because NOT USE_CUDA is set disabling ROCM because NOT USE_ROCM is set -- MIOpen not found. Compiling without MIOpen support disabling MKLDNN because USE_MKLDNN is not set -- {fmt} version: 11.2.0 -- Build type: Release -- Using CPU-only version of Kineto -- Configuring Kineto dependency: -- KINETO_SOURCE_DIR = E:/PyTorch_Build/pytorch/third_party/kineto/libkineto -- KINETO_BUILD_TESTS = OFF -- KINETO_LIBRARY_TYPE = static CMake Deprecation Warning at third_party/kineto/libkineto/CMakeLists.txt:7 (cmake_minimum_required): Compatibility with CMake < 3.10 will be removed from a future version of CMake. Update the VERSION argument <min> value. Or, use the <min>...<max> syntax to tell CMake that the project requires at least <min> but has been updated to work with policies introduced by <max> or earlier. CMake Warning (dev) at third_party/kineto/libkineto/CMakeLists.txt:15 (find_package): Policy CMP0148 is not set: The FindPythonInterp and FindPythonLibs modules are removed. Run "cmake --help-policy CMP0148" for policy details. Use the cmake_policy command to set the policy and suppress this warning. This warning is for project developers. Use -Wno-dev to suppress it. -- Found PythonInterp: E:/Python310/python.exe (found version "3.10.10") -- CUDA_SOURCE_DIR = -- ROCM_SOURCE_DIR = -- CUPTI unavailable or disabled - not building GPU profilers -- Kineto: FMT_SOURCE_DIR = E:/PyTorch_Build/pytorch/third_party/fmt -- Kineto: FMT_INCLUDE_DIR = E:/PyTorch_Build/pytorch/third_party/fmt/include -- CUPTI_INCLUDE_DIR = /extras/CUPTI/include -- ROCTRACER_INCLUDE_DIR = /include/roctracer -- DYNOLOG_INCLUDE_DIR = E:/PyTorch_Build/pytorch/third_party/kineto/libkineto/third_party/dynolog/ -- IPCFABRIC_INCLUDE_DIR = E:/PyTorch_Build/pytorch/third_party/kineto/libkineto/third_party/dynolog//dynolog/src/ipcfabric/ -- Configured Kineto (CPU) -- Performing Test HAS/WD4624 -- Performing Test HAS/WD4624 - Success -- Performing Test HAS/WD4068 -- Performing Test HAS/WD4068 - Success -- Performing Test HAS/WD4067 -- Performing Test HAS/WD4067 - Success -- Performing Test HAS/WD4267 -- Performing Test HAS/WD4267 - Success -- Performing Test HAS/WD4661 -- Performing Test HAS/WD4661 - Success -- Performing Test HAS/WD4717 -- Performing Test HAS/WD4717 - Success -- Performing Test HAS/WD4244 -- Performing Test HAS/WD4244 - Success -- Performing Test HAS/WD4804 -- Performing Test HAS/WD4804 - Success -- Performing Test HAS/WD4273 -- Performing Test HAS/WD4273 - Success -- Performing Test HAS_WNO_STRINGOP_OVERFLOW -- Performing Test HAS_WNO_STRINGOP_OVERFLOW - Failed -- -- Note: when building with Visual Studio the build type is specified when building. -- For example: 'cmake --build . --config=Release -- Architecture: x64 -- Use the C++ compiler to compile (MI_USE_CXX=ON) -- -- Library name : mimalloc -- Version : 2.2.4 -- Build type : release -- C++ Compiler : C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe -- Compiler flags : /Zc:__cplusplus -- Compiler defines : MI_CMAKE_BUILD_TYPE=release;MI_BUILD_RELEASE -- Link libraries : psapi;shell32;user32;advapi32;bcrypt -- Build targets : static -- CMake Error at CMakeLists.txt:1264 (add_subdirectory): add_subdirectory given source "torch/headeronly" which is not an existing directory. -- don't use NUMA -- Looking for backtrace -- Looking for backtrace - not found -- Could NOT find Backtrace (missing: Backtrace_LIBRARY Backtrace_INCLUDE_DIR) -- headers outputs: torch\csrc\inductor\aoti_torch\generated\c_shim_cuda.h not found torch\csrc\inductor\aoti_torch\generated\c_shim_cpu.h not found torch\csrc\inductor\aoti_torch\generated\c_shim_aten.h not found -- sources outputs: -- declarations_yaml outputs: -- Performing Test COMPILER_SUPPORTS_NO_AVX256_SPLIT -- Performing Test COMPILER_SUPPORTS_NO_AVX256_SPLIT - Failed -- Using ATen parallel backend: OMP disabling CUDA because USE_CUDA is set false -- Check size of long double -- Check size of long double - done -- Performing Test COMPILER_SUPPORTS_FLOAT128 -- Performing Test COMPILER_SUPPORTS_FLOAT128 - Failed -- Found OpenMP_C: -openmp:experimental (found version "2.0") -- Found OpenMP_CXX: -openmp:experimental (found version "2.0") -- Found OpenMP: TRUE (found version "2.0") -- Performing Test COMPILER_SUPPORTS_OPENMP -- Performing Test COMPILER_SUPPORTS_OPENMP - Success -- Performing Test COMPILER_SUPPORTS_OMP_SIMD -- Performing Test COMPILER_SUPPORTS_OMP_SIMD - Failed -- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES -- Performing Test COMPILER_SUPPORTS_WEAK_ALIASES - Failed -- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH -- Performing Test COMPILER_SUPPORTS_BUILTIN_MATH - Failed -- Performing Test COMPILER_SUPPORTS_SYS_GETRANDOM -- Performing Test COMPILER_SUPPORTS_SYS_GETRANDOM - Failed -- Configuring build for SLEEF-v3.8.0 Target system: Windows Target processor: Host system: Windows-10.0.26100 Host processor: AMD64 Detected C compiler: MSVC @ C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe CMake: 4.1.0 Make program: C:/Program Files/Microsoft Visual Studio/2022/Community/MSBuild/Current/Bin/amd64/MSBuild.exe Crosscompiling SLEEF. Native build dir: -- Using option `/D_CRT_SECURE_NO_WARNINGS /D_CRT_NONSTDC_NO_DEPRECATE ` to compile libsleef -- Building shared libs : OFF -- Building static test bins: OFF -- MPFR : LIB_MPFR-NOTFOUND -- GMP : LIBGMP-NOTFOUND -- RT : -- FFTW3 : LIBFFTW3-NOTFOUND -- OPENSSL : -- SDE : SDE_COMMAND-NOTFOUND -- COMPILER_SUPPORTS_OPENMP : FALSE AT_INSTALL_INCLUDE_DIR include/ATen/core core header install: E:/PyTorch_Build/pytorch/build/aten/src/ATen/core/aten_interned_strings.h core header install: E:/PyTorch_Build/pytorch/build/aten/src/ATen/core/enum_tag.h core header install: E:/PyTorch_Build/pytorch/build/aten/src/ATen/core/TensorBody.h CMake Error: File E:/PyTorch_Build/pytorch/torch/_utils_internal.py does not exist. CMake Error at caffe2/CMakeLists.txt:241 (configure_file): configure_file Problem configuring file CMake Error: File E:/PyTorch_Build/pytorch/torch/csrc/api/include/torch/version.h.in does not exist. CMake Error at caffe2/CMakeLists.txt:246 (configure_file): configure_file Problem configuring file CMake Error at caffe2/CMakeLists.txt:1398 (add_subdirectory): The source directory E:/PyTorch_Build/pytorch/torch does not contain a CMakeLists.txt file. CMake Warning at CMakeLists.txt:1285 (message): Generated cmake files are only fully tested if one builds with system glog, gflags, and protobuf. Other settings may generate files that are not well tested. CMake Warning at CMakeLists.txt:1349 (message): Generated cmake files are only available when building shared libs. -- -- ******** Summary ******** -- General: -- CMake version : 4.1.0 -- CMake command : E:/Python310/Lib/site-packages/cmake/data/bin/cmake.exe -- System : Windows -- C++ compiler : C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.44.35207/bin/Hostx64/x64/cl.exe -- C++ compiler id : MSVC -- C++ compiler version : 19.44.35215.0 -- Using ccache if found : OFF -- CXX flags : /DWIN32 /D_WINDOWS /EHsc /Zc:__cplusplus /MP /bigobj /FS /utf-8 -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273 -- Shared LD flags : /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 -- Static LD flags : /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 -- Module LD flags : /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 -- Build type : Release -- Compile definitions : ONNX_ML=1;ONNXIFI_ENABLE_EXT=1;ONNX_NAMESPACE=onnx_torch;_CRT_SECURE_NO_DEPRECATE=1;USE_EXTERNAL_MZCRC;MINIZ_DISABLE_ZIP_READER_CRC32_CHECKS;EXPORT_AOTI_FUNCTIONS;WIN32_LEAN_AND_MEAN;_UCRT_LEGACY_INFINITY;NOMINMAX;USE_MIMALLOC -- CMAKE_PREFIX_PATH : E:\Python310\Lib\site-packages -- CMAKE_INSTALL_PREFIX : E:/PyTorch_Build/pytorch/torch -- USE_GOLD_LINKER : OFF -- -- TORCH_VERSION : 2.9.0 -- BUILD_STATIC_RUNTIME_BENCHMARK: OFF -- BUILD_BINARY : OFF -- BUILD_CUSTOM_PROTOBUF : ON -- Protobuf compiler : -- Protobuf includes : -- Protobuf libraries : -- BUILD_PYTHON : True -- Python version : 3.10.10 -- Python executable : E:\Python310\python.exe -- Python library : E:/Python310/libs/python310.lib -- Python includes : E:/Python310/include -- Python site-package : E:\Python310\Lib\site-packages -- BUILD_SHARED_LIBS : OFF -- CAFFE2_USE_MSVC_STATIC_RUNTIME : ON -- BUILD_TEST : True -- BUILD_JNI : OFF -- BUILD_MOBILE_AUTOGRAD : OFF -- BUILD_LITE_INTERPRETER: OFF -- INTERN_BUILD_MOBILE : -- TRACING_BASED : OFF -- USE_BLAS : 0 -- USE_LAPACK : 0 -- USE_ASAN : OFF -- USE_TSAN : OFF -- USE_CPP_CODE_COVERAGE : OFF -- USE_CUDA : OFF -- USE_XPU : OFF -- USE_ROCM : OFF -- BUILD_NVFUSER : -- USE_EIGEN_FOR_BLAS : ON -- USE_EIGEN_FOR_SPARSE : OFF -- USE_FBGEMM : OFF -- USE_KINETO : ON -- USE_GFLAGS : OFF -- USE_GLOG : OFF -- USE_LITE_PROTO : OFF -- USE_PYTORCH_METAL : OFF -- USE_PYTORCH_METAL_EXPORT : OFF -- USE_MPS : OFF -- CAN_COMPILE_METAL : -- USE_MKL : OFF -- USE_MKLDNN : OFF -- USE_UCC : OFF -- USE_ITT : OFF -- USE_XCCL : OFF -- USE_NCCL : OFF -- Found NVSHMEM : -- USE_NNPACK : OFF -- USE_NUMPY : ON -- USE_OBSERVERS : ON -- USE_OPENCL : OFF -- USE_OPENMP : ON -- USE_MIMALLOC : ON -- USE_MIMALLOC_ON_MKL : OFF -- USE_VULKAN : OFF -- USE_PROF : OFF -- USE_PYTORCH_QNNPACK : OFF -- USE_XNNPACK : OFF -- USE_DISTRIBUTED : OFF -- Public Dependencies : -- Private Dependencies : Threads::Threads;cpuinfo;fp16;caffe2::openmp;fmt::fmt-header-only;kineto -- Public CUDA Deps. : -- Private CUDA Deps. : -- USE_COREML_DELEGATE : OFF -- BUILD_LAZY_TS_BACKEND : ON -- USE_ROCM_KERNEL_ASSERT : OFF -- Performing Test HAS_WMISSING_PROTOTYPES -- Performing Test HAS_WMISSING_PROTOTYPES - Failed -- Performing Test HAS_WERROR_MISSING_PROTOTYPES -- Performing Test HAS_WERROR_MISSING_PROTOTYPES - Failed -- Configuring incomplete, errors occurred! (.venv) PS E:\PyTorch_Build\pytorch> (.venv) PS E:\PyTorch_Build\pytorch> # 7. 验证安装 (.venv) PS E:\PyTorch_Build\pytorch> & "E:\Python310\python.exe" -c @" >> import torch >> print(f'PyTorch版本: {torch.__version__}') >> print(f'CUDA可用: {torch.cuda.is_available()}') >> print(f'CUDA版本: {torch.version.cuda}') >> print(f'cuDNN版本: {torch.backends.cudnn.version()}') >> "@ Traceback (most recent call last): File "<string>", line 1, in <module> File "E:\Python310\lib\site-packages\torch-2.9.0a0+git2d31c3d-py3.10-win-amd64.egg\torch\__init__.py", line 281, in <module> _load_dll_libraries() File "E:\Python310\lib\site-packages\torch-2.9.0a0+git2d31c3d-py3.10-win-amd64.egg\torch\__init__.py", line 277, in _load_dll_libraries raise err OSError: [WinError 126] 找不到指定的模块。 Error loading "E:\Python310\lib\site-packages\torch-2.9.0a0+git2d31c3d-py3.10-win-amd64.egg\torch\lib\aoti_custom_ops.dll" or one of its dependencies. (.venv) PS E:\PyTorch_Build\pytorch> Set-Content -Path manual_build.ps1 -Value (Get-Content manual_build.ps1) Get-Content: Cannot find path 'E:\PyTorch_Build\pytorch\manual_build.ps1' because it does not exist. (.venv) PS E:\PyTorch_Build\pytorch> .\manual_build.ps1 (.venv) PS E:\PyTorch_Build\pytorch> # 安装 Docker Desktop (.venv) PS E:\PyTorch_Build\pytorch> Invoke-WebRequest -Uri "https://desktop.docker.com/win/main/amd64/Docker%20Desktop%20Installer.exe" -OutFile "$env:TEMP\docker_install.exe" (.venv) PS E:\PyTorch_Build\pytorch> Start-Process -FilePath "$env:TEMP\docker_install.exe" -ArgumentList "install --quiet" -Wait “
09-02
“tmpxft_000061d4_00000000-7_cutlassB_f32_aligned_k65536_dropout.cudafe1.cpp [4835/7459] Building CXX object caffe2\CMakeFiles\torch_cuda.dir\__\torch\csrc\jit\tensorexpr\cuda_codegen.cpp.obj E:\PyTorch_Build\pytorch\torch/csrc/jit/tensorexpr/ir.h(395): warning C4805: “==”: 在操作中将类型“c10::impl::ScalarTypeToCPPType<c10::ScalarType::Bool>::type”与类型“T”混合不安全 with [ T=int ] E:\PyTorch_Build\pytorch\torch/csrc/jit/tensorexpr/ir.h(395): note: 模板实例化上下文(最早的实例化上下文)为 E:\PyTorch_Build\pytorch\torch\csrc\jit\tensorexpr\cuda_codegen.cpp(147): note: 查看对正在编译的函数 模板 实例化“bool torch::jit::tensorexpr::immediateEquals<int>(const torch::jit::tensorexpr::ExprPtr &,T)”的引用 with [ T=int ] [4844/7459] Building CUDA object caffe2\CMakeFiles\torch_c...aten\src\ATen\native\quantized\cuda\FakeQuantizeCore.cu.ob FakeQuantizeCore.cu tmpxft_000011c8_00000000-7_FakeQuantizeCore.cudafe1.cpp [4850/7459] Building CUDA object caffe2\CMakeFiles\torch_c...aten\src\ATen\native\sparse\cuda\SparseCsrTensorMath.cu.ob SparseCsrTensorMath.cu tmpxft_00000e1c_00000000-7_SparseCsrTensorMath.cudafe1.cpp [4854/7459] Building CXX object caffe2\torch\CMakeFiles\torch_python.dir\csrc\StorageMethods.cpp.obj FAILED: [code=2] caffe2/torch/CMakeFiles/torch_python.dir/csrc/StorageMethods.cpp.obj C:\PROGRA~2\MICROS~2\2022\BUILDT~1\VC\Tools\MSVC\1444~1.352\bin\Hostx64\x64\cl.exe /nologo /TP -DAT_PER_OPERATOR_HEADERS -DBUILDING_TESTS -DEXPORT_AOTI_FUNCTIONS -DFMT_HEADER_ONLY=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNOMINMAX -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DPy_NO_LINK_LIB -DTHP_BUILD_MAIN_LIB -DTORCH_CUDA_USE_NVTX3 -DUSE_CUDA -DUSE_EXTERNAL_MZCRC -DUSE_ITT -DUSE_MIMALLOC -DUSE_NUMPY -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_UCRT_LEGACY_INFINITY -Dtorch_python_EXPORTS -IE:\PyTorch_Build\pytorch\build\aten\src -IE:\PyTorch_Build\pytorch\aten\src -IE:\PyTorch_Build\pytorch\build -IE:\PyTorch_Build\pytorch -IE:\PyTorch_Build\pytorch\nlohmann -IE:\PyTorch_Build\pytorch\moodycamel -IE:\PyTorch_Build\pytorch\third_party\mimalloc\include -IE:\PyTorch_Build\pytorch\torch\.. -IE:\PyTorch_Build\pytorch\torch\..\aten\src -IE:\PyTorch_Build\pytorch\torch\..\aten\src\TH -IE:\PyTorch_Build\pytorch\build\caffe2\aten\src -IE:\PyTorch_Build\pytorch\build\third_party -IE:\PyTorch_Build\pytorch\build\third_party\onnx -IE:\PyTorch_Build\pytorch\torch\..\third_party\valgrind-headers -IE:\PyTorch_Build\pytorch\torch\..\third_party\gloo -IE:\PyTorch_Build\pytorch\torch\..\third_party\onnx -IE:\PyTorch_Build\pytorch\torch\..\third_party\flatbuffers\include -IE:\PyTorch_Build\pytorch\torch\..\third_party\kineto\libkineto\include -IE:\PyTorch_Build\pytorch\torch\..\third_party\cpp-httplib -IE:\PyTorch_Build\pytorch\torch\..\third_party\nlohmann\include -IE:\PyTorch_Build\pytorch\torch\csrc -IE:\PyTorch_Build\pytorch\torch\csrc\api\include -IE:\PyTorch_Build\pytorch\torch\lib -IE:\PyTorch_Build\pytorch\torch\standalone -IE:\PyTorch_Build\pytorch\torch\lib\libshm_windows -IE:\PyTorch_Build\pytorch\torch\csrc\api -IE:\PyTorch_Build\pytorch\c10\.. -IE:\PyTorch_Build\pytorch\c10\cuda\..\.. -IE:\PyTorch_Build\pytorch\third_party\fmt\include -IE:\PyTorch_Build\pytorch\third_party\onnx -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\googletest\googlemock\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\googletest\googletest\include -external:IE:\PyTorch_Build\pytorch\third_party\protobuf\src -external:IE:\PyTorch_Build\pytorch\third_party\XNNPACK\include -external:IE:\PyTorch_Build\pytorch\third_party\ittapi\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\eigen -external:IE:\PyTorch_Build\pytorch\third_party\ideep\mkl-dnn\include\oneapi\dnnl -external:IE:\PyTorch_Build\pytorch\third_party\ideep\include -external:IE:\PyTorch_Build\pytorch\INTERFACE -external:IE:\PyTorch_Build\pytorch\third_party\nlohmann\include -external:IE:\PyTorch_Build\pytorch\third_party\concurrentqueue -external:IE:\PyTorch_Build\pytorch\rtx5070_env\lib\site-packages\numpy\_core\include -external:IE:\Python310\Include -external:I"E:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include" -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\pybind11\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\opentelemetry-cpp\api\include -external:IE:\PyTorch_Build\pytorch\third_party\cpp-httplib -external:IE:\PyTorch_Build\pytorch\third_party\NVTX\c\include -external:W0 /DWIN32 /D_WINDOWS /EHsc /Zc:__cplusplus /bigobj /FS /utf-8 -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273 -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /Zc:preprocessor /Zc:preprocessor /Zc:preprocessor /O2 /Ob2 /DNDEBUG /bigobj -DNDEBUG -std:c++17 -MD /permissive- /EHsc /bigobj -O2 /utf-8 /showIncludes /Focaffe2\torch\CMakeFiles\torch_python.dir\csrc\StorageMethods.cpp.obj /Fdcaffe2\torch\CMakeFiles\torch_python.dir\ /FS -c E:\PyTorch_Build\pytorch\torch\csrc\StorageMethods.cpp E:\PyTorch_Build\pytorch\torch\csrc\StorageMethods.cpp(9): fatal error C1083: 无法打开包括文件: “libshm.h”: No such file or directory [4855/7459] Building CXX object caffe2\torch\CMakeFiles\torch_python.dir\csrc\Storage.cpp.obj FAILED: [code=2] caffe2/torch/CMakeFiles/torch_python.dir/csrc/Storage.cpp.obj C:\PROGRA~2\MICROS~2\2022\BUILDT~1\VC\Tools\MSVC\1444~1.352\bin\Hostx64\x64\cl.exe /nologo /TP -DAT_PER_OPERATOR_HEADERS -DBUILDING_TESTS -DEXPORT_AOTI_FUNCTIONS -DFMT_HEADER_ONLY=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNOMINMAX -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DPy_NO_LINK_LIB -DTHP_BUILD_MAIN_LIB -DTORCH_CUDA_USE_NVTX3 -DUSE_CUDA -DUSE_EXTERNAL_MZCRC -DUSE_ITT -DUSE_MIMALLOC -DUSE_NUMPY -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_UCRT_LEGACY_INFINITY -Dtorch_python_EXPORTS -IE:\PyTorch_Build\pytorch\build\aten\src -IE:\PyTorch_Build\pytorch\aten\src -IE:\PyTorch_Build\pytorch\build -IE:\PyTorch_Build\pytorch -IE:\PyTorch_Build\pytorch\nlohmann -IE:\PyTorch_Build\pytorch\moodycamel -IE:\PyTorch_Build\pytorch\third_party\mimalloc\include -IE:\PyTorch_Build\pytorch\torch\.. -IE:\PyTorch_Build\pytorch\torch\..\aten\src -IE:\PyTorch_Build\pytorch\torch\..\aten\src\TH -IE:\PyTorch_Build\pytorch\build\caffe2\aten\src -IE:\PyTorch_Build\pytorch\build\third_party -IE:\PyTorch_Build\pytorch\build\third_party\onnx -IE:\PyTorch_Build\pytorch\torch\..\third_party\valgrind-headers -IE:\PyTorch_Build\pytorch\torch\..\third_party\gloo -IE:\PyTorch_Build\pytorch\torch\..\third_party\onnx -IE:\PyTorch_Build\pytorch\torch\..\third_party\flatbuffers\include -IE:\PyTorch_Build\pytorch\torch\..\third_party\kineto\libkineto\include -IE:\PyTorch_Build\pytorch\torch\..\third_party\cpp-httplib -IE:\PyTorch_Build\pytorch\torch\..\third_party\nlohmann\include -IE:\PyTorch_Build\pytorch\torch\csrc -IE:\PyTorch_Build\pytorch\torch\csrc\api\include -IE:\PyTorch_Build\pytorch\torch\lib -IE:\PyTorch_Build\pytorch\torch\standalone -IE:\PyTorch_Build\pytorch\torch\lib\libshm_windows -IE:\PyTorch_Build\pytorch\torch\csrc\api -IE:\PyTorch_Build\pytorch\c10\.. -IE:\PyTorch_Build\pytorch\c10\cuda\..\.. -IE:\PyTorch_Build\pytorch\third_party\fmt\include -IE:\PyTorch_Build\pytorch\third_party\onnx -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\googletest\googlemock\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\googletest\googletest\include -external:IE:\PyTorch_Build\pytorch\third_party\protobuf\src -external:IE:\PyTorch_Build\pytorch\third_party\XNNPACK\include -external:IE:\PyTorch_Build\pytorch\third_party\ittapi\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\eigen -external:IE:\PyTorch_Build\pytorch\third_party\ideep\mkl-dnn\include\oneapi\dnnl -external:IE:\PyTorch_Build\pytorch\third_party\ideep\include -external:IE:\PyTorch_Build\pytorch\INTERFACE -external:IE:\PyTorch_Build\pytorch\third_party\nlohmann\include -external:IE:\PyTorch_Build\pytorch\third_party\concurrentqueue -external:IE:\PyTorch_Build\pytorch\rtx5070_env\lib\site-packages\numpy\_core\include -external:IE:\Python310\Include -external:I"E:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include" -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\pybind11\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\opentelemetry-cpp\api\include -external:IE:\PyTorch_Build\pytorch\third_party\cpp-httplib -external:IE:\PyTorch_Build\pytorch\third_party\NVTX\c\include -external:W0 /DWIN32 /D_WINDOWS /EHsc /Zc:__cplusplus /bigobj /FS /utf-8 -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273 -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /Zc:preprocessor /Zc:preprocessor /Zc:preprocessor /O2 /Ob2 /DNDEBUG /bigobj -DNDEBUG -std:c++17 -MD /permissive- /EHsc /bigobj -O2 /utf-8 /showIncludes /Focaffe2\torch\CMakeFiles\torch_python.dir\csrc\Storage.cpp.obj /Fdcaffe2\torch\CMakeFiles\torch_python.dir\ /FS -c E:\PyTorch_Build\pytorch\torch\csrc\Storage.cpp E:\PyTorch_Build\pytorch\torch\csrc\Storage.cpp(10): fatal error C1083: 无法打开包括文件: “libshm.h”: No such file or directory [4856/7459] Building CXX object caffe2\torch\CMakeFiles\torch_python.dir\csrc\Module.cpp.obj FAILED: [code=2] caffe2/torch/CMakeFiles/torch_python.dir/csrc/Module.cpp.obj C:\PROGRA~2\MICROS~2\2022\BUILDT~1\VC\Tools\MSVC\1444~1.352\bin\Hostx64\x64\cl.exe /nologo /TP -DAT_PER_OPERATOR_HEADERS -DBUILDING_TESTS -DEXPORT_AOTI_FUNCTIONS -DFMT_HEADER_ONLY=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNOMINMAX -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DPy_NO_LINK_LIB -DTHP_BUILD_MAIN_LIB -DTORCH_CUDA_USE_NVTX3 -DUSE_CUDA -DUSE_EXTERNAL_MZCRC -DUSE_ITT -DUSE_MIMALLOC -DUSE_NUMPY -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_UCRT_LEGACY_INFINITY -Dtorch_python_EXPORTS -IE:\PyTorch_Build\pytorch\build\aten\src -IE:\PyTorch_Build\pytorch\aten\src -IE:\PyTorch_Build\pytorch\build -IE:\PyTorch_Build\pytorch -IE:\PyTorch_Build\pytorch\nlohmann -IE:\PyTorch_Build\pytorch\moodycamel -IE:\PyTorch_Build\pytorch\third_party\mimalloc\include -IE:\PyTorch_Build\pytorch\torch\.. -IE:\PyTorch_Build\pytorch\torch\..\aten\src -IE:\PyTorch_Build\pytorch\torch\..\aten\src\TH -IE:\PyTorch_Build\pytorch\build\caffe2\aten\src -IE:\PyTorch_Build\pytorch\build\third_party -IE:\PyTorch_Build\pytorch\build\third_party\onnx -IE:\PyTorch_Build\pytorch\torch\..\third_party\valgrind-headers -IE:\PyTorch_Build\pytorch\torch\..\third_party\gloo -IE:\PyTorch_Build\pytorch\torch\..\third_party\onnx -IE:\PyTorch_Build\pytorch\torch\..\third_party\flatbuffers\include -IE:\PyTorch_Build\pytorch\torch\..\third_party\kineto\libkineto\include -IE:\PyTorch_Build\pytorch\torch\..\third_party\cpp-httplib -IE:\PyTorch_Build\pytorch\torch\..\third_party\nlohmann\include -IE:\PyTorch_Build\pytorch\torch\csrc -IE:\PyTorch_Build\pytorch\torch\csrc\api\include -IE:\PyTorch_Build\pytorch\torch\lib -IE:\PyTorch_Build\pytorch\torch\standalone -IE:\PyTorch_Build\pytorch\torch\lib\libshm_windows -IE:\PyTorch_Build\pytorch\torch\csrc\api -IE:\PyTorch_Build\pytorch\c10\.. -IE:\PyTorch_Build\pytorch\c10\cuda\..\.. -IE:\PyTorch_Build\pytorch\third_party\fmt\include -IE:\PyTorch_Build\pytorch\third_party\onnx -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\googletest\googlemock\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\googletest\googletest\include -external:IE:\PyTorch_Build\pytorch\third_party\protobuf\src -external:IE:\PyTorch_Build\pytorch\third_party\XNNPACK\include -external:IE:\PyTorch_Build\pytorch\third_party\ittapi\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\eigen -external:IE:\PyTorch_Build\pytorch\third_party\ideep\mkl-dnn\include\oneapi\dnnl -external:IE:\PyTorch_Build\pytorch\third_party\ideep\include -external:IE:\PyTorch_Build\pytorch\INTERFACE -external:IE:\PyTorch_Build\pytorch\third_party\nlohmann\include -external:IE:\PyTorch_Build\pytorch\third_party\concurrentqueue -external:IE:\PyTorch_Build\pytorch\rtx5070_env\lib\site-packages\numpy\_core\include -external:IE:\Python310\Include -external:I"E:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include" -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\pybind11\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\opentelemetry-cpp\api\include -external:IE:\PyTorch_Build\pytorch\third_party\cpp-httplib -external:IE:\PyTorch_Build\pytorch\third_party\NVTX\c\include -external:W0 /DWIN32 /D_WINDOWS /EHsc /Zc:__cplusplus /bigobj /FS /utf-8 -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273 -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /Zc:preprocessor /Zc:preprocessor /Zc:preprocessor /O2 /Ob2 /DNDEBUG /bigobj -DNDEBUG -std:c++17 -MD /permissive- /EHsc /bigobj -O2 /utf-8 /showIncludes /Focaffe2\torch\CMakeFiles\torch_python.dir\csrc\Module.cpp.obj /Fdcaffe2\torch\CMakeFiles\torch_python.dir\ /FS -c E:\PyTorch_Build\pytorch\torch\csrc\Module.cpp E:\PyTorch_Build\pytorch\torch\csrc\Module.cpp(34): fatal error C1083: 无法打开包括文件: “libshm.h”: No such file or directory [4872/7459] Building CUDA object caffe2\CMakeFiles\torch_cuda.dir\__\aten\src\ATen\native\cuda\group_norm_kernel.cu.obj group_norm_kernel.cu tmpxft_00005adc_00000000-7_group_norm_kernel.cudafe1.cpp [4873/7459] Building CUDA object caffe2\CMakeFiles\torch_cuda.dir\__\aten\src\ATen\UfuncCUDA_add.cu.obj UfuncCUDA_add.cu tmpxft_00006bdc_00000000-7_UfuncCUDA_add.cudafe1.cpp [4875/7459] Building CUDA object caffe2\CMakeFiles\torch_cuda.dir\__\aten\src\ATen\native\cuda\Unique.cu.obj Unique.cu tmpxft_00000658_00000000-7_Unique.cudafe1.cpp ninja: build stopped: subcommand failed. (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch>” “PowerShell 7 环境已加载 (版本: 7.5.2) PS C:\Users\Administrator\Desktop> cd E:\PyTorch_Build\pytorch PS E:\PyTorch_Build\pytorch> python -m venv rtx5070_env Error: [Errno 13] Permission denied: 'E:\\PyTorch_Build\\pytorch\\rtx5070_env\\Scripts\\python.exe' PS E:\PyTorch_Build\pytorch> .\rtx5070_env\Scripts\activate (rtx5070_env) PS E:\PyTorch_Build\pytorch> conda install -c conda-forge openblas 3 channel Terms of Service accepted Retrieving notices: done Channels: - conda-forge - defaults - nvidia - pytorch-nightly Platform: win-64 Collecting package metadata (repodata.json): done Solving environment: done ## Package Plan ## environment location: C:\Miniconda3 added / updated specs: - openblas The following packages will be downloaded: package | build ---------------------------|----------------- libopenblas-0.3.30 |pthreads_ha4fe6b2_2 3.8 MB conda-forge openblas-0.3.30 |pthreads_h4a7f399_2 262 KB conda-forge ------------------------------------------------------------ Total: 4.0 MB The following NEW packages will be INSTALLED: libopenblas conda-forge/win-64::libopenblas-0.3.30-pthreads_ha4fe6b2_2 openblas conda-forge/win-64::openblas-0.3.30-pthreads_h4a7f399_2 Proceed ([y]/n)? y Downloading and Extracting Packages: Preparing transaction: done Verifying transaction: done Executing transaction: done (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 设置cuDNN路径 (rtx5070_env) PS E:\PyTorch_Build\pytorch> @" >> set(CUDNN_INCLUDE_DIR "E:/Program Files/NVIDIA/CUNND/v9.12/include/13.0") >> set(CUDNN_LIBRARY "E:/Program Files/NVIDIA/CUNND/v9.12/lib/13.0/x64/cudnn64_9.lib") >> message(STATUS "Applied custom cuDNN settings") >> "@ | Set-Content set_cudnn.cmake (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 在CMakeLists.txt第一行插入引用 (rtx5070_env) PS E:\PyTorch_Build\pytorch> $cmakeFile = "CMakeLists.txt" (rtx5070_env) PS E:\PyTorch_Build\pytorch> $content = Get-Content $cmakeFile (rtx5070_env) PS E:\PyTorch_Build\pytorch> $newContent = @() (rtx5070_env) PS E:\PyTorch_Build\pytorch> $newContent += "include(set_cudnn.cmake)" (rtx5070_env) PS E:\PyTorch_Build\pytorch> $newContent += $content (rtx5070_env) PS E:\PyTorch_Build\pytorch> $newContent | Set-Content $cmakeFile (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 安装OpenBLAS (rtx5070_env) PS E:\PyTorch_Build\pytorch> conda install -c conda-forge openblas -y 3 channel Terms of Service accepted Channels: - conda-forge - defaults - nvidia - pytorch-nightly Platform: win-64 Collecting package metadata (repodata.json): done Solving environment: done # All requested packages already installed. (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> Write-Host "✅ 构建环境优化完成" -ForegroundColor Green ✅ 构建环境优化完成 (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 检查GPU支持状态 (rtx5070_env) PS E:\PyTorch_Build\pytorch> python -c "import torch; print(f'CUDA可用: {torch.cuda.is_available()}')" Traceback (most recent call last): File "<string>", line 1, in <module> File "E:\PyTorch_Build\pytorch\torch\__init__.py", line 415, in <module> from torch._C import * # noqa: F403 ImportError: DLL load failed while importing _C: 找不到指定的模块。 (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 查看已完成任务比例 (rtx5070_env) PS E:\PyTorch_Build\pytorch> $total = 7459 (rtx5070_env) PS E:\PyTorch_Build\pytorch> $completed = (Get-ChildItem build -Recurse -File | Measure-Object).Count (rtx5070_env) PS E:\PyTorch_Build\pytorch> $percent = [math]::Round(($completed/$total)*100, 2) (rtx5070_env) PS E:\PyTorch_Build\pytorch> Write-Host "构建进度: $completed/$total ($percent%)" 构建进度: 12552/7459 (168.28%) (rtx5070_env) PS E:\PyTorch_Build\pytorch>”
最新发布
09-04
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值