caffe中新增自己的激活函数层_C++实现

本文详细介绍如何在Caffe框架中添加自定义层,并通过一个具体的实例——自定义幂运算加偏置项的激活层——演示整个过程。包括定义新层、注册、编译及测试等关键步骤。

一、前言

    本篇主要转载自一个视频教程,主要实现在caffe中新增自己的层。

二、具体做法

     自定义一个计算层,实现y=x^power+bur的功能,事实上这个新层为激活函数层

三、实现的方法思路   

    

    (1)任何一个层都可以被继承,然后进行重写函数

    (2)尽量确保要实现的功能是否必须要自己写,不然尽量用已有的层,每一个层在caffe/include/caffe/layers
源码中都有详细的介绍

四、整体步骤

1. 创建新定义的头文件include/caffe/layers/my_neuron_layer.hpp
    重新Layer名的方法:virtual inline const char*  type() const { return "MyNeuron"; }
    如果只是需要cpu方法的话,可以注释掉forward/backward_gpu()这两个方法
2. 创建对应src/caffe/src/my_neuron_layer.cpp的源文件
    重写方法LayerSetUp,实现从能从prototxt读取参数,这个没有从prototxt读取新的参数,则不需要重写
    重写方法Reshape,如果对继承类没有修改的话,就不需要重写
    重写方法Forward_cpu
    重写方法Backward_cpu(非必须)
    如果要GPU支持,则还需要创建src/caffe/src/my_neuron_layer.cu,同理重写方法Forward_gpu/Backward_gpu(非必须)
3. 到proto/caffe.proto 注册新的layer   
4. 在my_neuron_layer.cpp添加注册的宏定义
    INSTANTIATE_CLASS(MyNeuronLayer);
    REGISTER_LAYER_CLASS(MyNeuron);  
    如果有my_neuron_layer.cu 文件,则还要添加 
    INSTANTIATE_LAYER_GPU_FUNCS(MyNeuronLayer); 
5. 重新make和install

五、具体做法(仅仅用CPU)

5.1、prototxt文件中新层的描述

     由于需要从prototxt文件中读取参数初始化 power 和 bur,所实现新层为激活函数的功能,故可参考sigmoid_layer 的写法,先参看sigmoid在prototxt文件的描述:

 

layer {
name: "sigmoid"
type: "Sigmoid"
bottom: "fc1"
top: "fc1"
}

 

     新增参数 power 和 bur,在prototxt文件中描述可这样写:

layer {
  name: "myneuron"
  type: "MyNeuron"  #name和type不一样
  bottom: "fc1"
  top: "fc1"
  my_neuron_param {
    power: 3
    bur: 1
  }
}

5.2、新增 myNeuron_layer.hpp 和 myNeuron_layer.cpp

 

    方便的做法是参考相似层的写法,这里将sigmoid_layer.hpp和sigmoid_layer.cpp 改为相应名字后,分别保存至include/caffe/layers/ 和 src/caffe/layers/ , 并分别修改如下:

//myNeuron_layer.hpp
#ifndef CAFFE_MY_NEURON_LAYER_HPP_
#define CAFFE_MY_NEURON_LAYER_HPP_

#include <vector>

#include "caffe/blob.hpp"
#include "caffe/layer.hpp"
#include "caffe/proto/caffe.pb.h"

#include "caffe/layers/neuron_layer.hpp"

namespace caffe {

template <typename Dtype>
class MyNeuronLayer : public NeuronLayer<Dtype> {
 public:
  
  explicit MyNeuronLayer(const LayerParameter& param)
      : NeuronLayer<Dtype>(param) {}
  virtual void LayerSetUp(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top);

  virtual inline const char* type() const { return "MyNeuron"; }

 protected:
  
  virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
      const vector<Blob<Dtype>*>& top);
  virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,const vector<Blob<Dtype>*>& top);

  virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
      const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom);
  virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom);

  Dtype power_;
  Dtype bur_;
};

}  // namespace caffe

#endif  // CAFFE_MY_NEURON_LAYER_HPP_
//myNeuron_layer.cpp
#include <vector>

#include "caffe/layers/myNeuron_layer.hpp"
#include "caffe/util/math_functions.hpp"

namespace caffe {

template <typename Dtype>
void MyNeuronLayer<Dtype>::LayerSetUp(const vector<Blob<Dtype>*>& bottom,const vector<Blob<Dtype>*>& top){
  
  NeuronLayer<Dtype>::LayerSetUp(bottom,top);
  power_ = this->layer_param_.my_neuron_param().power();
  bur_ = this->layer_param_.my_neuron_param().bur();
}

// Compute y = x^power
template <typename Dtype>
void MyNeuronLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,const vector<Blob<Dtype>*>& top){

  Dtype* top_data = top[0]->mutable_cpu_data();
  const int count = bottom[0]->count();
  caffe_powx(count, bottom[0]->cpu_data(), Dtype(power_), top_data);
  caffe_add_scale(count, Dtype(bur_), top_data);
}

template <typename Dtype>
void MyNeuronLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,const vector<bool>& propagate_down,const vector<Blob<Dtype>*>& bottom){
  const int count = top[0]->count();
  const Dtype* top_diff = top[0]->cpu_diff();
  if(propagate_down[0]){
    const Dtype* bottom_data = bottom[0]->cpu_data();
    Dtype* bottom_diff = bottom[0]->mutable_cpu_diff();
    caffe_powx(count, bottom_data, Dtype(power_ - 1), bottom_diff);
    caffe_scal(count, Dtype(power_), bottom_diff);
    caffe_mul(count, bottom_diff, top_diff, bottom_diff);
  }

}

#ifdef CPU_ONLY
STUB_GPU(MyNeuronLayer);
#endif

INSTANTIATE_CLASS(MyNeuronLayer);  
REGISTER_LAYER_CLASS(MyNeuron);

}// namespace caffe

5.3、注册
    接下来需要进行注册。

   (1) 到src/caffe/proto/caffe.proto 注册 prototxt中的 my_neuron_param 

 

message LayerParameter {
...
optional MyNeuronParameter my_neuron_param = 150;
...
}
...
message MyNeuronParameter{
  optional float power = 1 [default = 2];
  optional float bur = 2 [default = 1];
}
...
message V1LayerParameter {
...
MYNEURON = 40;  #大写
...
}

 

        V1layer注册的是全大写的类型名字,这点有点不知为何
   (2) 在cpp文件中最后注册cpp中所创建的新类和 相应 prototxt中的类型名

 

INSTANTIATE_CLASS(MyNeuronLayer);  
REGISTER_LAYER_CLASS(MyNeuron); //注意这里和V1LayerParameter保持一致

5.4、测试

 

     重新make之后,创建deploy.prototxt 和 test_my_neuron.py 如下

 

name: "CaffeNet"
input: "data"
input_shape {
  dim: 1 # batchsize
  dim: 1 # number of colour channels - rgb
  dim: 28 # width
  dim: 28 # height
}
layer {
  name: "myneuron"
  type: "MyNeuron"
  bottom: "data"
  top: "data_out"
  my_neuron_param {
    power : 2
    bur : 1
  }
}
# -*- coding: utf-8 -*-

import numpy as np
import matplotlib.pyplot as plt
import os
import sys

deploy_file = "./deploy.prototxt"
test_data   = "./5.jpg"

if __name__ == '__main__':
  sys.path.append("/home/zjy/caffe/python")
  import caffe

  net = caffe.Net(deploy_file,caffe.TEST)

  transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})

  transformer.set_transpose('data', (2, 0, 1))

  img = caffe.io.load_image(test_data,color=False)

  net.blobs['data'].data[...] = transformer.preprocess('data', img)

  print net.blobs['data'].data[0][0][14]

  out = net.forward()

  print out['data_out'][0][0][14]

5.5、运行结果

六、具体做法(用GPU)

 

 

 

6.1、prototxt文件中新层的描述

    保持不变

6.2、新增 myNeuron_layer.hpp 和 myNeuron_layer.cu

      myNeuron_layer.hpp 保持不变,在src/caffe/layers/ 增加 myNeuron_layer.cu

 

#include <vector>

#include "caffe/layers/myNeuron_layer.hpp"
#include "caffe/util/math_functions.hpp"
#include <iostream>
using namespace std;

namespace caffe {

template <typename Dtype>
void MyNeuronLayer<Dtype>::Forward_gpu(
    const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) {
  const int count = top[0]->count();
  Dtype* top_data = top[0]->mutable_gpu_data();
  caffe_gpu_powx(count, bottom[0]->gpu_data(), Dtype(power_), top_data);
}

template <typename Dtype>
void MyNeuronLayer<Dtype>::Backward_gpu(const vector<Blob<Dtype>*>& top,
    const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) {
  const int count = top[0]->count();
  const Dtype* top_diff = top[0]->gpu_diff();
  if (propagate_down[0]) {
    const Dtype* bottom_data = bottom[0]->gpu_data();
    Dtype* bottom_diff = bottom[0]->mutable_gpu_diff();
    const Dtype* bottom_data_w = bottom[0]->cpu_data();
    const Dtype* bottom_diff_w = bottom[0]->cpu_diff();

    cout << "bottom_data[0]: " << bottom_data_w[0] << endl;
    cout << "bottom_diff[0]: " << bottom_diff_w[0] << endl;

    caffe_gpu_powx(count, bottom_data, Dtype(power_ - 1), bottom_diff);

    bottom_diff = bottom[0]->mutable_gpu_diff();
    bottom_data_w = bottom[0]->cpu_data();
    bottom_diff_w = bottom[0]->cpu_diff();
    cout << "bottom_data[0]: " << bottom_data_w[0] << endl;
    cout << "bottom_diff[0]: " << bottom_diff_w[0] << endl;

    caffe_gpu_scal(count, Dtype(power_), bottom_diff);

    bottom_diff = bottom[0]->mutable_gpu_diff();
    bottom_data_w = bottom[0]->cpu_data();
    bottom_diff_w = bottom[0]->cpu_diff();
    cout << "bottom_data[0]: " << bottom_data_w[0] << endl;
    cout << "bottom_diff[0]: " << bottom_diff_w[0] << endl;

    caffe_gpu_mul(count, bottom_diff, top_diff, bottom_diff);

    bottom_diff = bottom[0]->mutable_gpu_diff();
    bottom_data_w = bottom[0]->cpu_data();
    bottom_diff_w = bottom[0]->cpu_diff();
    cout << "bottom_data[0]: " << bottom_data_w[0] << endl;
    cout << "bottom_diff[0]: " << bottom_diff_w[0] << endl;
  }
}

INSTANTIATE_LAYER_GPU_FUNCS(MyNeuronLayer);

}  // namespace caffe

5.3、注册
(1) 保持不变

 

(2) 在 cu 文件后添加

INSTANTIATE_LAYER_GPU_FUNCS(MyNeuronLayer);

5.4、测试
     重新 make之后,创建 deploy.prototxt 和 test_my_neuron_gpu.py , deploy.prototxt  保持不变,test_my_neuron_gpu.py

如下:

# -*- coding: utf-8 -*-

import numpy as np
import matplotlib.pyplot as plt
import os
import sys

deploy_file = "./deploy.prototxt"
test_data   = "./5.jpg"

if __name__ == '__main__':
  sys.path.append("/home/zjy/caffe/python")
  import caffe
  caffe.set_mode_gpu()

  net = caffe.Net(deploy_file,caffe.TEST)

  transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})

  transformer.set_transpose('data', (2, 0, 1))

  img = caffe.io.load_image(test_data,color=False)

  net.blobs['data'].data[...] = transformer.preprocess('data', img)

  print net.blobs['data'].data[0][0][14]

  out = net.forward()

  print out['data_out'][0][0][14]

 

“tmpxft_000061d4_00000000-7_cutlassB_f32_aligned_k65536_dropout.cudafe1.cpp [4835/7459] Building CXX object caffe2\CMakeFiles\torch_cuda.dir\__\torch\csrc\jit\tensorexpr\cuda_codegen.cpp.obj E:\PyTorch_Build\pytorch\torch/csrc/jit/tensorexpr/ir.h(395): warning C4805: “==”: 在操作中将类型“c10::impl::ScalarTypeToCPPType<c10::ScalarType::Bool>::type”与类型“T”混合不安全 with [ T=int ] E:\PyTorch_Build\pytorch\torch/csrc/jit/tensorexpr/ir.h(395): note: 模板实例化上下文(最早的实例化上下文)为 E:\PyTorch_Build\pytorch\torch\csrc\jit\tensorexpr\cuda_codegen.cpp(147): note: 查看对正在编译的函数 模板 实例化“bool torch::jit::tensorexpr::immediateEquals<int>(const torch::jit::tensorexpr::ExprPtr &,T)”的引用 with [ T=int ] [4844/7459] Building CUDA object caffe2\CMakeFiles\torch_c...aten\src\ATen\native\quantized\cuda\FakeQuantizeCore.cu.ob FakeQuantizeCore.cu tmpxft_000011c8_00000000-7_FakeQuantizeCore.cudafe1.cpp [4850/7459] Building CUDA object caffe2\CMakeFiles\torch_c...aten\src\ATen\native\sparse\cuda\SparseCsrTensorMath.cu.ob SparseCsrTensorMath.cu tmpxft_00000e1c_00000000-7_SparseCsrTensorMath.cudafe1.cpp [4854/7459] Building CXX object caffe2\torch\CMakeFiles\torch_python.dir\csrc\StorageMethods.cpp.obj FAILED: [code=2] caffe2/torch/CMakeFiles/torch_python.dir/csrc/StorageMethods.cpp.obj C:\PROGRA~2\MICROS~2\2022\BUILDT~1\VC\Tools\MSVC\1444~1.352\bin\Hostx64\x64\cl.exe /nologo /TP -DAT_PER_OPERATOR_HEADERS -DBUILDING_TESTS -DEXPORT_AOTI_FUNCTIONS -DFMT_HEADER_ONLY=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNOMINMAX -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DPy_NO_LINK_LIB -DTHP_BUILD_MAIN_LIB -DTORCH_CUDA_USE_NVTX3 -DUSE_CUDA -DUSE_EXTERNAL_MZCRC -DUSE_ITT -DUSE_MIMALLOC -DUSE_NUMPY -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_UCRT_LEGACY_INFINITY -Dtorch_python_EXPORTS -IE:\PyTorch_Build\pytorch\build\aten\src -IE:\PyTorch_Build\pytorch\aten\src -IE:\PyTorch_Build\pytorch\build -IE:\PyTorch_Build\pytorch -IE:\PyTorch_Build\pytorch\nlohmann -IE:\PyTorch_Build\pytorch\moodycamel -IE:\PyTorch_Build\pytorch\third_party\mimalloc\include -IE:\PyTorch_Build\pytorch\torch\.. -IE:\PyTorch_Build\pytorch\torch\..\aten\src -IE:\PyTorch_Build\pytorch\torch\..\aten\src\TH -IE:\PyTorch_Build\pytorch\build\caffe2\aten\src -IE:\PyTorch_Build\pytorch\build\third_party -IE:\PyTorch_Build\pytorch\build\third_party\onnx -IE:\PyTorch_Build\pytorch\torch\..\third_party\valgrind-headers -IE:\PyTorch_Build\pytorch\torch\..\third_party\gloo -IE:\PyTorch_Build\pytorch\torch\..\third_party\onnx -IE:\PyTorch_Build\pytorch\torch\..\third_party\flatbuffers\include -IE:\PyTorch_Build\pytorch\torch\..\third_party\kineto\libkineto\include -IE:\PyTorch_Build\pytorch\torch\..\third_party\cpp-httplib -IE:\PyTorch_Build\pytorch\torch\..\third_party\nlohmann\include -IE:\PyTorch_Build\pytorch\torch\csrc -IE:\PyTorch_Build\pytorch\torch\csrc\api\include -IE:\PyTorch_Build\pytorch\torch\lib -IE:\PyTorch_Build\pytorch\torch\standalone -IE:\PyTorch_Build\pytorch\torch\lib\libshm_windows -IE:\PyTorch_Build\pytorch\torch\csrc\api -IE:\PyTorch_Build\pytorch\c10\.. -IE:\PyTorch_Build\pytorch\c10\cuda\..\.. -IE:\PyTorch_Build\pytorch\third_party\fmt\include -IE:\PyTorch_Build\pytorch\third_party\onnx -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\googletest\googlemock\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\googletest\googletest\include -external:IE:\PyTorch_Build\pytorch\third_party\protobuf\src -external:IE:\PyTorch_Build\pytorch\third_party\XNNPACK\include -external:IE:\PyTorch_Build\pytorch\third_party\ittapi\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\eigen -external:IE:\PyTorch_Build\pytorch\third_party\ideep\mkl-dnn\include\oneapi\dnnl -external:IE:\PyTorch_Build\pytorch\third_party\ideep\include -external:IE:\PyTorch_Build\pytorch\INTERFACE -external:IE:\PyTorch_Build\pytorch\third_party\nlohmann\include -external:IE:\PyTorch_Build\pytorch\third_party\concurrentqueue -external:IE:\PyTorch_Build\pytorch\rtx5070_env\lib\site-packages\numpy\_core\include -external:IE:\Python310\Include -external:I"E:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include" -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\pybind11\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\opentelemetry-cpp\api\include -external:IE:\PyTorch_Build\pytorch\third_party\cpp-httplib -external:IE:\PyTorch_Build\pytorch\third_party\NVTX\c\include -external:W0 /DWIN32 /D_WINDOWS /EHsc /Zc:__cplusplus /bigobj /FS /utf-8 -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273 -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /Zc:preprocessor /Zc:preprocessor /Zc:preprocessor /O2 /Ob2 /DNDEBUG /bigobj -DNDEBUG -std:c++17 -MD /permissive- /EHsc /bigobj -O2 /utf-8 /showIncludes /Focaffe2\torch\CMakeFiles\torch_python.dir\csrc\StorageMethods.cpp.obj /Fdcaffe2\torch\CMakeFiles\torch_python.dir\ /FS -c E:\PyTorch_Build\pytorch\torch\csrc\StorageMethods.cpp E:\PyTorch_Build\pytorch\torch\csrc\StorageMethods.cpp(9): fatal error C1083: 无法打开包括文件: “libshm.h”: No such file or directory [4855/7459] Building CXX object caffe2\torch\CMakeFiles\torch_python.dir\csrc\Storage.cpp.obj FAILED: [code=2] caffe2/torch/CMakeFiles/torch_python.dir/csrc/Storage.cpp.obj C:\PROGRA~2\MICROS~2\2022\BUILDT~1\VC\Tools\MSVC\1444~1.352\bin\Hostx64\x64\cl.exe /nologo /TP -DAT_PER_OPERATOR_HEADERS -DBUILDING_TESTS -DEXPORT_AOTI_FUNCTIONS -DFMT_HEADER_ONLY=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNOMINMAX -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DPy_NO_LINK_LIB -DTHP_BUILD_MAIN_LIB -DTORCH_CUDA_USE_NVTX3 -DUSE_CUDA -DUSE_EXTERNAL_MZCRC -DUSE_ITT -DUSE_MIMALLOC -DUSE_NUMPY -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_UCRT_LEGACY_INFINITY -Dtorch_python_EXPORTS -IE:\PyTorch_Build\pytorch\build\aten\src -IE:\PyTorch_Build\pytorch\aten\src -IE:\PyTorch_Build\pytorch\build -IE:\PyTorch_Build\pytorch -IE:\PyTorch_Build\pytorch\nlohmann -IE:\PyTorch_Build\pytorch\moodycamel -IE:\PyTorch_Build\pytorch\third_party\mimalloc\include -IE:\PyTorch_Build\pytorch\torch\.. -IE:\PyTorch_Build\pytorch\torch\..\aten\src -IE:\PyTorch_Build\pytorch\torch\..\aten\src\TH -IE:\PyTorch_Build\pytorch\build\caffe2\aten\src -IE:\PyTorch_Build\pytorch\build\third_party -IE:\PyTorch_Build\pytorch\build\third_party\onnx -IE:\PyTorch_Build\pytorch\torch\..\third_party\valgrind-headers -IE:\PyTorch_Build\pytorch\torch\..\third_party\gloo -IE:\PyTorch_Build\pytorch\torch\..\third_party\onnx -IE:\PyTorch_Build\pytorch\torch\..\third_party\flatbuffers\include -IE:\PyTorch_Build\pytorch\torch\..\third_party\kineto\libkineto\include -IE:\PyTorch_Build\pytorch\torch\..\third_party\cpp-httplib -IE:\PyTorch_Build\pytorch\torch\..\third_party\nlohmann\include -IE:\PyTorch_Build\pytorch\torch\csrc -IE:\PyTorch_Build\pytorch\torch\csrc\api\include -IE:\PyTorch_Build\pytorch\torch\lib -IE:\PyTorch_Build\pytorch\torch\standalone -IE:\PyTorch_Build\pytorch\torch\lib\libshm_windows -IE:\PyTorch_Build\pytorch\torch\csrc\api -IE:\PyTorch_Build\pytorch\c10\.. -IE:\PyTorch_Build\pytorch\c10\cuda\..\.. -IE:\PyTorch_Build\pytorch\third_party\fmt\include -IE:\PyTorch_Build\pytorch\third_party\onnx -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\googletest\googlemock\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\googletest\googletest\include -external:IE:\PyTorch_Build\pytorch\third_party\protobuf\src -external:IE:\PyTorch_Build\pytorch\third_party\XNNPACK\include -external:IE:\PyTorch_Build\pytorch\third_party\ittapi\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\eigen -external:IE:\PyTorch_Build\pytorch\third_party\ideep\mkl-dnn\include\oneapi\dnnl -external:IE:\PyTorch_Build\pytorch\third_party\ideep\include -external:IE:\PyTorch_Build\pytorch\INTERFACE -external:IE:\PyTorch_Build\pytorch\third_party\nlohmann\include -external:IE:\PyTorch_Build\pytorch\third_party\concurrentqueue -external:IE:\PyTorch_Build\pytorch\rtx5070_env\lib\site-packages\numpy\_core\include -external:IE:\Python310\Include -external:I"E:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include" -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\pybind11\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\opentelemetry-cpp\api\include -external:IE:\PyTorch_Build\pytorch\third_party\cpp-httplib -external:IE:\PyTorch_Build\pytorch\third_party\NVTX\c\include -external:W0 /DWIN32 /D_WINDOWS /EHsc /Zc:__cplusplus /bigobj /FS /utf-8 -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273 -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /Zc:preprocessor /Zc:preprocessor /Zc:preprocessor /O2 /Ob2 /DNDEBUG /bigobj -DNDEBUG -std:c++17 -MD /permissive- /EHsc /bigobj -O2 /utf-8 /showIncludes /Focaffe2\torch\CMakeFiles\torch_python.dir\csrc\Storage.cpp.obj /Fdcaffe2\torch\CMakeFiles\torch_python.dir\ /FS -c E:\PyTorch_Build\pytorch\torch\csrc\Storage.cpp E:\PyTorch_Build\pytorch\torch\csrc\Storage.cpp(10): fatal error C1083: 无法打开包括文件: “libshm.h”: No such file or directory [4856/7459] Building CXX object caffe2\torch\CMakeFiles\torch_python.dir\csrc\Module.cpp.obj FAILED: [code=2] caffe2/torch/CMakeFiles/torch_python.dir/csrc/Module.cpp.obj C:\PROGRA~2\MICROS~2\2022\BUILDT~1\VC\Tools\MSVC\1444~1.352\bin\Hostx64\x64\cl.exe /nologo /TP -DAT_PER_OPERATOR_HEADERS -DBUILDING_TESTS -DEXPORT_AOTI_FUNCTIONS -DFMT_HEADER_ONLY=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DNOMINMAX -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DPy_NO_LINK_LIB -DTHP_BUILD_MAIN_LIB -DTORCH_CUDA_USE_NVTX3 -DUSE_CUDA -DUSE_EXTERNAL_MZCRC -DUSE_ITT -DUSE_MIMALLOC -DUSE_NUMPY -DWIN32_LEAN_AND_MEAN -D_CRT_SECURE_NO_DEPRECATE=1 -D_UCRT_LEGACY_INFINITY -Dtorch_python_EXPORTS -IE:\PyTorch_Build\pytorch\build\aten\src -IE:\PyTorch_Build\pytorch\aten\src -IE:\PyTorch_Build\pytorch\build -IE:\PyTorch_Build\pytorch -IE:\PyTorch_Build\pytorch\nlohmann -IE:\PyTorch_Build\pytorch\moodycamel -IE:\PyTorch_Build\pytorch\third_party\mimalloc\include -IE:\PyTorch_Build\pytorch\torch\.. -IE:\PyTorch_Build\pytorch\torch\..\aten\src -IE:\PyTorch_Build\pytorch\torch\..\aten\src\TH -IE:\PyTorch_Build\pytorch\build\caffe2\aten\src -IE:\PyTorch_Build\pytorch\build\third_party -IE:\PyTorch_Build\pytorch\build\third_party\onnx -IE:\PyTorch_Build\pytorch\torch\..\third_party\valgrind-headers -IE:\PyTorch_Build\pytorch\torch\..\third_party\gloo -IE:\PyTorch_Build\pytorch\torch\..\third_party\onnx -IE:\PyTorch_Build\pytorch\torch\..\third_party\flatbuffers\include -IE:\PyTorch_Build\pytorch\torch\..\third_party\kineto\libkineto\include -IE:\PyTorch_Build\pytorch\torch\..\third_party\cpp-httplib -IE:\PyTorch_Build\pytorch\torch\..\third_party\nlohmann\include -IE:\PyTorch_Build\pytorch\torch\csrc -IE:\PyTorch_Build\pytorch\torch\csrc\api\include -IE:\PyTorch_Build\pytorch\torch\lib -IE:\PyTorch_Build\pytorch\torch\standalone -IE:\PyTorch_Build\pytorch\torch\lib\libshm_windows -IE:\PyTorch_Build\pytorch\torch\csrc\api -IE:\PyTorch_Build\pytorch\c10\.. -IE:\PyTorch_Build\pytorch\c10\cuda\..\.. -IE:\PyTorch_Build\pytorch\third_party\fmt\include -IE:\PyTorch_Build\pytorch\third_party\onnx -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\googletest\googlemock\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\googletest\googletest\include -external:IE:\PyTorch_Build\pytorch\third_party\protobuf\src -external:IE:\PyTorch_Build\pytorch\third_party\XNNPACK\include -external:IE:\PyTorch_Build\pytorch\third_party\ittapi\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\eigen -external:IE:\PyTorch_Build\pytorch\third_party\ideep\mkl-dnn\include\oneapi\dnnl -external:IE:\PyTorch_Build\pytorch\third_party\ideep\include -external:IE:\PyTorch_Build\pytorch\INTERFACE -external:IE:\PyTorch_Build\pytorch\third_party\nlohmann\include -external:IE:\PyTorch_Build\pytorch\third_party\concurrentqueue -external:IE:\PyTorch_Build\pytorch\rtx5070_env\lib\site-packages\numpy\_core\include -external:IE:\Python310\Include -external:I"E:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include" -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\pybind11\include -external:IE:\PyTorch_Build\pytorch\cmake\..\third_party\opentelemetry-cpp\api\include -external:IE:\PyTorch_Build\pytorch\third_party\cpp-httplib -external:IE:\PyTorch_Build\pytorch\third_party\NVTX\c\include -external:W0 /DWIN32 /D_WINDOWS /EHsc /Zc:__cplusplus /bigobj /FS /utf-8 -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DLIBKINETO_NOXPUPTI=ON -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE /wd4624 /wd4068 /wd4067 /wd4267 /wd4661 /wd4717 /wd4244 /wd4804 /wd4273 -DHAVE_AVX512_CPU_DEFINITION -DHAVE_AVX2_CPU_DEFINITION /Zc:preprocessor /Zc:preprocessor /Zc:preprocessor /O2 /Ob2 /DNDEBUG /bigobj -DNDEBUG -std:c++17 -MD /permissive- /EHsc /bigobj -O2 /utf-8 /showIncludes /Focaffe2\torch\CMakeFiles\torch_python.dir\csrc\Module.cpp.obj /Fdcaffe2\torch\CMakeFiles\torch_python.dir\ /FS -c E:\PyTorch_Build\pytorch\torch\csrc\Module.cpp E:\PyTorch_Build\pytorch\torch\csrc\Module.cpp(34): fatal error C1083: 无法打开包括文件: “libshm.h”: No such file or directory [4872/7459] Building CUDA object caffe2\CMakeFiles\torch_cuda.dir\__\aten\src\ATen\native\cuda\group_norm_kernel.cu.obj group_norm_kernel.cu tmpxft_00005adc_00000000-7_group_norm_kernel.cudafe1.cpp [4873/7459] Building CUDA object caffe2\CMakeFiles\torch_cuda.dir\__\aten\src\ATen\UfuncCUDA_add.cu.obj UfuncCUDA_add.cu tmpxft_00006bdc_00000000-7_UfuncCUDA_add.cudafe1.cpp [4875/7459] Building CUDA object caffe2\CMakeFiles\torch_cuda.dir\__\aten\src\ATen\native\cuda\Unique.cu.obj Unique.cu tmpxft_00000658_00000000-7_Unique.cudafe1.cpp ninja: build stopped: subcommand failed. (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch>” “PowerShell 7 环境已加载 (版本: 7.5.2) PS C:\Users\Administrator\Desktop> cd E:\PyTorch_Build\pytorch PS E:\PyTorch_Build\pytorch> python -m venv rtx5070_env Error: [Errno 13] Permission denied: 'E:\\PyTorch_Build\\pytorch\\rtx5070_env\\Scripts\\python.exe' PS E:\PyTorch_Build\pytorch> .\rtx5070_env\Scripts\activate (rtx5070_env) PS E:\PyTorch_Build\pytorch> conda install -c conda-forge openblas 3 channel Terms of Service accepted Retrieving notices: done Channels: - conda-forge - defaults - nvidia - pytorch-nightly Platform: win-64 Collecting package metadata (repodata.json): done Solving environment: done ## Package Plan ## environment location: C:\Miniconda3 added / updated specs: - openblas The following packages will be downloaded: package | build ---------------------------|----------------- libopenblas-0.3.30 |pthreads_ha4fe6b2_2 3.8 MB conda-forge openblas-0.3.30 |pthreads_h4a7f399_2 262 KB conda-forge ------------------------------------------------------------ Total: 4.0 MB The following NEW packages will be INSTALLED: libopenblas conda-forge/win-64::libopenblas-0.3.30-pthreads_ha4fe6b2_2 openblas conda-forge/win-64::openblas-0.3.30-pthreads_h4a7f399_2 Proceed ([y]/n)? y Downloading and Extracting Packages: Preparing transaction: done Verifying transaction: done Executing transaction: done (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 设置cuDNN路径 (rtx5070_env) PS E:\PyTorch_Build\pytorch> @" >> set(CUDNN_INCLUDE_DIR "E:/Program Files/NVIDIA/CUNND/v9.12/include/13.0") >> set(CUDNN_LIBRARY "E:/Program Files/NVIDIA/CUNND/v9.12/lib/13.0/x64/cudnn64_9.lib") >> message(STATUS "Applied custom cuDNN settings") >> "@ | Set-Content set_cudnn.cmake (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 在CMakeLists.txt第一行插入引用 (rtx5070_env) PS E:\PyTorch_Build\pytorch> $cmakeFile = "CMakeLists.txt" (rtx5070_env) PS E:\PyTorch_Build\pytorch> $content = Get-Content $cmakeFile (rtx5070_env) PS E:\PyTorch_Build\pytorch> $newContent = @() (rtx5070_env) PS E:\PyTorch_Build\pytorch> $newContent += "include(set_cudnn.cmake)" (rtx5070_env) PS E:\PyTorch_Build\pytorch> $newContent += $content (rtx5070_env) PS E:\PyTorch_Build\pytorch> $newContent | Set-Content $cmakeFile (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 安装OpenBLAS (rtx5070_env) PS E:\PyTorch_Build\pytorch> conda install -c conda-forge openblas -y 3 channel Terms of Service accepted Channels: - conda-forge - defaults - nvidia - pytorch-nightly Platform: win-64 Collecting package metadata (repodata.json): done Solving environment: done # All requested packages already installed. (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> Write-Host "✅ 构建环境优化完成" -ForegroundColor Green ✅ 构建环境优化完成 (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 检查GPU支持状态 (rtx5070_env) PS E:\PyTorch_Build\pytorch> python -c "import torch; print(f'CUDA可用: {torch.cuda.is_available()}')" Traceback (most recent call last): File "<string>", line 1, in <module> File "E:\PyTorch_Build\pytorch\torch\__init__.py", line 415, in <module> from torch._C import * # noqa: F403 ImportError: DLL load failed while importing _C: 找不到指定的模块。 (rtx5070_env) PS E:\PyTorch_Build\pytorch> (rtx5070_env) PS E:\PyTorch_Build\pytorch> # 查看已完成任务比例 (rtx5070_env) PS E:\PyTorch_Build\pytorch> $total = 7459 (rtx5070_env) PS E:\PyTorch_Build\pytorch> $completed = (Get-ChildItem build -Recurse -File | Measure-Object).Count (rtx5070_env) PS E:\PyTorch_Build\pytorch> $percent = [math]::Round(($completed/$total)*100, 2) (rtx5070_env) PS E:\PyTorch_Build\pytorch> Write-Host "构建进度: $completed/$total ($percent%)" 构建进度: 12552/7459 (168.28%) (rtx5070_env) PS E:\PyTorch_Build\pytorch>”
最新发布
09-04
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值