cudart link错误自动修复脚本

问题

在conda安装cuda环境时

mamba install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia -y
mamba install nvidia/label/cuda-12.1.0::cuda

安装的envs/xxx/lib/libcudart.so 会link到错的

如果手动修复太复杂,我写了一个脚本自动修复

脚本

#!/bin/bash

# 检查是否提供了环境名称作为参数
if [ "$#" -ne 1 ]; then
    echo "Usage: $0 <environment_name>"
    exit 1
fi

env_name=$1
env_path="/mnt/data/wangziyi/miniconda3/envs/$env_name/lib"

# 检查环境路径是否存在
if [ ! -d "$env_path" ]; then
    echo "Error: The environment path $env_path does not exist."
    exit 1
fi

# 使用find命令和正则表达式来查找最长的libcudart.so文件名
cudart_so_file=$(find "$env_path" -type f -regextype posix-extended -regex "$env_path/libcudart\.so\.[0-9]+\.[0-9]+\.[0-9]+" -printf '%f\n' | sort -rV | head -n 1)

# 检查是否找到了文件
if [ -z "$cudart_so_file" ]; then
    echo "Error: No file matching libcudart.so.<version> was found in the environment."
    exit 1
fi

# 创建或更新libcudart.so的符号链接
ln -sf "$env_path/$cudart_so_file" "$env_path/libcudart.so"

echo "Symbol link for libcudart.so has been updated to $cudart_so_file."

在这里插入图片描述

新版

function fixcuda() {
    # 检查是否提供了环境名称作为参数
    if [ "$#" -ne 1 ]; then
        # 如果没有提供参数,使用环境变量CONDA_DEFAULT_ENV的值
        env_name=${CONDA_DEFAULT_ENV:-"base"}
    else
        env_name=$1
    fi

    env_path="$HOME/miniconda3/envs/$env_name/lib"
    echo "Environment path: $env_path"

    ls -l $env_path | grep cuda

    # 检查环境路径是否存在
    if [ ! -d "$env_path" ]; then
        echo "Error: The environment path $env_path does not exist."
        exit 1
    fi

    # 使用find命令和正则表达式来查找最长的libcudart.so文件名
    cudart_so_file=$(find "$env_path" -type f -regextype posix-extended -regex "$env_path/libcudart\.so\.[0-9]+\.[0-9]+\.[0-9]+" -printf '%f\n' | sort -rV | head -n 1)

    # 检查是否找到了文件
    if [ -z "$cudart_so_file" ]; then
        echo "Error: No file matching libcudart.so.<version> was found in the environment."
        exit 1
    fi

    # 创建或更新libcudart.so的符号链接
    ln -sf "$env_path/$cudart_so_file" "$env_path/libcudart.so"

    echo "Symbol link for libcudart.so has been updated to $cudart_so_file."
}

Traceback (most recent call last): File "D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\utils\cpp_extension.py", line 2107, in _run_ninja_build subprocess.run( File "D:\anaconda\envs\pytorch2.3.1\lib\subprocess.py", line 528, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "D:\pycharm\pythonProject\ConvolutionalNeuralOperator-main\TrainCNO.py", line 160, in <module> output_pred_batch = model(input_batch) File "D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "D:\pycharm\pythonProject\ConvolutionalNeuralOperator-main\CNOModule.py", line 476, in forward x = self.lift(x) File "D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "D:\pycharm\pythonProject\ConvolutionalNeuralOperator-main\CNOModule.py", line 151, in forward x = self.inter_CNOBlock(x) File "D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "D:\pycharm\pythonProject\ConvolutionalNeuralOperator-main\CNOModule.py", line 104, in forward return self.activation(x) File "D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "D:\pycharm\pythonProject\ConvolutionalNeuralOperator-main\training\filtered_networks.py", line 392, in forward x = filtered_lrelu.filtered_lrelu(x=x, fu=self.up_filter, fd=self.down_filter, b=self.bias.to(x.dtype), File "D:\pycharm\pythonProject\ConvolutionalNeuralOperator-main\torch_utils\ops\filtered_lrelu.py", line 114, in filtered_lrelu if impl == 'cuda' and x.device.type == 'cuda' and _init(): File "D:\pycharm\pythonProject\ConvolutionalNeuralOperator-main\torch_utils\ops\filtered_lrelu.py", line 26, in _init _plugin = custom_ops.get_plugin( File "D:\pycharm\pythonProject\ConvolutionalNeuralOperator-main\torch_utils\custom_ops.py", line 136, in get_plugin torch.utils.cpp_extension.load(name=module_name, build_directory=cached_build_dir, File "D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\utils\cpp_extension.py", line 1309, in load return _jit_compile( File "D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\utils\cpp_extension.py", line 1719, in _jit_compile _write_ninja_file_and_build_library( File "D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\utils\cpp_extension.py", line 1832, in _write_ninja_file_and_build_library _run_ninja_build( File "D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\utils\cpp_extension.py", line 2123, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension 'filtered_lrelu_plugin': [1/4] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin\nvcc --generate-dependencies-with-compile --dependency-output filtered_lrelu_ns.cuda.o.d -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4068 -Xcompiler /wd4067 -Xcompiler /wd4624 -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DTORCH_EXTENSION_NAME=filtered_lrelu_plugin -DTORCH_API_INCLUDE_EXTENSION_H -ID:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\include -ID:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\include\torch\csrc\api\include -ID:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\include\TH -ID:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\include" -ID:\anaconda\envs\pytorch2.3.1\Include -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 -std=c++17 --use_fast_math --allow-unsupported-compiler -c C:\Users\Administrator\AppData\Local\torch_extensions\torch_extensions\Cache\py39_cu118\filtered_lrelu_plugin\2e9606d7cf844ec44b9f500eaacd35c0-nvidia-geforce-rtx-4060-ti\filtered_lrelu_ns.cu -o filtered_lrelu_ns.cuda.o [2/4] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin\nvcc --generate-dependencies-with-compile --dependency-output filtered_lrelu_rd.cuda.o.d -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4068 -Xcompiler /wd4067 -Xcompiler /wd4624 -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DTORCH_EXTENSION_NAME=filtered_lrelu_plugin -DTORCH_API_INCLUDE_EXTENSION_H -ID:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\include -ID:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\include\torch\csrc\api\include -ID:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\include\TH -ID:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\include" -ID:\anaconda\envs\pytorch2.3.1\Include -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 -std=c++17 --use_fast_math --allow-unsupported-compiler -c C:\Users\Administrator\AppData\Local\torch_extensions\torch_extensions\Cache\py39_cu118\filtered_lrelu_plugin\2e9606d7cf844ec44b9f500eaacd35c0-nvidia-geforce-rtx-4060-ti\filtered_lrelu_rd.cu -o filtered_lrelu_rd.cuda.o [3/4] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin\nvcc --generate-dependencies-with-compile --dependency-output filtered_lrelu_wr.cuda.o.d -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4068 -Xcompiler /wd4067 -Xcompiler /wd4624 -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DTORCH_EXTENSION_NAME=filtered_lrelu_plugin -DTORCH_API_INCLUDE_EXTENSION_H -ID:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\include -ID:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\include\torch\csrc\api\include -ID:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\include\TH -ID:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\include" -ID:\anaconda\envs\pytorch2.3.1\Include -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_89,code=compute_89 -gencode=arch=compute_89,code=sm_89 -std=c++17 --use_fast_math --allow-unsupported-compiler -c C:\Users\Administrator\AppData\Local\torch_extensions\torch_extensions\Cache\py39_cu118\filtered_lrelu_plugin\2e9606d7cf844ec44b9f500eaacd35c0-nvidia-geforce-rtx-4060-ti\filtered_lrelu_wr.cu -o filtered_lrelu_wr.cuda.o [4/4] "D:\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.29.30133\bin\Hostx64\x64/link.exe" filtered_lrelu.o filtered_lrelu_wr.cuda.o filtered_lrelu_rd.cuda.o filtered_lrelu_ns.cuda.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda.lib -INCLUDE:?warp_size@cuda@at@@YAHXZ torch.lib /LIBPATH:D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\lib torch_python.lib /LIBPATH:D:\anaconda\envs\pytorch2.3.1\libs "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\lib\x64" cudart.lib /out:filtered_lrelu_plugin.pyd FAILED: filtered_lrelu_plugin.pyd "D:\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.29.30133\bin\Hostx64\x64/link.exe" filtered_lrelu.o filtered_lrelu_wr.cuda.o filtered_lrelu_rd.cuda.o filtered_lrelu_ns.cuda.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda.lib -INCLUDE:?warp_size@cuda@at@@YAHXZ torch.lib /LIBPATH:D:\anaconda\envs\pytorch2.3.1\lib\site-packages\torch\lib torch_python.lib /LIBPATH:D:\anaconda\envs\pytorch2.3.1\libs "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\lib\x64" cudart.lib /out:filtered_lrelu_plugin.pyd 正在创建库 filtered_lrelu_plugin.lib 和对象 filtered_lrelu_plugin.exp filtered_lrelu.o : error LNK2019: 无法解析的外部符号 __std_find_trivial_1,函数 "char const * __cdecl std::_Find_vectorized<char const ,char>(char const * const,char const * const,char)" (??$_Find_vectorized@$$CBDD@std@@YAPEBDQEBD0D@Z) 中引用了该符号 ninja: build stopped: subcommand failed.这是怎么回事
最新发布
07-06
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值