超分辨率重建 DAT 上手运行部署记录（更新中）

风起舞斜阳

已于 2025-02-08 11:22:45 修改

阅读量1k

点赞数 15

文章标签：超分辨率重建人工智能图像处理

于 2024-07-02 16:48:51 首次发布

本文链接：https://blog.youkuaiyun.com/m0_52249955/article/details/140126485

版权

把这两组链接分别导入就可以了，这里主要t+数字要更换，还有后面r32.7就是你的l4t系统版本号，输入一个点后的就可以了，不用精确到小版本，比如我32.7.3就输到32.7，不然也是找不到。

引言：记录使用深度学习的方法，使得不清楚失真图像进行超分辨率重建使得不清楚图像变得清晰

代码来着GitHub大佬分析：GitHub - zhengchen1999/DAT: PyTorch code for our ICCV 2023 paper "Dual Aggregation Transformer for Image Super-Resolution"

论文地址：https://arxiv.org/abs/2308.03364

本文主要讲述自己进行安装部署运行的过程，用于debug记录，希望也可以帮助其他小伙伴。

1.代码获取

https://github.com/zhengchen1999/DAT.git

2.环境搭建

这里使用conda搭建python运行环境，注意conda要切换到代码目录

conda create -n DAT python=3.8
conda activate DAT
pip install -r requirements.txt
python setup.py develop

其中pytorch和cuda安装可能有点问题

torch-1.8.0+cu111-cp38-cp38-win_amd64.whl

建议去官网下载：https://download.pytorch.org/whl/torch_stable.html

安装完之后，可能会出现“ImportError: cannot import name ‘packaging‘ from ‘pkg_resources‘‘

降低版本setuptools版本

pip install setuptools==69.5.1

使用项目进行检查环境

python setup.py develop

在环境中测试，是否加载cuda

import torch
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

运行结果：cuda 说明显卡加载成功

下载对应数据集，和预训练模型，识别效果

百度网盘请输入提取码

将预训练模型放入experiments的pretrained_models 下

3.测试

可以将自己要重建的图像放入datasets目录的single目录下

进行超分辨率重建

python basicsr/test.py -opt options/Test/test_single_x4.yml

test_single_x4.yml介绍

# general settings
name: test_single_x4   #测试文件生成名
model_type: DATModel   #使用模型
scale: 4                #缩小倍数
num_gpu: 1              #gpu数量
manual_seed: 10        #随机种子，随机种子一样，生成的结果一样，可用来复现

datasets:              #超分辨重建数据
  test_1:
    name: Single
    type: SingleImageDataset       #图片配对模式
    dataroot_lq: datasets/single    #测试路径低清图片位置
    io_backend:
      type: disk              #输入/输出后端配置，这里使用磁盘作为后端

# network structures  网络结构配置，用于生成模型
network_g:
  type: DAT              #网络类型 DAT
  upscale: 4              #上采样倍数
  in_chans: 3             #输入通道数
  img_size: 64            #图像尺寸
  img_range: 1.           #图像数据的范围
  split_size: [8,32]      #分割大小，用于网络中的注意力机制。
  depth: [6,6,6,6,6,6]     #网络层深度
  embed_dim: 180            #嵌入维度
  num_heads: [6,6,6,6,6,6]   #注意力机制中的头数
  expansion_factor: 4       #扩张因子
  resi_connection: '1conv'  #残差连接的类型

# path
path:
  pretrain_network_g: experiments/pretrained_models/DAT/DAT_x4.pth   #预训练模型路径
  strict_load_g: True               #加载预训练模型时是否严格匹配

# validation settings
val:
  save_img: True                    #是否保存图片
  suffix: 'x4'  # 保存图像时添加的后缀
  use_chop: False  # 是否使用分割来节省内存

生成输出结果在results/test_sing_x4/visualization/sing内

4.数据集制作

根据使用的预训练模型在option/Train目录下选择对应的yml文件。

数据集的制作包括：高清图片数据集获取、高清图片数据集的清洗（去除异常图片）、低清图片制作

高清图片数据集获取：自己选择合适的数据集

高清图片数据集的清洗（去除切片产生的黑色图片）

对对灰度图进行的处理

from PIL import Image
import numpy as np
import os

def load_grayscale_image(image_path):
    """加载灰度图像并转换为NumPy数组"""
    with Image.open(image_path) as img:
        if img.mode != 'L':  # 确保图像是灰度模式
            img = img.convert('L')
        return np.array(img)

def is_black_dominant_gray(image_array, threshold=0.5, black_threshold=10):
    """判断图像是否含有大面积黑色区域"""
    # 计算低于black_threshold的像素点数量
    black_pixels = np.sum(image_array < black_threshold)
    total_pixels = image_array.size
    return black_pixels / total_pixels > threshold

def filter_images(folder_path, threshold=0.5, black_threshold=10):
    """遍历文件夹并剔除含有大面积黑色区域的灰度图像"""
    for filename in os.listdir(folder_path):
        if filename.lower().endswith(('.png', '.jpg', '.jpeg', '.bmp', '.pgm', '.pbm')):
            image_path = os.path.join(folder_path, filename)
            image_array = load_grayscale_image(image_path)
            if is_black_dominant_gray(image_array, threshold, black_threshold):
                print(f"Removing {filename} due to black dominance.")
                os.remove(image_path)  # 取消注释以删除文件

# Example usage
folder_path = r'G:\1_tmp\img'
filter_images(folder_path)

对彩色图的处理

from PIL import Image
import numpy as np
import os

def load_color_image(image_path):
    """加载彩色图像并转换为NumPy数组"""
    with Image.open(image_path) as img:
        return np.array(img)

def is_black_dominant_color(image_array, threshold=0.5, black_threshold=10):
    """判断彩色图像是否含有大面积黑色区域"""
    # 将图像数组转换为HSV颜色空间，以便更容易地检测黑色
    hsv_image = np.zeros_like(image_array, dtype=np.uint8)
    hsv_image[..., 0] = image_array[..., 0]  # H通道
    hsv_image[..., 1] = 255  # S通道设为最大值
    hsv_image[..., 2] = image_array[..., 2]  # V通道
    
    # 计算低于black_threshold的像素点数量
    black_pixels = np.sum(hsv_image[..., 2] < black_threshold)
    total_pixels = image_array.size // 3  # 彩色图像的总像素数是三个通道的像素数之和
    return black_pixels / total_pixels > threshold

def filter_images(folder_path, threshold=0.5, black_threshold=10):
    """遍历文件夹并剔除含有大面积黑色区域的彩色图像"""
    for filename in os.listdir(folder_path):
        if filename.lower().endswith(('.png', '.jpg', '.jpeg', '.bmp')):
            image_path = os.path.join(folder_path, filename)
            image_array = load_color_image(image_path)
            if is_black_dominant_color(image_array, threshold, black_threshold):
                print(f"Removing {filename} due to black dominance.")
                os.remove(image_path)  # 取消注释以删除文件

# Example usage
folder_path = r'G:\1_tmp\img'
filter_images(folder_path)

低清图片制作：

模糊：

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import uuid
from ffmpy import FFmpeg


def blur_image(image_path: str, output_dir: str, level=10):
    ext = os.path.basename(image_path).strip().split('.')[-1]
    if ext not in ['png', 'jpg']:
        raise Exception('format error')
    ff = FFmpeg(executable=r'G:ffmpeg-6.1-win-64\ffmpeg.exe',
        inputs={
            '{}'.format(image_path): None}, outputs={
            os.path.join(
                output_dir, '{}.{}'.format(
                    uuid.uuid4(), ext)): '-filter_complex "boxblur={}:1:cr=0:ar=0"'.format(level)})
    print(ff.cmd)
    ff.run()
    return os.path.join(output_dir, '{}.{}'.format(uuid.uuid4(), ext))

def blur_images_in_folder(folder_path: str, output_dir: str, level=10):
    # 确保输出目录存在
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)

    # 遍历文件夹中的所有文件
    for filename in os.listdir(folder_path):
        image_path = os.path.join(folder_path, filename)
        # 检查文件是否是图片
        if os.path.isfile(image_path) and filename.lower().endswith(('png', 'jpg', 'jpeg')):
            # 对图像应用模糊效果并保存到输出目录
            blurred_image_path = blur_image(image_path, output_dir, level)
            print(f"Blurred image saved as: {blurred_image_path}")


# 使用示例
# 假设 'path_to_images' 是包含图片的文件夹路径
# 'path_to_output' 是你希望保存模糊图片的文件夹路径
path_to_images = r'G:\1_tmp\img'
path_to_output = r'G:\1_tmp\img_LR'
blur_images_in_folder(path_to_images, path_to_output)

注意：

ff = FFmpeg(executable=r'G:python_ku\ffmpeg-6.1-win-64\ffmpeg.exe'

下载ffmpegFFbinaries - Download binaries for ffmpeg, ffprobe, ffserver and ffplay (cross-platform: Windows, Mac, Linux)

使用双线性插值方法，将800*800的模糊图像，退化为200*200作为低清模糊图像

（matlab代码）有兴趣可以转为python

function Prepare_TrainData_HR_LR_BI()
%% settings
path_save = './img_LR_200';   %保存路径
path_src = './img_LR';   %原路径
ext               =  {'*.jpg','*.png','*.bmp'};
filepaths           =  [];
for i = 1 : length(ext)
    filepaths = cat(1,filepaths, dir(fullfile(path_src, ext{i})));
end
nb_im = length(filepaths);
DIV2K_HR = [];

for idx_im = 1:nb_im
    fprintf('Read HR :%d\n', idx_im);
    ImHR = imread(fullfile(path_src, filepaths(idx_im).name));
    DIV2K_HR{idx_im} = ImHR;
end

%% generate and save LR via imresize() with Bicubic

for IdxIm = 1:nb_im
    fprintf('IdxIm=%d\n', IdxIm);
    ImHR = DIV2K_HR{IdxIm};
    
    ImLRx2 = imresize(ImHR, 1/2, 'bicubic');
    ImLRx3 = imresize(ImHR, 1/3, 'bicubic');
    ImLRx4 = imresize(ImHR, 1/4, 'bicubic');
    
    % Use the original filename and append _x* suffix
    origFileName = filepaths(IdxIm).name;
    fileName_x2 = [origFileName(1:end-4), '.jpg']; % Assuming .png extension  修改  生成图片的后缀，需要对训练和测试的yml文件进行修改
    fileName_x3 = [origFileName(1:end-4), '.jpg'];
    fileName_x4 = [origFileName(1:end-4), '.jpg'];

    FolderLRx2 = fullfile(path_save, 'DIV2K_LR_bicubic', 'X2');   %文件夹名字
    FolderLRx3 = fullfile(path_save, 'DIV2K_LR_bicubic', 'X3');
    FolderLRx4 = fullfile(path_save, 'DIV2K_LR_bicubic', 'X4');

    if ~exist(FolderLRx2, 'dir')
        mkdir(FolderLRx2);
    end
    if ~exist(FolderLRx3, 'dir')
        mkdir(FolderLRx3);
    end
    if ~exist(FolderLRx4, 'dir')
        mkdir(FolderLRx4);
    end

    NameLRx2 = fullfile(FolderLRx2, fileName_x2);
    NameLRx3 = fullfile(FolderLRx3, fileName_x3);
    NameLRx4 = fullfile(FolderLRx4, fileName_x4);
    
    % save image
    imwrite(ImLRx2, NameLRx2);
    imwrite(ImLRx3, NameLRx3);
    imwrite(ImLRx4, NameLRx4);
end
end

5.训练

Python basicsr/train.py -opt options/Train/train_DAT_light_x4.yml

yml文件根据需求自己添加编写，主要是训练轮数和训练策略。

6.测试

python basicsr/test.py -opt options/Test/test_single_x4.yml

可以制作测试集进行对模型进行测试

7.pth转onnx模型

转换为onnx模型，注意转换所需内存特别大，RTX4090 24G只能转400*400,800*800转不了。不过超分辨率模型对这个要求不高，可以打散再拼接。

from basicsr.archs.dat_arch import DAT
import torch
import torch.nn as nn
import torch.nn.functional as F
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)
device = 'cpu'
# 假设DAT类已经正确导入，并且是你需要的模型架构
model = DAT(
    upscale=4,
    in_chans=3,
    img_size=64,
    img_range=1.,
    depth=[18],
    embed_dim=60,
    num_heads=[6],
    expansion_factor=2,
    resi_connection='3conv',
    split_size=[8,32],
    upsampler='pixelshuffledirect'
).to(device)

# 加载预训练的权重
state_dict = torch.load(r'G:\1_tmp_1\DAT_light_1600_g_480000.pth')['params']
model.load_state_dict(state_dict, strict=False)  # 使用strict=False以忽略不匹配的键

# 确保模型处于评估模式
model.eval()

dummy_input = torch.randn(1,3,200,200).to(device)
x = model(dummy_input)
print(x)
print(x.shape)
# 导出模型为ONNX格式
# 'path_to_save_model.onnx' 是你想要保存ONNX模型的文件路径
torch.onnx.export(model,               # 模型
                  dummy_input,         # 输入张量
                  'DAT_model12.onnx',   # 输出文件路径
                  export_params=True,        # 是否导出参数到ONNX文件中
                  opset_version=11,          # ONNX的操作集版本
                  do_constant_folding=True,  # 是否执行常量折叠优化
                  input_names=['input'],     # 输入张量的名称
                  output_names=['output'],   # 输出张量的名称
                  dynamic_axes={'input': {0: 'batch_size'},  # 可变长度轴的字典
                                'output': {0: 'batch_size'}})

8.使用onnx模型

import os

import numpy as np
import onnxruntime as ort
import torch
import cv2
import math
from typing import Optional, List, Union
from PIL import Image
import time
print(torch.cuda.is_available())
print(ort.__version__)
print(ort.get_device())
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
def make_grid(
    tensor: Union[torch.Tensor, List[torch.Tensor]],
    nrow: int = 8,
    padding: int = 0,
    normalize: bool = False,
    value_range: Optional[tuple] = None,
    scale_each: bool = False,
    pad_value: int = 0
) -> torch.Tensor:
    """Make a grid of images.

    Args:
        tensor (Tensor or list): 4D mini-batch Tensor of shape (B x C x H x W)
            or a list of images all of the same size.
        nrow (int, optional): Number of images displayed in each row of the grid.
            The final grid size is ``(B / nrow, nrow)``. Default: ``8``.
        padding (int, optional): amount of padding. Default: ``2``.
        normalize (bool, optional): If True, shift the image to the range (0, 1),
            by the min and max values specified by :attr:`value_range`. Default: ``False``.
        value_range (tuple, optional): tuple (min, max) where min and max are numbers,
            then these numbers are used to normalize the image. By default, min and max
            are computed from the tensor.
        scale_each (bool, optional): If ``True``, scale each image in the batch of
            images separately rather than the (min, max) over all images. Default: ``False``.
        pad_value (int, optional): Value for the padded pixels. Default: ``0``.

    Returns:
        Tensor: Grid of images.
    """
    if not (torch.is_tensor(tensor) or
            (isinstance(tensor, list) and all(torch.is_tensor(t) for t in tensor))):
        raise TypeError(f'tensor or list of tensors expected, got {type(tensor)}')

    # if list of tensors, convert to a 4D mini-batch Tensor
    if isinstance(tensor, list):
        tensor = torch.stack(tensor, dim=0)

    if tensor.dim() == 2:  # single image H x W
        tensor = tensor.unsqueeze(0)
    if tensor.dim() == 3:  # single image
        if tensor.size(0) == 1:  # if single-channel, convert to 3-channel
            tensor = torch.cat((tensor, tensor, tensor), 0)
        tensor = tensor.unsqueeze(0)

    if tensor.dim() == 4 and tensor.size(1) == 1:  # single-channel images
        tensor = torch.cat((tensor, tensor, tensor), 1)

    if normalize is True:
        tensor = tensor.clone()  # avoid modifying tensor in-place
        if value_range is not None:
            assert isinstance(value_range, tuple), \
                "value_range has to be a tuple (min, max) if specified. min and max are numbers"

        def norm_ip(img, low, high):
            img.clamp_(min=low, max=high)
            img.sub_(low).div_(max(high - low, 1e-5))

        def norm_range(t, value_range):
            if value_range is not None:
                norm_ip(t, value_range[0], value_range[1])
            else:
                norm_ip(t, float(t.min()), float(t.max()))

        if scale_each is True:
            for t in tensor:  # loop over mini-batch dimension
                norm_range(t, value_range)
        else:
            norm_range(tensor, value_range)

    if tensor.size(0) == 1:
        return tensor.squeeze(0)

    # make the mini-batch of images into a grid
    nmaps = tensor.size(0)
    xmaps = min(nrow, nmaps)
    ymaps = int(math.ceil(float(nmaps) / xmaps))
    height, width = int(tensor.size(2) + padding), int(tensor.size(3) + padding)
    num_channels = tensor.size(1)
    grid = tensor.new_full((num_channels, height * ymaps + padding, width * xmaps + padding), pad_value)
    k = 0
    for y in range(ymaps):
        for x in range(xmaps):
            if k >= nmaps:
                break
            grid.narrow(1, y * height + padding, height - padding).narrow(  # type: ignore[attr-defined]
                2, x * width + padding, width - padding
            ).copy_(tensor[k])
            k = k + 1
    return grid
def tensor2img_run(tensor, rgb2bgr=True, out_type=np.uint8, min_max=(0, 1)):
    """Convert torch Tensors into image numpy arrays.

    After clamping to [min, max], values will be normalized to [0, 1].

    Args:
        tensor (Tensor or list[Tensor]): Accept shapes:
            1) 4D mini-batch Tensor of shape (B x 3/1 x H x W);
            2) 3D Tensor of shape (3/1 x H x W);
            3) 2D Tensor of shape (H x W).
            Tensor channel should be in RGB order.
        rgb2bgr (bool): Whether to change rgb to bgr.
        out_type (numpy type): output types. If ``np.uint8``, transform outputs
            to uint8 type with range [0, 255]; otherwise, float type with
            range [0, 1]. Default: ``np.uint8``.
        min_max (tuple[int]): min and max values for clamp.

    Returns:
        (Tensor or list): 3D ndarray of shape (H x W x C) OR 2D ndarray of
        shape (H x W). The channel order is BGR.
    """
    if not (torch.is_tensor(tensor) or (isinstance(tensor, list) and all(torch.is_tensor(t) for t in tensor))):
        raise TypeError(f'tensor or list of tensors expected, got {type(tensor)}')

    if torch.is_tensor(tensor):
        tensor = [tensor]
    result = []
    for _tensor in tensor:
        _tensor = _tensor.squeeze(0).float().detach().cpu().clamp_(*min_max)
        _tensor = (_tensor - min_max[0]) / (min_max[1] - min_max[0])

        n_dim = _tensor.dim()
        if n_dim == 4:
            img_np = make_grid(_tensor, nrow=int(math.sqrt(_tensor.size(0))), normalize=False).numpy()
            img_np = img_np.transpose(1, 2, 0)
            if rgb2bgr:
                img_np = cv2.cvtColor(img_np, cv2.COLOR_RGB2BGR)
        elif n_dim == 3:
            img_np = _tensor.numpy()
            img_np = img_np.transpose(1, 2, 0)
            if img_np.shape[2] == 1:  # gray image
                img_np = np.squeeze(img_np, axis=2)
            else:
                if rgb2bgr:
                    img_np = cv2.cvtColor(img_np, cv2.COLOR_RGB2BGR)
        elif n_dim == 2:
            img_np = _tensor.numpy()
        else:
            raise TypeError(f'Only support 4D, 3D or 2D tensor. But received with dimension: {n_dim}')
        if out_type == np.uint8:
            # Unlike MATLAB, numpy.unit8() WILL NOT round by default.
            img_np = (img_np * 255.0).round()
        img_np = img_np.astype(out_type)
        result.append(img_np)
    if len(result) == 1:
        result = result[0]
    return result
# def load_onnx_model(model_path):
#     """加载 ONNX 模型."""
#     session = ort.InferenceSession(model_path, providers=['CPUExecutionProvider'])
#     return session
def load_onnx_model(model: str) -> ort.InferenceSession:
    providers = ['CPUExecutionProvider']
    if torch.cuda.is_available():
        providers.insert(0, 'CUDAExecutionProvider')
    session = ort.InferenceSession(model, providers=providers)
    return session
def infer(session, input_tensor):
    """使用 ONNX 模型进行推理"""

    input_name = session.get_inputs()[0].name
    output = session.run(None, {input_name: input_tensor})  #推理输出
    return output

def get_image_bytes(filepath):
    """读取图像文件的字节内容."""
    with open(filepath, 'rb') as f:
        image_bytes = f.read()
    return image_bytes

def imfrombytes(content, flag='color', float32=False):
    """从字节内容读取图像."""
    img_np = np.frombuffer(content, np.uint8)
    img = cv2.imdecode(img_np, cv2.IMREAD_COLOR)
    if float32:
        img = img.astype(np.float32) / 255.
    return img

def img2tensor(img, bgr2rgb=True, float32=True):
    """Numpy array to  NO tensor."""
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) if bgr2rgb else img
    #Todo 通道转变HWC —-> CHW
    #img type is float32
    img = img.transpose(2, 0, 1)
    return img
def preprocess_image(image_path, block_size=64):
    """预处理图像，转换为模型输入所需的格式."""
    image_bytes = get_image_bytes(image_path)
    img_cv = imfrombytes(image_bytes, flag='color', float32=True)
    img_numpy = img2tensor(img_cv)
    # axis=0 会在数组的形状前面添加一个维度
    img_numpy = np.expand_dims(img_numpy, axis=0)
    return img_numpy, img_cv.shape[0], img_cv.shape[1]
def main(session, input_image_path, output_image_path, sc):

    # 预处理图像
    input_img_numpy, height, width = preprocess_image(input_image_path)
    output_numpy = infer(session, input_img_numpy)
    output_tensors = torch.tensor(output_numpy[0], dtype=torch.get_default_dtype())
    output_np = tensor2img_run(output_tensors, rgb2bgr=True, out_type=np.uint8, min_max=(0, 1))
    img_rgb = cv2.cvtColor(output_np, cv2.COLOR_BGR2RGB)
    img_pil = Image.fromarray(img_rgb, 'RGB')
  #  img_pil.show()
    sc = sc+1
    print(f"第{sc}张图片 output_image_path is {output_image_path}")
    img_pil.save(output_image_path)
    return sc


if __name__ == "__main__":
    model_path = r'G:\1_超分辨数据集\onnx模型\DAT_light_HighPrecisiondata_400to1600_x4_CPU.onnx'
    # 替换为你的输入图像文件夹路径
    input_images_folder = r'G:\1_tmp_1\test_data_SR_400x400\test_onnx_infer_input_400x400_DAT_light_HighPrecisiondata_800to3200_x4_400to1600_GPU'
    # 替换为你的输出图像文件夹路径
    output_images_folder = r'G:\1_tmp_1\test_data_SR_400x400\test_onnx_infer_output_1600x1600_DAT_light_HighPrecisiondata_400to1600_x4_CPU'

    # 确保输出文件夹存在
    os.makedirs(output_images_folder, exist_ok=True)

    # 记录开始时间
    a = time.time()
    # 加载模型
    session = load_onnx_model(model_path)
    sc = 0
    # 遍历文件夹中的所有文件
    for image_filename in os.listdir(input_images_folder):
        # 检查文件扩展名，确保只处理图片文件
        if image_filename.lower().endswith(('.png', '.jpg', '.jpeg', '.bmp', '.gif')):
            # 构建完整的文件路径
            input_image_path = os.path.join(input_images_folder, image_filename)
            output_image_path = os.path.join(output_images_folder, image_filename)
            # try:
                # 调用你的模型处理函数
            sc = main(session, input_image_path, output_image_path,sc)
            # except Exception as e:
            #     print(f"处理图像时发生错误: {e}")

    # 记录结束时间
    b = time.time()

    # 打印总的处理时间
    print('Total time:', int(b - a))
    print('2')

此代码采用onnxruntime-gpu 库进行推理def load_onnx_mode（）使用GPU加载模型，数据导入使用numpy进行加载输入维度是1*3*n*n

所需库文件windows conda DAT

Package                 Version            Editable project location
----------------------- ------------------ -------------------------
absl-py                 2.1.0                                       
addict                  2.4.0                                       
astunparse              1.6.3                                       
basicsr                 1.3.5                         
cachetools              5.3.3                                       
certifi                 2024.6.2                                    
chardet                 5.2.0                                       
charset-normalizer      3.3.2                                       
colorama                0.4.6                                       
coloredlogs             15.0.1
einops                  0.8.0
filelock                3.15.4
flatbuffers             24.3.25
fsspec                  2024.6.1
future                  1.0.0
fvcore                  0.1.5.post20221221
google-auth             2.31.0
google-auth-oauthlib    1.0.0
grpcio                  1.64.1
h5py                    3.11.0
huggingface-hub         0.23.4
humanfriendly           10.0
idna                    3.7
imageio                 2.34.2
importlib_metadata      8.0.0
iopath                  0.1.10
lazy_loader             0.4
lmdb                    1.5.1
Markdown                3.6
MarkupSafe              2.1.5
mpmath                  1.3.0
networkx                3.1
numpy                   1.24.4
oauthlib                3.2.2
onnx                    1.16.1
onnxruntime             1.18.1
onnxruntime-gpu         1.18.1
opencv-python           4.10.0.84
packaging               24.1
pillow                  10.4.0
pip                     24.0
platformdirs            4.2.2
portalocker             2.10.1
protobuf                5.27.2
pyasn1                  0.6.0
pyasn1_modules          0.4.0
scipy                   1.10.1
setuptools              69.5.1
six                     1.16.0
sympy                   1.13.0
tabulate                0.9.0
tb-nightly              2.14.0a20230808
tensorboard-data-server 0.7.2
termcolor               2.4.0
tifffile                2023.7.10
timm                    1.0.7
tomli                   2.0.1
torch                   1.8.0+cu111
torchvision             0.9.0
tqdm                    4.66.4
typing_extensions       4.12.2
urllib3                 2.2.2
Werkzeug                3.0.3
wheel                   0.43.0
yacs                    0.1.8
yapf                    0.40.2
zipp                    3.19.2

注：cuda 和pytoch 根据自己显卡进行安装配置

cuda torch和torchvision下载网站download.pytorch.org/whl/torch_stable.html

下载对应版本

10.在英伟达orin nx开发板上部署

查看自己的nvidia版本安装jtop

sudo apt install jtop

jtop 查看自己的型号

安装nvidia-jetpack 使用这个配置cuda tensorRT cuDNN

sudo vi /etc/apt/sources.list这句打开记录源的文件，一般是这个文件，有的默认可能不一样。随便什么编辑器打开都可以，gedit也行看个人习惯。增加下面这两行

deb https://repo.download.nvidia.com/jetson/common r35.3 main
deb https://repo.download.nvidia.com/jetson/t210 r35.3 main

把这两组链接分别导入就可以了，这里主要数字要更换，还有后面r35.3就是你的l4t系统版本号，输入一个点后的就可以了，不用精确到小版本

sudo apt update
sudo apt install nvidia-jetpack

安装pytorch

根据cuda版本去安装pytorch。

资料：链接：https://pan.baidu.com/s/1rb9m11lq_nQqNJ2_ToX5OA?pwd=50vd
提取码：50vd
注意：arm的pytorch比较特殊不能在线安装，需要离线安装包下载，上面有资料

torchvision：

git clone --branch v0.13.0 https://gitee.com/xing-min/vision torchvision

参考：jetson agx orin 的pytorch、torchvision、tensorrt安装最全教程_jetson torchvision-优快云博客