Flownet2-PyTorch 安装使用流程

入星

已于 2025-01-30 17:07:26 修改

阅读量1k

点赞数 18

文章标签： pytorch 人工智能 python

于 2025-01-30 16:45:16 首次发布

本文链接：https://blog.youkuaiyun.com/nichengshenhe/article/details/145399304

版权

Flownet2-PyTorch 安装使用流程

代码地址：https://github.com/NVIDIA/flownet2-pytorch

环境配置：

ubuntu 18.04
miniconda conda3
python 3.7
cuda 10.0
PyTorch 1.0.0 （经测试有些版本会报错，能够使用的版本可以参考这里：https://github.com/NVIDIA/flownet2-pytorch/issues/156）

安装过程

可以选择租一台AutoDL云服务器（用的RTX 2080Ti*1）

在这里插入图片描述

配置好代码环境
1. 创建新的conda环境，在终端输入下面代码：
```
conda create -n flownet2 python=3.7  # 创建flownet2环境
conda activate flownet2 # 激活环境
```
  如果 conda activate 之后提示需要 conda init ，就在终端运行下面代码，让bash自动打开conda环境，然后重新打开终端，并激活 flownet2 环境。
```
conda init bash
```
2. 下载 PyTorch 1.0.0
  
  在这个网址里找到符合环境要求的torch版本：https://download.pytorch.org/whl/cu100/torch/，右键点击复制下载链接：
  
  在终端（保证已打开flownet2环境）里运行下面代码：
```
pip install https:xxx  # https:xxx改成下载链接
```
  接着下载torchvision，在这个网址里找到 torchvision 0.2.0 ：https://download.pytorch.org/whl/cu100/torchvision，右键点击复制下载链接：
  
  在终端（保证已打开flownet2环境）里运行下面代码：
```
pip install https:xxx  # https:xxx改成下载链接
```
3. 下载源码并安装环境
  
  通过 git 克隆的方式下载源码，否则运行 main.py 时会报错 “找不到 git 仓库”。
```
git clone https://github.com/NVIDIA/flownet2-pytorch.git
```
  这个过程可能会遇到连接超时的情况，灵活使用科学上网工具。
  
  终端cd到代码运行目录，并通过sh文件安装代码环境：
```
cd flownet2-pytorch
./install.sh
```
4. 运行代码
  
  首先准备数据集，我用的是自己的数据集，所以没有尝试使用官方代码里的数据集处理，大家可以使用 MPI Sintel 数据集来跑代码，数据集下载链接：http://sintel.is.tue.mpg.de/downloads。
  
  运行 main.py 之前，按需修改这个代码里的一些参数，有gpu的把这里改一下，我是有1个gpu：
  
  然后终端里运行代码：
```
python main.py --optimizer_lr=1e-4
```
  注意这里设置的 optimizer_lr 参数，如果不设置，可能会报错：
```
raise ValueError('The histogram is empty, please file a bug report.')
ValueError: The histogram is empty, please file a bug report.
```
  这是因为学习率设置不合适的话，一开始的EPE损失和L1损失会很大导致溢出，此时报错里还会有 “NaN or Inf found in input tensor…” 的提示，这个学习率可以根据自己的训练情况调整，可以参考这里：https://github.com/NVIDIA/flownet2-pytorch/issues/53#issuecomment-384725298。

报错记录：

运行 install.sh 时报错：
```
channelnorm_kernel.cu:3:35: fatal error: ATen/cuda/CUDAContext.h: No such file or directory
compilation terminated.
error: command '/usr/local/cuda/bin/nvcc' failed with exit code 1
```
这是由于PyTorch版本不合适导致的，把PyTorch版本从0.4.1换成1.0.0后，成功解决。能够使用的版本可以参考这里：https://github.com/NVIDIA/flownet2-pytorch/issues/156
运行 main.py 时报错：
```
File "/xxx/flownet2-pytorch/utils/frame_utils.py", line 3, in <module>
    from scipy.misc import imread
ImportError: cannot import name 'imread' from 'scipy.misc' 
```
这是由于直接通过 pip install scipy 方式下载的版本太高，‘scipy.misc’ 里面没有 ‘imread’ ，使用下面代码重新安装 scipy 包即可：
```
pip install scipy==1.2.1
```

运行 main.py 时报错：

Traceback (most recent call last):
  File "main.py", line 110, in <module>
    args.current_hash = subprocess.check_output(["git", "rev-parse", "HEAD"]).rstrip()
 ......
subprocess.CalledProcessError: Command '['git', 'rev-parse', 'HEAD']' returned non-zero exit status 128.

这是由于找不到 git 仓库造成的，下载源代码时通过 git clone 的方式下载即可。

运行 main.py 时报错：

NaN or Inf found in input tensor..85s/it]
Training Epoch 1 L1: inf, EPE: inf, lr: 0.001, load: 5.5e-05:  33%|█████▉            | 36/110.0 [00:16<00:34,  2.17it/s]
Overall Progress:   0%|                                                   | 0/10000 [00:29<?, ?it/s]16<00:33,  2.23it/s]
Traceback (most recent call last):
  File "main.py", line 439, in <module>
    train_loss, iterations = train(args=args, epoch=epoch, start_iteration=global_iteration, data_loader=train_loader, model=model_and_loss, optimizer=optimizer, logger=train_logger, offset=offset)
  File "main.py", line 327, in train
    logger.add_histogram(str(key), all_losses[:, i], global_iteration)
  ......
    raise ValueError('The histogram is empty, please file a bug report.')
ValueError: The histogram is empty, please file a bug report.

这是由于L1损失和EPE损失值过大导致溢出造成的，设置一个较低的优化器学习率即可，如：