linux下手把手安装3DGS和TalkingGaussian

烧技湾

已于 2024-09-05 21:17:48 修改

阅读量2.2k

点赞数 34

文章标签： linux 运维服务器

于 2024-09-04 17:02:27 首次发布

本文链接：https://blog.youkuaiyun.com/wqthaha/article/details/141883838

版权

前言

该教程是记录如何一步步安装linux下的3DGS环境。走过的路，会有起起伏伏的；那些踩过的坑，记录下来，并提供一个可成功的路线，供参考。

一、安装CUDA

参考博客：
参考1 参考2

官方说we used 11.8, known issues with 11.6。因此需要确保CUDA版本高于11.8

CUDA官网链接：https://developer.nvidia.com/cuda-toolkit-archive
在这里插入图片描述

#下载run文件
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run

#给下载好的文件加权限
sudo chmod +x cuda_11.8.0_520.61.05_linux.run

#运行这个cuda文件
sudo sh cuda_11.8.0_520.61.05_linux.run

在这里插入图片描述把这个驱动包的驱动按空格取消安装，即用本来安装好的显卡驱动。然后选install，出现下面的画面

注意这里的安装目录是“/usr/local/cuda-11.8/”

安装完成后，配置并更新环境变量：

#打开环境变量编辑
gedit ~/.bashrc

#添加路径（此处展示的是默认路径，根据自己的路径来）
# cuda11.8
export PATH=/usr/local/cuda-11.8/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda-11.8

#激活环境变量
source ~/.bashrc 

#使用下面的命令查看你的CUDA版本：
nvcc -V

在这里插入图片描述
结果是11.8，代表切换成功了。

二安装CUDNN

下载的链接：https://developer.nvidia.com/rdp/cudnn-archive
推荐中文链接：https://developer.nvidia.cn/rdp/cudnn-archive
这里我选择的是cudnn8.9.7 for cuda 11.x的tar版本

在这里插入图片描述
需要登陆nvidia账户，才能下载；下载完后进行解压：并且将对应的库和头文件移到cuda11.8目录中，反正需要注意的就是cuda11.8的路径。

tar -xvf cudnn-linux-x86_64-8.x.x.x_cudaX.Y-archive.tar.xz
sudo cp cudnn-*-archive/include/cudnn*.h /usr/local/cuda-11.8/include 
sudo cp -P cudnn-*-archive/lib/libcudnn* /usr/local/cuda-11.8/lib64 
sudo chmod a+r /usr/local/cuda-11.8/include/cudnn*.h /usr/local/cuda-11.8/lib64/libcudnn*

检查cudnn

cat /usr/local/cuda-11.8/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

在这里插入图片描述

安装Talking-gaussian工程

github: https://github.com/Fictionarry/TalkingGaussian

环境说明：Tested on Ubuntu 20.04, CUDA 11.8(通过run文件安装), PyTorch 1.12.1

git clone git@github.com:Fictionarry/TalkingGaussian.git --recursive

conda env create --file environment.yml
conda activate talking_gaussian

安装3DGS过程如下：

pip install ./submodules/diff-gaussian-rasterization

在这里插入图片描述

pip install ./submodules/simple-knn

在这里插入图片描述

pip install ./gridencoder

pip install "git+https://github.com/facebookresearch/pytorch3d.git"

在这里插入图片描述

pip install tensorflow-gpu==2.8.0

安装准备文件

Prepare face-parsing model and the 3DMM model for head pose estimation.

bash scripts/prepare.sh

在这里插入图片描述
Download 3DMM model from Basel Face Model 2009:

# 1. copy 01_MorphableModel.mat to data_util/face_tracking/3DMM/
# 2. run following
cd data_utils/face_tracking
python convert_BFM.py

01_MorphableModel.mat 的下载地址如下：

https://github.com/jadewu/3D-Human-Face-Reconstruction-with-3DMM-face-model-from-RGB-image/blob/main/BFM/01_MorphableModel.mat

Prepare the environment for EasyPortrait:

# prepare mmcv
conda activate talking_gaussian
pip install -U openmim
mim install mmcv-full==1.7.1

# download model weight
cd data_utils/easyportrait
wget "https://n-ws-620xz-pd11.s3pd11.sbercloud.ru/b-ws-620xz-pd11-jux/easyportrait/experiments/models/fpn-fp-512.pth"

在这里插入图片描述

测试TalkingGaussian

Video Dataset

Here we provide two video clips used in our experiments, which are captured from YouTube. Please respect the original content creators’ rights and comply with YouTube’s copyright policies in the usage.

video 地址：https://drive.google.com/drive/folders/1E_8W805lioIznqbkvTQHWWi5IFXUG7Er

Other used videos can be found from GeneFace and AD-NeRF.

Pre-processing Training Video

Put training video under data/may/may.mp4.

The video must be 25FPS, with all frames containing the talking person. The resolution should be about 512x512, and duration about 1-5 min.

Run script to process the video.

python data_utils/process.py data/may/may.mp4

运行过程会下载一些文件，运行时间会稍微有点久。
在这里插入图片描述
这是运行结束的结果，时间花费有点久；

Obtain Action Units

Run FeatureExtraction in OpenFace, rename and move the output CSV file to data//au.csv.
这一步在windows上执行exe文件非常快，只需要下载几个模型文件，即可运行成功；

Generate tooth masks

export PYTHONPATH=./data_utils/easyportrait 
python ./data_utils/easyportrait/create_teeth_mask.py ./data/may

生成牙齿模板，这里调用了上面下载的模型文件；

Audio Pre-process

In our paper, we use DeepSpeech features for evaluation.

DeepSpeech

python data_utils/deepspeech_features/extract_ds_features.py --input data/may/aud.wav 
# saved to data/<name>.npy

HuBERT

Similar to ER-NeRF, HuBERT is also available. Recommended for situations if the audio is not in English.

Specify --audio_extractor hubert when training and testing.

python data_utils/hubert.py --wav data/may/aud.wav
 # save to data/<name>_hu.npy

遇到不少bug，记录一下：
BUG1：ModuleNotFoundError: No module named 'importlib.metadata
解决方法：升级python=3.8.11

BUG2： ImportError: cannot import name 'get_full_repo_name' from 'huggingface_hub' #2055
解决方法：pip install --upgrade huggingface_hub 

BUG3:
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like facebook/hubert-large-ls960-ft is not the path to a directory containing a file named preprocessor_config.json.
解决方法：下载好离线模型；

BUG4:  Torch not compiled with CUDA enabled
功亏一篑！！！

功亏一篑：上述所有东西需要重来：cudatoolkit为11.3,无法兼容CUDA ersion为11.8,真的是无语到爆！

重来一遍：
安装的环境有所变化，即python换成3.8,cudatoolkit=11.6

在这里插入图片描述

一次就安装成功；

BUGs:
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.

解决方法：
pip install --upgrade "protobuf<=3.20.1"

在这里插入图片描述
终于成功，在这一步卡了好久！

Train

If resources are sufficient, partially parallel is available to speed up the training. See the script.

bash scripts/train_xx.sh data/may output/may_talkingface 0

遇到的bug：

Bug1：FileNotFoundError: [Errno 2] No such file or directory: '/home/wqt/TFG/TalkingGaussian/data/may/aud_ds.npy'
解决方法：复制aud.npy为aud_ds.npy，即可

Bug2： AttributeError: module ‘PIL.Image‘ has no attribute ‘ANTIALIAS‘
解决方法：Image.LANCZOS 替换为Image.Resampling.LANCZOS

在这里插入图片描述
python train_mouth.py 训练完毕，训练的时间还是挺快的，大概10分钟；

python train_face.py 遇到了一个bug，如下：

Bug3：Training Error: TypeError: No loop matching the specified signature and casting was found for ufunc greater

解决方法：i have solve this problem by down-grade numpy version to 1.23.4
pip install numpy==1.23.4

在这里插入图片描述
终于完整的训练完毕！！！

Test

# saved to output/<project_name>/test/ours_None/renders
python synthesize_fuse.py -S data/<ID> -M output/<project_name> --eval  
python synthesize_fuse.py -S data/may -M output/may_talkingface --eval

在这里插入图片描述

Inference with target audio

python synthesize_fuse.py -S data/<ID> -M output/<project_name> --use_train --audio <preprocessed_audio_feature>.npy

测试1：
python synthesize_fuse.py -S data/marcon -M output/marcon_talkingface --use_train --audio data/may/aud.npy
Bug如下：
FileNotFoundError: [Errno 2] No such file or directory: ‘output/marcon_talkingface/cfg_args’

测试2：
python synthesize_fuse.py -S data/may -M output/may_talkingface --use_train --audio data/may/aud.npy

      "-S", "data/may",   //加载Splating各种变换
      "-M", "output/may_talkingface",  //加载训练好的模型
      "--use_train",
      "--audio", "data/may/aud.npy"  //输入预处理的音频信息