目录
前言
该教程是记录如何一步步安装linux下的3DGS环境。走过的路,会有起起伏伏的;那些踩过的坑,记录下来,并提供一个可成功的路线,供参考。
一、安装CUDA
官方说we used 11.8, known issues with 11.6。因此需要确保CUDA版本高于11.8
CUDA官网链接:https://developer.nvidia.com/cuda-toolkit-archive
#下载run文件
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
#给下载好的文件加权限
sudo chmod +x cuda_11.8.0_520.61.05_linux.run
#运行这个cuda文件
sudo sh cuda_11.8.0_520.61.05_linux.run
把这个驱动包的驱动按空格取消安装,即用本来安装好的显卡驱动。然后选install,出现下面的画面
注意这里的安装目录是“/usr/local/cuda-11.8/”
安装完成后,配置并更新环境变量:
#打开环境变量编辑
gedit ~/.bashrc
#添加路径(此处展示的是默认路径,根据自己的路径来)
# cuda11.8
export PATH=/usr/local/cuda-11.8/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda-11.8
#激活环境变量
source ~/.bashrc
#使用下面的命令查看你的CUDA版本:
nvcc -V
结果是11.8,代表切换成功了。
二 安装CUDNN
下载的链接:https://developer.nvidia.com/rdp/cudnn-archive
推荐中文链接:https://developer.nvidia.cn/rdp/cudnn-archive
这里我选择的是cudnn8.9.7 for cuda 11.x的tar版本
需要登陆nvidia账户,才能下载;下载完后进行解压:并且将对应的库和头文件移到cuda11.8目录中,反正需要注意的就是cuda11.8的路径。
tar -xvf cudnn-linux-x86_64-8.x.x.x_cudaX.Y-archive.tar.xz
sudo cp cudnn-*-archive/include/cudnn*.h /usr/local/cuda-11.8/include
sudo cp -P cudnn-*-archive/lib/libcudnn* /usr/local/cuda-11.8/lib64
sudo chmod a+r /usr/local/cuda-11.8/include/cudnn*.h /usr/local/cuda-11.8/lib64/libcudnn*
检查cudnn
cat /usr/local/cuda-11.8/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
安装Talking-gaussian工程
github: https://github.com/Fictionarry/TalkingGaussian
环境说明:Tested on Ubuntu 20.04, CUDA 11.8(通过run文件安装), PyTorch 1.12.1
git clone git@github.com:Fictionarry/TalkingGaussian.git --recursive
conda env create --file environment.yml
conda activate talking_gaussian
安装3DGS过程如下:
pip install ./submodules/diff-gaussian-rasterization
pip install ./submodules/simple-knn
pip install ./gridencoder
pip install "git+https://github.com/facebookresearch/pytorch3d.git"
pip install tensorflow-gpu==2.8.0
安装准备文件
Prepare face-parsing model and the 3DMM model for head pose estimation.
bash scripts/prepare.sh
Download 3DMM model from Basel Face Model 2009:
# 1. copy 01_MorphableModel.mat to data_util/face_tracking/3DMM/
# 2. run following
cd data_utils/face_tracking
python convert_BFM.py
01_MorphableModel.mat 的下载地址如下:
https://github.com/jadewu/3D-Human-Face-Reconstruction-with-3DMM-face-model-from-RGB-image/blob/main/BFM/01_MorphableModel.mat
Prepare the environment for EasyPortrait:
# prepare mmcv
conda activate talking_gaussian
pip install -U openmim
mim install mmcv-full==1.7.1
# download model weight
cd data_utils/easyportrait
wget "https://n-ws-620xz-pd11.s3pd11.sbercloud.ru/b-ws-620xz-pd11-jux/easyportrait/experiments/models/fpn-fp-512.pth"
测试TalkingGaussian
Video Dataset
Here we provide two video clips used in our experiments, which are captured from YouTube. Please respect the original content creators’ rights and comply with YouTube’s copyright policies in the usage.
video 地址:https://drive.google.com/drive/folders/1E_8W805lioIznqbkvTQHWWi5IFXUG7Er
Other used videos can be found from GeneFace and AD-NeRF.
Pre-processing Training Video
- Put training video under data/may/may.mp4.
The video must be 25FPS, with all frames containing the talking person. The resolution should be about 512x512, and duration about 1-5 min.
- Run script to process the video.
python data_utils/process.py data/may/may.mp4
运行过程会下载一些文件,运行时间会稍微有点久。
这是运行结束的结果,时间花费有点久;
- Obtain Action Units
Run FeatureExtraction in OpenFace, rename and move the output CSV file to data//au.csv.
这一步在windows上执行exe文件非常快,只需要下载几个模型文件,即可运行成功;
- Generate tooth masks
export PYTHONPATH=./data_utils/easyportrait
python ./data_utils/easyportrait/create_teeth_mask.py ./data/may
生成牙齿模板,这里调用了上面下载的模型文件;
Audio Pre-process
In our paper, we use DeepSpeech features for evaluation.
- DeepSpeech
python data_utils/deepspeech_features/extract_ds_features.py --input data/may/aud.wav
# saved to data/<name>.npy
- HuBERT
Similar to ER-NeRF, HuBERT is also available. Recommended for situations if the audio is not in English.
Specify --audio_extractor hubert when training and testing.
python data_utils/hubert.py --wav data/may/aud.wav
# save to data/<name>_hu.npy
遇到不少bug,记录一下:
BUG1:ModuleNotFoundError: No module named 'importlib.metadata
解决方法:升级python=3.8.11
BUG2: ImportError: cannot import name 'get_full_repo_name' from 'huggingface_hub' #2055
解决方法:pip install --upgrade huggingface_hub
BUG3:
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like facebook/hubert-large-ls960-ft is not the path to a directory containing a file named preprocessor_config.json.
解决方法:下载好离线模型;
BUG4: Torch not compiled with CUDA enabled
功亏一篑!!!
功亏一篑:上述所有东西需要重来:cudatoolkit为11.3,无法兼容CUDA ersion为11.8,真的是无语到爆!
重来一遍:
安装的环境有所变化,即python换成3.8,cudatoolkit=11.6
一次就安装成功;
BUGs:
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
解决方法:
pip install --upgrade "protobuf<=3.20.1"
终于成功,在这一步卡了好久!
Train
If resources are sufficient, partially parallel is available to speed up the training. See the script.
bash scripts/train_xx.sh data/may output/may_talkingface 0
遇到的bug:
Bug1:FileNotFoundError: [Errno 2] No such file or directory: '/home/wqt/TFG/TalkingGaussian/data/may/aud_ds.npy'
解决方法:复制aud.npy为aud_ds.npy,即可
Bug2: AttributeError: module ‘PIL.Image‘ has no attribute ‘ANTIALIAS‘
解决方法:Image.LANCZOS 替换为Image.Resampling.LANCZOS
python train_mouth.py 训练完毕,训练的时间还是挺快的,大概10分钟;
python train_face.py 遇到了一个bug,如下:
Bug3:Training Error: TypeError: No loop matching the specified signature and casting was found for ufunc greater
解决方法:i have solve this problem by down-grade numpy version to 1.23.4
pip install numpy==1.23.4
终于完整的训练完毕!!!
Test
# saved to output/<project_name>/test/ours_None/renders
python synthesize_fuse.py -S data/<ID> -M output/<project_name> --eval
python synthesize_fuse.py -S data/may -M output/may_talkingface --eval
Inference with target audio
python synthesize_fuse.py -S data/<ID> -M output/<project_name> --use_train --audio <preprocessed_audio_feature>.npy
测试1:
python synthesize_fuse.py -S data/marcon -M output/marcon_talkingface --use_train --audio data/may/aud.npy
Bug如下:
FileNotFoundError: [Errno 2] No such file or directory: ‘output/marcon_talkingface/cfg_args’
测试2:
python synthesize_fuse.py -S data/may -M output/may_talkingface --use_train --audio data/may/aud.npy
"-S", "data/may", //加载Splating各种变换
"-M", "output/may_talkingface", //加载训练好的模型
"--use_train",
"--audio", "data/may/aud.npy" //输入预处理的音频信息
生成的效果图如下所示:
binmayong
总结
这篇文章主要贡献有:
1、手把手指导如何安装3DGS
2、实现https://github.com/Fictionarry/TalkingGaussian的安装指导
踩过的坑,把它展示出来,以提升效率!