LivePortrait代码调试—给图片实现动态表情

本文链接：https://blog.youkuaiyun.com/qq_50465499/article/details/143384534

文章目录

- 1. 克隆代码和准备环境
- - 1.1 克隆代码
  - 1.2 准备环境
  - 1.3 安装CUDA
- 2. Inference
- - 2.1快速推断：
  - 2.2 使用-s和-d改变原始输入和驱动的输入：
  - 2.3 让动物也能动起来
- 3 使用自己的视频驱动

1. 克隆代码和准备环境

1.1 克隆代码

git clone https://github.com/comfyanonymous/ComfyUI.git	//克隆代码
cd LivePortrait	//进入目录

如果在git clone时出现以下错误，可能是网络的问题，多试几次就可以。
error1

1.2 准备环境

# create env using conda
conda create -n LivePortrait python=3.10	//创建环境
conda activate LivePortrait	//激活环境

1.3 安装CUDA

(1) 检查cuda版本

nvidia-smi

右上角为服务器支持的最新cuda版本为11.4
(2)安装cuda(这里我选择11.1的版本，但是报错)

# for CUDA 11.1
pip install torch==1.10.1+cu111 torchvision==0.11.2 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu111/torch_stable.html
# for CUDA 11.8
pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu118
# for CUDA 12.1
pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121

报错（原因是不支持torch==1.10.1+cu111）：
error2
解决（在官网Previous PyTorch Versions | PyTorch找到最接近的版本）

# CUDA 11.3
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

觉得下载速度慢可以挂上清华源 https://pypi.tuna.tsinghua.edu.cn/simple：

# CUDA 11.3
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113 -i https://pypi.tuna.tsinghua.edu.cn/simple

(2)安装依赖包

pip install -r requirements.txt

报错(原因是不支持1.18.0版本的onnx)：
error3
修改requirements.txt中的onnxruntime-gpu==1.18.1：

-r requirements_base.txt

onnxruntime-gpu==1.18.1
transformers==4.38.0

（3）下载权重

# !pip install -U "huggingface_hub[cli]"	//安装huggingface-cli
huggingface-cli download KwaiVGI/LivePortrait --local-dir pretrained_weights --exclude "*.git*" "README.md" "docs"

2. Inference

2.1快速推断：

python inference.py

效果展示：

test

2.2 使用-s和-d改变原始输入和驱动的输入：

# source input is an image
python inference.py -s assets/examples/source/s9.jpg -d assets/examples/driving/d0.mp4

# source input is a video ✨
python inference.py -s assets/examples/source/s13.mp4 -d assets/examples/driving/d0.mp4

# more options to see
python inference.py -h

2.3 让动物也能动起来

首先，编译MultiScaleDeformableAttention：

cd src/utils/dependencies/XPose/models/UniPose/ops
python setup.py build install
cd - # equal to cd ../../../../../../../

然后推断：

python inference_animals.py -s assets/examples/source/s39.jpg -d assets/examples/driving/wink.pkl --driving_multiplier 1.75 --no_flag_stitching

猫猫眨眼：

3 使用自己的视频驱动

注意事项
要使用您自己的驾驶视频，我们建议：
(1) 将其裁剪为 1：1 的纵横比（例如，512x512 或 256x256 像素），或通过启用自动裁剪。–flag_crop_driving_video
(2) 将焦点放在头部区域，类似于示例视频。
(3) 尽量减少肩部运动。
(4) 确保驾驶视频的第一帧是表情中立的正面。

以下是使用自动裁剪：–flag_crop_driving_video

python inference.py -s assets/examples/source/s9.jpg -d assets/examples/driving/d13.mp4 --flag_crop_driving_video

Gradio接口
只需通过以下方式运行：

python app.py

运行后，复制http://127.0.0.1:8890，在本地浏览器中打开，可能会发现：
error
需要在windows电脑中使用ssh连接上远程服务器（根据个人情况修改，用户名、远程服务器ip地址和远程服务器端口号）：

ssh -o ServerAliveInterval=60 -CNg -L 8890:127.0.0.1:8890 用户名@远程服务器ip地址 -p 远程服务器端口号

刷新浏览器的http://127.0.0.1:8890，可以看到：
gradio
可以改变源图像试试：
test

动物模式的 Gradio 接口也是类似，用自家猫试试:

python app_animals.py # animals mode 🐱🐶

加速选项 .首次推理会触发优化过程（大约 1 分钟），使后续推理的速度提高 20-30%。性能提升可能因 CUDA 版本而异。–flag_do_torch_compile

python app.py --flag_do_torch_compile