MiDaS可视化工具：深度图与原始图像对比展示方法-优快云博客

MiDaS可视化工具：深度图与原始图像对比展示方法

【免费下载链接】MiDaS 项目地址: https://gitcode.com/gh_mirrors/mid/MiDaS

1. 痛点与解决方案

你是否在使用MiDaS（Monocular Depth Estimation模型）时遇到过以下问题：生成的深度图与原始图像难以直观对比？无法快速评估深度估计效果？本文将详细介绍如何使用MiDaS自带的可视化工具，实现深度图与原始图像的并排对比展示，帮助你更直观地分析和优化深度估计结果。

读完本文后，你将能够：

理解MiDaS可视化工具的工作原理
使用命令行参数生成对比图像
自定义深度图的色彩映射和显示方式
批量处理图像并保存对比结果

2. 深度可视化原理

2.1 深度图生成流程

MiDaS的深度可视化流程主要包含以下步骤：

mermaid

2.2 深度值归一化

深度图可视化的关键步骤是将原始深度值转换为可显示的图像格式。MiDaS采用以下公式进行归一化：

normalized\_depth = 255 \times \frac{depth - depth\_min}{depth\_max - depth\_min}

其中：

depth 是模型输出的原始深度值
depth_min 和 depth_max 分别是深度图中的最小和最大值
normalized_depth 是归一化后的深度值（0-255范围）

2.3 色彩映射

MiDaS提供两种色彩映射方式：

** Inferno 色彩映射 **：使用OpenCV的COLORMAP_INFERNO，近处显示为黄色，远处显示为深色
灰度映射：将归一化深度值直接映射为灰度图像

3. 快速开始：基础对比可视化

3.1 环境准备

首先确保你已克隆MiDaS仓库并安装了必要依赖：

git clone https://gitcode.com/gh_mirrors/mid/MiDaS
cd MiDaS
pip install -r requirements.txt

3.2 基础命令

使用以下命令生成原始图像与深度图的并排对比：

python run.py --input_path input/ --output_path output/ --model_type dpt_beit_large_512 --side

参数说明：

--input_path：输入图像文件夹路径
--output_path：输出结果文件夹路径
--model_type：指定使用的模型类型
--side：启用并排显示模式

3.3 命令执行流程

mermaid

4. 高级可视化选项

4.1 灰度深度图

使用--grayscale参数生成灰度深度图：

python run.py --input_path input/ --output_path output/ --model_type midas_v21_small_256 --side --grayscale

4.2 模型对比

不同模型的深度估计效果有差异，可通过以下命令对比：

# 使用大型模型
python run.py --input_path input/ --output_path output/large/ --model_type dpt_beit_large_512 --side

# 使用小型模型
python run.py --input_path input/ --output_path output/small/ --model_type midas_v21_small_256 --side

4.3 色彩映射对比

使用不同色彩映射参数生成对比结果：

# Inferno色彩映射(默认)
python run.py --input_path input/ --output_path output/inferno/ --side

# 灰度映射
python run.py --input_path input/ --output_path output/grayscale/ --side --grayscale

5. 自定义可视化效果

5.1 修改深度图颜色映射

如果要使用其他色彩映射，可以修改create_side_by_side函数中的色彩映射部分：

# 在run.py中找到create_side_by_side函数
if not grayscale:
    # 将默认的COLORMAP_INFERNO改为其他映射
    right_side = cv2.applyColorMap(np.uint8(right_side), cv2.COLORMAP_JET)  # 改为JET映射

OpenCV支持的色彩映射选项：

COLORMAP_JET：蓝->青->绿->黄->红
COLORMAP_HOT：黑->红->黄->白
COLORMAP_COOL：青->洋红
COLORMAP_SPRING：洋红->黄

5.2 调整深度图对比度

可以通过修改归一化参数来调整深度图的对比度：

# 在create_side_by_side函数中修改归一化代码
# 原始代码
normalized_depth = 255 * (depth - depth_min) / (depth_max - depth_min)

# 修改为(增加对比度)
normalized_depth = 255 * np.power((depth - depth_min) / (depth_max - depth_min), 0.7)

5.3 自定义并排布局

默认布局是左右并排，你可以修改代码实现上下布局：

# 将横向拼接改为纵向拼接
# return np.concatenate((image, right_side), axis=1)  # 左右并排
return np.concatenate((image, right_side), axis=0)  # 上下并排

6. 批量处理与高级应用

6.1 批量处理文件夹中的所有图像

MiDaS支持自动处理文件夹中的所有图像：

python run.py --input_path input/ --output_path output/batch/ --side --model_type dpt_swin2_large_384

6.2 实时摄像头对比可视化

使用摄像头实时生成深度对比：

python run.py --output_path output/camera/ --side --model_type midas_v21_small_256

此模式下，程序会打开摄像头并实时显示原始画面与深度图的对比。

6.3 不同模型结果对比

创建一个脚本来比较不同模型的深度估计结果：

#!/bin/bash
# compare_models.sh

# 模型列表
models=("dpt_beit_large_512" "dpt_swin2_large_384" "midas_v21_small_256")

# 为每个模型生成结果
for model in "${models[@]}"; do
    echo "Processing with $model..."
    python run.py --input_path input/ --output_path "output/${model}" --model_type "$model" --side
done

7. 可视化结果解析

7.1 深度图质量评估

通过对比可视化结果，我们可以从以下几个方面评估深度估计质量：

评估指标	良好结果特征	问题结果特征
边缘清晰度	物体边缘清晰可辨	边缘模糊或错位
深度连续性	同一平面深度值一致	平面上有异常波动
细节保留	小物体和纹理清晰	小物体丢失或模糊
动态范围	近处和远处细节兼顾	近处过曝或远处细节丢失

7.2 常见问题及解决方案

问题	原因	解决方案
深度图整体偏暗	深度范围过大导致压缩	使用--grayscale参数或调整归一化范围
图像边缘深度异常	模型对边缘处理能力有限	尝试使用更大的模型(如dpt_beit_large_512)
远处细节丢失	模型感受野限制	使用专为远景优化的模型或提高输入分辨率
计算速度慢	模型规模过大	切换到轻量级模型(如midas_v21_small_256)

8. API详解：自定义可视化集成

8.1 create_side_by_side函数

create_side_by_side是实现对比可视化的核心函数，定义如下：

def create_side_by_side(image, depth, grayscale):
    """
    将RGB图像和深度图并排显示
    
    参数:
        image: RGB图像 (BGR格式, 0-255)
        depth: 深度图数据
        grayscale: 是否使用灰度映射而非彩色映射
        
    返回:
        合成后的并排图像
    """
    depth_min = depth.min()
    depth_max = depth.max()
    normalized_depth = 255 * (depth - depth_min) / (depth_max - depth_min)
    normalized_depth *= 3
    
    right_side = np.repeat(np.expand_dims(normalized_depth, 2), 3, axis=2) / 3
    if not grayscale:
        right_side = cv2.applyColorMap(np.uint8(right_side), cv2.COLORMAP_INFERNO)
        
    if image is None:
        return right_side
    else:
        return np.concatenate((image, right_side), axis=1)

8.2 在自定义项目中集成

你可以在自己的项目中直接调用MiDaS的可视化功能：

import cv2
import numpy as np
from midas.model_loader import load_model
from run import create_side_by_side
import utils

# 加载模型
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model, transform, net_w, net_h = load_model(device, None, "dpt_beit_large_512", False)

# 读取并预处理图像
image = utils.read_image("input/image.jpg")
input_batch = transform({"image": image})["image"].unsqueeze(0).to(device)

# 推理
with torch.no_grad():
    prediction = model(input_batch)
    prediction = torch.nn.functional.interpolate(
        prediction.unsqueeze(1),
        size=image.shape[:2][::-1],
        mode="bicubic",
        align_corners=False,
    ).squeeze()

# 转换为numpy数组
depth_map = prediction.cpu().numpy()

# 创建对比可视化
original_image_bgr = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) * 255
comparison_image = create_side_by_side(original_image_bgr.astype(np.uint8), depth_map, grayscale=False)

# 保存结果
cv2.imwrite("comparison_result.png", comparison_image)

9. 性能优化与最佳实践

9.1 模型选择指南

根据你的需求选择合适的模型：

模型类型	速度	精度	推荐场景
midas_v21_small_256	最快	较低	实时应用、移动设备
dpt_hybrid_384	中等	中等	平衡速度和精度的场景
dpt_beit_large_512	最慢	最高	高精度要求的静态场景

9.2 提高可视化质量的技巧

调整输入分辨率：使用--height参数控制输入图像高度
使用方形输入：添加--square参数强制方形输入分辨率
后处理优化：对输出深度图应用高斯模糊减少噪声

# 优化分辨率和形状
python run.py --input_path input/ --output_path output/ --model_type dpt_swin2_large_384 --side --height 512 --square

9.3 批量处理大型数据集

对于大型数据集，建议使用以下优化策略：

# 使用多进程处理
python -m multiprocessing run_batch.py

# 或者使用轻量级模型加速处理
python run.py --input_path large_dataset/ --output_path output/ --model_type midas_v21_small_256 --side

10. 总结与展望

MiDaS提供的可视化工具通过直观的并排对比展示，极大地简化了深度估计结果的分析过程。本文详细介绍了从基础使用到高级自定义的各种技巧，包括：

深度可视化的核心原理和流程
使用--side参数快速生成对比结果
自定义色彩映射和布局
批量处理和实时摄像头应用
集成API到自定义项目

未来MiDaS可视化功能可能会加入更多高级特性，如3D点云可视化、交互式深度调整等。你也可以通过贡献代码来扩展这些功能。

附录：完整参数列表

参数	类型	描述
--input_path	字符串	输入图像文件夹路径
--output_path	字符串	输出结果文件夹路径
--model_weights	字符串	模型权重文件路径
--model_type	字符串	模型类型
--side	标志	启用RGB和深度图并排显示
--optimize	标志	使用半精度优化加速推理
--height	整数	编码器输入图像高度
--square	标志	强制方形输入分辨率
--grayscale	标志	使用灰度色彩映射

支持的模型类型：dpt_beit_large_512, dpt_beit_large_384, dpt_beit_base_384, dpt_swin2_large_384, dpt_swin2_base_384, dpt_swin2_tiny_256, dpt_swin_large_384, dpt_next_vit_large_384, dpt_levit_224, dpt_large_384, dpt_hybrid_384, midas_v21_384, midas_v21_small_256, openvino_midas_v21_small_256

【免费下载链接】MiDaS 项目地址: https://gitcode.com/gh_mirrors/mid/MiDaS

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考