【深度学习MVS系列论文】MVSNet: Depth Inference for Unstructured Multi-view Stereo

原创

已于 2022-01-20 21:35:36 修改 · 1.3k 阅读

7 ·

CC 4.0 BY-SA版权

文章标签：

#深度学习 #人工智能

于 2022-01-20 21:28:03 首次发布

MVSNet是一种针对非结构化多视图立体问题的深度推断网络，它结合了2D特征提取与3D成本体正则化，能够从多个视图中恢复三维结构。该方法使用差分同源变换构建3D成本体，并通过3D卷积网络正则化以获得初始深度图。

核心思路

extract deep visual image features
build 3D cost column upon the reference camera frustum via the differential equations homography warping
apply 3D convolution to regularize and regress the initial deep map
refine with the reference image

input: one reference image + several source images

output: depth for the reference image

key sight: differential equations homography warping operation, encode camera geometries in the network to build the 3D cost volumes from 2D image features and enables the end-to-end training

contribution:

encode the camera parameters as the differential equations homography to build the 3D cost volume upon the camera frustum
bridge the 2D feature extraction and 3D cost regularization networks
decouple the MVS reconstruction to smaller problems of per-view depth map estimation
variance-based metric that maps multiple features into one cost feature to adopt arbitrary number of views
3D cost volumn is built upon the camera frustum instead of the regular Euclidean space
decouple MVS reconstruction to per-view depth map estimation