计算机视觉的研究原则-Richard Szeliski

最新推荐文章于 2025-05-10 21:38:05 发布

3D-Vision

最新推荐文章于 2025-05-10 21:38:05 发布

阅读量1.4k

点赞数

分类专栏：图像处理资源文章标签： algorithm performance optimization image build testing

图像处理资源专栏收录该内容

7 篇文章

订阅专栏

Richard Szeliski -计算机视觉的研究原则

In formulating and solving computer vision problems, I have often found it useful to draw inspiration from three high-level approaches:

• Scientific: build detailed models of the image formation process and develop mathematical techniques to invert these in order to recover the quantities of interest (where necessary, making simplifying assumption to make the mathematics more tractable).
• Statistical: use probabilistic models to quantify the prior likelihood of your unknowns and the noisy measurement processes that produce the input images, then infer the best possible estimates of your desired quantities and analyze their resulting uncertainties. The inference algorithms used are often closely related to the optimization techniques used to invert the (scientific) image formation processes.
• Engineering: develop techniques that are simple to describe and implement but that are also known to work well in practice. Test these techniques to understand their limitation and failure modes, as well as their expected computational costs (run-time performance).

These three approaches build on each other and are used throughout the book.

My personal research and development philosophy (and hence the exercises in the book) have a strong emphasis on testing algorithms. It’s too easy in computer vision to develop an algorithm that does something plausible on a few images rather than something correct. The best way to validate your algorithms is to use a three-part strategy.

First, test your algorithm on clean synthetic data, for which the exact results are known. Second, add noise to the data and evaluate how the performance degrades as a function of noise level. Finally, test the algorithm on real-world data, preferably drawn from a wide variety of sources, such as photos found on the Web. Only then can you truly know if your algorithm can deal with real-world complexity, i.e., images that do not fit some simplified model or assumptions.