OpenCV实践（6）- 离散傅里叶变换

最新推荐文章于 2025-05-16 15:54:03 发布

tupelo-shen

最新推荐文章于 2025-05-16 15:54:03 发布

阅读量679

点赞数

CC 4.0 BY-SA版权

分类专栏： OpenCV 图像处理文章标签： opencv DFT 离散傅里叶变换

本文链接：https://blog.youkuaiyun.com/shenwanjiang111/article/details/54836285

OpenCV 同时被 2 个专栏收录

19 篇文章

订阅专栏

图像处理

19 篇文章

订阅专栏

本文介绍如何使用OpenCV实现图像的傅里叶变换，并详细解释了相关代码。包括图像扩展、创建存储空间、执行变换等步骤。展示了如何计算和显示变换后的幅值图像。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1 目标

（1）什么是傅里叶变换？有什么用？
（2）OpenCV怎样实现傅里叶变换？
（3）OpenCV提供的函数的使用方法，像copyMakeBorder()，merge()， dft()， getOptimalDFTSize()，log() 和 normalize()。

2 源代码

可以在下面的目录下找到：samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp
源代码内容如下：

#include "opencv2/core/core.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"

#include <iostream>

using namespace cv;
using namespace std;

static void help(char* progName)
{
    cout << endl
        <<  "This program demonstrated the use of the discrete Fourier transform (DFT). " << endl
        <<  "The dft of an image is taken and it's power spectrum is displayed."          << endl
        <<  "Usage:"                                                                      << endl
        << progName << " [image_name -- default lena.jpg] "                       << endl << endl;
}

int main(int argc, char ** argv)
{
    help(argv[0]);

    const char* filename = argc >=2 ? argv[1] : "lena.jpg";

    Mat I = imread(filename, CV_LOAD_IMAGE_GRAYSCALE);
    if( I.empty())
        return -1;

    Mat padded;                            //expand input image to optimal size
    int m = getOptimalDFTSize( I.rows );
    int n = getOptimalDFTSize( I.cols ); // on the border add zero values
    copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));

    Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
    Mat complexI;
    merge(planes, 2, complexI);         // Add to the expanded another plane with zeros

    dft(complexI, complexI);            // this way the result may fit in the source matrix

    // compute the magnitude and switch to logarithmic scale
    // => log(1 + sqrt(Re(DFT(I))^2 + Im(DFT(I))^2))
    split(complexI, planes);                   // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
    magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
    Mat magI = planes[0];

    magI += Scalar::all(1);                    // switch to logarithmic scale
    log(magI, magI);

    // crop the spectrum, if it has an odd number of rows or columns
    magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));

    // rearrange the quadrants of Fourier image  so that the origin is at the image center
    int cx = magI.cols/2;
    int cy = magI.rows/2;

    Mat q0(magI, Rect(0, 0, cx, cy));   // Top-Left - Create a ROI per quadrant
    Mat q1(magI, Rect(cx, 0, cx, cy));  // Top-Right
    Mat q2(magI, Rect(0, cy, cx, cy));  // Bottom-Left
    Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right

    Mat tmp;                           // swap quadrants (Top-Left with Bottom-Right)
    q0.copyTo(tmp);
    q3.copyTo(q0);
    tmp.copyTo(q3);

    q1.copyTo(tmp);                    // swap quadrant (Top-Right with Bottom-Left)
    q2.copyTo(q1);
    tmp.copyTo(q2);

    normalize(magI, magI, 0, 1, CV_MINMAX); // Transform the matrix with float values into a
                                            // viewable image form (float between values 0 and 1).

    imshow("Input Image", I);    // Show the result
    imshow("spectrum magnitude", magI);
    waitKey();

    return 0;
}

3 代码解释

傅里叶变换实现的功能是，将一幅图像分解成正弦和余弦两部分。换句话说，就是从空间域向频域的转换。它的思想就是任何一个函数都可以用无穷个正弦和余弦函数组成。而傅里叶变换就是这种思想的具体实现。一个二维图像的傅里叶变换用数学表达式表示，如下图所示：
这里写图片描述
在这里，f代表图像在空间域的值，F表示它在频域的值。转换的结果是复数。既可以通过一个实图像和一个复图像展示这个函数，也可以通过幅值和相位图像。但是，贯穿整个图像处理算法，我们只对幅值图像感兴趣，因为它包含了我们需要的关于图像几何结构的信息。然而，如果当你打算对这些图像作修改时，你需要重新转换成图像格式，这是你就需要保留这两部分的内容。
在这个demo中，我们将展示怎样计算和显示傅里叶变换后的幅值图像。在这里，假设数字图像是离散的。这就意味着，如果给定一个区间域内的值，就能得到一个数值。例如，对于基本的灰度图像值，值范围通常是0-255。因此，这里需要离散傅里叶变换（DFT）。当你需要从几何观点决定一个图像的结构时，你就可以使用此变换了。下面是步骤（在这个例子里，使用灰度图像作为输入）：
（1）将图像扩展成最优尺寸。
DFT的性能依赖于图像的大小。如果图像的尺寸大小是数字2，3，5的倍数，DFT的计算最快。因此，为了获得最大性能，填充图像使图像成为一个具有这种特征的对象，是很好的一个想法。getOptimalDFTSize()函数返回优化后的大小。可以使用函数copyMakeBorder()扩展图像的边界：

//expand input image to optimal size
Mat padded;                            
int m = getOptimalDFTSize( I.rows );
int n = getOptimalDFTSize( I.cols ); 
// on the border add zero pixels
copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));

扩展像素全部被初始化为0。
（2）为复数部分和实数部分创建存储空间
傅里叶变换的结果是复数。这就意味着，对每个图像值，结果都有两个图像值。更重要的是，频域的范围要比它在空间域的范围大。因此，我们至少需要浮点数存储这些值。所以，我们将要把我们的输入图像矩阵转换成浮点数类型。然后扩展这个矩阵为2通道，用另一个通道存储复数值：

Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
Mat complexI;
// Add to the expanded another plane with zeros
merge(planes, 2, complexI);

（3）执行离散傅里叶变换
将输入矩阵作为输出矩阵：

dft(complexI, complexI);// 计算结果就保存在源矩阵中

（4）转换复数和实数值为幅度。
公式如下：
这里写图片描述
将其用OpenCV实现代码：

// planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
split(complexI, planes); 
// planes[0] = magnitude                 
magnitude(planes[0], planes[1], planes[0]);
Mat magI = planes[0];

（5）转换到对数坐标
已经被证明，傅里叶系数的动态范围太大，无法在屏幕上显示。那些非常小的值和一些变化很大的值我们不能观察到。因此，高值将会显示为白点，同时小值为黑色。为了使灰度值能够可视化，我们将线性坐标转化为对数坐标：
这里写图片描述
OpenCV实现代码就是：

// switch to logarithmic scale
magI += Scalar::all(1);                    
log(magI, magI);

（6）裁剪并重排
还记得，我们在第一步的时候，扩展图像了吗？现在是时候，再把那部分扔掉了。为了可视化的目的，我们必须重新排列结果的象限，所以，原点就相当于图像的中心点：

magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
int cx = magI.cols/2;
int cy = magI.rows/2;

// Create a ROI per quadrant
Mat q0(magI, Rect(0, 0, cx, cy));   // Top-Left
Mat q1(magI, Rect(cx, 0, cx, cy));  // Top-Right
Mat q2(magI, Rect(0, cy, cx, cy));  // Bottom-Left
Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right

Mat tmp;
// swap quadrants(Top-Left with Bottom-Right)
q0.copyTo(tmp);
q3.copyTo(q0);
tmp.copyTo(q3);

// swap quadrant(Top-Right with Bottom-Left)
q1.copyTo(tmp);
q2.copyTo(q1);
tmp.copyTo(q2);

（7）归一化
这一步还是为了可视化。我们已经有了幅值，但是这仍然在0-1的显示范围之外。使用normalize() 函数进行归一化。

normalize(magI, magI, 0, 1, CV_MINMAX); 
// Transform the matrix with float values into a
// viewable image form(float between values 0 and 1).

4 结果

一个应用的想法就是，确定图像中代表的几何朝向。例如，让我们查看文本是否水平？例如，有些文本就是垂直行格式。这些就可以在傅里叶变换中看出来。
让我们使用 this horizontal 和 this rotated image 分别进行展示，这两幅图片的内容是文本。
水平文本的例子：
这里写图片描述
旋转后的文本例子：

从上面的结果上，你能看出频域中最有影响的部分（幅值图像中亮点）跟随图片中的对象的几何旋转而旋转。据此，我们就可以计算图片中对象的偏移，然后实现图片的旋转用以纠正最终可能出现的未对准现象。