OpenCV源码剖析之imread JPEG

最新推荐文章于 2024-03-01 16:48:44 发布

原创最新推荐文章于 2024-03-01 16:48:44 发布 · 1.7k 阅读

4 ·

CC 4.0 BY-SA版权

文章标签：

#OpenCV源码 #imread #JPEG

本文详细解析了OpenCV中JPEG图片的解码流程，包括如何选择合适的解码器、读取图片头部信息及数据等关键步骤。

在经历第一份工作的2年半后，有幸能够从新进入到图像处理这个领域来，与以前工作时只能空闲时间看看OpenCV源码、博客和了解OpenCV最新动态这种三天打鱼两天晒网的不同，这次自己可以专心扎进这里面来了。学习图像处理自然少不了OpenCV，其源码完全开源、强大的使用群体和社区资源是学习图像处理的不二之选，其源码也是十分值得图像处理人研究学习的。从这期开始将逐渐分析其种的源码实现，也算是对平时的工作、学习做个备份，正所谓好记性不如烂笔头，且能看、能说和能写是完全不同的。本期将主要讲解OpenCV中对JPEG图片的解码处理。

使用OpenCV处理的图像的第一步就是载入图像数据了，这是一个图片解码的过程，这次将以JPEG图片为例进行说明。读取图片自然是离不开imread这个接口了，其函数原型如下：

Mat imread(const String& filename, int flags)

从OpenCV源码上看imread接口接受一个文件名和flags作为参数，filename就不用说了，自然就是我们要读入的图片，flags可以理解读取图片的方式了，其定义在imgcodecs.hpp里，具体如下，我们可以根据处理需要选择读取方式。

//! Imread flags

enum ImreadModes {
       IMREAD_UNCHANGED            = -1, //!< If set, return the loaded image as is (with alpha channel, otherwise it gets cropped).
       IMREAD_GRAYSCALE            = 0,  //!< If set, always convert image to the single channel grayscale image.
       IMREAD_COLOR                = 1,  //!< If set, always convert image to the 3 channel BGR color image.
       IMREAD_ANYDEPTH             = 2,  //!< If set, return 16-bit/32-bit image when the input has the corresponding depth, otherwise convert it to 8-bit.
       IMREAD_ANYCOLOR             = 4,  //!< If set, the image is read in any possible color format.
       IMREAD_LOAD_GDAL            = 8,  //!< If set, use the gdal driver for loading the image.
       IMREAD_REDUCED_GRAYSCALE_2  = 16, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/2.
       IMREAD_REDUCED_COLOR_2      = 17, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/2.
       IMREAD_REDUCED_GRAYSCALE_4  = 32, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/4.
       IMREAD_REDUCED_COLOR_4      = 33, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/4.
       IMREAD_REDUCED_GRAYSCALE_8  = 64, //!< If set, always convert image to the single channel grayscale image and the image size reduced 1/8.
       IMREAD_REDUCED_COLOR_8      = 65, //!< If set, always convert image to the 3 channel BGR color image and the image size reduced 1/8.
       IMREAD_IGNORE_ORIENTATION   = 128 //!< If set, do not rotate the image according to EXIF's orientation flag.

     };

imread接着调用了 imread_(filename, flags, LOAD_MAT, &img);函数，这函数上面也有注释，就是load the data了，说白了就是对图片进行解码，返回图片裸流数据了(就暂且这么叫吧)，让我们移位更进步，到这个里面去看看，imread其中乾坤就在这里面了。

在imread_ 里面首先映入眼帘的都是ImageDecoder decoder;从这个命名来看这就是一个图片解码器了，由于我们起初并不知道需要什么样的解码方式才能解传入的图片，这就需要代码实现去找对应图片格式的解码器。这里有2种方式GdalDecoder().newDecoder()和findDecoder，第一个没用过，从其他解释来看这是针对遥感图像数据处理了，这里将不做详解，以后用到在see see。findDecoder接口只传入了一个filename，没错这就是我们读入图片的文件命令，findDecoder里面首先是遍历所有注册的解码器，找到最大的signatureLength，这个我们可以在grfmt_base.cpp里面找到，其实现如下：

size_t BaseImageDecoder::signatureLength() const
{
    return m_signature.size();

}

从代码看可以知道这是属于BaseImageDecoder类的一个方法，而它的处理结果是返回m_signature的大小，针对于JPEG图片其赋值为 m_signature = "\xFF\xD8\xFF";这个可以在grfmt_jpeg.cpp里找到。找到这个长度后，就打开文件读取这个长度的数据到signature里去，接下来自然还是遍历所有注册的解码器，找到那个与传入图片signature相对应的解码器。

bool BaseImageDecoder::checkSignature( const String& signature ) const
{
    size_t len = signatureLength();
    return signature.size() >= len && memcmp( signature.c_str(), m_signature.c_str(), len ) == 0;

}

假设已经找到图片对应的解码器了，接下来的处理就是decoder->readHeader()，找到其对应的JPEG实现：

bool  JpegDecoder::readHeader()
{
    volatile bool result = false;
    close();

    JpegState* state = new JpegState;
    m_state = state;
    state->cinfo.err = jpeg_std_error(&state->jerr.pub);
    state->jerr.pub.error_exit = error_exit;
    if( setjmp( state->jerr.setjmp_buffer ) == 0 )
   {
        jpeg_create_decompress( &state->cinfo );

        if( !m_buf.empty() )
        {
            jpeg_buffer_src(&state->cinfo, &state->source);
            state->source.pub.next_input_byte = m_buf.ptr();
            state->source.pub.bytes_in_buffer = m_buf.cols*m_buf.rows*m_buf.elemSize();
        }
        else
        {
            m_f = fopen( m_filename.c_str(), "rb" );
            if( m_f )
                jpeg_stdio_src( &state->cinfo, m_f );
        }

        if (state->cinfo.src != 0)
        {
            jpeg_read_header( &state->cinfo, TRUE );

            state->cinfo.scale_num=1;
            state->cinfo.scale_denom = m_scale_denom;
            m_scale_denom=1; // trick! to know which decoder used scale_denom see imread_
            jpeg_calc_output_dimensions(&state->cinfo);
            m_width = state->cinfo.output_width;
            m_height = state->cinfo.output_height;
            m_type = state->cinfo.num_components > 1 ? CV_8UC3 : CV_8UC1;
            result = true;
        }
    }

    if( !result )
        close();

    return result;

}

这完全就是调用的JPEG实现了，大致就是创建JPEG解码器，设置输入源为文件，读取文件的宽高通道信息了。如果对这部分不是很清楚的，可以下载下libjpeg的源码，其example.c、libjpeg.txt都是你理解的好帮手。

接下来就是创建OpenCV图片管理器了，再接着就是读取图片，同理找到JPEG的对应实现：

bool  JpegDecoder::readData( Mat& img )
{
    volatile bool result = false;
    size_t step = img.step;
    bool color = img.channels() > 1;

    if( m_state && m_width && m_height )
    {
        jpeg_decompress_struct* cinfo = &((JpegState*)m_state)->cinfo;
        JpegErrorMgr* jerr = &((JpegState*)m_state)->jerr;
        JSAMPARRAY buffer = 0;
        if( setjmp( jerr->setjmp_buffer ) == 0 )
        {
            /* check if this is a mjpeg image format */
            if ( cinfo->ac_huff_tbl_ptrs[0] == NULL &&
                cinfo->ac_huff_tbl_ptrs[1] == NULL &&
                cinfo->dc_huff_tbl_ptrs[0] == NULL &&
                cinfo->dc_huff_tbl_ptrs[1] == NULL )
            {
                /* yes, this is a mjpeg image format, so load the correct
                huffman table */
                my_jpeg_load_dht( cinfo,
                    my_jpeg_odml_dht,
                    cinfo->ac_huff_tbl_ptrs,
                    cinfo->dc_huff_tbl_ptrs );
            }

            if( color )
            {
                if( cinfo->num_components != 4 )
                {
                    cinfo->out_color_space = JCS_RGB;
                    cinfo->out_color_components = 3;
                }
                else
                {
                    cinfo->out_color_space = JCS_CMYK;
                    cinfo->out_color_components = 4;
                }
            }
            else
            {
                if( cinfo->num_components != 4 )
                {
                    cinfo->out_color_space = JCS_GRAYSCALE;
                    cinfo->out_color_components = 1;
                }
                else
                {
                    cinfo->out_color_space = JCS_CMYK;
                    cinfo->out_color_components = 4;
                }
            }

            jpeg_start_decompress( cinfo );
            buffer = (*cinfo->mem->alloc_sarray)((j_common_ptr)cinfo,
                                              JPOOL_IMAGE, m_width*4, 1 );
            uchar* data = img.ptr();
            for( ; m_height--; data += step )
            {
                jpeg_read_scanlines( cinfo, buffer, 1 );
                if( color )
                {
                    if( cinfo->out_color_components == 3 )
                        icvCvt_RGB2BGR_8u_C3R( buffer[0], 0, data, 0, cvSize(m_width,1) );
                    else
                        icvCvt_CMYK2BGR_8u_C4C3R( buffer[0], 0, data, 0, cvSize(m_width,1) );
                }
                else
                {
                    if( cinfo->out_color_components == 1 )
                        memcpy( data, buffer[0], m_width );
                    else
                        icvCvt_CMYK2Gray_8u_C4C1R( buffer[0], 0, data, 0, cvSize(m_width,1) );
                }
            }

            result = true;
            jpeg_finish_decompress( cinfo );
        }
    }

    close();
    return result;
}

这里采用逐行读取解码的形式将解码后的图像数据拷贝给Mat ptr，至此对于JPEG的OpenCV imread实现就完全讲解完了，整个过程梳理下来，自己也收获不少。这就是自己对这部分的理解，欢迎大家留言交流。附上一个很好的libjpeg使用教程l：

点击打开链接