Some aspects of the image recognition and image processing

最新推荐文章于 2018-11-30 01:13:34 发布

原创最新推荐文章于 2018-11-30 01:13:34 发布 · 1.8k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#image recognition #image processing

personal aspect 专栏收录该内容

1 篇文章

订阅专栏

本文深入探讨了图像识别与处理的前沿技术，从相机成像原理到图像处理工具的使用，如OpenCV，并讨论了其在人脸识别、智能物体识别及大数据图像结构化等领域的应用。同时，对图像处理中的关键操作，如色彩转换、直方图统计、图像增强及过滤进行了详细介绍。

Foreword
Camera imaging
Image Processing
Current situation and personal outlook

Foreword

With the advance of technology and the needs of consumers, image recognition and processing has become increasingly widely used in various industries. The closest thing to life is face recognition and scene recognition, for example, structured light, which has been used in iPhones and iPad several times recently. In addition, Huawei’s latest series of mobile phone take the approach that using intelligent object recognition, also BAT added image search function in their website or apps. At the same time, in the criminal investigation (such as chasing fugitives), and even public civilization behavior (run a light detection), it is also widely used . Additionally, the combination of big data and the structuring of image data is also the frontier research direction. This article wants to explore some aspects of the image recognition and image processing.

Camera imaging

When it comes to images, we have to mention its acquisition. Typically, an image is taken by a digital camera that projects light through a lens and acquires a scene on the image sensor.
在这里插入图片描述

d₁ is the distance from the lens to the image plane, d₂ is the distance from the lens to the object, and f is the focal length of the lens
The relational formula is: 1/f=1/d₁ +1/d₂
Due to the different parameters of the camera, for the sake of clear explanation, in the image processing, we simplified the camera model and applied the pinhole camera model learned in the middle school.
在这里插入图片描述
The basic projection equation for the relationship between an object and its image is:
h₁=(f*h₂)/h₁

When the image sensor receives the data of the object, it will transfer its basic features to each pixel, so that we will proceed to the next step and process the image.

Image Processing

In terms of image processing, a tool is introduced which is used in the trace identification module that the blogger is processing.
The image is essentially a matrix of numerical values. By taking the most widely used OpenCV in computer vision as an example, we use the cv::Matstructure to manipulate images. Each element in the matrix represents one pixel. For black and white images (grayscale images), the pixels are 8-bit unsigned digits ranging from 0 to 256. For color images, three primary colors are needed to construct different visible colors. The commonly used main characters are channels of red, green, and blue, which are RGB channels.
We can use OpenCV to perform various operations on images. The most basic one is to load, display and store images.

#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>

By making good use of these two header files and the API they contain, we can master and transform the various features of the image. For example,cv::Sizeallows us to get the height and width of the matrix. The vector cv:Vec3bcan represent data containing three colors of RBG and use the channel’s index to indicate a specific one of the three channels. And the image is manipulated by scanning the image matrix (pointer, iterator).
In addition, in order to perform more advanced operations on images, it is necessary to make good use of mathematical knowledge (constructor) related to the algorithm. We can design our own functions, or we can use the existing functions in OpenCV (which is really easy to use), such as using cv::cvtColorto convert colors.

cvtColor(srcImage, grayImage, CV_BGR2GRAY);
//change the picture into grayscale image
cvtColor(srcImage, HSVImage, COLOR_BGR2HSV);
//change the picture into HSV image

The image is composed of pixels of different colors. The distribution of pixel values in the image is an important attribute of the image, so the histogram can be used to count the number of pixels. We usually handle multi-channel by calling the cv::calcHistfunction.

cv::calcHist(&image,
   1,//histogram for single image
   channels,//the used channels
   cv::Mat(),//do not use masks
   hist,//gotten histogram
   3,//three-dimensional histogram
   histSize,
   ranges//The range of pixel values
);

For images, we also modify and improve the image quality by using the distribution of pixel values (using the lookup table or mapping function cv::LUT). Of course, the integrated image can also be used to count the pixels cv::integral.

void integral( 
    InputArray src, // input image as W*H,8-bit or floating-point
    OutputArray sum, // integral image as(W*H)*(H+1), 32-bit integer or floating-point
    OutputArray sqsum, // integral image for squared pixel values; it is (W*H)*(H+1), double-precision floating-point array
    OutputArray tilted, // integral for the image rotated by 45 degrees;
    int sdepth = -1, //the target depth
    int sqdepth = -1 
);

Filtering is also an important operation in image processing, removing noise from the image, extracting useful visual features, and resampling the image. The curvature of the edge or metric image function is examined using a Sobel or Laplacian operator (second derivative summation).
In addition, the Canny operator cv::Canny

cv::Canny(image,//grayscale image
          contours,//output the contours of the image
          100,//Low threshold
          300,//High threshold
          );

is used to verify the contour of the image (cv::Mat contours) and the Hough transform detection line, or cv::findCountoursis used to extract the contour of the connected region in the image.

OpenCV is a very basic and very useful open source library (like its full name Open Source Computer Vision Library). After years of analysis and optimization, it has become a major development tool in the field of computer vision. It can be seen that due to its high efficiency and open source, it will be in a leading position in both academic and commercial use in the coming years. In addition, Matlab is also a very efficient image processing tool.

From the above mentioned points that OpenCV can deal with images.Blogger deeply feel that the big factory software like Photoshop is really awesome, which greatly facilitates the multimedia workers and the need to modify the pictures of the normal consumers .

Current situation and personal outlook

Recently, the craze for photo processing in the past few years (such as Mito in China, adding filter to the photo), has transferred to face recognition. To analyse deeper, it is the popularity and progress of mobile devices (face unlocking, mobile payment software software face payment, password lock), as well as the improvement of public facilities (such as the Public security eye system,“brushing face” as approach to pubilc places) lead the distance of recognition between people and images to gradually shortened, which is also a manifestation of technology illuminating life. However, this popularity often leads to some unavoidable problems, such as personal privacy covered by electronic devices. At this time, in addition to requiring relevant departments and related companies to strengthen the protection of information, what we can do is to participate in the development of related technologies, encrypt, compress and store image processing, improve the method of image transmission, and let the public able to enjoy convenience while ensuring that personal privacy is not leaked and used by the criminals. This is the direction that we are studying and applying in the future.

In the future, with the development of science and technology and the growing needs of the people, image recognition will not only be applied to face recognition, car safety driving, motion tracking, motion recognition, but also in machine vision, structure, analysis, human-computer interaction (yes, that is TNT, future products), each step requires continuous research, revision and optimization by relevant research scholars and practitioners. After reading the papers of several related disciplines in IEEE, it can be found that, in fact, the current research has reached a deep level for image processing, such as compression, encoding, filtering, and encryption

(Highly recommend UESTC’s professor Zeng Bing who has published many image processing in IEEE. Articles such as Directional Discrete Cosine Transforms for Image Coding and Optimal median-type filtering under structural constraints are excellent research results. )

If I want to contribute in this area, it must need many corresponding efforts.

Writing this article today is just a preliminary understanding and understanding of this subject. It is also a vision for this subject. What is more, it is an encouragement for myself. I hope that with the support of these powerful tools, and the excellent example of the discipline, the brilliant example. Under the light of the the excellent , the brilliant, I can study, apply and practice.

Reference material：

1. Robert Laganiere，2015，OpenCV Computer vision programming strategy
2. Bing Zeng，1995，Optimal median-type filtering under structural constraints
3. Bing Zeng，2008，Directional Discrete Cosine Transforms—A New Framework for Image Coding