目标检测学习记录——特征提取之HOG

最新推荐文章于 2021-01-29 19:57:09 发布

原创

最新推荐文章于 2021-01-29 19:57:09 发布 · 2.1k 阅读

5 ·

CC 4.0 BY-SA版权

本文详细介绍了计算机视觉中的HOG（Histogram of Oriented Gradient）特征提取方法，包括图像预处理、梯度计算、细胞单元定义、直方图构建、块滑动与归一化等步骤，及其在行人检测中的应用。通过学习，可以深入理解HOG特征在目标检测中的重要作用。

===================================================================================================================

作为一名计算机视觉研究和代码编程上的小菜鸟。上学期就看了Histogram of Oriented Gradient for Human Detection 整篇论文不长，看了似乎也懂，但是回过头来感觉懂得其实很浅显。在此开博，记录学习的知识，也算是记录自己研究生的历程。没有新的见解，更多的是个人对此理解的整理。

==========================================================================================================================================================

简述：方向梯度直方图（Histogram of Oriented Gradient, HOG）特征是一种在计算机视觉和图像处理中用来进行物体检测的特征描述子（用于表示特征）。它通过计算和统计图像局部区域的梯度方向直方图来构成特征。Hog特征结合SVM分类器已经被广泛应用于图像识别中，尤其在行人检测中获得了极大的成功。需要提醒的是，HOG+SVM进行行人检测的方法是法国研究人员Dalal在2005的CVPR上提出的，而如今虽然有很多行人检测算法不断提出，但基本都是以HOG+SVM的思路为主。

大概过程：

HOG特征提取方法就是将一个image（你要检测的目标或者扫描窗口）：

1）灰度化（将图像看做一个x,y,z（灰度）的三维图像）；

2）采用Gamma校正法对输入图像进行颜色空间的标准化（归一化）；目的是调节图像的对比度，降低图像局部的阴影和光照变化所造成的影响，同时可以抑制噪音的干扰；

3）计算图像每个像素的梯度（包括大小和方向）；主要是为了捕获轮廓信息，同时进一步弱化光照的干扰。

4）将图像划分成小cells（例如6*6像素/cell）；

5）统计每个cell的梯度直方图（不同梯度的个数），即可形成每个cell的descriptor；

6）将每几个cell组成一个block（例如3*3个cell/block），一个block内所有cell的特征descriptor串联起来便得到该block的HOG特征descriptor。

7）将图像image内的所有block的HOG特征descriptor串联起来就可以得到该image（你要检测的目标）的HOG特征descriptor了。这个就是最终的可供分类使用的特征向量了。

理解：将图像灰度化后，将图像分成小的连通区域称为细胞单元（cell），采集得到像素点的梯度或边缘的方向直方图（Histogram of Oriented Gradient）。文章中检测窗口取128*64，区块（block）取16*16，每个Block划分成2*2的cell，细胞单元（cell）取8*8，移动窗口步长为8个像素。所以一个检测窗口分为[（128-16）/8+1]*[（64-16）/8+1]=15*7=105个block块。每个块分为4个cell，每个cell分成9个bin，所以HOG特征描述子有105*4*9=3780维。

block分成cell

cell按方向分成9个bin

还未了解的点：

三线插值？

Matlab源码：见TimeHandle的blog

转自：http://www.zhizhihu.com/html/y2010/1690.html

Histograms of Oriented Gradients (HOG)特征 MATLAB 计算

[plain]view plaincopy 
   
 function F = hogcalculator(img, cellpw, cellph, nblockw, nblockh,...  
 nthet, overlap, isglobalinterpolate, issigned, normmethod)  
 % HOGCALCULATOR calculate R-HOG feature vector of an input image using the  
 % procedure presented in Dalal and Triggs's paper in CVPR 2005.  
 %  
 % Author: timeHandle  
 % Time: March 24, 2010  
 % May 12，2010 update.  
 %  
 % this copy of code is written for my personal interest, which is an   
 % original and inornate realization of [Dalal CVPR2005]'s algorithm  
 % without any optimization. I just want to check whether I understand  
 % the algorithm really or not, and also do some practices for knowing  
 % matlab programming more well because I could be called as 'novice'.   
 % OpenCV 2.0 has realized Dalal's HOG algorithm which runs faster  
 % than mine without any doubt, ╮(╯▽╰)╭ . Ronan pointed a error in   
 % the code，thanks for his correction. Note that at the end of this  
 % code, there are some demonstration code，please remove in your work.  
   
 %   
 % F = hogcalculator(img, cellpw, cellph, nblockw, nblockh,  
 % nthet, overlap, isglobalinterpolate, issigned, normmethod)  
 %  
 % IMG:  
 % IMG is the input image.  
 %  
 % CELLPW, CELLPH:  
 % CELLPW and CELLPH are cell's pixel width and height respectively.  
 %  
 % NBLOCKW, NBLCOKH:  
 % NBLOCKW and NBLCOKH are block size counted by cells number in x and  
 % y directions respectively.  
 %  
 % NTHET, ISSIGNED:  
 % NTHET is the number of the bins of the histogram of oriented  
 % gradient. The histogram of oriented gradient ranges from 0 to pi in  
 % 'unsigned' condition while to 2*pi in 'signed' condition, which can  
 % be specified through setting the value of the variable ISSIGNED by  
 % the string 'unsigned' or 'signed'.  
 %  
 % OVERLAP:  
 % OVERLAP is the overlap proportion of two neighboring block.  
 %  
 % ISGLOBALINTERPOLATE:  
 % ISGLOBALINTERPOLATE specifies whether the trilinear interpolation  
 % is done in a single global 3d histogram of the whole detecting  
 % window by the string 'globalinterpolate' or in each local 3d  
 % histogram corresponding to respective blocks by the string  
 % 'localinterpolate' which is in strict accordance with the procedure  
 % proposed in Dalal's paper. Interpolating in the whole detecting  
 % window requires the block's sliding step to be an integral multiple  

最低0.47元/天解锁文章