Object Detection
Output:
- category label from fixed, known set of categories
- bounding box (x, y, width, height)
If only one object is needed to be detected -> add FC layer to the Net pretrianed on ImageNet
Sliding Window
apply a CNN to many different crops of the image, CNN classifies each crop as object / backgroud
but too many windows!! and may detect repeatedly
we need region proposals to find a small set of boxes that are likely to cover all the objects
“Selective Search” quick to generate 2000 regions
R-CNN : Region-Based CNN
- Region proposals
- warped the image to fixed size 224*224
- forward each region through ConvNet independently
- output a classification score and also a Bbox of 4 numbers, using the following algorithm
Measurement of boxes (IoU)
I o U = Area of Intersection Area of Union IoU = \frac{\text{Area of Intersection}}{\text{Area of Union}} IoU=Area of UnionArea of Intersection