论文笔记(九):Utilizing the Instability inWeakly Supervised Object Detection

本文分析了基于MIL的检测器在弱监督对象检测中的不稳定性,并提出通过融合不同初始化的检测器结果来提高检测性能。文章介绍了多分支框架、在线融合策略(SCS)和正交初始化方法,旨在利用不稳定性达到类似集成学习的效果,提升检测准确性。

论文笔记(九):Utilizing the Instability inWeakly Supervised Object Detection

---------------------------------分割线--------------------------------
艾米拜可
---------------------------------分割线--------------------------------
这篇文章来自CVPR2019,仍旧是一篇弱监督检测的文章,和ICCV上的C-MIDN(之后会补上博客)来自同一位作者,其思路也都相似,都是想利用多实例检测器来进行融合以达到更好的效果,相较于C-MIDN,这篇文章的实现更为简单(没有用到语义分割信息),原理也更好理解。

MIDN的不稳定性

在这里插入图片描述作者在实验中发现如果对相同的MIDN模型进行不同的初始化,其检测结果会有很大变化,如上图所示,即使在本次MIDN中陷入了局部最优的图片,在给与了不同的初始化之后就能正常检测到完整的物体。作者称此为MIDN的不稳定性,为了定量分析这一不稳定性,作者定义了一种度量标准:不一致检测率(IDR),代表两个检测器结果的不一致性。
不一致检测率(IDR) :在图像中,若两个检测器的某一类的最高得分边框的IOU<0.5,则表示该图像该类上的结果不一致,而不一致检测率即为两个检测器在测试集上该类的结果不一致率。 I D R ∗ c = ∣ { I k c , w h e r e I o U ( b 1 , k c , b 2 , k c ) < 0.5 } ∣ ∣ { I k c } ∣ , IDR*c=\frac{|\{I_k^c,where IoU(b^c_{1,k},b^c_{2,k})<0.5\}|}{|\{I_k^c\}|}, IDRc={ Ikc}{

In the context of computer vision, enhancing 3D object detection using stereo matching techniques involves leveraging the geometric constraints provided by stereo imagery to improve the accuracy and robustness of object detection algorithms. Stereo matching, which is the process of finding corresponding points between two images taken from slightly different viewpoints, can provide depth information that is crucial for 3D object detection. By utilizing this depth information, one can better understand the spatial layout of objects within a scene, thereby improving the detection performance. One approach to enhancing 3D object detection using stereo matching involves the integration of stereo vision principles with deep learning models. This can be achieved by designing neural network architectures that take advantage of the disparity maps generated from stereo pairs. Disparity maps represent the pixel-wise differences between the left and right images, which can be converted into depth maps. These depth maps can then be used as additional input channels to convolutional neural networks (CNNs), providing the model with explicit 3D information that can aid in object detection tasks. Moreover, the use of stereo matching can help in refining the bounding box predictions for objects in 3D space. Traditional 2D object detectors provide bounding boxes that are confined to the image plane. However, when combined with stereo matching, these detectors can be extended to predict 3D bounding boxes that encompass the objects in the real world, thus providing more accurate localization of objects in the environment. Another aspect of enhancing 3D object detection using stereo matching lies in the area of multi-view geometry. Multi-view geometry principles can be applied to ensure that the detected objects in 3D space are consistent across different views. This consistency check can help in reducing false positives and improving the overall reliability of the detection system. For instance, when implementing stereo matching techniques for 3D object detection, one might consider the following code snippet that demonstrates how to compute disparity maps using OpenCV, a popular computer vision library: ```python import cv2 import numpy as np # Load left and right images img_left = cv2.imread('left_image.png', 0) img_right = cv2.imread('right_image.png', 0) # Create a stereo block matching object stereo = cv2.StereoBM_create(numDisparities=16, blockSize=15) # Compute the disparity map disparity = stereo.compute(img_left, img_right) # Display the disparity map cv2.imshow('Disparity', disparity) cv2.waitKey(0) cv2.destroyAllWindows() ``` This code snippet uses the StereoBM algorithm from OpenCV to compute a disparity map from a pair of stereo images. The disparity map can then be used as input to a 3D object detection pipeline, where it serves as a proxy for depth information, enabling the detection of objects in three-dimensional space. Furthermore, the application of stereo matching in 3D object detection can benefit from the use of advanced stereo matching algorithms that can handle large disparities and varying lighting conditions. These algorithms can provide more accurate disparity maps, which in turn can lead to better 3D object detection results. In summary, enhancing 3D object detection using stereo matching techniques involves integrating stereo vision principles with modern machine learning models, utilizing disparity maps to provide depth information, applying multi-view geometry for consistency checks, and employing advanced stereo matching algorithms to handle challenging scenarios. By doing so, one can significantly improve the performance of 3D object detection systems in various applications, such as autonomous driving, robotics, and augmented reality.
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值