Seeing Through Fog Without Seeing Fog:Deep Multimodal Sensor Fusion in Unseen Adverse Weather (翻)

本文提出了一种在恶劣天气下进行目标检测的深度多模态融合方法,通过自适应的单样本深度融合架构,利用测量熵来指导融合。文章介绍了一个新的多模态数据集,涵盖了多种天气条件,包括雾、雪和雨,用于训练和验证模型。现有的融合方法在不利天气中表现不佳,而提出的模型在没有大量标注数据的情况下也能泛化到不可见的传感器失真。该方法在实际的恶劣天气数据上优于现有的融合和单传感器方法,展示了在实时自动驾驶中的潜力。

Title:Seeing Through Fog Without Seeing Fog:Deep Multimodal Sensor Fusion in Unseen Adverse Weather

看透雾而不见雾:在看不见的不利天气下进行深度多模态传感器融合

Abstract:

The fusion of multimodal sensor streams,such as cam-era,lidar,and radar measurements,plays a critical role inobject detection for autonomous vehicles,which base theirdecision making on these inputs.While existing methodsexploit redundant information in good environmentalconditions,they fail in adverse weather where the sensorystreams can be asymmetrically distorted.These rare“edge-case”scenarios are not represented in available datasets,and existing fusion architectures are not designed tohandle them.To address this challenge we present a novelmultimodal dataset acquired in over 10,000 km of drivingin northern Europe.Although this dataset is the first largemultimodal dataset in adverse weather,with 100k labels forlidar,camera,radar,and gated NIR sensors,it does not fa-cilitate training as extreme weather is rare.To this end,wepresent a deep fusion network for robust fusion without alarge corpus of labeled training data covering all asymmet-ric distortions.Departing from proposal-level fusion,wepropose a single-shot model that adaptively fuses features,driven by measurement entropy.We validate the proposedmethod,trained on clean data,on our extensive validationdataset.Code and data are available here https://github.com/princeton-computational-imaging/SeeingThroughFog.

摘要:多模态传感器流的融合,如摄像机、激光雷达和雷达测量,在自动驾驶车辆的目标检测中起着至关重要的作用,它们的决策基于这些输入。现有的方法在良好的环境条件下利用冗余信息,但在恶劣天气下感知流会发生不对称失真。这些罕见的"边缘案例"场景并没有在可用的数据集中表现出来,现有的融合架构也没有对其进行处理。为了应对这一挑战,我们提出了一个在北欧超过10,000公里的驾驶中获得的新的多模态数据集。尽管该数据集是第一个在恶劣天气下的大型多模态数据集,具有100k个标签( forlidar、摄像头、雷达和门控NIR传感器),但由于极端天气很少,不便于训练。为此,我们提出了一种深度融合网络,用于鲁棒融合,而不需要覆盖所有非对称失真的大量有标签训练数据。从建议级融合出发,我们提出了一种由测量熵驱动的自适应融合特征的单样本模型。我们在我们广泛的验证数据集上对提出的方法进行验证,在干净的数据上进行训练。代码和数据见https://github.com/princeton-computational-imaging/SeeingThroughFog.

1.Introduction

Object detection is a fundamental computer vision prob-lem in autonomous robots,including self-driving vehiclesand autonomous drones.Such applications require 2D or3D bounding boxes of scene objects in challenging real-world scenarios,including complex cluttered scenes,highlyvarying illumination,and adverse weather conditions.Themost promising autonomous vehicle systems rely on redun-dant inputs from multiple sensor modalities[58,6,73],in-cluding camera,lidar,radar,and emerging sensor such asFIR[29].A growing body of work on object detection us-ing convolutional neural networks has enabled accurate 2Dand 3D box estimation from such multimodal data,typicallyrelying on camera and lidar data[64,11,56,71,66,42,35].While these existing methods,and the autonomous sys-tem that performs decision making on their outputs,per-form well under normal imaging conditions,they fail inadverse weather and imaging conditions.This is becauseexisting training datasets are biased towards clear weatherconditions,and detector architectures are designed to relyonly on the redundant information in the undistorted sen-sory streams.However,they are not designed for harsh sce-narios that distort the sensor streams asymmetrically,seeFigure.1.Extreme weather conditions are statistically rare.For example,thick fog is observable only during 0.01% of typical driving in North America,and even in foggy re-gions,dense fog with visibility below 50 m occurs only upto 15 times a year[61].Figure 2 shows the distributionof real driving data acquired over four weeks in Swedencovering 10,000 km driven in winter conditions.The nat-urally biased distribution validates that harsh weather sce-narios are only rarely or even not at all represented in avail-able datasets[65,19,58].Unfortunately,domain adaptationmethods[44,28,41]also do not offer an ad-hoc solution asthey require target samples,and adverse weather-distorteddata are underrepresented in general.Moreover,existingmethods are limited to image data but not to multisensordata,e.g.including lidar point-cloud data.

1. 介绍

目标检测是自主机器人中的一个基本计算机视觉问题,包括自动驾驶车辆和自主无人机。这类应用在具有挑战性的现实场景中需要场景对象的2D或3D边界框,包括复杂的杂乱场景、高度变化的光照和恶劣的天气条件。最有前途的自动驾驶系统依赖于多个传感器模态[ 58、6、73]的冗余输入,包括相机、激光雷达、雷达和红外等新兴传感器[ 29 ]。越来越多的使用卷积神经网络进行目标检测的工作已经能够从这些多模态数据中准确地估计2D和3D框,通常依赖于相机和激光雷达数据[ 64、11、56、71、66、42、35]。虽然这些现有的方法,以及对其输出执行决策的自治系统,在正常成像条件下表现良好,但在恶劣天气和成像条件下失败。这是因为现有的训练数据集偏向于晴朗的天气状况,而检测器架构的设计仅依赖于未失真感知流中的冗余信息。然而,它们并不是针对不对称地扭曲传感器流的恶劣场景而设计的,见图1。极端天气条件在统计上是罕见的。例如,只有在0.01 % 在北美地区,甚至在雾区,能见度低于50 m的浓雾每年仅发生15次期间才能观测到浓雾。图2显示了瑞典在冬季驾驶10000公里的情况下,四周内获得的实际驾驶数据的分布情况。自然偏态分布验证了恶劣天气场景在可用数据集[ 65、19、58]中只有很少甚至根本没有表现。不幸的是,域适应方法[ 44、28、41]也没有提供一个临时的解决方案,因为它需要目标样本,而不利的天气失真数据通常是代表性不足的。此外,现有方法仅限于图像数据,而不适用于多传感器数据,如激光雷达点云数据。

 Existing fusion methods have been proposed mostly forlidar-camera setups[64,11,42,35,12],as a result of thelimited sensor inputs in existing training datasets[65,19,58].These methods do not only struggle with sensor distor-tions in adverse weather due to the bias of the training data.Either they perform late fusion through filtering after inde-pendently processing the individual sensor streams[12],orthey fuse proposals[35]or high-level feature vectors[64].The network architecture of these approaches is designedwith the assumption that the data streams are consistent andredundant,i.e.an object appearing in one sensory streamalso appears in the other.However,in harsh weather condi-tions,such as fog,rain,snow,or extreme lighting condition,including low-light or low-reflectance objects,multimodalsensor configurations can fail asymmetrically.For exam-ple,conventional RGB cameras provide unreliable noisymeasurements in low-light scene areas,while scanning li-dar sensors provide reliable depth using active illumina-tion.In rain and snow,small particles affect the color im-age and lidar depth estimates equally through backscatter.Adversely,in foggy or snowy conditions,state-of-the-artpulsed lidar systems are restricted to less than 20 m rangedue to backscatter,see Figure 3.While relying on lidarmeasurements might be a solution for night driving,it isnot for adverse weather conditions.

由于现有的训练数据集[ 65、19、58]中传感器的输入有限,现有的融合方法大多是针对相机-相机设置提出的。这些方法不仅在恶劣天气下由于训练数据的偏差而与传感器故障作斗争。它们要么在独立处理各个传感器流后通过滤波进行后期融合[ 12 ],要么融合提案[ 35 ]或高层特征向量[ 64 ]。这些方法的网络结构是在数据流是一致且冗余的假设下设计的,即出现在一个感知流中的对象也出现在另一个感知流中。然而,在恶劣的天气条件下,如雾、雨、雪或极端光照条件下,包括低光照或低反射物体,多模态传感器配置可能会不对称地失效。例如,传统的RGB相机在低光照场景区域提供不可靠的噪声测量,而扫描激光雷达传感器通过主动照明提供可靠的深度。在雨和雪中,小粒子通过后向散射对彩色图像和激光雷达深度估计产生同等影响。相反,在雾天或雪天条件下,由于后向散射的影响,先进的脉冲激光雷达系统被限制在20 m以内,见图3。虽然依靠激光雷达测量可能是夜间驾驶的一种解决方案,但它不适用于恶劣的天气条件。

In this work,we propose a multimodal fusion method forobject detection in adverse weather,including fog,snow,and harsh rain,without having large annotated trainingdatasets available for these scenarios.Specifically,we han-dle asymmetric measurement corruptions in camera,lidar,radar,and gated NIR sensor streams by departing from ex-isting proposal-level fusion methods:we propose an adap-tive single-shot deep fusion architecture which exchangesfeatures in intertwined feature extractor blocks.This deepearly fusion is steered by measured entropy.The proposedadaptive fusion allows us to learn models that generalizeacross scenarios.To validate our approach,we address thebias in existing datasets by introducing a novel multimodaldataset acquired on three months of acquisition in northernEurope.This dataset is the first large multimodal driving dataset in adverse weather,with 100k labels for lidar,cam-era,radar,gated NIR sensor,and FIR sensor.Although theweather-bias still prohibits training,this data allows us tovalidate that the proposed method generalizes robustly tounseen weather conditions with asymmetric sensor corrup-tions,while being trained on clean data.Specifically,we make the following contributions:

在本文中,我们提出了一种多模态融合方法,用于在雾、雪和恶劣降雨等恶劣天气下进行目标检测,而无需为这些场景提供大量有标注的训练数据集。具体来说,我们通过与现有的建议级融合方法不同,处理相机、激光雷达、雷达和门控NIR传感器流中的不对称测量损坏:我们提出了一种自适应的单样本深度融合架构,在相互交织的特征提取器块中交换特征。这种深度早期融合是由测量的熵引导的。提出的自适应融合允许我们学习跨场景泛化的模型。为了验证我们的方法,我们通过介绍一个新的多模态数据集在欧洲北部的三个月的采集来解决现有数据集的偏见。该数据集是第一个恶劣天气下的大型多模态驾驶数据集,具有100k个标签,用于激光雷达、

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值