点击下方卡片,关注“自动驾驶之心”公众号
ADAS巨卷干货,即可获取
今天自动驾驶之心为大家整理了ICCV2023相关的paper以及会议上的工作安排,希望已在巴黎的同学,好好表现,争取一把!不知今年的best paper花落谁家~
编辑 | 自动驾驶之心
ICCV2023共收录了约2100篇paper,线下会议将在10.2-10.6号在法国巴黎的巴黎会议中心举办,许多朋友已经到达巴黎,和大家一路走来,很开心看到很多oral和poster,谁会是今年的best paper呢?今天也盘一盘ICCV2023的相关工作。
首先看看有哪些oral:
(一)基于多视图的3D视觉
主要是Nerf、SLAM、重建等领域,Nerf方向发展很迅速,自动驾驶行业也开始铺开啦~



(二)视觉&&计算机图形学&&机器人相关
Graphics和Robottics不用说了,还增加了很多隐私、安全、可解释性和数据集相关的paper。


(三)识别&&分割&&生成式Model
看到了Segment Anything,不错不错,其它的开放式分割、视频实例分割等一直在投,生成式AI Diffusion Model占了蛮多坑,发力点突出。


(四)3D建模&&自动驾驶相关
主要是基于images的深度估计,3D重建,还有一些3D车道线检测、端到端自动驾驶、low-level vision,我依然看好端到端!

(五)小样本&&迁移学习&&半监督&&持续学习等
很持久的赛道,对工业界真的是刚需啊,毕竟样本标注太贵了!

还有一些视觉语言相关的,大模型起来了,今年都是相关的坑。

(六)机器学习和数据集

还有很多posters,这里不再一一列举了,完整的信息可以到公众号【自动驾驶Daily】后台回复“ICCV2023”获取ICCV2023论文链接、oral链接、poster安排以及workshop、tutorials链接!
其实前面自动驾驶之心也为大家持续汇总了很多优秀的ICCV2023的paper,详细可以参考:
https://github.com/autodriving-heart/ICCV2023-Papers-autonomous-driving
这里再给大家汇总下,涉及3D目标检测、BEV、协同感知、语义分割、点云、SLAM、大模型、NeRF、端到端、多模态融合等。
1)OCC感知
SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving
OccNet: Scene as Occupancy
OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction
OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception
2)端到端自动驾驶
VAD: Vectorized Scene Representation for Efficient Autonomous Driving
3)协同感知
Among Us: Adversarially Robust Collaborative Perception by Consensus
HM-ViT: Hetero-modal Vehicle-to-Vehicle Cooperative perception with vision transformer
Optimizing the Placement of Roadside LiDARs for Autonomous Driving
UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework
4)3D目标检测
PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images
StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection
Cross Modal Transformer: Towards Fast and Robust 3D Object Detection
DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection
SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection
MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation
Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction
Revisiting Domain-Adaptive 3D Object Detection by Reliable, Diverse and Class-balanced Pseudo-Labeling
Learning from Noisy Data for Semi-Supervised 3D Object Detection
SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection
5)语义分割
Rethinking Range View Representation for LiDAR Segmentation
UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase
Segment Anything
MARS: Model-agnostic Biased Object Removal without Additional Supervision for Weakly-Supervised Semantic Segmentation
Tube-Link: A Flexible Cross Tube Baseline for Universal Video Segmentation
Contextual Point Cloud Modeling for Weakly-supervised Point Cloud Semantic Segmentation
DVIS: Decoupled Video Instance Segmentation Framework
Diffusion-based Image Translation with Label Guidance for Domain Adaptive Semantic Segmentation
6)点云感知
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions
Implicit Autoencoder for Point Cloud Self-supervised Representation Learning
P2C: Self-Supervised Point Cloud Completion from Single Partial Clouds
CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training
SVDFormer: Complementing Point Cloud via Self-view Augmentation and Self-structure Dual-generator
AdaptPoint: Sample-adaptive Augmentation for Point Cloud Recognition Against Real-world Corruptions
RegFormer: An Efficient Projection-Aware Transformer Network for Large-Scale Point Cloud Registration
CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud Semantic Segmentation
Density-invariant Features for Distant Point Cloud Registration
AutoSynth: Learning to Generate 3D Training Data for Object Point Cloud Registration
Sample-adaptive Augmentation for Point Cloud Recognition Against Real-world Corruptions
SVQNet: Sparse Voxel-Adjacent Query Network for 4D Spatio-Temporal LiDAR Semantic Segmentation
DELFlow: Dense Efficient Learning of Scene Flow for Large-Scale Point Clouds
7)目标跟踪
PVT++: A Simple End-to-End Latency-Aware Visual Tracking Framework
Cross-modal Orthogonal High-rank Augmentation for RGB-Event Transformer-trackers
ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking
Multiple Planar Object Tracking
3DMOTFormer: Graph Transformer for Online 3D Multi-Object Tracking
MBPTrack: Improving 3D Point Cloud Tracking with Memory Networks and Box Priors
Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking
Robust Object Modeling for Visual Tracking
8)2D目标检测
AlignDet: Aligning Pre-training and Fine-tuning in Object Detection
Cascade-DETR: Delving into High-Quality Universal Object Detection
9)轨迹预测
EigenTrajectory: Low-Rank Descriptors for Multi-Modal Trajectory Forecasting
Fast Inference and Update of Probabilistic Density Estimation on Trajectory Prediction
10)NeRF
IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable Novel View Synthesis
SceneRF: Self-Supervised Monocular 3D Scene Reconstruction with Radiance Fields
CopyRNeRF: Protecting the CopyRight of Neural Radiance Fields
Tri-MipRF: Tri-Mip Representation for Efficient Anti-Aliasing Neural Radiance Fields
Lighting up NeRF via Unsupervised Decomposition and Enhancement
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts
Seal-3D: Interactive Pixel-Level Editing for Neural Radiance Fields
11)深度估计
Constraining Depth Map Geometry for Multi-View Stereo: A Dual-Depth Approach with Saddle- shaped Depth Cells
Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering
AGG-Net: Attention Guided Gated-convolutional Network for Depth Image Completion
Self-Supervised Monocular Depth Estimation by Direction-aware Cumulative Convolution Network
12)局部高精地图
PivotNet: Vectorized Pivot Learning for End-to-end HD Map Construction
13)车道线检测
LATR: 3D Lane Detection from Monocular Images with Transformer
14)Vision-Language Models
Distribution-Aware Prompt Tuning for Vision-Language Models
15)其它
Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection
Integrating Boxes and Masks: A Multi-Object Framework for Unified Visual Tracking and Segmentation
Forecast-MAE: Self-supervised Pre-training for Motion Forecasting with Masked Autoencoders
UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation
① 全网独家视频课程
BEV感知、毫米波雷达视觉融合、多传感器标定、多传感器融合、多模态3D目标检测、点云3D目标检测、目标跟踪、Occupancy、cuda与TensorRT模型部署、协同感知、语义分割、自动驾驶仿真、传感器部署、决策规划、轨迹预测等多个方向学习视频(扫码即可学习)

② 国内首个自动驾驶学习社区
近2000人的交流社区,涉及30+自动驾驶技术栈学习路线,想要了解更多自动驾驶感知(2D检测、分割、2D/3D车道线、BEV感知、3D目标检测、Occupancy、多传感器融合、多传感器标定、目标跟踪、光流估计)、自动驾驶定位建图(SLAM、高精地图、局部在线地图)、自动驾驶规划控制/轨迹预测等领域技术方案、AI模型部署落地实战、行业动态、岗位发布,欢迎扫描下方二维码,加入自动驾驶之心知识星球,这是一个真正有干货的地方,与领域大佬交流入门、学习、工作、跳槽上的各类难题,日常分享论文+代码+视频,期待交流!

③【自动驾驶之心】技术交流群
自动驾驶之心是首个自动驾驶开发者社区,聚焦目标检测、语义分割、全景分割、实例分割、关键点检测、车道线、目标跟踪、3D目标检测、BEV感知、多模态感知、Occupancy、多传感器融合、transformer、大模型、点云处理、端到端自动驾驶、SLAM、光流估计、深度估计、轨迹预测、高精地图、NeRF、规划控制、模型部署落地、自动驾驶仿真测试、产品经理、硬件配置、AI求职交流等方向。扫码添加汽车人助理微信邀请入群,备注:学校/公司+方向+昵称(快速入群方式)
④【自动驾驶之心】平台矩阵,欢迎联系我们!