ICCV2023大会开始啦，今年有哪些CV与自动驾驶领域新坑？best paper会花落谁家？...-优快云博客

点击下方卡片，关注“自动驾驶之心”公众号

ADAS巨卷干货，即可获取

今天自动驾驶之心为大家整理了ICCV2023相关的paper以及会议上的工作安排，希望已在巴黎的同学，好好表现，争取一把！不知今年的best paper花落谁家~

>>点击进入→自动驾驶之心【全栈技术】交流群

编辑 | 自动驾驶之心

ICCV2023共收录了约2100篇paper，线下会议将在10.2-10.6号在法国巴黎的巴黎会议中心举办，许多朋友已经到达巴黎，和大家一路走来，很开心看到很多oral和poster，谁会是今年的best paper呢？今天也盘一盘ICCV2023的相关工作。

首先看看有哪些oral：

（一）基于多视图的3D视觉

主要是Nerf、SLAM、重建等领域，Nerf方向发展很迅速，自动驾驶行业也开始铺开啦~

（二）视觉&&计算机图形学&&机器人相关

Graphics和Robottics不用说了，还增加了很多隐私、安全、可解释性和数据集相关的paper。

（三）识别&&分割&&生成式Model

看到了Segment Anything，不错不错，其它的开放式分割、视频实例分割等一直在投，生成式AI Diffusion Model占了蛮多坑，发力点突出。

（四）3D建模&&自动驾驶相关

主要是基于images的深度估计，3D重建，还有一些3D车道线检测、端到端自动驾驶、low-level vision，我依然看好端到端！

（五）小样本&&迁移学习&&半监督&&持续学习等

很持久的赛道，对工业界真的是刚需啊，毕竟样本标注太贵了！

还有一些视觉语言相关的，大模型起来了，今年都是相关的坑。

（六）机器学习和数据集

还有很多posters，这里不再一一列举了，完整的信息可以到公众号【自动驾驶Daily】后台回复“ICCV2023”获取ICCV2023论文链接、oral链接、poster安排以及workshop、tutorials链接！

其实前面自动驾驶之心也为大家持续汇总了很多优秀的ICCV2023的paper，详细可以参考：

https://github.com/autodriving-heart/ICCV2023-Papers-autonomous-driving

这里再给大家汇总下，涉及3D目标检测、BEV、协同感知、语义分割、点云、SLAM、大模型、NeRF、端到端、多模态融合等。

1）OCC感知

SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving

OccNet: Scene as Occupancy

OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction

OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception

2）端到端自动驾驶

VAD: Vectorized Scene Representation for Efficient Autonomous Driving

3）协同感知

Among Us: Adversarially Robust Collaborative Perception by Consensus

HM-ViT: Hetero-modal Vehicle-to-Vehicle Cooperative perception with vision transformer

Optimizing the Placement of Roadside LiDARs for Autonomous Driving

UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework

4）3D目标检测

PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection

Cross Modal Transformer: Towards Fast and Robust 3D Object Detection

DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection

SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection

MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction

Revisiting Domain-Adaptive 3D Object Detection by Reliable, Diverse and Class-balanced Pseudo-Labeling

Learning from Noisy Data for Semi-Supervised 3D Object Detection

SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection

5）语义分割

Rethinking Range View Representation for LiDAR Segmentation

UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase

Segment Anything

MARS: Model-agnostic Biased Object Removal without Additional Supervision for Weakly-Supervised Semantic Segmentation

Tube-Link: A Flexible Cross Tube Baseline for Universal Video Segmentation

Contextual Point Cloud Modeling for Weakly-supervised Point Cloud Semantic Segmentation

DVIS: Decoupled Video Instance Segmentation Framework

Diffusion-based Image Translation with Label Guidance for Domain Adaptive Semantic Segmentation

6）点云感知

Robo3D: Towards Robust and Reliable 3D Perception against Corruptions

Implicit Autoencoder for Point Cloud Self-supervised Representation Learning

P2C: Self-Supervised Point Cloud Completion from Single Partial Clouds

CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training

SVDFormer: Complementing Point Cloud via Self-view Augmentation and Self-structure Dual-generator

AdaptPoint: Sample-adaptive Augmentation for Point Cloud Recognition Against Real-world Corruptions

RegFormer: An Efficient Projection-Aware Transformer Network for Large-Scale Point Cloud Registration

CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud Semantic Segmentation

Density-invariant Features for Distant Point Cloud Registration

AutoSynth: Learning to Generate 3D Training Data for Object Point Cloud Registration

Sample-adaptive Augmentation for Point Cloud Recognition Against Real-world Corruptions

SVQNet: Sparse Voxel-Adjacent Query Network for 4D Spatio-Temporal LiDAR Semantic Segmentation

DELFlow: Dense Efficient Learning of Scene Flow for Large-Scale Point Clouds

7）目标跟踪

PVT++: A Simple End-to-End Latency-Aware Visual Tracking Framework

Cross-modal Orthogonal High-rank Augmentation for RGB-Event Transformer-trackers

ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking

Multiple Planar Object Tracking

3DMOTFormer: Graph Transformer for Online 3D Multi-Object Tracking

MBPTrack: Improving 3D Point Cloud Tracking with Memory Networks and Box Priors

Exploring Lightweight Hierarchical Vision Transformers for Efficient Visual Tracking

Robust Object Modeling for Visual Tracking

8）2D目标检测

AlignDet: Aligning Pre-training and Fine-tuning in Object Detection

Cascade-DETR: Delving into High-Quality Universal Object Detection

9）轨迹预测

EigenTrajectory: Low-Rank Descriptors for Multi-Modal Trajectory Forecasting

Fast Inference and Update of Probabilistic Density Estimation on Trajectory Prediction

10）NeRF

IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable Novel View Synthesis

SceneRF: Self-Supervised Monocular 3D Scene Reconstruction with Radiance Fields

CopyRNeRF: Protecting the CopyRight of Neural Radiance Fields

Tri-MipRF: Tri-Mip Representation for Efficient Anti-Aliasing Neural Radiance Fields

Lighting up NeRF via Unsupervised Decomposition and Enhancement

Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts

Seal-3D: Interactive Pixel-Level Editing for Neural Radiance Fields

11）深度估计

Constraining Depth Map Geometry for Multi-View Stereo: A Dual-Depth Approach with Saddle- shaped Depth Cells

Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering

AGG-Net: Attention Guided Gated-convolutional Network for Depth Image Completion

Self-Supervised Monocular Depth Estimation by Direction-aware Cumulative Convolution Network

12）局部高精地图

PivotNet: Vectorized Pivot Learning for End-to-end HD Map Construction

13）车道线检测

LATR: 3D Lane Detection from Monocular Images with Transformer

14）Vision-Language Models

Distribution-Aware Prompt Tuning for Vision-Language Models

15）其它

Bridging Cross-task Protocol Inconsistency for Distillation in Dense Object Detection

Integrating Boxes and Masks: A Multi-Object Framework for Unified Visual Tracking and Segmentation

Forecast-MAE: Self-supervised Pre-training for Motion Forecasting with Masked Autoencoders

UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation

① 全网独家视频课程

BEV感知、毫米波雷达视觉融合、多传感器标定、多传感器融合、多模态3D目标检测、点云3D目标检测、目标跟踪、Occupancy、cuda与TensorRT模型部署、协同感知、语义分割、自动驾驶仿真、传感器部署、决策规划、轨迹预测等多个方向学习视频（扫码即可学习）

视频官网：www.zdjszx.com

② 国内首个自动驾驶学习社区

近2000人的交流社区，涉及30+自动驾驶技术栈学习路线，想要了解更多自动驾驶感知（2D检测、分割、2D/3D车道线、BEV感知、3D目标检测、Occupancy、多传感器融合、多传感器标定、目标跟踪、光流估计）、自动驾驶定位建图（SLAM、高精地图、局部在线地图）、自动驾驶规划控制/轨迹预测等领域技术方案、AI模型部署落地实战、行业动态、岗位发布，欢迎扫描下方二维码，加入自动驾驶之心知识星球，这是一个真正有干货的地方，与领域大佬交流入门、学习、工作、跳槽上的各类难题，日常分享论文+代码+视频，期待交流！

③【自动驾驶之心】技术交流群

自动驾驶之心是首个自动驾驶开发者社区，聚焦目标检测、语义分割、全景分割、实例分割、关键点检测、车道线、目标跟踪、3D目标检测、BEV感知、多模态感知、Occupancy、多传感器融合、transformer、大模型、点云处理、端到端自动驾驶、SLAM、光流估计、深度估计、轨迹预测、高精地图、NeRF、规划控制、模型部署落地、自动驾驶仿真测试、产品经理、硬件配置、AI求职交流等方向。扫码添加汽车人助理微信邀请入群，备注：学校/公司+方向+昵称（快速入群方式）