LLM在自动驾驶中的Survey

最新推荐文章于 2025-12-02 15:02:30 发布

转载最新推荐文章于 2025-12-02 15:02:30 发布 · 342 阅读

CC 4.0 BY-SA版权

原文链接：https://mp.weixin.qq.com/s?__biz=Mzg2NzUxNTU1OA==&mid=2247617668&idx=4&sn=da7f8d0b0e52f2fbc7fd5f73a79101f7&chksm=cf669e970ef60833ee0f0d9a901f0a36912b2c51f748a3f0e81d38d2cfdbee0369c2b801b4ca&scene=126&sessionid=0

文章标签：

#自动驾驶 #人工智能 #机器学习

作者 | erkang 编辑 | 自动驾驶之心

原文链接：https://zhuanlan.zhihu.com/p/707464289

点击下方卡片，关注“自动驾驶之心”公众号

戳我-> 领取自动驾驶近15个方向学习路线

>>点击进入→自动驾驶之心『大语言模型』技术交流群

本文只做学术分享，如有侵权，联系删文

做了粗略的LLM在自动驾驶中的survey

汇总了相关的paper list
方便快速了解信息，摘取了abstract，图-问题描述，图-方法overview
TODO：归纳分类 & 极简总结

整体感觉，目前的LLM for Autonomous Driving虽然很火，在三维世界理解下的reasoning还是大有可为，尤其是为端到端服务。

Sec-1: Paper list

[1] LMDrive: Closed-loop End-to-End Driving with Large language Models
[2] Driving Everywhere with Large Language Model Policy Adaptation
[3] ADAPT: Action-aware Driving Caption Transformer
[4] Prompting Multi-Modal Tokens to Enhance End-to-End Autonomous Driving Imitation Learning with LLMs
[5] DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experience
[6] Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Model
[7] DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model
[8] ADriver-I: A General World Model for Autonomous Driving
[9] DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving
[10] Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving
[11] Embodied Understanding of Driving Scenarios
[12] Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models
[13] OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning
[14] Dolphins: Multimodal Language Model for Driving
[15] HiLM-D: Towards High-Resolution Understanding in Multimodal Large Language Models for Autonomous Driving
[16] Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving
[17] RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model
[18 ]AD-H: Autonomous Driving with Hierarchical Agents
[19] Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving
[20] LingoQA: Video Question Answering for Autonomous Driving
[21] 3D Dense Captioning beyond Nouns: A Middleware for Autonomous Driving
[22] Probing Multimodal LLMs as World Models for Driving
[23] LeGo-Drive: Language-enhanced Goal-oriented Closed-Loop End-to-End Autonomous Driving
[24] LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding｜
[25] DriveCoT: Integrating Chain-of-Thought Reasoning with End-to-End Driving

Sec-2: Paper汇总

包含Title, Venue, Year, Institute, Abstract, Fig-Problem-define, and Fig-Method-Overview

投稿作者为『自动驾驶之心知识星球』特邀嘉宾，欢迎加入交流！

① 全网独家视频课程

BEV感知、BEV模型部署、BEV目标跟踪、毫米波雷达视觉融合、多传感器标定、多传感器融合、多模态3D目标检测、车道线检测、轨迹预测、在线高精地图、世界模型、点云3D目标检测、目标跟踪、Occupancy、cuda与TensorRT模型部署、大模型与自动驾驶、Nerf、语义分割、自动驾驶仿真、传感器部署、决策规划、轨迹预测等多个方向学习视频（扫码即可学习）

网页端官网：www.zdjszx.com

② 国内首个自动驾驶学习社区

国内最大最专业，近3000人的交流社区，已得到大多数自动驾驶公司的认可！涉及30+自动驾驶技术栈学习路线，从0到一带你入门自动驾驶感知（2D/3D检测、语义分割、车道线、BEV感知、Occupancy、多传感器融合、多传感器标定、目标跟踪）、自动驾驶定位建图（SLAM、高精地图、局部在线地图）、自动驾驶规划控制/轨迹预测等领域技术方案、大模型、端到端等，更有行业动态和岗位发布！欢迎扫描下方二维码，加入自动驾驶之心知识星球，这是一个真正有干货的地方，与领域大佬交流入门、学习、工作、跳槽上的各类难题，日常分享论文+代码+视频