第五章：IL与RL抓取

最新推荐文章于 2025-06-11 19:37:00 发布

明码

最新推荐文章于 2025-06-11 19:37:00 发布

阅读量984

点赞数 19

分类专栏：机器人抓取与操作文章标签：机器人机器学习 python

本文链接：https://blog.youkuaiyun.com/qq_37087723/article/details/145120536

版权

终于更新了，实在太忙了

Imitation Learning

5.1 Recap

We have covered so far :

Traditional robotics method
- Kinematic, planning, control

Pose estimation and camera geometries
6DoF pose estimation for instance, category, unknown object
Camera calibration, 2D to 3D projection
Get the geometric information of the object and environment
Grasping
- Traditional grasping pipeline
- Learning-based grasping
  - 2D plannar, 6DoF, multi-finger grasping

5.2 Intro to Imitation Learning

在机器人技术的背景下，示范学习 (LfD)、示范编程 (PbD) 和模仿学习 (IL) 是密切相关的范式，都涉及到使机器人能够学习任务或从人类演示中派生控制器。虽然这些术语通常可以互换使用，但根据上下文的不同，它们在重点或范围上可能有细微的差别。

在这里插入图片描述

著名的工作ACT， DP
在这里插入图片描述

T

• Given: expert demonstration data, or demonstrator
• Goal: learn a policy that mimics the expert

在这里插入图片描述

Learn a control policy from dataset
Autonomous driving:
• Input: vector space, or image, or BEV
• Output : trajectory, or control output
Robotic manipulation (act)
• Input: image, joint state
• Output: sequence of joint command.

ChauffeurNet
在这里插入图片描述

在这里插入图片描述
Learning Objective:

Behavior Cloning
• Simplistic supervised learning formulation
• Requires expert data (s, a)
Interactive Imitation Learning
• Experts help to interactively generate policy and data
• Requires expert policy (in addition to the data)
• Dagger
Inverse Reinforcement Learning
• RL formulation
• Reward learning
在这里插入图片描述