VAT-Mart: Learning Visual Action Trajectory Proposals for Manipulating 3D ARTiculated Objects-优快云博客

文章目录

概述
引言
相关工作
- 问题定义 Actionable Visual Priors: Action Affordance and Trajectory Proposals
方法架构
代码初读

概述

ICLR2022
项目主页
 完整解读参考

引言

在introduction部分作者提出了关于当前two-stage solution(estimated visual articulation models+robotic manipulation planners and controllers)的一些问题，认为目前handshaking point也就是标准化的视觉铰接模型（运动学结构+部件位姿+关节参数）的表示并不是最优的选择，因为在这种表示下一些物体的几何的(比如说物体的边、孔、条等等)、语义的(比如说门、把手等等)特征被忽略了。
由此引出本文提出的“actionable visual representations视觉可供性表征”。一种更加几何相关geometry-aware,交互相关interaction-aware, and 任务相关的task-aware perception-interaction handshaking point。具体来说是Action Affordance and Trajectory Proposals，预测逐点的action possibility和visual action trajectory proposals。与where2act任务无关、短时的预测不同，这篇文字further augment the per-point action predictions with task-aware distributions of trajectory proposals,
接着具体介绍了这种actionable visual priors的一些用处。这里“priors 先验”这个概念用的很妙。
接着介绍了获取为了这种actionable visual priors所设计的交互-感知的学习架构interaction-for-perception learning framework VAT-MART，通过强化学习方式来获取成功的交互轨迹，同时对感知网络进行训练来推广所学习到的知识。
Contribution 1、定义了一种novel的“actionable visual priors” ；2、设计交互-感知的学习架构iinteraction-for-perception framework VAT-MART； 3、实验验证了有效性。

相关工作

Perceiving and Manipulating 3D Articulated Objects 主要讲标准化的视觉铰接模型（运动学结构+部件位姿+关节参数）的感知
Learning Actionable Visual Representations 讲了一些主流的表征方式，grasp/manipulation affordance，key points，contact points，这篇文字探索的是dense affordance map
Learning Perception from Interaction

问题定义 Actionable Visual Priors: Action Affordance and Trajectory Proposals

对于每一个铰接物体，学习的object-centric actionable visual priors包括三个部分：1) an actionability map over articulated parts indicating where to interact; 2) per-point distributions of visual action trajectory proposals suggesting how to interact; and 3) estimated success likelihood scores rating the outcomes of the interaction. 并且所有预测是interaction-conditioned (e.g., pushing, pulling) and task-aware (e.g., open a door for 30◦, close a drawer by 0.1-unit-length).
数学描述：
在这里插入图片描述
限制在了1-DOF的part articulation

方法架构

在这里插入图片描述

代码初读

仓库地址
数据集：where2act_original_sapien_dataset.zip
依赖：sapien，PointNet++，其他在requirements.txt
步骤(Training Pipeline for the VAT-Mart Framework)：

script # code
sh scripts/run_train_RL_PushDoor.sh # RL_train_push_door_mp.py
sh scripts/run_collect_PushDoor.sh # td3_collect_push_door.py
sh scripts/run_train_critic_PushDoor_before.sh # train_3d_task_critic.py
sh scripts/run_train_Curiosity_RL_PushDoor.sh # td3_train_push_pull_door_curiosityDriven_mp.py
sh scripts/run_train_critic_PushDoor.sh # train_3d_task_critic.py
sh scripts/run_train_actor_PushDoor.sh # train_3d_task_actor.py
sh scripts/run_train_score_PushDoor.sh # train_3d_task_score.py
sh scripts/run_eval_sampleSucc_PushDoor.sh # eval_sampleSucc.py