概述
引言
在introduction部分作者提出了关于当前two-stage solution(estimated visual articulation models+robotic manipulation planners and controllers)的一些问题,认为目前handshaking point也就是标准化的视觉铰接模型(运动学结构+部件位姿+关节参数)的表示并不是最优的选择,因为在这种表示下一些物体的几何的(比如说物体的边、孔、条等等)、语义的(比如说门、把手等等)特征被忽略了。
由此引出本文提出的“actionable visual representations视觉可供性表征”。一种更加几何相关geometry-aware,交互相关interaction-aware, and 任务相关的task-aware perception-interaction handshaking point。具体来说是Action Affordance and Trajectory Proposals,预测逐点的action possibility和visual action trajectory proposals。与where2act任务无关、短时的预测不同,这篇文字further augment the per-point action predictions with task-aware distributions of trajectory proposals,
接着具体介绍了这种actionable visual priors的一些用处。这里“priors 先验”这个概念用的很妙。
接着介绍了获取为了这种actionable visual priors所设计的交互-感知的学习架构interaction-for-perception learning framework VAT-MART,通过强化学习方式来获取成功的交互轨迹,同时对感知网络进行训练来推广所学习到的知识。
Contribution 1、定义了一种novel的“actionable visual priors” ;2、设计交互-感知的学习架构iinteraction-for-perception framework VAT-MART; 3、实验验证了有效性。
相关工作
Perceiving and Manipulating 3D Articulated Objects 主要讲标准化的视觉铰接模型(运动学结构+部件位姿+关节参数)的感知
Learning Actionable Visual Representations 讲了一些主流的表征方式,grasp/manipulation affordance,key points,contact points,这篇文字探索的是dense affordance map
Learning Perception from Interaction
问题定义 Actionable Visual Priors: Action Affordance and Trajectory Proposals
对于每一个铰接物体,学习的object-centric actionable visual priors包括三个部分:1) an actionability map over articulated parts indicating where to interact; 2) per-point distributions of visual action trajectory proposals suggesting how to interact; and 3) estimated success likelihood scores rating the outcomes of the interaction. 并且所有预测是interaction-conditioned (e.g., pushing, pulling) and task-aware (e.g., open a door for 30◦, close a drawer by 0.1-unit-length).
数学描述:
限制在了1-DOF的part articulation
方法架构
代码初读
仓库地址
数据集:where2act_original_sapien_dataset.zip
依赖:sapien,PointNet++,其他在requirements.txt
步骤(Training Pipeline for the VAT-Mart Framework):
script # code
sh scripts/run_train_RL_PushDoor.sh # RL_train_push_door_mp.py
sh scripts/run_collect_PushDoor.sh # td3_collect_push_door.py
sh scripts/run_train_critic_PushDoor_before.sh # train_3d_task_critic.py
sh scripts/run_train_Curiosity_RL_PushDoor.sh # td3_train_push_pull_door_curiosityDriven_mp.py
sh scripts/run_train_critic_PushDoor.sh # train_3d_task_critic.py
sh scripts/run_train_actor_PushDoor.sh # train_3d_task_actor.py
sh scripts/run_train_score_PushDoor.sh # train_3d_task_score.py
sh scripts/run_eval_sampleSucc_PushDoor.sh # eval_sampleSucc.py