📚 LIBERO 完整文档集索引
终身机器人学习基准平台 - 从入门到精通的完整指南
🎯 文档概览
您现在拥有6份完整文档,涵盖LIBERO的方方面面:
| 文档 | 大小 | 行数 | 用途 | 难度 |
|---|---|---|---|---|
| 📘 Notebooks教程讲解 | 41KB | 1424 | Jupyter教程详解 | ⭐ |
| 📗 Notebooks快速参考 | 12KB | 473 | 快速查询 | ⭐ |
| 📙 Scripts工具详解 | 49KB | 1702 | 10个脚本深度剖析 | ⭐⭐ |
| 📕 Scripts快速参考 | 11KB | 339 | 命令速查 | ⭐ |
| 📔 代码结构详解 | 64KB | 2237 | libero/核心代码 | ⭐⭐⭐ |
| 📓 代码快速参考 | 12KB | 503 | 类与接口速查 | ⭐⭐ |
📖 各文档详细说明
1️⃣ Notebooks教程讲解
内容概要
- 📒
playground.ipynb: 入门交互式教程 - 🎯
procedural_generation.ipynb: 任务生成流程 - 🧪
benchmark_example.ipynb: 基准评估示例
适用人群
- 🆕 LIBERO新手
- 🎓 想快速上手的研究者
- 👨🏫 教学演示
核心内容
- 环境创建与交互
- 任务检索与执行
- BDDL文件理解
- 程序化任务生成
- 完整评估流程
2️⃣ Notebooks快速参考
内容概要
3个notebook的核心代码片段速查表
适用场景
- ⚡ 需要快速查找代码示例
- 🔍 忘记某个API怎么用
- 📝 复制粘贴代码模板
查询内容
- 环境初始化代码
- 任务加载示例
- 数据集操作
- 评估循环模板
3️⃣ Scripts工具详解
内容概要
10个工具脚本的完整讲解:
init_path.py- 路径初始化check_dataset_integrity.py- 数据集验证get_dataset_info.py- 数据集分析get_affordance_info.py- 对象能力查询config_copy.py- 配置文件复制create_libero_task_example.py- 任务创建示例create_template.py- 模板生成工具collect_demonstration.py- 人类演示收集libero_100_collect_demonstrations.py- 批量收集create_dataset.py- 数据集转换
适用人群
- 🛠️ 需要收集数据的研究者
- 🎯 想创建自定义任务
- 📊 需要分析数据集
每个脚本包含
- 功能说明
- 代码详解
- 参数说明
- 使用示例
- 应用场景
- 注意事项
4️⃣ Scripts快速参考
内容概要
脚本工具的速查卡片
包含内容
- 一行命令速查
- 参数速记表
- 使用场景速览
- SpaceMouse控制说明
- HDF5结构对照
- 常见问题一行解决
适用场景
- ⚡ 忘记命令参数
- 🔍 快速查找用法
- 💡 解决常见错误
5️⃣ 代码结构详解
内容概要
深度剖析 libero/ 文件夹的核心代码
Part 1: libero/libero/ - 环境系统
- 📁
assets/- 3D资产库 - 📄
bddl_files/- 任务定义 - 🎯
benchmark/- 基准套件 - 🤖
envs/- 环境实现 - 💾
init_files/- 固定初始化 - 🔧
utils/- 工具函数
Part 2: libero/lifelong/ - 学习系统
- 🎯
main.py- 主训练脚本 - 🧠
algos/- 5种终身学习算法- Sequential (顺序微调)
- ER (经验回放)
- EWC (弹性权重固化)
- PackNet (参数打包)
- Multitask (多任务基线)
- 🏗️
models/policy/- 3种策略架构- ResNet-RNN
- ResNet-Transformer
- ViT-Transformer
- 📊
datasets/- 数据加载 - 📈
metrics/- 评估指标
适用人群
- 🔬 深度研究LIBERO的学者
- 💻 需要修改核心代码
- 🏗️ 想实现自定义算法
- 🎓 博士生/高年级研究生
每个模块包含
- 架构设计说明
- 完整代码示例
- 类与方法详解
- 设计模式讲解
- 扩展指南
6️⃣ 代码快速参考
内容概要
代码结构的一页纸速查表
包含内容
- 目录结构速览
- 关键类与接口
- 5大任务套件对比
- 5种算法对比
- 3种架构对比
- 核心工作流
- 数据流转图
- 配置文件速查
- 常见使用场景
- 观察/动作空间格式
- 调试技巧
适用场景
- 🔍 快速查找类名
- 📝 复制接口示例
- 💡 回忆API用法
- ⚡ 不想翻大文档
🎓 推荐阅读路径
🌟 路径1:绝对新手(2-3小时)
1️⃣ Notebooks快速参考 (15分钟)
↓ 了解基本概念
2️⃣ Notebooks教程讲解 (60分钟)
↓ 跑通所有notebook
3️⃣ Scripts快速参考 (15分钟)
↓ 熟悉工具脚本
4️⃣ 代码快速参考 (30分钟)
↓ 了解代码组织
5️⃣ 开始实验!
🚀 路径2:有经验用户(1-2小时)
1️⃣ 代码快速参考 (20分钟)
↓ 快速了解全局
2️⃣ Scripts快速参考 (10分钟)
↓ 熟悉工具
3️⃣ 代码结构详解 - 按需阅读 (30-60分钟)
↓ 深入感兴趣的模块
4️⃣ 直接开始研究!
🔬 路径3:深度研究者(4-6小时)
1️⃣ 代码结构详解 - 完整阅读 (2小时)
↓ 理解所有核心代码
2️⃣ Scripts工具详解 (1小时)
↓ 掌握数据收集流程
3️⃣ Notebooks教程讲解 (1小时)
↓ 学习最佳实践
4️⃣ 开始魔改代码!
🎯 路径4:特定任务导向
要收集数据?
→ Scripts工具详解 (第8-10节)
要创建新任务?
→ Scripts工具详解 (第6-7节)
→ 代码结构详解 (Part 1)
要实现新算法?
→ 代码结构详解 (Part 2.2)
要设计新架构?
→ 代码结构详解 (Part 2.3)
要理解数据格式?
→ Scripts工具详解 (第10节)
→ 代码结构详解 (Part 2.4)
📊 文档内容对照表
| 想了解… | 详细文档 | 快速参考 |
|---|---|---|
| Notebook教程 | libero_notebooks_guide.md | libero_quick_reference.md |
| 工具脚本 | libero_scripts_guide.md | libero_scripts_cheatsheet.md |
| 核心代码 | libero_code_structure_guide.md | libero_code_quick_reference.md |
💡 使用技巧
📱 移动设备
所有文档都是Markdown格式,在手机上也可以流畅阅读!
🔖 浏览器书签
建议将以下文档加入浏览器书签:
- 快速参考类(常用)
- 自己当前研究方向的详细文档
🔍 搜索功能
使用 Ctrl+F (或 Cmd+F) 快速搜索关键词
📝 做笔记
建议在阅读时:
- 标记重点部分
- 记录自己的理解
- 添加代码注释
🎯 关键概念速查
LIBERO核心概念
- 终身学习: 连续学习多个任务而不遗忘
- 知识迁移: 从旧任务迁移知识到新任务
- BDDL: 任务定义语言
- 任务套件: 相关任务的集合
5种知识类型
- 空间关系 (Spatial) - LIBERO-Spatial
- 对象概念 (Object) - LIBERO-Object
- 任务目标 (Goal) - LIBERO-Goal
- 混合知识 (Mixed) - LIBERO-90/10
3种策略架构
- ResNet-RNN: 简单快速
- ResNet-Transformer: 长期依赖
- ViT-Transformer: 视觉-语言融合
5种学习算法
- Sequential: 顺序微调(baseline)
- ER: 经验回放
- EWC: 弹性权重固化
- PackNet: 参数打包
- Multitask: 多任务学习(oracle)
🚦 代码层次结构
┌─────────────────────────────────────┐
│ 应用层 (Application) │
│ benchmark_scripts/, notebooks/ │
└──────────────┬──────────────────────┘
│
┌──────────────▼──────────────────────┐
│ 算法层 (Algorithm) │
│ libero/lifelong/algos/ │
└──────────────┬──────────────────────┘
│
┌──────────────▼──────────────────────┐
│ 模型层 (Model) │
│ libero/lifelong/models/ │
└──────────────┬──────────────────────┘
│
┌──────────────▼──────────────────────┐
│ 环境层 (Environment) │
│ libero/libero/envs/ │
└──────────────┬──────────────────────┘
│
┌──────────────▼──────────────────────┐
│ 任务层 (Task) │
│ libero/libero/bddl_files/ │
└──────────────┬──────────────────────┘
│
┌──────────────▼──────────────────────┐
│ 资产层 (Assets) │
│ libero/libero/assets/ │
└─────────────────────────────────────┘
🎓 学习检查清单
✅ 入门阶段
- 成功运行
playground.ipynb - 理解BDDL文件格式
- 完成一次数据收集
- 运行一个基准实验
✅ 进阶阶段
- 创建自定义任务
- 实现简单的算法变体
- 理解所有5种算法
- 对比不同策略架构
✅ 高级阶段
- 实现全新的终身学习算法
- 设计新的策略架构
- 扩展BDDL语言
- 贡献代码到主仓库
🎯 常见研究方向
基于这些文档,你可以探索:
-
新算法开发
- 防遗忘机制
- 知识蒸馏
- 元学习方法
-
架构创新
- 多模态融合
- 注意力机制
- 高效推理
-
任务设计
- 长期任务
- 协作任务
- 多机器人任务
-
基准扩展
- 新任务套件
- 新评估指标
- 真实机器人迁移
LIBERO 代码库完全讲解
LIBERO: Lifelong Learning Benchmark on Robot Manipulation Tasks
用于终身机器人学习中知识迁移的基准测试平台
📂 目录结构总览
LIBERO/
├── libero/ # 核心代码库 ⭐
│ ├── libero/ # 环境与任务定义
│ │ ├── assets/ # 对象3D模型和纹理
│ │ ├── bddl_files/ # 任务定义文件(BDDL格式)
│ │ ├── benchmark/ # 基准测试套件
│ │ ├── envs/ # 环境实现
│ │ ├── init_files/ # 初始状态配置
│ │ └── utils/ # 工具函数
│ └── lifelong/ # 终身学习算法与模型
│ ├── main.py # 主训练脚本
│ ├── algos/ # 算法实现
│ ├── models/ # 策略网络
│ ├── datasets/ # 数据加载
│ └── metrics/ # 评估指标
├── scripts/ # 实用工具脚本
├── benchmark_scripts/ # 基准测试脚本
├── configs/ # 配置文件
└── notebooks/ # Jupyter教程
🎯 Part 1: libero/libero/ - 环境与任务定义
这是LIBERO的核心,定义了所有任务、环境和对象。
1.1 📁 assets/ - 3D资产文件
存储所有机器人操作任务所需的3D模型、纹理和材质。
assets/
├── arenas/ # 工作空间场景
│ ├── kitchen_table_arena.xml # 厨房桌子场景
│ ├── study_table_arena.xml # 书房桌子场景
│ └── living_room_arena.xml # 客厅场景
├── objects/ # 可操作对象模型
│ ├── bowl/
│ ├── plate/
│ ├── mug/
│ ├── cabinet/
│ └── ...(50+种对象)
├── fixtures/ # 固定装置模型
│ ├── kitchen_table/
│ ├── wooden_cabinet/
│ └── microwave/
└── textures/ # 纹理贴图
├── floor/
├── wall/
└── table/
关键文件格式:
- XML文件: MuJoCo格式的对象定义
- STL/OBJ文件: 3D几何模型
- PNG文件: 纹理贴图
示例:碗的定义 (bowl.xml)
<mujoco>
<asset>
<mesh file="bowl_mesh.stl" scale="0.001 0.001 0.001"/>
<texture file="bowl_texture.png" type="2d"/>
</asset>
<body name="bowl">
<inertial pos="0 0 0.03" mass="0.15"/>
<geom type="mesh" mesh="bowl_mesh" rgba="0.2 0.2 0.2 1"/>
</body>
</mujoco>
1.2 📄 bddl_files/ - 任务定义文件
使用BDDL(Behavior Domain Definition Language)定义所有任务的初始状态和目标条件。
bddl_files/
├── libero_spatial/ # 10个空间推理任务
│ ├── KITCHEN_SCENE1_put_the_black_bowl_on_the_plate.bddl
│ ├── KITCHEN_SCENE1_put_the_black_bowl_to_the_left_of_the_plate.bddl
│ └── ...
├── libero_object/ # 10个对象泛化任务
│ ├── KITCHEN_SCENE2_pick_up_the_alphabet_soup_and_place_it_in_the_basket.bddl
│ └── ...
├── libero_goal/ # 10个目标泛化任务
│ ├── KITCHEN_SCENE3_turn_on_the_stove_and_put_the_moka_pot_on_it.bddl
│ └── ...
├── libero_90/ # 90个预训练任务
│ ├── KITCHEN_SCENE5_task_0.bddl
│ ├── KITCHEN_SCENE5_task_1.bddl
│ └── ...
└── libero_10/ # 10个评估任务(长期任务)
├── LIVING_ROOM_SCENE1_task_0.bddl
└── ...
BDDL文件示例:
(define (problem KITCHEN_SCENE1_put_the_black_bowl_on_the_plate)
(:domain libero)
; 任务的自然语言描述
(:language "put the black bowl on the plate")
; 场景中的对象
(:objects
kitchen_table - kitchen_table
akita_black_bowl_1 - akita_black_bowl
plate_1 - plate
wooden_cabinet_1 - wooden_cabinet
)
; 初始状态约束
(:init
(On akita_black_bowl_1 kitchen_table_akita_black_bowl_init_region)
(On plate_1 kitchen_table_plate_init_region)
(On wooden_cabinet_1 kitchen_table_wooden_cabinet_init_region)
)
; 目标条件(必须同时满足)
(:goal
(And
(On akita_black_bowl_1 plate_1)
)
)
)
BDDL谓词说明:
| 谓词 | 参数 | 含义 | 示例 |
|---|---|---|---|
On | (obj, region) | 对象在某区域上 | (On bowl table) |
In | (obj, container) | 对象在容器内 | (In apple basket) |
Open | (obj) | 对象处于打开状态 | (Open cabinet_door) |
Closed | (obj) | 对象处于关闭状态 | (Closed microwave) |
TurnedOn | (device) | 设备已开启 | (TurnedOn stove) |
TurnedOff | (device) | 设备已关闭 | (TurnedOff faucet) |
Touching | (obj1, obj2) | 两对象接触 | (Touching gripper bowl) |
1.3 🎯 benchmark/ - 基准测试套件
定义任务顺序和评估协议。
# benchmark/__init__.py
from .libero_benchmark import *
TASK_SUITES = {
"libero_spatial": LIBERO_SPATIAL,
"libero_object": LIBERO_OBJECT,
"libero_goal": LIBERO_GOAL,
"libero_10": LIBERO_10,
"libero_90": LIBERO_90,
}
def get_benchmark_dict():
"""获取所有基准测试套件"""
return TASK_SUITES
核心文件:libero_benchmark.py
class TaskSuite:
"""任务套件基类"""
def __init__(self):
self.tasks = [] # 任务列表
self.name = "" # 套件名称
def get_task(self, task_id):
"""
获取特定任务
Args:
task_id: 任务索引 (0 到 n-1)
Returns:
Task对象
"""
return self.tasks[task_id]
def get_num_tasks(self):
"""返回套件中的任务数量"""
return len(self.tasks)
def get_task_init_states(self, task_id):
"""
获取任务的固定初始状态(用于基准测试)
Args:
task_id: 任务索引
Returns:
初始状态列表(每个任务10个固定初始状态)
"""
init_file = os.path.join(
get_libero_path("init_files"),
f"{self.name}_task{task_id}_init_states.npy"
)
return np.load(init_file, allow_pickle=True)
class LIBERO_SPATIAL(TaskSuite):
"""
LIBERO-Spatial: 空间推理任务套件
所有任务使用相同的对象,但要求不同的空间关系:
- "put A on B"
- "put A to the left of B"
- "put A to the right of B"
- "put A in front of B"
等等
知识迁移类型: 空间关系知识
"""
def __init__(self):
super().__init__()
self.name = "libero_spatial"
# 定义10个任务
self.tasks = [
Task(
name="KITCHEN_SCENE1_put_the_black_bowl_on_the_plate",
language="put the black bowl on the plate",
problem_folder="libero_spatial",
bddl_file="KITCHEN_SCENE1_put_the_black_bowl_on_the_plate.bddl"
),
Task(
name="KITCHEN_SCENE1_put_the_black_bowl_to_the_left_of_the_plate",
language="put the black bowl to the left of the plate",
problem_folder="libero_spatial",
bddl_file="KITCHEN_SCENE1_put_the_black_bowl_to_the_left_of_the_plate.bddl"
),
# ... 8个更多任务
]
class LIBERO_OBJECT(TaskSuite):
"""
LIBERO-Object: 对象泛化任务套件
相似的任务结构,但使用不同的对象:
- "pick up the alphabet soup"
- "pick up the cream cheese box"
- "pick up the tomato sauce"
等等
知识迁移类型: 对象概念知识
"""
def __init__(self):
super().__init__()
self.name = "libero_object"
# ... 类似定义
class LIBERO_GOAL(TaskSuite):
"""
LIBERO-Goal: 目标泛化任务套件
使用相同场景但不同的目标条件:
- "turn on the stove and put the pot on it"
- "open the cabinet and put the bowl in it"
- "close the microwave"
等等
知识迁移类型: 任务目标知识
"""
pass
class LIBERO_90(TaskSuite):
"""
LIBERO-90: 预训练任务套件
90个不同的操作任务,用于预训练策略
知识迁移类型: 混合(空间、对象、目标)
"""
pass
class LIBERO_10(TaskSuite):
"""
LIBERO-10: 评估任务套件
10个长期复杂任务,用于测试终身学习性能
这些任务需要更长的操作序列
知识迁移类型: 混合
"""
pass
class Task:
"""单个任务的数据结构"""
def __init__(self, name, language, problem_folder, bddl_file):
self.name = name # 任务名称
self.language = language # 自然语言描述
self.problem_folder = problem_folder # BDDL文件所在文件夹
self.bddl_file = bddl_file # BDDL文件名
使用示例:
from libero.libero import benchmark
# 获取所有基准测试套件
benchmark_dict = benchmark.get_benchmark_dict()
# 选择一个套件
task_suite = benchmark_dict["libero_spatial"]()
# 获取任务信息
task = task_suite.get_task(0)
print(f"Task: {task.name}")
print(f"Instruction: {task.language}")
print(f"BDDL: {task.bddl_file}")
# 获取固定初始状态
init_states = task_suite.get_task_init_states(0)
print(f"Number of init states: {len(init_states)}") # 10个
1.4 🤖 envs/ - 环境实现
实现所有任务的MuJoCo仿真环境。
envs/
├── __init__.py # 环境注册
├── base_env.py # 基础环境类
├── bddl_utils.py # BDDL解析工具
├── objects/ # 对象类定义
│ ├── __init__.py
│ ├── object_base.py # 对象基类
│ └── specific_objects.py # 具体对象实现
├── problems/ # 特定问题环境
│ ├── KITCHEN_SCENE1_*.py # 各个任务的环境类
│ └── ...
└── wrappers.py # 环境包装器
核心类:LiberoBaseEnv
# base_env.py
import robosuite as suite
from robosuite.models import MujocoWorldBase
class LiberoBaseEnv(suite.robots.SingleArmEnv):
"""
LIBERO所有任务的基础环境类
继承自Robosuite的SingleArmEnv,添加了:
- BDDL文件解析
- 目标条件检查
- 稀疏奖励计算
- 固定初始化
"""
def __init__(
self,
bddl_file_name,
robots="Panda",
camera_heights=128,
camera_widths=128,
camera_names=["agentview", "robot0_eye_in_hand"],
**kwargs
):
"""
Args:
bddl_file_name: BDDL文件路径
robots: 机器人类型(默认Panda)
camera_heights: 相机高度(像素)
camera_widths: 相机宽度(像素)
camera_names: 相机列表
"""
# 解析BDDL文件
self.bddl_file = bddl_file_name
self.problem_info = bddl_utils.get_problem_info(bddl_file_name)
self.language_instruction = self.problem_info["language_instruction"]
# 调用父类初始化
super().__init__(
robots=robots,
camera_heights=camera_heights,
camera_widths=camera_widths,
camera_names=camera_names,
**kwargs
)
# 初始化BDDL评估器
self.bddl_evaluator = bddl_utils.BDDLEvaluator(
bddl_file_name,
self.sim
)
def _load_model(self):
"""加载场景模型"""
super()._load_model()
# 添加BDDL文件中定义的对象
mujoco_objects = []
for obj_name, obj_type in self.problem_info["objects"].items():
obj = OBJECTS_DICT[obj_type]()
mujoco_objects.append(obj)
# 添加到场景
self.model = MujocoWorldBase()
self.model.merge_assets(self.robot_model)
for obj in mujoco_objects:
self.model.merge_assets(obj)
def _check_success(self):
"""
检查是否完成任务
Returns:
bool: 任务是否成功
"""
return self.bddl_evaluator.check_goal_satisfied()
def reward(self, action):
"""
计算奖励(稀疏)
Args:
action: 机器人动作
Returns:
float: 奖励值(成功=1.0,失败=0.0)
"""
return 1.0 if self._check_success() else 0.0
def set_init_state(self, init_state):
"""
设置固定初始状态(用于基准测试)
Args:
init_state: MuJoCo状态向量
"""
self.sim.set_state_from_flattened(init_state)
self.sim.forward()
def step(self, action):
"""
执行一步仿真
Args:
action: 7维动作 [dx, dy, dz, droll, dpitch, dyaw, gripper]
Returns:
obs: 观察字典
reward: 奖励
done: 是否结束
info: 额外信息
"""
obs, reward, done, info = super().step(action)
# 添加成功标志
info['success'] = self._check_success()
return obs, reward, done, info
def _get_observation(self):
"""
获取观察
Returns:
OrderedDict: 包含以下键值:
- agentview_image: (H, W, 3) RGB图像
- robot0_eye_in_hand_image: (H, W, 3) 手眼相机图像
- robot0_eef_pos: (3,) 末端执行器位置
- robot0_eef_quat: (4,) 末端执行器四元数
- robot0_gripper_qpos: (2,) 夹爪位置
- robot0_joint_pos: (7,) 关节角度
"""
di = super()._get_observation()
# 可以添加额外的观察项
# di["language_instruction"] = self.language_instruction
return di
环境包装器
# wrappers.py
class OffScreenRenderEnv(gym.Wrapper):
"""
离屏渲染包装器
用于无GUI环境下获取图像观察
"""
def __init__(self, env):
super().__init__(env)
self.env.has_offscreen_renderer = True
self.env.has_renderer = False
class LanguageConditionedEnv(gym.Wrapper):
"""
语言条件包装器
将语言指令添加到观察中
"""
def __init__(self, env, use_language_embedding=False):
super().__init__(env)
self.use_language_embedding = use_language_embedding
if use_language_embedding:
# 使用预训练语言模型(如CLIP)编码
self.language_encoder = CLIPTextEncoder()
def reset(self):
obs = self.env.reset()
obs["language"] = self._get_language_obs()
return obs
def _get_language_obs(self):
if self.use_language_embedding:
# 返回语言嵌入向量
text = self.env.language_instruction
return self.language_encoder.encode(text)
else:
# 返回原始文本
return self.env.language_instruction
使用示例:
from libero.libero.envs import TASK_MAPPING
from libero.libero import get_libero_path
# 获取BDDL文件路径
bddl_file = os.path.join(
get_libero_path("bddl_files"),
"libero_spatial",
"KITCHEN_SCENE1_put_the_black_bowl_on_the_plate.bddl"
)
# 从任务名称获取环境类
problem_name = "KITCHEN_SCENE1_put_the_black_bowl_on_the_plate"
env_class = TASK_MAPPING[problem_name]
# 创建环境
env = env_class(
bddl_file_name=bddl_file,
robots="Panda",
has_renderer=False,
has_offscreen_renderer=True,
use_camera_obs=True,
camera_heights=128,
camera_widths=128,
control_freq=20,
)
# 重置环境
obs = env.reset()
# 执行动作
action = [0.0] * 7 # [dx, dy, dz, droll, dpitch, dyaw, gripper]
obs, reward, done, info = env.step(action)
print(f"Success: {info['success']}")
print(f"Reward: {reward}")
1.5 💾 init_files/ - 固定初始状态
为基准测试提供固定的初始状态。
# 每个任务有10个固定的初始状态
init_files/
├── libero_spatial_task0_init_states.npy # Task 0的10个初始状态
├── libero_spatial_task1_init_states.npy
├── ...
├── libero_object_task0_init_states.npy
└── ...
为什么需要固定初始状态?
- 确保评估的公平性和可重复性
- 不同算法在相同初始条件下比较
- 减少随机性对结果的影响
使用方式:
# 加载固定初始状态
task_suite = benchmark_dict["libero_spatial"]()
init_states = task_suite.get_task_init_states(task_id=0)
# 对每个初始状态评估
for init_state in init_states:
env.reset()
env.set_init_state(init_state)
# 运行策略...
1.6 🔧 utils/ - 工具函数
各种辅助功能。
# utils/utils.py
def get_libero_path(subpath):
"""
获取LIBERO数据路径
Args:
subpath: 子路径名(如 "datasets", "bddl_files")
Returns:
完整路径
"""
libero_root = os.path.expanduser("~/.libero")
return os.path.join(libero_root, subpath)
def update_env_kwargs(env_kwargs, **new_kwargs):
"""
更新环境参数
Args:
env_kwargs: 原环境参数字典
**new_kwargs: 新参数
Returns:
更新后的环境参数
"""
env_kwargs.update(new_kwargs)
return env_kwargs
# utils/bddl_generation_utils.py
def generate_bddl_from_template(
task_name,
objects,
init_conditions,
goal_conditions,
output_dir
):
"""
从模板生成BDDL文件
Args:
task_name: 任务名称
objects: 对象列表
init_conditions: 初始条件列表
goal_conditions: 目标条件列表
output_dir: 输出目录
"""
bddl_content = f"""
(define (problem {task_name})
(:domain libero)
(:objects
{' '.join(f'{obj.name} - {obj.type}' for obj in objects)}
)
(:init
{' '.join(f'({cond})' for cond in init_conditions)}
)
(:goal
(And
{' '.join(f'({cond})' for cond in goal_conditions)}
)
)
)
"""
output_path = os.path.join(output_dir, f"{task_name}.bddl")
with open(output_path, 'w') as f:
f.write(bddl_content)
# utils/object_utils.py
def get_affordance_regions(objects_dict):
"""
获取所有对象的可交互区域
Args:
objects_dict: 对象字典
Returns:
可交互区域字典
"""
affordances = {}
for obj_name, obj_class in objects_dict.items():
obj_instance = obj_class()
affordances[obj_name] = {
'regions': obj_instance.get_affordance_regions(),
'actions': obj_instance.get_available_actions()
}
return affordances
🧠 Part 2: libero/lifelong/ - 终身学习算法与模型
这部分实现了训练、评估和各种终身学习算法。
2.1 🎯 main.py - 主训练脚本
终身学习实验的入口点。
# lifelong/main.py
import hydra
from omegaconf import DictConfig, OmegaConf
import torch
from libero.lifelong.algos import get_algo_class
from libero.lifelong.datasets import get_dataset
from libero.lifelong.models import get_policy_class
@hydra.main(config_path="../configs", config_name="config")
def main(cfg: DictConfig):
"""
主训练函数
配置通过Hydra从configs/目录加载
使用方式:
python main.py benchmark_name=libero_spatial policy=bc_transformer_policy lifelong=er
"""
# 设置随机种子
torch.manual_seed(cfg.seed)
np.random.seed(cfg.seed)
# ============= 1. 加载数据集 =============
print(f"Loading benchmark: {cfg.benchmark_name}")
# 获取任务套件
benchmark = get_benchmark_dict()[cfg.benchmark_name]()
n_tasks = benchmark.get_num_tasks()
# 加载所有任务的数据集
datasets = []
for task_id in range(n_tasks):
task = benchmark.get_task(task_id)
dataset = get_dataset(
dataset_path=cfg.dataset_path,
task_name=task.name,
obs_modality=cfg.data.obs.modality, # "rgb", "rgb_proprio", etc.
initialize_obs_utils=True,
seq_len=cfg.data.seq_len,
)
datasets.append(dataset)
print(f"Loaded {n_tasks} tasks")
# ============= 2. 创建策略网络 =============
print(f"Creating policy: {cfg.policy.policy_type}")
policy_class = get_policy_class(cfg.policy.policy_type)
policy = policy_class(
obs_shapes=datasets[0].obs_shapes,
action_dim=datasets[0].action_dim,
**cfg.policy,
)
# ============= 3. 创建算法 =============
print(f"Creating algorithm: {cfg.lifelong.algo_name}")
algo_class = get_algo_class(cfg.lifelong.algo_name)
algo = algo_class(
n_tasks=n_tasks,
policy=policy,
datasets=datasets,
**cfg.lifelong,
)
# ============= 4. 终身学习训练循环 =============
print("Starting lifelong learning...")
for task_id in range(n_tasks):
print(f"\n{'='*50}")
print(f"Learning Task {task_id}: {benchmark.get_task(task_id).language}")
print(f"{'='*50}")
# 训练当前任务
algo.learn_task(
task_id=task_id,
epochs=cfg.train.epochs,
batch_size=cfg.train.batch_size,
)
# 评估所有已学任务(测量遗忘)
if (task_id + 1) % cfg.eval.eval_freq == 0:
print(f"\nEvaluating after task {task_id}...")
eval_results = {}
for eval_task_id in range(task_id + 1):
success_rate = evaluate_policy(
policy=policy,
benchmark=benchmark,
task_id=eval_task_id,
n_eval_episodes=cfg.eval.n_eval_episodes,
device=cfg.device,
)
eval_results[eval_task_id] = success_rate
print(f"Task {eval_task_id}: {success_rate:.2%}")
# 保存结果
save_results(eval_results, task_id, cfg.output_dir)
print("\nLifelong learning completed!")
def evaluate_policy(policy, benchmark, task_id, n_eval_episodes, device):
"""
评估策略在特定任务上的表现
Args:
policy: 策略网络
benchmark: 任务套件
task_id: 任务ID
n_eval_episodes: 评估回合数
device: 设备
Returns:
success_rate: 成功率
"""
task = benchmark.get_task(task_id)
# 创建环境
env = create_env(task)
# 加载固定初始状态
init_states = benchmark.get_task_init_states(task_id)
policy.eval()
successes = []
with torch.no_grad():
for init_state in init_states[:n_eval_episodes]:
env.reset()
env.set_init_state(init_state)
done = False
obs_history = []
while not done:
# 获取观察
obs = env._get_observation()
obs_history.append(obs)
# 策略推理
obs_tensor = prepare_obs(obs_history, device)
action = policy(obs_tensor)
# 执行动作
obs, reward, done, info = env.step(action.cpu().numpy())
if info['success']:
successes.append(1)
break
if len(obs_history) > 500: # 最大步数
successes.append(0)
break
env.close()
return np.mean(successes)
if __name__ == "__main__":
main()
2.2 🤖 algos/ - 终身学习算法
实现各种防止遗忘的算法。
algos/
├── __init__.py
├── base.py # Sequential基类(顺序微调)
├── er.py # Experience Replay(经验回放)
├── ewc.py # Elastic Weight Consolidation
├── packnet.py # PackNet
├── multitask.py # 多任务学习(baseline)
└── single_task.py # 单任务学习(oracle)
2.2.1 base.py - Sequential基类
# algos/base.py
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
class Sequential:
"""
顺序微调基类
最简单的终身学习方法:
- 按顺序学习每个任务
- 没有防遗忘机制
- 会发生灾难性遗忘
"""
def __init__(self, n_tasks, policy, datasets, lr=1e-4, **kwargs):
"""
Args:
n_tasks: 任务数量
policy: 策略网络
datasets: 数据集列表
lr: 学习率
"""
self.n_tasks = n_tasks
self.policy = policy
self.datasets = datasets
self.optimizer = torch.optim.Adam(
self.policy.parameters(),
lr=lr
)
self.current_task_id = -1
def learn_task(self, task_id, epochs, batch_size):
"""
学习一个新任务
Args:
task_id: 任务ID
epochs: 训练轮数
batch_size: 批大小
"""
self.current_task_id = task_id
print(f"Training on task {task_id} for {epochs} epochs")
# 获取当前任务的数据集
dataset = self.datasets[task_id]
dataloader = DataLoader(
dataset,
batch_size=batch_size,
shuffle=True,
num_workers=4,
)
self.policy.train()
for epoch in range(epochs):
epoch_loss = 0
for batch in dataloader:
# 解包批次数据
obs = batch['obs'] # 观察
actions = batch['actions'] # 动作标签
# 前向传播
pred_actions = self.policy(obs)
# 计算损失(行为克隆)
loss = self.compute_loss(pred_actions, actions)
# 反向传播
self.optimizer.zero_grad()
loss.backward()
self.optimizer.step()
epoch_loss += loss.item()
avg_loss = epoch_loss / len(dataloader)
if (epoch + 1) % 10 == 0:
print(f"Epoch {epoch+1}/{epochs}, Loss: {avg_loss:.4f}")
def compute_loss(self, pred_actions, target_actions):
"""
计算行为克隆损失
Args:
pred_actions: 预测动作
target_actions: 真实动作
Returns:
loss: 损失值
"""
return nn.MSELoss()(pred_actions, target_actions)
def save(self, save_path):
"""保存模型"""
torch.save({
'policy_state_dict': self.policy.state_dict(),
'optimizer_state_dict': self.optimizer.state_dict(),
'current_task_id': self.current_task_id,
}, save_path)
def load(self, load_path):
"""加载模型"""
checkpoint = torch.load(load_path)
self.policy.load_state_dict(checkpoint['policy_state_dict'])
self.optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
self.current_task_id = checkpoint['current_task_id']
2.2.2 er.py - Experience Replay
# algos/er.py
from .base import Sequential
from torch.utils.data import ConcatDataset
class ER(Sequential):
"""
Experience Replay(经验回放)
方法:
- 保存之前任务的少量数据
- 新任务训练时混合旧数据
- 防止遗忘旧任务
超参数:
- memory_size: 每个任务保存多少数据
"""
def __init__(self, n_tasks, policy, datasets, memory_size=1000, **kwargs):
"""
Args:
memory_size: 每个任务的记忆缓冲区大小
"""
super().__init__(n_tasks, policy, datasets, **kwargs)
self.memory_size = memory_size
self.memory_datasets = [] # 存储之前任务的数据子集
def learn_task(self, task_id, epochs, batch_size):
"""
使用经验回放学习新任务
"""
self.current_task_id = task_id
# 当前任务的数据集
current_dataset = self.datasets[task_id]
# 如果有记忆数据,混合使用
if len(self.memory_datasets) > 0:
# 合并当前数据和记忆数据
combined_dataset = ConcatDataset(
[current_dataset] + self.memory_datasets
)
else:
combined_dataset = current_dataset
# 创建数据加载器
dataloader = DataLoader(
combined_dataset,
batch_size=batch_size,
shuffle=True,
num_workers=4,
)
# 训练(与Sequential相同)
self.policy.train()
for epoch in range(epochs):
for batch in dataloader:
obs = batch['obs']
actions = batch['actions']
pred_actions = self.policy(obs)
loss = self.compute_loss(pred_actions, actions)
self.optimizer.zero_grad()
loss.backward()
self.optimizer.step()
# 保存当前任务的部分数据到记忆
memory_subset = self._sample_memory(current_dataset, self.memory_size)
self.memory_datasets.append(memory_subset)
print(f"Added {len(memory_subset)} samples to memory")
def _sample_memory(self, dataset, n_samples):
"""
从数据集中采样数据存入记忆
Args:
dataset: 源数据集
n_samples: 采样数量
Returns:
memory_dataset: 记忆数据集子集
"""
# 随机采样索引
indices = np.random.choice(
len(dataset),
size=min(n_samples, len(dataset)),
replace=False
)
# 创建子集
from torch.utils.data import Subset
return Subset(dataset, indices)
2.2.3 ewc.py - Elastic Weight Consolidation
# algos/ewc.py
from .base import Sequential
import torch
class EWC(Sequential):
"""
Elastic Weight Consolidation(弹性权重固化)
方法:
- 计算重要参数的Fisher信息矩阵
- 添加正则化项惩罚重要参数的改变
- 保护对旧任务重要的参数
论文: "Overcoming catastrophic forgetting in neural networks" (Kirkpatrick et al., 2017)
超参数:
- ewc_lambda: 正则化强度
"""
def __init__(self, n_tasks, policy, datasets, ewc_lambda=5000, **kwargs):
"""
Args:
ewc_lambda: EWC正则化系数
"""
super().__init__(n_tasks, policy, datasets, **kwargs)
self.ewc_lambda = ewc_lambda
# 存储每个任务的Fisher信息和参数
self.fisher_dict = {} # {task_id: fisher_matrix}
self.optpar_dict = {} # {task_id: optimal_parameters}
def learn_task(self, task_id, epochs, batch_size):
"""
使用EWC学习新任务
"""
self.current_task_id = task_id
dataset = self.datasets[task_id]
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
self.policy.train()
for epoch in range(epochs):
for batch in dataloader:
obs = batch['obs']
actions = batch['actions']
pred_actions = self.policy(obs)
# 计算任务损失
task_loss = self.compute_loss(pred_actions, actions)
# 计算EWC损失(正则化项)
ewc_loss = self.compute_ewc_loss()
# 总损失
loss = task_loss + self.ewc_lambda * ewc_loss
self.optimizer.zero_grad()
loss.backward()
self.optimizer.step()
# 任务完成后,计算并保存Fisher信息
print("Computing Fisher information...")
fisher = self.compute_fisher(dataloader)
self.fisher_dict[task_id] = fisher
# 保存当前参数作为"最优参数"
self.optpar_dict[task_id] = {
name: param.clone().detach()
for name, param in self.policy.named_parameters()
}
def compute_fisher(self, dataloader):
"""
计算Fisher信息矩阵
Fisher信息衡量参数对任务的重要性
Args:
dataloader: 数据加载器
Returns:
fisher: Fisher信息字典 {param_name: fisher_value}
"""
fisher = {}
# 初始化
for name, param in self.policy.named_parameters():
fisher[name] = torch.zeros_like(param)
self.policy.eval()
# 对数据集采样计算
n_samples = 0
for batch in dataloader:
obs = batch['obs']
actions = batch['actions']
pred_actions = self.policy(obs)
loss = self.compute_loss(pred_actions, actions)
self.optimizer.zero_grad()
loss.backward()
# 累积梯度平方(Fisher近似)
for name, param in self.policy.named_parameters():
if param.grad is not None:
fisher[name] += param.grad.pow(2)
n_samples += len(obs)
if n_samples > 1000: # 只用部分数据计算
break
# 归一化
for name in fisher:
fisher[name] /= n_samples
return fisher
def compute_ewc_loss(self):
"""
计算EWC正则化损失
损失 = sum_i (fisher_i * (theta_i - theta*_i)^2)
Returns:
ewc_loss: 正则化损失
"""
if len(self.fisher_dict) == 0:
return 0.0
ewc_loss = 0.0
for task_id in self.fisher_dict:
fisher = self.fisher_dict[task_id]
optpar = self.optpar_dict[task_id]
for name, param in self.policy.named_parameters():
# 当前参数与旧任务最优参数的差异
param_diff = param - optpar[name]
# 加权损失(重要参数权重大)
ewc_loss += (fisher[name] * param_diff.pow(2)).sum()
return ewc_loss
2.2.4 packnet.py - PackNet
# algos/packnet.py
from .base import Sequential
import torch
class PackNet(Sequential):
"""
PackNet: 参数打包网络
方法:
- 为每个任务分配网络的一部分参数
- 学习新任务时冻结旧任务的参数
- 通过剪枝实现参数分配
论文: "PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning" (Mallya & Lazebnik, 2018)
超参数:
- prune_ratio: 每个任务的剪枝比例
"""
def __init__(self, n_tasks, policy, datasets, prune_ratio=0.5, **kwargs):
"""
Args:
prune_ratio: 每个任务后剪枝的参数比例
"""
super().__init__(n_tasks, policy, datasets, **kwargs)
self.prune_ratio = prune_ratio
self.task_masks = {} # {task_id: mask_dict}
self.free_params = None # 当前可用参数的掩码
def learn_task(self, task_id, epochs, batch_size):
"""
使用PackNet学习新任务
"""
self.current_task_id = task_id
# 第一个任务:所有参数都可用
if task_id == 0:
self.free_params = {
name: torch.ones_like(param)
for name, param in self.policy.named_parameters()
}
# 只训练可用参数
dataset = self.datasets[task_id]
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
self.policy.train()
for epoch in range(epochs):
for batch in dataloader:
obs = batch['obs']
actions = batch['actions']
pred_actions = self.policy(obs)
loss = self.compute_loss(pred_actions, actions)
self.optimizer.zero_grad()
loss.backward()
# 只更新可用参数
self.apply_mask_to_gradients()
self.optimizer.step()
# 训练完成后,进行剪枝
print(f"Pruning {self.prune_ratio*100:.1f}% of parameters for task {task_id}")
task_mask = self.prune_network(self.prune_ratio)
self.task_masks[task_id] = task_mask
# 更新可用参数
for name in self.free_params:
self.free_params[name] *= (1 - task_mask[name])
def apply_mask_to_gradients(self):
"""
将掩码应用到梯度,冻结已分配的参数
"""
for name, param in self.policy.named_parameters():
if param.grad is not None:
param.grad *= self.free_params[name]
def prune_network(self, prune_ratio):
"""
剪枝网络,为当前任务分配参数
策略:保留权重绝对值最大的参数
Args:
prune_ratio: 剪枝比例
Returns:
task_mask: 当前任务使用的参数掩码
"""
task_mask = {}
for name, param in self.policy.named_parameters():
# 只考虑可用参数
available = self.free_params[name]
available_params = param * available
# 计算阈值
abs_params = torch.abs(available_params)
n_params = available.sum().item()
n_keep = int(n_params * prune_ratio)
if n_keep > 0:
threshold = torch.topk(
abs_params.view(-1),
k=int(n_keep),
largest=True
).values[-1]
# 创建掩码:保留大于阈值的参数
mask = (abs_params >= threshold).float()
else:
mask = torch.zeros_like(param)
task_mask[name] = mask
return task_mask
def forward_with_task_mask(self, task_id):
"""
使用特定任务的参数掩码进行推理
Args:
task_id: 任务ID
"""
# 应用任务掩码
original_params = {}
for name, param in self.policy.named_parameters():
original_params[name] = param.clone()
param.data *= self.task_masks[task_id][name]
# 推理...
# 恢复参数
for name, param in self.policy.named_parameters():
param.data = original_params[name]
2.2.5 multitask.py - 多任务学习基线
# algos/multitask.py
from .base import Sequential
from torch.utils.data import ConcatDataset
class Multitask(Sequential):
"""
多任务学习基线(Oracle)
方法:
- 使用所有任务的数据一起训练
- 不是真正的终身学习(需要所有数据)
- 作为性能上界
"""
def __init__(self, n_tasks, policy, datasets, **kwargs):
super().__init__(n_tasks, policy, datasets, **kwargs)
# 合并所有数据集
self.combined_dataset = ConcatDataset(datasets)
def learn_task(self, task_id, epochs, batch_size):
"""
多任务训练(忽略task_id,总是用全部数据)
"""
if task_id == 0: # 只训练一次
dataloader = DataLoader(
self.combined_dataset,
batch_size=batch_size,
shuffle=True,
)
self.policy.train()
for epoch in range(epochs):
for batch in dataloader:
obs = batch['obs']
actions = batch['actions']
pred_actions = self.policy(obs)
loss = self.compute_loss(pred_actions, actions)
self.optimizer.zero_grad()
loss.backward()
self.optimizer.step()
2.3 🏗️ models/policy/ - 策略网络架构
实现三种视觉运动策略架构。
models/
├── __init__.py
└── policy/
├── bc_rnn_policy.py # ResNet + RNN
├── bc_transformer_policy.py # ResNet + Transformer
└── bc_vilt_policy.py # Vision-Language Transformer
2.3.1 bc_rnn_policy.py - ResNet-RNN
# models/policy/bc_rnn_policy.py
import torch
import torch.nn as nn
from torchvision.models import resnet18
class BCRNNPolicy(nn.Module):
"""
ResNet + RNN 策略
架构:
1. 图像编码器: ResNet18
2. 序列模型: GRU
3. 动作解码器: MLP
适用于:需要时序信息的任务
"""
def __init__(
self,
obs_shapes,
action_dim=7,
hidden_dim=512,
rnn_hidden_dim=256,
rnn_num_layers=2,
**kwargs
):
"""
Args:
obs_shapes: 观察形状字典 {'agentview_image': (3,128,128), ...}
action_dim: 动作维度(Panda机器人=7)
hidden_dim: 编码器输出维度
rnn_hidden_dim: RNN隐藏层维度
rnn_num_layers: RNN层数
"""
super().__init__()
self.action_dim = action_dim
self.rnn_hidden_dim = rnn_hidden_dim
self.rnn_num_layers = rnn_num_layers
# ============= 1. 图像编码器 =============
# 使用预训练ResNet18
self.image_encoder = resnet18(pretrained=True)
# 移除最后的全连接层
self.image_encoder.fc = nn.Identity()
image_feature_dim = 512 # ResNet18的特征维度
# 处理多个相机视角
self.camera_names = ['agentview_image', 'robot0_eye_in_hand_image']
n_cameras = len(self.camera_names)
# 融合多视角特征
self.feature_fusion = nn.Linear(
image_feature_dim * n_cameras,
hidden_dim
)
# ============= 2. 本体感觉编码器 =============
# 处理机器人状态(关节角度、夹爪等)
proprio_dim = 0
if 'robot0_joint_pos' in obs_shapes:
proprio_dim += obs_shapes['robot0_joint_pos'][0] # 7
if 'robot0_gripper_qpos' in obs_shapes:
proprio_dim += obs_shapes['robot0_gripper_qpos'][0] # 2
if proprio_dim > 0:
self.proprio_encoder = nn.Sequential(
nn.Linear(proprio_dim, 64),
nn.ReLU(),
nn.Linear(64, 64),
)
rnn_input_dim = hidden_dim + 64
else:
self.proprio_encoder = None
rnn_input_dim = hidden_dim
# ============= 3. RNN序列模型 =============
self.rnn = nn.GRU(
input_size=rnn_input_dim,
hidden_size=rnn_hidden_dim,
num_layers=rnn_num_layers,
batch_first=True, # (batch, seq, feature)
)
# ============= 4. 动作解码器 =============
self.action_decoder = nn.Sequential(
nn.Linear(rnn_hidden_dim, 256),
nn.ReLU(),
nn.Linear(256, 128),
nn.ReLU(),
nn.Linear(128, action_dim),
)
def forward(self, obs_dict, hidden_state=None):
"""
前向传播
Args:
obs_dict: 观察字典,包含:
- agentview_image: (B, T, 3, H, W)
- robot0_eye_in_hand_image: (B, T, 3, H, W)
- robot0_joint_pos: (B, T, 7) [可选]
- robot0_gripper_qpos: (B, T, 2) [可选]
hidden_state: RNN隐藏状态 (可选)
Returns:
actions: (B, T, action_dim)
hidden_state: 新的隐藏状态
"""
batch_size = obs_dict['agentview_image'].shape[0]
seq_len = obs_dict['agentview_image'].shape[1]
# ============= 1. 编码图像 =============
image_features = []
for cam_name in self.camera_names:
images = obs_dict[cam_name] # (B, T, 3, H, W)
# 展平批次和时间维度
images = images.reshape(-1, *images.shape[2:]) # (B*T, 3, H, W)
# 编码
features = self.image_encoder(images) # (B*T, 512)
# 恢复形状
features = features.reshape(batch_size, seq_len, -1) # (B, T, 512)
image_features.append(features)
# 拼接多视角特征
image_features = torch.cat(image_features, dim=-1) # (B, T, 512*n_cameras)
# 融合特征
visual_features = self.feature_fusion(image_features) # (B, T, hidden_dim)
# ============= 2. 编码本体感觉 =============
if self.proprio_encoder is not None:
proprio_list = []
if 'robot0_joint_pos' in obs_dict:
proprio_list.append(obs_dict['robot0_joint_pos'])
if 'robot0_gripper_qpos' in obs_dict:
proprio_list.append(obs_dict['robot0_gripper_qpos'])
proprio = torch.cat(proprio_list, dim=-1) # (B, T, proprio_dim)
# 展平并编码
proprio = proprio.reshape(-1, proprio.shape[-1])
proprio_features = self.proprio_encoder(proprio)
proprio_features = proprio_features.reshape(batch_size, seq_len, -1)
# 拼接视觉和本体感觉特征
features = torch.cat([visual_features, proprio_features], dim=-1)
else:
features = visual_features
# ============= 3. RNN处理序列 =============
if hidden_state is None:
hidden_state = torch.zeros(
self.rnn_num_layers,
batch_size,
self.rnn_hidden_dim,
device=features.device
)
rnn_output, hidden_state = self.rnn(features, hidden_state)
# rnn_output: (B, T, rnn_hidden_dim)
# ============= 4. 解码动作 =============
# 展平
rnn_output = rnn_output.reshape(-1, rnn_output.shape[-1])
actions = self.action_decoder(rnn_output) # (B*T, action_dim)
# 恢复形状
actions = actions.reshape(batch_size, seq_len, self.action_dim)
return actions, hidden_state
def predict_action(self, obs_dict, hidden_state=None):
"""
预测单步动作(用于推理)
Args:
obs_dict: 单步观察
hidden_state: RNN隐藏状态
Returns:
action: (action_dim,)
hidden_state: 新的隐藏状态
"""
# 添加批次和时间维度
for key in obs_dict:
obs_dict[key] = obs_dict[key].unsqueeze(0).unsqueeze(0)
actions, hidden_state = self.forward(obs_dict, hidden_state)
# 移除批次和时间维度
action = actions[0, 0]
return action, hidden_state
2.3.2 bc_transformer_policy.py - ResNet-Transformer
# models/policy/bc_transformer_policy.py
import torch
import torch.nn as nn
from torchvision.models import resnet18
class BCTransformerPolicy(nn.Module):
"""
ResNet + Transformer 策略
架构:
1. 图像编码器: ResNet18
2. 序列模型: Transformer Encoder
3. 动作解码器: MLP
优势:
- 更好的长期依赖建模
- 并行化训练更快
- 注意力机制可视化
"""
def __init__(
self,
obs_shapes,
action_dim=7,
hidden_dim=512,
transformer_num_layers=4,
transformer_num_heads=8,
transformer_max_seq_len=10,
**kwargs
):
"""
Args:
transformer_num_layers: Transformer层数
transformer_num_heads: 注意力头数
transformer_max_seq_len: 最大序列长度
"""
super().__init__()
self.action_dim = action_dim
self.max_seq_len = transformer_max_seq_len
self.hidden_dim = hidden_dim
# ============= 1. 图像编码器(与RNN相同)=============
self.image_encoder = resnet18(pretrained=True)
self.image_encoder.fc = nn.Identity()
self.camera_names = ['agentview_image', 'robot0_eye_in_hand_image']
n_cameras = len(self.camera_names)
self.feature_fusion = nn.Linear(512 * n_cameras, hidden_dim)
# ============= 2. 位置编码 =============
# Transformer需要位置信息
self.positional_encoding = nn.Parameter(
torch.randn(1, transformer_max_seq_len, hidden_dim)
)
# ============= 3. Transformer编码器 =============
encoder_layer = nn.TransformerEncoderLayer(
d_model=hidden_dim,
nhead=transformer_num_heads,
dim_feedforward=hidden_dim * 4,
dropout=0.1,
activation='relu',
batch_first=True,
)
self.transformer = nn.TransformerEncoder(
encoder_layer,
num_layers=transformer_num_layers,
)
# ============= 4. 动作解码器 =============
self.action_decoder = nn.Sequential(
nn.Linear(hidden_dim, 256),
nn.ReLU(),
nn.Dropout(0.1),
nn.Linear(256, 128),
nn.ReLU(),
nn.Linear(128, action_dim),
)
def forward(self, obs_dict):
"""
前向传播
Args:
obs_dict: 观察字典
Returns:
actions: (B, T, action_dim)
"""
batch_size = obs_dict['agentview_image'].shape[0]
seq_len = obs_dict['agentview_image'].shape[1]
# ============= 1. 编码图像 =============
image_features = []
for cam_name in self.camera_names:
images = obs_dict[cam_name]
images = images.reshape(-1, *images.shape[2:])
features = self.image_encoder(images)
features = features.reshape(batch_size, seq_len, -1)
image_features.append(features)
image_features = torch.cat(image_features, dim=-1)
visual_features = self.feature_fusion(image_features)
# ============= 2. 添加位置编码 =============
# 截取或填充位置编码
pos_enc = self.positional_encoding[:, :seq_len, :]
features = visual_features + pos_enc
# ============= 3. Transformer处理 =============
# 创建因果掩码(防止看到未来)
causal_mask = self.generate_square_subsequent_mask(seq_len)
causal_mask = causal_mask.to(features.device)
transformer_output = self.transformer(
features,
mask=causal_mask
) # (B, T, hidden_dim)
# ============= 4. 解码动作 =============
transformer_output = transformer_output.reshape(-1, self.hidden_dim)
actions = self.action_decoder(transformer_output)
actions = actions.reshape(batch_size, seq_len, self.action_dim)
return actions
@staticmethod
def generate_square_subsequent_mask(sz):
"""
生成因果掩码(上三角矩阵)
Args:
sz: 序列长度
Returns:
mask: (sz, sz) 掩码矩阵
"""
mask = torch.triu(torch.ones(sz, sz), diagonal=1)
mask = mask.masked_fill(mask == 1, float('-inf'))
return mask
def predict_action(self, obs_history):
"""
基于观察历史预测动作
Args:
obs_history: 观察历史列表
Returns:
action: (action_dim,)
"""
# 只保留最近的max_seq_len个观察
if len(obs_history) > self.max_seq_len:
obs_history = obs_history[-self.max_seq_len:]
# 构建批次
obs_dict = {}
for key in obs_history[0]:
obs_dict[key] = torch.stack([obs[key] for obs in obs_history])
obs_dict[key] = obs_dict[key].unsqueeze(0) # 添加批次维度
actions = self.forward(obs_dict)
# 返回最后一个时间步的动作
action = actions[0, -1]
return action
2.3.3 bc_vilt_policy.py - Vision-Language Transformer
# models/policy/bc_vilt_policy.py
import torch
import torch.nn as nn
from transformers import ViTModel, BertModel
class BCViLTPolicy(nn.Module):
"""
Vision-Language Transformer策略
架构:
1. 图像编码器: Vision Transformer (ViT)
2. 语言编码器: BERT
3. 多模态融合: Cross-attention Transformer
4. 动作解码器: MLP
特点:
- 直接处理语言指令
- 视觉-语言联合表示
- 适合语言条件任务
"""
def __init__(
self,
obs_shapes,
action_dim=7,
hidden_dim=768,
use_language=True,
**kwargs
):
"""
Args:
use_language: 是否使用语言条件
"""
super().__init__()
self.action_dim = action_dim
self.hidden_dim = hidden_dim
self.use_language = use_language
# ============= 1. Vision Transformer =============
self.vit = ViTModel.from_pretrained('google/vit-base-patch16-224')
# 处理多个相机
self.camera_names = ['agentview_image', 'robot0_eye_in_hand_image']
n_cameras = len(self.camera_names)
# 融合多视角ViT特征
self.visual_fusion = nn.Sequential(
nn.Linear(hidden_dim * n_cameras, hidden_dim),
nn.LayerNorm(hidden_dim),
)
# ============= 2. 语言编码器 =============
if self.use_language:
self.bert = BertModel.from_pretrained('bert-base-uncased')
# 冻结BERT(可选)
for param in self.bert.parameters():
param.requires_grad = False
# ============= 3. 多模态融合 =============
if self.use_language:
# Cross-attention: 视觉特征attend to语言特征
self.cross_attention = nn.MultiheadAttention(
embed_dim=hidden_dim,
num_heads=8,
batch_first=True,
)
self.fusion_norm = nn.LayerNorm(hidden_dim)
# ============= 4. 时序建模 =============
self.temporal_transformer = nn.TransformerEncoder(
nn.TransformerEncoderLayer(
d_model=hidden_dim,
nhead=8,
dim_feedforward=hidden_dim * 4,
batch_first=True,
),
num_layers=2,
)
# ============= 5. 动作解码器 =============
self.action_decoder = nn.Sequential(
nn.Linear(hidden_dim, 512),
nn.ReLU(),
nn.Dropout(0.1),
nn.Linear(512, 256),
nn.ReLU(),
nn.Linear(256, action_dim),
)
def forward(self, obs_dict, language_tokens=None):
"""
前向传播
Args:
obs_dict: 观察字典
language_tokens: 语言token (B, seq_len) [可选]
Returns:
actions: (B, T, action_dim)
"""
batch_size = obs_dict['agentview_image'].shape[0]
seq_len = obs_dict['agentview_image'].shape[1]
# ============= 1. 编码视觉 =============
visual_features = []
for cam_name in self.camera_names:
images = obs_dict[cam_name] # (B, T, 3, H, W)
# 展平
images = images.reshape(-1, *images.shape[2:]) # (B*T, 3, H, W)
# ViT编码
outputs = self.vit(pixel_values=images)
features = outputs.last_hidden_state[:, 0] # [CLS] token
# 恢复形状
features = features.reshape(batch_size, seq_len, -1)
visual_features.append(features)
# 融合多视角
visual_features = torch.cat(visual_features, dim=-1)
visual_features = self.visual_fusion(visual_features)
# (B, T, hidden_dim)
# ============= 2. 编码语言(如果使用)=============
if self.use_language and language_tokens is not None:
# BERT编码
lang_outputs = self.bert(input_ids=language_tokens)
lang_features = lang_outputs.last_hidden_state # (B, L, hidden_dim)
# Cross-attention融合
# visual attend to language
fused_features, _ = self.cross_attention(
query=visual_features.reshape(-1, 1, self.hidden_dim),
key=lang_features,
value=lang_features,
)
fused_features = fused_features.reshape(batch_size, seq_len, self.hidden_dim)
# 残差连接
features = self.fusion_norm(visual_features + fused_features)
else:
features = visual_features
# ============= 3. 时序建模 =============
temporal_features = self.temporal_transformer(features)
# ============= 4. 解码动作 =============
temporal_features = temporal_features.reshape(-1, self.hidden_dim)
actions = self.action_decoder(temporal_features)
actions = actions.reshape(batch_size, seq_len, self.action_dim)
return actions
2.4 📊 datasets/ - 数据加载
实现高效的数据加载管道。
# datasets/libero_dataset.py
import torch
from torch.utils.data import Dataset
import h5py
import numpy as np
class LiberoDataset(Dataset):
"""
LIBERO任务数据集
从HDF5文件加载演示数据
"""
def __init__(
self,
dataset_path,
obs_modality="rgb", # "rgb", "rgb_proprio", "depth"
seq_len=10,
frame_stack=1,
**kwargs
):
"""
Args:
dataset_path: HDF5数据集路径
obs_modality: 观察模态
seq_len: 序列长度
frame_stack: 帧堆叠数量
"""
self.dataset_path = dataset_path
self.obs_modality = obs_modality
self.seq_len = seq_len
self.frame_stack = frame_stack
# 加载数据集
self.hdf5_file = h5py.File(dataset_path, 'r')
self.demos = list(self.hdf5_file['data'].keys())
# 构建索引
self.index = self._build_index()
print(f"Loaded {len(self.demos)} demos, {len(self.index)} samples")
def _build_index(self):
"""
构建数据索引
Returns:
index: [(demo_id, start_idx, end_idx), ...]
"""
index = []
for demo_id in self.demos:
demo = self.hdf5_file['data'][demo_id]
n_steps = demo.attrs['num_samples']
# 滑动窗口采样
for i in range(n_steps - self.seq_len + 1):
index.append((demo_id, i, i + self.seq_len))
return index
def __len__(self):
return len(self.index)
def __getitem__(self, idx):
"""
获取一个样本
Returns:
sample: {
'obs': 观察字典,
'actions': (seq_len, action_dim),
'demo_id': 演示ID,
}
"""
demo_id, start_idx, end_idx = self.index[idx]
demo = self.hdf5_file['data'][demo_id]
# 加载观察
obs = {}
if 'rgb' in self.obs_modality:
# 加载RGB图像
obs['agentview_image'] = torch.from_numpy(
demo['obs/agentview_rgb'][start_idx:end_idx]
).float() / 255.0 # 归一化到[0,1]
obs['robot0_eye_in_hand_image'] = torch.from_numpy(
demo['obs/eye_in_hand_rgb'][start_idx:end_idx]
).float() / 255.0
# 转换为(T, C, H, W)
obs['agentview_image'] = obs['agentview_image'].permute(0, 3, 1, 2)
obs['robot0_eye_in_hand_image'] = obs['robot0_eye_in_hand_image'].permute(0, 3, 1, 2)
if 'proprio' in self.obs_modality:
# 加载本体感觉
obs['robot0_joint_pos'] = torch.from_numpy(
demo['obs/joint_states'][start_idx:end_idx]
).float()
obs['robot0_gripper_qpos'] = torch.from_numpy(
demo['obs/gripper_states'][start_idx:end_idx]
).float()
if 'depth' in self.obs_modality:
# 加载深度图
obs['agentview_depth'] = torch.from_numpy(
demo['obs/agentview_depth'][start_idx:end_idx]
).float()
# 加载动作
actions = torch.from_numpy(
demo['actions'][start_idx:end_idx]
).float()
sample = {
'obs': obs,
'actions': actions,
'demo_id': demo_id,
}
return sample
@property
def obs_shapes(self):
"""返回观察形状"""
sample = self[0]
return {k: v.shape[1:] for k, v in sample['obs'].items()}
@property
def action_dim(self):
"""返回动作维度"""
return self[0]['actions'].shape[-1]
🎯 扩展点
想扩展LIBERO?可以:
- 添加新任务: 编写新的BDDL文件
- 实现新算法: 继承
Sequential基类 - 设计新架构: 实现新的Policy类
- 创建新对象: 添加XML模型到assets/
📚 进一步学习
- LIBERO官方文档: https://lifelong-robot-learning.github.io/LIBERO/
- 论文: “LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning”
- 代码仓库: https://github.com/Lifelong-Robot-Learning/LIBERO
LIBERO 代码结构快速参考
一页纸搞懂LIBERO代码组织
📂 目录树(两层)
LIBERO/
├── libero/
│ ├── libero/ ⭐ 环境与任务定义
│ └── lifelong/ ⭐ 终身学习算法
├── scripts/ 🔧 实用工具脚本
├── configs/ ⚙️ 配置文件
└── notebooks/ 📓 教程示例
🎯 核心模块速览
1️⃣ libero/libero/ - 环境系统
| 模块 | 路径 | 功能 | 文件数 |
|---|---|---|---|
| 资产库 | assets/ | 3D模型、纹理、场景 | 100+ |
| 任务定义 | bddl_files/ | 130个BDDL任务文件 | 130 |
| 基准套件 | benchmark/ | 5个任务套件类 | 5 |
| 环境实现 | envs/ | MuJoCo环境 | 50+ |
| 初始状态 | init_files/ | 固定初始化 | 130 |
| 工具函数 | utils/ | 辅助功能 | 10+ |
2️⃣ libero/lifelong/ - 学习系统
| 模块 | 路径 | 功能 | 实现数 |
|---|---|---|---|
| 主程序 | main.py | 训练入口 | 1 |
| 算法 | algos/ | 5种终身学习算法 | 5 |
| 策略网络 | models/policy/ | 3种架构 | 3 |
| 数据加载 | datasets/ | HDF5数据集 | 2 |
| 评估指标 | metrics/ | 性能度量 | 5+ |
🔑 关键类与接口
环境相关
# 1. 任务套件
from libero.libero import benchmark
benchmark_dict = benchmark.get_benchmark_dict()
task_suite = benchmark_dict["libero_spatial"]()
task = task_suite.get_task(task_id=0)
# 属性
task.name # 任务名称
task.language # 自然语言描述
task.bddl_file # BDDL文件名
task.problem_folder # 所在文件夹
# 方法
task_suite.get_num_tasks() # 任务数量
task_suite.get_task_init_states(task_id) # 固定初始状态
# 2. 环境创建
from libero.libero.envs import TASK_MAPPING
env_class = TASK_MAPPING[task.name]
env = env_class(
bddl_file_name=bddl_path,
robots="Panda",
has_renderer=False,
has_offscreen_renderer=True,
use_camera_obs=True,
camera_heights=128,
camera_widths=128,
)
# 环境方法
obs = env.reset()
obs, reward, done, info = env.step(action)
env.set_init_state(init_state)
success = env._check_success()
算法相关
# 3. 算法基类
from libero.lifelong.algos import Sequential
algo = Sequential(
n_tasks=10,
policy=policy,
datasets=datasets,
lr=1e-4,
)
# 方法
algo.learn_task(task_id, epochs, batch_size)
algo.save(save_path)
algo.load(load_path)
# 4. 策略网络
from libero.lifelong.models.policy import BCTransformerPolicy
policy = BCTransformerPolicy(
obs_shapes={'agentview_image': (3, 128, 128), ...},
action_dim=7,
hidden_dim=512,
transformer_num_layers=4,
)
# 方法
actions = policy(obs_dict) # 训练
action = policy.predict_action(obs_history) # 推理
# 5. 数据集
from libero.lifelong.datasets import LiberoDataset
dataset = LiberoDataset(
dataset_path="path/to/demo.hdf5",
obs_modality="rgb_proprio",
seq_len=10,
)
# 属性
dataset.obs_shapes # 观察形状字典
dataset.action_dim # 动作维度
len(dataset) # 样本数量
# 获取样本
sample = dataset[0]
# sample = {
# 'obs': {...},
# 'actions': (seq_len, 7),
# 'demo_id': 'demo_0'
# }
📋 五大任务套件
| 套件名 | 任务数 | 知识类型 | 用途 |
|---|---|---|---|
| LIBERO-Spatial | 10 | 空间关系 | 测试空间推理迁移 |
| LIBERO-Object | 10 | 对象概念 | 测试对象泛化 |
| LIBERO-Goal | 10 | 任务目标 | 测试目标泛化 |
| LIBERO-90 | 90 | 混合知识 | 预训练数据 |
| LIBERO-10 | 10 | 混合知识 | 长期评估 |
🧠 五种算法
| 算法 | 类名 | 防遗忘策略 | 额外内存 |
|---|---|---|---|
| Sequential | Sequential | 无(baseline) | 0 |
| Experience Replay | ER | 重放旧数据 | 中等 |
| EWC | EWC | 参数正则化 | 少量 |
| PackNet | PackNet | 参数分配 | 无 |
| Multitask | Multitask | 联合训练(oracle) | 全部 |
🏗️ 三种策略架构
| 架构 | 类名 | 编码器 | 序列模型 | 特点 |
|---|---|---|---|---|
| ResNet-RNN | BCRNNPolicy | ResNet18 | GRU | 简单快速 |
| ResNet-T | BCTransformerPolicy | ResNet18 | Transformer | 长期依赖 |
| ViT-T | BCViLTPolicy | ViT + BERT | Cross-attention | 视觉-语言融合 |
🎯 核心工作流
训练流程
1. 加载基准套件
benchmark = get_benchmark_dict()["libero_spatial"]()
2. 加载数据集
datasets = [load_dataset(task_id) for task_id in range(10)]
3. 创建策略
policy = BCTransformerPolicy(...)
4. 创建算法
algo = ER(n_tasks=10, policy=policy, datasets=datasets)
5. 终身学习循环
for task_id in range(10):
algo.learn_task(task_id, epochs=100, batch_size=64)
evaluate_all_tasks()
评估流程
1. 加载环境和初始状态
env = create_env(task)
init_states = benchmark.get_task_init_states(task_id)
2. 对每个初始状态评估
for init_state in init_states:
env.reset()
env.set_init_state(init_state)
# 执行策略直到成功或失败
while not done:
action = policy.predict_action(obs)
obs, reward, done, info = env.step(action)
3. 计算成功率
success_rate = sum(successes) / n_episodes
📊 数据流转图
┌──────────────┐
│ BDDL文件 │ 任务定义
└──────┬───────┘
│
▼
┌──────────────┐
│ Environment │ MuJoCo仿真
└──────┬───────┘
│ 人类演示
▼
┌──────────────┐
│ demo.hdf5 │ 原始数据(states+actions)
└──────┬───────┘
│ create_dataset.py
▼
┌──────────────┐
│ *_demo.hdf5 │ 训练数据(images+states+actions)
└──────┬───────┘
│
▼
┌──────────────┐
│ Dataset │ PyTorch数据集
└──────┬───────┘
│
▼
┌──────────────┐
│ Algorithm │ 终身学习
└──────┬───────┘
│
▼
┌──────────────┐
│ Policy │ 训练好的策略
└──────────────┘
🔧 配置文件速查
configs/config.yaml (主配置)
defaults:
- data: default
- policy: bc_transformer_policy
- lifelong: er
- train: default
seed: 42
benchmark_name: libero_spatial
configs/policy/bc_transformer_policy.yaml
policy_type: BCTransformerPolicy
hidden_dim: 512
transformer_num_layers: 4
transformer_num_heads: 8
transformer_max_seq_len: 10
configs/lifelong/er.yaml
algo_name: ER
memory_size: 1000 # 每任务保存样本数
lr: 1e-4
weight_decay: 1e-5
💡 常见使用场景
场景1: 运行基准实验
python libero/lifelong/main.py \
benchmark_name=libero_spatial \
policy=bc_transformer_policy \
lifelong=er \
seed=0
场景2: 评估单个任务
python libero/lifelong/evaluate.py \
--benchmark libero_spatial \
--task_id 0 \
--policy bc_transformer_policy \
--algo er \
--seed 0 \
--load_task 9
场景3: 创建自定义环境
from libero.libero.envs.base_env import LiberoBaseEnv
class MyCustomEnv(LiberoBaseEnv):
def _check_success(self):
# 自定义成功条件
return ...
def reward(self, action):
# 自定义奖励
return ...
场景4: 实现自定义算法
from libero.lifelong.algos.base import Sequential
class MyAlgorithm(Sequential):
def learn_task(self, task_id, epochs, batch_size):
# 自定义学习过程
...
📐 观察空间格式
RGB观察
{
'agentview_image': (3, 128, 128), # 第三人称视角
'robot0_eye_in_hand_image': (3, 128, 128), # 手眼相机
}
本体感觉观察
{
'robot0_joint_pos': (7,), # 关节角度
'robot0_gripper_qpos': (2,), # 夹爪位置
'robot0_eef_pos': (3,), # 末端执行器位置
'robot0_eef_quat': (4,), # 末端执行器姿态
}
动作空间
action = [dx, dy, dz, droll, dpitch, dyaw, gripper]
# dx, dy, dz: 位置增量 (范围: -1 到 1)
# droll, dpitch, dyaw: 姿态增量 (范围: -1 到 1)
# gripper: 夹爪动作 (-1=打开, 1=关闭)
🎓 关键设计模式
1. 策略-算法分离
Policy(如何决策) + Algorithm(如何学习) = 完整系统
2. 模块化组合
Environment + Dataset + Policy + Algorithm → 实验
3. 配置驱动
YAML配置 → Hydra → 动态组合模块
📚 扩展LIBERO的4种方式
| 扩展类型 | 文件位置 | 示例 |
|---|---|---|
| 新任务 | bddl_files/custom/ | 编写BDDL文件 |
| 新算法 | lifelong/algos/ | 继承Sequential类 |
| 新架构 | models/policy/ | 实现新Policy类 |
| 新对象 | assets/objects/ | 添加XML模型 |
🔍 调试技巧
查看环境
env.render() # 可视化
env.sim.data.qpos # 查看关节位置
env.sim.data.qvel # 查看关节速度
检查数据集
import h5py
f = h5py.File('dataset.hdf5', 'r')
print(list(f['data'].keys())) # 演示列表
print(f['data/demo_0'].keys()) # 数据项
验证策略
policy.eval()
with torch.no_grad():
action = policy(obs)
print(action.shape) # (batch, seq_len, 7)
⚠️ 常见陷阱
| 问题 | 原因 | 解决 |
|---|---|---|
| GPU内存溢出 | 批大小太大 | 减小batch_size |
| 训练不收敛 | 学习率不当 | 调整lr |
| 遗忘严重 | 算法选择 | 尝试ER或EWC |
| 加载速度慢 | HDF5读取 | 增加num_workers |
| 动作不连续 | 序列长度太短 | 增加seq_len |
📊 性能基准(参考)
Forward Transfer (前向迁移)
- Sequential: ~60%
- ER: ~65%
- EWC: ~62%
- PackNet: ~58%
- Multitask: ~75% (oracle)
Backward Transfer (遗忘)
- Sequential: -30%
- ER: -15%
- EWC: -20%
- PackNet: -10%
- Multitask: 0% (oracle)
实际结果依赖于具体配置和任务套件
LIBERO 工具脚本完全讲解
这是对LIBERO项目中10个核心工具脚本的详细讲解文档。
📋 脚本总览
| 脚本名称 | 功能 | 使用场景 | 难度 |
|---|---|---|---|
init_path.py | 路径初始化 | 所有脚本的前置依赖 | ⭐ |
check_dataset_integrity.py | 数据集完整性检查 | 验证数据集质量 | ⭐ |
get_dataset_info.py | 数据集信息查看 | 分析数据集统计 | ⭐ |
get_affordance_info.py | 可交互区域信息 | 查看对象交互能力 | ⭐ |
config_copy.py | 配置文件复制 | 初始化项目配置 | ⭐ |
create_libero_task_example.py | 任务创建示例 | 学习任务创建 | ⭐⭐ |
create_template.py | 模板生成工具 | 快速创建新组件 | ⭐⭐ |
collect_demonstration.py | 人类演示收集 | 收集单个任务数据 | ⭐⭐⭐ |
libero_100_collect_demonstrations.py | LIBERO-100数据收集 | 批量收集数据 | ⭐⭐⭐ |
create_dataset.py | 数据集生成 | 转换演示为训练数据 | ⭐⭐⭐ |
1️⃣ init_path.py - 路径初始化
📖 功能说明
这是一个简单但关键的初始化脚本,用于将LIBERO包路径添加到Python搜索路径中。
💻 完整代码
import sys
import os
path = os.path.dirname(os.path.realpath(__file__))
sys.path.insert(0, os.path.join(path, "../"))
🔑 核心逻辑
- 获取当前脚本所在目录的绝对路径
- 将上级目录添加到
sys.path最前面 - 确保可以导入LIBERO包
💡 使用场景
# 在其他脚本开头导入
import init_path # 必须在导入libero之前
from libero.libero import benchmark
from libero.libero.envs import *
⚠️ 注意事项
- 必须在所有LIBERO导入之前执行
- 适用于在
scripts/目录下运行脚本的情况 - 如果已经正确安装LIBERO包,可以不需要这个脚本
2️⃣ check_dataset_integrity.py - 数据集完整性检查
📖 功能说明
自动扫描并验证LIBERO数据集的完整性,检查每个数据集是否包含正确数量的演示轨迹。
🔑 核心功能
检查项目:
- ✅ 每个数据集是否有50个演示轨迹
- ✅ 轨迹长度统计(均值和标准差)
- ✅ 动作范围检查
- ✅ 数据集版本标签验证
💻 核心代码逻辑
from pathlib import Path
import h5py
import numpy as np
from libero.libero import get_libero_path
error_datasets = []
# 递归查找所有HDF5文件
for demo_file_name in Path(get_libero_path("datasets")).rglob("*hdf5"):
demo_file = h5py.File(demo_file_name)
# 统计演示数量
count = 0
for key in demo_file["data"].keys():
if "demo" in key:
count += 1
if count == 50: # LIBERO标准:每任务50个演示
# 统计轨迹长度
traj_lengths = []
for demo_name in demo_file["data"].keys():
traj_lengths.append(
demo_file["data/{}/actions".format(demo_name)].shape[0]
)
traj_lengths = np.array(traj_lengths)
print(f"✔ dataset {demo_file_name} is intact")
print(f"Mean length: {np.mean(traj_lengths)} ± {np.std(traj_lengths)}")
# 检查版本
if demo_file["data"].attrs["tag"] == "libero-v1":
print("Version correct")
else:
print(f"❌ Error: {demo_file_name} has {count} demos (expected 50)")
error_datasets.append(demo_file_name)
# 报告错误
if len(error_datasets) > 0:
print("\n[error] The following datasets are corrupted:")
for dataset in error_datasets:
print(dataset)
📊 输出示例
[info] dataset libero_spatial/demo_0.hdf5 is intact, test passed ✔
124.5 +- 15.3
Version correct
=========================================
[info] dataset libero_object/demo_1.hdf5 is intact, test passed ✔
156.2 +- 22.7
Version correct
=========================================
💡 使用方法
# 检查所有数据集
python check_dataset_integrity.py
# 自动扫描 ~/.libero/datasets/ 目录下的所有HDF5文件
🎯 应用场景
- 下载数据集后验证完整性
- 数据收集后的质量检查
- 定期验证数据集状态
- 诊断数据集问题
3️⃣ get_dataset_info.py - 数据集信息查看
📖 功能说明
详细报告HDF5数据集的统计信息、元数据和结构,是数据集分析的利器。
🔑 主要功能
报告内容:
- 📊 轨迹统计(总数、长度分布)
- 🎯 动作范围(最大/最小值)
- 🗣️ 语言指令
- 🔖 过滤键(Filter Keys)
- 🌍 环境元数据
- 📦 数据结构详情
💻 核心代码解析
import h5py
import json
import argparse
import numpy as np
parser = argparse.ArgumentParser()
parser.add_argument("--dataset", type=str, help="path to hdf5 dataset")
parser.add_argument("--filter_key", type=str, default=None)
parser.add_argument("--verbose", action="store_true")
args = parser.parse_args()
f = h5py.File(args.dataset, "r")
# 获取演示列表
if args.filter_key is not None:
demos = sorted([elem.decode("utf-8")
for elem in np.array(f["mask/{}".format(args.filter_key)])])
else:
demos = sorted(list(f["data"].keys()))
# 统计轨迹长度和动作范围
traj_lengths = []
action_min = np.inf
action_max = -np.inf
for ep in demos:
traj_lengths.append(f["data/{}/actions".format(ep)].shape[0])
action_min = min(action_min, np.min(f["data/{}/actions".format(ep)][()]))
action_max = max(action_max, np.max(f["data/{}/actions".format(ep)][()]))
traj_lengths = np.array(traj_lengths)
# 报告统计信息
print("")
print(f"total transitions: {np.sum(traj_lengths)}")
print(f"total trajectories: {traj_lengths.shape[0]}")
print(f"traj length mean: {np.mean(traj_lengths)}")
print(f"traj length std: {np.std(traj_lengths)}")
print(f"traj length min: {np.min(traj_lengths)}")
print(f"traj length max: {np.max(traj_lengths)}")
print(f"action min: {action_min}")
print(f"action max: {action_max}")
# 获取语言指令
problem_info = json.loads(f["data"].attrs["problem_info"])
language_instruction = problem_info["language_instruction"]
print(f"language instruction: {language_instruction.strip('\"')}")
# 报告数据结构
print("\n==== Dataset Structure ====")
for ep in demos:
print(f"episode {ep} with {f['data/{}'.format(ep)].attrs['num_samples']} transitions")
for k in f["data/{}".format(ep)]:
if k in ["obs", "next_obs"]:
print(f" key: {k}")
for obs_k in f["data/{}/{}".format(ep, k)]:
shape = f["data/{}/{}/{}".format(ep, k, obs_k)].shape
print(f" observation key {obs_k} with shape {shape}")
elif isinstance(f["data/{}/{}".format(ep, k)], h5py.Dataset):
key_shape = f["data/{}/{}".format(ep, k)].shape
print(f" key: {k} with shape {key_shape}")
if not args.verbose:
break # 只显示第一个演示的结构
f.close()
# 验证动作范围
if (action_min < -1.0) or (action_max > 1.0):
raise Exception(f"Actions should be in [-1., 1.] but got [{action_min}, {action_max}]")
📊 输出示例
total transitions: 6247
total trajectories: 50
traj length mean: 124.94
traj length std: 15.32
traj length min: 95
traj length max: 168
action min: -0.9876
action max: 0.9912
language instruction: put the black bowl on the plate
==== Filter Keys ====
filter key train with 45 demos
filter key valid with 5 demos
==== Env Meta ====
{
"type": 1,
"env_name": "KITCHEN_SCENE1_put_the_black_bowl_on_the_plate",
"problem_name": "KITCHEN_SCENE1_put_the_black_bowl_on_the_plate",
"bddl_file": "libero/bddl_files/kitchen_scene1/...",
"env_kwargs": {...}
}
==== Dataset Structure ====
episode demo_0 with 124 transitions
key: obs
observation key agentview_rgb with shape (124, 128, 128, 3)
observation key eye_in_hand_rgb with shape (124, 128, 128, 3)
observation key gripper_states with shape (124, 2)
observation key joint_states with shape (124, 7)
key: actions with shape (124, 7)
key: rewards with shape (124,)
key: dones with shape (124,)
💡 使用方法
# 查看基本信息
python get_dataset_info.py --dataset path/to/demo.hdf5
# 查看训练集子集
python get_dataset_info.py --dataset demo.hdf5 --filter_key train
# 详细模式(显示所有演示结构)
python get_dataset_info.py --dataset demo.hdf5 --verbose
🎯 应用场景
- 分析数据集特性
- 验证数据格式
- 调试数据加载问题
- 评估数据质量
4️⃣ get_affordance_info.py - 可交互区域信息
📖 功能说明
提取所有对象的可交互区域(affordance regions)信息,显示对象支持哪些交互操作。
💻 完整代码
import init_path
from libero.libero.envs.objects import OBJECTS_DICT
from libero.libero.utils.object_utils import get_affordance_regions
# 获取所有对象的可交互区域
affordances = get_affordance_regions(OBJECTS_DICT)
print(affordances)
📊 输出示例
{
'microwave': {
'regions': ['inside', 'top'],
'actions': ['open', 'close', 'put_in']
},
'wooden_cabinet': {
'regions': ['top_region', 'bottom_region', 'door'],
'actions': ['open', 'close', 'put_on', 'put_in']
},
'plate': {
'regions': ['surface'],
'actions': ['put_on']
},
'basket': {
'regions': ['inside'],
'actions': ['put_in']
},
'flat_stove': {
'regions': ['burner_1', 'burner_2', 'burner_3', 'burner_4'],
'actions': ['turnon', 'turnoff', 'put_on']
}
}
🔑 关键概念
**Affordance(可供性/可交互性)**指对象支持的交互能力:
- 容器类(microwave, basket):可以放入物体(
put_in) - 表面类(plate, table):可以放置物体(
put_on) - 可开关类(cabinet, fridge):可以打开/关闭(
open/close) - 可控制类(stove, faucet):可以开启/关闭(
turnon/turnoff)
💡 使用方法
# 直接运行
python get_affordance_info.py
# 输出会显示所有对象及其可交互区域
🎯 应用场景
- 任务设计:了解哪些对象支持哪些操作
- BDDL文件编写:确定可用的谓词和区域
- 调试任务:验证交互操作的可行性
- 文档编写:生成对象能力清单
5️⃣ config_copy.py - 配置文件复制
📖 功能说明
将LIBERO的配置文件复制到当前项目目录,方便自定义和修改配置。
💻 完整代码
import os
import shutil
from libero.libero import get_libero_path
def main():
target_path = os.path.abspath(os.path.join("./", "configs"))
print(f"Copying configs to {target_path}")
# 检查目标目录是否已存在
if os.path.exists(target_path):
response = input("The target directory already exists. Overwrite it? (y/n) ")
if response.lower() != "y":
return
shutil.rmtree(target_path)
# 复制配置文件
shutil.copytree(
os.path.join(get_libero_path("benchmark_root"), "../configs"),
target_path
)
print("✔ Configs copied successfully!")
if __name__ == "__main__":
main()
📁 复制的配置结构
configs/
├── config.yaml # 主配置文件
├── data/
│ └── default.yaml # 数据配置
├── eval/
│ └── default.yaml # 评估配置
├── lifelong/
│ ├── base.yaml # 基础算法配置
│ ├── er.yaml # 经验回放配置
│ ├── ewc.yaml # EWC配置
│ └── packnet.yaml # PackNet配置
├── policy/
│ ├── bc_rnn_policy.yaml # RNN策略配置
│ ├── bc_transformer_policy.yaml # Transformer策略配置
│ └── bc_vilt_policy.yaml # ViLT策略配置
└── train/
└── default.yaml # 训练配置
🔑 配置文件说明
主配置文件 (config.yaml)
defaults:
- data: default
- eval: default
- policy: bc_transformer_policy
- lifelong: base
- train: default
seed: 42
benchmark_name: libero_spatial
folder: ${libero.datasets}
策略配置示例 (bc_transformer_policy.yaml)
policy_type: BCTransformerPolicy
transformer_num_layers: 4
transformer_num_heads: 6
transformer_max_seq_len: 10
image_encoder:
network: ResnetEncoder
network_kwargs:
language_fusion: film
freeze: false
💡 使用方法
# 复制配置文件
python config_copy.py
# 之后可以修改 ./configs/ 目录下的配置
🎯 应用场景
- 初始化新项目
- 自定义实验配置
- 创建不同的配置变体
- 版本控制配置文件
6️⃣ create_libero_task_example.py - 任务创建示例
📖 功能说明
演示如何通过代码创建LIBERO任务,生成BDDL文件。这是学习任务创建的最佳起点。
💻 完整代码详解
import numpy as np
from libero.libero.utils.bddl_generation_utils import (
get_xy_region_kwargs_list_from_regions_info,
)
from libero.libero.utils.mu_utils import register_mu, InitialSceneTemplates
from libero.libero.utils.task_generation_utils import (
register_task_info,
generate_bddl_from_task_info,
)
# ============= 第1步: 定义场景 =============
@register_mu(scene_type="kitchen")
class KitchenScene1(InitialSceneTemplates):
"""
定义一个厨房场景,包含:
- 1个厨房桌子(工作空间)
- 1个木制橱柜
- 1个黑碗
- 1个盘子
"""
def __init__(self):
# 定义固定装置(fixtures)
fixture_num_info = {
"kitchen_table": 1, # 桌子
"wooden_cabinet": 1, # 橱柜
}
# 定义可操作对象(objects)
object_num_info = {
"akita_black_bowl": 1, # 黑碗
"plate": 1, # 盘子
}
super().__init__(
workspace_name="kitchen_table", # 工作空间名称
fixture_num_info=fixture_num_info,
object_num_info=object_num_info,
)
def define_regions(self):
"""定义对象的初始放置区域"""
# 橱柜的放置区域(桌子后方)
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, -0.30], # 中心坐标 (x, y)
region_name="wooden_cabinet_init_region",
target_name=self.workspace_name,
region_half_len=0.01, # 区域半径(米)
yaw_rotation=(np.pi, np.pi), # 旋转角度
)
)
# 黑碗的放置区域(桌子中央)
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, 0.0],
region_name="akita_black_bowl_init_region",
target_name=self.workspace_name,
region_half_len=0.025,
)
)
# 盘子的放置区域(桌子前方)
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, 0.25],
region_name="plate_init_region",
target_name=self.workspace_name,
region_half_len=0.025,
)
)
# 生成区域参数列表
self.xy_region_kwargs_list = get_xy_region_kwargs_list_from_regions_info(
self.regions
)
@property
def init_states(self):
"""定义初始状态约束"""
states = [
("On", "akita_black_bowl_1", "kitchen_table_akita_black_bowl_init_region"),
("On", "plate_1", "kitchen_table_plate_init_region"),
("On", "wooden_cabinet_1", "kitchen_table_wooden_cabinet_init_region"),
]
return states
# ============= 第2步: 定义任务 =============
def main():
# 任务1: 打开橱柜顶层并把碗放进去
scene_name = "kitchen_scene1"
language = "open the top cabinet and put the black bowl in it"
register_task_info(
language,
scene_name=scene_name,
objects_of_interest=["wooden_cabinet_1", "akita_black_bowl_1"],
goal_states=[
("Open", "wooden_cabinet_1_top_region"), # 打开顶层
("In", "akita_black_bowl_1", "wooden_cabinet_1_top_region"), # 碗在里面
],
)
# 任务2: 打开橱柜底层并把碗放进去
language = "open the bottom cabinet and put the black bowl in it"
register_task_info(
language,
scene_name=scene_name,
objects_of_interest=["wooden_cabinet_1", "akita_black_bowl_1"],
goal_states=[
("Open", "wooden_cabinet_1_top_region"), # 打开顶层(需要先开)
("In", "akita_black_bowl_1", "wooden_cabinet_1_bottom_region"), # 碗在底层
],
)
# ============= 第3步: 生成BDDL文件 =============
bddl_file_names, failures = generate_bddl_from_task_info()
print("✔ Successfully generated BDDL files:")
for file_name in bddl_file_names:
print(f" - {file_name}")
if failures:
print("\n❌ Failed to generate:")
for failure in failures:
print(f" - {failure}")
if __name__ == "__main__":
main()
🔑 关键步骤详解
Step 1: 场景定义
@register_mu(scene_type="kitchen") # 注册为厨房场景
class KitchenScene1(InitialSceneTemplates):
# 继承自InitialSceneTemplates基类
包含三个方法:
__init__: 声明场景中的对象define_regions: 定义对象放置区域init_states: 定义初始状态约束
Step 2: 任务注册
register_task_info(
language, # 自然语言指令
scene_name=scene_name, # 使用的场景
objects_of_interest=[...], # 关键对象
goal_states=[...], # 目标状态
)
Step 3: BDDL生成
bddl_file_names, failures = generate_bddl_from_task_info()
📊 生成的BDDL文件示例
(define (problem KITCHEN_SCENE1_open_the_top_cabinet_and_put_the_black_bowl_in_it)
(:domain libero)
(:language "open the top cabinet and put the black bowl in it")
(:objects
kitchen_table - kitchen_table
wooden_cabinet_1 - wooden_cabinet
akita_black_bowl_1 - akita_black_bowl
plate_1 - plate
)
(:init
(On akita_black_bowl_1 kitchen_table_akita_black_bowl_init_region)
(On plate_1 kitchen_table_plate_init_region)
(On wooden_cabinet_1 kitchen_table_wooden_cabinet_init_region)
)
(:goal
(And
(Open wooden_cabinet_1_top_region)
(In akita_black_bowl_1 wooden_cabinet_1_top_region)
)
)
)
💡 使用方法
# 运行示例
python create_libero_task_example.py
# BDDL文件会生成在默认位置
# 可以在脚本中指定输出目录
🎯 学习要点
- 场景定义: 装置vs对象的区别
- 区域设置: 坐标系统和尺寸
- 状态约束: 谓词的使用方法
- 任务目标: 组合多个条件
7️⃣ create_template.py - 模板生成工具
📖 功能说明
交互式工具,用于快速生成LIBERO扩展组件的模板文件,大幅减少重复代码编写。
💻 核心功能
import os
import xml.etree.ElementTree as ET
from libero.libero import get_libero_path
from libero.libero.envs.textures import get_texture_file_list
def create_problem_class_from_file(class_name):
"""从模板创建问题类文件"""
template_source_file = os.path.join(
get_libero_path("benchmark_root"),
"../../templates/problem_class_template.py"
)
# 读取模板
with open(template_source_file, "r") as f:
lines = f.readlines()
# 替换占位符
new_lines = []
for line in lines:
if "YOUR_CLASS_NAME" in line:
line = line.replace("YOUR_CLASS_NAME", class_name)
new_lines.append(line)
# 保存新文件
output_file = f"{class_name.lower()}.py"
with open(output_file, "w") as f:
f.writelines(new_lines)
print(f"✔ Created class {class_name} at: {output_file}")
def create_scene_xml_file(scene_name):
"""交互式创建场景XML文件"""
template_source_file = os.path.join(
get_libero_path("benchmark_root"),
"../../templates/scene_template.xml"
)
# 解析XML模板
parser = ET.XMLParser(target=ET.TreeBuilder(insert_comments=True))
tree = ET.parse(template_source_file, parser)
root = tree.getroot()
# 定义需要选择纹理的元素
basic_elements = [
("Floor", "texplane"),
("Table", "tex-table"),
("Table legs", "tex-table-legs"),
("Walls", "tex-wall"),
]
# 为每个元素选择纹理
for (element_name, texture_name) in basic_elements:
element = root.findall('.//texture[@name="{}"]'.format(texture_name))[0]
# 确定纹理类型
type = None
if "floor" in element_name.lower():
type = "floor"
elif "table" in element_name.lower():
type = "table"
elif "wall" in element_name.lower():
type = "wall"
# 获取可用纹理列表
texture_list = get_texture_file_list(type=type, texture_path="../")
# 显示选项
for i, (texture_name, texture_file_path) in enumerate(texture_list):
print(f"[{i}]: {texture_name}")
# 用户选择
choice = int(input(f"Select texture for {element_name}: "))
element.set("file", texture_list[choice][1])
# 保存XML文件
output_file = f"{scene_name}.xml"
tree.write(output_file, encoding="utf-8")
print(f"✔ Created scene {scene_name} at: {output_file}")
print("\n[Notice] Texture paths assume the XML will be in libero/libero/assets/scenes/")
def main():
# 显示选项
choices = [
"problem_class", # 问题类Python文件
"scene", # 场景XML文件
"object", # 对象定义(保留)
"arena", # 竞技场定义(保留)
]
print("=== LIBERO Template Generator ===")
for i, choice in enumerate(choices):
print(f"[{i}]: {choice}")
choice = int(input("Select which file to create: "))
if choices[choice] == "problem_class":
class_name = input("Specify the class name: ")
assert " " not in class_name, "Space not allowed in naming"
# 标准化类名(首字母大写)
parts = class_name.split("_")
class_name = "_".join([part.lower().capitalize() for part in parts])
create_problem_class_from_file(class_name)
elif choices[choice] == "scene":
scene_name = input("Specify the scene name: ")
scene_name = scene_name.lower()
assert " " not in scene_name, "Space not allowed in naming"
create_scene_xml_file(scene_name)
🎬 使用示例
创建问题类
$ python create_template.py
=== LIBERO Template Generator ===
[0]: problem_class
[1]: scene
[2]: object
[3]: arena
Select which file to create: 0
Specify the class name: my_custom_task
✔ Created class My_Custom_Task at: my_custom_task.py
生成的模板内容:
from libero.libero.envs.base_env import LiberoBaseEnv
import numpy as np
class My_Custom_Task(LiberoBaseEnv):
"""
YOUR TASK DESCRIPTION HERE
"""
def __init__(self, bddl_file_name, **kwargs):
super().__init__(bddl_file_name, **kwargs)
def _check_success(self):
"""
Check if the task is successfully completed.
Returns:
bool: True if task is complete
"""
# TODO: Implement success condition
return self._check_goal_satisfied()
# TODO: Add custom methods if needed
创建场景XML
$ python create_template.py
Select which file to create: 1
Specify the scene name: my_kitchen
Select texture for Floor:
[0]: light_wood
[1]: dark_wood
[2]: tile_gray
[3]: tile_white
Select: 2
Select texture for Table:
[0]: wood_light
[1]: wood_dark
[2]: metal
Select: 0
Select texture for Walls:
[0]: white
[1]: beige
[2]: gray
Select: 1
✔ Created scene my_kitchen at: my_kitchen.xml
📋 模板类型说明
| 模板类型 | 生成内容 | 用途 |
|---|---|---|
| problem_class | Python类文件 | 自定义任务逻辑 |
| scene | XML场景文件 | 定义3D环境 |
| object | 对象定义 | 新对象类型 |
| arena | 竞技场 | 自定义工作空间 |
🎯 应用场景
- 快速原型开发
- 减少样板代码
- 标准化组件结构
- 新手学习模板
8️⃣ collect_demonstration.py - 人类演示收集
📖 功能说明
使用SpaceMouse或键盘收集人类操作演示数据,是创建高质量训练数据的核心工具。
🔑 核心流程
初始化环境 → 人类操控 → 记录轨迹 → 检查成功 → 保存HDF5
💻 核心代码详解
1. 轨迹收集函数
def collect_human_trajectory(
env, device, arm, env_configuration, problem_info, remove_directory=[]
):
"""
使用输入设备收集演示轨迹
Args:
env: MuJoCo环境
device: 输入设备(SpaceMouse或键盘)
arm: 控制的机械臂('right' 或 'left')
env_configuration: 环境配置
problem_info: 任务信息
remove_directory: 要移除的目录列表
Returns:
saving: 是否保存该轨迹
"""
# 重置环境
reset_success = False
while not reset_success:
try:
env.reset()
reset_success = True
except:
continue
env.render()
# 任务完成计数器
task_completion_hold_count = -1
device.start_control()
saving = True
count = 0
while True:
count += 1
# 获取动作
active_robot = env.robots[0] if env_configuration == "bimanual" else env.robots[arm == "left"]
action, grasp = input2action(
device=device,
robot=active_robot,
active_arm=arm,
env_configuration=env_configuration,
)
# 按ESC退出不保存
if action is None:
print("Break - Not saving")
saving = False
break
# 执行动作
env.step(action)
env.render()
# 检查任务完成
if task_completion_hold_count == 0:
break
# 状态机:检查连续10步成功
if env._check_success():
if task_completion_hold_count > 0:
task_completion_hold_count -= 1
else:
task_completion_hold_count = 10 # 首次成功
else:
task_completion_hold_count = -1
print(f"Episode length: {count}")
if not saving:
remove_directory.append(env.ep_directory.split("/")[-1])
env.close()
return saving
2. HDF5数据保存函数
def gather_demonstrations_as_hdf5(
directory, out_dir, env_info, args, remove_directory=[]
):
"""
将演示数据收集为HDF5格式
HDF5结构:
data/
├── [attributes] 元数据
├── demo_1/
│ ├── [attribute] model_file
│ ├── states (dataset)
│ └── actions (dataset)
├── demo_2/
└── ...
"""
hdf5_path = os.path.join(out_dir, "demo.hdf5")
f = h5py.File(hdf5_path, "w")
grp = f.create_group("data")
num_eps = 0
env_name = None
# 遍历所有轨迹文件
for ep_directory in os.listdir(directory):
if ep_directory in remove_directory:
continue
state_paths = os.path.join(directory, ep_directory, "state_*.npz")
states = []
actions = []
# 加载状态和动作
for state_file in sorted(glob(state_paths)):
dic = np.load(state_file, allow_pickle=True)
env_name = str(dic["env"])
states.extend(dic["states"])
for ai in dic["action_infos"]:
actions.append(ai["actions"])
if len(states) == 0:
continue
# 删除第一个动作和最后一个状态(数据收集机制导致)
del states[-1]
assert len(states) == len(actions)
num_eps += 1
ep_data_grp = grp.create_group(f"demo_{num_eps}")
# 保存模型XML
xml_path = os.path.join(directory, ep_directory, "model.xml")
with open(xml_path, "r") as f:
xml_str = f.read()
ep_data_grp.attrs["model_file"] = xml_str
# 保存数据集
ep_data_grp.create_dataset("states", data=np.array(states))
ep_data_grp.create_dataset("actions", data=np.array(actions))
# 保存元数据
now = datetime.datetime.now()
grp.attrs["date"] = f"{now.month}-{now.day}-{now.year}"
grp.attrs["time"] = f"{now.hour}:{now.minute}:{now.second}"
grp.attrs["repository_version"] = suite.__version__
grp.attrs["env"] = env_name
grp.attrs["env_info"] = env_info
grp.attrs["problem_info"] = json.dumps(problem_info)
grp.attrs["bddl_file_name"] = args.bddl_file
grp.attrs["bddl_file_content"] = str(open(args.bddl_file, "r", encoding="utf-8").read())
f.close()
3. 主程序
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--directory", type=str, default="demonstration_data")
parser.add_argument("--robots", nargs="+", type=str, default="Panda")
parser.add_argument("--controller", type=str, default="OSC_POSE")
parser.add_argument("--device", type=str, default="spacemouse")
parser.add_argument("--pos-sensitivity", type=float, default=1.5)
parser.add_argument("--rot-sensitivity", type=float, default=1.0)
parser.add_argument("--num-demonstration", type=int, default=50)
parser.add_argument("--bddl-file", type=str, required=True)
parser.add_argument("--vendor-id", type=int, default=9583)
parser.add_argument("--product-id", type=int, default=50734)
args = parser.parse_args()
# 加载控制器配置
controller_config = load_controller_config(default_controller=args.controller)
# 获取任务信息
problem_info = BDDLUtils.get_problem_info(args.bddl_file)
problem_name = problem_info["problem_name"]
language_instruction = problem_info["language_instruction"]
# 创建环境
env = TASK_MAPPING[problem_name](
bddl_file_name=args.bddl_file,
robots=args.robots,
controller_configs=controller_config,
has_renderer=True,
has_offscreen_renderer=False,
ignore_done=True,
use_camera_obs=False,
control_freq=20,
)
# 包装环境
env = VisualizationWrapper(env)
tmp_directory = f"demonstration_data/tmp/{problem_name}/{time.time()}"
env = DataCollectionWrapper(env, tmp_directory)
# 初始化设备
if args.device == "spacemouse":
device = SpaceMouse(
args.vendor_id,
args.product_id,
pos_sensitivity=args.pos_sensitivity,
rot_sensitivity=args.rot_sensitivity,
)
elif args.device == "keyboard":
device = Keyboard(
pos_sensitivity=args.pos_sensitivity,
rot_sensitivity=args.rot_sensitivity
)
# 创建输出目录
new_dir = os.path.join(args.directory, f"{problem_name}_{time.time()}")
os.makedirs(new_dir)
# 收集演示
remove_directory = []
i = 0
while i < args.num_demonstration:
print(f"Collecting demonstration {i+1}/{args.num_demonstration}")
saving = collect_human_trajectory(
env, device, args.arm, args.config, problem_info, remove_directory
)
if saving:
gather_demonstrations_as_hdf5(
tmp_directory, new_dir, env_info, args, remove_directory
)
i += 1
🎮 设备控制说明
SpaceMouse 3D鼠标
移动鼠标:控制末端执行器位置 (x, y, z)
旋转鼠标:控制末端执行器姿态 (roll, pitch, yaw)
按钮:开关夹爪
按ESC:取消当前演示(不保存)
键盘控制
W/S: 前进/后退 (x方向)
A/D: 左移/右移 (y方向)
Q/E: 上升/下降 (z方向)
J/L: 旋转roll
I/K: 旋转pitch
U/O: 旋转yaw
Space: 切换夹爪
ESC: 取消当前演示
💡 使用方法
使用SpaceMouse收集
python collect_demonstration.py \
--bddl-file path/to/task.bddl \
--device spacemouse \
--num-demonstration 50 \
--pos-sensitivity 1.5 \
--rot-sensitivity 1.0 \
--vendor-id 9583 \
--product-id 50734
使用键盘收集
python collect_demonstration.py \
--bddl-file path/to/task.bddl \
--device keyboard \
--num-demonstration 50 \
--pos-sensitivity 2.0 \
--rot-sensitivity 1.5
📊 收集流程
1. 加载任务BDDL文件
2. 创建环境和输入设备
3. 循环收集N个演示:
a. 重置环境
b. 人类操控完成任务
c. 检查任务成功(连续10步)
d. 保存轨迹到HDF5
4. 完成后关闭环境
⚠️ 注意事项
- 任务成功判定:需要连续10步保持成功状态
- 取消演示:按ESC取消,该轨迹不会保存
- 设备连接:确保SpaceMouse正确连接和配置
- 存储空间:50个演示约需要500MB-1GB空间
- 手腕疲劳:收集50个演示需要30-60分钟
🎯 最佳实践
- 先用几个演示熟悉操作
- 保持操作平滑自然
- 任务完成后保持姿势1-2秒
- 失败的尝试及时按ESC取消
- 定期休息避免疲劳
9️⃣ libero_100_collect_demonstrations.py - LIBERO-100批量收集
📖 功能说明
专门用于LIBERO-100数据集收集的脚本,带有彩色提示和批量处理功能。
🔑 与collect_demonstration.py的区别
| 特性 | collect_demonstration.py | libero_100_collect_demonstrations.py |
|---|---|---|
| 用途 | 单任务数据收集 | 批量任务数据收集 |
| 界面 | 基础 | 彩色终端提示 |
| 任务标识 | 无 | 支持task-id参数 |
| 交互性 | 直接开始 | 等待用户确认 |
| 灵敏度默认值 | 1.5/1.0 | 1.5/1.5 |
💻 关键改进
from termcolor import colored
# 彩色任务提示
text = colored(language_instruction, "red", attrs=["bold"])
print("Goal of the following task: ", text)
instruction = colored(
"Hit any key to proceed to data collection ...",
"green",
attrs=["reverse", "blink"]
)
print(instruction)
input() # 等待用户准备
# 支持任务ID
parser.add_argument("--task-id", type=int)
💡 使用方法
批量收集LIBERO-100
# 创建批处理脚本
for task_id in {0..99}; do
python libero_100_collect_demonstrations.py \
--bddl-file libero/bddl_files/libero_90/task_${task_id}.bddl \
--task-id $task_id \
--device spacemouse \
--num-demonstration 50
done
单个任务收集
python libero_100_collect_demonstrations.py \
--bddl-file libero/bddl_files/libero_10/task_0.bddl \
--task-id 0 \
--device spacemouse \
--num-demonstration 50 \
--pos-sensitivity 1.5 \
--rot-sensitivity 1.5
🎯 应用场景
- LIBERO-100数据集创建
- 批量任务数据收集
- 需要任务标识的项目
- 团队协作数据收集
🔟 create_dataset.py - 数据集生成
📖 功能说明
将收集的原始演示数据(demo.hdf5)转换为训练就绪的数据集,包括提取图像观察、状态信息和奖励信号。
🔑 核心功能
数据转换流程:
原始演示(demo.hdf5) → 回放验证 → 提取观察 → 添加奖励 → 训练数据集(_demo.hdf5)
💻 核心代码详解
import argparse
import h5py
import numpy as np
import json
from pathlib import Path
import robosuite.utils.transform_utils as T
from libero.libero.envs import *
from libero.libero import get_libero_path
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--demo-file", default="demo.hdf5")
parser.add_argument("--use-actions", action="store_true")
parser.add_argument("--use-camera-obs", action="store_true")
parser.add_argument("--dataset-path", type=str, default="datasets/")
parser.add_argument("--dataset-name", type=str, default="training_set")
parser.add_argument("--no-proprio", action="store_true")
parser.add_argument("--use-depth", action="store_true")
args = parser.parse_args()
# ============= 第1步: 读取原始演示文件 =============
f = h5py.File(args.demo_file, "r")
env_name = f["data"].attrs["env"]
env_kwargs = json.loads(f["data"].attrs["env_info"])
problem_info = json.loads(f["data"].attrs["problem_info"])
problem_name = problem_info["problem_name"]
demos = list(f["data"].keys())
bddl_file_name = f["data"].attrs["bddl_file_name"]
# ============= 第2步: 确定输出路径 =============
bddl_file_dir = os.path.dirname(bddl_file_name)
hdf5_path = os.path.join(
get_libero_path("datasets"),
bddl_file_dir.split("bddl_files/")[-1].replace(".bddl", "_demo.hdf5")
)
output_parent_dir = Path(hdf5_path).parent
output_parent_dir.mkdir(parents=True, exist_ok=True)
# ============= 第3步: 创建输出HDF5文件 =============
h5py_f = h5py.File(hdf5_path, "w")
grp = h5py_f.create_group("data")
# 保存元数据
grp.attrs["env_name"] = env_name
grp.attrs["problem_info"] = f["data"].attrs["problem_info"]
grp.attrs["macros_image_convention"] = macros.IMAGE_CONVENTION
# 更新环境配置
libero_utils.update_env_kwargs(
env_kwargs,
bddl_file_name=bddl_file_name,
has_renderer=not args.use_camera_obs,
has_offscreen_renderer=args.use_camera_obs,
ignore_done=True,
use_camera_obs=args.use_camera_obs,
camera_depths=args.use_depth,
camera_names=["robot0_eye_in_hand", "agentview"],
camera_heights=128,
camera_widths=128,
control_freq=20,
)
grp.attrs["bddl_file_name"] = bddl_file_name
grp.attrs["bddl_file_content"] = open(bddl_file_name, "r").read()
# ============= 第4步: 创建环境 =============
env = TASK_MAPPING[problem_name](**env_kwargs)
total_len = 0
cap_index = 5 # 跳过前5帧(力传感器不稳定)
# ============= 第5步: 处理每个演示 =============
for (i, ep) in enumerate(demos):
print(f"Processing episode {i+1}/{len(demos)}...")
# 读取模型XML和状态
model_xml = f["data/{}".format(ep)].attrs["model_file"]
states = f["data/{}/states".format(ep)][()]
actions = np.array(f["data/{}/actions".format(ep)][()])
# 重置环境
reset_success = False
while not reset_success:
try:
env.reset()
reset_success = True
except:
continue
# 从XML和状态恢复环境
env.reset_from_xml_string(model_xml)
env.sim.reset()
env.sim.set_state_from_flattened(states[0])
env.sim.forward()
# 初始化数据容器
ee_states = []
gripper_states = []
joint_states = []
robot_states = []
agentview_images = []
eye_in_hand_images = []
agentview_depths = []
eye_in_hand_depths = []
valid_index = []
# ============= 第6步: 回放并记录 =============
for j, action in enumerate(actions):
obs, reward, done, info = env.step(action)
# 验证回放准确性
if j < len(actions) - 1:
state_playback = env.sim.get_state().flatten()
err = np.linalg.norm(states[j + 1] - state_playback)
if err > 0.01:
print(f"[warning] playback diverged by {err:.2f}")
# 跳过前几帧(力传感器稳定期)
if j < cap_index:
continue
valid_index.append(j)
# 记录本体感觉信息
if not args.no_proprio:
if "robot0_gripper_qpos" in obs:
gripper_states.append(obs["robot0_gripper_qpos"])
joint_states.append(obs["robot0_joint_pos"])
ee_states.append(
np.hstack((
obs["robot0_eef_pos"],
T.quat2axisangle(obs["robot0_eef_quat"]),
))
)
robot_states.append(env.get_robot_state_vector(obs))
# 记录视觉观察
if args.use_camera_obs:
agentview_images.append(obs["agentview_image"])
eye_in_hand_images.append(obs["robot0_eye_in_hand_image"])
if args.use_depth:
agentview_depths.append(obs["agentview_depth"])
eye_in_hand_depths.append(obs["robot0_eye_in_hand_depth"])
else:
env.render()
# ============= 第7步: 保存处理后的数据 =============
states = states[valid_index]
actions = actions[valid_index]
# 创建奖励和完成标志
dones = np.zeros(len(actions)).astype(np.uint8)
dones[-1] = 1 # 最后一步标记为完成
rewards = np.zeros(len(actions)).astype(np.uint8)
rewards[-1] = 1 # 最后一步给予奖励
# 创建演示组
ep_data_grp = grp.create_group(f"demo_{i}")
obs_grp = ep_data_grp.create_group("obs")
# 保存本体感觉
if not args.no_proprio:
obs_grp.create_dataset("gripper_states", data=np.stack(gripper_states, axis=0))
obs_grp.create_dataset("joint_states", data=np.stack(joint_states, axis=0))
obs_grp.create_dataset("ee_states", data=np.stack(ee_states, axis=0))
obs_grp.create_dataset("ee_pos", data=np.stack(ee_states, axis=0)[:, :3])
obs_grp.create_dataset("ee_ori", data=np.stack(ee_states, axis=0)[:, 3:])
# 保存图像
obs_grp.create_dataset("agentview_rgb", data=np.stack(agentview_images, axis=0))
obs_grp.create_dataset("eye_in_hand_rgb", data=np.stack(eye_in_hand_images, axis=0))
if args.use_depth:
obs_grp.create_dataset("agentview_depth", data=np.stack(agentview_depths, axis=0))
obs_grp.create_dataset("eye_in_hand_depth", data=np.stack(eye_in_hand_depths, axis=0))
# 保存其他数据
ep_data_grp.create_dataset("actions", data=actions)
ep_data_grp.create_dataset("states", data=states)
ep_data_grp.create_dataset("robot_states", data=np.stack(robot_states, axis=0))
ep_data_grp.create_dataset("rewards", data=rewards)
ep_data_grp.create_dataset("dones", data=dones)
ep_data_grp.attrs["num_samples"] = len(agentview_images)
ep_data_grp.attrs["model_file"] = model_xml
ep_data_grp.attrs["init_state"] = states[0]
total_len += len(agentview_images)
# ============= 第8步: 保存全局属性 =============
grp.attrs["num_demos"] = len(demos)
grp.attrs["total"] = total_len
env.close()
h5py_f.close()
f.close()
print("\n✔ Dataset created successfully!")
print(f"Saved to: {hdf5_path}")
if __name__ == "__main__":
main()
📊 数据集结构对比
输入:原始演示 (demo.hdf5)
data/
├── demo_1/
│ ├── states (array) # MuJoCo状态
│ └── actions (array) # 动作序列
└── demo_2/
└── ...
输出:训练数据集 (*_demo.hdf5)
data/
├── demo_0/
│ ├── obs/
│ │ ├── agentview_rgb (128, 128, 3) # 第三人称视角
│ │ ├── eye_in_hand_rgb (128, 128, 3) # 手眼相机
│ │ ├── gripper_states (2,) # 夹爪状态
│ │ ├── joint_states (7,) # 关节角度
│ │ ├── ee_pos (3,) # 末端执行器位置
│ │ └── ee_ori (3,) # 末端执行器姿态
│ ├── actions (7,) # 动作
│ ├── rewards (1,) # 奖励(稀疏)
│ ├── dones (1,) # 完成标志
│ └── [attributes]
│ ├── num_samples
│ ├── model_file
│ └── init_state
└── demo_1/
└── ...
💡 使用方法
基础使用(不带视觉)
python create_dataset.py \
--demo-file path/to/demo.hdf5
包含视觉观察
python create_dataset.py \
--demo-file path/to/demo.hdf5 \
--use-camera-obs
包含深度图
python create_dataset.py \
--demo-file path/to/demo.hdf5 \
--use-camera-obs \
--use-depth
无本体感觉(仅视觉)
python create_dataset.py \
--demo-file path/to/demo.hdf5 \
--use-camera-obs \
--no-proprio
🔑 关键特性
1. 回放验证
# 确保回放准确性
state_playback = env.sim.get_state().flatten()
err = np.linalg.norm(states[j + 1] - state_playback)
if err > 0.01:
print(f"[warning] playback diverged by {err:.2f}")
2. 跳过不稳定帧
cap_index = 5 # 跳过前5帧
# 力传感器在开始时不稳定
if j < cap_index:
continue
3. 稀疏奖励
# LIBERO使用稀疏奖励
rewards = np.zeros(len(actions))
rewards[-1] = 1 # 只有完成时给奖励
4. 图像分辨率
camera_heights=128,
camera_widths=128,
# 默认128x128,可以修改
⚠️ 注意事项
- 内存使用:处理图像时需要大量内存
- 处理时间:50个演示约需5-10分钟
- 磁盘空间:
- 无图像:~100MB
- 有图像:~1-2GB
- 有深度:~2-3GB
- 回放误差:少量误差(<0.01)是正常的
🎯 典型工作流
# 1. 收集演示
python collect_demonstration.py --bddl-file task.bddl --num-demonstration 50
# 2. 转换为训练数据
python create_dataset.py --demo-file demo.hdf5 --use-camera-obs
# 3. 验证数据集
python check_dataset_integrity.py
# 4. 查看信息
python get_dataset_info.py --dataset path/to/task_demo.hdf5
# 5. 开始训练
python libero/lifelong/main.py ...
🎓 总结与最佳实践
📋 脚本使用建议
| 阶段 | 推荐脚本 | 目的 |
|---|---|---|
| 初始化 | config_copy.py, init_path.py | 设置项目环境 |
| 探索 | get_affordance_info.py | 了解对象能力 |
| 设计 | create_libero_task_example.py, create_template.py | 创建新任务 |
| 数据收集 | collect_demonstration.py | 收集训练数据 |
| 数据处理 | create_dataset.py | 生成训练集 |
| 验证 | check_dataset_integrity.py, get_dataset_info.py | 质量检查 |
🔧 常见工作流
工作流1:快速原型
# 1. 复制配置
python config_copy.py
# 2. 创建任务
python create_libero_task_example.py
# 3. 收集少量演示测试
python collect_demonstration.py --num-demonstration 5
工作流2:完整数据集创建
# 1. 收集50个演示
python collect_demonstration.py --num-demonstration 50
# 2. 转换为训练数据
python create_dataset.py --use-camera-obs
# 3. 检查完整性
python check_dataset_integrity.py
# 4. 查看统计
python get_dataset_info.py --dataset path/to/dataset.hdf5
工作流3:LIBERO-100扩展
# 批量创建100个任务的数据集
for i in {0..99}; do
python libero_100_collect_demonstrations.py \
--task-id $i \
--bddl-file libero_90/task_$i.bddl \
--num-demonstration 50
done
⚠️ 常见问题解决
| 问题 | 可能原因 | 解决方案 |
|---|---|---|
| 导入错误 | 路径问题 | 使用init_path.py |
| 回放误差大 | 随机性/物理不确定 | 检查随机种子 |
| 图像倒置 | 图像约定 | 检查macros.IMAGE_CONVENTION |
| 内存溢出 | 处理大数据集 | 分批处理或减少演示数 |
| 设备连接失败 | SpaceMouse配置 | 检查vendor-id和product-id |
🚀 性能优化建议
- 并行处理:多个任务可以并行收集
- 批处理:使用脚本批量处理多个数据集
- 预检查:收集前验证BDDL文件
- 增量保存:大数据集分批保存
- 定期备份:避免数据丢失
LIBERO 工具脚本完全讲解
这是对LIBERO项目中10个核心工具脚本的详细讲解文档。
📋 脚本总览
| 脚本名称 | 功能 | 使用场景 | 难度 |
|---|---|---|---|
init_path.py | 路径初始化 | 所有脚本的前置依赖 | ⭐ |
check_dataset_integrity.py | 数据集完整性检查 | 验证数据集质量 | ⭐ |
get_dataset_info.py | 数据集信息查看 | 分析数据集统计 | ⭐ |
get_affordance_info.py | 可交互区域信息 | 查看对象交互能力 | ⭐ |
config_copy.py | 配置文件复制 | 初始化项目配置 | ⭐ |
create_libero_task_example.py | 任务创建示例 | 学习任务创建 | ⭐⭐ |
create_template.py | 模板生成工具 | 快速创建新组件 | ⭐⭐ |
collect_demonstration.py | 人类演示收集 | 收集单个任务数据 | ⭐⭐⭐ |
libero_100_collect_demonstrations.py | LIBERO-100数据收集 | 批量收集数据 | ⭐⭐⭐ |
create_dataset.py | 数据集生成 | 转换演示为训练数据 | ⭐⭐⭐ |
1️⃣ init_path.py - 路径初始化
📖 功能说明
这是一个简单但关键的初始化脚本,用于将LIBERO包路径添加到Python搜索路径中。
💻 完整代码
import sys
import os
path = os.path.dirname(os.path.realpath(__file__))
sys.path.insert(0, os.path.join(path, "../"))
🔑 核心逻辑
- 获取当前脚本所在目录的绝对路径
- 将上级目录添加到
sys.path最前面 - 确保可以导入LIBERO包
💡 使用场景
# 在其他脚本开头导入
import init_path # 必须在导入libero之前
from libero.libero import benchmark
from libero.libero.envs import *
⚠️ 注意事项
- 必须在所有LIBERO导入之前执行
- 适用于在
scripts/目录下运行脚本的情况 - 如果已经正确安装LIBERO包,可以不需要这个脚本
2️⃣ check_dataset_integrity.py - 数据集完整性检查
📖 功能说明
自动扫描并验证LIBERO数据集的完整性,检查每个数据集是否包含正确数量的演示轨迹。
🔑 核心功能
检查项目:
- ✅ 每个数据集是否有50个演示轨迹
- ✅ 轨迹长度统计(均值和标准差)
- ✅ 动作范围检查
- ✅ 数据集版本标签验证
💻 核心代码逻辑
from pathlib import Path
import h5py
import numpy as np
from libero.libero import get_libero_path
error_datasets = []
# 递归查找所有HDF5文件
for demo_file_name in Path(get_libero_path("datasets")).rglob("*hdf5"):
demo_file = h5py.File(demo_file_name)
# 统计演示数量
count = 0
for key in demo_file["data"].keys():
if "demo" in key:
count += 1
if count == 50: # LIBERO标准:每任务50个演示
# 统计轨迹长度
traj_lengths = []
for demo_name in demo_file["data"].keys():
traj_lengths.append(
demo_file["data/{}/actions".format(demo_name)].shape[0]
)
traj_lengths = np.array(traj_lengths)
print(f"✔ dataset {demo_file_name} is intact")
print(f"Mean length: {np.mean(traj_lengths)} ± {np.std(traj_lengths)}")
# 检查版本
if demo_file["data"].attrs["tag"] == "libero-v1":
print("Version correct")
else:
print(f"❌ Error: {demo_file_name} has {count} demos (expected 50)")
error_datasets.append(demo_file_name)
# 报告错误
if len(error_datasets) > 0:
print("\n[error] The following datasets are corrupted:")
for dataset in error_datasets:
print(dataset)
📊 输出示例
[info] dataset libero_spatial/demo_0.hdf5 is intact, test passed ✔
124.5 +- 15.3
Version correct
=========================================
[info] dataset libero_object/demo_1.hdf5 is intact, test passed ✔
156.2 +- 22.7
Version correct
=========================================
💡 使用方法
# 检查所有数据集
python check_dataset_integrity.py
# 自动扫描 ~/.libero/datasets/ 目录下的所有HDF5文件
🎯 应用场景
- 下载数据集后验证完整性
- 数据收集后的质量检查
- 定期验证数据集状态
- 诊断数据集问题
3️⃣ get_dataset_info.py - 数据集信息查看
📖 功能说明
详细报告HDF5数据集的统计信息、元数据和结构,是数据集分析的利器。
🔑 主要功能
报告内容:
- 📊 轨迹统计(总数、长度分布)
- 🎯 动作范围(最大/最小值)
- 🗣️ 语言指令
- 🔖 过滤键(Filter Keys)
- 🌍 环境元数据
- 📦 数据结构详情
💻 核心代码解析
import h5py
import json
import argparse
import numpy as np
parser = argparse.ArgumentParser()
parser.add_argument("--dataset", type=str, help="path to hdf5 dataset")
parser.add_argument("--filter_key", type=str, default=None)
parser.add_argument("--verbose", action="store_true")
args = parser.parse_args()
f = h5py.File(args.dataset, "r")
# 获取演示列表
if args.filter_key is not None:
demos = sorted([elem.decode("utf-8")
for elem in np.array(f["mask/{}".format(args.filter_key)])])
else:
demos = sorted(list(f["data"].keys()))
# 统计轨迹长度和动作范围
traj_lengths = []
action_min = np.inf
action_max = -np.inf
for ep in demos:
traj_lengths.append(f["data/{}/actions".format(ep)].shape[0])
action_min = min(action_min, np.min(f["data/{}/actions".format(ep)][()]))
action_max = max(action_max, np.max(f["data/{}/actions".format(ep)][()]))
traj_lengths = np.array(traj_lengths)
# 报告统计信息
print("")
print(f"total transitions: {np.sum(traj_lengths)}")
print(f"total trajectories: {traj_lengths.shape[0]}")
print(f"traj length mean: {np.mean(traj_lengths)}")
print(f"traj length std: {np.std(traj_lengths)}")
print(f"traj length min: {np.min(traj_lengths)}")
print(f"traj length max: {np.max(traj_lengths)}")
print(f"action min: {action_min}")
print(f"action max: {action_max}")
# 获取语言指令
problem_info = json.loads(f["data"].attrs["problem_info"])
language_instruction = problem_info["language_instruction"]
print(f"language instruction: {language_instruction.strip('\"')}")
# 报告数据结构
print("\n==== Dataset Structure ====")
for ep in demos:
print(f"episode {ep} with {f['data/{}'.format(ep)].attrs['num_samples']} transitions")
for k in f["data/{}".format(ep)]:
if k in ["obs", "next_obs"]:
print(f" key: {k}")
for obs_k in f["data/{}/{}".format(ep, k)]:
shape = f["data/{}/{}/{}".format(ep, k, obs_k)].shape
print(f" observation key {obs_k} with shape {shape}")
elif isinstance(f["data/{}/{}".format(ep, k)], h5py.Dataset):
key_shape = f["data/{}/{}".format(ep, k)].shape
print(f" key: {k} with shape {key_shape}")
if not args.verbose:
break # 只显示第一个演示的结构
f.close()
# 验证动作范围
if (action_min < -1.0) or (action_max > 1.0):
raise Exception(f"Actions should be in [-1., 1.] but got [{action_min}, {action_max}]")
📊 输出示例
total transitions: 6247
total trajectories: 50
traj length mean: 124.94
traj length std: 15.32
traj length min: 95
traj length max: 168
action min: -0.9876
action max: 0.9912
language instruction: put the black bowl on the plate
==== Filter Keys ====
filter key train with 45 demos
filter key valid with 5 demos
==== Env Meta ====
{
"type": 1,
"env_name": "KITCHEN_SCENE1_put_the_black_bowl_on_the_plate",
"problem_name": "KITCHEN_SCENE1_put_the_black_bowl_on_the_plate",
"bddl_file": "libero/bddl_files/kitchen_scene1/...",
"env_kwargs": {...}
}
==== Dataset Structure ====
episode demo_0 with 124 transitions
key: obs
observation key agentview_rgb with shape (124, 128, 128, 3)
observation key eye_in_hand_rgb with shape (124, 128, 128, 3)
observation key gripper_states with shape (124, 2)
observation key joint_states with shape (124, 7)
key: actions with shape (124, 7)
key: rewards with shape (124,)
key: dones with shape (124,)
💡 使用方法
# 查看基本信息
python get_dataset_info.py --dataset path/to/demo.hdf5
# 查看训练集子集
python get_dataset_info.py --dataset demo.hdf5 --filter_key train
# 详细模式(显示所有演示结构)
python get_dataset_info.py --dataset demo.hdf5 --verbose
🎯 应用场景
- 分析数据集特性
- 验证数据格式
- 调试数据加载问题
- 评估数据质量
4️⃣ get_affordance_info.py - 可交互区域信息
📖 功能说明
提取所有对象的可交互区域(affordance regions)信息,显示对象支持哪些交互操作。
💻 完整代码
import init_path
from libero.libero.envs.objects import OBJECTS_DICT
from libero.libero.utils.object_utils import get_affordance_regions
# 获取所有对象的可交互区域
affordances = get_affordance_regions(OBJECTS_DICT)
print(affordances)
📊 输出示例
{
'microwave': {
'regions': ['inside', 'top'],
'actions': ['open', 'close', 'put_in']
},
'wooden_cabinet': {
'regions': ['top_region', 'bottom_region', 'door'],
'actions': ['open', 'close', 'put_on', 'put_in']
},
'plate': {
'regions': ['surface'],
'actions': ['put_on']
},
'basket': {
'regions': ['inside'],
'actions': ['put_in']
},
'flat_stove': {
'regions': ['burner_1', 'burner_2', 'burner_3', 'burner_4'],
'actions': ['turnon', 'turnoff', 'put_on']
}
}
🔑 关键概念
**Affordance(可供性/可交互性)**指对象支持的交互能力:
- 容器类(microwave, basket):可以放入物体(
put_in) - 表面类(plate, table):可以放置物体(
put_on) - 可开关类(cabinet, fridge):可以打开/关闭(
open/close) - 可控制类(stove, faucet):可以开启/关闭(
turnon/turnoff)
💡 使用方法
# 直接运行
python get_affordance_info.py
# 输出会显示所有对象及其可交互区域
🎯 应用场景
- 任务设计:了解哪些对象支持哪些操作
- BDDL文件编写:确定可用的谓词和区域
- 调试任务:验证交互操作的可行性
- 文档编写:生成对象能力清单
5️⃣ config_copy.py - 配置文件复制
📖 功能说明
将LIBERO的配置文件复制到当前项目目录,方便自定义和修改配置。
💻 完整代码
import os
import shutil
from libero.libero import get_libero_path
def main():
target_path = os.path.abspath(os.path.join("./", "configs"))
print(f"Copying configs to {target_path}")
# 检查目标目录是否已存在
if os.path.exists(target_path):
response = input("The target directory already exists. Overwrite it? (y/n) ")
if response.lower() != "y":
return
shutil.rmtree(target_path)
# 复制配置文件
shutil.copytree(
os.path.join(get_libero_path("benchmark_root"), "../configs"),
target_path
)
print("✔ Configs copied successfully!")
if __name__ == "__main__":
main()
📁 复制的配置结构
configs/
├── config.yaml # 主配置文件
├── data/
│ └── default.yaml # 数据配置
├── eval/
│ └── default.yaml # 评估配置
├── lifelong/
│ ├── base.yaml # 基础算法配置
│ ├── er.yaml # 经验回放配置
│ ├── ewc.yaml # EWC配置
│ └── packnet.yaml # PackNet配置
├── policy/
│ ├── bc_rnn_policy.yaml # RNN策略配置
│ ├── bc_transformer_policy.yaml # Transformer策略配置
│ └── bc_vilt_policy.yaml # ViLT策略配置
└── train/
└── default.yaml # 训练配置
🔑 配置文件说明
主配置文件 (config.yaml)
defaults:
- data: default
- eval: default
- policy: bc_transformer_policy
- lifelong: base
- train: default
seed: 42
benchmark_name: libero_spatial
folder: ${libero.datasets}
策略配置示例 (bc_transformer_policy.yaml)
policy_type: BCTransformerPolicy
transformer_num_layers: 4
transformer_num_heads: 6
transformer_max_seq_len: 10
image_encoder:
network: ResnetEncoder
network_kwargs:
language_fusion: film
freeze: false
💡 使用方法
# 复制配置文件
python config_copy.py
# 之后可以修改 ./configs/ 目录下的配置
🎯 应用场景
- 初始化新项目
- 自定义实验配置
- 创建不同的配置变体
- 版本控制配置文件
6️⃣ create_libero_task_example.py - 任务创建示例
📖 功能说明
演示如何通过代码创建LIBERO任务,生成BDDL文件。这是学习任务创建的最佳起点。
💻 完整代码详解
import numpy as np
from libero.libero.utils.bddl_generation_utils import (
get_xy_region_kwargs_list_from_regions_info,
)
from libero.libero.utils.mu_utils import register_mu, InitialSceneTemplates
from libero.libero.utils.task_generation_utils import (
register_task_info,
generate_bddl_from_task_info,
)
# ============= 第1步: 定义场景 =============
@register_mu(scene_type="kitchen")
class KitchenScene1(InitialSceneTemplates):
"""
定义一个厨房场景,包含:
- 1个厨房桌子(工作空间)
- 1个木制橱柜
- 1个黑碗
- 1个盘子
"""
def __init__(self):
# 定义固定装置(fixtures)
fixture_num_info = {
"kitchen_table": 1, # 桌子
"wooden_cabinet": 1, # 橱柜
}
# 定义可操作对象(objects)
object_num_info = {
"akita_black_bowl": 1, # 黑碗
"plate": 1, # 盘子
}
super().__init__(
workspace_name="kitchen_table", # 工作空间名称
fixture_num_info=fixture_num_info,
object_num_info=object_num_info,
)
def define_regions(self):
"""定义对象的初始放置区域"""
# 橱柜的放置区域(桌子后方)
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, -0.30], # 中心坐标 (x, y)
region_name="wooden_cabinet_init_region",
target_name=self.workspace_name,
region_half_len=0.01, # 区域半径(米)
yaw_rotation=(np.pi, np.pi), # 旋转角度
)
)
# 黑碗的放置区域(桌子中央)
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, 0.0],
region_name="akita_black_bowl_init_region",
target_name=self.workspace_name,
region_half_len=0.025,
)
)
# 盘子的放置区域(桌子前方)
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, 0.25],
region_name="plate_init_region",
target_name=self.workspace_name,
region_half_len=0.025,
)
)
# 生成区域参数列表
self.xy_region_kwargs_list = get_xy_region_kwargs_list_from_regions_info(
self.regions
)
@property
def init_states(self):
"""定义初始状态约束"""
states = [
("On", "akita_black_bowl_1", "kitchen_table_akita_black_bowl_init_region"),
("On", "plate_1", "kitchen_table_plate_init_region"),
("On", "wooden_cabinet_1", "kitchen_table_wooden_cabinet_init_region"),
]
return states
# ============= 第2步: 定义任务 =============
def main():
# 任务1: 打开橱柜顶层并把碗放进去
scene_name = "kitchen_scene1"
language = "open the top cabinet and put the black bowl in it"
register_task_info(
language,
scene_name=scene_name,
objects_of_interest=["wooden_cabinet_1", "akita_black_bowl_1"],
goal_states=[
("Open", "wooden_cabinet_1_top_region"), # 打开顶层
("In", "akita_black_bowl_1", "wooden_cabinet_1_top_region"), # 碗在里面
],
)
# 任务2: 打开橱柜底层并把碗放进去
language = "open the bottom cabinet and put the black bowl in it"
register_task_info(
language,
scene_name=scene_name,
objects_of_interest=["wooden_cabinet_1", "akita_black_bowl_1"],
goal_states=[
("Open", "wooden_cabinet_1_top_region"), # 打开顶层(需要先开)
("In", "akita_black_bowl_1", "wooden_cabinet_1_bottom_region"), # 碗在底层
],
)
# ============= 第3步: 生成BDDL文件 =============
bddl_file_names, failures = generate_bddl_from_task_info()
print("✔ Successfully generated BDDL files:")
for file_name in bddl_file_names:
print(f" - {file_name}")
if failures:
print("\n❌ Failed to generate:")
for failure in failures:
print(f" - {failure}")
if __name__ == "__main__":
main()
🔑 关键步骤详解
Step 1: 场景定义
@register_mu(scene_type="kitchen") # 注册为厨房场景
class KitchenScene1(InitialSceneTemplates):
# 继承自InitialSceneTemplates基类
包含三个方法:
__init__: 声明场景中的对象define_regions: 定义对象放置区域init_states: 定义初始状态约束
Step 2: 任务注册
register_task_info(
language, # 自然语言指令
scene_name=scene_name, # 使用的场景
objects_of_interest=[...], # 关键对象
goal_states=[...], # 目标状态
)
Step 3: BDDL生成
bddl_file_names, failures = generate_bddl_from_task_info()
📊 生成的BDDL文件示例
(define (problem KITCHEN_SCENE1_open_the_top_cabinet_and_put_the_black_bowl_in_it)
(:domain libero)
(:language "open the top cabinet and put the black bowl in it")
(:objects
kitchen_table - kitchen_table
wooden_cabinet_1 - wooden_cabinet
akita_black_bowl_1 - akita_black_bowl
plate_1 - plate
)
(:init
(On akita_black_bowl_1 kitchen_table_akita_black_bowl_init_region)
(On plate_1 kitchen_table_plate_init_region)
(On wooden_cabinet_1 kitchen_table_wooden_cabinet_init_region)
)
(:goal
(And
(Open wooden_cabinet_1_top_region)
(In akita_black_bowl_1 wooden_cabinet_1_top_region)
)
)
)
💡 使用方法
# 运行示例
python create_libero_task_example.py
# BDDL文件会生成在默认位置
# 可以在脚本中指定输出目录
🎯 学习要点
- 场景定义: 装置vs对象的区别
- 区域设置: 坐标系统和尺寸
- 状态约束: 谓词的使用方法
- 任务目标: 组合多个条件
7️⃣ create_template.py - 模板生成工具
📖 功能说明
交互式工具,用于快速生成LIBERO扩展组件的模板文件,大幅减少重复代码编写。
💻 核心功能
import os
import xml.etree.ElementTree as ET
from libero.libero import get_libero_path
from libero.libero.envs.textures import get_texture_file_list
def create_problem_class_from_file(class_name):
"""从模板创建问题类文件"""
template_source_file = os.path.join(
get_libero_path("benchmark_root"),
"../../templates/problem_class_template.py"
)
# 读取模板
with open(template_source_file, "r") as f:
lines = f.readlines()
# 替换占位符
new_lines = []
for line in lines:
if "YOUR_CLASS_NAME" in line:
line = line.replace("YOUR_CLASS_NAME", class_name)
new_lines.append(line)
# 保存新文件
output_file = f"{class_name.lower()}.py"
with open(output_file, "w") as f:
f.writelines(new_lines)
print(f"✔ Created class {class_name} at: {output_file}")
def create_scene_xml_file(scene_name):
"""交互式创建场景XML文件"""
template_source_file = os.path.join(
get_libero_path("benchmark_root"),
"../../templates/scene_template.xml"
)
# 解析XML模板
parser = ET.XMLParser(target=ET.TreeBuilder(insert_comments=True))
tree = ET.parse(template_source_file, parser)
root = tree.getroot()
# 定义需要选择纹理的元素
basic_elements = [
("Floor", "texplane"),
("Table", "tex-table"),
("Table legs", "tex-table-legs"),
("Walls", "tex-wall"),
]
# 为每个元素选择纹理
for (element_name, texture_name) in basic_elements:
element = root.findall('.//texture[@name="{}"]'.format(texture_name))[0]
# 确定纹理类型
type = None
if "floor" in element_name.lower():
type = "floor"
elif "table" in element_name.lower():
type = "table"
elif "wall" in element_name.lower():
type = "wall"
# 获取可用纹理列表
texture_list = get_texture_file_list(type=type, texture_path="../")
# 显示选项
for i, (texture_name, texture_file_path) in enumerate(texture_list):
print(f"[{i}]: {texture_name}")
# 用户选择
choice = int(input(f"Select texture for {element_name}: "))
element.set("file", texture_list[choice][1])
# 保存XML文件
output_file = f"{scene_name}.xml"
tree.write(output_file, encoding="utf-8")
print(f"✔ Created scene {scene_name} at: {output_file}")
print("\n[Notice] Texture paths assume the XML will be in libero/libero/assets/scenes/")
def main():
# 显示选项
choices = [
"problem_class", # 问题类Python文件
"scene", # 场景XML文件
"object", # 对象定义(保留)
"arena", # 竞技场定义(保留)
]
print("=== LIBERO Template Generator ===")
for i, choice in enumerate(choices):
print(f"[{i}]: {choice}")
choice = int(input("Select which file to create: "))
if choices[choice] == "problem_class":
class_name = input("Specify the class name: ")
assert " " not in class_name, "Space not allowed in naming"
# 标准化类名(首字母大写)
parts = class_name.split("_")
class_name = "_".join([part.lower().capitalize() for part in parts])
create_problem_class_from_file(class_name)
elif choices[choice] == "scene":
scene_name = input("Specify the scene name: ")
scene_name = scene_name.lower()
assert " " not in scene_name, "Space not allowed in naming"
create_scene_xml_file(scene_name)
🎬 使用示例
创建问题类
$ python create_template.py
=== LIBERO Template Generator ===
[0]: problem_class
[1]: scene
[2]: object
[3]: arena
Select which file to create: 0
Specify the class name: my_custom_task
✔ Created class My_Custom_Task at: my_custom_task.py
生成的模板内容:
from libero.libero.envs.base_env import LiberoBaseEnv
import numpy as np
class My_Custom_Task(LiberoBaseEnv):
"""
YOUR TASK DESCRIPTION HERE
"""
def __init__(self, bddl_file_name, **kwargs):
super().__init__(bddl_file_name, **kwargs)
def _check_success(self):
"""
Check if the task is successfully completed.
Returns:
bool: True if task is complete
"""
# TODO: Implement success condition
return self._check_goal_satisfied()
# TODO: Add custom methods if needed
创建场景XML
$ python create_template.py
Select which file to create: 1
Specify the scene name: my_kitchen
Select texture for Floor:
[0]: light_wood
[1]: dark_wood
[2]: tile_gray
[3]: tile_white
Select: 2
Select texture for Table:
[0]: wood_light
[1]: wood_dark
[2]: metal
Select: 0
Select texture for Walls:
[0]: white
[1]: beige
[2]: gray
Select: 1
✔ Created scene my_kitchen at: my_kitchen.xml
📋 模板类型说明
| 模板类型 | 生成内容 | 用途 |
|---|---|---|
| problem_class | Python类文件 | 自定义任务逻辑 |
| scene | XML场景文件 | 定义3D环境 |
| object | 对象定义 | 新对象类型 |
| arena | 竞技场 | 自定义工作空间 |
🎯 应用场景
- 快速原型开发
- 减少样板代码
- 标准化组件结构
- 新手学习模板
8️⃣ collect_demonstration.py - 人类演示收集
📖 功能说明
使用SpaceMouse或键盘收集人类操作演示数据,是创建高质量训练数据的核心工具。
🔑 核心流程
初始化环境 → 人类操控 → 记录轨迹 → 检查成功 → 保存HDF5
💻 核心代码详解
1. 轨迹收集函数
def collect_human_trajectory(
env, device, arm, env_configuration, problem_info, remove_directory=[]
):
"""
使用输入设备收集演示轨迹
Args:
env: MuJoCo环境
device: 输入设备(SpaceMouse或键盘)
arm: 控制的机械臂('right' 或 'left')
env_configuration: 环境配置
problem_info: 任务信息
remove_directory: 要移除的目录列表
Returns:
saving: 是否保存该轨迹
"""
# 重置环境
reset_success = False
while not reset_success:
try:
env.reset()
reset_success = True
except:
continue
env.render()
# 任务完成计数器
task_completion_hold_count = -1
device.start_control()
saving = True
count = 0
while True:
count += 1
# 获取动作
active_robot = env.robots[0] if env_configuration == "bimanual" else env.robots[arm == "left"]
action, grasp = input2action(
device=device,
robot=active_robot,
active_arm=arm,
env_configuration=env_configuration,
)
# 按ESC退出不保存
if action is None:
print("Break - Not saving")
saving = False
break
# 执行动作
env.step(action)
env.render()
# 检查任务完成
if task_completion_hold_count == 0:
break
# 状态机:检查连续10步成功
if env._check_success():
if task_completion_hold_count > 0:
task_completion_hold_count -= 1
else:
task_completion_hold_count = 10 # 首次成功
else:
task_completion_hold_count = -1
print(f"Episode length: {count}")
if not saving:
remove_directory.append(env.ep_directory.split("/")[-1])
env.close()
return saving
2. HDF5数据保存函数
def gather_demonstrations_as_hdf5(
directory, out_dir, env_info, args, remove_directory=[]
):
"""
将演示数据收集为HDF5格式
HDF5结构:
data/
├── [attributes] 元数据
├── demo_1/
│ ├── [attribute] model_file
│ ├── states (dataset)
│ └── actions (dataset)
├── demo_2/
└── ...
"""
hdf5_path = os.path.join(out_dir, "demo.hdf5")
f = h5py.File(hdf5_path, "w")
grp = f.create_group("data")
num_eps = 0
env_name = None
# 遍历所有轨迹文件
for ep_directory in os.listdir(directory):
if ep_directory in remove_directory:
continue
state_paths = os.path.join(directory, ep_directory, "state_*.npz")
states = []
actions = []
# 加载状态和动作
for state_file in sorted(glob(state_paths)):
dic = np.load(state_file, allow_pickle=True)
env_name = str(dic["env"])
states.extend(dic["states"])
for ai in dic["action_infos"]:
actions.append(ai["actions"])
if len(states) == 0:
continue
# 删除第一个动作和最后一个状态(数据收集机制导致)
del states[-1]
assert len(states) == len(actions)
num_eps += 1
ep_data_grp = grp.create_group(f"demo_{num_eps}")
# 保存模型XML
xml_path = os.path.join(directory, ep_directory, "model.xml")
with open(xml_path, "r") as f:
xml_str = f.read()
ep_data_grp.attrs["model_file"] = xml_str
# 保存数据集
ep_data_grp.create_dataset("states", data=np.array(states))
ep_data_grp.create_dataset("actions", data=np.array(actions))
# 保存元数据
now = datetime.datetime.now()
grp.attrs["date"] = f"{now.month}-{now.day}-{now.year}"
grp.attrs["time"] = f"{now.hour}:{now.minute}:{now.second}"
grp.attrs["repository_version"] = suite.__version__
grp.attrs["env"] = env_name
grp.attrs["env_info"] = env_info
grp.attrs["problem_info"] = json.dumps(problem_info)
grp.attrs["bddl_file_name"] = args.bddl_file
grp.attrs["bddl_file_content"] = str(open(args.bddl_file, "r", encoding="utf-8").read())
f.close()
3. 主程序
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--directory", type=str, default="demonstration_data")
parser.add_argument("--robots", nargs="+", type=str, default="Panda")
parser.add_argument("--controller", type=str, default="OSC_POSE")
parser.add_argument("--device", type=str, default="spacemouse")
parser.add_argument("--pos-sensitivity", type=float, default=1.5)
parser.add_argument("--rot-sensitivity", type=float, default=1.0)
parser.add_argument("--num-demonstration", type=int, default=50)
parser.add_argument("--bddl-file", type=str, required=True)
parser.add_argument("--vendor-id", type=int, default=9583)
parser.add_argument("--product-id", type=int, default=50734)
args = parser.parse_args()
# 加载控制器配置
controller_config = load_controller_config(default_controller=args.controller)
# 获取任务信息
problem_info = BDDLUtils.get_problem_info(args.bddl_file)
problem_name = problem_info["problem_name"]
language_instruction = problem_info["language_instruction"]
# 创建环境
env = TASK_MAPPING[problem_name](
bddl_file_name=args.bddl_file,
robots=args.robots,
controller_configs=controller_config,
has_renderer=True,
has_offscreen_renderer=False,
ignore_done=True,
use_camera_obs=False,
control_freq=20,
)
# 包装环境
env = VisualizationWrapper(env)
tmp_directory = f"demonstration_data/tmp/{problem_name}/{time.time()}"
env = DataCollectionWrapper(env, tmp_directory)
# 初始化设备
if args.device == "spacemouse":
device = SpaceMouse(
args.vendor_id,
args.product_id,
pos_sensitivity=args.pos_sensitivity,
rot_sensitivity=args.rot_sensitivity,
)
elif args.device == "keyboard":
device = Keyboard(
pos_sensitivity=args.pos_sensitivity,
rot_sensitivity=args.rot_sensitivity
)
# 创建输出目录
new_dir = os.path.join(args.directory, f"{problem_name}_{time.time()}")
os.makedirs(new_dir)
# 收集演示
remove_directory = []
i = 0
while i < args.num_demonstration:
print(f"Collecting demonstration {i+1}/{args.num_demonstration}")
saving = collect_human_trajectory(
env, device, args.arm, args.config, problem_info, remove_directory
)
if saving:
gather_demonstrations_as_hdf5(
tmp_directory, new_dir, env_info, args, remove_directory
)
i += 1
🎮 设备控制说明
SpaceMouse 3D鼠标
移动鼠标:控制末端执行器位置 (x, y, z)
旋转鼠标:控制末端执行器姿态 (roll, pitch, yaw)
按钮:开关夹爪
按ESC:取消当前演示(不保存)
键盘控制
W/S: 前进/后退 (x方向)
A/D: 左移/右移 (y方向)
Q/E: 上升/下降 (z方向)
J/L: 旋转roll
I/K: 旋转pitch
U/O: 旋转yaw
Space: 切换夹爪
ESC: 取消当前演示
💡 使用方法
使用SpaceMouse收集
python collect_demonstration.py \
--bddl-file path/to/task.bddl \
--device spacemouse \
--num-demonstration 50 \
--pos-sensitivity 1.5 \
--rot-sensitivity 1.0 \
--vendor-id 9583 \
--product-id 50734
使用键盘收集
python collect_demonstration.py \
--bddl-file path/to/task.bddl \
--device keyboard \
--num-demonstration 50 \
--pos-sensitivity 2.0 \
--rot-sensitivity 1.5
📊 收集流程
1. 加载任务BDDL文件
2. 创建环境和输入设备
3. 循环收集N个演示:
a. 重置环境
b. 人类操控完成任务
c. 检查任务成功(连续10步)
d. 保存轨迹到HDF5
4. 完成后关闭环境
⚠️ 注意事项
- 任务成功判定:需要连续10步保持成功状态
- 取消演示:按ESC取消,该轨迹不会保存
- 设备连接:确保SpaceMouse正确连接和配置
- 存储空间:50个演示约需要500MB-1GB空间
- 手腕疲劳:收集50个演示需要30-60分钟
🎯 最佳实践
- 先用几个演示熟悉操作
- 保持操作平滑自然
- 任务完成后保持姿势1-2秒
- 失败的尝试及时按ESC取消
- 定期休息避免疲劳
9️⃣ libero_100_collect_demonstrations.py - LIBERO-100批量收集
📖 功能说明
专门用于LIBERO-100数据集收集的脚本,带有彩色提示和批量处理功能。
🔑 与collect_demonstration.py的区别
| 特性 | collect_demonstration.py | libero_100_collect_demonstrations.py |
|---|---|---|
| 用途 | 单任务数据收集 | 批量任务数据收集 |
| 界面 | 基础 | 彩色终端提示 |
| 任务标识 | 无 | 支持task-id参数 |
| 交互性 | 直接开始 | 等待用户确认 |
| 灵敏度默认值 | 1.5/1.0 | 1.5/1.5 |
💻 关键改进
from termcolor import colored
# 彩色任务提示
text = colored(language_instruction, "red", attrs=["bold"])
print("Goal of the following task: ", text)
instruction = colored(
"Hit any key to proceed to data collection ...",
"green",
attrs=["reverse", "blink"]
)
print(instruction)
input() # 等待用户准备
# 支持任务ID
parser.add_argument("--task-id", type=int)
💡 使用方法
批量收集LIBERO-100
# 创建批处理脚本
for task_id in {0..99}; do
python libero_100_collect_demonstrations.py \
--bddl-file libero/bddl_files/libero_90/task_${task_id}.bddl \
--task-id $task_id \
--device spacemouse \
--num-demonstration 50
done
单个任务收集
python libero_100_collect_demonstrations.py \
--bddl-file libero/bddl_files/libero_10/task_0.bddl \
--task-id 0 \
--device spacemouse \
--num-demonstration 50 \
--pos-sensitivity 1.5 \
--rot-sensitivity 1.5
🎯 应用场景
- LIBERO-100数据集创建
- 批量任务数据收集
- 需要任务标识的项目
- 团队协作数据收集
🔟 create_dataset.py - 数据集生成
📖 功能说明
将收集的原始演示数据(demo.hdf5)转换为训练就绪的数据集,包括提取图像观察、状态信息和奖励信号。
🔑 核心功能
数据转换流程:
原始演示(demo.hdf5) → 回放验证 → 提取观察 → 添加奖励 → 训练数据集(_demo.hdf5)
💻 核心代码详解
import argparse
import h5py
import numpy as np
import json
from pathlib import Path
import robosuite.utils.transform_utils as T
from libero.libero.envs import *
from libero.libero import get_libero_path
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--demo-file", default="demo.hdf5")
parser.add_argument("--use-actions", action="store_true")
parser.add_argument("--use-camera-obs", action="store_true")
parser.add_argument("--dataset-path", type=str, default="datasets/")
parser.add_argument("--dataset-name", type=str, default="training_set")
parser.add_argument("--no-proprio", action="store_true")
parser.add_argument("--use-depth", action="store_true")
args = parser.parse_args()
# ============= 第1步: 读取原始演示文件 =============
f = h5py.File(args.demo_file, "r")
env_name = f["data"].attrs["env"]
env_kwargs = json.loads(f["data"].attrs["env_info"])
problem_info = json.loads(f["data"].attrs["problem_info"])
problem_name = problem_info["problem_name"]
demos = list(f["data"].keys())
bddl_file_name = f["data"].attrs["bddl_file_name"]
# ============= 第2步: 确定输出路径 =============
bddl_file_dir = os.path.dirname(bddl_file_name)
hdf5_path = os.path.join(
get_libero_path("datasets"),
bddl_file_dir.split("bddl_files/")[-1].replace(".bddl", "_demo.hdf5")
)
output_parent_dir = Path(hdf5_path).parent
output_parent_dir.mkdir(parents=True, exist_ok=True)
# ============= 第3步: 创建输出HDF5文件 =============
h5py_f = h5py.File(hdf5_path, "w")
grp = h5py_f.create_group("data")
# 保存元数据
grp.attrs["env_name"] = env_name
grp.attrs["problem_info"] = f["data"].attrs["problem_info"]
grp.attrs["macros_image_convention"] = macros.IMAGE_CONVENTION
# 更新环境配置
libero_utils.update_env_kwargs(
env_kwargs,
bddl_file_name=bddl_file_name,
has_renderer=not args.use_camera_obs,
has_offscreen_renderer=args.use_camera_obs,
ignore_done=True,
use_camera_obs=args.use_camera_obs,
camera_depths=args.use_depth,
camera_names=["robot0_eye_in_hand", "agentview"],
camera_heights=128,
camera_widths=128,
control_freq=20,
)
grp.attrs["bddl_file_name"] = bddl_file_name
grp.attrs["bddl_file_content"] = open(bddl_file_name, "r").read()
# ============= 第4步: 创建环境 =============
env = TASK_MAPPING[problem_name](**env_kwargs)
total_len = 0
cap_index = 5 # 跳过前5帧(力传感器不稳定)
# ============= 第5步: 处理每个演示 =============
for (i, ep) in enumerate(demos):
print(f"Processing episode {i+1}/{len(demos)}...")
# 读取模型XML和状态
model_xml = f["data/{}".format(ep)].attrs["model_file"]
states = f["data/{}/states".format(ep)][()]
actions = np.array(f["data/{}/actions".format(ep)][()])
# 重置环境
reset_success = False
while not reset_success:
try:
env.reset()
reset_success = True
except:
continue
# 从XML和状态恢复环境
env.reset_from_xml_string(model_xml)
env.sim.reset()
env.sim.set_state_from_flattened(states[0])
env.sim.forward()
# 初始化数据容器
ee_states = []
gripper_states = []
joint_states = []
robot_states = []
agentview_images = []
eye_in_hand_images = []
agentview_depths = []
eye_in_hand_depths = []
valid_index = []
# ============= 第6步: 回放并记录 =============
for j, action in enumerate(actions):
obs, reward, done, info = env.step(action)
# 验证回放准确性
if j < len(actions) - 1:
state_playback = env.sim.get_state().flatten()
err = np.linalg.norm(states[j + 1] - state_playback)
if err > 0.01:
print(f"[warning] playback diverged by {err:.2f}")
# 跳过前几帧(力传感器稳定期)
if j < cap_index:
continue
valid_index.append(j)
# 记录本体感觉信息
if not args.no_proprio:
if "robot0_gripper_qpos" in obs:
gripper_states.append(obs["robot0_gripper_qpos"])
joint_states.append(obs["robot0_joint_pos"])
ee_states.append(
np.hstack((
obs["robot0_eef_pos"],
T.quat2axisangle(obs["robot0_eef_quat"]),
))
)
robot_states.append(env.get_robot_state_vector(obs))
# 记录视觉观察
if args.use_camera_obs:
agentview_images.append(obs["agentview_image"])
eye_in_hand_images.append(obs["robot0_eye_in_hand_image"])
if args.use_depth:
agentview_depths.append(obs["agentview_depth"])
eye_in_hand_depths.append(obs["robot0_eye_in_hand_depth"])
else:
env.render()
# ============= 第7步: 保存处理后的数据 =============
states = states[valid_index]
actions = actions[valid_index]
# 创建奖励和完成标志
dones = np.zeros(len(actions)).astype(np.uint8)
dones[-1] = 1 # 最后一步标记为完成
rewards = np.zeros(len(actions)).astype(np.uint8)
rewards[-1] = 1 # 最后一步给予奖励
# 创建演示组
ep_data_grp = grp.create_group(f"demo_{i}")
obs_grp = ep_data_grp.create_group("obs")
# 保存本体感觉
if not args.no_proprio:
obs_grp.create_dataset("gripper_states", data=np.stack(gripper_states, axis=0))
obs_grp.create_dataset("joint_states", data=np.stack(joint_states, axis=0))
obs_grp.create_dataset("ee_states", data=np.stack(ee_states, axis=0))
obs_grp.create_dataset("ee_pos", data=np.stack(ee_states, axis=0)[:, :3])
obs_grp.create_dataset("ee_ori", data=np.stack(ee_states, axis=0)[:, 3:])
# 保存图像
obs_grp.create_dataset("agentview_rgb", data=np.stack(agentview_images, axis=0))
obs_grp.create_dataset("eye_in_hand_rgb", data=np.stack(eye_in_hand_images, axis=0))
if args.use_depth:
obs_grp.create_dataset("agentview_depth", data=np.stack(agentview_depths, axis=0))
obs_grp.create_dataset("eye_in_hand_depth", data=np.stack(eye_in_hand_depths, axis=0))
# 保存其他数据
ep_data_grp.create_dataset("actions", data=actions)
ep_data_grp.create_dataset("states", data=states)
ep_data_grp.create_dataset("robot_states", data=np.stack(robot_states, axis=0))
ep_data_grp.create_dataset("rewards", data=rewards)
ep_data_grp.create_dataset("dones", data=dones)
ep_data_grp.attrs["num_samples"] = len(agentview_images)
ep_data_grp.attrs["model_file"] = model_xml
ep_data_grp.attrs["init_state"] = states[0]
total_len += len(agentview_images)
# ============= 第8步: 保存全局属性 =============
grp.attrs["num_demos"] = len(demos)
grp.attrs["total"] = total_len
env.close()
h5py_f.close()
f.close()
print("\n✔ Dataset created successfully!")
print(f"Saved to: {hdf5_path}")
if __name__ == "__main__":
main()
📊 数据集结构对比
输入:原始演示 (demo.hdf5)
data/
├── demo_1/
│ ├── states (array) # MuJoCo状态
│ └── actions (array) # 动作序列
└── demo_2/
└── ...
输出:训练数据集 (*_demo.hdf5)
data/
├── demo_0/
│ ├── obs/
│ │ ├── agentview_rgb (128, 128, 3) # 第三人称视角
│ │ ├── eye_in_hand_rgb (128, 128, 3) # 手眼相机
│ │ ├── gripper_states (2,) # 夹爪状态
│ │ ├── joint_states (7,) # 关节角度
│ │ ├── ee_pos (3,) # 末端执行器位置
│ │ └── ee_ori (3,) # 末端执行器姿态
│ ├── actions (7,) # 动作
│ ├── rewards (1,) # 奖励(稀疏)
│ ├── dones (1,) # 完成标志
│ └── [attributes]
│ ├── num_samples
│ ├── model_file
│ └── init_state
└── demo_1/
└── ...
💡 使用方法
基础使用(不带视觉)
python create_dataset.py \
--demo-file path/to/demo.hdf5
包含视觉观察
python create_dataset.py \
--demo-file path/to/demo.hdf5 \
--use-camera-obs
包含深度图
python create_dataset.py \
--demo-file path/to/demo.hdf5 \
--use-camera-obs \
--use-depth
无本体感觉(仅视觉)
python create_dataset.py \
--demo-file path/to/demo.hdf5 \
--use-camera-obs \
--no-proprio
🔑 关键特性
1. 回放验证
# 确保回放准确性
state_playback = env.sim.get_state().flatten()
err = np.linalg.norm(states[j + 1] - state_playback)
if err > 0.01:
print(f"[warning] playback diverged by {err:.2f}")
2. 跳过不稳定帧
cap_index = 5 # 跳过前5帧
# 力传感器在开始时不稳定
if j < cap_index:
continue
3. 稀疏奖励
# LIBERO使用稀疏奖励
rewards = np.zeros(len(actions))
rewards[-1] = 1 # 只有完成时给奖励
4. 图像分辨率
camera_heights=128,
camera_widths=128,
# 默认128x128,可以修改
⚠️ 注意事项
- 内存使用:处理图像时需要大量内存
- 处理时间:50个演示约需5-10分钟
- 磁盘空间:
- 无图像:~100MB
- 有图像:~1-2GB
- 有深度:~2-3GB
- 回放误差:少量误差(<0.01)是正常的
🎯 典型工作流
# 1. 收集演示
python collect_demonstration.py --bddl-file task.bddl --num-demonstration 50
# 2. 转换为训练数据
python create_dataset.py --demo-file demo.hdf5 --use-camera-obs
# 3. 验证数据集
python check_dataset_integrity.py
# 4. 查看信息
python get_dataset_info.py --dataset path/to/task_demo.hdf5
# 5. 开始训练
python libero/lifelong/main.py ...
🎓 总结与最佳实践
📋 脚本使用建议
| 阶段 | 推荐脚本 | 目的 |
|---|---|---|
| 初始化 | config_copy.py, init_path.py | 设置项目环境 |
| 探索 | get_affordance_info.py | 了解对象能力 |
| 设计 | create_libero_task_example.py, create_template.py | 创建新任务 |
| 数据收集 | collect_demonstration.py | 收集训练数据 |
| 数据处理 | create_dataset.py | 生成训练集 |
| 验证 | check_dataset_integrity.py, get_dataset_info.py | 质量检查 |
🔧 常见工作流
工作流1:快速原型
# 1. 复制配置
python config_copy.py
# 2. 创建任务
python create_libero_task_example.py
# 3. 收集少量演示测试
python collect_demonstration.py --num-demonstration 5
工作流2:完整数据集创建
# 1. 收集50个演示
python collect_demonstration.py --num-demonstration 50
# 2. 转换为训练数据
python create_dataset.py --use-camera-obs
# 3. 检查完整性
python check_dataset_integrity.py
# 4. 查看统计
python get_dataset_info.py --dataset path/to/dataset.hdf5
工作流3:LIBERO-100扩展
# 批量创建100个任务的数据集
for i in {0..99}; do
python libero_100_collect_demonstrations.py \
--task-id $i \
--bddl-file libero_90/task_$i.bddl \
--num-demonstration 50
done
⚠️ 常见问题解决
| 问题 | 可能原因 | 解决方案 |
|---|---|---|
| 导入错误 | 路径问题 | 使用init_path.py |
| 回放误差大 | 随机性/物理不确定 | 检查随机种子 |
| 图像倒置 | 图像约定 | 检查macros.IMAGE_CONVENTION |
| 内存溢出 | 处理大数据集 | 分批处理或减少演示数 |
| 设备连接失败 | SpaceMouse配置 | 检查vendor-id和product-id |
🚀 性能优化建议
- 并行处理:多个任务可以并行收集
- 批处理:使用脚本批量处理多个数据集
- 预检查:收集前验证BDDL文件
- 增量保存:大数据集分批保存
- 定期备份:避免数据丢失
LIBERO 脚本快速参考卡片
📋 一分钟速览
┌─────────────────────────────────────────────────────────────┐
│ LIBERO 10个工具脚本 - 从数据收集到训练的完整工具链 │
└─────────────────────────────────────────────────────────────┘
🎯 按功能分类
📦 基础设施 (2个)
| 脚本 | 命令 | 用途 |
|---|---|---|
init_path.py | import init_path | 路径初始化(导入LIBERO前必须) |
config_copy.py | python config_copy.py | 复制配置文件到项目目录 |
🔍 信息查询 (3个)
| 脚本 | 命令 | 用途 |
|---|---|---|
get_dataset_info.py | python get_dataset_info.py --dataset xxx.hdf5 | 查看数据集统计信息 |
check_dataset_integrity.py | python check_dataset_integrity.py | 检查数据集完整性(50个演示) |
get_affordance_info.py | python get_affordance_info.py | 查看对象可交互区域 |
🛠️ 任务创建 (2个)
| 脚本 | 命令 | 用途 |
|---|---|---|
create_libero_task_example.py | python create_libero_task_example.py | 学习任务创建(示例) |
create_template.py | python create_template.py | 生成模板(场景/类) |
🎮 数据收集 (3个)
| 脚本 | 命令 | 用途 |
|---|---|---|
collect_demonstration.py | --bddl-file task.bddl --num-demonstration 50 | 人类演示收集(单任务) |
libero_100_collect_demonstrations.py | --task-id 0 --bddl-file task.bddl | LIBERO-100批量收集 |
create_dataset.py | --demo-file demo.hdf5 --use-camera-obs | 转换为训练数据集 |
⚡ 5秒钟命令速查
# 初始化项目
python config_copy.py
# 查看对象能力
python get_affordance_info.py
# 创建任务示例
python create_libero_task_example.py
# 收集数据(SpaceMouse)
python collect_demonstration.py --bddl-file task.bddl --device spacemouse --num-demonstration 50
# 转换数据集
python create_dataset.py --demo-file demo.hdf5 --use-camera-obs
# 检查数据集
python check_dataset_integrity.py
# 查看数据集信息
python get_dataset_info.py --dataset path/to/task_demo.hdf5
🔧 典型使用场景
场景1️⃣: 我是新手,想了解LIBERO
# Step 1: 初始化
python config_copy.py
# Step 2: 查看有哪些对象和交互
python get_affordance_info.py
# Step 3: 看看任务创建示例
python create_libero_task_example.py
# Step 4: 查看现有数据集
python get_dataset_info.py --dataset ~/.libero/datasets/libero_spatial/demo_0.hdf5
场景2️⃣: 我要创建新任务
# Step 1: 生成任务模板
python create_template.py
# 选择: [0] problem_class
# Step 2: 创建场景XML(可选)
python create_template.py
# 选择: [1] scene
# Step 3: 参考示例编写任务
# 编辑: my_task.py (参考 create_libero_task_example.py)
# Step 4: 生成BDDL文件
python my_task.py # 调用 generate_bddl_from_task_info()
场景3️⃣: 我要收集训练数据
# Step 1: 准备BDDL文件
# 假设: libero/bddl_files/my_task/task.bddl
# Step 2: 收集演示(需要SpaceMouse)
python collect_demonstration.py \
--bddl-file libero/bddl_files/my_task/task.bddl \
--device spacemouse \
--num-demonstration 50 \
--pos-sensitivity 1.5 \
--rot-sensitivity 1.0
# Step 3: 转换为训练数据
python create_dataset.py \
--demo-file demonstration_data/.../demo.hdf5 \
--use-camera-obs
# Step 4: 验证数据集
python check_dataset_integrity.py
python get_dataset_info.py --dataset ~/.libero/datasets/my_task/task_demo.hdf5
场景4️⃣: 我要扩展LIBERO-100
# Step 1: 批量收集(创建bash脚本)
for i in {0..9}; do
python libero_100_collect_demonstrations.py \
--task-id $i \
--bddl-file libero/bddl_files/libero_10/task_${i}.bddl \
--device spacemouse \
--num-demonstration 50
done
# Step 2: 批量转换
for demo_file in demonstration_data/*/demo.hdf5; do
python create_dataset.py --demo-file $demo_file --use-camera-obs
done
# Step 3: 批量检查
python check_dataset_integrity.py
📊 数据流转图
┌──────────────────┐
│ BDDL定义文件 │ (create_libero_task_example.py)
└────────┬─────────┘
│
▼
┌──────────────────┐
│ 人类演示收集 │ (collect_demonstration.py)
│ demo.hdf5 │
└────────┬─────────┘
│ 原始状态+动作
▼
┌──────────────────┐
│ 数据集转换 │ (create_dataset.py)
│ *_demo.hdf5 │
└────────┬─────────┘
│ 图像+状态+动作+奖励
▼
┌──────────────────┐
│ 数据验证 │ (check_dataset_integrity.py)
└────────┬─────────┘
│
▼
┌──────────────────┐
│ 训练模型 │ (libero/lifelong/main.py)
└──────────────────┘
🎮 SpaceMouse控制速查
┌─────────────────────────────────────────────────────┐
│ SpaceMouse 3D鼠标控制 │
├─────────────────────────────────────────────────────┤
│ 移动鼠标 → 控制末端执行器位置 (x, y, z) │
│ 旋转鼠标 → 控制末端执行器姿态 (roll, pitch, yaw) │
│ 左键/右键 → 开关夹爪 │
│ 按 ESC → 取消当前演示(不保存) │
└─────────────────────────────────────────────────────┘
⚙️ 灵敏度建议:
- 位置: 1.0-2.0 (默认1.5)
- 旋转: 0.8-1.5 (默认1.0)
- 新手建议降低灵敏度: --pos-sensitivity 1.0 --rot-sensitivity 0.8
📦 HDF5文件结构速查
原始演示 (demo.hdf5)
data/
├── [attributes] metadata
├── demo_1/
│ ├── states (array)
│ └── actions (array)
└── demo_2/ ...
训练数据集 (*_demo.hdf5)
data/
├── demo_0/
│ ├── obs/
│ │ ├── agentview_rgb (T, 128, 128, 3)
│ │ ├── eye_in_hand_rgb (T, 128, 128, 3)
│ │ ├── gripper_states (T, 2)
│ │ ├── joint_states (T, 7)
│ │ ├── ee_pos (T, 3)
│ │ └── ee_ori (T, 3)
│ ├── actions (T, 7)
│ ├── rewards (T,) # 稀疏: 只有最后一步=1
│ ├── dones (T,) # 只有最后一步=1
│ └── states (T, 123)
└── demo_1/ ...
⚠️ 常见问题一行解决
| 问题 | 解决方案 |
|---|---|
ImportError: No module named libero | 在脚本开头添加 import init_path |
| SpaceMouse连接失败 | 检查 --vendor-id 9583 --product-id 50734 |
| 回放误差过大 (>0.01) | 正常,少量误差可接受;如果>0.1检查物理参数 |
| 数据集不是50个演示 | 检查收集时是否有失败的演示(按ESC取消的) |
| 图像是黑屏 | 确保使用 --use-camera-obs 和 has_offscreen_renderer=True |
| 内存溢出 | 减少演示数量或分批处理 |
| 任务一直不成功 | 检查BDDL目标状态定义是否合理 |
🎯 记忆口诀
路径初始化: init_path
配置复制: config_copy
信息查询: get_*
任务创建: create_libero_task_example, create_template
数据收集: collect_demonstration (+ libero_100版本)
数据转换: create_dataset
质量检查: check_dataset_integrity
📝 参数速记表
collect_demonstration.py
--bddl-file # BDDL文件路径 [必需]
--device # spacemouse / keyboard
--num-demonstration # 演示数量 (默认50)
--pos-sensitivity # 位置灵敏度 (默认1.5)
--rot-sensitivity # 旋转灵敏度 (默认1.0)
--controller # OSC_POSE / IK_POSE
create_dataset.py
--demo-file # 输入demo.hdf5路径
--use-camera-obs # 包含图像观察
--use-depth # 包含深度图
--no-proprio # 不包含本体感觉
get_dataset_info.py
--dataset # 数据集HDF5路径 [必需]
--filter_key # train / valid (可选)
--verbose # 显示详细信息
🚀 快速上手3步走
# 1️⃣ 安装和初始化 (5分钟)
conda create -n libero python=3.8
conda activate libero
cd LIBERO
pip install -e .
python scripts/config_copy.py
# 2️⃣ 探索和学习 (10分钟)
python scripts/get_affordance_info.py
python scripts/create_libero_task_example.py
python scripts/get_dataset_info.py --dataset ~/.libero/datasets/libero_10/demo_0.hdf5
# 3️⃣ 收集数据 (30-60分钟)
python scripts/collect_demonstration.py \
--bddl-file libero/bddl_files/libero_10/KITCHEN_SCENE1_put_the_black_bowl_on_the_plate.bddl \
--device spacemouse \
--num-demonstration 50
💡 专业提示
- 数据收集: 任务成功需要连续10步,保持姿势1-2秒
- 失败重来: 按ESC取消演示,不会计入演示数量
- 批量处理: 用bash循环脚本批量处理多个任务
- 定期休息: 收集50个演示约需45分钟,注意休息
- 版本控制: 将生成的BDDL文件加入Git管理
- 备份数据: 演示数据珍贵,及时备份
LIBERO Jupyter Notebooks 完整讲解
这是对LIBERO项目中4个核心Jupyter Notebook教程的详细讲解文档。
📚 Notebook 概览
| 文件名 | 用途 | 难度 |
|---|---|---|
quick_walkthrough.ipynb | LIBERO基础入门 | ⭐ 入门 |
procedural_creation_walkthrough.ipynb | 任务过程生成 | ⭐⭐ 进阶 |
custom_object_example.ipynb | 自定义对象 | ⭐⭐⭐ 高级 |
quick_guide_algo.ipynb | 算法和模型实现 | ⭐⭐⭐ 高级 |
1️⃣ quick_walkthrough.ipynb - 快速入门教程
📖 教程内容概览
这是LIBERO的基础入门教程,涵盖以下核心内容:
- LIBERO默认配置管理
- 可用基准测试的基本信息
- 检查基准测试完整性
- 初始化状态文件验证
- 可视化任务初始状态
- 下载数据集
🔑 核心代码段讲解
1.1 路径配置管理
from libero.libero import benchmark, get_libero_path, set_libero_default_path
# 获取默认路径
benchmark_root_path = get_libero_path("benchmark_root")
init_states_default_path = get_libero_path("init_states")
datasets_default_path = get_libero_path("datasets")
bddl_files_default_path = get_libero_path("bddl_files")
功能说明:
- 所有路径从配置文件
~/.libero/config.yaml中检索 - 默认路径相对于LIBERO代码库设置
- 可以动态修改为自定义路径
路径类型:
benchmark_root: 基准测试根目录init_states: 初始化状态文件目录datasets: 数据集存储目录bddl_files: BDDL任务定义文件目录
1.2 自定义路径设置
# 设置自定义路径
set_libero_default_path(os.path.join(os.path.expanduser("~"), "custom_project"))
# 恢复默认路径(不传参数)
set_libero_default_path()
应用场景:
- 在多个项目间切换
- 使用不同版本的数据集
- 自定义实验环境
1.3 获取可用基准测试
benchmark_dict = benchmark.get_benchmark_dict()
print(benchmark_dict)
输出结果:
{
'libero_spatial': <class 'LIBERO_SPATIAL'>,
'libero_object': <class 'LIBERO_OBJECT'>,
'libero_goal': <class 'LIBERO_GOAL'>,
'libero_90': <class 'LIBERO_90'>,
'libero_10': <class 'LIBERO_10'>,
'libero_100': <class 'LIBERO_100'>
}
基准测试说明:
- libero_spatial: 10个空间关系任务
- libero_object: 10个对象概念任务
- libero_goal: 10个任务目标任务
- libero_90: 90个预训练任务
- libero_10: 10个评估任务
- libero_100: 完整的100个任务套件
1.4 检查基准测试完整性
# 初始化基准测试实例
benchmark_instance = benchmark_dict["libero_10"]()
num_tasks = benchmark_instance.get_num_tasks()
print(f"{num_tasks} tasks in the benchmark")
# 获取所有任务名称
task_names = benchmark_instance.get_task_names()
# 检查每个任务的BDDL文件
for i in range(num_tasks):
task = benchmark_instance.get_task(i)
bddl_file = os.path.join(bddl_files_default_path,
task.problem_folder,
task.bddl_file)
if not os.path.exists(bddl_file):
print(f"[error] bddl file {bddl_file} cannot be found")
关键属性:
task.name: 任务名称task.language: 语言指令描述task.problem_folder: 问题文件夹路径task.bddl_file: BDDL文件名
任务示例:
LIVING_ROOM_SCENE2_put_both_the_alphabet_soup_and_the_tomato_sauce_in_the_basket
KITCHEN_SCENE3_turn_on_the_stove_and_put_the_moka_pot_on_it
1.5 验证初始化状态文件
# 检查初始状态文件是否存在
for i in range(num_tasks):
task = benchmark_instance.get_task(i)
init_states_path = os.path.join(init_states_default_path,
task.problem_folder,
task.init_states_file)
if not os.path.exists(init_states_path):
print(f"[error] init states {init_states_path} not found")
# 加载初始状态
init_states = benchmark_instance.get_task_init_states(0)
print(init_states.shape) # 输出: (50, 123)
数据格式:
- 形状:
(num_init_rollouts, num_simulation_states) - 50: 每个任务有50个不同的初始化状态
- 123: MuJoCo仿真状态的维度
用途:
- 确保任务评估的可重复性
- 提供多样化的初始场景
- 标准化基准测试
1.6 可视化初始状态
from libero.libero.envs import OffScreenRenderEnv
# 获取任务信息
task = benchmark_instance.get_task(0)
task_bddl_file = os.path.join(bddl_files_default_path,
task.problem_folder,
task.bddl_file)
# 创建环境
env_args = {
"bddl_file_name": task_bddl_file,
"camera_heights": 128,
"camera_widths": 128
}
env = OffScreenRenderEnv(**env_args)
env.seed(0)
env.reset()
# 设置初始状态并可视化
init_states = benchmark_instance.get_task_init_states(0)
for init_state_id in range(min(5, len(init_states))):
env.set_init_state(init_states[init_state_id])
obs = env.get_observation()
# 可视化obs中的图像
env.close()
环境参数:
bddl_file_name: 任务定义文件camera_heights/widths: 相机分辨率(默认128x128)- 支持离屏渲染(OffScreen)
💡 学习要点
- 配置管理: 理解LIBERO的路径管理系统
- 基准测试结构: 掌握6种基准测试的区别和用途
- 任务完整性: 学会验证BDDL文件和初始状态文件
- 环境交互: 了解如何创建和使用LIBERO环境
2️⃣ procedural_creation_walkthrough.ipynb - 过程生成教程
📖 教程内容概览
这个教程教你如何使用LIBERO的过程生成管道创建自定义任务:
- 检索可用对象和谓词
- 定义初始状态分布
- 定义任务目标
- 生成PDDL文件
🔑 核心代码段讲解
2.1 获取可用对象列表
from libero.libero.envs.objects import get_object_dict, get_object_fn
# 获取所有可用对象
object_dict = get_object_dict()
print(object_dict)
对象分类:
HOPE对象(日常用品):
- 食品:
alphabet_soup,ketchup,mayo,milk,cookies - 调味品:
bbq_sauce,salad_dressing,tomato_sauce - 奶制品:
butter,cream_cheese,chocolate_pudding
Google扫描对象(容器和厨具):
- 容器:
basket,white_bowl,akita_black_bowl,plate - 厨具:
chefmate_8_frypan,moka_pot,rack
可活动对象(家具和电器):
- 厨房电器:
microwave,flat_stove,faucet - 橱柜:
slide_cabinet,wooden_cabinet,white_cabinet - 其他:
window,short_fridge
TurboSquid对象(装饰和功能):
- 书籍:
black_book,yellow_book - 杯具:
red_coffee_mug,porcelain_mug,white_yellow_mug - 家具:
wooden_shelf,wine_rack,desk_caddy
2.2 检索特定对象类
category_name = "moka_pot"
object_cls = get_object_fn(category_name)
print(f"{category_name}: defined in the class {object_cls}")
# 输出: moka_pot: defined in the class <class 'MokaPot'>
2.3 获取可用谓词
from libero.libero.envs.predicates import get_predicate_fn_dict, get_predicate_fn
# 获取所有谓词
predicate_dict = get_predicate_fn_dict()
print(predicate_dict)
可用谓词:
- true/false: 布尔谓词
- in: 对象A在对象B内部
- on: 对象A在对象B上面
- up: 抬起对象
- open: 打开可活动对象
- close: 关闭可活动对象
- turnon: 打开开关
- turnoff: 关闭开关
- printjointstate: 调试用,打印关节状态
# 获取特定谓词
predicate_name = "on"
on_predicate = get_predicate_fn(predicate_name)
2.4 定义自定义初始场景
from libero.libero.utils.mu_utils import register_mu, InitialSceneTemplates
from libero.libero.utils.bddl_generation_utils import get_xy_region_kwargs_list_from_regions_info
@register_mu(scene_type="kitchen")
class KitchenScene1(InitialSceneTemplates):
def __init__(self):
# 定义固定装置数量
fixture_num_info = {
"kitchen_table": 1, # 1个厨房桌子
"wooden_cabinet": 1, # 1个木制橱柜
}
# 定义对象数量
object_num_info = {
"akita_black_bowl": 1, # 1个黑碗
"plate": 1, # 1个盘子
}
super().__init__(
workspace_name="kitchen_table",
fixture_num_info=fixture_num_info,
object_num_info=object_num_info
)
def define_regions(self):
"""定义对象的放置区域"""
# 橱柜放置区域
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, -0.30], # 区域中心坐标
region_name="wooden_cabinet_init_region",
target_name=self.workspace_name,
region_half_len=0.01, # 区域半径
yaw_rotation=(np.pi, np.pi) # 旋转角度
)
)
# 黑碗放置区域
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, 0.0],
region_name="akita_black_bowl_init_region",
target_name=self.workspace_name,
region_half_len=0.025
)
)
# 盘子放置区域
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, 0.25],
region_name="plate_init_region",
target_name=self.workspace_name,
region_half_len=0.025
)
)
self.xy_region_kwargs_list = get_xy_region_kwargs_list_from_regions_info(self.regions)
@property
def init_states(self):
"""定义初始状态约束"""
states = [
("On", "akita_black_bowl_1", "kitchen_table_akita_black_bowl_init_region"),
("On", "plate_1", "kitchen_table_plate_init_region"),
("On", "wooden_cabinet_1", "kitchen_table_wooden_cabinet_init_region")
]
return states
关键参数说明:
workspace_name: 工作空间名称(通常是桌子或台面)fixture_num_info: 固定装置(不可移动的大型物体)object_num_info: 可操作对象region_centroid_xy: 区域中心的XY坐标region_half_len: 区域半径(米)yaw_rotation: 物体的旋转角度范围
2.5 定义任务目标
from libero.libero.utils.task_generation_utils import register_task_info, generate_bddl_from_task_info
@register_task_info
class KitchenScene1PutBowlOnPlate:
"""任务: 把黑碗放在盘子上"""
def __init__(self):
self.task_name = "KITCHEN_SCENE1_put_the_black_bowl_on_the_plate"
self.scene_name = "KitchenScene1" # 对应上面定义的场景
self.task_description = "put the black bowl on the plate"
# 任务语言指令(用于语言条件策略)
self.language_instruction = "put the black bowl on the plate"
@property
def goal(self):
"""定义任务目标状态"""
return [
("On", "akita_black_bowl_1", "plate_1"), # 黑碗在盘子上
]
目标定义格式:
(谓词, 对象1, 对象2)
常见目标示例:
# 把物体放进容器
("In", "milk_1", "basket_1")
# 把物体放在表面上
("On", "mug_1", "plate_1")
# 打开橱柜
("Open", "microwave_1")
# 打开电器
("TurnOn", "flat_stove_1")
2.6 生成BDDL文件
from libero.libero.utils.task_generation_utils import generate_bddl_from_task_info
# 生成任务的BDDL文件
task_info = KitchenScene1PutBowlOnPlate()
bddl_file_path = generate_bddl_from_task_info(
task_info,
output_dir="./custom_pddl"
)
print(f"BDDL file generated at: {bddl_file_path}")
生成的BDDL文件内容示例:
(define (problem KITCHEN_SCENE1_put_the_black_bowl_on_the_plate)
(:domain libero)
(:language "put the black bowl on the plate")
(:objects
kitchen_table - kitchen_table
wooden_cabinet_1 - wooden_cabinet
akita_black_bowl_1 - akita_black_bowl
plate_1 - plate
)
(:init
(On akita_black_bowl_1 kitchen_table_akita_black_bowl_init_region)
(On plate_1 kitchen_table_plate_init_region)
(On wooden_cabinet_1 kitchen_table_wooden_cabinet_init_region)
)
(:goal
(And
(On akita_black_bowl_1 plate_1)
)
)
)
💡 学习要点
- 对象系统: 掌握50+个可用对象的分类和使用
- 谓词系统: 理解10个谓词的语义和应用
- 场景定义: 学会使用装饰器模式定义初始场景
- 任务目标: 使用谓词组合定义复杂任务目标
- BDDL生成: 自动化生成标准化任务定义
3️⃣ custom_object_example.ipynb - 自定义对象教程
📖 教程内容概览
这个高级教程教你如何在LIBERO中添加自定义对象:
- 定义自定义对象类
- 注册对象到系统
- 创建使用自定义对象的场景
- 生成包含自定义对象的任务
🔑 核心代码段讲解
3.1 定义自定义对象基类
import os
import numpy as np
from robosuite.models.objects import MujocoXMLObject
from libero.libero.envs.base_object import register_object
class CustomObjects(MujocoXMLObject):
"""自定义对象基类"""
def __init__(self, custom_path, name, obj_name,
joints=[dict(type="free", damping="0.0005")]):
# 确保路径是绝对路径
assert os.path.isabs(custom_path), "Custom path must be an absolute path"
# 确保是XML文件
assert custom_path.endswith(".xml"), "Custom path must be an xml file"
super().__init__(
custom_path,
name=name,
joints=joints,
obj_type="all",
duplicate_collision_geoms=False,
)
# 设置类别名称
self.category_name = "_".join(
re.sub(r"([A-Z])", r" \\1", self.__class__.__name__).split()
).lower()
# 对象属性
self.object_properties = {"vis_site_names": {}}
关键参数:
custom_path: XML文件的绝对路径joints: 关节定义(free表示6自由度浮动)obj_type: 对象类型duplicate_collision_geoms: 是否复制碰撞几何体
3.2 注册具体的自定义对象
@register_object
class LiberoMug(CustomObjects):
"""自定义杯子对象"""
def __init__(self, name="libero_mug", obj_name="libero_mug"):
super().__init__(
custom_path=os.path.abspath(os.path.join(
"./", "custom_assets", "libero_mug", "libero_mug.xml"
)),
name=name,
obj_name=obj_name,
)
# 定义对象的旋转约束
self.rotation = {
"x": (-np.pi/2, -np.pi/2), # X轴旋转范围
"y": (-np.pi, -np.pi), # Y轴旋转范围
"z": (np.pi, np.pi), # Z轴旋转范围
}
self.rotation_axis = None # 无特定旋转轴
@register_object
class LiberoMugYellow(CustomObjects):
"""黄色杯子变体"""
def __init__(self, name="libero_mug_yellow", obj_name="libero_mug_yellow"):
super().__init__(
custom_path=os.path.abspath(os.path.join(
"./", "custom_assets", "libero_mug_yellow", "libero_mug_yellow.xml"
)),
name=name,
obj_name=obj_name,
)
self.rotation = {
"x": (-np.pi/2, -np.pi/2),
"y": (-np.pi, -np.pi),
"z": (np.pi, np.pi),
}
self.rotation_axis = None
@register_object 装饰器的作用:
- 将对象注册到全局对象字典
- 使对象可以在BDDL文件中使用
- 支持通过名称检索对象类
3.3 在场景中使用自定义对象
from libero.libero.utils.mu_utils import register_mu, InitialSceneTemplates
@register_mu(scene_type="kitchen")
class KitchenDemoScene(InitialSceneTemplates):
def __init__(self):
fixture_num_info = {
"kitchen_table": 1,
"wooden_cabinet": 1,
}
# 使用自定义对象
object_num_info = {
"libero_mug": 1, # 使用自定义的LiberoMug
"libero_mug_yellow": 1, # 使用黄色变体
"plate": 1, # 使用系统内置对象
}
super().__init__(
workspace_name="kitchen_table",
fixture_num_info=fixture_num_info,
object_num_info=object_num_info
)
def define_regions(self):
"""定义自定义对象的放置区域"""
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, 0.0],
region_name="libero_mug_init_region",
target_name=self.workspace_name,
region_half_len=0.025
)
)
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.2, 0.0],
region_name="libero_mug_yellow_init_region",
target_name=self.workspace_name,
region_half_len=0.025
)
)
self.regions.update(
self.get_region_dict(
region_centroid_xy=[-0.2, 0.0],
region_name="plate_init_region",
target_name=self.workspace_name,
region_half_len=0.025
)
)
self.xy_region_kwargs_list = get_xy_region_kwargs_list_from_regions_info(self.regions)
@property
def init_states(self):
states = [
("On", "libero_mug_1", "kitchen_table_libero_mug_init_region"),
("On", "libero_mug_yellow_1", "kitchen_table_libero_mug_yellow_init_region"),
("On", "plate_1", "kitchen_table_plate_init_region"),
("On", "wooden_cabinet_1", "kitchen_table_wooden_cabinet_init_region")
]
return states
3.4 定义使用自定义对象的任务
from libero.libero.utils.task_generation_utils import register_task_info
@register_task_info
class KitchenDemoScenePutMugsOnPlate:
"""任务: 把两个杯子都放在盘子上"""
def __init__(self):
self.task_name = "KITCHEN_DEMO_SCENE_put_both_mugs_on_the_plate"
self.scene_name = "KitchenDemoScene"
self.task_description = "put both mugs on the plate"
self.language_instruction = "put the libero mug and yellow mug on the plate"
@property
def goal(self):
return [
("On", "libero_mug_1", "plate_1"),
("On", "libero_mug_yellow_1", "plate_1"),
]
# 生成BDDL文件
task_info = KitchenDemoScenePutMugsOnPlate()
bddl_file = generate_bddl_from_task_info(
task_info,
output_dir="./custom_pddl"
)
3.5 准备自定义对象的XML文件
自定义对象需要MuJoCo XML格式的3D模型。文件结构示例:
custom_assets/
├── libero_mug/
│ ├── libero_mug.xml # MuJoCo模型定义
│ ├── libero_mug.obj # 网格文件(可选)
│ └── textures/ # 纹理文件(可选)
└── libero_mug_yellow/
├── libero_mug_yellow.xml
├── libero_mug_yellow.obj
└── textures/
libero_mug.xml示例:
<mujoco model="libero_mug">
<asset>
<mesh name="libero_mug_mesh" file="libero_mug.obj"/>
</asset>
<worldbody>
<body name="libero_mug">
<geom type="mesh" mesh="libero_mug_mesh"
rgba="0.8 0.8 0.8 1"
friction="0.95 0.3 0.1"/>
</body>
</worldbody>
</mujoco>
💡 学习要点
- 对象定义: 学会创建符合MuJoCo标准的自定义对象
- 注册机制: 理解装饰器注册模式
- 旋转约束: 掌握如何定义对象的方向约束
- 场景集成: 将自定义对象无缝集成到场景中
- 资产管理: 组织和管理自定义3D资产
4️⃣ quick_guide_algo.ipynb - 算法实现指南
📖 教程内容概览
这个高级教程涵盖LIBERO实验的核心组件:
- 数据集准备
- 编写自定义算法
- 定义自定义策略架构
- 实现训练循环
- 可视化结果
🔑 核心代码段讲解
4.1 配置加载和初始化
from hydra import compose, initialize
from omegaconf import OmegaConf
import yaml
from easydict import EasyDict
import hydra
# 清除之前的Hydra实例
hydra.core.global_hydra.GlobalHydra.instance().clear()
# 加载默认配置
initialize(config_path="../libero/configs")
hydra_cfg = compose(config_name="config")
yaml_config = OmegaConf.to_yaml(hydra_cfg)
cfg = EasyDict(yaml.safe_load(yaml_config))
配置结构说明:
cfg = {
'policy': {
'policy_type': 'BCTransformerPolicy',
'image_encoder': {...},
'language_encoder': {...},
'policy_head': {...},
'transformer_num_layers': 4,
'transformer_num_heads': 6,
...
},
'data': {
'obs': {'modality': ['rgb', 'low_dim']},
'seq_len': 10,
...
},
'train': {
'n_epochs': 25,
'batch_size': 64,
...
},
'eval': {
'n_eval': 5,
'num_procs': 1,
...
}
}
4.2 数据集准备
from libero.lifelong.datasets import get_dataset, SequenceVLDataset
from libero.libero.benchmark import get_benchmark
from libero.lifelong.utils import get_task_embs
# 设置路径
cfg.folder = get_libero_path("datasets")
cfg.bddl_folder = get_libero_path("bddl_files")
cfg.init_states_folder = get_libero_path("init_states")
# 选择基准测试
cfg.benchmark_name = "libero_object"
benchmark = get_benchmark(cfg.benchmark_name)(task_order=0)
# 准备数据集
datasets = []
descriptions = []
shape_meta = None
for i in range(benchmark.n_tasks):
# 加载数据集
task_i_dataset, shape_meta = get_dataset(
dataset_path=os.path.join(cfg.folder, benchmark.get_task_demonstration(i)),
obs_modality=cfg.data.obs.modality,
initialize_obs_utils=(i==0), # 只在第一个任务初始化
seq_len=cfg.data.seq_len,
)
# 获取语言描述
descriptions.append(benchmark.get_task(i).language)
datasets.append(task_i_dataset)
# 获取任务嵌入
task_embs = get_task_embs(cfg, descriptions)
benchmark.set_task_embs(task_embs)
# 创建视觉-语言数据集
datasets = [SequenceVLDataset(ds, emb) for (ds, emb) in zip(datasets, task_embs)]
数据集特性:
- SequenceDataset: 处理序列数据(历史观察)
- SequenceVLDataset: 添加语言条件
- shape_meta: 包含观察和动作的形状信息
- task_embs: 任务的语言嵌入向量
4.3 自定义策略架构
import torch
import torch.nn as nn
from libero.lifelong.models.policy_head import *
from libero.lifelong.models.policy import *
class CustomTransformerPolicy(nn.Module):
"""自定义Transformer策略"""
def __init__(self, cfg, shape_meta):
super().__init__()
self.cfg = cfg
# 图像编码器
self.image_encoder = ResnetEncoder(
input_shape=shape_meta['obs']['rgb_shape'],
language_dim=cfg.policy.language_encoder.network_kwargs.output_size,
freeze=False,
pretrained=False
)
# 语言编码器
self.language_encoder = MLPEncoder(
input_size=768, # BERT维度
hidden_size=128,
num_layers=1,
output_size=128
)
# Transformer时间编码器
self.transformer = nn.TransformerDecoder(
decoder_layer=nn.TransformerDecoderLayer(
d_model=cfg.policy.transformer_input_size,
nhead=cfg.policy.transformer_num_heads,
dim_feedforward=cfg.policy.transformer_mlp_hidden_size,
dropout=cfg.policy.transformer_dropout
),
num_layers=cfg.policy.transformer_num_layers
)
# 位置编码
self.pos_encoding = SinusoidalPositionEncoding(
input_size=cfg.policy.transformer_input_size
)
# 动作头(高斯混合模型)
self.policy_head = GMMHead(
input_size=cfg.policy.transformer_head_output_size,
output_size=shape_meta['ac_dim'],
hidden_size=1024,
num_modes=5, # 5个高斯分量
min_std=0.0001
)
def forward(self, obs_dict, task_emb):
"""
前向传播
Args:
obs_dict: 观察字典,包含图像和低维状态
task_emb: 任务嵌入向量
Returns:
action_dist: 动作分布
"""
batch_size, seq_len = obs_dict['rgb'].shape[:2]
# 编码语言
lang_feat = self.language_encoder(task_emb)
# 编码图像序列
rgb_seq = obs_dict['rgb'].reshape(-1, *obs_dict['rgb'].shape[2:])
img_feat = self.image_encoder(rgb_seq, lang_feat)
img_feat = img_feat.reshape(batch_size, seq_len, -1)
# 添加位置编码
img_feat = self.pos_encoding(img_feat)
# Transformer处理
# img_feat shape: (batch, seq, features)
# 转换为 (seq, batch, features) for transformer
img_feat = img_feat.transpose(0, 1)
context = self.transformer(img_feat, img_feat)
context = context.transpose(0, 1)
# 只取最后一个时间步
context = context[:, -1, :]
# 生成动作分布
action_dist = self.policy_head(context)
return action_dist
def get_action(self, obs_dict, task_emb):
"""获取动作(推理时使用)"""
with torch.no_grad():
action_dist = self.forward(obs_dict, task_emb)
# 从分布中采样或取均值
action = action_dist.sample() # 或 action_dist.mode()
return action
架构组件说明:
- ResnetEncoder: CNN视觉编码器
- MLPEncoder: 语言编码器
- TransformerDecoder: 时间序列处理
- SinusoidalPositionEncoding: 位置编码
- GMMHead: 高斯混合模型输出头
4.4 自定义终身学习算法
from libero.lifelong.algos import Sequential
import torch.optim as optim
class CustomLifelongAlgorithm(Sequential):
"""自定义终身学习算法"""
def __init__(self, n_tasks, **kwargs):
super().__init__(n_tasks, **kwargs)
# 算法特定的超参数
self.memory_size = 1000 # 经验回放缓冲区大小
self.replay_buffer = []
self.ewc_lambda = 1000 # EWC正则化系数
self.fisher_information = {}
def learn_one_task(self, task_id, train_data, cfg):
"""
学习单个任务
Args:
task_id: 任务ID
train_data: 训练数据
cfg: 配置
"""
print(f"[Algorithm] Learning task {task_id}")
# 创建数据加载器
if task_id > 0 and len(self.replay_buffer) > 0:
# 混合当前任务和回放数据
combined_data = self._combine_with_replay(train_data)
dataloader = self._create_dataloader(combined_data, cfg)
else:
dataloader = self._create_dataloader(train_data, cfg)
# 训练循环
for epoch in range(cfg.train.n_epochs):
epoch_loss = 0
for batch_idx, batch in enumerate(dataloader):
# 前向传播
obs_dict = self._prepare_obs(batch)
task_emb = batch['task_emb']
actions = batch['actions']
action_dist = self.policy(obs_dict, task_emb)
# 计算损失
bc_loss = -action_dist.log_prob(actions).mean()
# 添加EWC正则化(如果不是第一个任务)
if task_id > 0:
ewc_loss = self._compute_ewc_loss()
total_loss = bc_loss + self.ewc_lambda * ewc_loss
else:
total_loss = bc_loss
# 反向传播
self.optimizer.zero_grad()
total_loss.backward()
self.optimizer.step()
epoch_loss += total_loss.item()
print(f"Epoch {epoch}: Loss = {epoch_loss / len(dataloader):.4f}")
# 任务学习后的处理
self._store_important_samples(train_data, task_id)
self._update_fisher_information(dataloader)
def _combine_with_replay(self, current_data):
"""混合当前数据和回放数据"""
# 从回放缓冲区采样
replay_samples = random.sample(self.replay_buffer,
min(len(self.replay_buffer),
len(current_data) // 2))
return current_data + replay_samples
def _store_important_samples(self, data, task_id):
"""存储重要样本到回放缓冲区"""
# 简单策略:随机存储
if len(self.replay_buffer) < self.memory_size:
samples_to_store = min(self.memory_size - len(self.replay_buffer),
len(data))
self.replay_buffer.extend(random.sample(data, samples_to_store))
else:
# 缓冲区已满,替换旧样本
samples_to_replace = min(len(data), self.memory_size // self.n_tasks)
indices = random.sample(range(len(self.replay_buffer)),
samples_to_replace)
new_samples = random.sample(data, samples_to_replace)
for idx, sample in zip(indices, new_samples):
self.replay_buffer[idx] = sample
def _compute_ewc_loss(self):
"""计算弹性权重巩固损失"""
ewc_loss = 0
for name, param in self.policy.named_parameters():
if name in self.fisher_information:
fisher = self.fisher_information[name]
old_param = self.old_params[name]
ewc_loss += (fisher * (param - old_param) ** 2).sum()
return ewc_loss
def _update_fisher_information(self, dataloader):
"""更新Fisher信息矩阵"""
self.fisher_information = {}
self.old_params = {}
# 保存当前参数
for name, param in self.policy.named_parameters():
self.old_params[name] = param.data.clone()
self.fisher_information[name] = torch.zeros_like(param)
# 计算Fisher信息
for batch in dataloader:
obs_dict = self._prepare_obs(batch)
task_emb = batch['task_emb']
action_dist = self.policy(obs_dict, task_emb)
loss = -action_dist.log_prob(batch['actions']).mean()
self.optimizer.zero_grad()
loss.backward()
for name, param in self.policy.named_parameters():
if param.grad is not None:
self.fisher_information[name] += param.grad.data ** 2 / len(dataloader)
算法关键特性:
- 经验回放: 存储和重放之前任务的样本
- EWC正则化: 保护重要参数不被覆盖
- Fisher信息: 衡量参数的重要性
- 混合训练: 结合当前任务和历史任务数据
4.5 训练循环实现
from libero.lifelong.utils import create_experiment_dir
import os
def train_lifelong_learning(cfg, benchmark, datasets, policy, algorithm):
"""
终身学习训练主循环
Args:
cfg: 配置
benchmark: 基准测试
datasets: 数据集列表
policy: 策略网络
algorithm: 终身学习算法
"""
# 创建实验目录
exp_dir = create_experiment_dir(cfg)
# 训练每个任务
for task_id in range(benchmark.n_tasks):
print(f"\n{'='*50}")
print(f"Training on Task {task_id}: {benchmark.get_task_names()[task_id]}")
print(f"{'='*50}\n")
# 获取当前任务数据
train_data = datasets[task_id]
# 学习任务
algorithm.learn_one_task(task_id, train_data, cfg)
# 评估所有已学习任务(前向迁移)
if (task_id + 1) % cfg.eval.eval_freq == 0:
print(f"\nEvaluating after task {task_id}...")
results = evaluate_all_tasks(
cfg,
benchmark,
policy,
task_id + 1
)
# 保存结果
save_results(results, exp_dir, task_id)
# 打印结果
print_evaluation_results(results)
# 保存检查点
if (task_id + 1) % cfg.train.save_freq == 0:
checkpoint_path = os.path.join(exp_dir, f"checkpoint_task_{task_id}.pth")
torch.save({
'task_id': task_id,
'policy_state_dict': policy.state_dict(),
'algorithm_state': algorithm.get_state(),
'cfg': cfg
}, checkpoint_path)
print(f"Checkpoint saved to {checkpoint_path}")
def evaluate_all_tasks(cfg, benchmark, policy, n_learned_tasks):
"""
评估所有已学习的任务
Returns:
results: 字典,包含每个任务的成功率
"""
results = {}
for task_id in range(n_learned_tasks):
task = benchmark.get_task(task_id)
success_rate = evaluate_single_task(cfg, task, policy)
results[task_id] = {
'task_name': benchmark.get_task_names()[task_id],
'success_rate': success_rate
}
# 计算平均性能
avg_success = sum(r['success_rate'] for r in results.values()) / n_learned_tasks
results['average'] = avg_success
return results
def evaluate_single_task(cfg, task, policy):
"""
评估单个任务
Returns:
success_rate: 成功率
"""
from libero.libero.envs import OffScreenRenderEnv
# 创建环境
env = OffScreenRenderEnv(
bddl_file_name=task.bddl_file,
camera_heights=128,
camera_widths=128
)
success_count = 0
n_eval = cfg.eval.n_eval
for eval_idx in range(n_eval):
obs = env.reset()
done = False
step_count = 0
max_steps = cfg.eval.max_steps
while not done and step_count < max_steps:
# 获取动作
with torch.no_grad():
obs_dict = prepare_obs_for_policy(obs)
task_emb = task.task_emb
action = policy.get_action(obs_dict, task_emb)
# 执行动作
obs, reward, done, info = env.step(action.cpu().numpy())
step_count += 1
if info.get('success', False):
success_count += 1
env.close()
success_rate = success_count / n_eval
return success_rate
# 运行训练
train_lifelong_learning(cfg, benchmark, datasets, policy, algorithm)
4.6 结果可视化
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
def visualize_results(results_dict, save_path=None):
"""
可视化终身学习结果
Args:
results_dict: 包含多个checkpoint的结果
save_path: 保存图表的路径
"""
# 准备数据
n_tasks = len(results_dict[0]) - 1 # 减去'average'键
checkpoints = sorted(results_dict.keys())
# 创建热力图数据
heatmap_data = np.zeros((n_tasks, len(checkpoints)))
for i, checkpoint in enumerate(checkpoints):
for task_id in range(n_tasks):
heatmap_data[task_id, i] = results_dict[checkpoint][task_id]['success_rate']
# 绘制热力图
plt.figure(figsize=(12, 8))
sns.heatmap(
heatmap_data,
annot=True,
fmt='.2f',
cmap='YlGnBu',
xticklabels=[f"After Task {cp}" for cp in checkpoints],
yticklabels=[f"Task {i}" for i in range(n_tasks)],
vmin=0,
vmax=1
)
plt.title('Task Success Rate Throughout Lifelong Learning')
plt.xlabel('Training Checkpoint')
plt.ylabel('Task ID')
plt.tight_layout()
if save_path:
plt.savefig(os.path.join(save_path, 'heatmap.png'), dpi=300)
plt.show()
# 绘制学习曲线
plt.figure(figsize=(10, 6))
average_performance = [results_dict[cp]['average'] for cp in checkpoints]
plt.plot(checkpoints, average_performance, marker='o', linewidth=2)
plt.xlabel('Number of Tasks Learned')
plt.ylabel('Average Success Rate')
plt.title('Lifelong Learning Performance')
plt.grid(True, alpha=0.3)
plt.ylim([0, 1])
if save_path:
plt.savefig(os.path.join(save_path, 'learning_curve.png'), dpi=300)
plt.show()
# 计算遗忘指标
forgetting = compute_forgetting(results_dict)
print(f"\nAverage Forgetting: {forgetting:.4f}")
# 计算前向迁移
forward_transfer = compute_forward_transfer(results_dict)
print(f"Forward Transfer: {forward_transfer:.4f}")
def compute_forgetting(results_dict):
"""计算灾难性遗忘指标"""
n_tasks = len(results_dict[0]) - 1
forgetting_sum = 0
for task_id in range(n_tasks - 1):
# 找到该任务学习后的最高性能
max_perf = max(
results_dict[cp][task_id]['success_rate']
for cp in range(task_id, n_tasks)
)
# 最终性能
final_perf = results_dict[n_tasks - 1][task_id]['success_rate']
# 遗忘 = 最高性能 - 最终性能
forgetting_sum += (max_perf - final_perf)
return forgetting_sum / (n_tasks - 1) if n_tasks > 1 else 0
def compute_forward_transfer(results_dict):
"""计算前向迁移"""
n_tasks = len(results_dict[0]) - 1
transfer_sum = 0
for task_id in range(1, n_tasks):
# 学习该任务后的性能
perf_after_learning = results_dict[task_id][task_id]['success_rate']
# 假设零样本性能为0(可以用单任务训练来估计)
zero_shot_perf = 0
transfer_sum += (perf_after_learning - zero_shot_perf)
return transfer_sum / (n_tasks - 1) if n_tasks > 1 else 0
💡 学习要点
- 配置管理: Hydra配置系统的使用
- 数据管道: 从HDF5文件到PyTorch Dataset
- 模型设计: Transformer策略架构的实现
- 算法实现: 终身学习算法的核心逻辑
- 评估流程: 多任务评估和性能指标
- 可视化: 结果分析和可视化技术
🎯 综合学习建议
学习路径
初学者路径:
- 从
quick_walkthrough.ipynb开始,理解基本概念 - 运行示例代码,熟悉API
- 尝试修改参数,观察变化
进阶路径:
4. 学习 procedural_creation_walkthrough.ipynb
5. 创建简单的自定义任务
6. 理解BDDL文件的结构
高级路径:
7. 掌握 custom_object_example.ipynb
8. 准备自己的3D资产
9. 集成到完整任务中
专家路径:
10. 深入 quick_guide_algo.ipynb
11. 实现自定义算法
12. 进行完整的实验
实践建议
- 动手实践: 每个notebook都运行一遍
- 修改代码: 尝试改变参数,理解影响
- 阅读论文: 参考LIBERO原始论文理解设计思想
- 社区交流: GitHub Issues上寻求帮助
- 迭代改进: 从简单任务开始,逐步增加复杂度
常见问题
Q: 如何调试BDDL文件错误?
A: 检查对象名称拼写、谓词语法、初始状态和目标状态的一致性
Q: 自定义对象加载失败?
A: 确保XML文件路径正确,网格文件存在,MuJoCo语法正确
Q: 训练不收敛?
A: 检查学习率、批大小、序列长度,查看损失曲线
Q: 评估成功率为0?
A: 检查任务定义、初始状态、最大步数限制
📚 相关资源
- 官方文档: https://lifelong-robot-learning.github.io/LIBERO/
- GitHub仓库: https://github.com/Lifelong-Robot-Learning/LIBERO
- 论文: LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning (NeurIPS 2023)
- 数据集: https://huggingface.co/datasets/yifengzhu-hf/LIBERO-datasets
🔧 技术栈总结
核心依赖:
- MuJoCo: 物理仿真引擎
- robosuite: 机器人操作任务框架
- PyTorch: 深度学习框架
- Hydra: 配置管理
- H5py: 数据集存储
关键模块:
libero.libero: 基准测试和任务定义libero.lifelong: 终身学习算法libero.envs: 环境和对象定义libero.predicates: 谓词系统
祝你在LIBERO的学习之旅中收获满满!🚀
LIBERO Notebooks 快速参考指南
📋 四个Notebook快速对照
| Notebook | 核心功能 | 主要代码示例 | 适用场景 |
|---|---|---|---|
| quick_walkthrough | 基础API使用 | get_benchmark_dict(), get_task() | 入门学习 |
| procedural_creation | 任务生成 | @register_mu, @register_task_info | 创建新任务 |
| custom_object | 对象扩展 | @register_object, XML模型 | 添加新物体 |
| quick_guide_algo | 算法实现 | 自定义Policy和Algorithm | 研究开发 |
1️⃣ quick_walkthrough.ipynb - 5分钟速览
核心功能
# 1. 路径管理
from libero.libero import get_libero_path, set_libero_default_path
datasets_path = get_libero_path("datasets")
# 2. 获取基准测试
from libero.libero import benchmark
benchmark_dict = benchmark.get_benchmark_dict()
# {'libero_spatial', 'libero_object', 'libero_goal', 'libero_90', 'libero_10', 'libero_100'}
# 3. 加载任务
bench = benchmark_dict["libero_10"]()
task = bench.get_task(0)
print(task.language) # 语言指令
print(task.bddl_file) # BDDL文件路径
# 4. 创建环境
from libero.libero.envs import OffScreenRenderEnv
env = OffScreenRenderEnv(bddl_file_name=task_bddl_file)
obs = env.reset()
关键概念
- 6个基准测试套件: spatial/object/goal各10任务, 90+10=100任务
- BDDL文件: 定义任务的语言描述文件
- 初始状态: 每任务50个固定初始状态,确保可重复性
- OffScreenRenderEnv: 无头渲染环境,适合服务器训练
2️⃣ procedural_creation_walkthrough.ipynb - 任务创建三步走
Step 1: 查看可用资源
from libero.libero.envs.objects import get_object_dict, get_object_fn
from libero.libero.envs.predicates import get_predicate_fn_dict
# 50+个对象:食品、容器、厨具、家具
objects = get_object_dict()
# 'moka_pot', 'plate', 'microwave', 'wooden_cabinet'...
# 10个谓词:空间关系和动作
predicates = get_predicate_fn_dict()
# 'on', 'in', 'open', 'close', 'turnon', 'turnoff'...
Step 2: 定义场景
from libero.libero.utils.mu_utils import register_mu, InitialSceneTemplates
@register_mu(scene_type="kitchen")
class MyKitchenScene(InitialSceneTemplates):
def __init__(self):
fixture_num_info = {"kitchen_table": 1, "wooden_cabinet": 1}
object_num_info = {"plate": 1, "moka_pot": 1}
super().__init__(
workspace_name="kitchen_table",
fixture_num_info=fixture_num_info,
object_num_info=object_num_info
)
def define_regions(self):
# 定义plate的放置区域
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, 0.2],
region_name="plate_init_region",
target_name=self.workspace_name,
region_half_len=0.025
)
)
# ... 其他区域
@property
def init_states(self):
return [
("On", "plate_1", "kitchen_table_plate_init_region"),
("On", "moka_pot_1", "kitchen_table_moka_pot_init_region"),
]
Step 3: 定义任务
from libero.libero.utils.task_generation_utils import register_task_info
@register_task_info
class PutMokaPotOnPlate:
def __init__(self):
self.task_name = "put_the_moka_pot_on_the_plate"
self.scene_name = "MyKitchenScene"
self.language_instruction = "put the moka pot on the plate"
@property
def goal(self):
return [("On", "moka_pot_1", "plate_1")]
# 生成BDDL
from libero.libero.utils.task_generation_utils import generate_bddl_from_task_info
generate_bddl_from_task_info(PutMokaPotOnPlate(), output_dir="./my_tasks")
设计模式
场景定义 (@register_mu)
↓
对象放置区域 (define_regions)
↓
初始状态约束 (init_states)
↓
任务目标 (@register_task_info)
↓
BDDL文件生成
3️⃣ custom_object_example.ipynb - 添加新物体
准备工作
custom_assets/
├── my_object/
│ ├── my_object.xml # MuJoCo模型
│ └── my_object.obj # 网格文件(可选)
对象定义
from robosuite.models.objects import MujocoXMLObject
from libero.libero.envs.base_object import register_object
@register_object
class MyCustomObject(MujocoXMLObject):
def __init__(self, name="my_object", obj_name="my_object"):
super().__init__(
custom_path=os.path.abspath("./custom_assets/my_object/my_object.xml"),
name=name,
joints=[dict(type="free", damping="0.0005")],
obj_type="all",
duplicate_collision_geoms=False,
)
# 旋转约束(弧度)
self.rotation = {
"x": (-np.pi/2, -np.pi/2),
"y": (0, 0),
"z": (0, 2*np.pi),
}
在场景中使用
@register_mu(scene_type="kitchen")
class SceneWithCustomObject(InitialSceneTemplates):
def __init__(self):
fixture_num_info = {"kitchen_table": 1}
object_num_info = {
"my_object": 1, # 使用自定义对象
"plate": 1,
}
super().__init__(
workspace_name="kitchen_table",
fixture_num_info=fixture_num_info,
object_num_info=object_num_info
)
XML模板
<mujoco model="my_object">
<asset>
<mesh name="my_object_mesh" file="my_object.obj"/>
</asset>
<worldbody>
<body name="my_object">
<geom type="mesh" mesh="my_object_mesh"
rgba="1 0 0 1"
friction="0.95 0.3 0.1"/>
</body>
</worldbody>
</mujoco>
4️⃣ quick_guide_algo.ipynb - 算法开发
完整工作流
配置加载 (Hydra)
→ 数据准备 (HDF5 → Dataset)
→ 模型定义 (Policy)
→ 算法实现 (Algorithm)
→ 训练循环 (Train)
→ 评估 (Evaluate)
→ 可视化 (Visualize)
1. 配置和数据
from hydra import compose, initialize
from libero.lifelong.datasets import get_dataset, SequenceVLDataset
# 加载配置
initialize(config_path="../libero/configs")
cfg = compose(config_name="config")
# 准备数据
for i in range(benchmark.n_tasks):
dataset, shape_meta = get_dataset(
dataset_path=demo_path,
obs_modality=['rgb', 'low_dim'],
seq_len=10,
)
datasets.append(dataset)
# 添加语言条件
task_embs = get_task_embs(cfg, descriptions)
datasets = [SequenceVLDataset(ds, emb) for ds, emb in zip(datasets, task_embs)]
2. 自定义策略
import torch.nn as nn
class MyPolicy(nn.Module):
def __init__(self, cfg, shape_meta):
super().__init__()
# 图像编码器
self.image_encoder = ResnetEncoder(...)
# 语言编码器
self.language_encoder = MLPEncoder(...)
# 时间编码器
self.transformer = nn.TransformerDecoder(...)
# 动作头
self.policy_head = GMMHead(...)
def forward(self, obs_dict, task_emb):
img_feat = self.image_encoder(obs_dict['rgb'])
lang_feat = self.language_encoder(task_emb)
context = self.transformer(img_feat, lang_feat)
action_dist = self.policy_head(context[:, -1, :])
return action_dist
3. 自定义算法
from libero.lifelong.algos import Sequential
class MyAlgorithm(Sequential):
def __init__(self, n_tasks, **kwargs):
super().__init__(n_tasks, **kwargs)
self.memory_buffer = [] # 经验回放
self.fisher_info = {} # EWC
def learn_one_task(self, task_id, train_data, cfg):
# 混合当前数据和回放数据
if task_id > 0:
train_data = self.mix_with_replay(train_data)
# 训练循环
for epoch in range(cfg.train.n_epochs):
for batch in dataloader:
loss = self.compute_loss(batch)
if task_id > 0:
loss += self.ewc_loss() # 添加EWC
self.optimizer.zero_grad()
loss.backward()
self.optimizer.step()
# 更新记忆
self.update_memory(train_data)
self.update_fisher(dataloader)
4. 评估和可视化
def evaluate_all_tasks(cfg, benchmark, policy, n_learned):
results = {}
for task_id in range(n_learned):
env = create_env(benchmark.get_task(task_id))
success_rate = run_episodes(env, policy, n_eval=50)
results[task_id] = success_rate
return results
# 可视化
import matplotlib.pyplot as plt
import seaborn as sns
# 热力图:任务性能矩阵
sns.heatmap(performance_matrix, annot=True, cmap='YlGnBu')
# 学习曲线
plt.plot(checkpoints, avg_performance)
# 计算指标
forgetting = compute_forgetting(results)
forward_transfer = compute_forward_transfer(results)
🎓 学习检查清单
入门级 ✅
- 运行quick_walkthrough所有代码
- 理解6个基准测试的区别
- 能够加载和可视化任务
- 创建简单的环境实例
中级 ✅
- 使用现有对象创建新任务
- 理解BDDL文件结构
- 定义初始场景和目标状态
- 生成可运行的BDDL文件
高级 ✅
- 准备自定义3D模型
- 注册新对象类型
- 集成到完整任务
- 调试MuJoCo XML
专家级 ✅
- 实现自定义策略架构
- 开发新的终身学习算法
- 完整实验流程
- 论文级别的评估和分析
🔧 调试技巧
常见错误排查
1. 路径问题
# 检查路径
print(get_libero_path("datasets"))
print(get_libero_path("bddl_files"))
# 重置路径
set_libero_default_path() # 恢复默认
2. BDDL语法错误
# 检查对象名称拼写
# 检查谓词语法
# 验证初始状态和目标的一致性
3. 环境创建失败
# 设置渲染后端
export MUJOCO_GL=egl # 或 glfw, osmesa
export PYOPENGL_PLATFORM=egl
4. 数据加载错误
# 检查HDF5文件
import h5py
with h5py.File(demo_file, 'r') as f:
print(list(f.keys()))
📊 性能优化建议
训练加速
# 1. 数据加载
cfg.data.num_workers = 4 # 多进程加载
# 2. 批处理
cfg.train.batch_size = 64 # 根据GPU调整
# 3. 混合精度
from torch.cuda.amp import autocast, GradScaler
scaler = GradScaler()
内存优化
# 1. 梯度累积
accumulation_steps = 4
for i, batch in enumerate(dataloader):
loss = model(batch) / accumulation_steps
loss.backward()
if (i + 1) % accumulation_steps == 0:
optimizer.step()
optimizer.zero_grad()
# 2. 清理缓存
torch.cuda.empty_cache()
🚀 快速开始模板
最小可运行示例
#!/usr/bin/env python
"""LIBERO最小示例"""
from libero.libero import benchmark
from libero.libero.envs import OffScreenRenderEnv
import numpy as np
# 1. 加载基准测试
bench = benchmark.get_benchmark_dict()["libero_10"]()
# 2. 获取任务
task = bench.get_task(0)
print(f"Task: {task.language}")
# 3. 创建环境
env = OffScreenRenderEnv(bddl_file_name=task.bddl_file)
# 4. 运行随机策略
obs = env.reset()
for step in range(100):
action = np.random.randn(7) # 随机动作
obs, reward, done, info = env.step(action)
if done:
break
env.close()
print(f"Success: {info.get('success', False)}")
3万+

被折叠的 条评论
为什么被折叠?



