LIBERO Notebooks 快速参考指南
📋 四个Notebook快速对照
| Notebook | 核心功能 | 主要代码示例 | 适用场景 |
|---|---|---|---|
| quick_walkthrough | 基础API使用 | get_benchmark_dict(), get_task() | 入门学习 |
| procedural_creation | 任务生成 | @register_mu, @register_task_info | 创建新任务 |
| custom_object | 对象扩展 | @register_object, XML模型 | 添加新物体 |
| quick_guide_algo | 算法实现 | 自定义Policy和Algorithm | 研究开发 |
1️⃣ quick_walkthrough.ipynb - 5分钟速览
核心功能
# 1. 路径管理
from libero.libero import get_libero_path, set_libero_default_path
datasets_path = get_libero_path("datasets")
# 2. 获取基准测试
from libero.libero import benchmark
benchmark_dict = benchmark.get_benchmark_dict()
# {'libero_spatial', 'libero_object', 'libero_goal', 'libero_90', 'libero_10', 'libero_100'}
# 3. 加载任务
bench = benchmark_dict["libero_10"]()
task = bench.get_task(0)
print(task.language) # 语言指令
print(task.bddl_file) # BDDL文件路径
# 4. 创建环境
from libero.libero.envs import OffScreenRenderEnv
env = OffScreenRenderEnv(bddl_file_name=task_bddl_file)
obs = env.reset()
关键概念
- 6个基准测试套件: spatial/object/goal各10任务, 90+10=100任务
- BDDL文件: 定义任务的语言描述文件
- 初始状态: 每任务50个固定初始状态,确保可重复性
- OffScreenRenderEnv: 无头渲染环境,适合服务器训练
2️⃣ procedural_creation_walkthrough.ipynb - 任务创建三步走
Step 1: 查看可用资源
from libero.libero.envs.objects import get_object_dict, get_object_fn
from libero.libero.envs.predicates import get_predicate_fn_dict
# 50+个对象:食品、容器、厨具、家具
objects = get_object_dict()
# 'moka_pot', 'plate', 'microwave', 'wooden_cabinet'...
# 10个谓词:空间关系和动作
predicates = get_predicate_fn_dict()
# 'on', 'in', 'open', 'close', 'turnon', 'turnoff'...
Step 2: 定义场景
from libero.libero.utils.mu_utils import register_mu, InitialSceneTemplates
@register_mu(scene_type="kitchen")
class MyKitchenScene(InitialSceneTemplates):
def __init__(self):
fixture_num_info = {"kitchen_table": 1, "wooden_cabinet": 1}
object_num_info = {"plate": 1, "moka_pot": 1}
super().__init__(
workspace_name="kitchen_table",
fixture_num_info=fixture_num_info,
object_num_info=object_num_info
)
def define_regions(self):
# 定义plate的放置区域
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, 0.2],
region_name="plate_init_region",
target_name=self.workspace_name,
region_half_len=0.025
)
)
# ... 其他区域
@property
def init_states(self):
return [
("On", "plate_1", "kitchen_table_plate_init_region"),
("On", "moka_pot_1", "kitchen_table_moka_pot_init_region"),
]
Step 3: 定义任务
from libero.libero.utils.task_generation_utils import register_task_info
@register_task_info
class PutMokaPotOnPlate:
def __init__(self):
self.task_name = "put_the_moka_pot_on_the_plate"
self.scene_name = "MyKitchenScene"
self.language_instruction = "put the moka pot on the plate"
@property
def goal(self):
return [("On", "moka_pot_1", "plate_1")]
# 生成BDDL
from libero.libero.utils.task_generation_utils import generate_bddl_from_task_info
generate_bddl_from_task_info(PutMokaPotOnPlate(), output_dir="./my_tasks")
设计模式
场景定义 (@register_mu)
↓
对象放置区域 (define_regions)
↓
初始状态约束 (init_states)
↓
任务目标 (@register_task_info)
↓
BDDL文件生成
3️⃣ custom_object_example.ipynb - 添加新物体
准备工作
custom_assets/
├── my_object/
│ ├── my_object.xml # MuJoCo模型
│ └── my_object.obj # 网格文件(可选)
对象定义
from robosuite.models.objects import MujocoXMLObject
from libero.libero.envs.base_object import register_object
@register_object
class MyCustomObject(MujocoXMLObject):
def __init__(self, name="my_object", obj_name="my_object"):
super().__init__(
custom_path=os.path.abspath("./custom_assets/my_object/my_object.xml"),
name=name,
joints=[dict(type="free", damping="0.0005")],
obj_type="all",
duplicate_collision_geoms=False,
)
# 旋转约束(弧度)
self.rotation = {
"x": (-np.pi/2, -np.pi/2),
"y": (0, 0),
"z": (0, 2*np.pi),
}
在场景中使用
@register_mu(scene_type="kitchen")
class SceneWithCustomObject(InitialSceneTemplates):
def __init__(self):
fixture_num_info = {"kitchen_table": 1}
object_num_info = {
"my_object": 1, # 使用自定义对象
"plate": 1,
}
super().__init__(
workspace_name="kitchen_table",
fixture_num_info=fixture_num_info,
object_num_info=object_num_info
)
XML模板
<mujoco model="my_object">
<asset>
<mesh name="my_object_mesh" file="my_object.obj"/>
</asset>
<worldbody>
<body name="my_object">
<geom type="mesh" mesh="my_object_mesh"
rgba="1 0 0 1"
friction="0.95 0.3 0.1"/>
</body>
</worldbody>
</mujoco>
4️⃣ quick_guide_algo.ipynb - 算法开发
完整工作流
配置加载 (Hydra)
→ 数据准备 (HDF5 → Dataset)
→ 模型定义 (Policy)
→ 算法实现 (Algorithm)
→ 训练循环 (Train)
→ 评估 (Evaluate)
→ 可视化 (Visualize)
1. 配置和数据
from hydra import compose, initialize
from libero.lifelong.datasets import get_dataset, SequenceVLDataset
# 加载配置
initialize(config_path="../libero/configs")
cfg = compose(config_name="config")
# 准备数据
for i in range(benchmark.n_tasks):
dataset, shape_meta = get_dataset(
dataset_path=demo_path,
obs_modality=['rgb', 'low_dim'],
seq_len=10,
)
datasets.append(dataset)
# 添加语言条件
task_embs = get_task_embs(cfg, descriptions)
datasets = [SequenceVLDataset(ds, emb) for ds, emb in zip(datasets, task_embs)]
2. 自定义策略
import torch.nn as nn
class MyPolicy(nn.Module):
def __init__(self, cfg, shape_meta):
super().__init__()
# 图像编码器
self.image_encoder = ResnetEncoder(...)
# 语言编码器
self.language_encoder = MLPEncoder(...)
# 时间编码器
self.transformer = nn.TransformerDecoder(...)
# 动作头
self.policy_head = GMMHead(...)
def forward(self, obs_dict, task_emb):
img_feat = self.image_encoder(obs_dict['rgb'])
lang_feat = self.language_encoder(task_emb)
context = self.transformer(img_feat, lang_feat)
action_dist = self.policy_head(context[:, -1, :])
return action_dist
3. 自定义算法
from libero.lifelong.algos import Sequential
class MyAlgorithm(Sequential):
def __init__(self, n_tasks, **kwargs):
super().__init__(n_tasks, **kwargs)
self.memory_buffer = [] # 经验回放
self.fisher_info = {} # EWC
def learn_one_task(self, task_id, train_data, cfg):
# 混合当前数据和回放数据
if task_id > 0:
train_data = self.mix_with_replay(train_data)
# 训练循环
for epoch in range(cfg.train.n_epochs):
for batch in dataloader:
loss = self.compute_loss(batch)
if task_id > 0:
loss += self.ewc_loss() # 添加EWC
self.optimizer.zero_grad()
loss.backward()
self.optimizer.step()
# 更新记忆
self.update_memory(train_data)
self.update_fisher(dataloader)
4. 评估和可视化
def evaluate_all_tasks(cfg, benchmark, policy, n_learned):
results = {}
for task_id in range(n_learned):
env = create_env(benchmark.get_task(task_id))
success_rate = run_episodes(env, policy, n_eval=50)
results[task_id] = success_rate
return results
# 可视化
import matplotlib.pyplot as plt
import seaborn as sns
# 热力图:任务性能矩阵
sns.heatmap(performance_matrix, annot=True, cmap='YlGnBu')
# 学习曲线
plt.plot(checkpoints, avg_performance)
# 计算指标
forgetting = compute_forgetting(results)
forward_transfer = compute_forward_transfer(results)
🎓 学习检查清单
入门级 ✅
- 运行quick_walkthrough所有代码
- 理解6个基准测试的区别
- 能够加载和可视化任务
- 创建简单的环境实例
中级 ✅
- 使用现有对象创建新任务
- 理解BDDL文件结构
- 定义初始场景和目标状态
- 生成可运行的BDDL文件
高级 ✅
- 准备自定义3D模型
- 注册新对象类型
- 集成到完整任务
- 调试MuJoCo XML
专家级 ✅
- 实现自定义策略架构
- 开发新的终身学习算法
- 完整实验流程
- 论文级别的评估和分析
🔧 调试技巧
常见错误排查
1. 路径问题
# 检查路径
print(get_libero_path("datasets"))
print(get_libero_path("bddl_files"))
# 重置路径
set_libero_default_path() # 恢复默认
2. BDDL语法错误
# 检查对象名称拼写
# 检查谓词语法
# 验证初始状态和目标的一致性
3. 环境创建失败
# 设置渲染后端
export MUJOCO_GL=egl # 或 glfw, osmesa
export PYOPENGL_PLATFORM=egl
4. 数据加载错误
# 检查HDF5文件
import h5py
with h5py.File(demo_file, 'r') as f:
print(list(f.keys()))
📊 性能优化建议
训练加速
# 1. 数据加载
cfg.data.num_workers = 4 # 多进程加载
# 2. 批处理
cfg.train.batch_size = 64 # 根据GPU调整
# 3. 混合精度
from torch.cuda.amp import autocast, GradScaler
scaler = GradScaler()
内存优化
# 1. 梯度累积
accumulation_steps = 4
for i, batch in enumerate(dataloader):
loss = model(batch) / accumulation_steps
loss.backward()
if (i + 1) % accumulation_steps == 0:
optimizer.step()
optimizer.zero_grad()
# 2. 清理缓存
torch.cuda.empty_cache()
🚀 快速开始模板
最小可运行示例
#!/usr/bin/env python
"""LIBERO最小示例"""
from libero.libero import benchmark
from libero.libero.envs import OffScreenRenderEnv
import numpy as np
# 1. 加载基准测试
bench = benchmark.get_benchmark_dict()["libero_10"]()
# 2. 获取任务
task = bench.get_task(0)
print(f"Task: {task.language}")
# 3. 创建环境
env = OffScreenRenderEnv(bddl_file_name=task.bddl_file)
# 4. 运行随机策略
obs = env.reset()
for step in range(100):
action = np.random.randn(7) # 随机动作
obs, reward, done, info = env.step(action)
if done:
break
env.close()
print(f"Success: {info.get('success', False)}")
📚 进阶资源
代码示例库
- 官方示例:
LIBERO/notebooks/ - 算法实现:
LIBERO/libero/lifelong/algos/ - 策略架构:
LIBERO/libero/lifelong/models/policy/
阅读材料
相关工作
- MetaWorld: 多任务元学习基准
- RLBench: 视觉机器人学习
- BEHAVIOR: 家庭活动模拟
- Robosuite: 机器人操作框架
LIBERO Jupyter Notebooks 完整讲解
这是对LIBERO项目中4个核心Jupyter Notebook教程的详细讲解文档。
📚 Notebook 概览
| 文件名 | 用途 | 难度 |
|---|---|---|
quick_walkthrough.ipynb | LIBERO基础入门 | ⭐ 入门 |
procedural_creation_walkthrough.ipynb | 任务过程生成 | ⭐⭐ 进阶 |
custom_object_example.ipynb | 自定义对象 | ⭐⭐⭐ 高级 |
quick_guide_algo.ipynb | 算法和模型实现 | ⭐⭐⭐ 高级 |
1️⃣ quick_walkthrough.ipynb - 快速入门教程
📖 教程内容概览
这是LIBERO的基础入门教程,涵盖以下核心内容:
- LIBERO默认配置管理
- 可用基准测试的基本信息
- 检查基准测试完整性
- 初始化状态文件验证
- 可视化任务初始状态
- 下载数据集
🔑 核心代码段讲解
1.1 路径配置管理
from libero.libero import benchmark, get_libero_path, set_libero_default_path
# 获取默认路径
benchmark_root_path = get_libero_path("benchmark_root")
init_states_default_path = get_libero_path("init_states")
datasets_default_path = get_libero_path("datasets")
bddl_files_default_path = get_libero_path("bddl_files")
功能说明:
- 所有路径从配置文件
~/.libero/config.yaml中检索 - 默认路径相对于LIBERO代码库设置
- 可以动态修改为自定义路径
路径类型:
benchmark_root: 基准测试根目录init_states: 初始化状态文件目录datasets: 数据集存储目录bddl_files: BDDL任务定义文件目录
1.2 自定义路径设置
# 设置自定义路径
set_libero_default_path(os.path.join(os.path.expanduser("~"), "custom_project"))
# 恢复默认路径(不传参数)
set_libero_default_path()
应用场景:
- 在多个项目间切换
- 使用不同版本的数据集
- 自定义实验环境
1.3 获取可用基准测试
benchmark_dict = benchmark.get_benchmark_dict()
print(benchmark_dict)
输出结果:
{
'libero_spatial': <class 'LIBERO_SPATIAL'>,
'libero_object': <class 'LIBERO_OBJECT'>,
'libero_goal': <class 'LIBERO_GOAL'>,
'libero_90': <class 'LIBERO_90'>,
'libero_10': <class 'LIBERO_10'>,
'libero_100': <class 'LIBERO_100'>
}
基准测试说明:
- libero_spatial: 10个空间关系任务
- libero_object: 10个对象概念任务
- libero_goal: 10个任务目标任务
- libero_90: 90个预训练任务
- libero_10: 10个评估任务
- libero_100: 完整的100个任务套件
1.4 检查基准测试完整性
# 初始化基准测试实例
benchmark_instance = benchmark_dict["libero_10"]()
num_tasks = benchmark_instance.get_num_tasks()
print(f"{num_tasks} tasks in the benchmark")
# 获取所有任务名称
task_names = benchmark_instance.get_task_names()
# 检查每个任务的BDDL文件
for i in range(num_tasks):
task = benchmark_instance.get_task(i)
bddl_file = os.path.join(bddl_files_default_path,
task.problem_folder,
task.bddl_file)
if not os.path.exists(bddl_file):
print(f"[error] bddl file {bddl_file} cannot be found")
关键属性:
task.name: 任务名称task.language: 语言指令描述task.problem_folder: 问题文件夹路径task.bddl_file: BDDL文件名
任务示例:
LIVING_ROOM_SCENE2_put_both_the_alphabet_soup_and_the_tomato_sauce_in_the_basket
KITCHEN_SCENE3_turn_on_the_stove_and_put_the_moka_pot_on_it
1.5 验证初始化状态文件
# 检查初始状态文件是否存在
for i in range(num_tasks):
task = benchmark_instance.get_task(i)
init_states_path = os.path.join(init_states_default_path,
task.problem_folder,
task.init_states_file)
if not os.path.exists(init_states_path):
print(f"[error] init states {init_states_path} not found")
# 加载初始状态
init_states = benchmark_instance.get_task_init_states(0)
print(init_states.shape) # 输出: (50, 123)
数据格式:
- 形状:
(num_init_rollouts, num_simulation_states) - 50: 每个任务有50个不同的初始化状态
- 123: MuJoCo仿真状态的维度
用途:
- 确保任务评估的可重复性
- 提供多样化的初始场景
- 标准化基准测试
1.6 可视化初始状态
from libero.libero.envs import OffScreenRenderEnv
# 获取任务信息
task = benchmark_instance.get_task(0)
task_bddl_file = os.path.join(bddl_files_default_path,
task.problem_folder,
task.bddl_file)
# 创建环境
env_args = {
"bddl_file_name": task_bddl_file,
"camera_heights": 128,
"camera_widths": 128
}
env = OffScreenRenderEnv(**env_args)
env.seed(0)
env.reset()
# 设置初始状态并可视化
init_states = benchmark_instance.get_task_init_states(0)
for init_state_id in range(min(5, len(init_states))):
env.set_init_state(init_states[init_state_id])
obs = env.get_observation()
# 可视化obs中的图像
env.close()
环境参数:
bddl_file_name: 任务定义文件camera_heights/widths: 相机分辨率(默认128x128)- 支持离屏渲染(OffScreen)
💡 学习要点
- 配置管理: 理解LIBERO的路径管理系统
- 基准测试结构: 掌握6种基准测试的区别和用途
- 任务完整性: 学会验证BDDL文件和初始状态文件
- 环境交互: 了解如何创建和使用LIBERO环境
2️⃣ procedural_creation_walkthrough.ipynb - 过程生成教程
📖 教程内容概览
这个教程教你如何使用LIBERO的过程生成管道创建自定义任务:
- 检索可用对象和谓词
- 定义初始状态分布
- 定义任务目标
- 生成PDDL文件
🔑 核心代码段讲解
2.1 获取可用对象列表
from libero.libero.envs.objects import get_object_dict, get_object_fn
# 获取所有可用对象
object_dict = get_object_dict()
print(object_dict)
对象分类:
HOPE对象(日常用品):
- 食品:
alphabet_soup,ketchup,mayo,milk,cookies - 调味品:
bbq_sauce,salad_dressing,tomato_sauce - 奶制品:
butter,cream_cheese,chocolate_pudding
Google扫描对象(容器和厨具):
- 容器:
basket,white_bowl,akita_black_bowl,plate - 厨具:
chefmate_8_frypan,moka_pot,rack
可活动对象(家具和电器):
- 厨房电器:
microwave,flat_stove,faucet - 橱柜:
slide_cabinet,wooden_cabinet,white_cabinet - 其他:
window,short_fridge
TurboSquid对象(装饰和功能):
- 书籍:
black_book,yellow_book - 杯具:
red_coffee_mug,porcelain_mug,white_yellow_mug - 家具:
wooden_shelf,wine_rack,desk_caddy
2.2 检索特定对象类
category_name = "moka_pot"
object_cls = get_object_fn(category_name)
print(f"{category_name}: defined in the class {object_cls}")
# 输出: moka_pot: defined in the class <class 'MokaPot'>
2.3 获取可用谓词
from libero.libero.envs.predicates import get_predicate_fn_dict, get_predicate_fn
# 获取所有谓词
predicate_dict = get_predicate_fn_dict()
print(predicate_dict)
可用谓词:
- true/false: 布尔谓词
- in: 对象A在对象B内部
- on: 对象A在对象B上面
- up: 抬起对象
- open: 打开可活动对象
- close: 关闭可活动对象
- turnon: 打开开关
- turnoff: 关闭开关
- printjointstate: 调试用,打印关节状态
# 获取特定谓词
predicate_name = "on"
on_predicate = get_predicate_fn(predicate_name)
2.4 定义自定义初始场景
from libero.libero.utils.mu_utils import register_mu, InitialSceneTemplates
from libero.libero.utils.bddl_generation_utils import get_xy_region_kwargs_list_from_regions_info
@register_mu(scene_type="kitchen")
class KitchenScene1(InitialSceneTemplates):
def __init__(self):
# 定义固定装置数量
fixture_num_info = {
"kitchen_table": 1, # 1个厨房桌子
"wooden_cabinet": 1, # 1个木制橱柜
}
# 定义对象数量
object_num_info = {
"akita_black_bowl": 1, # 1个黑碗
"plate": 1, # 1个盘子
}
super().__init__(
workspace_name="kitchen_table",
fixture_num_info=fixture_num_info,
object_num_info=object_num_info
)
def define_regions(self):
"""定义对象的放置区域"""
# 橱柜放置区域
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, -0.30], # 区域中心坐标
region_name="wooden_cabinet_init_region",
target_name=self.workspace_name,
region_half_len=0.01, # 区域半径
yaw_rotation=(np.pi, np.pi) # 旋转角度
)
)
# 黑碗放置区域
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, 0.0],
region_name="akita_black_bowl_init_region",
target_name=self.workspace_name,
region_half_len=0.025
)
)
# 盘子放置区域
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, 0.25],
region_name="plate_init_region",
target_name=self.workspace_name,
region_half_len=0.025
)
)
self.xy_region_kwargs_list = get_xy_region_kwargs_list_from_regions_info(self.regions)
@property
def init_states(self):
"""定义初始状态约束"""
states = [
("On", "akita_black_bowl_1", "kitchen_table_akita_black_bowl_init_region"),
("On", "plate_1", "kitchen_table_plate_init_region"),
("On", "wooden_cabinet_1", "kitchen_table_wooden_cabinet_init_region")
]
return states
关键参数说明:
workspace_name: 工作空间名称(通常是桌子或台面)fixture_num_info: 固定装置(不可移动的大型物体)object_num_info: 可操作对象region_centroid_xy: 区域中心的XY坐标region_half_len: 区域半径(米)yaw_rotation: 物体的旋转角度范围
2.5 定义任务目标
from libero.libero.utils.task_generation_utils import register_task_info, generate_bddl_from_task_info
@register_task_info
class KitchenScene1PutBowlOnPlate:
"""任务: 把黑碗放在盘子上"""
def __init__(self):
self.task_name = "KITCHEN_SCENE1_put_the_black_bowl_on_the_plate"
self.scene_name = "KitchenScene1" # 对应上面定义的场景
self.task_description = "put the black bowl on the plate"
# 任务语言指令(用于语言条件策略)
self.language_instruction = "put the black bowl on the plate"
@property
def goal(self):
"""定义任务目标状态"""
return [
("On", "akita_black_bowl_1", "plate_1"), # 黑碗在盘子上
]
目标定义格式:
(谓词, 对象1, 对象2)
常见目标示例:
# 把物体放进容器
("In", "milk_1", "basket_1")
# 把物体放在表面上
("On", "mug_1", "plate_1")
# 打开橱柜
("Open", "microwave_1")
# 打开电器
("TurnOn", "flat_stove_1")
2.6 生成BDDL文件
from libero.libero.utils.task_generation_utils import generate_bddl_from_task_info
# 生成任务的BDDL文件
task_info = KitchenScene1PutBowlOnPlate()
bddl_file_path = generate_bddl_from_task_info(
task_info,
output_dir="./custom_pddl"
)
print(f"BDDL file generated at: {bddl_file_path}")
生成的BDDL文件内容示例:
(define (problem KITCHEN_SCENE1_put_the_black_bowl_on_the_plate)
(:domain libero)
(:language "put the black bowl on the plate")
(:objects
kitchen_table - kitchen_table
wooden_cabinet_1 - wooden_cabinet
akita_black_bowl_1 - akita_black_bowl
plate_1 - plate
)
(:init
(On akita_black_bowl_1 kitchen_table_akita_black_bowl_init_region)
(On plate_1 kitchen_table_plate_init_region)
(On wooden_cabinet_1 kitchen_table_wooden_cabinet_init_region)
)
(:goal
(And
(On akita_black_bowl_1 plate_1)
)
)
)
💡 学习要点
- 对象系统: 掌握50+个可用对象的分类和使用
- 谓词系统: 理解10个谓词的语义和应用
- 场景定义: 学会使用装饰器模式定义初始场景
- 任务目标: 使用谓词组合定义复杂任务目标
- BDDL生成: 自动化生成标准化任务定义
3️⃣ custom_object_example.ipynb - 自定义对象教程
📖 教程内容概览
这个高级教程教你如何在LIBERO中添加自定义对象:
- 定义自定义对象类
- 注册对象到系统
- 创建使用自定义对象的场景
- 生成包含自定义对象的任务
🔑 核心代码段讲解
3.1 定义自定义对象基类
import os
import numpy as np
from robosuite.models.objects import MujocoXMLObject
from libero.libero.envs.base_object import register_object
class CustomObjects(MujocoXMLObject):
"""自定义对象基类"""
def __init__(self, custom_path, name, obj_name,
joints=[dict(type="free", damping="0.0005")]):
# 确保路径是绝对路径
assert os.path.isabs(custom_path), "Custom path must be an absolute path"
# 确保是XML文件
assert custom_path.endswith(".xml"), "Custom path must be an xml file"
super().__init__(
custom_path,
name=name,
joints=joints,
obj_type="all",
duplicate_collision_geoms=False,
)
# 设置类别名称
self.category_name = "_".join(
re.sub(r"([A-Z])", r" \\1", self.__class__.__name__).split()
).lower()
# 对象属性
self.object_properties = {"vis_site_names": {}}
关键参数:
custom_path: XML文件的绝对路径joints: 关节定义(free表示6自由度浮动)obj_type: 对象类型duplicate_collision_geoms: 是否复制碰撞几何体
3.2 注册具体的自定义对象
@register_object
class LiberoMug(CustomObjects):
"""自定义杯子对象"""
def __init__(self, name="libero_mug", obj_name="libero_mug"):
super().__init__(
custom_path=os.path.abspath(os.path.join(
"./", "custom_assets", "libero_mug", "libero_mug.xml"
)),
name=name,
obj_name=obj_name,
)
# 定义对象的旋转约束
self.rotation = {
"x": (-np.pi/2, -np.pi/2), # X轴旋转范围
"y": (-np.pi, -np.pi), # Y轴旋转范围
"z": (np.pi, np.pi), # Z轴旋转范围
}
self.rotation_axis = None # 无特定旋转轴
@register_object
class LiberoMugYellow(CustomObjects):
"""黄色杯子变体"""
def __init__(self, name="libero_mug_yellow", obj_name="libero_mug_yellow"):
super().__init__(
custom_path=os.path.abspath(os.path.join(
"./", "custom_assets", "libero_mug_yellow", "libero_mug_yellow.xml"
)),
name=name,
obj_name=obj_name,
)
self.rotation = {
"x": (-np.pi/2, -np.pi/2),
"y": (-np.pi, -np.pi),
"z": (np.pi, np.pi),
}
self.rotation_axis = None
@register_object 装饰器的作用:
- 将对象注册到全局对象字典
- 使对象可以在BDDL文件中使用
- 支持通过名称检索对象类
3.3 在场景中使用自定义对象
from libero.libero.utils.mu_utils import register_mu, InitialSceneTemplates
@register_mu(scene_type="kitchen")
class KitchenDemoScene(InitialSceneTemplates):
def __init__(self):
fixture_num_info = {
"kitchen_table": 1,
"wooden_cabinet": 1,
}
# 使用自定义对象
object_num_info = {
"libero_mug": 1, # 使用自定义的LiberoMug
"libero_mug_yellow": 1, # 使用黄色变体
"plate": 1, # 使用系统内置对象
}
super().__init__(
workspace_name="kitchen_table",
fixture_num_info=fixture_num_info,
object_num_info=object_num_info
)
def define_regions(self):
"""定义自定义对象的放置区域"""
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.0, 0.0],
region_name="libero_mug_init_region",
target_name=self.workspace_name,
region_half_len=0.025
)
)
self.regions.update(
self.get_region_dict(
region_centroid_xy=[0.2, 0.0],
region_name="libero_mug_yellow_init_region",
target_name=self.workspace_name,
region_half_len=0.025
)
)
self.regions.update(
self.get_region_dict(
region_centroid_xy=[-0.2, 0.0],
region_name="plate_init_region",
target_name=self.workspace_name,
region_half_len=0.025
)
)
self.xy_region_kwargs_list = get_xy_region_kwargs_list_from_regions_info(self.regions)
@property
def init_states(self):
states = [
("On", "libero_mug_1", "kitchen_table_libero_mug_init_region"),
("On", "libero_mug_yellow_1", "kitchen_table_libero_mug_yellow_init_region"),
("On", "plate_1", "kitchen_table_plate_init_region"),
("On", "wooden_cabinet_1", "kitchen_table_wooden_cabinet_init_region")
]
return states
3.4 定义使用自定义对象的任务
from libero.libero.utils.task_generation_utils import register_task_info
@register_task_info
class KitchenDemoScenePutMugsOnPlate:
"""任务: 把两个杯子都放在盘子上"""
def __init__(self):
self.task_name = "KITCHEN_DEMO_SCENE_put_both_mugs_on_the_plate"
self.scene_name = "KitchenDemoScene"
self.task_description = "put both mugs on the plate"
self.language_instruction = "put the libero mug and yellow mug on the plate"
@property
def goal(self):
return [
("On", "libero_mug_1", "plate_1"),
("On", "libero_mug_yellow_1", "plate_1"),
]
# 生成BDDL文件
task_info = KitchenDemoScenePutMugsOnPlate()
bddl_file = generate_bddl_from_task_info(
task_info,
output_dir="./custom_pddl"
)
3.5 准备自定义对象的XML文件
自定义对象需要MuJoCo XML格式的3D模型。文件结构示例:
custom_assets/
├── libero_mug/
│ ├── libero_mug.xml # MuJoCo模型定义
│ ├── libero_mug.obj # 网格文件(可选)
│ └── textures/ # 纹理文件(可选)
└── libero_mug_yellow/
├── libero_mug_yellow.xml
├── libero_mug_yellow.obj
└── textures/
libero_mug.xml示例:
<mujoco model="libero_mug">
<asset>
<mesh name="libero_mug_mesh" file="libero_mug.obj"/>
</asset>
<worldbody>
<body name="libero_mug">
<geom type="mesh" mesh="libero_mug_mesh"
rgba="0.8 0.8 0.8 1"
friction="0.95 0.3 0.1"/>
</body>
</worldbody>
</mujoco>
💡 学习要点
- 对象定义: 学会创建符合MuJoCo标准的自定义对象
- 注册机制: 理解装饰器注册模式
- 旋转约束: 掌握如何定义对象的方向约束
- 场景集成: 将自定义对象无缝集成到场景中
- 资产管理: 组织和管理自定义3D资产
4️⃣ quick_guide_algo.ipynb - 算法实现指南
📖 教程内容概览
这个高级教程涵盖LIBERO实验的核心组件:
- 数据集准备
- 编写自定义算法
- 定义自定义策略架构
- 实现训练循环
- 可视化结果
🔑 核心代码段讲解
4.1 配置加载和初始化
from hydra import compose, initialize
from omegaconf import OmegaConf
import yaml
from easydict import EasyDict
import hydra
# 清除之前的Hydra实例
hydra.core.global_hydra.GlobalHydra.instance().clear()
# 加载默认配置
initialize(config_path="../libero/configs")
hydra_cfg = compose(config_name="config")
yaml_config = OmegaConf.to_yaml(hydra_cfg)
cfg = EasyDict(yaml.safe_load(yaml_config))
配置结构说明:
cfg = {
'policy': {
'policy_type': 'BCTransformerPolicy',
'image_encoder': {...},
'language_encoder': {...},
'policy_head': {...},
'transformer_num_layers': 4,
'transformer_num_heads': 6,
...
},
'data': {
'obs': {'modality': ['rgb', 'low_dim']},
'seq_len': 10,
...
},
'train': {
'n_epochs': 25,
'batch_size': 64,
...
},
'eval': {
'n_eval': 5,
'num_procs': 1,
...
}
}
4.2 数据集准备
from libero.lifelong.datasets import get_dataset, SequenceVLDataset
from libero.libero.benchmark import get_benchmark
from libero.lifelong.utils import get_task_embs
# 设置路径
cfg.folder = get_libero_path("datasets")
cfg.bddl_folder = get_libero_path("bddl_files")
cfg.init_states_folder = get_libero_path("init_states")
# 选择基准测试
cfg.benchmark_name = "libero_object"
benchmark = get_benchmark(cfg.benchmark_name)(task_order=0)
# 准备数据集
datasets = []
descriptions = []
shape_meta = None
for i in range(benchmark.n_tasks):
# 加载数据集
task_i_dataset, shape_meta = get_dataset(
dataset_path=os.path.join(cfg.folder, benchmark.get_task_demonstration(i)),
obs_modality=cfg.data.obs.modality,
initialize_obs_utils=(i==0), # 只在第一个任务初始化
seq_len=cfg.data.seq_len,
)
# 获取语言描述
descriptions.append(benchmark.get_task(i).language)
datasets.append(task_i_dataset)
# 获取任务嵌入
task_embs = get_task_embs(cfg, descriptions)
benchmark.set_task_embs(task_embs)
# 创建视觉-语言数据集
datasets = [SequenceVLDataset(ds, emb) for (ds, emb) in zip(datasets, task_embs)]
数据集特性:
- SequenceDataset: 处理序列数据(历史观察)
- SequenceVLDataset: 添加语言条件
- shape_meta: 包含观察和动作的形状信息
- task_embs: 任务的语言嵌入向量
4.3 自定义策略架构
import torch
import torch.nn as nn
from libero.lifelong.models.policy_head import *
from libero.lifelong.models.policy import *
class CustomTransformerPolicy(nn.Module):
"""自定义Transformer策略"""
def __init__(self, cfg, shape_meta):
super().__init__()
self.cfg = cfg
# 图像编码器
self.image_encoder = ResnetEncoder(
input_shape=shape_meta['obs']['rgb_shape'],
language_dim=cfg.policy.language_encoder.network_kwargs.output_size,
freeze=False,
pretrained=False
)
# 语言编码器
self.language_encoder = MLPEncoder(
input_size=768, # BERT维度
hidden_size=128,
num_layers=1,
output_size=128
)
# Transformer时间编码器
self.transformer = nn.TransformerDecoder(
decoder_layer=nn.TransformerDecoderLayer(
d_model=cfg.policy.transformer_input_size,
nhead=cfg.policy.transformer_num_heads,
dim_feedforward=cfg.policy.transformer_mlp_hidden_size,
dropout=cfg.policy.transformer_dropout
),
num_layers=cfg.policy.transformer_num_layers
)
# 位置编码
self.pos_encoding = SinusoidalPositionEncoding(
input_size=cfg.policy.transformer_input_size
)
# 动作头(高斯混合模型)
self.policy_head = GMMHead(
input_size=cfg.policy.transformer_head_output_size,
output_size=shape_meta['ac_dim'],
hidden_size=1024,
num_modes=5, # 5个高斯分量
min_std=0.0001
)
def forward(self, obs_dict, task_emb):
"""
前向传播
Args:
obs_dict: 观察字典,包含图像和低维状态
task_emb: 任务嵌入向量
Returns:
action_dist: 动作分布
"""
batch_size, seq_len = obs_dict['rgb'].shape[:2]
# 编码语言
lang_feat = self.language_encoder(task_emb)
# 编码图像序列
rgb_seq = obs_dict['rgb'].reshape(-1, *obs_dict['rgb'].shape[2:])
img_feat = self.image_encoder(rgb_seq, lang_feat)
img_feat = img_feat.reshape(batch_size, seq_len, -1)
# 添加位置编码
img_feat = self.pos_encoding(img_feat)
# Transformer处理
# img_feat shape: (batch, seq, features)
# 转换为 (seq, batch, features) for transformer
img_feat = img_feat.transpose(0, 1)
context = self.transformer(img_feat, img_feat)
context = context.transpose(0, 1)
# 只取最后一个时间步
context = context[:, -1, :]
# 生成动作分布
action_dist = self.policy_head(context)
return action_dist
def get_action(self, obs_dict, task_emb):
"""获取动作(推理时使用)"""
with torch.no_grad():
action_dist = self.forward(obs_dict, task_emb)
# 从分布中采样或取均值
action = action_dist.sample() # 或 action_dist.mode()
return action
架构组件说明:
- ResnetEncoder: CNN视觉编码器
- MLPEncoder: 语言编码器
- TransformerDecoder: 时间序列处理
- SinusoidalPositionEncoding: 位置编码
- GMMHead: 高斯混合模型输出头
4.4 自定义终身学习算法
from libero.lifelong.algos import Sequential
import torch.optim as optim
class CustomLifelongAlgorithm(Sequential):
"""自定义终身学习算法"""
def __init__(self, n_tasks, **kwargs):
super().__init__(n_tasks, **kwargs)
# 算法特定的超参数
self.memory_size = 1000 # 经验回放缓冲区大小
self.replay_buffer = []
self.ewc_lambda = 1000 # EWC正则化系数
self.fisher_information = {}
def learn_one_task(self, task_id, train_data, cfg):
"""
学习单个任务
Args:
task_id: 任务ID
train_data: 训练数据
cfg: 配置
"""
print(f"[Algorithm] Learning task {task_id}")
# 创建数据加载器
if task_id > 0 and len(self.replay_buffer) > 0:
# 混合当前任务和回放数据
combined_data = self._combine_with_replay(train_data)
dataloader = self._create_dataloader(combined_data, cfg)
else:
dataloader = self._create_dataloader(train_data, cfg)
# 训练循环
for epoch in range(cfg.train.n_epochs):
epoch_loss = 0
for batch_idx, batch in enumerate(dataloader):
# 前向传播
obs_dict = self._prepare_obs(batch)
task_emb = batch['task_emb']
actions = batch['actions']
action_dist = self.policy(obs_dict, task_emb)
# 计算损失
bc_loss = -action_dist.log_prob(actions).mean()
# 添加EWC正则化(如果不是第一个任务)
if task_id > 0:
ewc_loss = self._compute_ewc_loss()
total_loss = bc_loss + self.ewc_lambda * ewc_loss
else:
total_loss = bc_loss
# 反向传播
self.optimizer.zero_grad()
total_loss.backward()
self.optimizer.step()
epoch_loss += total_loss.item()
print(f"Epoch {epoch}: Loss = {epoch_loss / len(dataloader):.4f}")
# 任务学习后的处理
self._store_important_samples(train_data, task_id)
self._update_fisher_information(dataloader)
def _combine_with_replay(self, current_data):
"""混合当前数据和回放数据"""
# 从回放缓冲区采样
replay_samples = random.sample(self.replay_buffer,
min(len(self.replay_buffer),
len(current_data) // 2))
return current_data + replay_samples
def _store_important_samples(self, data, task_id):
"""存储重要样本到回放缓冲区"""
# 简单策略:随机存储
if len(self.replay_buffer) < self.memory_size:
samples_to_store = min(self.memory_size - len(self.replay_buffer),
len(data))
self.replay_buffer.extend(random.sample(data, samples_to_store))
else:
# 缓冲区已满,替换旧样本
samples_to_replace = min(len(data), self.memory_size // self.n_tasks)
indices = random.sample(range(len(self.replay_buffer)),
samples_to_replace)
new_samples = random.sample(data, samples_to_replace)
for idx, sample in zip(indices, new_samples):
self.replay_buffer[idx] = sample
def _compute_ewc_loss(self):
"""计算弹性权重巩固损失"""
ewc_loss = 0
for name, param in self.policy.named_parameters():
if name in self.fisher_information:
fisher = self.fisher_information[name]
old_param = self.old_params[name]
ewc_loss += (fisher * (param - old_param) ** 2).sum()
return ewc_loss
def _update_fisher_information(self, dataloader):
"""更新Fisher信息矩阵"""
self.fisher_information = {}
self.old_params = {}
# 保存当前参数
for name, param in self.policy.named_parameters():
self.old_params[name] = param.data.clone()
self.fisher_information[name] = torch.zeros_like(param)
# 计算Fisher信息
for batch in dataloader:
obs_dict = self._prepare_obs(batch)
task_emb = batch['task_emb']
action_dist = self.policy(obs_dict, task_emb)
loss = -action_dist.log_prob(batch['actions']).mean()
self.optimizer.zero_grad()
loss.backward()
for name, param in self.policy.named_parameters():
if param.grad is not None:
self.fisher_information[name] += param.grad.data ** 2 / len(dataloader)
算法关键特性:
- 经验回放: 存储和重放之前任务的样本
- EWC正则化: 保护重要参数不被覆盖
- Fisher信息: 衡量参数的重要性
- 混合训练: 结合当前任务和历史任务数据
4.5 训练循环实现
from libero.lifelong.utils import create_experiment_dir
import os
def train_lifelong_learning(cfg, benchmark, datasets, policy, algorithm):
"""
终身学习训练主循环
Args:
cfg: 配置
benchmark: 基准测试
datasets: 数据集列表
policy: 策略网络
algorithm: 终身学习算法
"""
# 创建实验目录
exp_dir = create_experiment_dir(cfg)
# 训练每个任务
for task_id in range(benchmark.n_tasks):
print(f"\n{'='*50}")
print(f"Training on Task {task_id}: {benchmark.get_task_names()[task_id]}")
print(f"{'='*50}\n")
# 获取当前任务数据
train_data = datasets[task_id]
# 学习任务
algorithm.learn_one_task(task_id, train_data, cfg)
# 评估所有已学习任务(前向迁移)
if (task_id + 1) % cfg.eval.eval_freq == 0:
print(f"\nEvaluating after task {task_id}...")
results = evaluate_all_tasks(
cfg,
benchmark,
policy,
task_id + 1
)
# 保存结果
save_results(results, exp_dir, task_id)
# 打印结果
print_evaluation_results(results)
# 保存检查点
if (task_id + 1) % cfg.train.save_freq == 0:
checkpoint_path = os.path.join(exp_dir, f"checkpoint_task_{task_id}.pth")
torch.save({
'task_id': task_id,
'policy_state_dict': policy.state_dict(),
'algorithm_state': algorithm.get_state(),
'cfg': cfg
}, checkpoint_path)
print(f"Checkpoint saved to {checkpoint_path}")
def evaluate_all_tasks(cfg, benchmark, policy, n_learned_tasks):
"""
评估所有已学习的任务
Returns:
results: 字典,包含每个任务的成功率
"""
results = {}
for task_id in range(n_learned_tasks):
task = benchmark.get_task(task_id)
success_rate = evaluate_single_task(cfg, task, policy)
results[task_id] = {
'task_name': benchmark.get_task_names()[task_id],
'success_rate': success_rate
}
# 计算平均性能
avg_success = sum(r['success_rate'] for r in results.values()) / n_learned_tasks
results['average'] = avg_success
return results
def evaluate_single_task(cfg, task, policy):
"""
评估单个任务
Returns:
success_rate: 成功率
"""
from libero.libero.envs import OffScreenRenderEnv
# 创建环境
env = OffScreenRenderEnv(
bddl_file_name=task.bddl_file,
camera_heights=128,
camera_widths=128
)
success_count = 0
n_eval = cfg.eval.n_eval
for eval_idx in range(n_eval):
obs = env.reset()
done = False
step_count = 0
max_steps = cfg.eval.max_steps
while not done and step_count < max_steps:
# 获取动作
with torch.no_grad():
obs_dict = prepare_obs_for_policy(obs)
task_emb = task.task_emb
action = policy.get_action(obs_dict, task_emb)
# 执行动作
obs, reward, done, info = env.step(action.cpu().numpy())
step_count += 1
if info.get('success', False):
success_count += 1
env.close()
success_rate = success_count / n_eval
return success_rate
# 运行训练
train_lifelong_learning(cfg, benchmark, datasets, policy, algorithm)
4.6 结果可视化
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
def visualize_results(results_dict, save_path=None):
"""
可视化终身学习结果
Args:
results_dict: 包含多个checkpoint的结果
save_path: 保存图表的路径
"""
# 准备数据
n_tasks = len(results_dict[0]) - 1 # 减去'average'键
checkpoints = sorted(results_dict.keys())
# 创建热力图数据
heatmap_data = np.zeros((n_tasks, len(checkpoints)))
for i, checkpoint in enumerate(checkpoints):
for task_id in range(n_tasks):
heatmap_data[task_id, i] = results_dict[checkpoint][task_id]['success_rate']
# 绘制热力图
plt.figure(figsize=(12, 8))
sns.heatmap(
heatmap_data,
annot=True,
fmt='.2f',
cmap='YlGnBu',
xticklabels=[f"After Task {cp}" for cp in checkpoints],
yticklabels=[f"Task {i}" for i in range(n_tasks)],
vmin=0,
vmax=1
)
plt.title('Task Success Rate Throughout Lifelong Learning')
plt.xlabel('Training Checkpoint')
plt.ylabel('Task ID')
plt.tight_layout()
if save_path:
plt.savefig(os.path.join(save_path, 'heatmap.png'), dpi=300)
plt.show()
# 绘制学习曲线
plt.figure(figsize=(10, 6))
average_performance = [results_dict[cp]['average'] for cp in checkpoints]
plt.plot(checkpoints, average_performance, marker='o', linewidth=2)
plt.xlabel('Number of Tasks Learned')
plt.ylabel('Average Success Rate')
plt.title('Lifelong Learning Performance')
plt.grid(True, alpha=0.3)
plt.ylim([0, 1])
if save_path:
plt.savefig(os.path.join(save_path, 'learning_curve.png'), dpi=300)
plt.show()
# 计算遗忘指标
forgetting = compute_forgetting(results_dict)
print(f"\nAverage Forgetting: {forgetting:.4f}")
# 计算前向迁移
forward_transfer = compute_forward_transfer(results_dict)
print(f"Forward Transfer: {forward_transfer:.4f}")
def compute_forgetting(results_dict):
"""计算灾难性遗忘指标"""
n_tasks = len(results_dict[0]) - 1
forgetting_sum = 0
for task_id in range(n_tasks - 1):
# 找到该任务学习后的最高性能
max_perf = max(
results_dict[cp][task_id]['success_rate']
for cp in range(task_id, n_tasks)
)
# 最终性能
final_perf = results_dict[n_tasks - 1][task_id]['success_rate']
# 遗忘 = 最高性能 - 最终性能
forgetting_sum += (max_perf - final_perf)
return forgetting_sum / (n_tasks - 1) if n_tasks > 1 else 0
def compute_forward_transfer(results_dict):
"""计算前向迁移"""
n_tasks = len(results_dict[0]) - 1
transfer_sum = 0
for task_id in range(1, n_tasks):
# 学习该任务后的性能
perf_after_learning = results_dict[task_id][task_id]['success_rate']
# 假设零样本性能为0(可以用单任务训练来估计)
zero_shot_perf = 0
transfer_sum += (perf_after_learning - zero_shot_perf)
return transfer_sum / (n_tasks - 1) if n_tasks > 1 else 0
💡 学习要点
- 配置管理: Hydra配置系统的使用
- 数据管道: 从HDF5文件到PyTorch Dataset
- 模型设计: Transformer策略架构的实现
- 算法实现: 终身学习算法的核心逻辑
- 评估流程: 多任务评估和性能指标
- 可视化: 结果分析和可视化技术
🎯 综合学习建议
学习路径
初学者路径:
- 从
quick_walkthrough.ipynb开始,理解基本概念 - 运行示例代码,熟悉API
- 尝试修改参数,观察变化
进阶路径:
4. 学习 procedural_creation_walkthrough.ipynb
5. 创建简单的自定义任务
6. 理解BDDL文件的结构
高级路径:
7. 掌握 custom_object_example.ipynb
8. 准备自己的3D资产
9. 集成到完整任务中
专家路径:
10. 深入 quick_guide_algo.ipynb
11. 实现自定义算法
12. 进行完整的实验
实践建议
- 动手实践: 每个notebook都运行一遍
- 修改代码: 尝试改变参数,理解影响
- 阅读论文: 参考LIBERO原始论文理解设计思想
- 社区交流: GitHub Issues上寻求帮助
- 迭代改进: 从简单任务开始,逐步增加复杂度
常见问题
Q: 如何调试BDDL文件错误?
A: 检查对象名称拼写、谓词语法、初始状态和目标状态的一致性
Q: 自定义对象加载失败?
A: 确保XML文件路径正确,网格文件存在,MuJoCo语法正确
Q: 训练不收敛?
A: 检查学习率、批大小、序列长度,查看损失曲线
Q: 评估成功率为0?
A: 检查任务定义、初始状态、最大步数限制
📚 相关资源
- 官方文档: https://lifelong-robot-learning.github.io/LIBERO/
- GitHub仓库: https://github.com/Lifelong-Robot-Learning/LIBERO
- 论文: LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning (NeurIPS 2023)
- 数据集: https://huggingface.co/datasets/yifengzhu-hf/LIBERO-datasets
🔧 技术栈总结
核心依赖:
- MuJoCo: 物理仿真引擎
- robosuite: 机器人操作任务框架
- PyTorch: 深度学习框架
- Hydra: 配置管理
- H5py: 数据集存储
关键模块:
libero.libero: 基准测试和任务定义libero.lifelong: 终身学习算法libero.envs: 环境和对象定义libero.predicates: 谓词系统
祝你在LIBERO的学习之旅中收获满满!🚀
3万+

被折叠的 条评论
为什么被折叠?



