IsaacLab Installation on Ubuntu 20.04, train a robot!

RL^2

已于 2025-02-13 16:16:12 修改

阅读量713

点赞数 5

文章标签： ubuntu python 机器学习人工智能 IssacLab learning RL

于 2025-02-12 18:08:37 首次发布

本文链接：https://blog.youkuaiyun.com/likecayon/article/details/145597440

版权

How Install IssacLab on unbuntu 20.04

Follow the official documentation
Installation using Isaac Sim Binaries

How to run the checkpoint (the policy you trained)

## train a new policy
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task=Isaac-Velocity-Rough-Anymal-C-v0 --headless
## play the policy
./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/play.py --task=Isaac-Velocity-Rough-Anymal-C-v0 --num_envs 32 --checkpoint logs/rsl_rl/anymal_c_rough/2025-02-12_10-49-33/model_1499.pt

Train a robot KeyError solution

Error description

The output showed something:

Error executing job with overrides: []
Traceback (most recent call last):
  File "/home/gao/IsaacLab/source/isaaclab_tasks/isaaclab_tasks/utils/hydra.py", line 101, in hydra_main
    func(env_cfg, agent_cfg, *args, **kwargs)
  File "/home/gao/IsaacLab/scripts/reinforcement_learning/rsl_rl/train.py", line 129, in main
    runner = OnPolicyRunner(env, agent_cfg.to_dict(), log_dir=log_dir, device=agent_cfg.device)
  File "/home/gao/miniforge3/envs/env_isaaclab/lib/python3.10/site-packages/rsl_rl/runners/on_policy_runner.py", line 44, in __init__
    if self.alg_cfg["rnd_cfg"] is not None:
KeyError: 'rnd_cfg'

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
2025-02-12 08:51:32 [20,276ms] [Warning] [omni.fabric.plugin] gFabricState->gUsdStageToSimStageWithHistoryMap had 1 outstanding SimStageWithHistory(s) at shutdown
2025-02-12 08:51:32 [20,847ms] [Warning] [carb] Recursive unloadAllPlugins() detected!

how to solve

File 1 :/home/gao/miniforge3/envs/env_isaaclab/lib/python3.10/site-packages/rsl_rl/runners/on_policy_runner.py

(env_isaaclab) gao@rs-controlserver1:~/IsaacLab$ python -c "import rsl_rl.runners.on_policy_runner; print(rsl_rl.runners.on_policy_runner.__file__)"
/home/gao/miniforge3/envs/env_isaaclab/lib/python3.10/site-packages/rsl_rl/runners/on_policy_runner.py

Open the File for Editing:
For example, using code:

code /home/gao/miniforge3/envs/env_isaaclab/lib/python3.10/site-packages/rsl_rl/runners/on_policy_runner.py

Locate the Problematic Code:
Search (using Ctrl+F in code) for rnd_cfg. Look for a line similar to:

if self.alg_cfg["rnd_cfg"] is not None:

Modify the Code:
Change that line to:

if self.alg_cfg.get("rnd_cfg") is not None:

This uses the dictionary’s get method, which returns None (or a default value if you specify one) if the key is missing, thus avoiding the KeyError.

Save and Exit:

File 2 ./source/isaaclab_tasks/isaaclab_tasks/manager_based/locomotion/velocity/config/anymal_c/agents/rsl_rl_ppo_cfg.py

. Update the Configuration Schema
A more permanent solution is to ensure that the configuration schema includes “symmetry_cfg”. For example, if your configuration is defined in a dataclass (likely in a file like rsl_rl_ppo_cfg.py), add a default field for “symmetry_cfg”. For example:

from dataclasses import dataclass, field
from typing import Optional, Dict

@dataclass
class AnymalCRoughPPORunnerCfg(RslRlOnPolicyRunnerCfg):
    # ... other fields ...
    symmetry_cfg: Optional[Dict] = field(default_factory=dict)

Make sure you modify the version of the file that is actually being used (in your case, the one in the manager-based configuration if that’s what your logs indicate).

rerun the training command

./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py --task=Isaac-Velocity-Rough-Anymal-C-v0 --headless

The output looks like this:

Actor MLP: Sequential(
  (0): Linear(in_features=235, out_features=512, bias=True)
  (1): ELU(alpha=1.0)
  (2): Linear(in_features=512, out_features=256, bias=True)
  (3): ELU(alpha=1.0)
  (4): Linear(in_features=256, out_features=128, bias=True)
  (5): ELU(alpha=1.0)
  (6): Linear(in_features=128, out_features=12, bias=True)
)
Critic MLP: Sequential(
  (0): Linear(in_features=235, out_features=512, bias=True)
  (1): ELU(alpha=1.0)
  (2): Linear(in_features=512, out_features=256, bias=True)
  (3): ELU(alpha=1.0)
  (4): Linear(in_features=256, out_features=128, bias=True)
  (5): ELU(alpha=1.0)
  (6): Linear(in_features=128, out_features=1, bias=True)
)
/home/gao/isaac-sim/isaac-sim-standalone@4.5.0/exts/omni.isaac.ml_archive/pip_prebundle/torch/nn/modules/module.py:1747: UserWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters(). (Triggered internally at ../aten/src/ATen/native/cudnn/RNN.cpp:1410.)
  return forward_call(*args, **kwargs)
################################################################################
                       Learning iteration 0/1500                        

                       Computation: 17687 steps/s (collection: 5.312s, learning 0.246s)
               Value function loss: 0.0365
                    Surrogate loss: 0.0010
             Mean action noise std: 1.00
                 Mean total reward: -0.35
               Mean episode length: 11.87
Episode_Reward/track_lin_vel_xy_exp: 0.0029
Episode_Reward/track_ang_vel_z_exp: 0.0020
       Episode_Reward/lin_vel_z_l2: -0.0098
      Episode_Reward/ang_vel_xy_l2: -0.0031
     Episode_Reward/dof_torques_l2: -0.0013
         Episode_Reward/dof_acc_l2: -0.0049
     Episode_Reward/action_rate_l2: -0.0028
      Episode_Reward/feet_air_time: -0.0003
 Episode_Reward/undesired_contacts: -0.0009
Episode_Reward/flat_orientation_l2: 0.0000
     Episode_Reward/dof_pos_limits: 0.0000
         Curriculum/terrain_levels: 3.5276
Metrics/base_velocity/error_vel_xy: 0.0173
Metrics/base_velocity/error_vel_yaw: 0.0172
      Episode_Termination/time_out: 3.5833
  Episode_Termination/base_contact: 0.0000
--------------------------------------------------------------------------------
                   Total timesteps: 98304
                    Iteration time: 5.56s
                        Total time: 5.56s
                               ETA: 8336.5s
                               Could not find git repository in /home/gao/miniforge3/envs/env_isaaclab/lib/python3.10/site-packages/rsl_rl/__init__.py. Skipping.
Storing git diff for 'IsaacLab' in: /home/gao/IsaacLab/logs/rsl_rl/anymal_c_rough/2025-02-12_10-49-33/git/IsaacLab.diff
/home/gao/isaac-sim/isaac-sim-standalone@4.5.0/exts/omni.isaac.ml_archive/pip_prebundle/torch/nn/modules/module.py:1747: UserWarning: RNN module weights are not part of single contiguous chunk of memory. This means they need to be compacted at every call, possibly greatly increasing memory usage. To compact weights again call flatten_parameters(). (Triggered internally at ../aten/src/ATen/native/cudnn/RNN.cpp:1410.)
  return forward_call(*args, **kwargs)
################################################################################
                       Learning iteration 1/1500                        

                       Computation: 25482 steps/s (collection: 3.705s, learning 0.152s)
               Value function loss: 0.0194
                    Surrogate loss: -0.0062
             Mean action noise std: 0.99
                 Mean total reward: -1.09
               Mean episode length: 39.25
Episode_Reward/track_lin_vel_xy_exp: 0.0075
Episode_Reward/track_ang_vel_z_exp: 0.0052
       Episode_Reward/lin_vel_z_l2: -0.0199
      Episode_Reward/ang_vel_xy_l2: -0.0094
     Episode_Reward/dof_torques_l2: -0.0043
         Episode_Reward/dof_acc_l2: -0.0143
     Episode_Reward/action_rate_l2: -0.0086
      Episode_Reward/feet_air_time: -0.0010
 Episode_Reward/undesired_contacts: -0.0051
Episode_Reward/flat_orientation_l2: 0.0000
     Episode_Reward/dof_pos_limits: 0.0000
         Curriculum/terrain_levels: 3.5047
Metrics/base_velocity/error_vel_xy: 0.0606
Metrics/base_velocity/error_vel_yaw: 0.0597
      Episode_Termination/time_out: 4.1250
  Episode_Termination/base_contact: 0.8750
--------------------------------------------------------------------------------
                   Total timesteps: 196608
                    Iteration time: 3.86s
                        Total time: 9.42s
                               ETA: 7056.9s

################################################################################
                       Learning iteration 2/1500                        

                       Computation: 23548 steps/s (collection: 4.017s, learning 0.158s)
               Value function loss: 0.0166
                    Surrogate loss: -0.0090
             Mean action noise std: 0.97
                 Mean total reward: -1.50
               Mean episode length: 63.07
Episode_Reward/track_lin_vel_xy_exp: 0.0118
Episode_Reward/track_ang_vel_z_exp: 0.0095
       Episode_Reward/lin_vel_z_l2: -0.0200
      Episode_Reward/ang_vel_xy_l2: -0.0145
     Episode_Reward/dof_torques_l2: -0.0075
         Episode_Reward/dof_acc_l2: -0.0231
     Episode_Reward/action_rate_l2: -0.0144
      Episode_Reward/feet_air_time: -0.0019
 Episode_Reward/undesired_contacts: -0.0138
Episode_Reward/flat_orientation_l2: 0.0000
     Episode_Reward/dof_pos_limits: 0.0000
         Curriculum/terrain_levels: 3.4744
Metrics/base_velocity/error_vel_xy: 0.0981
Metrics/base_velocity/error_vel_yaw: 0.0947
      Episode_Termination/time_out: 4.1250
  Episode_Termination/base_contact: 1.2500
--------------------------------------------------------------------------------
                   Total timesteps: 294912
                    Iteration time: 4.17s
                        Total time: 13.59s
                               ETA: 6785.9s