G1 Humanoid 6DoF Hands Locomotion (RL)

Overview

Reinforcement Learning locomotion policy for the Unitree G1 humanoid equipped with 6DoF Inspire robotic hands.

The policy was trained using RSL-RL within the MJLab framework, built on top of the MuJoCo physics engine.

Environment

  • Physics engine: MuJoCo
  • Framework: MJLab
  • RL library: RSL-RL
  • Task: Commanded velocity tracking

Framework Architecture

MJLab uses a modular manager-based RL environment:

  • Observation manager
  • Reward manager
  • Curriculum manager
  • Command manager
  • Termination manager

This design enables scalable task composition, rapid reward iteration, and clean sim-to-real transfer pipelines.

Algorithm

  • PPO
  • On-policy runner
  • Adaptive KL schedule

Observations

  • Base orientation
  • Base angular velocity
  • Joint positions
  • Joint velocities
  • Commanded velocity

Actions

  • Joint position targets

Network Architecture

Actor

  • [512, 256, 128]
  • ELU activation

Critic

  • [512, 256, 128]
  • ELU activation

Training Details

  • Iterations: 4000
  • Steps per environment: 24
  • Gamma: 0.99
  • Lambda: 0.95
  • Learning rate: 1e-3
  • Mini-batches: 4
  • Epochs per update: 5
  • Seed: 42

Files

  • checkpoints/model_4000.pt → trained policy
  • onnx/policy.onnx → deployment-ready model
  • configs/ → training configuration

Author

Josué Abad

Lab

NONHUMAN Robotics – Perú

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading