Day 39

Humanoid whole-body controllers: HumanPlus, OmniH2O, HOVER, ASAP, BeyondMimic

This is a valid v1.0 placeholder page for the later curriculum arc. Full interactive lab treatment ships after Week 1 dogfooding.

LECTURE & READING

Glossary primer (15 min)

  • WBC (Whole-Body Control & PlanningControllerThe algorithm or system that turns desired behavior into motor commands.)Core ConceptsPolicyThe rule or model that maps observations or states to actions. that controls all joints of a humanoid simultaneously: legs (Navigation & LocomotionLocomotionMovement of the robot body through space, like walking, rolling, or running.) + torso (balance) + arms (Manipulation & TasksManipulationUsing a robot arm or hand to move or interact with objects.) + head (gaze). 25–35 DoFs.
  • HumanPlus — Stanford 2024. Teleoperation-driven humanoid policies. Pose-tracking from human video.
  • OmniH2O — CMU 2024. Universal humanoid Imitation & Reinforcement LearningTeleoperation (teleop)A human remotely controlling the robot, often to collect demonstrations.: maps human motion to Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. via shared latent.
  • HOVER — NVIDIA 2024. Distillation framework: train multiple specialist policies (Navigation & LocomotionLocomotionMovement of the robot body through space, like walking, rolling, or running., Manipulation & TasksManipulationUsing a robot arm or hand to move or interact with objects., dancing), distill into one generalist.
  • ASAP — UCB / CMU 2025. Aligning Simulation & Sim-to-RealSimulationA virtual environment where robots can be trained or tested. and Physical worlds via online residual learning. The model knows it might be wrong; corrects on the fly.
  • BeyondMimic — Berkeley / Stanford 2025. Latent diffusion over motion repertoires; classifier guidance for combining skills. Generalizes to unseen motion combinations.
  • Motion retargeting — Map human MoCap motion to Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. Movement, Mechanics & Robot BodyJointA movable connection between robot parts. trajectories (different geometry).
  • AMP (Adversarial Motion Priors) — Discriminator distinguishes "robot-like" from "human-like" motion; Core ConceptsPolicyThe rule or model that maps observations or states to actions. regularized to look human-like.
  • Reference motion — A target Core ConceptsTrajectoryA sequence of states or actions over time. (from MoCap or animation) the Core ConceptsPolicyThe rule or model that maps observations or states to actions. is rewarded for matching.

Real-world analogy

If quadruped policies are "motorcycles" (4 contacts, low CoM, simpler), humanoid WBC is "unicycles" (2 contacts, high CoM, vastly harder to balance). Five years of progress: HumanPlus = first prototype; OmniH2O = better Imitation & Reinforcement LearningTeleoperation (teleop)A human remotely controlling the robot, often to collect demonstrations.; HOVER = generalist via distillation; ASAP = closes Simulation & Sim-to-RealSim-to-real (sim2real)Transferring a policy trained in simulation to a real robot. gap actively; BeyondMimic = composes new motions.

Hour 1 — Reading (pick 3 of 5)

Read abstracts + figures of all 5; deep-dive any 3 (~50 min total):

Hour 2 — Comparison table

Create docs/day39_humanoid_wbc.md:

# Humanoid WBC comparison

| Method | Year | Org | Robot | Source | Action repr | Key idea |
|---|---|---|---|---|---|---|
| HumanPlus | Jun 2024 | Stanford | H1 | Human video | PPO + AMP | Imitate human motion via 6D pose |
| OmniH2O | 2024 | CMU | H1 | Teleop + sim | RL | Universal "human → humanoid" map |
| HOVER | Late 2024 | NVIDIA | H1 / G1 | Many specialists | Distillation | Multi-task generalist via distillation |
| ASAP | Mar 2025 | UCB / CMU | H1 / G1 | Sim + real residual | RL + residual MLP | Learn the sim-to-real *delta* online |
| BeyondMimic | Aug 2025 | UCB / Stanford | H1 / G1 | MoCap library | Latent diffusion | Compose unseen motions via classifier guidance |

## What each adds over the previous

- HumanPlus → OmniH2O: better teleop topology, robot-agnostic
- OmniH2O → HOVER: multi-task generalization via distillation
- HOVER → ASAP: fix sim-to-real gap explicitly with residual model
- ASAP → BeyondMimic: compose multiple motion skills, not just retarget one

## Common ingredients (the recipe)

1. Reference motion (MoCap or human video, retargeted)
2. Reward = motion tracking + alive bonus + smoothness
3. Domain randomization (Day 25) — universal
4. Teacher-student (Day 27) — universal except BeyondMimic
5. Action filtering (low-pass on actions) for hardware

## When to use what

- I want a humanoid to dance: BeyondMimic (multi-skill latent diffusion)
- I want a humanoid that walks like a particular person: HumanPlus
- I want a generalist humanoid for many tasks: HOVER
- I deploy on hardware and have a sim2real gap: ASAP
- I do teleop demos: OmniH2O

LAB

Hour 3 — Lab: replicate a BeyondMimic-style latent-diffusion classifier guidance toy (60 min)

What you're building. A minimal version of BeyondMimic's core trick: train a small VAE on a few motion clips, then combine two classifier guidances at sample time to produce hybrid motions (e.g. "walk + raise arm").

This is conceptual — not full BeyondMimic. The full method needs MoCap data and an Isaac Lab humanoid env.

Step 1 — Get a few MoCap clips (15 min)

Use the AMASS Robot LearningDatasetA collection of training or evaluation data. or LAFAN1 (free). Or for a quick toy, use synthetic clips:

Full source continues in the committed curriculum files. The v1.0 page exposes the day flow and lab surface without inventing content.

Completion controls unlock when this day graduates from placeholder to full lab.

Papers you will re-read after this