DEPTH-ESTIMATIONCURRENT2026-02-17

Perceptive Humanoid Parkour: Chaining Dynamic Human Skills via Motion Matching

Zhen Wu, Xiaoyu Huang, Lujie Yang, Yuanhang Zhang, Koushil Sreenath, Xi Chen, Pieter Abbeel, Rocky Duan, Angjoo Kanazawa, Carmelo Sferrazza, Guanya Shi, C. Karen Liu

ARCHITECTURE
motion matching, reinforcement learning, behavior cloning with DAgger
ROBOT
Unitree G1 humanoid robot
KEY METRIC
96%
TASK
locomotion, parkour, obstacle traversal

Imagine a humanoid Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. that doesn't just walk—it parkours. The Unitree G1 Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. running PHP climbs 1.25-meter walls (96% of its own height), vaults over obstacles, and chains together multiple acrobatic skills in real time using only an onboard depth camera. This is a big deal because previous humanoid robots struggled with even basic obstacle traversal. PHP solves this by combining three key insights: (1) capturing human parkour motion through motion matching—treating movement as a nearest-neighbor search in motion space, (2) using Imitation & Reinforcement LearningReinforcement Learning (RL)Teaching a robot through trial and error using rewards. to make the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. actually execute these human-inspired trajectories, and (3) adding Perception & SensingPerceptionThe process of turning raw sensor data into useful understanding of the world. so the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. autonomously decides which Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer. to use based on what it sees. The result feels fundamentally different from prior work—this Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. moves with the fluidity and adaptability of a human, not the rigid, pre-planned Navigation & LocomotionGaitA repeated movement pattern for walking or running. of a conventional Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. Navigation & LocomotionLocomotionMovement of the robot body through space, like walking, rolling, or running. Control & PlanningControllerThe algorithm or system that turns desired behavior into motor commands..

ARCHITECTURE

THE PROBLEM

Before PHP, humanoid Navigation & LocomotionLocomotionMovement of the robot body through space, like walking, rolling, or running. research achieved stable walking on varied terrains, but parkour—dynamic, adaptive, human-like movement—remained out of reach. Prior work fell into two camps: (1) end-to-end Imitation & Reinforcement LearningReinforcement Learning (RL)Teaching a robot through trial and error using rewards. agents trained in Simulation & Sim-to-RealSimulationA virtual environment where robots can be trained or tested. that transfer poorly to real robots and struggle to compose multiple skills, and (2) hand-crafted motion controllers that work for specific tasks but lack expressiveness and don't adapt on the fly. The core limitation? Robots lacked both the motion expressiveness of humans AND the perceptual awareness to make real-time decisions about which Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer. to execute. A Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. might nail climbing one obstacle, but couldn't decide whether to climb the next one or step over it based on depth Perception & SensingSensorA device that provides information about the robot or its environment. input. Existing motion capture retargeting ignored the long-horizon composition problem—you could animate one motion, but chaining them smoothly while preserving human fluidity was unsolved.

HOW IT WORKS

1

Motion Matching: Compose Atomic Human Skills into Fluid Trajectories

The team started by capturing human parkour motion from video and mocap data, then retargeted it to the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions.'s body proportions. Rather than Robot LearningTrainingThe process of fitting a model using data or experience. a Core ConceptsPolicyThe rule or model that maps observations or states to actions. from scratch, they formulated motion composition as a nearest-neighbor search: given the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions.'s current Core ConceptsStateThe robot’s current condition, such as joint positions, velocity, object positions, or internal variables., find the closest matching human motion in a feature space (capturing pose, Movement, Mechanics & Robot BodyVelocityHow fast something moves., and context). This is brilliant because it preserves the elegance of human movement—no lerping or blending destroys the motion's natural rhythm. They built atomic skills (climbing, vaulting, stepping, rolling) and stitched them together seamlessly. The key insight: humans don't plan parkour as a sequence of Movement, Mechanics & Robot BodyJointA movable connection between robot parts. angles; they transition fluidly between skills. Motion matching captures that fluidity by always finding the next frame that best matches the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions.'s current Core ConceptsStateThe robot’s current condition, such as joint positions, velocity, object positions, or internal variables., creating long-horizon trajectories that feel natural, not robotic.

2

RL Expert Policies: Train Robots to Actually Track Human Motions

Motion matching gives you the target Core ConceptsTrajectoryA sequence of states or actions over time., but the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. must execute it with real physics, imperfect actuators, and ground Movement, Mechanics & Robot BodyContactPhysical interaction between the robot and an object or surface. forces. The team trained separate Imitation & Reinforcement LearningReinforcement Learning (RL)Teaching a robot through trial and error using rewards. expert policies for each atomic Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer., each one learning to track the kinematic Core ConceptsTrajectoryA sequence of states or actions over time. from motion matching while staying robust to perturbations. These experts were powerful but skill-specific—each one mastered climbing, or vaulting, or rolling. The magic: they work in Core ConceptsTrajectoryA sequence of states or actions over time. space, not raw Movement, Mechanics & Robot BodyJointA movable connection between robot parts. torques, which makes the learning problem tractable. The Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. learns 'how hard do I push my legs to follow this climbing Core ConceptsTrajectoryA sequence of states or actions over time. despite terrain variation?' This is computationally expensive to train (separate expert per Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer.), but critical for real-world transfer.

3

Policy Distillation with DAgger: Collapse Multiple Experts into One Depth-Based Policy

Here's the practical problem: deploying 10 separate expert policies on a real Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. is cumbersome and slow. The team distilled all experts into a single student Core ConceptsPolicyThe rule or model that maps observations or states to actions. using DAgger (Robot LearningDatasetA collection of training or evaluation data. Aggregation), a technique that iteratively collects trajectories from the experts and trains a Robot LearningSupervised learningLearning from labeled input-output examples. Core ConceptsPolicyThe rule or model that maps observations or states to actions. to mimic them. Crucially, the student Core ConceptsPolicyThe rule or model that maps observations or states to actions. takes only onboard depth images as input—no ground truth Core ConceptsStateThe robot’s current condition, such as joint positions, velocity, object positions, or internal variables.. During Robot LearningTrainingThe process of fitting a model using data or experience., DAgger gathers data where the expert knows the true Core ConceptsStateThe robot’s current condition, such as joint positions, velocity, object positions, or internal variables. and picks the best Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer.; the student sees only depth and learns to make the same decision. This closed-loop Robot LearningTrainingThe process of fitting a model using data or experience. is key: early mistakes teach the student to correct itself. The result: one lightweight Core ConceptsPolicyThe rule or model that maps observations or states to actions. that runs in real time on onboard compute, selecting and executing any of the parkour skills based on what the depth camera sees.

teaser
cat dash
134
4

Perception-Driven Decision-Making: Autonomous Skill Selection

The final piece is autonomous, context-aware behavior. The student Core ConceptsPolicyThe rule or model that maps observations or states to actions. doesn't just execute one fixed Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer.—it continuously perceives the obstacle landscape via Perception & SensingDepth sensingMeasuring how far objects are from the robot. and decides whether to step over, climb, vault, or roll based on obstacle geometry and height. The operator provides only a discrete 2D Movement, Mechanics & Robot BodyVelocityHow fast something moves. command (go forward, turn left/right, toggle speed). The Core ConceptsPolicyThe rule or model that maps observations or states to actions. handles the high-level decision-making. This is Perception & SensingPerceptionThe process of turning raw sensor data into useful understanding of the world. in the robotics sense: the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. uses Perception & SensingSensorA device that provides information about the robot or its environment. data to inform behavior selection in real time. Real-world adaptation is critical here—if an obstacle is displaced mid-run (the paper tests this), the Core ConceptsPolicyThe rule or model that maps observations or states to actions. regenerates its decision and adjusts the Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer. chain on the fly. This closes the loop between what the camera sees and what the motors do.

MORE DEMONSTRATIONS

roll
obstacle displacement
step climb 3
multi good
continuous step

KEY RESULTS

Maximum Obstacle Climbing Height1.25 meters

vs. 96% of the G1's 1.3m height; prior humanoid systems rarely exceeded 0.3-0.5m

This is the flashiest result: the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. climbs almost as high as its own body length. For context, most humanoid robots from prior work could step over 0.2-0.3m obstacles; here, we're seeing nearly 4x that. Climbing 1.25m requires explosive leg power, precise balance at the peak, and coordinated descent—all executed fluidly.

Long-Horizon Multi-Obstacle Traversal with Real-Time AdaptationSuccessful navigation of obstacle courses with closed-loop obstacle displacement

vs. Prior motion-matching or RL work typically handled single-skill execution or pre-planned sequences, not adaptive multi-skill chains

The paper demonstrates the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. running a course with multiple obstacles, autonomously selecting skills and adapting when obstacles are moved in real time. This is harder than climbing one wall—the Core ConceptsPolicyThe rule or model that maps observations or states to actions. must compose skills, handle Core ConceptsStateThe robot’s current condition, such as joint positions, velocity, object positions, or internal variables. transitions, and recover from Perception & SensingPerceptionThe process of turning raw sensor data into useful understanding of the world. errors. Real-time adaptation (not pre-planned re-optimization) proves the system generalizes beyond Robot LearningTrainingThe process of fitting a model using data or experience. data.

Skill Diversity10+ distinct parkour skills demonstrated (climbing, vaulting, rolling, crawling, stepping, sitting)

vs. Prior methods typically specialized in 1-2 skills per approach

The framework isn't a one-trick solution. It demonstrates cat vaults, speed vaults, platform climbs, rolling down from heights, crawling under obstacles, and more. This variety comes from the motion-matching + Imitation & Reinforcement LearningReinforcement Learning (RL)Teaching a robot through trial and error using rewards. + DAgger pipeline scaling to multiple skills without multiplicative Robot LearningTrainingThe process of fitting a model using data or experience. complexity for the student Core ConceptsPolicyThe rule or model that maps observations or states to actions..

Perception Latency and Computational LoadReal-time execution on onboard compute (depth-based, single student policy)

vs. No multi-expert switching or expensive state estimation; lighter than running separate RL policies

The distillation to a single depth-based Core ConceptsPolicyThe rule or model that maps observations or states to actions. is pragmatic. The Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. doesn't need ground truth Core ConceptsStateThe robot’s current condition, such as joint positions, velocity, object positions, or internal variables. or external tracking—just an onboard depth Perception & SensingSensorA device that provides information about the robot or its environment. and one neural network. This is deployable on real hardware without a lab full of cameras.

PERFORMANCE COMPARISON

WHY DEVELOPERS SHOULD CARE

For software developers building robotics systems, PHP demonstrates three critical lessons. First, motion matching is a underrated tool for Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. Control & PlanningControlThe method used to make the robot move the way you want.. Instead of Robot LearningTrainingThe process of fitting a model using data or experience. everything from raw pixels with Imitation & Reinforcement LearningReinforcement Learning (RL)Teaching a robot through trial and error using rewards. (which requires massive Simulation & Sim-to-RealSimulationA virtual environment where robots can be trained or tested. and often fails in the real world), you can leverage human motion as a prior. Treat Control & PlanningControlThe method used to make the robot move the way you want. as a search problem—find the best-matching human motion, then learn to execute it. This dramatically cuts Robot LearningTrainingThe process of fitting a model using data or experience. time and improves motion quality. Second, Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer. composition through modular Imitation & Reinforcement LearningReinforcement Learning (RL)Teaching a robot through trial and error using rewards. experts that distill into a single student Core ConceptsPolicyThe rule or model that maps observations or states to actions. is a practical architecture. You don't need to train one monolithic Core ConceptsPolicyThe rule or model that maps observations or states to actions.; break it into pieces (each expert for one Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer.), then compress that knowledge into a lightweight Core ConceptsPolicyThe rule or model that maps observations or states to actions. that runs on robots. DAgger is the glue—it lets you transfer expert knowledge to a Core ConceptsPolicyThe rule or model that maps observations or states to actions. that runs on different, limited sensors. Third, perception-driven behavior is non-negotiable for real-world robotics. The Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. isn't executing a fixed plan; it's perceiving obstacles in real time and adapting its Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer. selection. This is what enables the closed-loop obstacle displacement demos—the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. isn't brittle to perturbations. If you're building Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. software, think about how to combine (1) pre-trained motion priors, (2) learnable Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer. experts, and (3) lightweight Perception & SensingPerceptionThe process of turning raw sensor data into useful understanding of the world. policies that make real-time decisions. PHP shows this scales to complex, dynamic tasks like parkour.

LIMITATIONS

PHP's limitations are real and worth acknowledging. First, motion matching requires high-quality human motion data—the system only captures skills present in the Robot LearningTrainingThe process of fitting a model using data or experience. Robot LearningDatasetA collection of training or evaluation data.. If humans don't parkour in a certain way, the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. won't either. Second, the distillation pipeline (expert Imitation & Reinforcement LearningReinforcement Learning (RL)Teaching a robot through trial and error using rewards. → DAgger → student Core ConceptsPolicyThe rule or model that maps observations or states to actions.) is complex and requires careful Robot LearningDatasetA collection of training or evaluation data. collection; it's not as simple as end-to-end Robot LearningTrainingThe process of fitting a model using data or experience.. Third, the system relies on onboard Perception & SensingDepth sensingMeasuring how far objects are from the robot., which has limited range and can struggle with reflective surfaces or fast-moving obstacles. Fourth, the discrete Movement, Mechanics & Robot BodyVelocityHow fast something moves. command interface is limiting—the operator must still actively command the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions.; it's not fully autonomous decision-making about when to parkour. Fifth, Simulation & Sim-to-RealEvaluationMeasuring how well a robot system performs. is limited to one Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. (Unitree G1) and relatively controlled obstacle courses; Modern Robot LearningGeneralizationThe robot’s ability to work in new situations it has not seen before. to wildly different morphologies or unstructured outdoor terrain is unproven. Finally, the paper doesn't deeply analyze failure modes—when does the Core ConceptsPolicyThe rule or model that maps observations or states to actions. fail to climb or vault? What are the geometric or kinematic boundaries of the approach?

WHAT COMES NEXT

The obvious next frontier is full autonomy: instead of a human sending Movement, Mechanics & Robot BodyVelocityHow fast something moves. commands, the Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. plans where it wants to go, perceives the obstacle course, and self-navigates. This requires adding high-level Control & PlanningPlanningFiguring out what the robot should do before or during movement. (graph search over obstacle configurations) on top of the perception-driven Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer. selection. A second direction is Simulation & Sim-to-RealSim-to-real (sim2real)Transferring a policy trained in simulation to a real robot. Modern Robot LearningGeneralizationThe robot’s ability to work in new situations it has not seen before.—can you train motion matching and Imitation & Reinforcement LearningReinforcement Learning (RL)Teaching a robot through trial and error using rewards. policies in Simulation & Sim-to-RealSimulationA virtual environment where robots can be trained or tested. (where data is infinite) and transfer to new Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. hardware without retraining? The distillation pipeline hints at this, but it's not fully demonstrated. Third is humanoid morphology Modern Robot LearningGeneralizationThe robot’s ability to work in new situations it has not seen before.; does PHP work on Boston Movement, Mechanics & Robot BodyDynamicsThe study of motion including forces, torques, mass, and inertia. Atlas, Tesla Optimus, or other humanoids with different proportions and actuators? If yes, it becomes a general framework; if no, there's per-robot tuning. Fourth, exploring how to handle longer obstacle courses with hundreds of obstacles and more unpredictable geometry. Fifth, integrating higher-level reasoning—not just Modern Robot LearningSkillA reusable behavior like grasp, push, place, or open drawer. selection, but Navigation & LocomotionObstacle avoidanceMoving while avoiding collisions with obstacles. Control & PlanningPlanningFiguring out what the robot should do before or during movement., energy-efficient route selection, and semantic understanding of the Core ConceptsEnvironmentThe external world the robot operates in, including objects, obstacles, people, and surfaces.. The dream: a humanoid Core ConceptsRobotA physical system with sensors and actuators that can observe the world and take actions. that explores unknown terrain, perceives obstacles, plans a parkour route, and executes it autonomously, all in real time. PHP is a big step toward that.

RELATED PAPERS