Week navigation
Week 5: VLA Architectures
Week 5 -
VLA Architectures
RT-1 through modern VLA systems, GR00T, Helix, Gemini Robotics, RDT, and CogACT.
RT-1, RT-2 history + survey
Glossary primer (10 min) RT 1 (Robotic Transformer 1) — Google 2022. First serious general purpose robot transformer. 35M params, EfficientN...
π0 / π0.5 / π0.6 / π0.7 deep dive
Glossary primer (12 min) π0 — Physical Intelligence's first production VLA (Oct 2024). 3.3B params, PaliGemma backbone, flow matching action...
GR00T N1.5 / N1.6 architecture
Glossary primer (10 min) GR00T — NVIDIA's humanoid foundation model project, launched at GTC 2024. GR00T N1 — First public release (Mar 2025...
Helix (Figure)
Glossary primer (8 min) Helix — Figure AI's whole upper body humanoid VLA, announced 2025. ~2B params. Optimized for high frequency real tim...
Gemini Robotics + Robot Academy IBVS primer
Glossary primer (10 min) Gemini Robotics — Google DeepMind 2025. Robotics adaptation of Gemini 2.0/2.5. Native multimodal (image + video + a...
RDT-1B + CogACT + comparison reflection
Glossary primer (8 min) RDT 1B (Robotics Diffusion Transformer 1B) — Tsinghua / Shanghai AI Lab 2024. Bimanual specialist, 1B params, DiT st...
Week 5 retro + capstone Track A design
Hour 1 — Capstone Track A pre design (40 min) docs/day35 track a design.md : Hour 2 — Fresh clone test (45 min) For Week 5 — the bulk is dow...
What you will know by end of Week 5
- Read the week's source papers without drowning in undefined terms.
- Run the week's core software stack from a fresh clone.
- Explain the week's systems in terms of data, control, and learning loops.