Train baseline

This is a valid v1.0 placeholder page for the later curriculum arc. Full interactive lab treatment ships after Week 1 dogfooding.

LECTURE & READING

The point of today: get your "non-extension" version running end-to-end with at least one seed before you sleep. A working at 5pm is worth more than a perfect next week.

Hour 1 — Implement train.py for your track (45 min)

Track	What this is
A	LoRA fine-tune of π0.7 on your 30-episode
B	Train 100M-param JEPA predictor on your 50h
C	DR-only Go1 PPO (Day 25 reproduce)
D	ACT on your 50-episode bimanual

Each of these you've effectively done before. Today you wrap it in a proper script with seeds, wandb logging, and a saved checkpoint.

Hour 2 — Launch first seed (training in background; ~3-6 hours)

While it runs in tmux, work on Hour 3.

LAB

Hour 3 — Implement eval.py (60 min)

# src/eval.py
"""Capstone eval. Multi-seed. Outputs metrics.csv rows + plot."""
import argparse, pandas as pd

def evaluate_policy(policy_path, n_episodes=20, seed=1):
    # ... load, run, return mean & std of success rate
    ...

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("--seeds", type=str, default="1,2,3")
    args = parser.parse_args()
    seeds = [int(s) for s in args.seeds.split(",")]

    results = []
    for seed in seeds:
        for variant in ["baseline", "extension"]:
            ckpt = f"runs/{variant}_s{seed}"
            sr = evaluate_policy(ckpt, seed=seed)
            results.append({
                "seed": seed, "variant": variant, "success_rate": sr
            })
    df = pd.DataFrame(results)
    df.to_csv("runs/eval_summary.csv", index=False)
    # Make plot, log to metrics.csv

Make sure eval is fully scriptable (make eval works) — this is rubric category 3 ().

Hour 4 — Late: status check

After ~4 hours, check on your run. If it's progressing (loss decreasing, eval improving), good. If not, stop, debug, restart. Don't go to bed with a broken run.

Full source continues in the committed curriculum files. The v1.0 page exposes the day flow and lab surface without inventing content.

Completion controls unlock when this day graduates from placeholder to full lab.