Day 46

Eval + ablation + plots

This is a valid v1.0 placeholder page for the later curriculum arc. Full interactive lab treatment ships after Week 1 dogfooding.

LECTURE & READING

Hour 1 — Confirm all seeds done (30 min)

  • By morning of Day 46:
  • Evaluation & ResearchBaselineA reference method used for comparison.: 3 seeds ✓
  • Extension: ≥ 2 seeds ✓ (if seed 3 still running, fine)

If a seed crashed overnight, decide: retry or run with 2 seeds. Document the decision.

Hour 2 — Run eval across all seeds (60 min)

make eval

This produces runs/eval_summary.csv. Inspect manually for outliers.

LAB

Hour 3 — Plots + ablation (90 min)

Plot 1: headline result (bar with error bars)

# src/plot_headline.py
# Two bars (baseline mean ± std, extension mean ± std)
# Title: "<your task>: baseline vs extension"

Save to figures/headline.png. This goes on slide 1 of your deck.

Plot 2: ablation

Full source continues in the committed curriculum files. The v1.0 page exposes the day flow and lab surface without inventing content.

Completion controls unlock when this day graduates from placeholder to full lab.