1 results for tag "nanochat-llm-training"
Karpathy's minimal, hackable end-to-end harness for training LLMs on a single GPU node β tokenization, pretraining, SFT/RL finetuning, DCLM CORE evaluation, KV-cache inference, and a ChatGPT-like web UI. A single `--depth` dial auto-configures width, heads, LR, and training horizon, letting you reproduce GPT-2 for ~$48 on 8ΓH100 in roughly 2 hours.