Toggle navigation
about
blog
(current)
publications
cv
LOTION: Smoothing the Optimization Landscape for Quantized Training
October 15, 2025
How Does Critical Batch Size Scale in Pre-training? (Decoupling Data and Model Size)
November 22, 2024
Anything but SGD: Evaluating Optimizers for LLM Training
July 12, 2024
Where Do Features Come From?
November 15, 2023