The Shampoo Foundation commits 95% of $1 billion to advance research in deep learning, optimization, and AI safety—empowering the next generation of breakthroughs that benefit humanity.
The Shampoo Foundation was established with a singular vision: to remove barriers that prevent brilliant minds from pursuing transformative research in machine learning and optimization.
We believe the most important advances emerge when researchers have the freedom to explore ambitious ideas without constraint. Our grants provide that freedom—funding fundamental research, open-source tools, and educational initiatives that shape the future of AI.
Supporting long-term research that may not have immediate commercial applications but advances our understanding of intelligence.
Championing open-source tools, datasets, and publications that democratize access to cutting-edge ML research.
Creating pathways for underrepresented communities to participate in and lead AI research.
A second-order optimizer that achieves the benefits of curvature-aware optimization while remaining computationally tractable at scale.
Traditional optimizers like SGD and Adam only use gradient information. They treat all parameter directions equally, leading to slow convergence on ill-conditioned problems where the loss surface has very different curvatures in different directions.
Shampoo uses curvature information via efficient Kronecker-factored preconditioners. By accumulating gradient statistics in L and R matrices, it adapts to the geometry of the loss landscape—taking larger steps in flat directions and smaller steps in steep ones.
Calculate gradient G for the current mini-batch
Update L ← L + GGᵀ and R ← R + GᵀG
Efficiently compute L⁻¹ᐟ⁴ and R⁻¹ᐟ⁴ via Schur-Newton
W ← W − η · L⁻¹ᐟ⁴ G R⁻¹ᐟ⁴
A street guide to second-order optimization
For 50 years, SGD has ruled machine learning like a corrupt regime. It only looks at the slope. It ignores the terrain. It takes the same tiny steps whether crossing a flat plain or climbing a cliff.
Adam tried to fix it. Added momentum. Added adaptive rates. But it's still first-order. Still blind to curvature. Still part of the system.
Shampoo sees what others don't. It reads the curvature of the loss landscape. It knows when to sprint and when to tiptoe.
The secret? Kronecker factorization. Instead of storing an impossible (n² × n²) matrix, it keeps two smaller ones. The establishment said it couldn't be done. They were wrong.
2,512
steps to converge1,729
steps to convergeSecond Order Showdown
>>> INSERT COIN TO CONTINUE <<<
FLAWLESS VICTORY — 40% faster convergence, 2× fewer iterations. The preconditioner has spoken.
| Rank | Optimizer | Steps to 75.9% | Special Ability |
|---|---|---|---|
| 1ST | μ MUON | ~1,200 | 2× COMPUTE EFFICIENCY |
| 2ND | 🫧 SOAP | ~1,500 | ADAM IN EIGENBASIS |
| 3RD | 🧴 SHAMPOO | 1,729 | KRONECKER FACTORIZATION |
| 4TH | ∑ CASPR | ~1,800 | TIGHTER BOUNDS |
| 5TH | 📊 ADAMW | 2,512 | WEIGHT DECAY |
| 6TH | 📉 SGD | 4,000+ | SIMPLICITY |
© 2018-2025 OPTIMIZATION LABS
GAME OVER — CONTINUE?
Four paths through the landscape of loss, each with its own light and color, converging toward the same distant horizon.
The progenitor. Maintains Kronecker-factored preconditioners L and R, capturing row and column gradient statistics separately.
Shampoo's refined heir. Runs Adam in the eigenbasis of Shampoo's preconditioner, combining the best of both worlds.
The minimalist. Orthogonalizes momentum via Newton-Schulz iteration. No second moments, half the memory of Adam.
The theoretician. Combines axis preconditioners via Kronecker-sum approximation. Shampoo is its special case.
"As the impressionists captured light through countless small brushstrokes, so too do these optimizers approximate the curvature of loss through elegant factorizations — each revealing truth through its own particular lens."— On the Art of Optimization
We fund work across six interconnected domains, each critical to building beneficial AI systems.
Advancing second-order methods, adaptive algorithms, and theoretical foundations that make training more efficient and reliable.
Enabling efficient distributed training across thousands of accelerators with minimal overhead and maximum reproducibility.
Ensuring advanced AI systems remain beneficial, interpretable, and aligned with human values as capabilities scale.
Building and maintaining open-source frameworks, tools, and compute resources accessible to researchers worldwide.
Applying ML to accelerate discovery in biology, climate science, materials research, and other high-impact domains.
Fellowships, mentorship programs, and grants for early-career researchers pursuing ambitious, unconventional ideas.
Researcher & Philanthropist
Rohan Anil is a pioneering researcher whose work on the Shampoo optimizer transformed our understanding of practical second-order optimization in deep learning. His contributions—described as "a breakthrough in deep learning practical optimization at scale"—demonstrated that methods once considered computationally prohibitive could achieve state-of-the-art results.
Having witnessed firsthand how resource constraints limit scientific progress, Rohan established The Shampoo Foundation to ensure the next generation of researchers has the support to pursue transformative ideas without barriers.
The most profound advances in science come when brilliant people have the freedom to pursue ambitious ideas. Our role is simply to remove the obstacles.— Rohan Anil, Founder
We welcome proposals from researchers, institutions, and organizations working on fundamental problems in machine learning, optimization, and AI safety.
Begin ApplicationApplications reviewed on a rolling basis. Typical response within 6-8 weeks.