2026.04.09 · 11 min read
LLM with the JAX ecosystem from scratch - Part 3
Part 3 of a series of posts sharing my experience of training LLMs with the JAX ecosystem from scratch. Maximal update parameterization and scaling law.
llm pre-training jax parallel training parameterization scaling law
2026.04.08 · 9 min read
LLM with the JAX ecosystem from scratch - Part 2
Part 2 of a series of posts sharing my experience of training LLMs with the JAX ecosystem from scratch. Parallel training and sharding.
llm pre-training jax parallel training sharding fsdp tensor parallelism
2026.03.18 · 10 min read
LLM with the JAX ecosystem from scratch - Part 1
Part 1 of a series of posts sharing my experience of training LLMs with the JAX ecosystem from scratch. The basics of various components.
llm pre-training jax
2026.01.25 · 16 min read
Beginner's tips for LLM post-training
My learnings and tips for LLM post-training for beginners like me.
tips llm post-training pytorch rlvr