Danjie Wenren
Home
Archive
About
Home
Archive
About
Profile Image of the Author
Danjie Wenren
Doing cool stuff
LinkedIn GitHub
Categories
Research 4
Tags
fsdp jax llm parallel training parameterization post-training pre-training pytorch rlvr scaling law sharding tensor parallelism tips
© 2026 Danjie Wenren. All Rights Reserved. / RSS / Sitemap
Powered by Astro
ds2=gμν dxμ dxνds^2 = g_{\mu\nu}\,dx^\mu\,dx^\nuds2=gμν​dxμdxν
© 2026 Danjie Wenren. All Rights Reserved. / RSS / Sitemap
Powered by Astro
ds2=gμν dxμ dxνds^2 = g_{\mu\nu}\,dx^\mu\,dx^\nuds2=gμν​dxμdxν