In this video presentation, Aleksa Gordić explains what it takes to scale ML models up to trillions of parameters! He covers the fundamental ideas behind all of the recent big ML models like Meta’s OPT-175B, BigScience BLOOM 176B, EleutherAI’s GPT-NeoX-20B, GPT-J, …

Video Highlights: Ultimate Guide To Scaling ML Models – Megatron-LM | ZeRO | DeepSpeed | Mixed Precision Read more »