NVIDIA lately launched the most recent model of the NVIDIA NeMo Megatron framework, which is now in open beta. This framework can be utilized to construct and deploy giant language fashions (LLMs) with pure language understanding (NLU).
Combining NVIDIA NeMo Megatron with our Azure AI infrastructure gives a robust platform that anybody can spin up in minutes with out having to incur the prices and burden of managing their very own on-premises infrastructure. And naturally, we have now taken our benchmarking of the brand new framework to a brand new degree, to actually present the ability of the Azure infrastructure.
Reaching new milestones with 530B parameters
We used Azure NDm A100 v4-series digital machines to run the GPT-3 mannequin’s new NVIDIA NeMo Megatron framework and take a look at the bounds of this collection. NDm A100 v4 digital machines are Azure’s flagship GPU choices for AI and deep studying powered by NVIDIA A100 80GB Tensor Core GPUs. These cases have probably the most GPU reminiscence capability and bandwidth, backed by NVIDIA InfiniBand HDR connections to help scaling up and out. In the end, we ran a 530B-parameter benchmark on 175 digital machines, leading to a coaching time per step of as little as 55.7 seconds (figure1). This benchmark measures the compute effectivity and the way it scales by measuring the time taken per step to coach the mannequin after regular state is reached, with a mini-batch dimension of 1. Such excellent velocity wouldn’t have been attainable with out InfiniBand HDR offering glorious communication between nodes with out elevated latency.
These outcomes spotlight an nearly linear velocity improve, guaranteeing higher efficiency for the next variety of nodes—paramount for heavy or time-sensitive workloads. As proven by these runs with billions of parameters, clients can relaxation assured that Azure’s infrastructure can deal with even probably the most troublesome and complicated workloads, on demand.
“Velocity and scale are each key to creating giant language fashions, and the most recent launch of the NVIDIA NeMo Megatron framework introduces new strategies to ship 30 p.c sooner coaching for LLMs,” mentioned Paresh Kharya, senior director of accelerated computing at NVIDIA. “Microsoft’s testing with NeMo Megatron 530B additionally exhibits that Azure NDm A100 v4 cases powered by NVIDIA A100 Tensor Core GPUs and NVIDIA InfiniBand networking present a compelling possibility for reaching linear coaching speedups at large scale.”
Showcasing Azure AI capabilities—now and sooner or later
Azure’s dedication is to make AI and HPC accessible to everybody. It contains, however is just not restricted to, offering one of the best AI infrastructure that scales from the smallest use instances to the heaviest workloads. As we proceed to innovate to construct one of the best platform in your AI workloads, our promise to you is to make use of the most recent benchmarks to check our AI capabilities. These outcomes assist drive our personal innovation and showcase that there isn’t any restrict to what you are able to do. For all of your AI computing wants, Azure has you lined.
Be taught extra
To be taught extra in regards to the outcomes or methods to recreate them, please see the next hyperlinks.