Transformers are changing AI – and Nvidia’s new GPU is proof

transformers neural network
Image by TrendDesign | Bigstockphoto

ITEM: Nvidia’s recently announced Hopper H100 processor is the strongest indicator yet that transformers are transforming AI as we know it.

No, not the robots that noisily disguise themselves as cars. As Nvidia describes it in this blog post, transformers are a type of neural network model “that learns context and thus meaning by tracking relationships in sequential data like the words in this sentence.”

That’s why transformers have been generating a lot of excitement in AI circles as a paradigm shift for language AI – language is essentially composed of sequential data that requires context to understand it. As an example, they already play a role in how search engines like Google and Bing work.

But the potential applications for transformers goes well beyond language-related apps, from AI vision that looks for flaws in chip wafers on an assembly line to drug design, fraud prevention.

According to Dave Salvator, senior product manager for AI inference and cloud at Nvidia, a key attraction of transformers is the sheer number of parameters this particular model can handle, plus the fact that transformer models are growing much faster than other models, reports IEEE Spectrum:

“We are trending very quickly toward trillion-parameter models,” says Salvator. Nvidia’s analysis shows the training needs of transformer models growing 275-fold every two years, while the trend for all other models is 8-fold growth every two years.

This requires a lot more computing resources both for training AI and operating it in real time. And that’s why Nvidia has stepped up with its H100 GPU, which – unlike previous Nvidia GPUs – features a “transformer engine” that can dynamically change the precision of the calculations in the cores to speed up transformer neural network training.

The term “dynamic” is key, Spectrum reports:

The transformer engine’s secret sauce is its ability to dynamically choose what precision is needed for each layer in the neural network at each step in training a neural network. The least-precise units, the 8-bit floating point, can speed through their computations, but then produce 16-bit or 32-bit sums for the next layer if that’s the precision needed there. The Hopper goes a step further, though. Its 8-bit floating-point units can do their matrix math with either of two forms of 8-bit numbers.

Nvidia isn’t the only company using transformers, of course. But given Nvidia’s ambition to be the “picks and shovels of the metaverse”, as RFM’s Richard Windsor puts it, it’s clear the company understands the computational needs involved, not least because of all the AI that will be involved (and indeed required) to bring the metaverse to life in any meaningful way.

More details on transformers at IEEE Spectrum.

See also this Nvidia blog outlining why it thinks transformers are an AI game-changer.

Be the first to comment

What do you think?

This site uses Akismet to reduce spam. Learn how your comment data is processed.