Intel Unveils New AI Accelerator Chip: Gaudi 3

0 0
Read Time:2 Minute

Intel Unveils Gaudi 3 AI Accelerator Chip

Intel recently introduced its latest AI accelerator chip, Gaudi 3, at the Vision 2024 event in Phoenix. This cutting-edge chip is positioned as a viable alternative to Nvidia’s H100, a sought-after data center GPU that has experienced supply shortages recently.

Performance Claims and Comparison

According to Intel, Gaudi 3 boasts significant performance advantages over Nvidia’s H100, specifically clocking a 50% faster training time for models like OpenAI’s GPT-3 175B LLM and Meta’s Llama 2. Additionally, Intel asserts a 50% faster inference performance for models such as Llama 2 and Falcon 180B.

Despite the dominance of the H100 in the data center GPU market, Nvidia has plans for more powerful AI accelerator chips like the H200 and Blackwell B200, which have yet to be released to the public. The ongoing supply constraints of the H100 have prompted tech giants to explore custom AI accelerator chip designs.

Gaudi 3 Features and Specifications

Intel’s Gaudi 3 is an evolution of its predecessor, Gaudi 2, designed with two identical silicon dies interconnected by a high-bandwidth connection. Each die contains a central cache memory of 48 megabytes, along with four matrix multiplication engines and 32 programmable tensor processor cores, culminating in a total of 64 cores.

Intel highlights Gaudi 3’s enhanced performance through its use of 8-bit floating-point infrastructure, delivering double the AI compute capability of Gaudi 2. Additionally, the chip provides a fourfold increase in computational efficiency using the BFloat 16-number format. Gaudi 3 is equipped with 128GB of HBMe2 memory and boasts a memory bandwidth of 3.7TB.

Efficiency and Energy Consumption

Recognizing the power consumption challenges in data centers, Intel emphasizes Gaudi 3’s power efficiency. The company claims a 40% greater inference power-efficiency across various parameters compared to Nvidia’s H100, attributed to Gaudi’s large-matrix math engines that require less memory bandwidth.

Comparison with Blackwell Architecture

Intel’s use of TSMC’s N5 process technology in manufacturing Gaudi 3 showcases a narrowing technological gap with Nvidia, especially as the latter prepares to introduce the Blackwell architecture built on a custom N4P process. The decision to utilize HBM2e memory in Gaudi 3 emphasizes Intel’s commitment to competitive pricing.

While a direct performance comparison between Gaudi 3 and Nvidia’s B200 is pending third-party benchmarks, the advancement of Intel’s upcoming Falcon Shores chip and the utilization of cutting-edge nanosheet transistor technology remain topics of interest.

Image/Photo credit: source url

About Post Author

Chris Jones

Hey there! 👋 I'm Chris, 34 yo from Toronto (CA), I'm a journalist with a PhD in journalism and mass communication. For 5 years, I worked for some local publications as an envoy and reporter. Today, I work as 'content publisher' for InformOverload. 📰🌐 Passionate about global news, I cover a wide range of topics including technology, business, healthcare, sports, finance, and more. If you want to know more or interact with me, visit my social channels, or send me a message.
Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %