Blackwell Architecture Will Accelerate AI Products in Late 2024

NVIDIA CEO Jensen Huang revealed during the NVIDIA GTC conference on March 18 in San Jose, California, that the latest GPU platform from NVIDIA is called Blackwell. Major companies such as AWS, Microsoft, and Google are planning to incorporate this platform for generative AI and other modern computing tasks. Blackwell-based products are expected to hit the market worldwide in late 2024 through NVIDIA’s partners.

The Blackwell architecture features two dies connected by a 10 terabytes per second chip-to-chip interconnect, allowing each side to function as if it were a single chip. With 208 billion transistors and manufactured using NVIDIA’s 4NP TSMC process, Blackwell offers 8 TB/S memory bandwidth and 20 pentaFLOPS of AI performance. This platform is capable of handling training and inference for AI models scaling up to 10 trillion parameters, making it suitable for enterprise use.

Enhancements to Blackwell include technologies like the second generation of TensorRT-LLM and NeMo Megatron from NVIDIA, frameworks for increased compute and model sizes, confidential computing with encryption protocols, and a dedicated decompression engine for database queries. Security features include a reliability engine that conducts self-tests on the memory of the chip.

In addition to Blackwell, NVIDIA introduced the GB200 Grace Blackwell Superchip, which connects two NVIDIA B200 Tensor Core GPUs to the NVIDIA Grace CPU for LLM inference. The GB200 can be combined with NVIDIA Quantum-X800 InfiniBand and Spectrum-X800 Ethernet platforms for high-speed performance. A new server design, the GB200 NVL72, combines 36 Grace CPUs and 72 Blackwell GPUs to deliver 1.8 exaFLOPs of AI performance.

NVIDIA also unveiled the NVIDIA Inference Microservices (NIM), cloud-native microservices that contain APIs, domain-specific code, optimized inference engines, and enterprise runtime for generative AI. NIMs can be customized to the number of GPUs used and can run in the cloud or in a data center. Developers can experiment with NIMs for free starting March 18.

Furthermore, NVIDIA announced the release of AI Enterprise 5.0, which includes NIMs, CUDA-X microservices, AI Workbench, support for Red Hat OpenStack Platform, and expanded support for new NVIDIA GPUs. Other announcements included cuPQC for accelerating post-quantum cryptography and the X800 series of network switches to accelerate AI infrastructure.

Partnerships with companies like Oracle, AWS, Google Cloud, Microsoft, Dell, and SAP were also detailed during the keynote. These partnerships aim to incorporate NVIDIA’s latest technologies into various platforms and services. Huang emphasized that the industry is preparing for the adoption of Blackwell and highlighted the advancements in NVIDIA’s AI chips.

Source link