This paper focuses on key improvements when upgrading an NVIDIA® GPU from Turing to Blackwell, looking at architecture improvements specifically for high-end embedded GPUs, Turing TU104, Ampere GA104, Ada AD103 and Blackwell GB203. NVIDIA GPUs have always excelled at video graphics processing and in providing support for general purpose data processing that benefits from massive parallel processing algorithms.
NVIDIA also became a leader in artificial intelligence (AI) processing with the inclusion of Tensor cores in GPUs. Tensor cores were introduced by NVIDIA in 2017 as a key feature of the Volta data center GPUs, followed by 2nd generation Tensor cores in 2018 in the Turing architecture for desktop and other use cases. The Turing architecture also introduced new Ray Tracing cores used to accelerate photo realistic rendering. With each new GPU generation NVIDIA made updates to CUDA® core processing data paths, updated Tensor cores with new data precision handling support, and updated Ray Tracing core capabilities. New manufacturing processes provided support for the design of denser GPUs with more cores running at higher clock speeds. Each generation GPUs have become more performant and more necessary for modern data processing.
[View/Download PDF] NVIDIA GPU ARCHITECTURE: FROM TURING TO BLACKWELL
