AI Hardware: NVIDIA GeForce and Quadro Graphic Card Spec Comparison

The graphics card in the computer (“AI workstation”) is the key to maximizing the computing performance of an AI model. The AI workstation is available as DIY (build your own), in the cloud, or both. For organizations that have embraced AI and built a data science team, buying a workstation and the right graphics card is a no brainer, at least for testing AI models. However, selecting the right card for a specific use case is challenging because there are dozens of cards available.

There are many different tasks involved in building, training, testing, and running AI models. One of the most compute-intensive task is training the model, especially if its a deep learning neural network with many layers. Although the CPU plays an important part in data processing, the GPU is the workhorse that provides parallel computing capabilities. The 800 lb gorilla of the GPU market is Nvidia.

Hitachi engineer Hubert Yoshida wrote that CPU’s are designed for performing a single task like transaction processing, whereas a GPU is a massively parallel architecture designed for processing many functions at a time.

The E5-2690 motherboard supports 16 cores. That’s a fairly decent amount of cores, but it will fall short in being able to perform many compute-intensive AI tasks in a short period of time. For a deep learning model that needs to “calculate and update millions of parameters in run-time”, the 16 cores isn’t likely to cut it. However, for $2500 dollars, a GPU graphics card with 4,680 cores and 576 tensor cores can be added to the workstation. Tensor cores “accelerate large matrix operations” and “perform mixed-precision matrix multiply and accumulate calculations in a single operation.”

Turing is the latest and greatest GPU architecture from Nvidia. In fact, they call it the “greatest leap since the invention” of CUDA GPU in 2006. One of the new features of this technology is real-time tracing that brings 3D environments to life. That’s great for gaming but not really for AI. However, the Titan RTX comes packed with 576 tensor cores and 24GB of GDDR6 memory and memory bandwidth of 672 GB/s. If that isn’t enough, the NVLink technology makes it possible to daisy chain 1 or more cards together so they perform as a unit.

Here is a comparison of the GeForce and Quadro graphics cards.

NvidiaTitan RTXGeForceGeForceQuadroQuadroQuadroQuadro
SpecsTitan RTXRTX 2080 TiRTX 2080 SuperRTX 8000RTX 6000RTX 5000GV100
CUDA Cores4608435230724608460830725120
Tensor Cores576544384576576384640
TFLOPS-Single Prec.16.313.411.1516.316.311.214.8
Base Clock1350MHz1350MHz1650MHz1395MHz1440MHz1620Mhz1132Mhz
Boost Clock1770Mhz1545Mhz1815MHz1770MHz1770Mhz1815Mhz1627Mhz
Memory Bandwidth672 GB/s616GB/s496GB/s672 GB/s672 GB/s448 GB/s868 GB/s
  • TU = Turing architecture
  • Turing suceeds Volta GPU architecture
  •  TU102 has more features than TU104 and TU106 
  • Prices will vary between online retailers