The graphics card in the computer (“AI workstation”) is the key to maximizing the computing performance of an AI model. The AI workstation is available as DIY (build your own), in the cloud, or both. For organizations that have embraced AI and built a data science team, buying a workstation and the right graphics card is a no brainer, at least for testing AI models. However, selecting the right card for a specific use case is challenging because there are dozens of cards available.
There are many different tasks involved in building, training, testing, and running AI models. One of the most compute-intensive task is training the model, especially if its a deep learning neural network with many layers. Although the CPU plays an important part in data processing, the GPU is the workhorse that provides parallel computing capabilities. The 800 lb gorilla of the GPU market is Nvidia.
Hitachi engineer Hubert Yoshida wrote that CPU’s are designed for performing a single task like transaction processing, whereas a GPU is a massively parallel architecture designed for processing many functions at a time.
The E5-2690 motherboard supports 16 cores. That’s a fairly decent amount of cores, but it will fall short in being able to perform many compute-intensive AI tasks in a short period of time. For a deep learning model that needs to “calculate and update millions of parameters in run-time”, the 16 cores isn’t likely to cut it. However, for $2500 dollars, a GPU graphics card with 4,680 cores and 576 tensor cores can be added to the workstation. Tensor cores “accelerate large matrix operations” and “perform mixed-precision matrix multiply and accumulate calculations in a single operation.”
Turing is the latest and greatest GPU architecture from Nvidia. In fact, they call it the “greatest leap since the invention” of CUDA GPU in 2006. One of the new features of this technology is real-time tracing that brings 3D environments to life. That’s great for gaming but not really for AI. However, the Titan RTX comes packed with 576 tensor cores and 24GB of GDDR6 memory and memory bandwidth of 672 GB/s. If that isn’t enough, the NVLink technology makes it possible to daisy chain 1 or more cards together so they perform as a unit.
Here is a comparison of the GeForce and Quadro graphics cards.
Nvidia | Titan RTX | GeForce | GeForce | Quadro | Quadro | Quadro | Quadro |
---|---|---|---|---|---|---|---|
Specs | Titan RTX | RTX 2080 Ti | RTX 2080 Super | RTX 8000 | RTX 6000 | RTX 5000 | GV100 |
GPU | TU102 | TU102 | TU104 | TU102 | TU102 | TU102 | Volta |
CUDA Cores | 4608 | 4352 | 3072 | 4608 | 4608 | 3072 | 5120 |
Tensor Cores | 576 | 544 | 384 | 576 | 576 | 384 | 640 |
Memory | 24GB | 11GB | 8GB | 48GB | 24GB | 16GB | 32GB |
NVLink | yes | yes | yes | yes | yes | yes | yes |
TFLOPS-Single Prec. | 16.3 | 13.4 | 11.15 | 16.3 | 16.3 | 11.2 | 14.8 |
Base Clock | 1350MHz | 1350MHz | 1650MHz | 1395MHz | 1440MHz | 1620Mhz | 1132Mhz |
Boost Clock | 1770Mhz | 1545Mhz | 1815MHz | 1770MHz | 1770Mhz | 1815Mhz | 1627Mhz |
Memory Bandwidth | 672 GB/s | 616GB/s | 496GB/s | 672 GB/s | 672 GB/s | 448 GB/s | 868 GB/s |
Power | 280W | 260W | 250W | 295W | 295W | 265W | 250W |
Price | $2,595.00 | $1,199.00 | $699.99 | $5,500.00 | $4,000.00 | $2,400.00 | $11,083.00 |
- TU = Turing architecture
- Turing suceeds Volta GPU architecture
- Â TU102 has more features than TU104 and TU106Â
- Prices will vary between online retailers Â