Neural Magic’s AI Engine To Disrupt The GPU Industry


It was only a matter of time before an AI startup would tackle the GPU problem head-on, the problem being the GPU itself and why it’s needed in the first place. That startup is Neural Magic and they’ve developed an AI engine that allows data scientists to run deep learning models on commodity CPU-based hardware in the inference phase.

When it comes to running deep learning models, GPU cards are a requirement. Depending on the workload, it can take anywhere from one card to many. That’s a problem when the entry-level GPU card such as the RTX 2080 Super cost $700 and the more powerful RTX 8000 cost $5,500. Costs quickly escalate if GPU cards are daisy-chained in a workstation because more compute processing power and memory are needed. In the end, the cost for the GPU cards may exceed the cost of the server, depending on the configuration.

The Neural Magic AI engine helps in the inference phase. Inference means “training is complete and it’s time for a model to do its job: make predictions based on the incoming data”. However, the inference phase is “not as resource-heavy as training”. The way it’s done today, the GPU is the workhorse that handles inference in all phases. GPU’s are not only costly, but they require a lot of power and generate a lot of heat, not good when servers are racked in a data center. Having recently raised $15M, we’re certain to see more exciting features coming from Neural Magic in 2020.

Uses for AI Inference Engine

  • Image Classification
  • Image Processing
  • Recommendation Engine
  • Object Identification
  • Medical Imagery


  • Company: Neural Magic
  • Founded: 2018
  • HQ: Somerville, MA
  • Raised: $20M
  • # of Employees: ~20
  • Founders: Nir Shavit (CEO) and Alex Matveev (Chief Architect)
  • Product: AI engine that allows data scientist to run deep learning models on commodity CPU-based hardware in the inference phase
Scroll to Top