Battle of AI Chips Shifts from the Cloud to the Edge and IoT Frontier

January 9, 2019

AI, the Edge and the IoT Frontier

IoT has shifted away from connecting simple everyday devices such as cameras, thermostats and light bulbs to smart assistants, smart appliances and live streaming baby monitors. The data gleaned from the new wave of smart devices can be put to use for the continuous training and predictive capabilities of machine learning (ML) and artificial intelligence (AI), which in turn is driving customer demand; and increasingly this work is taking place at the network edge as opposed to the cloud or data center.

AI Neural Networks: Training & Inference

The AI that is linked to neural networks is mainly comprised of training and inferencing.

Training involves the construction of a deep neural network (DNN) model designed to solve a specific problem such as voice recognition or subject classification. The compute-intensive work of training the model takes place in the cloud or data center.

Inferencing comes into play once the model is reliable and ready for deployment on IoT devices, and involves the comparison of incoming data from smart devices against the trained DNN model in order to help the IoT device make “intelligent” decisions. Previously, inferencing always took place in the cloud because of the huge volume of data needed. However, as a result of improvements in software algorithms, better compute resources in hardware and more robust security, inferencing is increasingly taking place on devices at the network edge.

Customer Demand for AI at the Edge

Customer demand for AI at the edge is growing. In 2017, over 300 million smartphones shipped with some kind of neural-networking abilities; in 2018, over 800,000 AI accelerators shipped to data centers; and on a daily basis, more than 700 million people are using some kind of smart personal assistant, whether Apple’s Siri or Amazon’s Alexa or Echo.

According to a report from CSG Systems International, young adult millennials consider AI as “a major priority” for their smartphones with over half saying they would pay more for a device with AI features and functionality. In two years, it’s estimated that there will be 6 billion smartphones in use.

Clearly, there is growing customer demand and a great deal of potential for business related to on-device AI.

The Benefits of AI at the Edge

Benefits of inferencing at the edge in comparison to the cloud include:

- Lower latency response times

- Reduced bandwidth costs

- Minimized storage costs
- Improved privacy and security

Challenges with AI at the Edge

As with any device at the edge, when implementing AI, there will be a tradeoff between performance and consumption of power. Frequently, even if a neural network model has been trained to perform in the cloud, performance will diminish and battery drain will increase. Trying out different hardware options to find the right chip may change this.

Changes in Hardware – the Computer Chip

Computer chips comprise the processing and memory power of the digital computer. Back in the early 1970s, there were only a few thousand transistor components per chip whereas in 2006 there were one billion. As of 2017, the largest transistor count in a single-chip processor was 19.2 billion (AMD’s Ryzen-basd Epyc).

The rate at which transistor counts have multiplied tends to follow Moore’s law, which notes that the count of transistors doubles approximately every two years. Nanotechnology is expected to make the chips correspondingly more powerful as the transistors get even smaller.

Market Solutions, “A Silicon Tsunami”

Most semiconductor companies are working on AI in one form or another, if primarily in inference processing at the network edge. Apparently, up to 50 are already selling or getting ready to sell some kind of silicon AI accelerator. Some of these are IP blocks for SoCs, others are chips and a handful are systems. EE Times dubbed it “a silicon tsunami” because there is so much activity happening, spawned by deep learning.

The foundational technology is still evolving. Up to 50 technical papers on AI are published daily, “and it’s going up — it couldn’t be a more exciting field,” said David Patterson, the veteran co-developer of RISC who worked on Google’s Tensor Processing Unit.

Each new piece of software or hardware suggests another. Microprocessor analyst David Kanter from Real World Technologies has noted that machine learning inevitably involves new system architectures, citing the way in which tomorrow’s surveillance camera may put deep-learning accelerators next to CMOS sensors, for instance, so that raw data can be processed initially in the analog domain before being sent to an image processor as digital bits.

Various companies already have chips that deliver AI, including big names like Qualcomm, Arm and Intel and smaller cool startups like Gyrfalcon Technology and Xnor. We’ll take a look at a handful of them now.

Qualcomm

SoC Chips for Running AI at the Edge

Qualcomm, the semiconductor and telco equipment company, offer several different cores on their SoC chip for running AI workloads in IoT devices at the edge:

- Qualcomm Kryo™ CPU

- Qualcomm Adreno™ GPU
- Qualcomm Hexagon™ DSP, including recently released versions with HVX

These chips work in conjunction with the SnapDragon Neural Processing Engine (SNPE), which was built to accelerate neural network processing on Snapdragon devices. The optimal core from the list above can be selected to best suit the developer’s specific user experience.

This is when selecting a different core will be of benefit. The Qualcomm CPU, GPU and DSP on the Snapdragon processor, for instance, each handles workloads differently. Qualcomm recommends using the Hexagon DSP for a speech detection application “while an object detection or style transfer application might be better suited to the Adreno GPU”.

Jeff Bier, a DPS analyst and founder of the Embedded Vision Alliance says that Qualcomm’s Snapdragon chips are capable of delivering “tremendous performance on AI jobs if you know what you are doing”.

The Snapdragon 845 Mobile AI Platform

Qualcomm has built a third generation mobile AI platform designed to “significantly improve your processing speed and boost your mobile experiences related to camera, gaming, XR, voice recognition and more”.

The Snapdragon 845 Mobile Platform brings in the new Qualcomm Hexagon 685 Vector DSP Architecture, in addition to GPU and CPU optimizations, which working in combination deliver “up to three times faster processing of neural networks running on-device compared to the previous generation SoC”. Qualcomm also promises the running of AI applications with speed, reliability and robust security to “deliver compelling user experiences”.

Snapdragon 845 supports a range of AI applications, including landmark detection, face detection and specialist visual effects, such as bokeh.

Snapdragon 820A, Machine Learning and Connected Cars

The Snapdragon 820A processor was announced at CES 2017 as part of a wider set of initiatives designed to take Qualcomm further into the connected car space. The Snapdragon 820A supports multiple operating systems and frameworks and is designed to let automotive companies integrate machine learning-based Informational ADAS and User Interface personalization into cars with the Qualcomm Snapdragon Neural Processing Engine. According to Qualcomm, the new set of initiatives, including the powerful processor, “opens the door to a slew of innovative informational ADAS use cases, from intelligent surround view monitoring with sensing and drivable space calculation to natural language processing/understanding, HMI customization, and personalized user profiles”.

Xnor

Xnor, a Seattle-based startup, which advertises itself as offering “AI everywhere, on every device” has a patented technology that is a means of rendering machine learning models in operations that can be performed quickly by many different types of processor. Xnor has developed real buzz around it as the potential for speed, power and memory savings are vast, enabling devices with low-level CPUs to perform substantive tasks such as real-object recognition and tracking that it would usually take far more processing to achieve.

Gyrfalcon Technology

Last month, Gyrfalcon Technology announced its production-ready 22n ASIC chip with embedded MRAM, the Lightspeeur 2802M. According to the press release, “The 2802M becomes the first AI Accelerator to deliver the many benefits of MRAM such as non-volatility and low power, as well as significant advancements specific to Edge AI. The 2802M includes 40MB of memory which can support large AI models or multiple AI models within a single chip.” Some of the potential use case models the chip can support include voice identification, image classification, voice commands, pattern recognition, facial recognition, “any many others”.

Facebook

In the Spring of 2018, Bloomberg reported on signs that Facebook is working on making its own chips, including job listings and discussions with “people familiar with the matter”. The social media giant is already working on its own smart speakers, which could be improved by custom chipsets. The company would gain greater control over product development by using its own processors, and would better be able to tune its software and hardware to work in combination with each other. A job listing referred to “expertise to build custom solutions targeted at multiple verticals including AI/ML,” suggesting that the chip work would likely focus on a processor for AI tasks.

The news is not surprising given that its guru of machine learning is deep learning veteran Yann LeCunn. In an interview with Barron’s within an article titled “Watch out Intel, Here Comes Facebook” written in 2015, LeCunn noted that Facebook was receptive to Intel or another such vendor constructing its own neural-network processor, but he also cautioned, “If they don’t, then we’ll have to go with an industry partner who will build hardware to specs, or we’ll build our own.”

Conclusion

For several years, there has been tension between the world’s biggest tech companies, including Facebook, Apple, Baidu, Alibaba, Amazon and Microsoft, and the chip companies they rely on, in particular Intel, Qualcomm and Nvidia. The giants buy massive amounts of the chip companies’ microprocessors and graphic chips to power their data centers; however, the two camps are on opposing sides in an arms race to get the best AI-based ML functions.

There has always been a chance that the giants would decide to build their own custom chips so they needed to rely less on purchasing third party off-the-shelf parts. The rumor last year that Facebook was working on making its own chips seems highly likely as Facebook could use chips to power hardware devices, AI software and the servers in its own data centers. In 2010, Apple began to ship its own chips and uses them across its major product lines. Alphabet’s Google has also developed its own AI chip too. The primary reason for custom chips is that the tech giants can have more control over their own software and hardware, and as chips for machine learning depend heavily on algorithms and data (IP which is the stock and trade of companies like Facebook), it makes sense that they should be the ones to also build them.

Regardless of who builds what next, however, it is clear that AI at the edge is a booming market; although it is also likely that the edge will not completely dominate inference processing, and that this will also continue to happen in the cloud and at data centers, as well as some vertical-specific locations such as self-driving cars.