The “brain” of any modern smartphone is its SoC (system-on-chip), an incredibly complex micromachine. It packs a CPU and GPU, audio and video processing, wireless communication, and power management on a piece of silicon the size of a fingernail. SoCs are getting even more sophisticated. One recent addition is the NPU, which facilitates AI functions.
No Thanks, Keep Reading
Your phone likely has an NPU if it’s a recent model. Even the low-cost Samsung Galaxy A25, one of the best budget Android phones, includes one on its Exynos 1280 chip. But what does the NPU do, and what difference does it make? Let’s explore.
Related
What is generative AI?
An agent of the human will, an amplifier of human cognition. Discover the power of generative AI
What is an NPU, and which phones have one?
NPUs have been around longer than you think
NPU stands for neural processing unit. It is a computing module in a smartphone’s SoC, like the CPU (central processing unit) and GPU (graphics processing unit). Recent Snapdragon, Exynos, Dimensity, Apple A-series model SoCs, and some desktop and mobile PC processors by Intel, AMD, and Apple have NPUs.
Phone SoCs have had NPUs for a while. Qualcomm has had its AI Engine (PDF), a combination of hardware and software for AI tasks, on the Snapdragon 820 since 2015. Apple introduced its Neural Engine NPU in 2017 with the A11 Bionic chip. However, as unfinished as they may be, they’re more relevant today due to the hype surrounding AI and the features it brings.
Related
In the AI era, Google, Samsung, and Apple have made us all beta testers
Say goodbye to finished software
Why do we need NPUs in phones?
The job of an NPU is to accelerate tasks related to artificial intelligence and machine learning applications. Examples (several pictured above) include identifying people and objects in images, text and image generation, converting speech to text, real-time translation, and predicting the next word you may want to type.
You don’t need an NPU to perform these functions, but it makes the process faster, more energy-efficient, and less reliant on cloud computing. The computations required by AI tasks are so specific that optimizing a processing unit for those makes sense.
If you’re curious and love math, A.C.C. Coolen at King’s College London dives deep into the mathematics of neural networks. Also, Michael Stevens of Vsauce fame demonstrates a working neural network in his YouTube video below. Notice how basic but numerous operations are executed simultaneously for the network to work.
CPU vs. NPU: How are they different?
The CPU is a general-purpose unit that can do one or a few complex math operations fast and with high precision. However, AI jobs require many calculations to be run in parallel, while precision isn’t as important. A GPU would be a better fit for the task than a CPU, thanks to its parallel nature. Still, an NPU excels thanks to its efficiency, as IBM points out. An NPU can deliver similar AI performance while using a fraction of the energy, which makes it ideal for mobile, battery-powered devices.
The benefits of on-device AI
Another advantage of having an NPU as part of the SoC is that it performs some AI operations on the device instead of in the cloud, which could be slower. This makes sense for lighter loads like speech-to-text conversion. It’s also ideal when sensor input is involved and immediate results are expected, such as detecting objects in a scene in the camera app. The AI model is the code that processes your input and is stored locally. Applications like the Google Pixel Studio image generator use a hybrid approach, leveraging local and cloud AI models.
On-device AI is also great for privacy. The personal data you provide (in speech, text, or video form) does not need to leave your phone. This eliminates the chance of bad actors accessing it in a data breach.
How is a Google Pixel’s TPU different?
If you look at the specs page for the Google Pixel 9, a phone with heavily promoted AI features, you won’t find any mention of an NPU. That’s because it uses a TPU (tensor processing unit).
Like an NPU, the TPU accelerates AI calculations. What’s different is that TPUs and TPU chips are custom-designed by Google. You’ll find them only in Google hardware and the company’s data centers. Tensor processing units are optimized for TensorFlow, an open source software library developed by Google and made for machine learning and AI applications.
What are NPU TOPS?
Like horsepower, but for NPU chips
While most new phones have an NPU, some perform AI computations faster. TOPS (trillions of operations per second) is the common measurement of AI processor performance. Qualcomm explains that two factors determine an NPU’s TOPS: the frequency (clock speed) at which it runs and the number of MAC operation units at its disposal.
The recently announced Snapdragon 8 Elite chip is touted as having 45% better AI performance than its predecessor, the Snapdragon 8 Gen 3. As for the latter, it tops out at 45 TOPS. That isn’t much next to the 1,300+ TOPS delivered by the high-end Nvidia RTX 4090 desktop graphics card. Then again, a phone doesn’t suck 450 watts like Nvidia’s beast.
It’s difficult to put the numbers into context since TOPS requirements for AI tasks are rarely mentioned. However, Microsoft’s Copilot+ AI chatbot requires a minimum of 40 TOPS.
Are NPUs here to stay?
Given their highly specialized nature, neural processing units are not on track to replace CPUs or GPUs. Instead, they’re meant to improve the efficiency of mobile SoCs by taking on AI tasks while saving battery power. With AI becoming integrated into smartphones, we’ll hear more about NPUs in the future. For now, check out our favorite Samsung Galaxy AI features to explore what AI can do for you today.