feature article
Subscribe Now

Analog Neuromorphic Processors for ASICs/SoCs Offer Microwatt Edge AI

We live in exciting times with respect to AI and the devices used to implement inferencing at the edge, where the “internet rubber” meets the “real-world road,” as it were.

It reminds me of those distant days in the 1970s when we were all “feeling our way” with 8-bit microprocessors. I’m thinking of devices like the Intel 8008 (1972), Intel 8080 (1974), Motorola 6800 (1974), MOS Technology 6502 (1975), Zilog Z80 (1976), and Motorola 6809 (1978).

If I might wax eloquent for a moment, life in those days was tremendously exciting for microprocessor mavens. Nothing was standardized, and every processor architect seemed determined to reinvent computing from first principles.

The first device I was exposed to relied on a single, lonely register called the accumulator (ACC) to do all the heavy lifting. This seemed to make perfect sense at the time, so you can only imagine my surprise when I discovered another device with two accumulators (ACCA and ACCB). And I was only just getting used to this concept when I was introduced to a device with no accumulator at all (as such), but rather with seven general-purpose 8-bit registers called A, B, C, D, E, H, and L (plus the ability to pair BC, DE, and HL to act as 16-bit registers).

To make things even more intriguing, each microprocessor family came with its own instruction set architecture (ISA) and implementation philosophy that managed to mix quirks, oddities, and flashes of sublime brilliance in equal measure: the 8080’s clean three-byte formatting, the 6502’s elegant addressing modes, the Z80’s baroque but immensely practical “extension set,” and the 6809’s almost shockingly sophisticated design. They all approached the same fundamental problems in utterly different ways.

As an aside, the reason I said “extension set” rather than “instruction set” in the context of the Z80 is that this device was designed to be binary-compatible with the Intel 8080, meaning it had to start with the entire 8080 instruction set unchanged. But the guys and gals at Zilog didn’t stop there. In a furor of enthusiasm, they added over 80 new instructions and several new registers. These additions weren’t part of the 8080 ISA; they were effectively extensions that were “bolted on top.”

The trigger for my meandering musings above is that I’m experiencing a sort of déjà vu regarding the current state of play with edge AI inference devices. Just as in the 1970s when every new microprocessor seemed to spring from an entirely different architectural philosophy, today’s edge-AI world is awash with wildly inventive approaches to doing the same fundamental job.

At a high level of abstraction, everything relies on the same underlying idea that’s been with us since the dawn of artificial neural networks (ANNs). We have layers of artificial “neurons,” each happily accepting inputs from multiple upstream sources (many-to-one) and cheerfully fanning their outputs out to multiple downstream consumers (one-to-many).

These ANNs may be implemented using digital or analog techniques. Digital ANNs represent values numerically (integers, floats, fixed-point), while analog ANNs represent values as physical quantities (voltages, currents, charges, resistances). This tells us how the math is done, but not what kind of neuron model is used.

When it comes to the computational model, we have non-spiking ANNs that use continuous-valued activations, or spiking ANNs that use discrete-time or continuous-time events (“spikes”). This tells us how the neurons behave, but not how the math is physically implemented.

To put this another way, analog vs. digital determines the medium of computation, while spiking vs. non-spiking determines the model of computation. These axes are independent, so any combination is possible, and all four exist in real hardware:

  • Digital, non-spiking (the mainstream)
  • Digital, spiking
  • Analog, non-spiking
  • Analog, spiking

Irrespective of how they are implemented, at the heart of each neuron is a multiply-accumulate (MAC) function that takes a bunch of weighted inputs, accumulates them, adds a bit of bias, and pushes the result through a non-linear “squiggle” (I hope I’m not getting too technical) we call an activation function.

In the digital domain, MAC units are built from binary multipliers (arrays of adders, partial-product generators, and carry-propagate logic) followed by an adder or accumulator register. Even when cleverly optimized, these structures are large, clock-driven beasts, full of transistors switching on every tick of the silicon metronome.

By contrast, analog MACs are almost ridiculously elegant. Instead of vast armies of adders and clocked logic gates, analog designers let Mother Nature do the math for free. Weighted multiplication comes “for the cost of a resistor” (or a transistor’s transconductance), and accumulation happens automatically as currents or charges merge according to Ohm’s law and Kirchhoff’s rules. The result is tiny analog MACs that are astonishingly power-frugal (we’re talking about microwatts).

The real reason I’m waffling on all this is that I was just chatting with Aleksandr Timofeev, the CEO of POLYN Technology. Headquartered in Bristol, UK, POLYN also has teams located in France, Israel, and the USA.

Aleksandr was kind enough to bring me up to speed on POLYN’s new NASP technology, which was formally unveiled at CES Europe. This took place just a couple of weeks ago, as of this writing. NASP, which stands for Neuromorphic Analog Signal Processing, is POLYN’s proprietary hardware and design tool ecosystem that enables a trained digital neural network model to be translated into an analog neuromorphic core implemented in CMOS.

STOP! Re-read the previous paragraph. In a crunchy nutshell, you can take an existing ANN defined in something like TensorFlow or PyTorch and, using POLYN’s tools, automatically convert it into an ultra-low-power analog realization. These tools automatically implement sparsity and pruning, removing any nodes that don’t significantly contribute to the result.

NASP’s MACs are implemented as op-amps, while weights are realized as resistors. All of this is implemented in standard CMOS using standard PDKs. Now, this is the important bit: most of the ANNs used to implement inference engines for edge-AI—whether analog, digital, spiking, or non-spiking—are built from general-purpose arrays of neurons whose behavior can be changed by updating weights (and often biases), effectively “reprogramming” the network’s function.

By comparison, POLYN’s ANNs are physically synthesized into custom analog circuitry, where the topology and weights are baked directly into the silicon. Instead of loading new parameters into a general-purpose engine, you quite literally fabricate a bespoke, task-specific inference engine with minimal overhead and ultra-low power consumption.

While many of us have grown accustomed to the idea that everything must be endlessly reprogrammable, the truth is that a huge number of real-world applications don’t need to change from one day to the next. If all we want is to know whether a human is speaking (or not), for example, then a bespoke, ultra-efficient inference engine is a perfect fit.

It’s important to realize that we are talking about creating an ANN to be embedded in an ASIC or SoC design (except when we aren’t, as noted at the end of this column). According to Aleksandr, a neural net that can detect a human voice requires about 500 neurons and around a million parameters. If we decided to perform this task using a digital CPU, for example, then we would require an Arm M3 or M4-class machine. By comparison, a NASP implementation would require only a fraction of the silicon and consume only a fraction of the power.

One of the reasons for the NASP’s ultra-low power consumption—in addition to its analog nature—is that it’s actually powered off most of the time. In the case of detecting a human voice, for example, if we were using a NASP implementation, we might store the sound in a 10-millisecond “chunk” containing 200 samples. While we are collecting and storing these samples, our neural core is completely powered down. Once we have our “chunk,” we power up the core and stream all 200 samples through it. The propagation through the core, resulting in a “Yay” or “Nay” on the voice detection front, takes only around 10 microseconds, after which the core is powered down again. Meanwhile, the next “10 millisecond chunk” is being buffered.

A similar process can be applied to other tasks, such as processing video frames. Let’s assume an input rate of 30 frames per second. For each frame, we could power up the core, process the frame in 10 microseconds, and then power down the core again.

As I’ve mentioned on occasion, I’m a digital design engineer by trade, and I have an inherent distrust of the wobbly-wobbly nature of analog. Based on this, I had concerns regarding the accuracy of NASP-based inferencing engines. Aleksandr explained things as follows…

If someone trains a voice-detection neural net in the digital domain with a full-precision floating-point model, this model may achieve something like 96% accuracy. Aleksandr says that for most clients operating in a noisy environment, 96% accuracy is more than enough (many clients are happy with anything over 90%). When this model is subsequently transformed into a NASP core, there may be a 2% deviation. This doesn’t mean that the accuracy falls from 96% to 94%; we’re talking about a 2% deviation from the original 4% error, which is almost negligible, however you look at it.

As is usually the case, I fear I’ve only skimmed the surface of this amazing technology. Some additional points I feel should be at least mentioned are as follows:

  • As opposed to fixed neurons whose functions are frozen in silicon, NASP cores can also include configurable analog neurons. These are larger, but they allow updating small portions of the network, such as the final classification layer.
  • POLYN’s compiler includes error modeling and correction during synthesis that predicts the accuracy of the core before you ever tape out.
  • We talked only about 500-neuron cores in this column, but NASP supports networks up to ~50k neurons / 10M parameters today.
  • NASP can be delivered as either a hard macro (GDS), an entire chip, or a small coprocessor die (chiplet) for multi-die packaging.

If any of this has piqued your interest, then I heartily encourage you to wander over to POLYN’s website and take a deeper dive. Whether you’re wrestling with edge-AI power budgets, dreaming of sensor-side intelligence, or simply curious to see how far analog neuromorphic processing has come, POLYN’s NASP technology is well worth a look.

4 thoughts on “Analog Neuromorphic Processors for ASICs/SoCs Offer Microwatt Edge AI”

  1. Hi Max,

    Thanks for the article – consider my interest piqued!! I have taken a look at POLYN’s website as suggested. Interesting stuff. Machine condition monitoring is big business. I toyed with this some years ago using what mainline technology was available at the time – basically monitoring electrical machine supply lines and vibration sensors. Back then, we called it “sensor fusion”. Low-power ANN technology could be a game-changer.
    Thanks again.

    1. Hi RedBarnDesigner — thanks so much for your comment. I can see all sorts of applications for this technology — it’s amazing what you can do with even a single axis of input (not that they are limited to a single axis) — for example, I created my first AI/ML app using only the output from a current sensor that could accurately tell me when my Hoover’s bag needed emptying https://www.eejournal.com/article/i-just-created-my-first-ai-ml-app-part-2/

      1. Hi again Max,
        Thanks for reminding me of the URL to that project – I remember it being very interesting and a few weeks ago I wanted to just refresh my memory but couldn’t remember what the article was called. I’ll add it to my list of useful URLs!
        In fact I will take another look now.
        Thanks again for another really interesting article. We live in interesting times.

        1. “We live in interesting times.”

          Truer words… wait until you see my next column on Tuesday, December 2 (I’m taking tomorrow/Thursday off to celebrate Thanksgiving)… I think you will be surprised… 🙂

Leave a Reply

featured blogs
Jan 20, 2026
Long foretold by science-fiction writers, surveillance-driven technologies now decide not just what we see'”but what we pay....

featured chalk talk

Simplifying Position Control with Advanced Stepper Motor Driver
In this episode of Chalk Talk, Jiri Keprda from STMicroelectronics and Amelia Dalton explore the benefits of the powerSTEP01 is a system-in-package from STMicroelectronics. They also examine how this solution can streamline overall position control architecture, the high level commands included in this solution and the variety of advanced diagnostics included in the powerSTEP01 system-in-package.
Jan 21, 2025
31,011 views