feature article
Subscribe Now

Neuromorphic Revolution

Will Neuromorphic Architectures Replace Moore’s Law?

According to tech folklore, Carver Mead actually coined the term “Moore’s Law” – some ten years or so after the publication of Gordon Moore’s landmark 1965 Electronics Magazine article “Cramming More Components Onto Integrated Circuits.” For the next five and a half decades, the world was reshaped by the self-fulfilling prophecy outlined in that article. Namely, that every two years or so, semiconductor companies would be able to double the number of transistors that could be fabricated on a single semiconductor chip. 

That biennial doubling of transistors led most noticeably to an even faster exponential increase in computation power. Besides getting more transistors from Moore’s Law, we got faster, cheaper, and more power-efficient ones. All of those factors together enabled us to build faster, more complex, higher-performance computing devices. In 1974, Robert Dennard observed that power efficiency in computing would increase even faster than transistor count, because of the triple-exponential improvement in density, speed, and energy efficiency as process geometry scaled downward. This trend, known as “Dennard Scaling” stayed with us for around three decades, and compute performance (and more importantly, power, it turns out) rode an unprecedented exponential improvement rocket.

All of this compute power improvement was built on top of the Von Neumann processor architecture developed in 1945 by John Von Neumann (and others), documented in an unfinished report called “First Draft of a Report on EDIVAC.” So, ironically, the most impressive technological revolution in history was built on a half-century-old design from an unfinished paper. With all the remarkable advances in digital computation during the Moore’s Law era, the now 75-year old fundamental compute architecture has remained largely unchanged.

Is the Von Neumann architecture simply the best possible way to do computation? Of course not. To mis-paraphrase Winston Churchill, Von Neumann is “the worst possible compute architecture – except for all the others.” On a more serious note, the good thing about Von Neumann is its flexibility and area efficiency. It can handle just about any arbitrarily complex application without requiring the processor to scale in transistor count with the size of the problem. 

In the old days, before we could cram so many components onto integrated circuits, that architectural efficiency of Von Neumann was a big deal. We could build a 4-, 8-, or 16-bit Von Neumann processor out of a very small number of transistors and run giant programs at acceptable speed. But now, in the wake of Moore’s Law, transistors are asymptotically approaching zero cost. So, with almost infinite numbers of free transistors available, the value of building processors with comparatively small numbers of transistors has dropped significantly.

At the same time, even with Moore’s Law going full steam, the value extracted from each subsequent process node has decreased. Dennard Scaling came to an end around 2005, forcing us to switch from building bigger/faster Von Neumann processors to building “more” Von Neumann processors. The race became cramming more cores onto integrated circuits, and the scalability of Von Neumann to multi-core brought its own limitations.

Unfortunately, Moore’s Law didn’t continue going full steam. Each of the last several process nodes has cost exponentially more to realize and has yielded proportionally less in tangible benefits. The result is that, even though we should technically be able to build more dense chips for several more generations, the cost/benefit ratio of doing so makes it a less and less attractive enterprise. We now need a driver other than Moore’s Law to maintain the pace of technological progress.

Clearly, we are also reaching the end of the useful life of Von Neumann as the single, do-all compute architecture. The recent AI revolution has accelerated the development of alternatives to Von Neumann. AI, particularly done with convolutional neural networks, is an unbelievably compute-intensive problem that is uniquely unsuited to Von Neumann. Already, an industry-wide trend is afoot, shifting us away from large arrays of homogenous compute elements to complex configurations of heterogeneous elements including Von Neumann and non-Von Neumann approaches. 

One of the more promising non-Von Neumann approaches to AI is the Neuromorphic architecture. In the late 1980s, Carver Mead (yup, the same guy who supposedly coined the term “Moore’s Law”) observed that, on the current trajectory, Von Neumann processors would use millions of times more energy than the human brain uses for the same computation. He theorized that more efficient computational circuits could be built by emulating the neuron structure of the human brain. Mead made an analogy of neuron ion-flow with transistor current, and proposed what came to be known as Neuromorphic computing based on that idea.

At the time, Neuromorphic computing was visualized as an analog affair, with neurons triggering one another with continuously varying voltages or currents. But the world was firm on the path of optimizing the binary universe of digital design. Analog circuits were not scaling at anything like the digital exponential, so the evolution of Neuromorphic computing was outside the mainstream trajectory of Moore’s Law. 

Now, however, things have changed. 

In the longer term, we have seen most analog functions subsumed by digital approximations, and neuromorphic processors have been implemented with what are called “spiking neural networks” (SNNs), which rely on single-bit spikes from each neuron to activate neurons down the chain. These networks are completely asynchronous, and, rather than sending values, the activation depends on the timing of the spikes. Using this technique, neuromorphic processors have been implemented, taking advantage of current leading-edge bulk CMOS digital technology. This means neuromorphic architectures can finally reap the rewards of Moore’s Law. As a result, several practical neuromorphic processors have been built and tested, and the results are impressive and encouraging.

One example we wrote about two years ago is Brainchip’s Akida neuromorphic processor, for which development boards became available in December, 2020. Brainchip claims their devices use 90 to 99 percent less power than conventional CNN-based solutions. As far as we know, this is one of the first neuromorphic technologies to enter the broad commercial market, and the potential applications are vast. Brainchip provides both IP versions of their technology and SoCs with full implementations in silicon. Just about any system that can take advantage of “edge” AI could benefit from those kinds of power savings, and it will often make the difference between doing edge AI and not.

Also in December 2020, Intel gave an update on their neuromorphic research test chip, called Loihi, as well as their “Intel Neuromorphic Research Community (INRC),” both of which were also announced two years ago. Across a wide range of applications including voice-command recognition, gesture recognition, image retrieval, optimization and search, and robotics, Loihi has benchmarked 30-1,000 times more energy-efficient than CPUs and GPUs, and 100 times faster. Just as importantly, the architecture lends itself to rapid and ongoing learning, in sharp contrast to CNN-based systems, which tend to have an intense training phase that creates a static model for inference. Intel says they are seeking 1,000 times improvement in energy efficiency, 100 times improvement in performance, and “orders of magnitude” gains in the amount of data needed for training. 

Not all problems lend themselves to neuromorphic processing. Algorithms that are well-suited to today’s deep-learning technology are obvious wins. Intel is also evaluating algorithms “inspired by neuroscience” that emulate processes found in the brain. And, finally, they are looking at “mathematically formulated” problems.

In the first category, networks converted from today’s deep neural networks (DNNs) can be converted to a form usable by a neuromorphic chip. Additionally, “directly-trained” networks can be created with the neuromorphic processor itself. Finally, “back propagation,” common in CNNs, can be emulated in neuromorphic processors, despite the fact that this requires global communication not inherent to the neuromorphic architecture.

Loihi is a research chip, not designed for production. It is a 2-billion transistor chip, fabricated on Intel’s 14nm CMOS process. Loihi contains a fully asynchronous “neuromorphic many-core mesh that supports a wide range of sparse, hierarchical and recurrent neural network topologies with each neuron capable of communicating with thousands of other neurons.” Each of these cores includes a learning engine that adapts parameters during operation. The chip contains 130,000 neurons and 130 million synapses, divided into 128 neuromorphic cores. The chip includes a microcode learning engine for on-chip training of the SNN. Loihi chips have been integrated into boards and boxes containing as many as 100M total neurons in 768 chips. 

We are now at the confluence of a number of trends that could form the perfect storm of a revolution in processor architecture. First, neuromorphic processors are at the knee of commercial viability, and they bring something like the equivalent of 10 Moore’s Law nodes (20 years) of forward progress on certain classes of problems. Second, conventional DNNs are rapidly progressing and generating related and similar architectural innovation to those found in neuromorphic processors, suggesting a possible convergence on a future “best of both worlds” architecture that combines traits from both architectural domains. Third, Moore’s Law is coming to a close, and that puts more emphasis, talent, and money into the development of architectural approaches to driving future technological progress. And fourth, power consumption has emerged as probably the dominant driving factor in computation – which is the single metric where these new architectural approaches excel most. 

It will be interesting to watch as the first of these neuromorphic processors gains commercial traction and creates the virtuous cycle of investment, development, refinement, and deployment. It is likely that, within a few years, neuromorphic architectures (or similar derivative technologies) will have taken on a substantial role in our computing infrastructure and catapulted to the forefront new applications that can only be imagined today.

One thought on “Neuromorphic Revolution”

  1. No More “Times Lower Than”
    Hey Marketers and PR folks – this one’s for you. In researching this article I repeatedly ran into official, presumably copy-edited materials that made claims that power was “75X lower” or “30 times less than” other solutions. This is absolutely wrong.

    Your audience for these materials is engineers. Engineers are very good at math. When you use incorrect terminology like this, you damage your credibility.

    There are two distinct mistakes we see repeatedly in this area. I’ll explain both:

    First, is the “less than” or “lower than” mistake which is most egregious. Let’s agree that “1 times” means “100%.” If I say that we were able to “reduce power by 50%”, everyone would agree that means we now use half the power we did before. If I said we reduced power by 75%, we are using a quarter of the power, and so on.
    So a 1x reduction in power, or making power “1x lower” would mean we have reduced it by 100%. We now use zero power, which would be the best we could possibly do, unless our device now actually creates power – at which point the grammar police would turn the case over to the thermodynamics police.
    In Intel’s materials (sorry for picking on Intel, they are definitely not the only offenders), they say “…its Loihi solutions … require 75 times lower power than conventional mobile GPU implementations.”

    They probably meant to say “1/75th the power of mobile GPU implementations” 1/75th the power would equal 1.33% of the power, or 98.66% lower power. Obviously 98.66% is almost (but not quite) 1x. So, if they really want to use “times lower power” it should have read “require almost 1 times lower power than conventional…” If they truly used 75 times lower power, it would mean they are in the energy generation business, and one Loihi actually creates 74 times the power that a GPU burns. Hmmm…

    Notice I said “74 times” and not “75 times”? That brings us to the second common mistake.

    It is valid, when we are talking about things increasing, to use “times”. But there is an important difference between “times as much” and “times more”.

    Let’s start with the “one” example again. One is clearly “one times as much” as one. But, “one times more” than one would be two. Two times more than one would be three, and so on.

    You see where this is going?

    If your old system ran at 100 MHz, and your press release says your new system has “3x more performance,” do you mean it now runs at 400 MHz (correct), or are you saying it runs at 300 MHz (nope). Saying “three times the performance” would clearly mean 300 MHz, but when you switch the language to “more” you make it incorrect, or at best ambiguous.

    Thus ends my rant. Marketers, please never again use “times more” or “times lower” for any figures of merit in your materials. Train your copy editors to flag it if you do. Send me a snarky note if you find me using one (I’m certain I have). If you must use “times” do it only when the figure of merit is increasing. And, if you do, avoid “more than.” Say “three times the” or similar.

Leave a Reply

featured blogs
Dec 8, 2023
Read the technical brief to learn about Mixed-Order Mesh Curving using Cadence Fidelity Pointwise. When performing numerical simulations on complex systems, discretization schemes are necessary for the governing equations and geometry. In computational fluid dynamics (CFD) si...
Dec 7, 2023
Explore the different memory technologies at the heart of AI SoC memory architecture and learn about the advantages of SRAM, ReRAM, MRAM, and beyond.The post The Importance of Memory Architecture for AI SoCs appeared first on Chip Design....
Nov 6, 2023
Suffice it to say that everyone and everything in these images was shot in-camera underwater, and that the results truly are haunting....

featured video

Dramatically Improve PPA and Productivity with Generative AI

Sponsored by Cadence Design Systems

Discover how you can quickly optimize flows for many blocks concurrently and use that knowledge for your next design. The Cadence Cerebrus Intelligent Chip Explorer is a revolutionary, AI-driven, automated approach to chip design flow optimization. Block engineers specify the design goals, and generative AI features within Cadence Cerebrus Explorer will intelligently optimize the design to meet the power, performance, and area (PPA) goals in a completely automated way.

Click here for more information

featured paper

3D-IC Design Challenges and Requirements

Sponsored by Cadence Design Systems

While there is great interest in 3D-IC technology, it is still in its early phases. Standard definitions are lacking, the supply chain ecosystem is in flux, and design, analysis, verification, and test challenges need to be resolved. Read this paper to learn about design challenges, ecosystem requirements, and needed solutions. While various types of multi-die packages have been available for many years, this paper focuses on 3D integration and packaging of multiple stacked dies.

Click to read more

featured chalk talk

High Voltage Stackable Dual Phase Constant On Time Controllers - Microchip and Mouser
Sponsored by Mouser Electronics and Microchip
In this episode of Chalk Talk, Chris Romano from Microchip and Amelia Dalton discuss the what, where, and how of Microchip’s high voltage stackable dual phase constant on time controllers. They investigate the stacking capabilities of the MIC2132 controller, how these controllers compare with other solutions on the market, and how you can take advantage of these solutions in your next design.
May 22, 2023
24,666 views