feature article
Subscribe Now

Machine Learning For a Few Dollars

Eta Compute’s ECM3532 Chip Brings Inference to the Edge

“We have to be very prissy about how we tell computers to do things.” – Richard P. Feynman

Alpha, beta, gamma, delta… something, something… lambda… uh, omega. That’s about all I remember of the Greek alphabet. College was a long time ago and I never joined a fraternity. 

The folks at Eta Compute stayed in school, though, and got smart. They know that eta is the seventh Greek letter (between zeta and theta, natch) and that machine learning (ML) is a big deal. Can you spell MCU? I knew you could. 

This week, Eta Compute rolls out its second ML-oriented MCU, the ECM3532. The new chip is an upgrade from the debut ECM3531 device, with more performance and even lower power consumption. 

Like its immediate predecessor, the ’32 is aimed at “ML at the edge,” meaning it’s a low-cost device intended to do inference locally, rather than by sending buckets of data to some cloud-based machine that does the deep thinking remotely. It’s a good idea for IoT gadgets that need to massage images, voices, gestures, or sensor data. The trick is to make the ML hardware cheap enough while keeping the power consumption low enough. Eta thinks it’s aced both criteria, with prices in the “low single digits” and power down in the milliwatt range. These guys put the µ in microamp. 

The block diagram doesn’t give away much of the magic. In fact, it looks pretty much like any average MCU, with an ARM Cortex-M3 processor core running alongside a CoolFlux DSP licensed from NXP. Those are complemented with 512KB of flash, 256KB of SRAM, 8KB of ROM, and the usual assortment of UARTs, clocks, ADCs, and general-purpose I/O pins. It could be the poster on any MCU designer’s wall. 

It’s what’s underneath that counts. Eta Compute specializes in nonstandard low-power circuit design, a technique it calls CVFS: continuous voltage and frequency scaling. It’s an upgrade from the company’s previous DIAL (delay-insensitive asynchronous logic) methodology, but with similar goals. 

Like DIAL, CVFS relies on circuit-design tricks, not exotic semiconductor fabrication technology, to achieve low power consumption. DIAL was asynchronous; that is, there was no systemwide clock forcing every gate and latch to run in lockstep. Instead, each stage of a logic chain is joined by an asynchronous handshake signal. When one latch or flip-flop does its thing, it signals completion to the next stage, and so on. Asynchronous logic has plenty of advantages over synchronous logic, but a few disadvantages, too. Overall, the Eta Compute team decided the latter outweighed the former, so they re-thought how they’d design the next-generation ECM3532. 

Both chips can run at very low threshold voltages – like 0.25V, for example – that would make a normal synchronous design very slow and hard to manage. CVFS does away with the fully asynchronous philosophy of DIAL and replaces it with a number of self-generated clocks. It’s not fully asynchronous anymore, but it’s not a traditional synchronous design, either. Eta Compute says the new technique supports higher frequencies than DIAL did, without compromising the low-frequency power savings. The chip generates its own internal voltages as well as its own clocks, so integration with outside logic isn’t a problem. The ECM3532 can optionally run in synchronous mode with an external crystal, too, if you really need it to. 

The payoff is in the power savings, and Eta Compute says the ECM3532 consumes less than 5 µA/MHz under moderate loading, or 13µA/MHz when it’s running the Coremark benchmark. With a 3.0V supply, you’re looking at under 1 mA for many edge-ML tasks, according to the company. 

So, where does the machine learning come in? Well, that’s what the DSP is for. Neither the ECM3531 nor the new ’32 have ML accelerators as such, but they do include a DSP that should ease the task. As we noted earlier, a lot of ML inference work looks a lot like DSP filtering. Both benefit from fast MAC (multiple-accumulate) hardware, loop-intensive coding, and access to lots of memory. That pretty much describes the ECM3532 in a nutshell. 

It’s not a high-end beast designed for TensorFlow coding; the ’32 is more of a flyweight ready for TinyML. Having said that, Eta Compute does offer a software translator that converts TensorFlow to C, and from there to ECM3532 binaries. That allows developers to prototype and test their ideas using TensorFlow, and then ratchet down and refine them for the MCU. 

There are plenty of MCUs with DSPs onboard, but few are aimed at the ML market. And even fewer boast such low power numbers or use Eta Compute’s patented design methodology to get there. If “ML at the edge” becomes a thing, we can all Greek out on a new chip design. 

Leave a Reply

featured blogs
Jan 15, 2021
It's Martin Luther King Day on Monday. Cadence is off. Breakfast Bytes will not appear. And, as is traditional, I go completely off-topic the day before a break. In the past, a lot of novelty in... [[ Click on the title to access the full blog on the Cadence Community s...
Jan 14, 2021
Learn how electronic design automation (EDA) tools & silicon-proven IP enable today's most influential smart tech, including ADAS, 5G, IoT, and Cloud services. The post 5 Key Innovations that Are Making Everything Smarter appeared first on From Silicon To Software....
Jan 13, 2021
Here are some genius solutions to everyday problems you probably didn'€™t even know existed, but after you'€™ve seen them you'€™ll say '€œWow!'€...
Jan 13, 2021
Testing is the final step of any manufacturing process, and arguably the most important, and yet it can often be overlooked.  Releasing a poorly tested product onto the market has destroyed more than one reputation for quality, and this is even more important in an age when ...

featured paper

Overcoming Signal Integrity Challenges of 112G Connections on PCB

Sponsored by Cadence Design Systems

One big challenge with 112G SerDes is handling signal integrity (SI) issues. By the time the signal winds its way from the transmitter on one chip to packages, across traces on PCBs, through connectors or cables, and arrives at the receiver, the signal is very distorted, making it a challenge to recover the clock and data-bits of the information being transferred. Learn how to handle SI issues and ensure that data is faithfully transmitted with a very low bit error rate (BER).

Click here to download the whitepaper

featured chalk talk

The Wireless Member of the DARWIN Family

Sponsored by Mouser Electronics and Maxim Integrated

MCUs continue to evolve based on increasing demands from designers. We expect our microcontrollers to do more than ever - better security, more performance, lower power consumption - and we want it all for less money, of course. In this episode of Chalk Talk, Amelia Dalton chats with Kris Ardis from Maxim Integrated about the new DARWIN line of low-power MCUs.

Click here for more information about Maxim Integrated MAX32665-MAX32668 UB Class Microcontroller