feature article
Subscribe Now

Machine Learning For a Few Dollars

Eta Compute’s ECM3532 Chip Brings Inference to the Edge

“We have to be very prissy about how we tell computers to do things.” – Richard P. Feynman

Alpha, beta, gamma, delta… something, something… lambda… uh, omega. That’s about all I remember of the Greek alphabet. College was a long time ago and I never joined a fraternity. 

The folks at Eta Compute stayed in school, though, and got smart. They know that eta is the seventh Greek letter (between zeta and theta, natch) and that machine learning (ML) is a big deal. Can you spell MCU? I knew you could. 

This week, Eta Compute rolls out its second ML-oriented MCU, the ECM3532. The new chip is an upgrade from the debut ECM3531 device, with more performance and even lower power consumption. 

Like its immediate predecessor, the ’32 is aimed at “ML at the edge,” meaning it’s a low-cost device intended to do inference locally, rather than by sending buckets of data to some cloud-based machine that does the deep thinking remotely. It’s a good idea for IoT gadgets that need to massage images, voices, gestures, or sensor data. The trick is to make the ML hardware cheap enough while keeping the power consumption low enough. Eta thinks it’s aced both criteria, with prices in the “low single digits” and power down in the milliwatt range. These guys put the µ in microamp. 

The block diagram doesn’t give away much of the magic. In fact, it looks pretty much like any average MCU, with an ARM Cortex-M3 processor core running alongside a CoolFlux DSP licensed from NXP. Those are complemented with 512KB of flash, 256KB of SRAM, 8KB of ROM, and the usual assortment of UARTs, clocks, ADCs, and general-purpose I/O pins. It could be the poster on any MCU designer’s wall. 

It’s what’s underneath that counts. Eta Compute specializes in nonstandard low-power circuit design, a technique it calls CVFS: continuous voltage and frequency scaling. It’s an upgrade from the company’s previous DIAL (delay-insensitive asynchronous logic) methodology, but with similar goals. 

Like DIAL, CVFS relies on circuit-design tricks, not exotic semiconductor fabrication technology, to achieve low power consumption. DIAL was asynchronous; that is, there was no systemwide clock forcing every gate and latch to run in lockstep. Instead, each stage of a logic chain is joined by an asynchronous handshake signal. When one latch or flip-flop does its thing, it signals completion to the next stage, and so on. Asynchronous logic has plenty of advantages over synchronous logic, but a few disadvantages, too. Overall, the Eta Compute team decided the latter outweighed the former, so they re-thought how they’d design the next-generation ECM3532. 

Both chips can run at very low threshold voltages – like 0.25V, for example – that would make a normal synchronous design very slow and hard to manage. CVFS does away with the fully asynchronous philosophy of DIAL and replaces it with a number of self-generated clocks. It’s not fully asynchronous anymore, but it’s not a traditional synchronous design, either. Eta Compute says the new technique supports higher frequencies than DIAL did, without compromising the low-frequency power savings. The chip generates its own internal voltages as well as its own clocks, so integration with outside logic isn’t a problem. The ECM3532 can optionally run in synchronous mode with an external crystal, too, if you really need it to. 

The payoff is in the power savings, and Eta Compute says the ECM3532 consumes less than 5 µA/MHz under moderate loading, or 13µA/MHz when it’s running the Coremark benchmark. With a 3.0V supply, you’re looking at under 1 mA for many edge-ML tasks, according to the company. 

So, where does the machine learning come in? Well, that’s what the DSP is for. Neither the ECM3531 nor the new ’32 have ML accelerators as such, but they do include a DSP that should ease the task. As we noted earlier, a lot of ML inference work looks a lot like DSP filtering. Both benefit from fast MAC (multiple-accumulate) hardware, loop-intensive coding, and access to lots of memory. That pretty much describes the ECM3532 in a nutshell. 

It’s not a high-end beast designed for TensorFlow coding; the ’32 is more of a flyweight ready for TinyML. Having said that, Eta Compute does offer a software translator that converts TensorFlow to C, and from there to ECM3532 binaries. That allows developers to prototype and test their ideas using TensorFlow, and then ratchet down and refine them for the MCU. 

There are plenty of MCUs with DSPs onboard, but few are aimed at the ML market. And even fewer boast such low power numbers or use Eta Compute’s patented design methodology to get there. If “ML at the edge” becomes a thing, we can all Greek out on a new chip design. 

Leave a Reply

featured blogs
May 26, 2022
Introducing Synopsys Learning Center, an online, on-demand library of self-paced training modules, webinars, and labs designed for both new & experienced users. The post New Synopsys Learning Center Makes Training Easier and More Accessible appeared first on From Silico...
May 26, 2022
CadenceLIVE Silicon Valley is back as an in-person event for 2022, in the Santa Clara Convention Center as usual. The event will take place on Wednesday, June 8 and Thursday, June 9. Vaccination You... ...
May 25, 2022
There are so many cool STEM (science, technology, engineering, and math) toys available these days, and I want them all!...
May 24, 2022
By Neel Natekar Radio frequency (RF) circuitry is an essential component of many of the critical applications we now rely… ...

featured video

Increasing Semiconductor Predictability in an Unpredictable World

Sponsored by Synopsys

SLM presents significant value-driven opportunities for assessing the reliability and resilience of silicon devices, from data gathered during design, manufacture, test, and in-field. Silicon data driven analytics provide new actionable insights to address the challenges posed to large scale silicon designs.

Learn More

featured paper

Intel Agilex FPGAs Deliver Game-Changing Flexibility & Agility for the Data-Centric World

Sponsored by Intel

The new Intel® Agilex™ FPGA is more than the latest programmable logic offering—it brings together revolutionary innovation in multiple areas of Intel technology leadership to create new opportunities to derive value and meaning from this transformation from edge to data center. Want to know more? Start with this white paper.

Click to read more

featured chalk talk

56 Gbps PAM4 Performance in FPGA Applications

Sponsored by Mouser Electronics and Samtec

If you are working on an FPGA design, the choice of a connector solution can be a crucial element in your system design. Your FPGA connector solution needs to support the highest of speeds, small form factors, and emerging architectures. In this episode of Chalk Talk, Amelia Dalton joins Matthew Burns to chat about you can get 56 Gbps PAM4 performance in your next FPGA application. We take a closer look at Samtec’s AcceleRate® HD High-Density Arrays, the details of Samtec’s Flyover Technology, and why Samtec’s complete portfolio of high-performance interconnects are a perfect fit for 56 Gbps PAM4 FPGA Applications.

Click here for more information about Samtec AcceleRate® Slim Body Direct Attach Cable Assembly