feature article
Subscribe Now

NXP MCU Hits 1 GHz

A Barrier of Sorts, and a Win for the Little Guy

“That which we call a rose by any other name would smell as sweet.” – Romeo and Juliet, II, 2

When is an MCU not an MCU? Does it have to include on-chip flash? Does it need integrated peripherals? Must it be inexpensive? Small? Slow? Boring? 

The beauty of an MCU is in the eye of the beholder, and NXP’s newest i.MX RT1170 device may challenge developers’ ideas about how to categorize it. 

Its creator calls it a “crossover MCU,” a term I dislike, whether it’s applied to vehicles (top-heavy Tonka truck wannabes), musical styles (country singers without the hat), or microcontrollers (MCUs with fast cores and no flash). Whatevs.  

First, the deets. The new RT1170 is part of NXP’s existing i.MX line, which was acquired, along with Freescale (née Motorola), back in 2015. It’s a dual-core device, with an ARM Cortex-M7 running alongside a Cortex-M4. The two run independently, with separate power domains and separate clock frequencies. It’s the M7 that hits the headline clock speed of 1 GHz; the little M4 tops out at just 400 MHz. It’s NXP’s own interpretation of the big.little duumvirate tailored for MCUs. 

The chip also comes with a lot of on-chip RAM: a hefty 2 MB of SRAM. The thinking is that a 1-GHz processor needs a lot of local RAM to perform at its peak; relying on off-chip memory would’ve throttled the Cortex-M7 and squandered all that luscious performance. 

What it doesn’t have is any flash memory, which is a bit strange. On-chip flash has always been one of the defining characteristics of an MCU; market-research companies use its presence or absence to draw the dividing line between MCUs and CPUs. Future sales of the RT1170 will likely be counted into the latter bucket. 

Why no flash? Blame physics. The 28nm FD-SOI process that NXP uses for the RT1170 is advanced for such a low-cost part, and it’s what allows the chip to hit the 1-GHz high notes. But it’s also incompatible with flash manufacturing. That’s not to say it’s impossible to build flash using that process, only that it’s economically unattractive. NXP opted for speed and low power consumption; jettisoning the flash was the price they paid for that decision. 

In addition to its dual processors and big SRAM, the RT1170 also has an optional 2D graphics accelerator, a pair of Gbit Ethernet controllers with PHY (plus a 10/100 Ethernet port), an external SDRAM controller, encryption engine, MIPI camera interface, and the usual assortment of UARTs, timers, serial interfaces, and analog stuff. In short, it looks like an MCU. Just a really fast one that needs off-chip ROM. 

NXP brands its crypto engine “EdgeLock,” and it serves two purposes. First, it’s a necessary addition to today’s IoT appliances that need to secure user data, access keys, payment information, and more. On-chip security hardware is a check-box item these days, and NXP has included various levels of security in its i.MX parts for the past few years. 

Second, it patches over the missing on-chip flash. With no significant internal NVM, the RT1170 is forced to store secure data off-chip in external DRAM or serial flash. Normally, that would leave a gaping security hole, like storing your spare housekey under a big sign that says, “Key Here!” But, by encrypting/decrypting external data on the fly as it passes in/out of the chip, secure storage remains secure. NXP says the RT1170 can encrypt and decrypt any and all off-chip memory with no performance penalty. The crypto lag is hidden by the memory controller’s latency, making it essentially transparent to the programmer. 

The EdgeLock block contains both a true random-number generator (TRNG) and a pseudo-random-number generator (PRNG). Why have both? Because one’s quick and one’s good. I suspect that the TRNG hardware (which is more complex than a PRNG circuit) can’t keep up with the 1-GHz Cortex-M7, so NXP added the PRNG for everyday tasks that won’t delay the processor. Ideally, you’d use the TRNG to seed the PRNG, and then rely on the latter for most uses. 

Both Cortex-M cores can access any of the on-chip peripherals. In fact, everything on the chip is shared, apart from the two cores’ private L1 caches. It’s up to you to partition the workload, though. Because the two cores are clocked and powered separately, you can operate them utterly independently of one another, or you can have them collaborate. The smaller M4 core can run as slowly as 24 MHz, and there’s no rule that says it needs to be synchronized with the M7. Go forth and multitask. 

Which raises the question… why use two Cortex-M cores at all, instead of the more capable Cortex-A series, especially when running at GHz speeds? The RT1170 is like a caffeinated hummingbird versus a more relaxed sparrow or robin. Who wants GHz performance but not Cortex-A capability? 

RTOS programmers, says NXP. Cortex-A parts have lots of advantages, including richer instruction sets, bigger register files, multithreading capability, better multicore support, better software support, and so on. But they also have slower interrupt-response times. ARM created the Cortex-M series (M for microcontroller) specifically for MCU applications that prioritize real-time response over highfalutin “computer” features. They’re not as fast – at least, not usually – but they’re simpler to program and debug. 

The other reason is familiarity. A lot of NXP’s customers are MCU users from way back, and they’re accustomed to the Cortex-M architecture and the single-chip ethos. They’ve already got the tools, the expertise, and the real-time experience and aren’t really interested in migrating to the more complicated A-series. Cranking up the clock frequency without altering the underlying architecture suits them just fine. 

Production quantities of the RT1170 are still a year away, but that hasn’t stopped NXP from guesstimating the CoreMark score for its newest offspring. With a projected score of 6468, the RT1170 should be about twice as fast as NXP’s i.MX RT1060, its next-fastest Cortex-M device. For comparison, that score is about what Intel’s quad-core Xeon L5408 produced ten years ago, or an AMD Phenom II X4 (circa 2009), or about equal to TI’s OMAP4460, a dual-core Cortex-A9 device. It’s amazing what a few years of semiconductor fabrication progress will buy you. 

Especially for the price. The RT1170 will sell for around $5 when it hits mass production in 2H20. That’s a lot cheaper than a Xeon, Phenom, or OMAP even now, never mind when they were new. A 1-GHz processor (plus a 400-MHz sidekick) with 2MB of RAM, multiple Ethernet channels, 2D acceleration, hardware security, and plenty more, all in a single package for five bucks. That’s a good deal, no matter what you call it.

Leave a Reply

featured blogs
Dec 7, 2023
Building on the success of previous years, the 2024 edition of the DATE (Design, Automation and Test in Europe) conference will once again include the Young People Programme. The largest electronic design automation (EDA) conference in Europe, DATE will be held on 25-27 March...
Dec 7, 2023
Explore the different memory technologies at the heart of AI SoC memory architecture and learn about the advantages of SRAM, ReRAM, MRAM, and beyond.The post The Importance of Memory Architecture for AI SoCs appeared first on Chip Design....
Nov 6, 2023
Suffice it to say that everyone and everything in these images was shot in-camera underwater, and that the results truly are haunting....

featured video

Dramatically Improve PPA and Productivity with Generative AI

Sponsored by Cadence Design Systems

Discover how you can quickly optimize flows for many blocks concurrently and use that knowledge for your next design. The Cadence Cerebrus Intelligent Chip Explorer is a revolutionary, AI-driven, automated approach to chip design flow optimization. Block engineers specify the design goals, and generative AI features within Cadence Cerebrus Explorer will intelligently optimize the design to meet the power, performance, and area (PPA) goals in a completely automated way.

Click here for more information

featured paper

Universal Verification Methodology Coverage for Bluespec RISC-V Cores

Sponsored by Synopsys

This whitepaper explains the basics of UVM functional coverage for RISC-V cores using the Google RISCV-DV open-source project, Synopsys verification solutions, and a RISC-V processor core from Bluespec.

Click to read more

featured chalk talk

ADI's ISOverse
In order to move forward with innovations on the intelligent edge, we need to take a close look at isolation and how it can help foster the adoption of high voltage charging solutions and reliable and robust high speed communication. In this episode of Chalk Talk, Amelia Dalton is joined by Allison Lemus, Maurizio Granato, and Karthi Gopalan from Analog Devices and they examine benefits that isolation brings to intelligent edge applications including smart building control, the enablement of Industry 4.0, and more. They also examine how Analog Devices iCoupler® digital isolation technology can encourage innovation big and small!  
Mar 14, 2023
31,674 views