feature article
Subscribe Now

NXP MCU Hits 1 GHz

A Barrier of Sorts, and a Win for the Little Guy

“That which we call a rose by any other name would smell as sweet.” – Romeo and Juliet, II, 2

When is an MCU not an MCU? Does it have to include on-chip flash? Does it need integrated peripherals? Must it be inexpensive? Small? Slow? Boring? 

The beauty of an MCU is in the eye of the beholder, and NXP’s newest i.MX RT1170 device may challenge developers’ ideas about how to categorize it. 

Its creator calls it a “crossover MCU,” a term I dislike, whether it’s applied to vehicles (top-heavy Tonka truck wannabes), musical styles (country singers without the hat), or microcontrollers (MCUs with fast cores and no flash). Whatevs.  

First, the deets. The new RT1170 is part of NXP’s existing i.MX line, which was acquired, along with Freescale (née Motorola), back in 2015. It’s a dual-core device, with an ARM Cortex-M7 running alongside a Cortex-M4. The two run independently, with separate power domains and separate clock frequencies. It’s the M7 that hits the headline clock speed of 1 GHz; the little M4 tops out at just 400 MHz. It’s NXP’s own interpretation of the big.little duumvirate tailored for MCUs. 

The chip also comes with a lot of on-chip RAM: a hefty 2 MB of SRAM. The thinking is that a 1-GHz processor needs a lot of local RAM to perform at its peak; relying on off-chip memory would’ve throttled the Cortex-M7 and squandered all that luscious performance. 

What it doesn’t have is any flash memory, which is a bit strange. On-chip flash has always been one of the defining characteristics of an MCU; market-research companies use its presence or absence to draw the dividing line between MCUs and CPUs. Future sales of the RT1170 will likely be counted into the latter bucket. 

Why no flash? Blame physics. The 28nm FD-SOI process that NXP uses for the RT1170 is advanced for such a low-cost part, and it’s what allows the chip to hit the 1-GHz high notes. But it’s also incompatible with flash manufacturing. That’s not to say it’s impossible to build flash using that process, only that it’s economically unattractive. NXP opted for speed and low power consumption; jettisoning the flash was the price they paid for that decision. 

In addition to its dual processors and big SRAM, the RT1170 also has an optional 2D graphics accelerator, a pair of Gbit Ethernet controllers with PHY (plus a 10/100 Ethernet port), an external SDRAM controller, encryption engine, MIPI camera interface, and the usual assortment of UARTs, timers, serial interfaces, and analog stuff. In short, it looks like an MCU. Just a really fast one that needs off-chip ROM. 

NXP brands its crypto engine “EdgeLock,” and it serves two purposes. First, it’s a necessary addition to today’s IoT appliances that need to secure user data, access keys, payment information, and more. On-chip security hardware is a check-box item these days, and NXP has included various levels of security in its i.MX parts for the past few years. 

Second, it patches over the missing on-chip flash. With no significant internal NVM, the RT1170 is forced to store secure data off-chip in external DRAM or serial flash. Normally, that would leave a gaping security hole, like storing your spare housekey under a big sign that says, “Key Here!” But, by encrypting/decrypting external data on the fly as it passes in/out of the chip, secure storage remains secure. NXP says the RT1170 can encrypt and decrypt any and all off-chip memory with no performance penalty. The crypto lag is hidden by the memory controller’s latency, making it essentially transparent to the programmer. 

The EdgeLock block contains both a true random-number generator (TRNG) and a pseudo-random-number generator (PRNG). Why have both? Because one’s quick and one’s good. I suspect that the TRNG hardware (which is more complex than a PRNG circuit) can’t keep up with the 1-GHz Cortex-M7, so NXP added the PRNG for everyday tasks that won’t delay the processor. Ideally, you’d use the TRNG to seed the PRNG, and then rely on the latter for most uses. 

Both Cortex-M cores can access any of the on-chip peripherals. In fact, everything on the chip is shared, apart from the two cores’ private L1 caches. It’s up to you to partition the workload, though. Because the two cores are clocked and powered separately, you can operate them utterly independently of one another, or you can have them collaborate. The smaller M4 core can run as slowly as 24 MHz, and there’s no rule that says it needs to be synchronized with the M7. Go forth and multitask. 

Which raises the question… why use two Cortex-M cores at all, instead of the more capable Cortex-A series, especially when running at GHz speeds? The RT1170 is like a caffeinated hummingbird versus a more relaxed sparrow or robin. Who wants GHz performance but not Cortex-A capability? 

RTOS programmers, says NXP. Cortex-A parts have lots of advantages, including richer instruction sets, bigger register files, multithreading capability, better multicore support, better software support, and so on. But they also have slower interrupt-response times. ARM created the Cortex-M series (M for microcontroller) specifically for MCU applications that prioritize real-time response over highfalutin “computer” features. They’re not as fast – at least, not usually – but they’re simpler to program and debug. 

The other reason is familiarity. A lot of NXP’s customers are MCU users from way back, and they’re accustomed to the Cortex-M architecture and the single-chip ethos. They’ve already got the tools, the expertise, and the real-time experience and aren’t really interested in migrating to the more complicated A-series. Cranking up the clock frequency without altering the underlying architecture suits them just fine. 

Production quantities of the RT1170 are still a year away, but that hasn’t stopped NXP from guesstimating the CoreMark score for its newest offspring. With a projected score of 6468, the RT1170 should be about twice as fast as NXP’s i.MX RT1060, its next-fastest Cortex-M device. For comparison, that score is about what Intel’s quad-core Xeon L5408 produced ten years ago, or an AMD Phenom II X4 (circa 2009), or about equal to TI’s OMAP4460, a dual-core Cortex-A9 device. It’s amazing what a few years of semiconductor fabrication progress will buy you. 

Especially for the price. The RT1170 will sell for around $5 when it hits mass production in 2H20. That’s a lot cheaper than a Xeon, Phenom, or OMAP even now, never mind when they were new. A 1-GHz processor (plus a 400-MHz sidekick) with 2MB of RAM, multiple Ethernet channels, 2D acceleration, hardware security, and plenty more, all in a single package for five bucks. That’s a good deal, no matter what you call it.

Leave a Reply

featured blogs
Apr 9, 2021
You probably already know what ISO 26262 is. If you don't, then you can find out in several previous posts: "The Safest Train Is One that Never Leaves the Station" History of ISO 26262... [[ Click on the title to access the full blog on the Cadence Community s...
Apr 8, 2021
We all know the widespread havoc that Covid-19 wreaked in 2020. While the electronics industry in general, and connectors in particular, took an initial hit, the industry rebounded in the second half of 2020 and is rolling into 2021. Travel came to an almost stand-still in 20...
Apr 7, 2021
We explore how EDA tools enable hyper-convergent IC designs, supporting the PPA and yield targets required by advanced 3DICs and SoCs used in AI and HPC. The post Why Hyper-Convergent Chip Designs Call for a New Approach to Circuit Simulation appeared first on From Silicon T...
Apr 5, 2021
Back in November 2019, just a few short months before we all began an enforced… The post Collaboration and innovation thrive on diversity appeared first on Design with Calibre....

featured video

Meeting Cloud Data Bandwidth Requirements with HPC IP

Sponsored by Synopsys

As people continue to work remotely, demands on cloud data centers have never been higher. Chip designers for high-performance computing (HPC) SoCs are looking to new and innovative IP to meet their bandwidth, capacity, and security needs.

Click here for more information

featured paper

Understanding Functional Safety FIT Base Failure Rate Estimates per IEC 62380 and SN 29500

Sponsored by Texas Instruments

Functional safety standards such as IEC 61508 and ISO 26262 require semiconductor device manufacturers to address both systematic and random hardware failures. Base failure rates (BFR) quantify the intrinsic reliability of the semiconductor component while operating under normal environmental conditions. Download our white paper which focuses on two widely accepted techniques to estimate the BFR for semiconductor components; estimates per IEC Technical Report 62380 and SN 29500 respectively.

Click here to download the whitepaper

Featured Chalk Talk

TensorFlow to RTL with High-Level Synthesis

Sponsored by Cadence Design Systems

Bridging the gap from the AI and data science world to the RTL and hardware design world can be challenging. High-level synthesis (HLS) can provide a mechanism to get from AI frameworks like TensorFlow into synthesizable RTL, enabling the development of high-performance inference architectures. In this episode of Chalk Talk, Amelia Dalton chats with Dave Apte of Cadence Design Systems about doing AI design with HLS.

More information