feature article
Subscribe Now

NXP MCU Hits 1 GHz

A Barrier of Sorts, and a Win for the Little Guy

“That which we call a rose by any other name would smell as sweet.” – Romeo and Juliet, II, 2

When is an MCU not an MCU? Does it have to include on-chip flash? Does it need integrated peripherals? Must it be inexpensive? Small? Slow? Boring? 

The beauty of an MCU is in the eye of the beholder, and NXP’s newest i.MX RT1170 device may challenge developers’ ideas about how to categorize it. 

Its creator calls it a “crossover MCU,” a term I dislike, whether it’s applied to vehicles (top-heavy Tonka truck wannabes), musical styles (country singers without the hat), or microcontrollers (MCUs with fast cores and no flash). Whatevs.  

First, the deets. The new RT1170 is part of NXP’s existing i.MX line, which was acquired, along with Freescale (née Motorola), back in 2015. It’s a dual-core device, with an ARM Cortex-M7 running alongside a Cortex-M4. The two run independently, with separate power domains and separate clock frequencies. It’s the M7 that hits the headline clock speed of 1 GHz; the little M4 tops out at just 400 MHz. It’s NXP’s own interpretation of the big.little duumvirate tailored for MCUs. 

The chip also comes with a lot of on-chip RAM: a hefty 2 MB of SRAM. The thinking is that a 1-GHz processor needs a lot of local RAM to perform at its peak; relying on off-chip memory would’ve throttled the Cortex-M7 and squandered all that luscious performance. 

What it doesn’t have is any flash memory, which is a bit strange. On-chip flash has always been one of the defining characteristics of an MCU; market-research companies use its presence or absence to draw the dividing line between MCUs and CPUs. Future sales of the RT1170 will likely be counted into the latter bucket. 

Why no flash? Blame physics. The 28nm FD-SOI process that NXP uses for the RT1170 is advanced for such a low-cost part, and it’s what allows the chip to hit the 1-GHz high notes. But it’s also incompatible with flash manufacturing. That’s not to say it’s impossible to build flash using that process, only that it’s economically unattractive. NXP opted for speed and low power consumption; jettisoning the flash was the price they paid for that decision. 

In addition to its dual processors and big SRAM, the RT1170 also has an optional 2D graphics accelerator, a pair of Gbit Ethernet controllers with PHY (plus a 10/100 Ethernet port), an external SDRAM controller, encryption engine, MIPI camera interface, and the usual assortment of UARTs, timers, serial interfaces, and analog stuff. In short, it looks like an MCU. Just a really fast one that needs off-chip ROM. 

NXP brands its crypto engine “EdgeLock,” and it serves two purposes. First, it’s a necessary addition to today’s IoT appliances that need to secure user data, access keys, payment information, and more. On-chip security hardware is a check-box item these days, and NXP has included various levels of security in its i.MX parts for the past few years. 

Second, it patches over the missing on-chip flash. With no significant internal NVM, the RT1170 is forced to store secure data off-chip in external DRAM or serial flash. Normally, that would leave a gaping security hole, like storing your spare housekey under a big sign that says, “Key Here!” But, by encrypting/decrypting external data on the fly as it passes in/out of the chip, secure storage remains secure. NXP says the RT1170 can encrypt and decrypt any and all off-chip memory with no performance penalty. The crypto lag is hidden by the memory controller’s latency, making it essentially transparent to the programmer. 

The EdgeLock block contains both a true random-number generator (TRNG) and a pseudo-random-number generator (PRNG). Why have both? Because one’s quick and one’s good. I suspect that the TRNG hardware (which is more complex than a PRNG circuit) can’t keep up with the 1-GHz Cortex-M7, so NXP added the PRNG for everyday tasks that won’t delay the processor. Ideally, you’d use the TRNG to seed the PRNG, and then rely on the latter for most uses. 

Both Cortex-M cores can access any of the on-chip peripherals. In fact, everything on the chip is shared, apart from the two cores’ private L1 caches. It’s up to you to partition the workload, though. Because the two cores are clocked and powered separately, you can operate them utterly independently of one another, or you can have them collaborate. The smaller M4 core can run as slowly as 24 MHz, and there’s no rule that says it needs to be synchronized with the M7. Go forth and multitask. 

Which raises the question… why use two Cortex-M cores at all, instead of the more capable Cortex-A series, especially when running at GHz speeds? The RT1170 is like a caffeinated hummingbird versus a more relaxed sparrow or robin. Who wants GHz performance but not Cortex-A capability? 

RTOS programmers, says NXP. Cortex-A parts have lots of advantages, including richer instruction sets, bigger register files, multithreading capability, better multicore support, better software support, and so on. But they also have slower interrupt-response times. ARM created the Cortex-M series (M for microcontroller) specifically for MCU applications that prioritize real-time response over highfalutin “computer” features. They’re not as fast – at least, not usually – but they’re simpler to program and debug. 

The other reason is familiarity. A lot of NXP’s customers are MCU users from way back, and they’re accustomed to the Cortex-M architecture and the single-chip ethos. They’ve already got the tools, the expertise, and the real-time experience and aren’t really interested in migrating to the more complicated A-series. Cranking up the clock frequency without altering the underlying architecture suits them just fine. 

Production quantities of the RT1170 are still a year away, but that hasn’t stopped NXP from guesstimating the CoreMark score for its newest offspring. With a projected score of 6468, the RT1170 should be about twice as fast as NXP’s i.MX RT1060, its next-fastest Cortex-M device. For comparison, that score is about what Intel’s quad-core Xeon L5408 produced ten years ago, or an AMD Phenom II X4 (circa 2009), or about equal to TI’s OMAP4460, a dual-core Cortex-A9 device. It’s amazing what a few years of semiconductor fabrication progress will buy you. 

Especially for the price. The RT1170 will sell for around $5 when it hits mass production in 2H20. That’s a lot cheaper than a Xeon, Phenom, or OMAP even now, never mind when they were new. A 1-GHz processor (plus a 400-MHz sidekick) with 2MB of RAM, multiple Ethernet channels, 2D acceleration, hardware security, and plenty more, all in a single package for five bucks. That’s a good deal, no matter what you call it.

Leave a Reply

featured blogs
Oct 13, 2019
In part 3 of this blog series we looked at what typically is the longest stage in designing a PCB Routing and net tuning.  In part 4 we will finish the design process by looking at planes, and some miscellaneous items that may be required in some designs. Planes Figure 8...
Oct 13, 2019
https://youtu.be/8BM28qwHyUk Made at Arm TechCon (camera Randy Smith) Monday: What Is Quantum Supremacy? Tuesday: It's Ada Lovelace Day Today Wednesday: The First Woman to Receive the Kaufman... [[ Click on the title to access the full blog on the Cadence Community site...
Oct 11, 2019
The FPGA (or ACAP) universe gathered at the San Jose Fairmount last week during the Xilinx Developer Forum. Engineers, data scientists, analysts, distributors, alliance partners and more came to learn about the latest hardware, software and system level solutions from Xilinx....
Oct 11, 2019
Have you ever stayed awake at night pondering palindromic digital clock posers?...
Oct 11, 2019
[From the last episode: We looked at subroutines in computer programs.] We saw a couple weeks ago that some memories are big, but slow (flash memory). Others are fast, but not so big '€“ and they'€™re power-hungry to boot (SRAM). This sets up an interesting problem. When ...