feature article
Subscribe Now

MIPS I7200 Breaks the Chain

New 32-bit CPU Design Isn’t Very RISC-like Anymore

“It is surely harmful to souls to make it a heresy to believe what is proved.” — Galileo Galilei

Heresy! Sacrilege! Apostasy! The RISC orthodoxy has been profaned! Get the pitchforks and assemble the townspeople while I look for my wooden stake.

The High Sparrow and Lord Protector of RISC canon, MIPS Technologies, has decided that the RISC code is more what you’d call guidelines than actual rules. Welcome aboard the MIPS I7200.

To hell with orthodoxy, say we. We just want our embedded microprocessors to work efficiently, quickly, and expeditiously. And if that means tossing aside decades of research, careers of intense evangelism, veritable mountains of scholarly texts, and more than a few doctoral theses, so be it. This be war, and all’s fair in war and microprocessor design.

Our rebellious RISC renegade is the latest 32-bit CPU core design to emerge from the scriptorium in Sunnyvale, birthplace of MIPS Computer Systems and all that is holy in modern CPU architecture. MIPS, ARM, SPARC, and virtually every other new CPU to see the light of day in the past 20-odd years has been based on RISC philosophy. “Less is more,” is the guiding principle. Make the CPU hardware do less and, ipso facto, it will be faster. Shunt complexity to software instead because – hey, programmers are cheaper to hire than real electrical engineers, and software is easier to patch than hardware. Minimize the hardware and balloon the software. So it is written. So it shall be done.

Yeah, but. It’s a tradition more honored in the breach than the observance. No real CPUs adhere strictly to those early RISC principles. It’s simply too ascetic, too Spartan, and too damn hard to program. Almost from the beginning, ARM, MIPS, and all the other so-called RISC processors started contaminating their pure architectures with oddball instructions for shifting bits, calculating addresses, and handling floating-point numbers. Today, most RISC processors are reduced in name only.
But one tenet was cast in stone: Thou shalt have fixed-length instructions. Usually 32 bits wide, same-size instruction words are easy to decode, split, and execute. They neatly align on memory boundaries. They make it easy for compilers and linkers to figure jump addresses. Surely that’s one golden rule we can all agree on?

Eh, not so much.

The new MIPS I7200 marks the debut of a brand-new instruction set called nanoMIPS, and it’s – gasp! – variable-length. The final shoe has dropped and we’re not maintaining any pretense of RISC traditionalism anymore. The new nanoMIPS ISA isn’t even binary compatible with other MIPS processors (or anything else, naturally). It’s an entirely new ISA designed for small code size, never mind what it does to compatibility with the rest of the MIPS product line.

Have the MIPS designers lost their minds? Are they under an evil spell? Or perhaps they’ve been possessed by evil spirits emanating from a certain neighboring CPU facility in Santa Clara? Heaven knows those people design hideously complex processors with no regard whatsoever for the elegant virtues of abstemious design. Perchance MIPS has been affected by the number of the beast: x86.

Or maybe they’re just good CPU designers being practical. The I7200 is a midrange processor in MIPS’s product catalog, somewhere above the existing interAptiv CPU but below its many 64-bit processors. That makes the I7200 the hottest 32-bit CPU in the MIPS catalog. It was designed, the company says, with the help and encouragement of a certain Tier-1 vendor on the LTE/5G space, probably MediaTek. Thus, it’s a good fit for upcoming 5G modems, where parallel processing and low power consumption will be vital.

And one of the best ways to shrink silicon size is to shrink code size. After all, most processor chips are mostly memory. Between the big L1 and L2 caches in your average 32-bit SoC implementation, it’s hard to even find the CPU core swimming amongst all that SRAM. If you can reduce your code size by an appreciable amount, you can cut the memory size as well, and the power consumption with it. It’s the exact opposite of RISC: lard up the CPU silicon so that you need less code. Heresy!

Like its more conventional brethren, the I7200 does multithreading, which has become a MIPS hallmark. The CPU core can handle up to nine threads and switch between threads with zero overhead. Snippets of code can also be preloaded and “parked,” ready for instant deployment in the case of an interrupt handler or a high-priority task. This feature, combined with new scratchpad RAMs that bypass the cache, is designed to make the I7200 more deterministic – another important feature for exotic 5G or LTE Advanced modems.

The processor’s MMU can be dumbed-down to perform faster under an RTOS that prioritizes fast access time over elaborate memory-management schemes. Or, you can enable the full-on MMU to run Linux.

The nine-stage pipeline copies its structure from other recent MIPS processors, and it accommodates up to three (instead of just two) user-defined coprocessors under the existing ASE interface specification. This allows creative designers to add in their own hardware accelerators without inventing a completely new CPU from scratch. In addition to the new scratchpad RAM, the I7200 also supports conventional L1 data and instruction caches. Internal bus interfaces are now AXI4, whereas previous processors used OCP with AXI wrappers. If your ambitions extend to multiprocessor SoC design, the I7200 works in four-processor (36-thread) clusters, with cache coherence throughout.

But does it work? Gracious, yes. The I7200 outperforms its oddly named interAptiv predecessor by somewhere between 35% and 65%, depending on which EEMBC benchmark you prefer. It’s also a wee bit faster than arch-rival Arm’s Cortex-R8, which is pitched at roughly the same types of real-time applications. If it matters, the I7200 is also about 20% faster than a Cortex-A53 running at the same clock speed.

The real raison d’être for the I7200, however, is its code density. MIPS already has a compressed/condensed instruction set in the form of MIPS16e. It’s been deployed for ages and implemented in uncountable millions of devices. So why reinvent that particular wheel?

Because it’s better. The new nanoMIPS ISA is about 12% smaller than ARM’s Thumb2, and a good 15% to 20% smaller than MIPS16e, according to the company. Like MIPS16e, nanoMIPS is a standalone, self-contained instruction set. It’s not a mode or an extension; it really is the processor’s native ISA.

Unlike MIPS16e, however, nanoMIPS is not optional. Up until now, any MIPS licensee had the option of enabling MIPS16e or not. The “real” MIPS instruction set was always the default ISA and always required. That seemed only natural, and it meant that all MIPS processors were binary compatible with one another.

Not anymore. The nanoMIPS instruction set is the only one that the I7200 runs. There is no “standard” 32-bit MIPS option, which means that the I7200 is not binary compatible with any other MIPS processor (so far). It’s a clean break.

The company obviously feels that this was the right move, and it does make some sense, even if it’s a bit weird. Back when MIPS was still competing head-to-head with Arm and the other RISC pretenders, binary compatibility was important. (Look what it did for Intel.) But that ship has sailed. Few customers use their MIPS processors to run application code. There isn’t much of a third-party software library. It’s mostly deeply embedded real-time code that customers never see. So breaking compatibility isn’t a big deal, and it’s a fair trade for smaller code size. So, MIPS Technologies took a deep breath and took the plunge. As a result, we have an entirely new generation of MIPS processors that are unlike other MIPS processors.

The underlying architecture is still the same, and the programmer’s model is identical. If you didn’t look closely at the binaries excreted by your compiler, you wouldn’t know the difference. It’s all MIPS where it counts.

At 2-GHz clock rates without breathing hard, the I7200 is fast, efficient, reasonably small, and definitely scalable. MediaTek is already taping out its first I7200-based design, so it’s not even vaporware. The I7200 has achieved corporeal presence, with more on the way. Sing Hallelujah, O brothers and sisters, the new doctrine is here! Can I get an Amen?

6 thoughts on “MIPS I7200 Breaks the Chain”

  1. Finally some sanity!
    The mythical “execute all instructions in one cycle” BS was pure nonsense from day one. Loads and stores take one cycle PLUS THE MEMORY ACCESS TIME — meanwhile IF THERE ARE ANY IN THE PIPE they can execute.

    CPUs circa 1960 — almost 60 years ago — had interleaved memory and guess what the address of the next instruction was calculated while the memory was accessed. And sure enough the add/subtract kind of instructions executed in one cycle -==> when the data arrived from memory.
    RISC was pure hype from the very beginning, is now and always will be.

  2. None of the processors have really been RISC for a long time, the ISAs are for RISC processors of long ago, and the internals of the CPUs are now largely independent of the instruction sets.

    MIPS guys told me a while back they were breaking with backward compatibility because they had lost control of their ISA, and wanted to disenfranchise people using it – seems unlikely they’ll get a following for their new stuff in the face of RISC-V.

    I was hoping they would buy into some tech of mine to get a performance advantage (since Tallwood seem to have money to burn)…

    http://parallel.cc/cgi-bin/bfx.cgi/WT-2018/WT-2018.html

    … but some organizations just have a death wish.

Leave a Reply

featured blogs
Mar 28, 2024
The difference between Olympic glory and missing out on the podium is often measured in mere fractions of a second, highlighting the pivotal role of timing in sports. But what's the chronometric secret to those photo finishes and record-breaking feats? In this comprehens...
Mar 26, 2024
Learn how GPU acceleration impacts digital chip design implementation, expanding beyond chip simulation to fulfill compute demands of the RTL-to-GDSII process.The post Can GPUs Accelerate Digital Design Implementation? appeared first on Chip Design....
Mar 21, 2024
The awesome thing about these machines is that you are limited only by your imagination, and I've got a GREAT imagination....

featured video

We are Altera. We are for the innovators.

Sponsored by Intel

Today we embark on an exciting journey as we transition to Altera, an Intel Company. In a world of endless opportunities and challenges, we are here to provide the flexibility needed by our ecosystem of customers and partners to pioneer and accelerate innovation. As we leap into the future, we are committed to providing easy-to-design and deploy leadership programmable solutions to innovators to unlock extraordinary possibilities for everyone on the planet.

To learn more about Altera visit: http://intel.com/altera

featured chalk talk

Improving Chip to Chip Communication with I3C
Sponsored by Mouser Electronics and Microchip
In this episode of Chalk Talk, Amelia Dalton and Toby Sinkinson from Microchip explore the benefits of I3C. They also examine how I3C helps simplify sensor networks, provides standardization for commonly performed functions, and how you can get started using Microchips I3C modules in your next design.
Feb 19, 2024
5,363 views