“640KB ought to be enough for anybody.” – Bill Gates [disputed]
Back when the MIPS computer architecture was first created, MIPS meant one of three things. It was either “millions of instructions per second,” or, in the case of the CPU itself, “microprocessor without interlocked pipeline stages.” Cynical engineers also called it “meaningless indicator of performance for salesmen.”
These days, a processor that can process only a million instructions per second seems unutterably quaint, like a steam-powered sawmill. One MIPS is nothing. We want hundreds of MIPS – possibly thousands of MIPS (GOPS).
Wish granted. That same MIPS company has now produced the I6500, a massively parallel implementation of the MIPS CPU architecture that should satisfy our processing needs for another, oh, dozen years or so.
(Side note: kudos to the MIPS marketing crew for not giving the I6500 a lowercase i. There are way too many “iProducts” already. A grateful world thanks you.)
Let’s start with the headline number: 1,536 execution threads in a single chip. That’s the theoretical maximum number of threads you could have running at once if you crank up all of the I6500’s options and create a maximally provisioned device. It won’t be easy and it won’t be cheap, but it’s doable.
How is that possible? Each I6500 CPU core can handle four threads, and you can have up to six cores in each “cluster,” so there’s 24 execution threads right there. But, more importantly, the I6500 is designed to be ganged. Yup, you can create a massive mega-cluster of 64 clusters of 64-bit CPUs. In the immortal words of Jeremy Clarkson, “Powwwwer!”
But what if you don’t really want 384 identical MIPS processors running amok inside your new device? What if you’d prefer, say, a few graphics processors or DSPs or – gasp! – ARM processors in the mix?
Fear not, for Imagination Technology has prophesied this eventuality. The I6500 is all about heterogeneity, the au courant term for “plays well with others.” You see, Imagination has conceded the fact that it’s not the only CPU vendor in town, and the company knows that its CPU is often used alongside other CPUs, accelerators, and processing engines of every stripe. So it was prudent to make MIPS processors easy to integrate with non-MIPS processors. Hence, the I6500 and its scalable nature.
Naturally, the company’s own PowerVR graphics engine is the poster child for this hetero-tolerant architecture. Imagination would like nothing more than for you to license a few hundred MIPS cores along with a few dozen PowerVRs. (Heck, your regular royalty checks might even save the company.) But any engine that can be bolted to its ACE (AXI) coherent fabric is as good as any other.
That heterogeneity extends to the CPUs within each cluster. In a high-end configuration, all six would be 64-bit MIPS processors, but you could also stir in some 32-bit MIPS CPUs if you wish, or a DSP or two, or just about anything else. Let’s say you combine three 64-bit I6500 CPUs with a set of low-end MIPS M5100 cores. You could call it LARGE.tiny. Or something like that. You get the idea.
You can even mix CPUs (or other accelerators) that run at different voltages and frequencies. Some might be tuned for speed while others are optioned for low power consumption or low duty cycles. In which case, you’ll need level shifters, but that’s all handled by Imagination’s IP and (most likely) by your standard tools. You can even tweak voltage and/or frequency on the fly. Or, more accurately, the I6500 cluster architecture contains nothing that would prevent you from doing so.
Why six CPUs per cluster and not eight, or some other power of two? Ah, Grasshopper, that is because even the greatest of processors requires input and output, so two CPU slots in each cluster are reserved for an IOCU: an input/output controller unit. If you’re not familiar with the IOCU, it was introduced a few processors ago in other MIPS designs. It’s an automated I/O processor, designed to keep peripherals cache-coherent while offloading that chore from the main MIPS processor(s). Think of it as a cross between a programmable DMA controller and a cache manager. And you get at least two of them in each I6500 cluster; you can populate all eight CPU slots in a cluster with IOCUs if you like. It’s a good way to keep I/O traffic off of the main ACE fabric running throughout your system.
Internally, the I6500 isn’t much different from the existing I6400 CPU, which was introduced two years ago. Programmers and hardware designers alike won’t see much difference between the two. The I6500’s newness is all about its scalability, not any new microarchitectural tweaks.
Does that suggest that the I6400 is now obsolete, replaced by the much more scalable I6500? Eh, probably not. Old IP never dies, mostly because there’s no cost to keeping it around. It’s not as though it requires a lot of warehouse space. Most likely, Imagination will adjust the licensing and royalty rates to keep both cores attractive. New designers can use either core, depending on how much scalability they foresee needing. And as a famous programmer and business tycoon once pointed out, you can never predict how much an old design might have to scale up.