“Wagner’s music is better than it sounds.” – Mark Twain
They say seeing is believing. Speaking as a professional skeptic, I’ll withhold judgement until I’ve seen one of these things in real life. But I gotta tell ya, it’s looking pretty interesting so far.
The thing in question is the VISC processor from Soft Machines. You remember VISC from our earlier coverage in October, 2015; December, 2014; and November, 2014. It’s a high-performance processor that runs ARM binary code, but it’s supposed to be both faster and more power-efficient than ARM itself. Oh, and it’s probably cheaper, too.
Let that sink in for a moment. It runs ARM code, but does it better than ARM. How is that even possible? That’s like saying a copy of a Picasso painting is better than the original. Or that you can play Eruption better than Eddie Van Halen. Isn’t an actual ARM processor already the perfect device to execute ARM binaries, by definition?
Apparently not. At least, not according to the 250+ people on three continents working at Soft Machines. Their whole raison d’être is to out-ARM ARM with an entirely new kind of microarchitecture that’s both more efficient and more speedy than the CPUs coming out of Cambridge.
Sure, we’ve seen all this before. Not just ARM clones, but knockoffs of x86 processors, MIPS processors, and just about any other processor worth the time to reverse-engineer. Almost without exception, those clone makers have failed – at least in the commercial sense. A few CPU doppelgangers actually did work as advertised, but then got legislated out of existence. Chalk one up for the patent lawyers. More commonly, however, the clone CPU never lived up to its promised hype. Any number of companies can tell you that it’s pretty darned hard to make an x86 processor that’s better than Intel’s (or even AMD’s) in any useful way. You can get about 80% of the way there on startup capital and late-night engineering. It’s that last 20% that turns out to be tricky.
Customers don’t care how close you almost got to producing an interesting chip. They only care when it’s 100% done and 100% in their hands. Even then, they may not care enough to keep buying your device. “Good enough” is a legitimate technical target, but it’s not a compelling business strategy. If you’re going to peddle imitation ARM processors, they’d better be a whole lot better than the real thing in one or two significant ways. You don’t get bonus points for effort. It’s the deliverable that matters.
So that brings us to Soft Machines’ latest update on VISC. The first chip is due to tape out around the middle of this year, which is only a few months away now. Silicon is due to customers’ hands around Q3. In other words, we’re getting close. Close enough that we have some benchmark numbers.
Spoiler alert: The numbers are good. Very good, in fact. Another spoiler: There are a whole lot of caveats attached to the benchmark numbers, so don’t go thumbing your nose at your current ARM vendor just yet.
Let’s walk through the chart. You’ll see six curves: two for ARM-based processors, one for the Intel x86 processor, and three for Soft Machines. All six curves represent performance (on the horizontal axis) versus power consumption (vertical axis) measured in milliwatts. In this test, the single performance axis is really the geometric mean to two tests: SPECint2006 and SPECfp2006. But since SPEC itself is an amalgam of several discrete tests, you could say that the performance axis really represents a whole bunch of individual benchmark scores all shmoo’d together. It’s all so meta.
All six curves are, well, curves – not because we’re graphing on a funny log scale but because the processors’ power consumption grows faster than their performance does. That is, CPUs get less efficient the faster they go. No surprise there. We’ve known that for a long time. Haste makes waste.
In this chart, “good” is down and to the right. You want to be in that quadrant. That represents maximum performance with minimum power expenditure. It’s not possible, but that’s the ideal.
The takeaway here is how much farther to the right the three VISC processors are, compared to the ARM and x86 chips. That is, they deliver far more performance for the same power budget than the incumbents. For example, if you draw a horizontal line at the 2W mark, the three VISC processors deliver 1.7x, 1.9x and 2.4x better performance than the ARM Cortex-A72. Not bad.
Conversely, you can draw a vertical line to compare the six chips’ power consumption at a given performance level. If we drop a line right where the A72’s performance stops (at 3.4 GHz and about 17.0 on the performance axis), we intersect the beginning of the curve for the VISC cores, but at about one-quarter the power level.
Pretty nifty. Now for the caveats.
There’s good news and bad news. The good news is – Soft Machines used real chips, not simulations, to gather benchmark scores for its competitors. The Cortex-A72 came out of a Mediatek MTK8173, the processor in Amazon’s Kindle Fire TV. The Apple A9 was from an iPad, and Intel’s Skylake (Core i5-6200U) came out of a Dell Inspiron 13-7359. So they’re all real chips. Except when they’re not; the three VISC processors are simulated, of course, because Soft Machines won’t have silicon for another few months.
The other good news is that all the scores were taken using the same gcc compiler (more on this later), with the exception of the Apple A9, which used Apple’s compiler. In all fairness, Apple’s compiler produces slightly better scores than does gcc, so kudos to Soft Machines for giving the competition a small assist.
Now for the less-good news. All the scores have been scaled, massaged, or tweaked in some way in the name of “fairness.” For starters, all the benchmarks were run single-threaded; there are no multi-threaded scores, which is a bit counterintuitive given VISC’s threading acumen. Indeed, all of the processors here are comfortably multithreaded, so confining them to single-threaded execution seems like running the Olympic 100-meter dash as a one-legged sack race. So there’s that.
Second, five of the six CPUs, both real and imagined, are nominally running with 2 MB of cache, even if the actual chips have smaller or larger caches. (The sixth CPU, VISC Tahoe, was simulated with 4 MB of cache.) Again, this is all in pursuit of fairness, but customers don’t buy fairness. If my Intel Core i5 comes with 3 MB of cache – which it does – then by golly, I want that reflected in its benchmark scores.
Similarly, all the processors were virtually scaled to a 16nm CMOS process, regardless of how they’re actually manufactured. This amounts to a promotion for the Mediatek chip, which is actually built on 28nm, but it probably doesn’t affect either the Apple A9 or the Intel Core i5, which comes out of Intel’s 14nm line. Because the scaling factor was so small, Soft Machines felt confident that they weren’t misrepresenting the competition’s chips.
Finally, the performance of the Cortex-A72 is theoretical. It doesn’t actually run at 3.4 GHz, as shown in the chart. Real Mediatek MTK8173s poop out at about 1.5 GHz, so fully half of that performance curve is bogus. But at least it’s the upper half; Soft Machines is extrapolating the A72’s performance so that it can play in the same ballpark as VISC. If anything, they’re being generous. But it’s still not real.
That’s a whole lot of small print. Whether Soft Machines’ data massaging is fair or not is a judgement call that I’ll leave up to you. In an effort to compare (ahem) apples to apples, the company may have instead just clouded the issue. I don’t like to see too many asterisks on a document I’m about to sign. Good, bad, or indifferent, I like to see my benchmarks neat and clean and unencumbered by footnotes.
Now for the final bit of good news. The three simulated VISC processors were all running ARM binaries – which, if you recall, is sort of the point. But that’s still worth repeating. They weren’t running ARM source code that had been recompiled using a magical Soft Machines compiler that understands the VISC instruction set. No, they ran ARM binaries, straight from gcc. The very same binaries that the ARM-based Mediatek chip ran. (The Apple chip used Apple’s compiler.)
So even if you squint hard and take these numbers with a large grain of salt, it still looks to be an impressive performance. Simulating power consumption is a tricky thing, and the devil so often lies in the details, so I’ll withhold judgement on that aspect of the equation. But, leaving that aside, the performance numbers alone look pretty impressive. If these benchmarks are to be believed (and if SPEC2006 is at all relevant to your workload), then VISC should be able to double the performance of a high-end ARM processor or a midrange Intel x86 processor. So unless the “fairness” tweaking that went into these scores has horribly misrepresented everything, it looks like Soft Machines is on the right track. Maybe, in a few more months, we’ll know if they’ve broken The Curse of the Compatible Processors.