feature article
Subscribe Now

We’re the People’s Front of Judea!

Religion Intrudes on Microprocessor Design

Is it better to remain philosophically pure, or to tolerate outsiders in your midst? That’s the question posed by noted computer scientist David May in a recent talk on computer architecture.

May, who holds the dual roles of professor of Computer Science at the University of Bristol (England) and CTO of microprocessor firm XMOS, recently gave a talk entitled, “Heterogeneous Processors: Why?” Simultaneously blunt and obscure, the title refers to mixing two or more processor architectures on a single chip, or in a single system. (“Heterogeneous” being the currently trendy word among computer weenies for “different.”) In his talk, May makes the case that we don’t need to mix different processor types. Instead, we should make do with just one. Moreover, the processors we have today are too complex and really ought to be RISC-ified, stripped down to their bare essentials.

Those of you with long and nerdy memories will recognize this as classic David May. He’s the guy who developed the Transputer (exactly 30 years ago, as it happens), a remarkably streamlined microprocessor that relied on parallel processing. The Transputer even came with its own parallel programming language, Occam. Neither Occam nor the Transputer were terribly successful commercially, but they both laid the groundwork for many of the parallel structures we have today.

Now he’s advocating the same thing for today’s microprocessors, suggesting that even the wildly successful ARM architecture has forgotten its RISC roots and is corrupted with complexity. Indeed, May is working on a project to pare down an ARM processor to a mere 30-odd instructions and nothing else. (I imagine his CS interns bent over a lab bench in a crowded basement laboratory, backed by crackling Tesla coils and occasional claps of thunder, cutting away unneeded ARM instructions until only the core remains. “Come quickly, Doctor – it’s alive!!!”)

May’s thesis is that simpler processors breed simpler compilers and other tools, and that sticking to just one processor architecture means that only one set of such tools is required. Coding is easier when everything uses the same language, compiler, and toolchain. Mixing processors, in contrast, contaminates that glorious monoculture. You’ll need different compilers for the different instruction sets, and different tools for the different processors. Why invite gratuitous complexity?

Um, because it works, that’s why.

Far be it from me to argue with a distinguished professor of computer science at a major university… but here goes. Complexity works. It works really well. The “rainbow coalition” approach beats vanilla homogeneity every time. And RISC – real RISC – was a dumb idea anyway.

Back in the year MCMLXXXII we saw two seminal RISC architectures, MIPS and SPARC, pop out of Stanford University and UC Berkeley more or less simultaneously. These and other RISC designs focused on simplicity and minimalism, on the theory that “fast and light” would beat “slow and steady” by offering better performance, reduced power consumption, and smaller die size (i.e., less silicon). Like Danish Modern furniture, minimalism was trendy for a time and RISC seemed like the thinking person’s way forward. Comparatively complex chips (retroactively named CISC) were seen as outdated and overly byzantine. As the poster child for old-school complexity, Intel had a target painted on its back, while MIPS and SPARC (and others) spun off from their respective universities and became profitable independent companies.

Except it didn’t work. For starters, the RISC upstarts never delivered the big performance improvements they promised. At best, they were a few percentage points ahead of Intel’s (and AMD’s) x86 chips. “Just wait until the next generation,” the RISC faithful promised. “Then the scales will fall from your eyes and you will be a true believer.”

Then the RISC designers quietly started adding more complicated instructions. Because, you know, they had to get actual work done. Hardware instructions for multiplication and division crept in, then misaligned memory accesses, then floating-point math and vector operations, and so on. Down the slippery slope they went, until now it’s hard to tell a RISC processor from a CISC design. The difference is largely notional, and it has more to do with marketing gloss than any actual technical differences. Like that Danish furniture, it’s described as “modern” even though it’s older than many of us. 

What the RISC designers discovered, and what x86 designers already knew – but what Bristol’s CS department has yet to realize – is that hardware instructions serve a purpose. They’re not a gratuitous waste of transistors or a full-employment act for CPU designers and manual writers. They’re the most effective and efficient way to get a job done. Truly reduced instruction sets of the kind Prof. May is considering are enormously inefficient. RISC programs require a lot of memory because so many instructions are needed to accomplish simple tasks. Whatever transistors you save in the CPU core are spent many times over in fabricating DRAM cells. Any energy saved within the simple processor is squandered in exercising the memory bus. Programmers perform embarrassingly unnatural acts to get around the moral purity of an instruction set that seems designed to thwart efficiency. Like an Orthodox Jew on the Sabbath, RISC chips eschew real work.

If all processors looked the same, we’d have – What? – an ARM Cortex doing graphics? And network packet inspection? And encryption and decryption? With 30 instructions? Yes, that’s technically doable, as John von Neumann proved, but it’s hardly an efficient use of any conceivable resource. Not of transistors, not of memory, not of energy, and certainly not of people’s time. Bending over backwards to use a stripped-down CPU for complex tasks purely to conform to some abstract concept of architectural hygiene is the silliest kind of zealotry. 

I do like the idea of using Prof. May’s reductionist ARM as a teaching tool, however. For incoming CS students, it’s entirely proper and fitting that they should learn on a stripped-down, minimalist microprocessor design to get acquainted with the concepts. You don’t start a Mechanical Engineering student on a box-girder bridge before teaching the basic structural concepts first. Once the Comp Sci weenies have been weaned off mini-ARM they can start using normal, complex chips. Like a real ARM. Or MIPS, SPARC, PowerPC, or x86. Heck, they’re all RISC now. 

3 thoughts on “We’re the People’s Front of Judea!”

  1. With all due respect, you are guilty of the sin of which you are accusing David – simplifying. Did you sit through the presentation? The link is on my column last week.

  2. Yet another excellent article Jim. At most, i like the Statement “… that hardware instructions serve a purpose. They’re not a gratuitous waste of transistors or a full-employment act for CPU designers and manual writers.”

    One reason for RISC architecture was that, at about 70% of software code generally uses load/store operations. However, to my knowledge, it was not investigated what percentage of cpu time the remaining 30% uses. This may be one of the reasons for pure RISC architecture failure.

  3. Doing good engineering requires using the appropriate solution for the problem at hand and there are times when a true RISC architecture is appropriate.

    For example an FPGA-based micro controller may need to be small and have a high clock speed because the requirements are to (1) minimize resources consumed, (2) not adversely affect build time, and (3) easily meet clock constraints so as to simplify the FPGA system architecture.

    I designed a stack-based micro controller for FPGA house keeping and used a RISC architecture because it allowed me to satisfy these requirements. This wouldn’t have been possible with a CISC architecture.

Leave a Reply

featured blogs
Apr 17, 2024
The semiconductor industry thrives on innovation, and at the heart of this progress lies Electronic Design Automation (EDA). EDA tools allow engineers to design and evaluate chips, before manufacturing, a data-intensive process. It would not be wrong to say that data is the l...
Apr 16, 2024
Learn what IR Drop is, explore the chip design tools and techniques involved in power network analysis, and see how it accelerates the IC design flow.The post Leveraging Early Power Network Analysis to Accelerate Chip Design appeared first on Chip Design....
Mar 30, 2024
Join me on a brief stream-of-consciousness tour to see what it's like to live inside (what I laughingly call) my mind...

featured video

MaxLinear Integrates Analog & Digital Design in One Chip with Cadence 3D Solvers

Sponsored by Cadence Design Systems

MaxLinear has the unique capability of integrating analog and digital design on the same chip. Because of this, the team developed some interesting technology in the communication space. In the optical infrastructure domain, they created the first fully integrated 5nm CMOS PAM4 DSP. All their products solve critical communication and high-frequency analysis challenges.

Learn more about how MaxLinear is using Cadence’s Clarity 3D Solver and EMX Planar 3D Solver in their design process.

featured chalk talk

Intel AI Update
Sponsored by Mouser Electronics and Intel
In this episode of Chalk Talk, Amelia Dalton and Peter Tea from Intel explore how Intel is making AI implementation easier than ever before. They examine the typical workflows involved in artificial intelligence designs, the benefits that Intel’s scalable Xeon processor brings to AI projects, and how you can take advantage of the Intel AI ecosystem to further innovation in your next design.
Oct 6, 2023
24,918 views