No less an authority than the United States Supreme Court just ruled that a program’s application programming interface can be copied under the doctrine of copyright “fair use.” Google copied thousands of lines of Oracle’s code in order to implement its own version of the Java API without actually licensing the official Java API. The Court ruled that Google didn’t need a license because it’s okay to duplicate the API without one.
In a sense, this was no big deal because most programmers were already operating under this assumption. It feels logical, and it generally passes the programmers’ sniff test. An API isn’t like “real” code. It’s just an interface, a definition or specification, and not the same as code that carries out the actual function.
Indeed, the lawyers arguing the case made this distinction, too. They described some parts of Java as “declaring code” and other parts as “implementing code.” The APIs fell into the former category, while the bulk of the software was in the latter. Here’s how they describe it in the official decision:
“…the declaring code’s shortcut function is similar to a gas pedal in a car that tells the car to move faster or the QWERTY keyboard on a typewriter that calls up a certain letter when you press a particular key. As those analogies demonstrate, one can think of the declaring code as part of an interface between human beings and a machine.”
There you have it. It’s an interface.
So… does this interface definition extend to hardware, too? Is the pinout of a device subject to copyright, or the protocol of a serial bus, or the way a DRAM works? To blur the distinction even more, is a microprocessor’s instruction set also an interface, and, if so, is it protected under copyright law? And what effect might such a legal decision have on programming?
I would argue (in a decidedly non-judicial way) that an ISA is an interface. It’s where the software meets the hardware. Without a defined instruction set there’s no way to program a CPU or MCU. That’s what programming is.
Now, the mnemonics of that instruction set are a different matter. The processor doesn’t care if you call the MOV instruction MOV or MOVE or mov-it or Loretta. (“I wish I’d spelled creat with an ‘e’,” said Ken Thompson, creator of Unix.) Mnemonics are purely a human-readable shorthand for assemblers and debuggers. The chip never sees them, of course. You could rename every single instruction in your ISA and the chip would be none the wiser. You’d have a miserable time writing code, though.
Interestingly, you can publish a book of other people’s instruction sets. I’ve done it. The ISA is entirely somebody else’s creation, but writing it all down is permitted under fair use. You do have to write your own original descriptions of what each instruction does, and you have to draw your own diagrams, but the mnemonics themselves and what they do are not protected.
Can you protect the instruction set of a processor? Case history suggests you cannot. There are obviously copies of Intel’s x86 instruction set as well as previous MIPS workalikes, off-brand ARM chips, 8051 replicas, plus other clones, doppelgängers, knockoffs, reproductions, and imitations of almost every CPU or MCU you can think of. Some of these ISAs are harder to duplicate than others, and some have been more successful than others, but that’s beside the point. Those are technical and commercial concerns, respectively. Creating a CPU that runs someone else’s code is usually on solid legal ground.
The exception is when you duplicate the internal workings of the processor. That’s not okay. More than a few ARM and MIPS knockoffs met their end because they followed too closely the circuitry required to implement certain operations, not because of the operations themselves. They ran afoul of patent law, not copyright law. Simply having the FOOBAR instruction in your ISA repertoire is generally okay, but you have to implement it with “clean room” hardware that doesn’t step on someone else’s patents. That’s where it gets tricky. Intel and AMD processors have nearly identical instruction sets, but their internal microarchitectures are very different.
Let’s muddy the waters further. What if the processor is microcoded? Complex processors like those from Intel and AMD crack their complex opcodes into simpler micro-operations (uops). A single x86 instruction might translate into several uops that run under the direction of a uop sequencer. That sequencer is programmed; its firmware lives in ROM inside the processor. So, your Ryzen or Xeon is really running code that enables it to run code. It’s perpetually in x86 emulation mode. The “real” instruction set – the uops – is an internal secret.
Can you copy that? Is it protected code – the “implementing code” in the above legal definition – or is it simply an interface of “declaring code?” Nobody outside of Intel’s or AMD’s own engineers will ever see or use that microcode, and they can presumably change it any time they want to. The uops are completely invisible to normal programmers, who often don’t know it exists. So, the question may be moot. It’s not much of an interface if nobody interfaces to it. If a microprocessor crashes in the forest and no one’s there to reboot it…
Looking again at the Google v. Oracle decision, the Supreme Court said:
“The doctrine of ‘fair use’ is flexible and takes account of changes in technology. Computer programs differ to some extent from many other copyrightable works because computer programs always serve a functional purpose. Because of these differences, fair use has an important role to play for computer programs by providing a context-based check that keeps the copyright monopoly afforded to computer programs within its lawful bounds.”
They’re saying that programming isn’t the same as artwork, music, photography, stage plays, or movies because it’s functional, not decorative. Thus, the fair use doctrine is needed to prevent computer programs from being completely off-limits. It “keeps the copyright monopoly… within lawful bounds.”
The Court also hints at an answer to our question of protecting instruction sets:
“For most of the packages in its new API, Google also wrote its own declaring code. For 37 packages, however, Google copied the declaring code from the Sun Java API. As just explained, that means that, for those 37 packages, Google necessarily copied both the names given to particular tasks and the grouping of those tasks into classes and packages.”
The key part here is, “Google necessarily copied… the names given to particular tasks…” which makes it sound like cloning mnemonics is okay, which we kind of already knew. After all, AMD uses the same mnemonics as Intel, even for new instructions that were added long after those companies’ licensing agreement expired. And other x86 chipmakers use the same mnemonics, too.
If copying the mnemonics is okay, is copying the binary encoding okay? That’s what really matters, since mnemonics are only human-readable, but opcodes are machine-readable. Precedent would suggest that it’s legal. We wouldn’t have binary compatible processors otherwise.
Bottom line, copyright law has little to say about microprocessors. You can’t prevent someone from describing your instruction set, or even duplicating its essential functions. In both human- and machine-readable form, cloning an existing ISA is fair game. Or fair use.
Patent protection still applies, of course, and it’s the more relevant concern in most cases. You can’t copy another processor’s circuitry if such circuitry is covered by patent. (You can if it’s not patented.) For most basic CPU operations, that’s not too onerous. It’s pretty easy to build an adder or a shifter without violating patents. Complex instructions are harder, and sometimes an instruction seems to be deliberately designed to throw legal landmines in front of the clone makers.
The third way to protect IP is to treat it as a trade secret. Patents and copyrights both require disclosure. You get limited legal protection in return for describing your technique well enough that others can duplicate it after that protection expires. Trade secrets offer no such protection, but also no disclosure. It’s… a secret. That’s a good approach for recipes (e.g., Coca-Cola) and magic tricks, but risky for electronics, which can often be reverse engineered. If someone discovers your trade secret, you have no recourse.
The Supreme Court decision has clarified some software issues but seems to have little bearing on hardware, or the interface between hardware and software. That’s probably a good thing. Microcoded or hardwired, patented or open source, it looks like copyright law doesn’t apply to CPUs. Until it does.