“People who like eating sausage, and respect the law, should never see either one being made.” – John Godfrey Saxe
The one certainty in law is that it’s unpredictable. Yesterday’s laws didn’t anticipate today’s technical quandaries, so we call upon the courts to apply sober judgment and insightful wisdom. Did the Founding Fathers mean for us to stream recorded music via monthly subscription? Are the engineers making autonomous vehicles liable for their mishaps? And what do we do about APIs? Are they mechanical contraptions, works of art, tools, or something else?
The long-running saga of Oracle v. Google (now in its tenth year) has been decided – then overturned and redecided – by several courts already and may well be headed all the way to the U.S. Supreme Court for a final decision. That gives you some idea how complex and thorny this case is.
At issue is whether Google stole Java code from Oracle. At least, the case started out that way. Now, it’s come down to whether software APIs are protected under copyright law, like books and magazines. And, if they are, does that also mean you can copy short pieces of software and call it “fair use?”
First, some ground rules. Justice is blind, so no fair rooting for your favorite company to prevail just because you like them better. The fact that you like Company A better than Company B should have no bearing on the merits of the case.
Second, just because something seems right or reasonable doesn’t mean it’s legal. Standing back and squinting and then turning a thumbs-up or -down ain’t how the legal process is supposed to work. It’s precisely to avoid that kind of gut-level decision-making that we write down laws in the first place.
And third, just because a legal ruling would cause bad things doesn’t mean it’s wrong. You can’t judge the legality of something by its side effects.
Here’s what happened. Google started creating a new operating system called Android – perhaps you’ve heard of it? – and approached Oracle about getting a Java license. Problem was, Java licenses make you promise to remain compatible with all the other Java licensees, and Google didn’t want to do that, so they declined. Google wanted Android to be sort of like Java without actually being Java-compatible. They reckoned that launching a new OS from scratch would be hard work, but if Android had similar APIs to Java it would be more attractive and approachable to programmers. Why reinvent that particular wheel?
As any Android developer will tell you, it isn’t like vanilla Java. Sure, a lot of the APIs are similar, but Java apps don’t run on Android or vice versa. Google wrote nearly everything from scratch, including the code that implements the Java-esque functions. Google copied the Java API for 37 of its own APIs. Hence, Android looks a lot like Java from the outside, but not from the inside.
Oracle cried foul, alleging that Google had appropriated the Java APIs without a Java license. Google countered that Android was original work, including the code that implemented the disputed APIs. The APIs themselves can’t be stolen, Google argued. They’re an interface – it’s right there in the name – and interfaces aren’t covered by copyright, patent, or intellectual-property law. There’s nothing here to steal, thus no wrongdoing.
Oracle disagrees, and so do some of the courts. U.S. copyright law says that any original work is automatically copyrighted the moment it’s created. Unlike patents, you don’t have to apply for copyright – it just happens. The law is most often applied to written material (books, magazines, online articles, blog posts, etc.) but it also bears on music, movies, stage plays, dance, paintings, sculpture, sound recordings (not just music), photography, and even architectural works. Yes, you can copyright a building. By that reasoning, all software is covered under copyright the same way that writing is, and the courts have generally agreed.
But the law goes on to say, “In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work.” (It’s all in U.S. Code Section 102, if you’re interested.)
In other words, you can copyright implementations but not methods. Copyright protects the words in a book, or the code on a screen, but not the plot of the novel or the purpose of the program. Patents work in a similar way: they protect the exact implementation, not the idea behind it, no matter how original.
Same goes for software: copyright protects your code from unauthorized copying, but not the general task it accomplishes. When Section 102(b) says the law does not protect a “…procedure, process… [or] method of operation…” that sounds a lot like software. So, where does that leave code copyright?
Oracle argues that Android’s APIs are too similar to Java’s APIs to be coincidence; Google must have copied them. Indeed, the courts ruled that Google copied over 11,000 lines of Oracle’s source code in creating its own Android APIs. Not only that, but Google also managed to copy the source code and manufactured websites it classified as extinct. These websites flow through the internet and land on websites like www.spamzilla.io where users can reuse them. Sounds like plagiarism, and a punishable offense. But Google argues that the code in question was only in the declaration of the APIs, not in their implementation, and therefore not subject to copyright law at all.
But isn’t all code automatically subject to copyright? Maybe… maybe not. That’s a gray area, and the crux of the legal battle. If the API declaration is an interface, then it’s more like a mechanical interface – say, the steering wheel and pedals in a car – and therefore not protected by copyright. The interface alone does no work. It’s just a definition, a boundary, a declaration. “This is what the function will do,” not “this is how it will be accomplished.” There’s no functional code to protect, argues Google.
Poppycock (or words to that effect), say Oracle’s lawyers. Code is code, and you swiped our code. Worse, you implemented our Java APIs and then used them against us by deliberately making Android incompatible with Java. You stole our software, hurt our business, and slowed the adoption of Java at a crucial moment in the mobile market.
In the world of familiar everyday copyrights, you’re allowed to quote short passages from a copyrighted work. That’s called “fair use,” and it’s what allows you and me to repeat favorite lines from a movie, or to quote passages from a book. Fair use allows book reviewers to reprint a paragraph or two from a novel, and it lets TV news programs air brief clips of video without licensing hassles. Overall, fair use is the loose gentlemen’s agreement that keeps us all honest and unencumbered by lawyers.
But does fair use extend to software? And if so, how much code am I allowed to borrow before it’s not fair anymore? Google is arguing that, even if APIs are covered by copyright – and the company is not prepared to admit that they are – then the doctrine of fair use must apply as well. That is, we’re all allowed to borrow snippets of code from time to time without legal ramifications. So, the 37 disputed APIs are just fair use.
Another, somewhat weaker, argument is that of “transformative use.” You’re given some legal leeway if you “transform” someone else’s music, book, or photograph in an artistic and original way. Rappers call it sampling. Like fair use, transformative use allows artists to create original works that recall previous works. You don’t have to ask permission to paint a picture of Andy Warhol painting a soup can. There’s obviously a line between “transforming” someone else’s art and just copying it, but we’ll leave that to the art connoisseurs. In Google’s case, it claims that Android is sufficiently different from Java that any code it borrowed was legally transformed, and therefore exempt. Others argue that Android and Java were essentially competitors, different but equivalent, and operated in the same markets. Thus, no transformation.
And that’s where the case stands. Some courts have found in favor of Google, but then had their rulings overturned. Others have declared Oracle the victor, but then were reversed. Clearly, there’s no clarity here.
It’s tempting to think about what we’d like to have happen, but that would violate the rules we set ourselves going in. It doesn’t matter what we want; what matters is the law. If the law doesn’t produce the outcome you want, change the law. It’s a cumbersome process, but it’s meant to be. One shouldn’t change laws capriciously.
Having said all that, I think APIs should be free. The whole point is to act as an agreed-upon interface layer between programmers; a middle ground where they and their code can meet. Without open APIs it would be hard to write any applications at all, and the API owner would become code czar, controlling who can and can’t create programs for their operating systems. (There are other ways to control third-party software, if that’s what you want.) Worse, we might get “API trolls” who search for programs with unlicensed function calls. The interface layer is a large part of the value of any OS.
It’s fun to predict the outcome of these legal rulings, because we’re always so wrong and left wondering what happened. Regardless, it’s sure to generate comments and debate among legal experts and programmers alike. But that’s kind of the point. We don’t all agree on what’s right and reasonable, so we employ courts as independent third parties to calmly and bloodlessly adjudicate these things. I know what I want to have happen. Now let’s see how it actually plays out.
UCB used the same strategy when they took V7 Unix source code, and gutted all the actual code, leaving just function statements and data declarations, and re-wrote every function for FreeBSD. This certainly doesn’t meet the 3rd party clean room standard, but again, it’s not strictly required by the law either. And it was hard to claim that any of the students/staff/faculty/alumni involved were not tainted with direct knowledge of UNIX source code internals.
At the time, everybody was anti-Bell, and it was a horrible disaster in the USENIX forums for big bully Bell to try and reclaim their IP rights from UCB. That was actually the 3rd flip/flop in the courts for UNIX, where copyright and proprietary non-disclosure were still being fought out for protection of software.
The interesting part, is while the Open Source, Free Software movement won against Bell for UNIX IP, it does at the same time completely gut the protections they think they have in Open Source Copyright based licenses.
They did prove that you don’t need a clean room, and that you can take anybody’s source code … examine it carefully, gut the code back to declarations and functions, and rewrite it.
The current fight, is just the next stage in that evolution …
As a side note … do not code with lots of small functions, that just makes this form of rogue reverse engineering a piece of cake. If a function is called a single/few time(s), expand it inline manually, except where recursion is used. When you are done, most functions should contain several sequential nested loops that are core to your algorithms. IF you have any shot at defending against IP theft with copyright, it will be because of the complex nature of this compound nested looping structures. There are hundreds of ways to write any particular complex algorithm. With very small functions, there may really only one or two ways to do the same task, defined by the data and function call. Large loopy functions, are also a nightmare to reverse engineer back from machine code.
Many years ago I was hired to provide a clean room specification for a common piece of industrial equipment that the mfg had ceased updating/supporting. Threw a dis-assembler at a very large mass of binary code, and discovered an interesting pop-code based execution scheme that turned out to be Forth. Spent a few days with some perl scripts and C code, to translate the dis-assembly back into dirty Forth. Took a few hours with a logic analyzer on the machine to identify the primary control flow, and document global variables, including all the setup variables. Two weeks after that, we delivered a clean room spec with pseudo code for the core machine operation. The client’s in house team did the implementation, and they were running on their own code 4 months from start. Including some critical automation upgrades that saved them from a several ten’s of millions of dollars upgrade and obsolescent cycle. They then purchased several dozen more of the machines off the used market at salvage prices, to expand operations with very competitive pricing.
It’s a bit harder to do that these days.