feature article
Subscribe Now

Damn You, Autocorrect!

Resist the Urge to Make your Technology Too Helpful

There’s a blog post making the rounds from a German PhD student who discovered that some photocopiers can spontaneously change the numbers on printed documents. He’d photocopy a page with a column of numbers, but the resulting copy had different numbers. It’s magic!

To be clear, we’re not talking about text that was just fuzzy or hard to read. The copier actively changed the document. It was swapping one number for another.

This also wasn’t an OCR (optical character recognition) problem, misinterpreting the text and making a wrong guess. That’s common enough. No, the machine’s built-in OCR function was disabled. All the guy wanted was a simple, straightforward photocopy. Simple, right? And yet the machine, with no warning, clearly and deliberately changed numbers on his documents.

Well, “deliberately” in the sense that the copier’s firmware was responsible for the change. Here’s a sample: 


The original is on the left; the copy is on the right. On the top row, you can see how the “60” in the right-hand column of the original has been converted to an “80” on the copy. A similar transformation takes place on the third row, where “65” becomes “85.” Again, this isn’t just fuzziness caused by a poor-quality scan or print. In both cases, the digit 8 is clearly an 8, not a malformed 6. It has the characteristic “dent” in the left and right sides (typography nerds feel free to jump in here), and looks like a snowman, with two loops atop one another. Closing up the 6 with a few extra pixels wouldn’t look like that.

The copier is actually substituting numbers, drawing 8’s that aren’t there. So what’s going on?

Xerox’s firmware is too smart for its own good. Or, more accurately, the company’s programmers are too clever for their own good. Or they have too much time on their hands. Or poor management oversight. Or just a little too much extra ROM space that needed filling.

As the company admits, it’s a bug. (You think?) But it’s not just an unavoidable side effect caused by selecting the low-resolution settings. Even scanning a document at the highest possible resolution doesn’t eliminate the problem, Xerox says. In other words, there’s no way around it. You can’t trust your copied documents, period. But we’ve got a fix coming.

What they don’t say is that it’s a bug caused by poor decision-making, not just flawed code. In an effort to make copies that look nice and clean, Xerox machines try to sharpen up recognizable characters. In other words, they’re doing OCR even when you tell them specifically not to. In direct contradiction of the user’s expressed wishes (e.g., manually turning off the OCR feature), the copier does it anyway. And screws it up. And leaves no trace of its treachery.

Imagine photocopying a contract for someone to sign. It comes back with a signature, all nice and legal, but the numbers are wrong and you just got cheated out of $20,000. Are you going to suspect the photocopier – probably not – or are you going to chew out the assistant who obviously mistyped the contract? Try tracking down that bug. “Gee, the Word file on my computer looks right… it comes off the printer okay… I can’t figure out what happened, boss!”

In the game of office Clue, nobody suspects the programmer, in the copy room, with the firmware.

The failure here was in trying to do too much. And in ignoring user input. We’ve said it here before: never, ever ignore user input. It’s their machine, not yours. Learn to get out of the way. And always be up-front about what you are doing, so that intelligent users (or troubleshooters) can figure out what’s going on.

Xerox violated both of those rules, and then some. It (a) added a text clean-up feature before it was ready; (b) kept that feature enabled even when users expressly asked to turn it off; and (c) kept silent about what was going on. More subtly, it violated one of those strangely intangible aspects of all human/machine interfaces: it did something different from what we expect. Through years of experience, we’ve all learned that when we photocopy a document (sans OCR), we get a “dumb” copy, coffee stains and all. If the text gets too blurry to be legible, we solve it through other means, such as finding a cleaner original. But we expect the photocopier to just copy the $&@# image, not try to outsmart us and silently “improve” the document. The user didn’t ask for any improvement and isn’t expecting any. Thus, when the improvement algorithm fails, nobody thinks to look for it there. Especially because the OCR indicators are off. The copier is essentially lying, saying, “not me!”

The best user interfaces – indeed, the best products of any kind – work along that delicate balance between new features and old expectations. Music apps come with Play, Rewind, and Fast Forward buttons that are arranged as they have been for generations. Could that design be improved? Sure, but then we’d need to relearn it, and for what benefit? QWERTY keyboards are deliberately bad, but they’re also familiar and (nearly) universal. A lot of good product design includes deliberately mimicking anachronisms. No product exists in a vacuum, nor does any user. As developers, we need to put ourselves in the user’s head and understand that every product, no matter how important, is but one of thousands in that user’s life. None of them has a monopoly on the user’s attention, and few are worth learning new habits for.

Making photocopies used to be the job you gave to the new intern. Not something that requires a PhD and a major corporation to figure out. 

5 thoughts on “Damn You, Autocorrect!”

  1. This is going to be more and more of an issue as developers try to harness “context” as a way to let machines figure out what you really want. It’s a thing I referred to a couple years back as the “annoying valley,” riffing off the “uncanny valley” thing. The closer you get to figuring it out – without actually getting it quite right – the more annoying it is – until it’s perfect. Then it’s fine.

    Of course, in this case, the results could be more than just annoying…

  2. Normally I would be 100% behind you on this – but

    Predictive texting on my phone is weirdly , uncannily great. It offers sensible options, and these are sometimes chosen from your own vocabulary. It beats the prediction in open office into a cocked hat, and is way better than many spell chequers.

    So some times those wierdo programmers get some things right.

  3. Wow… predictive typing on my last Android phone was so hopelessly bizarre that I shut it completely off… This is actually something that you can get 90% right and have it be OK; there’s no Annoying Valley. Like reading business cards with CardScan – you expect to do some manual tweaking, and as that becomes less and less necessary, it’s only goodness.

  4. I like the predictive typing on my Android phone, in fact, I’ve downloaded and installed an alternate keyboard (SwiftKey) because it not only predicts the word I’m typing, but gives me some options for the *NEXT* word, and frequently gives me the right word, especially for those short texts that are so often the same thing (it analyzes your texts and emails to come up with personal word usage patterns).

    On the other hand, the predictive typing on Lync (we use it for IM at work) is awful, have the time I send an IM and see that what got sent was not what I intended…

  5. I love that in these replies about predictive typing there are obvious misses. And I suspect that I’m only catching “have” of them becuse hour superoir abapitve mind can reed cleraly bad text anwyay.

Leave a Reply

featured blogs
Jun 19, 2018
Blockchain. Cryptocurrencies. Bitcoin mining. Algorithmic trading. These disruptive technologies depend on high-performance computing platforms for reliable execution. The financial services industry invests billions in hardware to support the near real-time services consumer...
Jun 19, 2018
The closing session of the RSA conference was a sort of chat-show hosted by Hugh Thompson, who is the RSA Program Chair. He brought up three people, two of who have featured in Breakfast Bytes posts before. The proceedings started with a session where DJ Shiftee, who had been...
Jun 7, 2018
If integrating an embedded FPGA (eFPGA) into your ASIC or SoC design strikes you as odd, it shouldn'€™t. ICs have been absorbing almost every component on a circuit board for decades, starting with transistors, resistors, and capacitors '€” then progressing to gates, ALUs...
May 24, 2018
Amazon has apparently had an Echo hiccup of the sort that would give customers bad dreams. It sent a random conversation to a random contact. A couple had installed numerous Alexa-enabled devices in the home. At some point, they had a conversation '€“ as couples are wont to...