feature article
Subscribe Now

Who Invented the Johnson Decade Counter (and Why)?

I love digital logic. I love solving digital logic conundrums. And I especially love discovering interesting and unusual ways of doing things while also learning more about the people who came up with these ideas in the first place. Take Gray codes, for example. These were named after Frank Gray, who was a physicist and researcher at Bell Labs.

The Gray code, or reflected binary code (RBC), which first appeared in a 1953 patent, is a binary numeral system that is often used in electronics, but that also has many applications in mathematics. Consider a 3-bit binary count sequence: 000, 001, 010, 011, 100, 101, 110, 111, and back to 000. In this case, some transitions involve multiple bits changing at the same time. For example, the transition from 001 to 010 involves two bits changing simultaneously, while 011 to 100 involves all three bits changing.

By comparison, consider a 3-bit Gray code sequence: 000, 001, 011, 010, 110, 111, 101, 100, and back to 000. In this case, adjacent values always differ by only 1 bit.

The idea is simple enough… once someone shows it to you. I don’t know about you, but—if left to my own devices—I don’t think I would have come up with this in a thousand years.

As an aside (I’ve mentioned this before, but some readers may be new to this column), one of my favorite logic puzzles involves a black box with three inputs (A, B, C) and three outputs (NotA, NotB, NotC). 

The three outputs are the logical inversions of their corresponding inputs. That is, when A is 0, NotA is 1; when A is 1, NotA is 0; when B is 0, NotB is 1, and so forth.

Your mission is to come with a set of logic gates that will provide the desired function using only two NOT gates along with as many AND and OR gates as you wish, but no other inverting functions like NAND, NOR, XOR, or XNOR gates.

Although this may seem impossible at first glance, it can indeed be solved without any trickery or skullduggery. The various solution(s) are awesome. My favorite is presented in my Mysteries of the Ancients: Binary Coded Decimal (BCD) column.

Recently, I set myself a task of driving ten 7-segment displays using as few as possible (with a maximum of 12) digital input/output (I/O) pins on an Arduino Uno augmented by as few supporting chips as possible.

Why this particular task? Well, it arose as part of an educational series I’m working on to teach beginners about electronics in general and Arduinos in particular. Why use an Uno? I like the challenge of creating code for an 8-bit machine running at 16MHz with only 2KB of RAM.

And why only 12 digital I/Os? Well, the Arduino has 14 digital I/O pins, which are numbered from 0 to 13, but I’m reserving pins 0 and 1 for serial (UART) communications with my host PC. It also has six analog input pins that can also be used as digital I/Os, but I’m reserving those for other purposes.

I’m using 10-pin single digit common cathode 7-segment displays. Note that the “7-segment” portion of the moniker refers to the seven rectangular-ish elements used to display the digits; there’s also an eighth element that can be used to represent a decimal point (DP). 

Segment names and pin numbers for 7-Segment display (Source: Max Maxfield)

Let’s start with a single display. The easiest way to control this is to use the Arduino’s pins to drive the display’s pins directly (via current-limiting resistors, of course). 

 1 display, 8 Arduino pins (Source: Max Maxfield)

So far, so good, but what if we want to drive more displays. Using this “direct drive” approach, two displays would require 16 Arduino pins, but we have only 12 at our disposal. One thing we could do would be to multiplex the displays as shown below.

4 displays, 12 Arduino pins (Source: Max Maxfield)

The idea is that we start by using the Arduino to turn all the transistors off. Then we set the required 0 and 1 values on the pins driving the segments. Then we turn one of the transistors on to activate its associated display. We display the value for a short time (say 1/1000 of a second), then we turn that transistor off again. We repeat the process with different values for the other three displays

By cycling through the displays quickly enough, activating and deactivating them in turn, our “persistence of vision” makes it appear as though all the displays are active at the same time.

With respect to Arduino pins 13, 12, 11, and 12, we are cycling through the following four patterns: 0001, 0010, 0100, 1000, and back to 0001 again.

The problem is that we’ve now maxed-out our 12 pins. One way to free up some pins would be to use a 2:1 decoder. In this case, we could use two of the Arduino’s pins (say 11 and 10) to cycle through four binary combinations, 00, 01, 10, 11 and back to 00 again. Our 2:4 decoder can take these patterns and translate them into the 0001, 0010, 0100, 1000 equivalents we require to drive our transistors as illustrated below.

4 displays, 10 Arduino pins (Source: Max Maxfield)

We could use a similar approach to extend the number of displays. For example, we could use a 3:8 decoder to drive 8 displays, but this would leave us with only 1 pin free. Similarly, we could use a 4:16 decoder to drive up to 16 displays, thereby meeting or exceeding our goal of 10 displays, but this would once again consume all our pins as illustrated below.

10 displays, 12 Arduino pins (Source: Max Maxfield)

Of course, one way we could easily free up four more pins would be to add a BCD-to-7-segment decoder as illustrated below. The main downside to this is that we would lose the ability to control our displays’ DP segments, but that’s something we could worry about later.

10 displays, 8 Arduino pins (Source: Max Maxfield)

OK, so we can currently control 10 displays using only 8 of the Arduino’s pins with only 2 additional ICs. Can we do better? You betcha! What we could do would be to replace our 4:16 decoder (which requires 4 pins to drive it) with a decoded binary decade counter.

“What’s a decoded binary decade counter?” I hear you cry. Well, a 4-bit binary counter passes through 16 states (0 to 15) as follows: 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111, and then back to 0000 again.

By comparison, a binary decade counter passes through only 10 states (0 to 9) as follows: 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001, and then back to 0000 again.

Finally, a decoded binary decade counter will take the output from the binary decade counter and decode it by setting one of its 10 output pins high: 1000000000, 0100000000, 0010000000, 0001000000, 0000100000, 0000010000, 0000000100, 0000000010, 0000000001, and then back to 100000000.

The point is that this decoded decade counter will require only two signals (pins) in the form of an initialize (INIT), which will place the counter in its initial 1000000000 state and a clock (CLK), which will increment it from one state to the next. This will leave us with 6 free pins while still requiring only 2 additional ICS as illustrated below.

10 displays, 6 Arduino pins (Source: Max Maxfield)

As another aside, a lot of people will use a chain of 8-bit shift registers to control large numbers of 7-segment displays, and these certainly have their advantages, not least that we could reduce the total pin count to be as low as 4 (CLEAR, CLK, DATA, and LOAD) or even 3 (CLK, DATA, and LOAD) if we implement the equivalent of a CLR by shifting in a stream of 0s. Also, we would regain control of all our decimal points. The downside is that this would require us to use 10 extra chips (one 8-bit shift register per display) and—for the purposes of our problem—we are trying to use as few extra chips as possible. But we digress…

It’s possible to create a pure 4-bit binary ripple counter using only four flip-flops (usually T-type or JK-type) daisy-chained together such that the output of each flip-flop acts as the clock signal to the next higher-order flip-flop. It’s only the flop-flop representing the least-significant bit that’s fed by the main CLK signal from the outside world. The only problem with this counter is that it can take between 1 and 4 flip-flop delays for the new 4-bit count value to appear on the counter’s outputs.

Creating a binary decade counter is another marmite de poisson (“kettle of fish”). In this case, we will have to implement our counter as a finite state machine (FSM). The heart of our counter will be four D-type flip-flops, and we will decode the “next state” (the 4-bit value to load into the counter on the next clock) from the “current state” (the 4-bit value that’s currently stored in the counter).

And, of course, once we have our binary decade counter, we are still going to need to decode its outputs to generate the ten decoded binary decade counter values we require.

Is there any way to generate an equivalent of the decoded binary decade counter without having to go through the hassle of actually implementing and decoding a binary decade counter? Well, by golly, there is! For example, we can implement this in the form of a 10-bit ring counter, which is essentially a 10-bit shift register formed using D-type flip-flops, where the output from the final stage in the register chain is connected to the chain’s input as illustrated below:

10-bit ring counter (Source: Max Maxfield)

This form of ring counter is called a straight ring counter or a one-hot ring counter. There’s another possibility. Rather than feeding back the true output from the last state, we can instead employ the complementary output, thereby resulting in a Johnson counter (also known as a twisted ring counter, a switch-tail ring counter, a walking ring counter, or a Möbius counter). This results in our circulating a stream of ones followed by zeros around the ring.

Just to keep things simple, let’s contrast and compare the sequences generated by a 5-stage straight ring counter (a) in the image below with a 5-stage Johnson counter (b) in the image below.

5-stage straight and Johnson ring counters (Source: Max Maxfield)

Observe that, in the case of the straight ring counter, the INIT signal drives the S (“set”) input to the first flip-flop and the R (“reset”) inputs to the remaining flip-flops. By comparison, in the case of the Johnson counter, the INIT signal drives the R inputs on all the flip-flops. The sequences resulting from these two implementations are as shown below.

5-stage straight and Johnson ring counter state sequences (Source: Wikipedia)

In a crunchy nutshell, Johnson counters offer twice as many count states from the same number of shift registers. Also of interest is the fact that the Johnson counter generates a code in which adjacent states differ by only one bit as in a Gray code, which can be useful if the bit pattern is going to be asynchronously sampled.

When a fully decoded (one-hot representation) of the counter state is required, as in the case of driving our ten 7-segment displays, the straight ring counter presents these values by default. Happily, all that is required to decode the five outputs from a 5-stage Johnson counter into the ten 1-hot signals we require is to add ten 5-input AND gates (remember that we get the complementary outputs from our flip-flops for free). The result is known as a Johnson decade counter.

The core Johnson counter itself was invented by American inventor, engineer, computer pioneer, and professor, Robert Royce “Bob” Johnson. If your knee-jerk reaction is that Bob invented his counter in the desire to save transistors, you would be wrong because he wasn’t using transistors, which were horrendously expensive and not widely available at the time. 

Bob’s counter was first described in a 1954 Patent. If you look at this patent, you will see that each of the register stages is implemented using two back-to-back vacuum tubes. So, rather than use 20 expensive vacuum tubes to implement a 10-stage straight ring counter, only 10 tubes were required to implement a 5-stage Johnson counter whose outputs could be decoded as required (a 5-input AND gate could be implemented using only 5 diodes and 1 resistor).

OK, so we know who invented the Johnson counter, but who invented the Johnson decade counter (that is, decoding the outputs from a 5-stage Johnson counter into ten 1-hot decade outputs)? I’ve not been able to find any evidence that Johnson did this himself. My suspicion is that some bright-spark engineer at TI or RCA saw a customer demand for a 1-hot decade counter and said, “I’ve got a good idea, let’s decode the output from a 5-stage Johnson counter and call the result a ‘Johnson decade counter.’”

If this is the case, my next question would be “Why?” Or, rather, “Why not just use a 10-stage straight ring counter and have done?” Based on what I said before, you may be thinking that a Johnson decade counter requires fewer transistors than a 10-stage straight ring counter, but I fear this is not the case. Take a CMOS implementation of a Johnson decade counter, for example. Let’s assume that a CMOS D-type flip-flop requires 20 transistors, so a 10-stage straight ring counter would require 10 x 20 = 200 transistors.

Now consider a CMOS Johnson decade counter like the CD74HC4017. In this case, we have only five D-type flip-flops, equating to 5 x 20 = 100 transistors. But we also require ten 5-input AND gates. Each input to a CMOS AND gate requires 2 transistors, so a 5-input AND will require 10 transistors, so ten 5-input ANDs will require 100 transistors, so our five flip-flops and ten 5-input AND gates will require a total of 200 transistors, which is the same as the straight rung counter, so why bother?

This is the sort of thing that keeps me up at nights LOL. How about you? Do you have any thoughts you’d care to share on any of this? As always, I welcome your insightful comments, penetrating questions, and sagacious suggestions.

12 thoughts on “Who Invented the Johnson Decade Counter (and Why)?”

  1. I’m not sure, but I suspect the Johnson decade counter was probably invented at the time of TTL, or even DTL. At this time a D flip-flop needed something like 20 transistors (similar to a CMOS D-type Flip-Flop), while a 5-input and used only 5 diodes (the resistor may be omitted and placed outside of the chip if needed). Furthermore a diode is a lot smaller than a transistor. So 10 5-input ands take less space than 50 transistors, while 5 additional D-type flip-flops would use 100 transistors.

    The reason why the Johnson decade counter was invented is thus not evident with CMOS technology, but may have been natural before, at TTL time…

    1. Hi Bernard — thanks for your feedback — that makes PERFECT sense because the original Johnson counter employed diode logic — i just hadn’t thought about that form of logic in chip form.

      1. Hi Max,

        I think you’re missing the real value of Johnson counters in CMOS because your understanding is off for how to decode them.

        You don’t need big N-input gates to get the one-hot outputs; regardless of the number of counter stages, you can decode each output with a single 4-transistor NOR gate if you happen to have the complements from the flops. You’re effectively looking for the single 0-1/1-0 transition in the encoded output vector, so you AND each encoded bit with the complement of the adjacent bits. Demorganify everything and you can use 4 transistor NORs.

        Hence, a straight ring counter is 1 flop per output while a Johnson is only 1/2 flop plus one NOR per output. 200 transistors vs. 140 transistors, plus a single gate propagation delay (which can be hidden with clock skew). A useful tool for very high speed designs!

  2. Great article Max. Two things. First, the patent tells us that the Johnson counter was developed to minimize components. I’ll go further and say it was designed to minimize active components, which were tubes at the time. The early germanium transistors were just starting to appear and were not in common use. The patent also shows us that an AND gate needed only semiconductor diodes and a resistor. No active components needed for the gates, so a decimal Johnson counter could be implemented with ten triodes (or five dual triodes), a handful of semiconductor diodes (which were starting to appear in production quantities from Hughes Semiconductor and other vendors back then), and some resistors. Tubes are expensive and dissipate a lot of power, so minimizing their use is important. Second, this counter was designed when computers used up and down counters – incrementers and decrementers – to perform arithmetic instead of ALUs. ENIAC was like that. These early computers mimicked the decimal (non-binary) architectures of the mechanical and electromechanical calculators that preceded them. That’s why fully decoded or unary counters were so popular. They were also a perfect match for Nixie tube indicators, which appeared in 1955.

    1. Hi Steve — I agree with all you say — it’s just that I couldn’t discover if Johnson invented the decade counter version of his core counter or if someone else took the Johnson counter as a base and did the decade counter version.

      1. Max, I’d say that the decade version of Johnson’s counter is fully envisioned in his original patent: “It should be noted that an indication or, the count configuration of any preselected count of the counter may be sensed in addition to the sensing of a complete counting cycle. That is, any count within the capacity of the counter may be indicated by an appropriate sensing circuit.” By the way, I noticed that the patent was assigned to Hughes, a major source of semiconductor diodes in 1954. Diode logic combined with tube flip-flops was well established by the early 1950s.

        1. Well, if you put it that way… LOL I agree with you. I have to say that I think the whole thing is very clever– using just 2 tubes per register stage — generating 2n states for every n register stages — using the fact that you get q and qb outputs from each stage for free, and then using simple diode-based AND functions to decode the state. Brilliant!

          1. Those guys were really clever in those days, because active elements were not “free.” My poster child of that era is the Librascope LGP-30, a 31-bit, desk-sized computer with drum memory for registers that needed only 113 tubes, but 1450 semiconductor diodes. The machine was designed by Stan Frankel, who ran the CalTech computing shop. Hughes had barrels of off-spec semiconductor diodes that they just gave to Frankel, so he used them liberally in his design to make the instruction decoder and logic gates.

  3. Hi Steve — I’ve not heard of the Librascope LGP-30 — maybe that could be the subject of one of your future columns (or have you already written about it in the past?)

    1. Hi Max,

      Yes, I’ve written about the LGP-30, but not in EEJournal. There’s an LGP-30 in the Computer History Museum’s collection. The LGP-30’s inventor, Stan Frankel, is the subject of a lengthy Web page (years in the making) on one of my sites:


      Also, I wrote about a replica LGP-30 that I saw in 2016 at a Vintage Computer Festival at the Computer History Museum on LinkedIn, before I started writing for EEJournal:


Leave a Reply

featured blogs
Jul 12, 2024
I'm having olfactory flashbacks to the strangely satisfying scents found in machine shops. I love the smell of hot oil in the morning....

featured video

Unleashing Limitless AI Possibilities with FPGAs

Sponsored by Intel

Industry experts discuss real-world AI solutions based on Programmable Logic, or FPGAs. The panel talks about a new approach called FPGAi, what it is and how it will revolutionize how innovators design AI applications.

Click here to learn more about Leading the New Era of FPGAi

featured paper

DNA of a Modern Mid-Range FPGA

Sponsored by Intel

While it is tempting to classify FPGAs simply based on logic capacity, modern FPGAs are alterable systems on chips with a wide variety of features and resources. In this blog we look closer at requirements of the mid-range segment of the FPGA industry.

Click here to read DNA of a Modern Mid-Range FPGA - Intel Community

featured chalk talk

Portenta C33
Sponsored by Mouser Electronics and Arduino and Renesas
In this episode of Chalk Talk, Marta Barbero from Arduino, Robert Nolf from Renesas, and Amelia Dalton explore how the Portenta C33 module can help you develop cost-effective, real-time applications. They also examine how the Arduino ecosystem supports innovation throughout the development lifecycle and the benefits that the RA6M5 microcontroller from Renesas brings to this solution.  
Nov 8, 2023