High-Speed Serial Comes to the Analog/Digital Divide

Everyone knows that if you want to do things slowly, you do them one at a time. If you want to get more done, you get more people to help do things in parallel. Right? I mean, in the world of electronics, think “serial,” and what might come to mind is the slow, stately procession of bits plodding from your desktop to some not-very-needy peripheral. You want speed? Check out the parallel port, where multiple lines are willing and able to deliver the kind of data demanded by your more high-maintenance attention-craving peripherals.

Historically, this was also the case when hooking chips together on a board; any kind of real data transfer went on a bus, which by definition, consisted of parallel lines. And they got faster… and faster… until a couple of problems started to crop up. From an electrical standpoint, somewhere along the way you end up switching so fast that you actually can have multiple pulses on a wire at the same time, strung along, marching towards the end. And you’ve got a bunch of these wires, and they damn well better be EXACTLY the same length, or else you might mistakenly interpret line 5’s 791st bit as the 792nd bit. And with jitter, that could happen intermittently.

Oh, and then there’s the problem of how to clock the dang thing. Especially if you’re going across a backplane from one board to the other, and where there’s no master clock for both boards. If things are slow, well, slight differences in clock phase and frequency, if they matter at all, can be accommodated by FIFOs, or heck, even just double-buffering to harden against metastability. For those of you liking million-dollar words, such a system is called plesiochronous, meaning that it’s more or less synchronous, but there may be clock differences within some specified limit.

But once things get going fast enough, you just don’t have the margin you need; the clock-to-signal relationship has to be very tightly controlled. Within a board, perhaps a common clock can suffice, but another approach used is clock-forwarding (or source-synchronous), where a clock signal line is added to the bus so the receiving chip can synchronize the received bits to the forwarded clock and then resynchronize through a FIFO to its own clock, once it’s straightened out which bits go where. The forwarded clock is presumably the same frequency as the receiving chip’s clock (or close), but the phase is likely be different (if not, it’s an incredible coincidence). Whence comes our second million-dollar word – mesochronous, meaning the frequencies are the same but with differing phases.

But now comes the second major problem: routing these parallel lines on a board is non-trivial to say the least. Many of these are coming from underneath large ball grid arrays, and they need to snake their way out to daylight before they can look anything like a bus, and then they have to sneak around the balls on the receiving chip to get to their destination. And all of these lines, including the clock, must be extremely well-matched, not just in length, but in impedance. If one of the lines has to jump down to another metal layer to escape the confines of the ball grid, then some equivalent compensation must be provided to the other lines. Oh, and I forgot to mention… in order to switch this quickly, you can’t use standard old single-ended signals: you need differential pairs, so each line of the bus is actually two impedance-controlled wires. At some point you start tearing your hair out trying to route everything.

What’s old is new

So… if you went parallel in order to go faster than serial, and you now want to go faster than parallel, then you have to go… serial?? Strange as it seems, this is indeed what has happened. For one major reason: you can encode the clock onto the signal, transmitting both clock and data together on one line. How is this possible? By using a phase-locked loop (PLL) on the receive side as the data comes in and making sure that the data has enough transitions (or, as they say, has a high transition density). A PLL is essentially an electronic flywheel; it has momentum, and, as long as you give it a kick often enough to keep the speed up, it will keep spinning along and can act as a clock at a frequency determined by how often you kick it. Of course, you can’t predict exactly what your data will be, and you could very well have a long string of 1s or 0s, in which case our flywheel would start to slow down and eventually stop (“lose lock” in PLL lingo). So the data has to be encoded.

There are lots of encoding schemes that ensure transitions (Manchester encoding, for example, if you can tolerate abrupt 180-degree phase shifts). But there’s one other consideration: how the lines are “coupled.” In the old days, you just attached drivers to wires and ensured that the driver was beefy enough to charge or discharge the wire fast enough. This is known as DC coupling, where there is a direct DC current path from the wire through the driver. Increasingly, AC coupling has been used for differential pairs, where the line actually floats, and a capacitor couples the signal onto the line. This eliminates any steady-state DC current.

So now we have one more consideration: if you couple more 1s on than 0s, on average, then the floating line will start drifting northwards, eventually looking like a permanent 1. You’re basically pumping more charge onto it than you’re taking back off. It’s no longer enough just to have sufficient transition density, it also has to be balanced between 1s and 0s. This brings into play the concept of “disparity”, a kind of running total of 1s and 0s that you want to keep balanced at zero. Any “code group” with even numbers of 1s and 0s is fine on its own, but one with more 1s than 0s has to have an alternate encoding with more 0s than 1s; when it’s time to encode a byte (or “octet”), you look at the running disparity and pick one of the two possible encoded forms that will bring the running disparity back towards 0.

It’s also useful to be able to detect simple errors in transmission, suggesting a coding technique such that any single-bit error results in an illegal code. The other thing to watch out for is a situation where, for example, a synchronization signal might accidentally be detected when two consecutive “words” come together. If a synch word is “00000” (just making one up here), and two non-synch words “11000” and “00111” are sent back-to-back, the five consecutive 0s could be misinterpreted as a resynchronization. So, all told, we’ve got a lot of considerations to take into account when figuring out how the heck to encode this stuff.

Fortunately, there’s a tried and true coding technique that accomplishes all of this: the 8B/10B code (so-called because it encodes 8 bits into a 10-bit code). I defy anyone looking through the details of the code not to be impressed by the thoroughness of thought that has gone into this scheme. It’s widely used in many transmission standards. It does have the downside of having high overhead: 20 percent (two bits out of each 10 transmitted) of the bandwidth is lost due to the encoding. There is a successor, the 64B/66B standard, that loses two bits in 66, but it has so far not been widely deployed.

The FPGA world has seen adoption of a number of such serial standards, starting with Ethernet-related standards (where 8B/10B originated). Interestingly enough, a number of the standards have been perceived as (or have literally been) competing pairs: Serial RapidIO was seen as competing for attention with PCI Express. Both were highly touted (and continue to be), but these are heavyweight standards. Xilinx then developed Aurora as a lightweight proprietary scheme, and Altera followed with SerialLite, which could be even lighter or heavier, having various options. It’s unclear how much traction these proprietary standards have gotten, but they don’t appear to have taken over.

Taking it to analog

All of these schemes have been intended for interconnecting digital components. There’s one area that hasn’t yet benefited from this, and that’s the interface between analog and digital. There is yet another critical consideration here: the possible interference by the noisy digital circuits with the analog circuits. Analog chips are just now starting their migration from parallel to serial, and, in fact, so far it’s happening only on the analog-to-digital (ADC) side, since there are not yet any digital-to-analog converters (DACs) that are fast enough to warrant this.

Here it is really the interconnect issue driving migration to serial: let’s face it, DACs and ADCs aren’t typically large devices, so the area required for interconnect can exceed that required for the chip itself – especially in light of the double-ended signaling that’s even more critical here to reduce switching noise. Integration, the usual solution in an all-digital situation, has been limited because it’s very hard to integrate high-precision analog on the same chip with the same process as high-speed digital. So you end up having to go from chip to chip.

As an answer to the needs of this application, JEDEC created the JESD204 standard, intended for transferring data between analog converters and digital chips and FPGAs or ASICs. It’s a standard that in many ways resembles the others mentioned above, but without some of the more involved features like acknowledgment mechanisms and flow control. Electrically, it looks like CML, and the typical high-speed I/Os available on FPGAs can be used. The switching speed is targeted for the range 0.3125 – 3.125 Gbps (including encoding overhead, so the actual data transfer rate is less).

This is intended as a very short-range interconnect, and it is expected that both the digital and analog chips will have the same master clock. 8B/10B encoding is used to maintain DC balance and detect single-bit errors, but ultimately the signal is resynchronized at the receiving end to the master clock. In addition to the normal functioning of a reference clock in standard serializer/deserializer (SERDES) implementations, there are a couple other reasons that the standard illustrates for using a single master clock. First, there may be more than one ADC sending data into the FPGA or ASIC, so picking one to be the reference clock becomes somewhat arbitrary. Second, if the selected link containing the master clock goes down, then the entire receiving side shuts down as well. But assuming the converter and FPGA/ASIC are more or less collocated, a single master clock for both shouldn’t be an issue.

In the case of interpolating DACs or decimating ADCs, which require a faster clock in the converter, you can either generate that faster clock by multiplying up from the master clock or start with the faster rate and derive the master clock by dividing down (being careful with the ratios to ensure that no weird tones end up on the analog side… yeah, analog is a pain in the… well… never mind).

The serial lines are intended primarily to be data only; control information is mostly sent out-of-band on what may be slower single-ended sideband signals. It is allowed, though, for some control information to travel in a packet trailer. An example of a sideband signal explicitly defined in the standard is the SYNC signal, which goes from the receiving end back to the sending end and indicates when the line needs to be resynched or when there’s been a data error. Interestingly, it’s explicitly noted that error handling is assumed to be done by the FPGA or ASIC side, even if the error occurred sending data to a DAC, simply because FPGAs and ASICs are better-equipped to perform that function.

Depending on the application, there may be periods of inactivity, when essentially there’s no data. You’ve still got to transmit a signal during this time to keep the receive flywheel spinning, but you’re also allowed to power down the link to save energy. Obviously there are things you have to take care of when powering down and up, like resynching the link; the standard leaves that as an exercise for the reader.

Whether due to steady data or lack of data, there may be long periods where a single code is sent over and over via the link. Yes, if it is an odd disparity code, it will be balanced by its partner, but then this pair gets transmitted over and over. And any time you send a repeated pattern over and over, you now have a periodic signal, which creates a peak at some point in the emissions spectrum, potentially interfering with the analog side of things. So the standard provides for data scrambling using a specified polynomial. Data is first scrambled and then encoded before sending; it is decoded and then descrambled on the receiving side. Scrambling is optional, since it is perceived that the very acts of scrambling and descrambling could create noise, so not scrambling is still legit.

One of the common capabilities of other serial standards appears not to have been incorporated into JESD204: the ability to bond multiple lines into a wider channel. Ordinarily this is done essentially by “rasterizing” data onto multiple lines and then pulling it off and reassembling it on the other side, with a layer on both sides interpreting the multiple serial lines as a single entity. While there apparently has been discussion about this by potential users, it is, at this point, not included in the standard.

Lattice and Linear Technology appear to be first out of the chute implementing JESD204. On the FPGA side, what makes this feasible is a low-cost ECP2M family with high-speed serial link capability; historically, this feature has been available only on the highest-end devices in the FPGA world. That is starting to change, and Lattice has seen this low-cost application area as one that can benefit from low-cost SERDES FPGAs.

While serial interfaces are old hat for FPGAs, they’re very new for digital/analog converters. Linear Technology has announced their LTC2274 ADC, which has the I/Os and logic necessary for JESD204. Bear in mind that FPGAs, given the basic SERDES circuitry, can support many standards just by changing up the IP that implements the logic; the ADCs and DACs have to build the logic into hard silicon. So part of the pacing element in adoption of the standard will be the availability of converters that have built-in support. Linear Technology is hoping to get a jump on that trend.