feature article
Subscribe Now

A Better Flytrap

DDR3 Controllers Hit the Market for SoCs and FPGAs

That DDR memories work at all seems like a miracle. I mean, it’s like someone woke up one morning and said, “Hmmm…. You know, high-speed serial interconnect has complicated timing when you try to align a bunch of lanes… there HAS to be a way to take those concepts and make it even trickier to design.”

Here you’re taking a bank of memories and sending them data and address and clock and command and DQS signals, and all in “eye-diagram” territory. Skews and signal integrity are critical; this is not for the faint of heart. And, in fact, DDR architects have finally said “Uncle” for some of it.

Let’s start by reviewing the basic topology of a DDR memory. You’ve got a bunch of memory chips, each with data, address, control, and DQS. The DQS signal is the least familiar for those who’ve been fortunate enough not to have to think about how memories are accessed. Unlike older memories where the clock is used to time reads and writes, DQS is what really clocks data in during a write, or indicates data valid during a read; the clock signal is used as a reference timer for generating DQS. Each memory has its own DQS, which allows the timing of each memory to be tuned for its particular signals. Data and DQS lines are connected point-to-point; address and control lines are bussed to all the memories. Because the timing is so tight, the relative skews for all of these signals are critical.

The way the busses are created is through what is called T-branching – each time a bus branches, both sides of the branch have to be matched. And these busses branch more than once, so you end up with a complicated design job that more or less ends up looking something like clock tree synthesis on an IC. Each signal has a single source, and it has to arrive at multiple destinations at the same time. And they have to be impedance-matched, and each angle or via on the PCB acts as a discontinuity and, well, you can imagine it might get messy.

Actually, in theory, it’s a tad more complicated than that, since you also have to align, for each memory, with the data and DQS for that memory. If each of the data/DQS lines can be aligned using a technology like dynamic phase alignment, then all the data and DQS lines can be balanced and “equalized,” and the bussed signals can then be designed to arrive simultaneously at each chip.

If each of the data/DQS sets had a different arrival time, the job would get even nastier. Then again, the data isn’t a single signal – it’s a set of signals or lanes, and they have to be aligned with each other so that a single data grab on a given memory pulls the right data from each of the lanes, so, presumably, if you can align them with each other, then you can align them with the data/DQS signals from other memories as well. Having said that, aligning eight data signals and a DQS may be a big enough pain in the butt without having to align a bunch of sets of those.

Bottom line, it’s a tough timing job to get this all to work. Oh — and then it needs to continue working as the weather and temperature and phases of the moon change as well.

So… DDR2 doubled clock rates from DDR; hey, that was so much fun, let’s do it again! And let’s call it… hmmmmm… how… about… Oo! Oo! I’ve got it! How about DDR3? And now… we’ll pump data as fast as 1.6 Gb/s. Nice.

And this is where we abandon ship on the old bussing techniques. The kinds of timing tolerances are so crazy now that trying to use T-branching to balance everything out has been ruled out as being unworkable. And a new approach has been applied, taking a page from Fully-Buffered DIMMs. It’s called the “fly-by” technique. And it sounds simpler, and I’ll take designers’ word for it that it is simpler, but man, it still sounds like a tough thing to get right.

Instead of bussing the control signals to each memory, there’s one set of signals that simply starts at one side of the DIMM and crosses all the memories. Obviously, signals will arrive at the first memory before they get to the last memory. The idea here is to position signals on the control bus so that they arrive at their intended memories at the correct time. So the signal intended for, say, memory chip 3, will visit all the memories before and after hitting chip 3, but will arrive at chip 3 with timing such that is recognized by chip 3 and not by the other chips.

To be clear, we’re not talking about a pipeline here. We’re only talking wires, and signals are flying by, and we’re capturing them as they pass much as a frog catches a fly. The memory controller has to issue both the fly and the frog’s tongue and ensure that they meet. And it’s worse than that, it’s like the fly is being launched from California, the frog is in Japan, and the frog has a really long tongue that’s going to snag the fly 30,000 feet over oblivious beachgoers on Waikiki. No, it’s worse than that… you’ve got a bunch of flies and a bunch of frogs, and you’re going to send a stream of flies out and position them all so that the frogs all zap the right flies, taking into account variations in wind and air density and aerodynamics as the flies freak out. It just sounds unlikely, but it’s the new game. Scares the crap outa me. I guess it’s just that the old system was even scarier. Imagine that…

Part of the new game involves “read leveling” and “write leveling.” There’s a training sequence that executes before the controller goes into full operation to figure out the alignment of signals with respect to the various memories and their data/DQS lines. This basically teaches the controller how to set the timing of the signals it places on the control bus so that they reach their targets at the right time. Once set, the controller then has to compensate for variations in temperature and voltage to ensure that the memory continues working as conditions change.

Both Denali and Virage have announced DDR3 controller IP for SoCs. The solutions are modular, and the issues of higher-level controller tend to be separated from the thorny timing issues associated with the physical (PHY) level and the I/Os. Altera and Xilinx have also put their latest high-end devices to use as DDR3 controllers.

As for me, I think I’ll test out this new technology by heading out to Waikiki to gaze up into the night sky and see if I can spot any flies getting zapped. I could use the vacation.

Links for more information on any companies or products mentioned:
Altera DDR3
Denali DDR3
Virage DDR3
Xilinx DDR3

Leave a Reply

featured blogs
Aug 14, 2020
Eeek Alors! We now have a cunning simulator that you can download and use to create and test programs to run on my 12 x 12 ping pong ball array!...
Aug 14, 2020
[From the last episode: We looked at what it means to do machine learning '€œat the edge,'€ and some of the compromises that must be made.] When doing ML at the edge, we want two things: less computing (for speed and, especially, for energy) and smaller hardware that req...
Aug 14, 2020
Cadence ® Spectre ® AMS Designer is a high-performance mixed-signal simulation system. The ability to use multiple engines and drive from a variety of platforms enables you to "rev... [[ Click on the title to access the full blog on the Cadence Community site....
Aug 13, 2020
My first computer put out a crazy 33 MHz of processing power from the 486 CPU. That was on “Turbo Mode” of course, and when it was turned off we were left with 16 MHz. Insert frowny face. Maybe you are too young to remember a turbo button, but if you aren’t ...

Featured Paper

True 3D EM Modeling Enables Fast, Accurate Analysis

Sponsored by Cadence Design Systems

Tired of patchwork 3D EM analysis? Impedance discontinuity can destroy your BER and cause multiple design iterations. Using today’s 3D EM modeling tools can take you days to accurately model the interconnect structures. The Clarity™ 3D Solver lets you tackle the most complex EM challenges when designing systems for 5G, high-performance computing, automotive and machine learning applications. The Clarity 3D Solver delivers gold-standard accuracy, 10X faster analysis speeds and virtually unlimited capacity for true 3D modeling of critical interconnects in PCB, IC package and system-on-IC (SoIC) designs.

Click here for more information

Featured Paper

Improving Performance in High-Voltage Systems With Zero-Drift Hall-Effect Current Sensing

Sponsored by Texas Instruments

Learn how major industry trends are driving demands for isolated current sensing, and how new zero-drift Hall-effect current sensors can improve isolation and measurement drift while simplifying the design process.

Click here for more information

Featured Chalk Talk

ERFV Coax Connectors

Sponsored by Mouser Electronics and TE Connectivity

5G pushes every dimension of electronic and RF design, and that puts extraordinary demand on or connectors. The best designs in the world won’t work reliably if your connector solution isn’t up to the task. In this episode of Chalk Talk, Amelia Dalton chats with Claude de Lorraine of TE Connectivity about ERFV Coax Connectors - RF connectors that are designed specifically for 5G applications.

Click here for more information about TE Connectivity ERFV Coax Connectors