feature article
Subscribe Now

A Better Flytrap

DDR3 Controllers Hit the Market for SoCs and FPGAs

That DDR memories work at all seems like a miracle. I mean, it’s like someone woke up one morning and said, “Hmmm…. You know, high-speed serial interconnect has complicated timing when you try to align a bunch of lanes… there HAS to be a way to take those concepts and make it even trickier to design.”

Here you’re taking a bank of memories and sending them data and address and clock and command and DQS signals, and all in “eye-diagram” territory. Skews and signal integrity are critical; this is not for the faint of heart. And, in fact, DDR architects have finally said “Uncle” for some of it.

Let’s start by reviewing the basic topology of a DDR memory. You’ve got a bunch of memory chips, each with data, address, control, and DQS. The DQS signal is the least familiar for those who’ve been fortunate enough not to have to think about how memories are accessed. Unlike older memories where the clock is used to time reads and writes, DQS is what really clocks data in during a write, or indicates data valid during a read; the clock signal is used as a reference timer for generating DQS. Each memory has its own DQS, which allows the timing of each memory to be tuned for its particular signals. Data and DQS lines are connected point-to-point; address and control lines are bussed to all the memories. Because the timing is so tight, the relative skews for all of these signals are critical.

The way the busses are created is through what is called T-branching – each time a bus branches, both sides of the branch have to be matched. And these busses branch more than once, so you end up with a complicated design job that more or less ends up looking something like clock tree synthesis on an IC. Each signal has a single source, and it has to arrive at multiple destinations at the same time. And they have to be impedance-matched, and each angle or via on the PCB acts as a discontinuity and, well, you can imagine it might get messy.

Actually, in theory, it’s a tad more complicated than that, since you also have to align, for each memory, with the data and DQS for that memory. If each of the data/DQS lines can be aligned using a technology like dynamic phase alignment, then all the data and DQS lines can be balanced and “equalized,” and the bussed signals can then be designed to arrive simultaneously at each chip.

If each of the data/DQS sets had a different arrival time, the job would get even nastier. Then again, the data isn’t a single signal – it’s a set of signals or lanes, and they have to be aligned with each other so that a single data grab on a given memory pulls the right data from each of the lanes, so, presumably, if you can align them with each other, then you can align them with the data/DQS signals from other memories as well. Having said that, aligning eight data signals and a DQS may be a big enough pain in the butt without having to align a bunch of sets of those.

Bottom line, it’s a tough timing job to get this all to work. Oh — and then it needs to continue working as the weather and temperature and phases of the moon change as well.

So… DDR2 doubled clock rates from DDR; hey, that was so much fun, let’s do it again! And let’s call it… hmmmmm… how… about… Oo! Oo! I’ve got it! How about DDR3? And now… we’ll pump data as fast as 1.6 Gb/s. Nice.

And this is where we abandon ship on the old bussing techniques. The kinds of timing tolerances are so crazy now that trying to use T-branching to balance everything out has been ruled out as being unworkable. And a new approach has been applied, taking a page from Fully-Buffered DIMMs. It’s called the “fly-by” technique. And it sounds simpler, and I’ll take designers’ word for it that it is simpler, but man, it still sounds like a tough thing to get right.

Instead of bussing the control signals to each memory, there’s one set of signals that simply starts at one side of the DIMM and crosses all the memories. Obviously, signals will arrive at the first memory before they get to the last memory. The idea here is to position signals on the control bus so that they arrive at their intended memories at the correct time. So the signal intended for, say, memory chip 3, will visit all the memories before and after hitting chip 3, but will arrive at chip 3 with timing such that is recognized by chip 3 and not by the other chips.

To be clear, we’re not talking about a pipeline here. We’re only talking wires, and signals are flying by, and we’re capturing them as they pass much as a frog catches a fly. The memory controller has to issue both the fly and the frog’s tongue and ensure that they meet. And it’s worse than that, it’s like the fly is being launched from California, the frog is in Japan, and the frog has a really long tongue that’s going to snag the fly 30,000 feet over oblivious beachgoers on Waikiki. No, it’s worse than that… you’ve got a bunch of flies and a bunch of frogs, and you’re going to send a stream of flies out and position them all so that the frogs all zap the right flies, taking into account variations in wind and air density and aerodynamics as the flies freak out. It just sounds unlikely, but it’s the new game. Scares the crap outa me. I guess it’s just that the old system was even scarier. Imagine that…

Part of the new game involves “read leveling” and “write leveling.” There’s a training sequence that executes before the controller goes into full operation to figure out the alignment of signals with respect to the various memories and their data/DQS lines. This basically teaches the controller how to set the timing of the signals it places on the control bus so that they reach their targets at the right time. Once set, the controller then has to compensate for variations in temperature and voltage to ensure that the memory continues working as conditions change.

Both Denali and Virage have announced DDR3 controller IP for SoCs. The solutions are modular, and the issues of higher-level controller tend to be separated from the thorny timing issues associated with the physical (PHY) level and the I/Os. Altera and Xilinx have also put their latest high-end devices to use as DDR3 controllers.

As for me, I think I’ll test out this new technology by heading out to Waikiki to gaze up into the night sky and see if I can spot any flies getting zapped. I could use the vacation.

Links for more information on any companies or products mentioned:
Altera DDR3
Denali DDR3
Virage DDR3
Xilinx DDR3

Leave a Reply

featured blogs
May 5, 2021
New 5G infrastructure is powering smart city projects worldwide; explore the importance of IoT security for smart city solutions in public safety & logistics. The post How 5G Networks Will Accelerate Development of Smart Cities appeared first on From Silicon To Software...
May 4, 2021
What a difference a year can make! Oh, we're not referring to that virus that… The post Realize Live + U2U: Side by Side appeared first on Design with Calibre....
May 3, 2021
As a NASA flight enthusiast, the idea of unmanned aerial vehicle systems (also known as drones) sounds like a lot of fun. A good example of how fun drones can be is through drone racing'¦yes you read that right'¦ drone racing! However, apart from how fun they can be, drones...
May 2, 2021
https://youtu.be/1HEd6JCriCQ Made in Groveland CA (camera Carey Guo) Monday: Package Assembly Design Kits Tuesday: Rapid Adoption of the Arm Server-Class Processors Wednesday: Arm V9A Thursday:... [[ Click on the title to access the full blog on the Cadence Community site. ]...

featured video

The Verification World We Know is About to be Revolutionized

Sponsored by Cadence Design Systems

Designs and software are growing in complexity. With verification, you need the right tool at the right time. Cadence® Palladium® Z2 emulation and Protium™ X2 prototyping dynamic duo address challenges of advanced applications from mobile to consumer and hyperscale computing. With a seamlessly integrated flow, unified debug, common interfaces, and testbench content across the systems, the dynamic duo offers rapid design migration and testing from emulation to prototyping. See them in action.

Click here for more information

featured paper

Designing TI mmWave radar made easier using our third-party ecosystem

Sponsored by Texas Instruments

If you are new to radar or interested in replacing your existing sensing technology with radar, there can be a significant learning curve to both designing your product and ramping to production. In order to lower this barrier, Texas Instruments created a third-party ecosystem of radar experts who can provide solutions no matter how much help you need.

Click to read more

Featured Chalk Talk

Protecting Circuitry with eFuse IC

Sponsored by Mouser Electronics and Toshiba

Conventional fuses are rapidly becoming dinosaurs in our electronic systems. Finally, there is circuit protection technology that doesn’t rely on disposable parts and molten metal. In this episode of Chalk Talk, Amelia Dalton chats with Jake Canon of Toshiba about eFuse - a smart solution that will get rid of those old-school fuses once and for all.

Click here for more information about Toshiba efuses