Heterogeneous FPGAs are Here!

In the wild world of semiconductor marketing, it is often difficult to tell what is true, what is false, what is hoped or wished, and what is probably-going-to-be-true-pretty-soon (so we’ll just go ahead as if it were true anyway).

3D FPGAs are like that.

We all want 3D FPGAs, right? Right? Well, maybe, uh – 3D ones certainly seem like they’d be an improvement over 2D ones and – I enjoyed “Avatar,” but my friend got motion sickness… OK, what can 3D FPGAs do that 2D ones can’t?

For a lot of us, the advent of 3D FPGAs is like the advent of four valves per cylinder in our car engines, or “tuned port” fuel injection – just a few years ago. Most people that bought those cars didn’t actually know what those terms meant or what those innovations did to improve their overall driving experience. They just thought the decals on the side looked cool. Engineering had delivered innovative and useful technology. Marketing had figured out a way to get people to buy it. The two didn’t have to be related.

Xilinx has just announced that they are now “shipping the world’s first heterogeneous 3D FPGA.” Cool! The business section of the New York Times even ran the press release so that Wall Street financiers can revel in the knowledge that it has the ability to “deliver the eye and jitter characteristics needed to reach the performance required to interface to CFP2 optic modules.” I’ll bet THAT is a load off their minds. Or maybe the decals are just cool.

When your customers are engineers, you have to sometimes be a bit more rigorous in your explanations. Engineers are seldom impressed by cool decals. Engineers understand the substance behind the technology. Engineers will read an article like this and say “Hey, when the heck are you going to get on with telling us what 3D FPGAs can do?”

How about now?

The biggest potential benefit of interposer-based technologies like those used by Xilinx in the Virtex-7 H580T device (whose initial shipments are the subject of the recent news release) is the ability to mix and match semiconductor slices fabricated with different technologies onto a single silicon interposer where they can be interconnected with massive numbers of connections – without having to go through traditional IOs, bonding pads, pins, PCB traces, and back again.

Why is this good?

Let’s say you are designing a new FPGA. You want to choose the semiconductor process that works the best for your LUT fabric. It needs to be fast but have low leakage current. Maybe you end up choosing something like TSMCs 28nm HPL (low-power with high-K metal gate) process. It works great! Now, you need to design yourself some 28 Gbps SerDes transceivers. Uh, oh. The process you just chose isn’t so good for that. However, people are already running some nice 28 Gbps transceivers using plain-old 40nm technology. How about we just make the LUT part with 28nm HPL and glue that to an interposer along with some 40nm slices with our super-duper transceivers on them. BAM! We’ve got a new chip with the best of both worlds. Problem solved.

Uh, not quite so fast, there. Problem solved, but many more problems created.

Xilinx is doing here what technology leaders are supposed to do. They’re leading. Sometimes, that means flirting with the “bleeding edge” a bit – which is OK as long as you get away with it most of the time. Many savvy semiconductor suppliers have experimented with interposer-based devices. The technology is new, but it is there. Most of those companies – like nVidia, for example, have concluded that, for now, this more-appropriately-termed “2.5 D” TSV/interposer-based technology is not yet ready for prime time.

But, nVidia’s prime time may not be the same as Xilinx’s prime time. The difference is target markets/applications and prices. As usual, it’s all about economics. That’s what those Wall Street types SHOULD have been reading about. nVidia probably isn’t that interested in developing a chip that can be produced only in small quantities and costs in the vicinity of $5,000 per copy. Xilinx, on the other hand, has customers who would love the opportunity to get their hands on one. It would enable them to create high-value applications that they could not realize any other way. Xilinx, therefore, presses ahead developing and shipping product while others do not. In doing so, Xilinx is doing the semiconductor industry (and themselves most likely) a huge favor by bringing TSVs and silicon interposers to life in real products and pushing them that much closer to viability for higher-volume production someday.

Putting the “poser” back in “interposer”…

When Xilinx did their first demonstrations of 28 Gbps transceivers about a year ago, we got some quiet but urgent chatter through our editorial sensor array from “persons not authorized to speak to the press.” “Xilinx is CHEATING!” they said. “Those 28 Gbps transceivers they’re demonstrating are not 28nm technology. They’re based on 40nm technology!” The assumption was that since Xilinx was in the middle of rolling out their 28nm FPGAs families, any demonstration of 40nm transceivers was a sham. It meant Xilinx hadn’t yet conquered the (assumed) requirement to make 28 Gbps transceivers work with 28nm HPL process technology. The whisperers hadn’t yet connected the dots between Xilinx’s silicon interposer work and their demonstration of capabilities fabricated on different semiconductor processes.

Mixing and matching chunks of silicon made from different processes seems to offer all kinds of potential benefits. No longer will we have to make a huge compromise by selecting just one semiconductor process technology and then trying to design every part of a complex SoC with that same process. We can optimize our process selections for each part of our design – analog, digital, memory, programmable fabric, processor cores, multiple power/performance levels… the possibilities seem almost endless. The potential benefits seem huge. On the other hand, there are those rumors that TSMC will provide only ONE process technology at the upcoming 20nm node. That kind of news might rain on our parade a bit.

For now, what Xilinx is able to do with this technology is impressive. Xilinx’s Virtex-7 HT family boasts up to sixteen 28 Gbps and seventy-two 13.1 Gbps transceivers. Using the HPL process technology allows Xilinx to produce more LUT fabric with less power leakage, and separating the SerDes into separate dies on the interposer also gives a nice bonus of isolation between the LUT fabric and the transceiver section, reducing noise significantly. Xilinx has sorted out how to mate these transceivers nicely with the new CFP2 optical modules to create some really compelling single-chip solutions for ultra-high bandwidth (nx100G and 400G) line-card applications.

If you’re not designing one of those applications (or something very much like them), you don’t want one of these devices. Frankly, they are out of your price range, and you can do what you need to with much less expensive devices. However, if you’re among the few who are trying to reach these frightening data speeds, this is currently the only game in town. For the rest of us, it is exciting just to see that silicon interposer technology is being shipped in commercial products – with the knowledge that someday (if not soon) similar technologies will be bringing big benefits to the systems we’re designing.