It’s time! Grab your popcorn and settle in for the ride, FPGA fans. The biennial spectacular has just begun. The first gladiator has entered the arena, waved to the crowd, and lifted his weapons. We are about to witness the technological battle of the century in programmable logic.
Every two years, the two big FPGA vendors play a high-stakes game of “chicken” to see who is first to announce their plans for the next semiconductor technology node. At stake – bragging rights for being “first” to announce the next level of capabilities. The risk? Your competitor gets to look at your announcement and use it as a baseline for their own – exploiting any weaknesses in your announced plans and tailoring their own message to stack up favorably. You also risk angering your customers if the announcement precedes the actual product by too much.
For the past three nodes or so, we’ve seen a back-and-forth battle between Altera and Xilinx. Most people think that Altera got the upper hand in 40/45nm products with their Stratix IV family. Two years later, Xilinx struck back hard at 28nm with Virtex-7. Now, it’s time for the “next” generation, and Altera is apparently ready to get the party started. The company has just announced their upcoming “Generation 10” FPGA families – and it looks like this node is gonna be a doozy!
Before we dive into the background too much, let’s try some numbers on for size. How about FOUR MILLION logic element (equivalents) on a single monolithic die? For reference, at 28nm Xilinx’s V2000T used several chips interconnected on a silicon interposer (at great expense) to achieve 2 million logic elements. For monolithic FPGAs, we are looking at a 4x generation-to-generation density increase.
OK, what about speed? Let’s start with SerDes. At 28nm we were awed by 28Gbps SerDes transceivers. Now – we’ve doubled again to 56Gbps. On the Fmax front – for logic implemented in the FPGA LUT fabric itself – a mind-blowing 1GHz – double the frequency of the 28nm ancestors. This is particularly surprising given that the Fmax numbers have been relatively flat for the past several generations. In the compromise between power, area, and speed – speed has been put on the back burner for a while, because, apparently, people didn’t like chips that melt. Now, our melting days must be behind us, because we’re cranking up the clock again.
Aha! (I hear you say) That must mean that power has gone through the roof. Not so fast, there, power pontificators. Altera says that you’ll be able to dial anything from a 70% reduction in power for the same performance – to a 50% improvement in performance at the same power – to a doubling of performance with a modest 30% increase in power. It sounds like free money.
So – where does all this suspiciously magical goodness come from? Apparently FinFETs are as good as people say. Altera partnered with Intel for the new Stratix 10 family – and it will be fabricated on Intel’s 14nm Tri-Gate (Intel’s name for FinFETs) technology. Compared with planar transistors, FinFETs deliver significantly better performance/power characteristics, and FPGAs are perfectly suited to capitalize on that advantage.
Or, at least Altera’s FPGAs are. Xilinx hasn’t yet announced their next-node plans, but it is expected that Xilinx’s next families – originally expected to be based on TSMC’s 20nm planar technology – may be going for an accelerated schedule on TSMC 16nm FinFET. This Altera announcement undoubtedly puts enormous pressure on Xilinx to get to market with FinFET-based FPGAs of their own. Stratix 10 is expected to have test chips running in 2013 and software support in 2014, so the race is definitely not over. Intel is widely believed to be ahead of TSMC in 14/16nm technology in general – and in FinFETs in particular (Intel has FinFETs in their 22nm process as well) – so Altera would appear to be riding the favored horse here.
Altera is also throwing in yet another redesign of their DSP blocks – yielding what the company claims will be a 10x improvement in DSP performance. The new DSP blocks have floating-point capability hard-wired, so you won’t be using as much of your precious four million cells of logic fabric for DSP applications. We need more information to see the real-world implications of this DSP boost, but it sounds impressive on paper.
On the functionality front – Altera is announcing that both their Stratix and Arria families will have “SoC” versions in Generation 10. What they’re not yet announcing is what processor architecture will be used in the Stratix version of the SoC, so the question of Intel fabricating ARM-based processors will have to go unanswered for now. Let the religious-war speculation begin!
Stratix 10 is not the only family in “Generation 10,” of course. Apparently a single family does not a “Generation” make. Altera is also announcing “Arria 10” (there is no mention of a low-cost “Cyclone 10” as of yet). Arria 10 will not be based on the Intel 14nm Tri-Gate process. Instead, it will be fabricated using TSMC’s more-established 20nm planar process. As a result, Arria will ship sooner – with initial samples of Arria 10 devices available in early 2014.
So, what’s Arria 10 all about? At a high level – Arria 10 will give you the capabilities of the current high-end Stratix V FPGAs (and a bit more in several ways) in a mid-range FPGA. Specifically, Arria 10 is slated to have 15% better performance than the current Stratix V family, with densities ranging up to 1.15 million logic elements. The SoC version will have 2 ARM-9 cores operating at up to 1.5 GHz (the current version operates at 800MHz). On the SerDes front, Arria 10 will have 28Gbps transceivers – the same as today’s Stratix V FPGAs. What about memory? We’ll have 2666 Mbps DDR4 support and up to 15 Gbps Hybrid Memory Cube support. The company also claims that Arria 10 will reduce power consumption by around 40% compared with the current generation.
So – how long is it gonna take to run the design flow on these new monster devices? The good news there is that Altera also claims that the Quartus II software will deliver 8x faster compile times than the current software. If you factor in the increase in design complexity possible with these chips, that would seem to indicate that your design iteration times should at least stay the same – and probably get better. Again, we haven’t seen real-world examples, so this is speculation at this point.
Altogether, this is an impressive announcement from Altera – although we need to point out that it is a “future” announcement, and the success or failure will completely depend on the company’s ability to deliver these capabilities on an aggressive schedule. With Xilinx breathing down their collective necks, they would seem to be highly motivated to do just that.