feature article
Subscribe Now

A Bonus Generation

Xilinx Rolls UltraScale

The FPGA world has a unique obsession with semiconductor process nodes. Every two years or so we witness an epic battle between the two major market-share holders, centered mostly around who gets their devices working first on the next new semiconductor process. Historically, the stakes were very high. With FPGAs being among the first devices to go to production on a new node, and with the high-margin spoils of victory going largely to the winner – the biennial financial fates of the two big FPGA companies rode heavily on winning the next-generation derby.

Now, Xilinx is starting to ship samples of their new UltraScale family, based on TSMC’s 20nm planar CMOS process. This kind of “first to ship” announcement is usually a sign of impending victory, as the first to sample is usually the first to ship in volume and the first to collect the bulk of eager early adopters just chomping at the bit to design the biggest baddest silicon into their next system. Communications infrastructure has always been the largest segment of the FPGA market, and most of the big players in that segment have used devices from both Xilinx and Altera, so those big-budget projects are ripe for the picking any time a new leap in performance, density, and features becomes available.

In this case, things are looking pretty rosy for Xilinx, but the picture has gotten a lot more complicated, and it now involves a lot more than just who got to the next node first. Yes, there is certainly still big drama on the process front. In fact, you could argue that there is perhaps now more intrigue than ever. In the previous node, where both companies built product families based on TSMC’s 28nm process, Xilinx clearly came out on top, getting to market first and rolling in with a pile of innovations including Vivado – a brand-new, ground-up re-write of the company’s entire tool suite with ASIC-class capabilities – and some enormous interposer-based FPGAs aimed at the prototyping segment. Altera struggled with 28nm, shipped later than their competitor, and took a sizable financial and market-share hit as a result.

For the next generation, Altera pulled an unexpected play out of the hat – making a deal with Intel to build their next family on Intel’s upcoming 14nm FinFET technology. In theory, this could be a big boost for Altera. Xilinx has stuck with TSMC and has continued on the track to deliver their new UltraScale family on TSMC’s 20nm planar technology, and then to move to FinFETs with TSMC’s upcoming 16nm process. Altera, therefore, skipped the 20nm node with their flagship Stratix family, so this current UltraScale play by Xilinx will go unanswered at the high end – for a few months at least.

Did we mention that the game has gotten more complicated? As intriguing as this perpetual process geometry chess match has become, it is probably high time for it to share the stage with other, equally important factors. Winning in FPGAs today takes a lot more than just fancy chips on the latest semiconductor process. Just ask the numerous failed startups who came to market with impressive devices, only to fail because they were lacking the tools, the IP, and, most importantly, the armies of expert AEs deployed by the big two companies to help customers turn those cool chips into cool products.

In our own repeated surveys of the FPGA market, the number one factor in choosing an FPGA company for a particular design project is “Previous success with vendor’s tools and devices.” In other words, more than the fastest SerDes, the biggest IO counts, the fanciest DSP blocks, the lowest power, or the largest LUT arrays, FPGA designers care about their own confidence in just getting the darn thing to work. If we know a tool suite – including the bugs and their workarounds, have experience with all the quirks of a particular vendor’s chips, and have successfully dropped them into a socket on our boards, we are most likely to go with the same vendor again. It takes a pretty huge advantage in chip capabilities to get us to consider jumping ship and joining the other team.

Xilinx obviously understands this, and the UltraScale announcement shows that the company is taking the breadth of the fight seriously. Vivado now has a full generation of use under its belt, and it no longer walks on the wobbly legs of a few million lines of brand-new code. It brings the kind of performance and capability we will need if we plan to take advantage of the serious capabilities a family like UltraScale brings to the party. While FPGA tools have long given lip service to faster compile times, sophisticated timing optimization, power optimization, and IP management, Xilinx has really nailed an industrial-strength solution with Vivado. It’s a good thing, too, because the aging ISE suite would not at all be up to the challenge of the over-four-million-LUT monster at the high end of the new UltraScale family.

UltraScale is interesting in that it is the first family that Xilinx has had the opportunity to design using Vivado. The company ran exhaustive architectural explorations – experimenting with different configurations of routing resources over a wide range of design styles – to come up with an architecture that would allow very high utilization in the vast majority of cases. For the past several generations, FPGA densities on data sheets have come along with a bit of a nod and a wink. We all knew that no matter what the sheet said, we couldn’t get close to 100% utilization in any real-world design. We had to upsize our chip choice to be sure that we not only had enough LUTs, but that we’d have the routing resources to be able to successfully place and route our design and meet our timing constraints. With UltraScale’s better optimized architecture, that means that we can drop this fudge factor, or at least make it significantly less pessimistic. 

The UltraScale families themselves boast some impressive numbers. The largest device in the upcoming Virtex UltraScale family – the VU440 – boasts an incredible 4.4 million “logic cells” (4-input LUT equivalents). Aimed at the ASIC prototyping market, the device has 1,456 user IOs, 48 16.3 Gb/s transceivers, and 89 Mbits of block RAM. The company estimates that the device can implement the equivalent of 50 million ASIC gates. For people designing big prototyping boards or emulators, that kind of capacity is a really big deal. Other, smaller members of the family are aimed at more conventional markets, and they include even higher-performance SerDes – with 28Gb/s backplane-capable transceivers and up to 33Gb/s chip-to-chip/chip-to-optics transceivers. Of course, the devices include a very rich set of hard IP, including PCIe Gen3, 100 Gb/s Ethernet MAC, 150 Gb/s Interlaken, and DDR4 memory interfaces. All that IO capability makes the devices ripe for implementations of a number of single-chip 400 gig applications. 

It is a certainty that many such applications will come Xilinx’s way, as UltraScale will run without competition for some significant amount of time. Xilinx still has a lot of challenging work ahead of them, of course. Delivering the first samples of the first devices is far from having all members of a new family shipping in volume. But delivering those first samples is a benchmark that competitors still will not likely reach for quite some time, and, until then, Xilinx pretty much owns the playground. 

Leave a Reply

featured blogs
Sep 25, 2020
What do you think about earphone-style electroencephalography sensors that would allow your boss to monitor your brainwaves and collect your brain data while you are at work?...
Sep 25, 2020
[From the last episode: We looked at different ways of accessing a single bit in a memory, including the use of multiplexors.] Today we'€™re going to look more specifically at memory cells '€“ these things we'€™ve been calling bit cells. We mentioned that there are many...
Sep 25, 2020
Normally, in May, I'd have been off to Unterschleißheim, a suburb of Munich where historically we've held what used to be called CDNLive EMEA. We renamed this CadenceLIVE Europe and... [[ Click on the title to access the full blog on the Cadence Community site...
Sep 24, 2020
Samtec works with system architects in the early stages of their design to create solutions for cable management which provide even distribution of thermal load. Using ultra-low skew twinax cable to route signals over the board is a key performance enabler as signal integrity...

Featured Video

AI SoC Chats: Primitive Math IP for AI

Sponsored by Synopsys

Learn about the market trends and challenges around primitive math functions (floating point and integer math) in AI chipset development, and how DesignWare IP can help.

Click here for more information about DesignWare IP for Amazing AI

Featured Paper

Helping physicians achieve faster, more accurate patient diagnoses with molecular test technology

Sponsored by Texas Instruments

Point-of-care molecular diagnostics (PoC) help physicians achieve faster, more accurate patient diagnoses and treatment decisions. This article breaks down how molecular test technology works and the building blocks for a PoC molecular diagnostics analyzer sensor front end system.

Read the Article

Featured Chalk Talk

Thermal Bridge Technology

Sponsored by Mouser Electronics and TE Connectivity

Recent innovations can make your airflow cooling more efficient and effective. New thermal bridges can outperform conventional thermal pads in a number of ways. In this episode of Chalk Talk, Amelia Dalton chats with Zach Galbraith of TE Connectivity about the application of thermal bridges in cooling electronic designs.

More information about TE Thermal Bridge Technology