The crushing exponential onslaught of Moore’s Law is a harsh mistress. Each new process node demands a new round of superlatives and introduces a new way of thinking about the world of electronic design. The things we knew before are now quaint, dated, outmoded. The new things we are just learning (which will be all-to-soon relegated to the “norm”) stretch our collective imaginations as we interpret the implications of a whole new vocabulary on our day-to-day design lives.
When we first hear about what the next node will bring, we are often dumbstruck. No matter how many times we have experienced this topsy-turvy inversion of our expectations, we are surprised anew. The human psyche was just not designed to deal with long-term exponential change.
Altera has just rolled out their roadmap for 20nm. Calling this an “FPGA” roadmap would do it a disservice. We are apparently on the threshold of an era where terms like “FPGA” may lose their meaning. Just as we no longer have different chips for our inverters from our NAND gates, or different chips for our shift registers versus our 8-bit latches, we will soon no longer have separate chips for FPGAs and our other major system components. The integration bar is going up. Everybody will be making “SoCs”.
Over the years, we have watched the integration train press forward with monolithic devices. The vanishing point, of course, was out there on the horizon – the place where everything in our entire system was contained in one chip. Unfortunately, the reality of CMOS processes defied that goal. The best process for fabricating memory was different from the best process for fabricating analog, processors, programmable logic, and so forth. Chip designers worked in vain to find the best compromise – a process that could do all of the things they wanted at the same time without giving up too much on any one function. Compromising while also trying to keep up with the rocket-treadmill of Moore’s Law proved untenable. Merchant fabs gave designers a buffet of process variations to choose from – passing the buck on the issue. It didn’t work.
At 20nm, two very interesting trends strike this area. First, it seems that TSMC will offer only one process at 20nm. Apparently, getting one process to work at that scale is hard enough, and the prospect of coming up with three or four different recipes for different design palates was a bit over the top. Second, at 20nm we will move closer to commoditization of heterogeneous 2.5D packaging.
Taking the first one first, and specifically in the FPGA/programmable logic/”all-programmable”/SoC market, it means that Altera and Xilinx may no longer be able to claim differentiation from one another based on different underlying semiconductor processes. Although Xilinx has not disclosed 20nm details yet, we expect that both companies will be rolling out their 20nm offerings on exactly the same TSMC process. They’ll both have to find other fodder for press release one-upsmanship. Of course, we have not ruled out some surprise announcement that Xilinx would partner with another supplier – like Intel, for example – which could ignite the “our process is better than yours” battle anew.
Looking at the second trend, commoditization of heterogeneous 2.5D packaging, is more interesting. When the manufacturing techniques and standards have evolved so that heterogeneous 2.5D is commercially viable, the universe of ICs will change profoundly. No longer will companies have to choose a compromise process when they’re creating a massively-integrated SoC. They’ll be able to mix-and-match die/slices from a variety of processes – and even a variety of process nodes. Optimized, pre-tested memory slices will be parked alongside carefully-crafted multi-processor subsystems from a completely different manufacturing line. Flash memory with a much larger geometry will be included as well. Blazing-fast multi-gigabit IO and optical interfaces will be included – built on semiconductor processes that do not have to compromise their performance. Analog, and perhaps even MEMS components, may be integrated on the same interposer.
In short, the system that we would assemble today on a PCB may be entirely composed on an interposer and packaged up as a single “chip.” Why would we do this? Because it’s cool, of course. OK, there are other, more important reasons, too. Our current HUGE IC packages max out around 2,000 pins. That means that the most connections a single IC can have to another IC is far less than that. Using an interposer, however, there can be tens of thousands of connections between chips. More connections means more bandwidth and more flexibility in system design. Second, in order to move signals out through a pin, across a PCB, and back into another device, we need to use a lot of power. Moving signals from slice to slice through an interposer requires orders of magnitude less power. See? It’s cool.
In addition to increased chip-to-chip bandwidth and dramatically lower power consumption, there are more obvious potential benefits like board space and reliability. Clearly, dropping your entire system – pre tested and sealed – onto a board where it connects only to large-scale peripherals and physical components simplifies the rest of the design process dramatically.
We are nowhere near this vision yet, however. And we will not be there in a couple of years when 20nm goes into volume production, either. At 28nm, Xilinx has proven that heterogeneous 2.5D chips can be manufactured and tested – albeit at an enormous price premium. There are many problems to be solved before that vision can be commoditized – thermal issues, reliability, manufacturing costs, standards for mixed-vendor collaboration on connections through an interposer – the entire industry will be involved in bringing the vision to reality, and it will take closer to a decade than a single process node. Once we’re there, however, the rules will be drastically different. When you have a “chip” with memory from one vendor, programmable fabric from another, a processing subsystem from another, and so forth, whose logo will be silkscreened on the top? Who will be the “integrator” trusted by their customers to bring the whole thing to market?
Altera’s 20nm announcement contains a lot more than visions of heterogeneous 2.5D packaging and the usual “bigger, faster, cheaper, cooler” rigmarole. Altera says they will have blazing-fast 40Gbps SerDes, upgraded versions of their already-awesome floating-point variable-precision DSP blocks, and a new spin on their existing “HardCopy” ASIC service – allowing HardCopy devices to be dropped into the above-mentioned heterogeneous 2.5D ICs along with other system components and programmable fabric.
The company is also pushing hard on the tool front. To us, it seems that tools will most likely be the next battlefield. With enormous SoCs containing sophisticated high-performance multi-core processing subsystems and copious amounts of programmable logic fabric, the hardware platform will finally exist to support the mythical “Hardware/Software Co-Design” vision that has been touted (but not delivered) by the industry for the past 20 years. In that vision, processing tasks can be seamlessly divided and even re-partitioned between software implementations and hardware implementations – giving the perfect balance of performance, power efficiency, and silicon real estate.
Now that we’ll have hardware that can actually do it, we’ll need the design tools to back that up. Nobody has yet created a tool suite that can smoothly and easily switch between hardware and software implementations of the same algorithm in the general case (not just for specialized problems like DSP datapaths). The EDA industry has struggled to bring the linchpins of such technology to life: high-level synthesis, algorithmic design tools, transaction-level verification flows, analysis tools for hardware/software partitioning… The state of the art of these tools today is still primitive compared with what we’ll need before the average software developer can run a performance analyzer on some code, flip a switch on a couple of performance-intensive routines or objects, and have them transparently compiled into optimized hardware accelerators. Altera’s approach to this problem seems to be based on OpenCL – an evolving standard for programming massively parallel processors like graphics chips. Other vendors will take different paths. None are yet close to practical reality.
So, why is Altera choosing to announce 20nm innovations now? With every process node, there is a cat-and-mouse marketing and PR strategy game as to when to announce. We have discussed, and even poked fun at, this game on previous process nodes. (See: 45nm Chicken) Both Altera and Xilinx have clearly been working on 20nm technology for years, and they will continue working for years before you hold your first 20nm product in your hands. There is no big technological milestone that was just achieved allowing one or the other company to say, “Ta-Daaaa! We are the first to announce 20nm!” In reality, this announcement could be made any time within a one- to two-year period.
Xilinx would say that Altera is announcing now because Altera is not happy with the way the game is playing out at 28nm. While both companies have respectable offerings at 28nm shipping today, Xilinx has certainly grabbed more attention with announcements of innovations – including the first 2.5D production chips. If you think you’re behind in the game you’re playing, change the game. Altera would probably counter that they’ve gained market share the past couple of years against Xilinx, and that their earlier 20nm disclosure is a sign of technological leadership on the new process node. We will let you decide which one of those to believe. Kool-Aid is on sale at EE Journal this week – both red and blue flavors.
Marketing hype and positioning aside, it’s impossible not to be a little dumbstruck by the possibilities of 20nm and fascinated by the vision of the devices we may be able to design within a couple of years. Electronic design is almost never boring. The assumptions on which we build our amazing products are constantly being re-written, and we have to race with our collective imaginations to keep pace. It’s a fun ride.