Intel FOVEROS 3D Packaging

Moore’s Law said that we could double the number of transistors on an integrated circuit every two years. But there were a lot of variables in that equation. For example, the industry has always assumed that an “integrated circuit” was a single, 2D monolithic silicon chip. But Moore’s Law didn’t make that distinction. It also didn’t specify an area for the silicon chip, meaning the “more transistors” could be crammed onto larger chips without breaking the rules of the game. And, for decades, engineers have wondered about breaking down the third wall, and stacking circuits vertically. If we could do that (we reasoned) our transistor density would expand by the cube of any improvements in feature size, rather than just the square.

Ah, but physics and math are harsh mistresses. As chips grow larger and denser, the probability of failures that affect yield increases dramatically. And, since heat is generated by every toggling transistor and needs somewhere to go, stacking die has always been extremely problematic from a thermal perspective. Like jetpacks and teleportation, our visions of huge 3D ICs were confined to the realm of sci-fi and fantasy. Of course, we’ve all seen the limited case of stacked memory die, but the true challenge of stacking arbitrary heterogeneous functional elements into one device has gone unanswered.

Until now.

Intel has announced a new technology called FOVEROS, which is the closest thing yet to the idealized vision of a true 3D IC. And, as a demonstration vehicle, they’ve built an entire multi-processor computer in a package smaller than a dime. Of course, typing on that tiny keyboard will take some nimble fingers… (just kidding).

The first we heard of FOVEROS was at Intel’s Architecture Day last month. https://www.eejournal.com/article/intels-grand-vision/ FOVEROS – which one Intel spokesperson told us is a Greek word meaning “Awesome,” is a brand for a suite of packaging technologies that allows stacking of disparate die. Intel uses an active interposer which, unlike a typical passive silicon interposer, can contain active parts of the system. Now, Intel can combine chiplets based on different process technologies, each optimized for the task. By picking the best transistor for each function – CPU, IO, FPGA, RF, GPU, etc – the system can be optimized, and we’re not stuck trying to get major blocks of the system to perform well with a process not well suited to the function.

Additionally, by stacking chiplets vertically (rather than horizontally as with a typical passive silicon interposer), Intel is able to get around a major bottleneck in high-performance system-in-package design – memory proximity. In addition to the the through-silicon vias (TSVs) and traces that connect power and data to the attached chiplets found on a passive interposer, the FOVEROS active interposer in the example Intel showed contains the platform controller hub (PCH) that manages IO for the system. We could think of it as a PCH with attachment vias that allow system components to be bonded directly on top.

At Architecture Day, Intel had a demo design built with FOVEROS fired up and working. It was an x86 design combining a large Sunny Cove X86 core with four smaller Atom cores, all on one 10nm device, to form a hybrid x86 CPU (similar to ARM’s big.LITTLE.) The block diagram showed one Sunny Cove core with 0.5 MB of private medium-level cache, and four Atom cores with a shared 1.5 MB L2 cache,. Then there is an “Uncore” that contains 4MB of last-level cache, a quad-channel 4×16-bit memory controller supporting LPDDR4, a Gen11 graphics controller, a Gen 11.5 display controller, an imaging IPU, and MIPI with DisplayPort 1.4. On top of that was package-on-package (PoP) memory. The whole setup is in a package smaller than a dime,

Intel says that this is a real product, originally built at the request of a customer who asked for a device (for mobile use we assume?) that combined this kind of performance with 2 mW standby power. The SoP will apparently also be sold to the general market.

Packaging technology such as FOVEROS casts Moore’s Law in a whole new light. It opens up a wide gamut of applications where we can get much better performance, cost, and power consumption using approaches other than just “shrink it again, Sam.” With the daunting challenges posed by reaching the single-digit nanometer nodes, there are many system functions that simply do not make economic sense to fabricate on these exotic technologies. Further, at any process scale, there are variables that can optimize for power, density, performance, etc. Memory, logic, analog, RF, and IO (just to name some examples) all would be happier with different process recipes. Even within the logic parts of the design, components of the system that have large duty cycles can be optimized for power consumption, while short-duty performance-critical sections can have the dial turned toward maximizing frequency.

Of course, Intel just happened to come out with the FOVEROS announcement when they are woefully late on their 10nm schedules and clearly on a mission to explain that process node isn’t as important as we in the press make it out to be. You can make of that what you want. For our part, we think that advanced packaging technologies like FOVEROS have spectacular potential, and they are probably currently limited more by the lack of a robust ecosystem for designing with them (ready availability of a large selection of chiplets that are compatible within the same packaging ecosystem, and design tools that are centered around the idea of package-level system integration).

Working out that ecosystem will be an industry-wide task, and it will undoubtedly involve proprietary infighting and roadblocks, with the various big players trying to be the one that “owns” the final integration process. But it will be interesting to watch, and to see if the popular focus shifts slowly away from Moore’s Law related metrics (how many nm are you shipping today?) to more relevant measures of merit – cost, performance, power, reliability, and form factor for an entire system.