feature article
Subscribe Now

From Chiplet to Chiplet

Open Interconnect for an Open Market

A couple of months ago we talked about the new chip disintegration approach: chiplets. And even longer ago, we talked about zGlue’s fabric for interconnecting chiplets. But there are actually several ways in which chiplets are viewed as being co-assembled, which leaves the question, how are we going to connect these things together?

After all, if this whole don’t-integrate thing is going to work in the long term, then a bunch of interconnected chiplets must perform pretty much like the same functions would if they were all on the same chip. And on-chip interconnect is famously faster than historical chip-to-chip interconnect. So how do we bridge the gap between those two extremes – on-chip and legacy off-chip – in order to make chiplet assemblies competitive with chips?

Spoiler alert: this is still a work in progress.

That said, efforts are afoot to develop an open interconnect scheme so that an ecosystem for chiplets from different sources can thrive. Today, by contrast, chiplets tend to be interconnected by proprietary schemes, with each (large) company creating all of the pieces in-house. That makes it impossible to have an actual market, where integrators can select the best chiplets for their application, without artificial interconnect barriers.

Active in this effort is the Open Compute Project (OCP), which is trying to define an Open Domain-Specific Architecture (ODSA, which also seems to be the name of the working group) that will simplify – or at least standardize – the way we connect accelerators (which typically have domain-specific functionality) into systems. Their effort covers much more than interconnect, but that’s the focus for today.

The thing is, however, we’re talking interconnect speeds in the gigahertz range, so just running some metal on FR-4 from one chiplet to another ain’t gonna cut it. Length matters a lot here – so much so that some interconnect schemes are literally named after how long their “reach” is.

But first, let’s set the stage based on different ways of putting these chiplets together, zGlue’s specific fabric aside (and, to be clear, zGlue is an active participant in this project). Generically, there are three approaches:

  • MCM, which stands for multi-chip module, and which is not a new thing. While there’s nothing in the generic concept of a module that says anything about how it’s built, for the purposes of the ODSA, this means an organic substrate – specifically, the package material.
  • Interposers, meaning something like silicon or glass that can handle far more signals that are placed much closer together than is possible on an organic material. It acts as a shim between the dice and the package.
  • Finally, there’s Intel’s Embedded Multi-Die Interconnect Bridge, which is intended to be cheaper than an entire interposer. It can provide the interconnect density of an interposer, but it’s just a smaller chunk embedded into the package right where the interconnect will be. No need for interposer material anywhere else, cutting costs.

(Image courtesy ODSA)

For all of these, distances are small by PCB standards, and yet, even here, reach is a consideration. Millimeters matter. But we’re not talking just about physical interconnect here: we’re talking full protocol stacks, the upper layers of which would reside in firmware, with the lower layers being hardware.

(Click to enlarge. Image courtesy ODSA)

You can see that the various layers call out different protocols – or call for a new protocol. Let’s walk through them to see what they are and where they came from. Some of them may be familiar. We’ll spend more time on the PHY alternatives after.

  • CCIX: this is a coherency standard for use when connecting a CPU and an accelerator. It’s an extension to PCIe for chip-to-chip use. Rather than DMAing data to the accelerator, the CPU passes pointers.
  • TileLink: This is a chip-scale interconnect protocol with multiple masters and slaves. It supports cache coherency as well. It comes from SiFive, but it’s not limited to RISC-V CPUs.
  • ISF: This is developed by Netronome, an active participant in the ODSA project. They develop packet-processing solutions, not the least of which is an accelerator that they call a Flow Processor. In this context, ISF covers transport, routing, and link layers.
  • PIPE: this is a part of PCIe, covering the interface between the MAC sublayer and the PHY layer.

For PHY interconnect, there are two fundamental options: serial and parallel. The first, and most obvious, option is the ubiquitous PCIe PHY. There are two other serial links in the image above, and their difference is their intended reach. 

XSR is extra-short reach, for links up to 50 mm. USR is – you guessed it – ultra-short reach, for signals going no farther than 10 mm. (Yeah, they’re running out of superlatives… not sure what the next one will be called…). They both involve SERDES circuits using NRZ, PAM-4, or even Chord signaling. They’re specified by the Optical Internetworking Forum (OIF) for chip-to-chip and chip-to-optical connections. In the last OIF Common Electrical Interface (CEI) spec that I saw, both were positioned at 56 Gbps, DC coupled, with bit-error rates of 10-15. Presumably, USR will eventually reach higher bandwidths than XSR, since signals traverse a shorter distance.

Just to be clear on the signal formats, NRZ is your typical signal format – high is HI, low is LO, lasting for the duration of the symbol. We talked about PAM-4 not so long ago. Chord signaling is yet something different; it claims greater code efficiency than NRZ and better signal-integrity characteristics than PAM-4. It encodes n bits of data into n+1 physical bits. Honestly… I’ve looked into it a little bit, and it’s not so obvious how it works. So we’re going to need to come back to that one in the future if it starts to reach prime time.

For parallel interconnect, there are two options, both of which ODSA says require less sophisticated circuits than serial. You could even make them with 90-nm technology. First is BoW, which stands for “Bunch of Wires.” Yup, seriously. It has three modes:

  • Basic, supporting up to 5 Gbps. It’s unterminated, with transmit and receive on separate lines.
  • Fast, supporting up to 12 Gbps. It’s terminated, with transmit and receive on separate lines.
  • Turbo, supporting up to 12 Gbps for both transmit and receive on the same wire – a total of 24 Gbps. So it’s full duplex. The receive circuits must subtract out the transmitted signal in order to recover the received signal (just like the telephones of yore).

AIB, meanwhile, is a recent addition. Originating at Intel, with support from DARPA, it can drive single-data-rate (SDR) signals to 1 Gbps and double-data-rate (DDR) signals to 2 Gbps. Yes, it’s not as fast as BoW at present, but, being newer, there may be more headroom.

The big difference between BoW and AIB is the intended substrate. BoW was conceived for organic substrates, and all testing and characterization results reflect that. AIB was intended for interposers or EMIB and can therefore handle many more signals spaced more closely together. The ODSA folks say that BoW might operate just fine in an interposer system as well, but there’s no data yet to support that.

Finally, there may be another entrant from AMD. Their Infinity architecture has an Infinity Fabric for On-Package (IFOP) option. Their materials say it’s for chip-to-chip interconnect, which I presume makes it also suitable for chiplet-to-chiplet interconnect. What I don’t know is if they intend to promote it as an alternative to the ODSA options. I attempted to contact them, but, as of “print” time, I have received no response. I’ll update if I get one.

Whichever options win out, having open standards will be the only way that a chiplet market can take flight.

 

More info:

ODSA

AMD Infinity Architecture

Sourcing credit (all ODSA participants):

Bapi Vinnakota, Director of Silicon Architecture Program Management at Netronome

Bill Carter, CTO at OCP (Open Compute Project)

Quinn Jacobson, Strategic Architect at Achronix

2 thoughts on “From Chiplet to Chiplet”

Leave a Reply

featured blogs
Sep 20, 2021
As it seems to be becoming a (bad) habit, This Week in CFD is presented here as Last Week in CFD. But that doesn't make the news any less relevant. Great article on wind tunnels because they go... [[ Click on the title to access the full blog on the Cadence Community si...
Sep 18, 2021
Projects with a steampunk look-and-feel incorporate retro-futuristic technology and aesthetics inspired by 19th-century industrial steam-powered machinery....
Sep 15, 2021
Learn how chiplets form the basis of multi-die HPC processor architectures, fueling modern HPC applications and scaling performance & power beyond Moore's Law. The post What's Driving the Demand for Chiplets? appeared first on From Silicon To Software....
Aug 5, 2021
Megh Computing's Video Analytics Solution (VAS) portfolio implements a flexible and scalable video analytics pipeline consisting of the following elements: Video Ingestion Video Transformation Object Detection and Inference Video Analytics Visualization   Because Megh's ...

featured video

Accurate Full-System Thermal 3D Analysis

Sponsored by Cadence Design Systems

Designing electronics for the data center challenges designers to minimize and dissipate heat. Electrothermal co-simulation requires system components to be accurately modeled and analyzed. Learn about a true 3D solution that offers full system scalability with 3D analysis accuracy for the entire chip, package, board, and enclosure.

Click here for more information about Celsius Thermal Solver

featured paper

An Engineer's Guide to Designing with Precision Amplifiers

Sponsored by Texas Instruments

This e-book contains years of circuit design recommendations and insights from Texas Instruments industry experts and covers many common topics and questions you may encounter while designing with precision amplifiers.

Click to read more

featured chalk talk

Why Measure C02 Indoor Air Quality?

Sponsored by Mouser Electronics and Sensirion

The amount of carbon dioxide in the air can be a key indicator in indoor air quality and improving our indoor air quality can have a slew of benefits including increased energy efficiency, increased cognitive performance, and also the reduction of the risk for viral infection. In this episode of Chalk Talk, Amelia Dalton chats with Bernd Zimmermann from Sensirion about the reasons for measuring carbon dioxide in our indoor spaces, what this kind of measurement looks like, and why Sensirion’s new SCD4 carbon dioxide sensors are breaking new ground in this arena. (edited)

Click here for more information about Sensirion SCD4x Miniaturized CO2 Sensors