feature article
Subscribe Now

From Chiplet to Chiplet

Open Interconnect for an Open Market

A couple of months ago we talked about the new chip disintegration approach: chiplets. And even longer ago, we talked about zGlue’s fabric for interconnecting chiplets. But there are actually several ways in which chiplets are viewed as being co-assembled, which leaves the question, how are we going to connect these things together?

After all, if this whole don’t-integrate thing is going to work in the long term, then a bunch of interconnected chiplets must perform pretty much like the same functions would if they were all on the same chip. And on-chip interconnect is famously faster than historical chip-to-chip interconnect. So how do we bridge the gap between those two extremes – on-chip and legacy off-chip – in order to make chiplet assemblies competitive with chips?

Spoiler alert: this is still a work in progress.

That said, efforts are afoot to develop an open interconnect scheme so that an ecosystem for chiplets from different sources can thrive. Today, by contrast, chiplets tend to be interconnected by proprietary schemes, with each (large) company creating all of the pieces in-house. That makes it impossible to have an actual market, where integrators can select the best chiplets for their application, without artificial interconnect barriers.

Active in this effort is the Open Compute Project (OCP), which is trying to define an Open Domain-Specific Architecture (ODSA, which also seems to be the name of the working group) that will simplify – or at least standardize – the way we connect accelerators (which typically have domain-specific functionality) into systems. Their effort covers much more than interconnect, but that’s the focus for today.

The thing is, however, we’re talking interconnect speeds in the gigahertz range, so just running some metal on FR-4 from one chiplet to another ain’t gonna cut it. Length matters a lot here – so much so that some interconnect schemes are literally named after how long their “reach” is.

But first, let’s set the stage based on different ways of putting these chiplets together, zGlue’s specific fabric aside (and, to be clear, zGlue is an active participant in this project). Generically, there are three approaches:

  • MCM, which stands for multi-chip module, and which is not a new thing. While there’s nothing in the generic concept of a module that says anything about how it’s built, for the purposes of the ODSA, this means an organic substrate – specifically, the package material.
  • Interposers, meaning something like silicon or glass that can handle far more signals that are placed much closer together than is possible on an organic material. It acts as a shim between the dice and the package.
  • Finally, there’s Intel’s Embedded Multi-Die Interconnect Bridge, which is intended to be cheaper than an entire interposer. It can provide the interconnect density of an interposer, but it’s just a smaller chunk embedded into the package right where the interconnect will be. No need for interposer material anywhere else, cutting costs.

(Image courtesy ODSA)

For all of these, distances are small by PCB standards, and yet, even here, reach is a consideration. Millimeters matter. But we’re not talking just about physical interconnect here: we’re talking full protocol stacks, the upper layers of which would reside in firmware, with the lower layers being hardware.

(Click to enlarge. Image courtesy ODSA)

You can see that the various layers call out different protocols – or call for a new protocol. Let’s walk through them to see what they are and where they came from. Some of them may be familiar. We’ll spend more time on the PHY alternatives after.

  • CCIX: this is a coherency standard for use when connecting a CPU and an accelerator. It’s an extension to PCIe for chip-to-chip use. Rather than DMAing data to the accelerator, the CPU passes pointers.
  • TileLink: This is a chip-scale interconnect protocol with multiple masters and slaves. It supports cache coherency as well. It comes from SiFive, but it’s not limited to RISC-V CPUs.
  • ISF: This is developed by Netronome, an active participant in the ODSA project. They develop packet-processing solutions, not the least of which is an accelerator that they call a Flow Processor. In this context, ISF covers transport, routing, and link layers.
  • PIPE: this is a part of PCIe, covering the interface between the MAC sublayer and the PHY layer.

For PHY interconnect, there are two fundamental options: serial and parallel. The first, and most obvious, option is the ubiquitous PCIe PHY. There are two other serial links in the image above, and their difference is their intended reach. 

XSR is extra-short reach, for links up to 50 mm. USR is – you guessed it – ultra-short reach, for signals going no farther than 10 mm. (Yeah, they’re running out of superlatives… not sure what the next one will be called…). They both involve SERDES circuits using NRZ, PAM-4, or even Chord signaling. They’re specified by the Optical Internetworking Forum (OIF) for chip-to-chip and chip-to-optical connections. In the last OIF Common Electrical Interface (CEI) spec that I saw, both were positioned at 56 Gbps, DC coupled, with bit-error rates of 10-15. Presumably, USR will eventually reach higher bandwidths than XSR, since signals traverse a shorter distance.

Just to be clear on the signal formats, NRZ is your typical signal format – high is HI, low is LO, lasting for the duration of the symbol. We talked about PAM-4 not so long ago. Chord signaling is yet something different; it claims greater code efficiency than NRZ and better signal-integrity characteristics than PAM-4. It encodes n bits of data into n+1 physical bits. Honestly… I’ve looked into it a little bit, and it’s not so obvious how it works. So we’re going to need to come back to that one in the future if it starts to reach prime time.

For parallel interconnect, there are two options, both of which ODSA says require less sophisticated circuits than serial. You could even make them with 90-nm technology. First is BoW, which stands for “Bunch of Wires.” Yup, seriously. It has three modes:

  • Basic, supporting up to 5 Gbps. It’s unterminated, with transmit and receive on separate lines.
  • Fast, supporting up to 12 Gbps. It’s terminated, with transmit and receive on separate lines.
  • Turbo, supporting up to 12 Gbps for both transmit and receive on the same wire – a total of 24 Gbps. So it’s full duplex. The receive circuits must subtract out the transmitted signal in order to recover the received signal (just like the telephones of yore).

AIB, meanwhile, is a recent addition. Originating at Intel, with support from DARPA, it can drive single-data-rate (SDR) signals to 1 Gbps and double-data-rate (DDR) signals to 2 Gbps. Yes, it’s not as fast as BoW at present, but, being newer, there may be more headroom.

The big difference between BoW and AIB is the intended substrate. BoW was conceived for organic substrates, and all testing and characterization results reflect that. AIB was intended for interposers or EMIB and can therefore handle many more signals spaced more closely together. The ODSA folks say that BoW might operate just fine in an interposer system as well, but there’s no data yet to support that.

Finally, there may be another entrant from AMD. Their Infinity architecture has an Infinity Fabric for On-Package (IFOP) option. Their materials say it’s for chip-to-chip interconnect, which I presume makes it also suitable for chiplet-to-chiplet interconnect. What I don’t know is if they intend to promote it as an alternative to the ODSA options. I attempted to contact them, but, as of “print” time, I have received no response. I’ll update if I get one.

Whichever options win out, having open standards will be the only way that a chiplet market can take flight.

 

More info:

ODSA

AMD Infinity Architecture

Sourcing credit (all ODSA participants):

Bapi Vinnakota, Director of Silicon Architecture Program Management at Netronome

Bill Carter, CTO at OCP (Open Compute Project)

Quinn Jacobson, Strategic Architect at Achronix

2 thoughts on “From Chiplet to Chiplet”

Leave a Reply

featured blogs
Nov 23, 2022
The current challenge in custom/mixed-signal design is to have a fast and silicon-accurate methodology. In this blog series, we are exploring the Custom IC Design Flow and Methodology stages. This methodology directly addresses the primary challenge of predictability in creat...
Nov 22, 2022
Learn how analog and mixed-signal (AMS) verification technology, which we developed as part of DARPA's POSH and ERI programs, emulates analog designs. The post What's Driving the World's First Analog and Mixed-Signal Emulation Technology? appeared first on From Silicon To So...
Nov 21, 2022
By Hossam Sarhan With the growing complexity of system-on-chip designs and technology scaling, multiple power domains are needed to optimize… ...
Nov 18, 2022
This bodacious beauty is better equipped than my car, with 360-degree collision avoidance sensors, party lights, and a backup camera, to name but a few....

featured video

Maximizing Power Savings During Chip Implementation with Dynamic Refresh of Vectors

Sponsored by Synopsys

Drive power optimization with actual workloads and continually refresh vectors at each step of chip implementation for maximum power savings.

Learn more about Energy-Efficient SoC Solutions

featured paper

Algorithm Verification with FPGAs and ASICs

Sponsored by MathWorks

Developing new FPGA and ASIC designs involves implementing new algorithms, which presents challenges for verification for algorithm developers, hardware designers, and verification engineers. This eBook explores different aspects of hardware design verification and how you can use MATLAB and Simulink to reduce development effort and improve the quality of end products.

Click here to read more

featured chalk talk

E-Mobility: Electronic Challenges and Solutions

Sponsored by Mouser Electronics and Würth Elektronik

The future electrification of the world’s transportation industry depends on the infrastructure we create today. In this episode of Chalk Talk, Amelia Dalton chats with Sven Lerche from Würth Elektronik about the electronic challenges and solutions for today’s e-mobility designs and EV charging stations. They take a closer look at the trends in these kinds of designs, the role that electronic parts play in terms of robustness, and how Würth’s REDCUBE can help you with your next electric vehicle or EV charging station design.

Click here for more information about Würth Elektronik Automotive Products