feature article
Subscribe Now

High Bandwidth, Modest Capacity

Micron Launches GDDR6; Cadence Talks AI/Crypto Memory

Aaaaand… it’s memory time again. I don’t keep up with every release of memory (who could keep up with that without dedicating their lives to nothing but that?), but here and there we have either technology or application angles to new-memory stories. So, in that vein, we address memory in automotive and AI. Yes, two critical keywords in any tech article these days.

Automotive Moves to Graphics

I chatted with Micron Technologies about their latest GDDR6 release. And the name of the game here is bandwidth, starting at 14 Gbps per pin and moving to 16 (with 20 working in the labs now). Memory capacity runs from 8 to 32 Gb.

There are two independent channels to this memory. In fact, you could pretty much think of them as two separate co-packaged dice. Each channel has its own memory, read/write access, and refresh. They can be run mutually asynchronously. If you want to use it as a single memory, you can gang the signals and busses together externally.

One of the things that’s changing with high-performance memory is the size of the payload: it’s shrinking. (How often do you hear about future workloads getting smaller?) So each channel has a 16-bit bus to be more efficient. If you gang the two channels together, then you get the more standard 32-bit bus.

So… OK… memory for graphics… um… where does the promised automotive thing enter the picture? Did we toss the word in here just to come up in more searches? Nope. Turns out that automotive designs – traditional users of LPDDR or plain-ol’ DDR – are needing more bandwidth for ADAS applications and, in particular, high-definition displays for L4/L5 levels of autonomy (in other words, high levels). Micron has worked with several other companies to help put together the circuits needed to create an entire system: Rambus for the PHY, Northwest Logic for the controller, and Avery Design Systems for verification IP.

But, of course, adapting to the functional needs of cars can be a deal with the devil when it comes to operational requirements – including reliability. Like, 7 to 10 years of reliability. According to Micron, this is tough to achieve with other memories, making their GDDR6 a better fit.

Memory for AI and Crypto

Next we look at two other application areas that share one characteristic with automotive. The applications are AI and cryptography, and what they share is the smaller transaction. But they still need super-fast access.

Cadence raised this topic at DAC in a conversation with Marc Greenberg. We didn’t really focus on new products specifically, but rather on developments in system design and how that’s translating into possible future memory solutions (whenever any resulting products materialize).

With AI, you’re storing the weights for a neural-net engine; with cryptography, you’re storing hashes. According to Cadence, designers are looking for novel memory structures to give them higher bandwidth without necessarily delivering higher capacity. HBM2 and GDDR6 are examples of such newer memories that are up for consideration.

The reason for seeking out something new lies in a gap in capacity with the standard memory options available today. Given that these are working-memory tasks, the options are SRAM and DRAM. AI memories tend to need on the order of 10 GB of capacity (plus or minus), which isn’t nothing, but it’s less than DRAM tends to deliver. That said, it’s way more than cache – which has space for a few MB of data – can handle. So there’s this Goldilocks capacity region that these designers are jonesing for.

One thing that you might anticipate would be that SRAM-based cache would draw more power than DRAM. After all, an SRAM bit cell always burns power; as bit cells go, SRAM cells are considered pretty power-hungry. Of course, you get speed in the bargain, but it would be understandable if you thought (as I would have) that SRAM is the higher-power solution.

Not so, according to Cadence. Yes, the SRAM bit cell does draw more power, but it turns out that that’s not what dominates power usage: data movement does. And with cache, you’re moving data some nanometers across a die. With DRAM, you’re going out pins and through wires and into other pins, and the power cost of doing so makes DRAM the overall higher-power solution.

Is that solved with HBM2 and GDDR6? Not clear. GDDR6 power is lower than GDDR5 due to a lower VDD. HBM2 power is lower than HBM for the same reason. And, as far as I can tell, HBM2 runs with less power than GDDR6. But are they meeting the power needs of these non-graphics, smaller-payload applications?

I checked back in with Cadence, and Mr. Greenberg clarified that power isn’t the driving feature here: bandwidth is. The catch is that, as noted, capacity needs are modest. These applications require more memory than can economically be included on-chip, so an off-chip solution is required. HBM2 and GDDR6 fit this space; their relative lower power as compared to alternatives or past generations certainly helps to reduce the overall power of the solution, but it’s not the main story.

Sooo… what’ll it be? HBM2 or GDDR6? Or both? Poking around, HBM2 may have the power advantage, but it would appear to have a significant cost disadvantage. Where bandwidth matters – like, say, gaming (where you see most of the HBM2 discussions), HBM2 can win. Its market has certainly been slower to evolve than some expected, but new offerings suggest that it’s still moving forward.

The DDR franchise, with its LP and G variants, contains more familiar names, so you might expect them to experience easier going. And high pricing is never a great thing in the automotive market. But what about AI, or crypto? Well, it depends on where the system is. In the cloud? In a server locally? Or in a gadget?

Acceptable price, performance, footprint, and power points will depend strongly on where the memory finds itself. AI, in particular, is new enough that it has a lot of settling out to do before we know whether it pervades absolutely everything or remains focused in more limited platforms. So we still have plenty of time before we know exactly what’s going to be required where.

 

More info:

Micron GDDR6

2 thoughts on “High Bandwidth, Modest Capacity”

  1. Doomed as an approach; it’s a long-term side effect of splitting the Silicon processes for CPUs and memory (DRAM).

    The reason they need more bandwidth is that communication is usually a dimension down from storage, i.e. storage is over the area of the chip (2D) but communication is usually just the edge (1D), and if you die-stack you’ll have a volume (3D) vs at best the (bottom) surface (2D) for communication. Every process shrink makes the problem worse.

    Also known as the commuting vs computing problem – spending more energy on moving data than actually computing.

    Processor-in-memory works a lot better, but most CAD flows don’t support asynchronous design and RTL CPUs are generally too hot too stack, so my money is on these guys –

    http://etacompute.com/products/low-power-ip/

Leave a Reply

featured blogs
Apr 24, 2024
Diversity, equity, and inclusion (DEI) are not just words but values that are exemplified through our culture at Cadence. In the DEI@Cadence blog series, you'll find a community where employees share their perspectives and experiences. By providing a glimpse of their personal...
Apr 23, 2024
We explore Aerospace and Government (A&G) chip design and explain how Silicon Lifecycle Management (SLM) ensures semiconductor reliability for A&G applications.The post SLM Solutions for Mission-Critical Aerospace and Government Chip Designs appeared first on Chip ...
Apr 18, 2024
Are you ready for a revolution in robotic technology (as opposed to a robotic revolution, of course)?...

featured video

How MediaTek Optimizes SI Design with Cadence Optimality Explorer and Clarity 3D Solver

Sponsored by Cadence Design Systems

In the era of 5G/6G communication, signal integrity (SI) design considerations are important in high-speed interface design. MediaTek’s design process usually relies on human intuition, but with Cadence’s Optimality Intelligent System Explorer and Clarity 3D Solver, they’ve increased design productivity by 75X. The Optimality Explorer’s AI technology not only improves productivity, but also provides helpful insights and answers.

Learn how MediaTek uses Cadence tools in SI design

featured paper

Designing Robust 5G Power Amplifiers for the Real World

Sponsored by Keysight

Simulating 5G power amplifier (PA) designs at the component and system levels with authentic modulation and high-fidelity behavioral models increases predictability, lowers risk, and shrinks schedules. Simulation software enables multi-technology layout and multi-domain analysis, evaluating the impacts of 5G PA design choices while delivering accurate results in a single virtual workspace. This application note delves into how authentic modulation enhances predictability and performance in 5G millimeter-wave systems.

Download now to revolutionize your design process.

featured chalk talk

PIC32CX-BZ2 and WBZ451 Multi-Protocol Wireless MCU Family
Sponsored by Mouser Electronics and Microchip
In this episode of Chalk Talk, Amelia Dalton and Shishir Malav from Microchip explore the benefits of the PIC32CX-BZ2 and WBZ45 Multi-protocol Wireless MCU Family and how it can make IoT design easier than ever before. They investigate the components included in this multi-protocol wireless MCU family, the details of the software architecture included in this solution, and how you can utilize these MCUs in your next design.
May 4, 2023
40,688 views