feature article
Subscribe Now

High Bandwidth, Modest Capacity

Micron Launches GDDR6; Cadence Talks AI/Crypto Memory

Aaaaand… it’s memory time again. I don’t keep up with every release of memory (who could keep up with that without dedicating their lives to nothing but that?), but here and there we have either technology or application angles to new-memory stories. So, in that vein, we address memory in automotive and AI. Yes, two critical keywords in any tech article these days.

Automotive Moves to Graphics

I chatted with Micron Technologies about their latest GDDR6 release. And the name of the game here is bandwidth, starting at 14 Gbps per pin and moving to 16 (with 20 working in the labs now). Memory capacity runs from 8 to 32 Gb.

There are two independent channels to this memory. In fact, you could pretty much think of them as two separate co-packaged dice. Each channel has its own memory, read/write access, and refresh. They can be run mutually asynchronously. If you want to use it as a single memory, you can gang the signals and busses together externally.

One of the things that’s changing with high-performance memory is the size of the payload: it’s shrinking. (How often do you hear about future workloads getting smaller?) So each channel has a 16-bit bus to be more efficient. If you gang the two channels together, then you get the more standard 32-bit bus.

So… OK… memory for graphics… um… where does the promised automotive thing enter the picture? Did we toss the word in here just to come up in more searches? Nope. Turns out that automotive designs – traditional users of LPDDR or plain-ol’ DDR – are needing more bandwidth for ADAS applications and, in particular, high-definition displays for L4/L5 levels of autonomy (in other words, high levels). Micron has worked with several other companies to help put together the circuits needed to create an entire system: Rambus for the PHY, Northwest Logic for the controller, and Avery Design Systems for verification IP.

But, of course, adapting to the functional needs of cars can be a deal with the devil when it comes to operational requirements – including reliability. Like, 7 to 10 years of reliability. According to Micron, this is tough to achieve with other memories, making their GDDR6 a better fit.

Memory for AI and Crypto

Next we look at two other application areas that share one characteristic with automotive. The applications are AI and cryptography, and what they share is the smaller transaction. But they still need super-fast access.

Cadence raised this topic at DAC in a conversation with Marc Greenberg. We didn’t really focus on new products specifically, but rather on developments in system design and how that’s translating into possible future memory solutions (whenever any resulting products materialize).

With AI, you’re storing the weights for a neural-net engine; with cryptography, you’re storing hashes. According to Cadence, designers are looking for novel memory structures to give them higher bandwidth without necessarily delivering higher capacity. HBM2 and GDDR6 are examples of such newer memories that are up for consideration.

The reason for seeking out something new lies in a gap in capacity with the standard memory options available today. Given that these are working-memory tasks, the options are SRAM and DRAM. AI memories tend to need on the order of 10 GB of capacity (plus or minus), which isn’t nothing, but it’s less than DRAM tends to deliver. That said, it’s way more than cache – which has space for a few MB of data – can handle. So there’s this Goldilocks capacity region that these designers are jonesing for.

One thing that you might anticipate would be that SRAM-based cache would draw more power than DRAM. After all, an SRAM bit cell always burns power; as bit cells go, SRAM cells are considered pretty power-hungry. Of course, you get speed in the bargain, but it would be understandable if you thought (as I would have) that SRAM is the higher-power solution.

Not so, according to Cadence. Yes, the SRAM bit cell does draw more power, but it turns out that that’s not what dominates power usage: data movement does. And with cache, you’re moving data some nanometers across a die. With DRAM, you’re going out pins and through wires and into other pins, and the power cost of doing so makes DRAM the overall higher-power solution.

Is that solved with HBM2 and GDDR6? Not clear. GDDR6 power is lower than GDDR5 due to a lower VDD. HBM2 power is lower than HBM for the same reason. And, as far as I can tell, HBM2 runs with less power than GDDR6. But are they meeting the power needs of these non-graphics, smaller-payload applications?

I checked back in with Cadence, and Mr. Greenberg clarified that power isn’t the driving feature here: bandwidth is. The catch is that, as noted, capacity needs are modest. These applications require more memory than can economically be included on-chip, so an off-chip solution is required. HBM2 and GDDR6 fit this space; their relative lower power as compared to alternatives or past generations certainly helps to reduce the overall power of the solution, but it’s not the main story.

Sooo… what’ll it be? HBM2 or GDDR6? Or both? Poking around, HBM2 may have the power advantage, but it would appear to have a significant cost disadvantage. Where bandwidth matters – like, say, gaming (where you see most of the HBM2 discussions), HBM2 can win. Its market has certainly been slower to evolve than some expected, but new offerings suggest that it’s still moving forward.

The DDR franchise, with its LP and G variants, contains more familiar names, so you might expect them to experience easier going. And high pricing is never a great thing in the automotive market. But what about AI, or crypto? Well, it depends on where the system is. In the cloud? In a server locally? Or in a gadget?

Acceptable price, performance, footprint, and power points will depend strongly on where the memory finds itself. AI, in particular, is new enough that it has a lot of settling out to do before we know whether it pervades absolutely everything or remains focused in more limited platforms. So we still have plenty of time before we know exactly what’s going to be required where.

 

More info:

Micron GDDR6

2 thoughts on “High Bandwidth, Modest Capacity”

  1. Doomed as an approach; it’s a long-term side effect of splitting the Silicon processes for CPUs and memory (DRAM).

    The reason they need more bandwidth is that communication is usually a dimension down from storage, i.e. storage is over the area of the chip (2D) but communication is usually just the edge (1D), and if you die-stack you’ll have a volume (3D) vs at best the (bottom) surface (2D) for communication. Every process shrink makes the problem worse.

    Also known as the commuting vs computing problem – spending more energy on moving data than actually computing.

    Processor-in-memory works a lot better, but most CAD flows don’t support asynchronous design and RTL CPUs are generally too hot too stack, so my money is on these guys –

    http://etacompute.com/products/low-power-ip/

Leave a Reply

featured blogs
Apr 25, 2024
Cadence's seven -year partnership with'¯ Team4Tech '¯has given our employees unique opportunities to harness the power of technology and engage in a three -month philanthropic project to improve the livelihood of communities in need. In Fall 2023, this partnership allowed C...
Apr 24, 2024
Learn about maskless electron beam lithography and see how Multibeam's industry-first e-beam semiconductor lithography system leverages Synopsys software.The post Synopsys and Multibeam Accelerate Innovation with First Production-Ready E-Beam Lithography System appeared fir...
Apr 18, 2024
Are you ready for a revolution in robotic technology (as opposed to a robotic revolution, of course)?...

featured video

MaxLinear Integrates Analog & Digital Design in One Chip with Cadence 3D Solvers

Sponsored by Cadence Design Systems

MaxLinear has the unique capability of integrating analog and digital design on the same chip. Because of this, the team developed some interesting technology in the communication space. In the optical infrastructure domain, they created the first fully integrated 5nm CMOS PAM4 DSP. All their products solve critical communication and high-frequency analysis challenges.

Learn more about how MaxLinear is using Cadence’s Clarity 3D Solver and EMX Planar 3D Solver in their design process.

featured paper

Designing Robust 5G Power Amplifiers for the Real World

Sponsored by Keysight

Simulating 5G power amplifier (PA) designs at the component and system levels with authentic modulation and high-fidelity behavioral models increases predictability, lowers risk, and shrinks schedules. Simulation software enables multi-technology layout and multi-domain analysis, evaluating the impacts of 5G PA design choices while delivering accurate results in a single virtual workspace. This application note delves into how authentic modulation enhances predictability and performance in 5G millimeter-wave systems.

Download now to revolutionize your design process.

featured chalk talk

USB Power Delivery: Power for Portable (and Other) Products
Sponsored by Mouser Electronics and Bel
USB Type C power delivery was created to standardize medium and higher levels of power delivery but it also can support negotiations for multiple output voltage levels and is backward compatible with previous versions of USB. In this episode of Chalk Talk, Amelia Dalton and Bruce Rose from Bel/CUI Inc. explore the benefits of USB Type C power delivery, the specific communications protocol of USB Type C power delivery, and examine why USB Type C power supplies and connectors are the way of the future for consumer electronics.
Oct 2, 2023
26,045 views