feature article
Subscribe Now

A Bid to Simplify Flash Subsystem Design

Flash memory, once exotic and expensive, has followed in DRAM’s steps to become a familiar everyday technology. Even more than DRAM perhaps: when was the last time you went to a drug store and picked up a DRAM card while you were there?

As with DRAM, this has been motivated by price decreases: the price per megabyte of Flash is falling by roughly half every year, and volume has responded with a compound annual growth rate of about 170%, according to Denali Software, Inc. Couple this with the fact that Flash is non-volatile, and, well, it’s no surprise that it’s being found not only in camera storage cards and cell phones, but is now becoming a part of the memory subsystem in computing platforms.

One technology advance that has made this reduction in price possible is the development of the Multi-Level Cell (MLC). Traditional Flash used a Single-Level Cell (SLC), which is the standard kind of memory bit we’re used to – one bit per cell, either on or off. With MLC technology, each cell can have more states than just on and off. Current technology allows four levels, meaning that two bits’ worth of data can be stored in a single cell. This literally doubles the capacity of an array. Of course, it’s not that simple. With an SLC, you can be relatively sloppy when reading the cell data – near ground for a zero, and near the rail for a one, for example. To get an MLC, you’re dividing up that voltage range into finer divisions, meaning the sensing circuitry has to be much more discerning, and data retention becomes less forgiving. This has complicated both the internal read circuitry and the Flash controllers that have to interpret which bits go where. The next step to four-bit/cell technology will make the challenge even greater – sixteen voltage levels per cell.

Makers of specialty components have the luxury of calling their own shots when it comes to how their parts work, are pinned out, are timed, etc. The problem comes about when the components aren’t so specialized anymore, when they become commoditized, but with multiple vendors having different interfaces to their chips. In the case of Flash, the interfaces have been similar, but close doesn’t count, meaning extra work supporting multiple Flash vendors in a system.

The Open NAND Flash Interface (ONFi – love the lower-case “i”… At least they didn’t go all hacker on us and call it oNfI) was formed to unify the interface at the chip level. Now there’s a group working at the other side of things on a common API for software: the Non-Volatile Memory Host Controller Interface (NVMHCI), which is close to being approved in its first iteration.

All of this makes it easier to put Flash memory subsystems together and allow interchangeability (from an interface standpoint) of different Flash memories. It is in this environment, then, that Denali is announcing a new FlashPoint™ platform that is intended to simplify the configuration of a Flash subsystem using PCI-Express as the data pipe.

Denali has historically concentrated on generating IP for use in System-on-Chip (SoC) designs, starting with DDR memory, and following on with NAND Flash and PCI-Express. So it’s somewhat natural that they would move up a level of abstraction and automate the pulling together of the PCI-Express, Flash, and control blocks. They see the use of this both for generating PCI-Express Flash cache chips and for integrating further into an SoC.

The package contains the hardware IP and a full software stack that talks to the NVMHCI interface. Since that interface is new, they also provide an NVMHCI driver for customers that haven’t yet incorporated it. They also support the ONFi interface, allowing the subsystems to be built with existing Flash devices from a variety of manufacturers.

The contents of the FlashPoint™ platform have been disclosed only at a high level so far. It consists of a 32-bit RISC processor, RAM, ROM, and a number of control blocks, assembled in an architecture that they have tuned for performance. Within this environment, you can size the Flash memory and scale the number of Flash controllers according to the needs of the application. There is also an encryption engine that can be bypassed as well as a security block that supports passwords and partitioning. The hardware IP is provided in RTL. Not clear yet is whether this customization happens through a wizard-like tool or through tighter interaction with Denali on a project-by-project basis.

On the software/controller side, Denali’s existing Flash system has four layers, starting with hardware, then hardware abstraction, a Flash Translation Layer, and an OS/RTOS layer. The FlashPoint platform adds modules to the existing set. So at the hardware level, they have added the processor and RAM, a command pipeline, an auto-config block, and an NVMHCI block. At the firmware level, they have added power management, reliability monitors, and a memory map. And at the software level, they provide a protocol stack, command ordering, a task engine, and a system initialization block.

Customization on the software side can be achieved by choosing which modules to include in a system. Customizing the behavior of the modules themselves is apparently theoretically possible, but isn’t really intended. First of all, there’s the issue of the memory footprint and trying to jam any new firmware into it. Then there’s the more practical matter of the software being available in binary only, complicating integration of anything new into the bundle. In reality, their intent is that the software modules remain intact, with any other functionality added by higher-level software that communicates with the Denali software via the NVMHCI API.

Denali promises bandwidth of 160 to 200 MB/s. While the platform can operate in any of the standard applications that use Flash, much attention is being focused on computer cache applications. Flash is moving into position as another piece of the memory architecture, to the point where Microsoft has provided new capabilities in Windows Vista ® – the so-called ReadyBoost™ and ReadyDrive™ features – to support Flash as a cache to a hard drive or as an outright Solid State Drive (SSD), respectively. Denali expects the cost/capacity of Flash to make SSDs mainstream by mid-2009. They specifically point to having been able to build a 300 MB cache operating at 15K I/O Operations per second (IOPS), and a smaller laptop cache at 10K IOPS.

The implications of greater Flash usage can mean both higher performance and lower power, an attractive (and unusual) combination that suggests that Flash will find its way beyond the keychain and into more of the devices we use. If Flash subsystems get easier to design, hopefully that will happen sooner rather than later.

Leave a Reply

featured blogs
Apr 25, 2024
Structures in Allegro X layout editors let you create reusable building blocks for your PCBs, saving you time and ensuring consistency. What are Structures? Structures are pre-defined groups of design objects, such as vias, connecting lines (clines), and shapes. You can combi...
Apr 25, 2024
See how the UCIe protocol creates multi-die chips by connecting chiplets from different vendors and nodes, and learn about the role of IP and specifications.The post Want to Mix and Match Dies in a Single Package? UCIe Can Get You There appeared first on Chip Design....
Apr 18, 2024
Are you ready for a revolution in robotic technology (as opposed to a robotic revolution, of course)?...

featured video

How MediaTek Optimizes SI Design with Cadence Optimality Explorer and Clarity 3D Solver

Sponsored by Cadence Design Systems

In the era of 5G/6G communication, signal integrity (SI) design considerations are important in high-speed interface design. MediaTek’s design process usually relies on human intuition, but with Cadence’s Optimality Intelligent System Explorer and Clarity 3D Solver, they’ve increased design productivity by 75X. The Optimality Explorer’s AI technology not only improves productivity, but also provides helpful insights and answers.

Learn how MediaTek uses Cadence tools in SI design

featured paper

Designing Robust 5G Power Amplifiers for the Real World

Sponsored by Keysight

Simulating 5G power amplifier (PA) designs at the component and system levels with authentic modulation and high-fidelity behavioral models increases predictability, lowers risk, and shrinks schedules. Simulation software enables multi-technology layout and multi-domain analysis, evaluating the impacts of 5G PA design choices while delivering accurate results in a single virtual workspace. This application note delves into how authentic modulation enhances predictability and performance in 5G millimeter-wave systems.

Download now to revolutionize your design process.

featured chalk talk

E-Mobility - Charging Stations & Wallboxes AC or DC Charging?
In this episode of Chalk Talk, Amelia Dalton and Andreas Nadler from WĂĽrth Elektronik investigate e-mobility charging stations and wallboxes. We take a closer look at the benefits, components, and functions of AC and DC wallboxes and charging stations. They also examine the role that DC link capacitors play in power conversion and how WĂĽrth Elektronik can help you create your next AC and DC wallbox or charging station design.
Jul 12, 2023
32,877 views