industry news
Subscribe Now

Renesas Electronics Develops New Processing-In-Memory Technology for Next-Generation AI Chips that Achieves AI Processing Performance of 8.8 TOPS/W

AI Accelerator Achieves Both CNN Processing Speeds and Reduced Power Consumption
TOKYO, Japan, June 13, 2019 – Renesas Electronics Corporation (TSE: 6723), a premier supplier of advanced semiconductor solutions, today announced it has developed an AI accelerator that performs CNN (convolutional neural network) processing at high speeds and low power to move towards the next generation of Renesas embedded AI (e-AI), which will accelerate increased intelligence of endpoint devices. A Renesas test chip featuring this accelerator has achieved the power efficiency of 8.8 TOPS/W (Note 1), which is the industry’s highest class of power efficiency. The Renesas accelerator is based on the processing-in-memory (PIM) architecture, an increasingly popular approach for AI technology, in which multiply-and-accumulate operations are performed in the memory circuit as data is read out from that memory.
To create the new AI accelerator, Renesas developed the following three technologies. The first is a ternary-valued (-1, 0, 1) SRAM structure PIM technology that can perform large-scale CNN computations. The second is an SRAM circuit to be applied with comparators that can read out memory data at low power. The third is a technology that prevents calculation errors due to process variations in the manufacturing. Together, these technologies achieve both a reduction in the memory access time in deep learning processing and a reduction in the power required for the multiply-and-accumulate operations. Thus, the new accelerator achieves the industry’s highest class of power efficiency while maintaining an accuracy ratio more than 99 percent when evaluated in a handwritten character recognition test (MNIST).
Renesas presented these results on June 13, at the 2019 Symposia on VLSI Technology and Circuits in Kyoto, Japan, June 9-14, 2019. Renesas also demonstrated real-time image recognition using a prototype AI module in which this test chip, powered by a small battery, was connected with a microcontroller, a camera, other peripheral devices, and development tools at the demonstration session.
 
Until now, the PIM architecture was unable to achieve an adequate accuracy level for large-scale CNN computations with single-bit calculations since the binary (0,1) SRAM structure was only able to handle data with values 0 or 1. Furthermore, process variations in the manufacturing resulted in a reduction in the reliability of these calculations, and workarounds were required. Renesas has now developed technologies that resolve these issues and will be applying these, as a leading-edge technology that can implement revolutionary AI chips of the future, to the next generation of e-AI solutions for applications such as wearable equipment and robots that require both performance and power efficiency.
Key Features of the Newly Developed Technology for Next-Generation AI Chips:
 
Ternary (-1, 0, 1) SRAM structure PIM that can adjust its calculation bit number according to the accuracy required
The ternary (-1, 0, 1) SRAM structure PIM architecture adopts a combination of a ternary memory with a simple digital calculation block to hold increases in the amount of hardware and increases in calculation errors to a minimum. At the same time, it allows switching the number of bits between, for example, 1.5-bit (ternary) and 4-bit calculations according to the required accuracy. Since this can support different required accuracies and calculation scales on a per-user basis, users can optimize the balance between accuracy and power consumption.
High-precision/low-power memory data readout circuit that combines comparators and replica cells
When a PIM architecture is adopted, memory data is read out by detecting the value of the bit line current in the SRAM structure. Although it is effective to use A/D converters for high-precision bit line current detection, this approach has the issue of high-power consumption and increased chip areas. In this effort Renesas combined a comparator (1-bit sense amplifier) with a replica cell in which the current can be controlled flexibly to develop a high-precision memory data readout circuit. Furthermore, these technologies take advantage of the fact that the number of nodes (neurons) activated by neural network operation is very small, about 1%, and achieves even lower power operation by stopping operation of the readout circuits for nodes (neurons) that are not activated.
Variation avoidance technology that suppresses calculation errors due to process variations in the manufacturing
The PIM architecture has the challenge of calculation errors due to process variations in the manufacturing. This is because process variations in the manufacturing cause errors in the values of the bit line currents in the SRAM structure, and as a result, errors occur in the memory data readout. To resolve this issue, Renesas covered the inside of the chip with multiple SRAM calculation circuit blocks and used blocks with minimal manufacturing process variations to perform the calculations. Since the activated nodes are only a small minority of all nodes, activated nodes are allocated selectively to SRAM calculation circuit blocks that have minimal manufacturing process variations to perform the calculations. This allows calculation errors to be reduced to a level where they can be essentially ignored.
 
Since introducing the embedded AI (e-AI) concept in 2015, Renesas has moved forward with the development of several e-AI solutions. Renesas has defined “classes” based on the effectiveness of e-AI and applications that are implemented and has been developing e-AI solutions based on the following four classes:
  • Class 1: Judging the correctness or abnormality of signal waveform data.
  • Class 2 (100 GOPS/W class): Judging correctness or abnormality using real-time image processing.
  • Class 3 (1 TOPS/W class): Performing recognition in real time.
  • Class 4 (10 TOPS/W class): Enabling incremental learning at an endpoint.
 
Renesas introduced an e-AI development environment in 2017, and in 2018 announced the RZ/A2M microprocessor, which integrates Renesas’ exclusive DRP (dynamically reconfigurable processor) on chip. Renesas provides these technologies for applications classified through class 2. To implement class 3 applications, Renesas has further improved the computational performance of this DRP technology.
 
Now, Renesas is unveiling the new, leading-edge technology developed through this effort. The new accelerator technology combines both low power consumption and improved computational performance and could be one of the key technologies to implement future class 4 applications. Renesas is committed to contributing to the realization of a smart society through increased intelligence that applies AI at both edges and endpoints in the IoT.
 
(Note 1) TOPS/W: Tera Operations per Second per Watt. This is a rating that indicates the number of computations that can be performed in one second with one watt of power. The performance of 8.8 TOPS/W corresponds to performing 8.8 × 10^12 calculations in one second with one watt of power.
 
 
About Renesas Electronics Corporation
Renesas Electronics Corporation (TSE: 6723) delivers trusted embedded design innovation with complete semiconductor solutions that enable billions of connected, intelligent devices to enhance the way people work and live. A global leader in microcontrollers, analog, power, and SoC products, Renesas provides comprehensive solutions for a broad range of automotive, industrial, home electronics, office automation, and information communication technology applications that help shape a limitless future. Learn more at renesas.com.

Leave a Reply

featured blogs
Apr 25, 2024
Structures in Allegro X layout editors let you create reusable building blocks for your PCBs, saving you time and ensuring consistency. What are Structures? Structures are pre-defined groups of design objects, such as vias, connecting lines (clines), and shapes. You can combi...
Apr 25, 2024
See how the UCIe protocol creates multi-die chips by connecting chiplets from different vendors and nodes, and learn about the role of IP and specifications.The post Want to Mix and Match Dies in a Single Package? UCIe Can Get You There appeared first on Chip Design....
Apr 18, 2024
Are you ready for a revolution in robotic technology (as opposed to a robotic revolution, of course)?...

featured video

How MediaTek Optimizes SI Design with Cadence Optimality Explorer and Clarity 3D Solver

Sponsored by Cadence Design Systems

In the era of 5G/6G communication, signal integrity (SI) design considerations are important in high-speed interface design. MediaTek’s design process usually relies on human intuition, but with Cadence’s Optimality Intelligent System Explorer and Clarity 3D Solver, they’ve increased design productivity by 75X. The Optimality Explorer’s AI technology not only improves productivity, but also provides helpful insights and answers.

Learn how MediaTek uses Cadence tools in SI design

featured paper

Designing Robust 5G Power Amplifiers for the Real World

Sponsored by Keysight

Simulating 5G power amplifier (PA) designs at the component and system levels with authentic modulation and high-fidelity behavioral models increases predictability, lowers risk, and shrinks schedules. Simulation software enables multi-technology layout and multi-domain analysis, evaluating the impacts of 5G PA design choices while delivering accurate results in a single virtual workspace. This application note delves into how authentic modulation enhances predictability and performance in 5G millimeter-wave systems.

Download now to revolutionize your design process.

featured chalk talk

BMP585: Robust Barometric Pressure Sensor
In this episode of Chalk Talk, Amelia Dalton and Dr. Thomas Block from Bosch Sensortec investigate the benefits of barometric pressure sensors for a variety of electronic designs. They examine how the ultra-low power consumption, excellent accuracy and suitability for use in harsh environments can make Bosch’s BMP585 barometric pressure sensors a great fit for your next design.
Oct 2, 2023
26,221 views