feature article
Subscribe Now

Supercomputing To Go

HPEC Raises its Head at SC|05

Some embedded applications are much tougher, however. There are cases when we need to deliver copious amounts of computing power while remaining off the grid. Last week, at Supercomputing 2005 in Seattle, there was ample evidence of just such compute power gone mad. Gigantic racks of powerful processors pumped piles of data through blazing fast networks and onto enormous storage farms. The feel of the place was about as far from “embedded” as you can get, unless your idea of embedding somehow involves giant air-conditioners and 3-phase power.

Behind the huge storage clouds, teraflop racks, and nation-sized networks, there was considerable embedded computing activity going on, however. Although not its main event, high-performance embedded computing (HPEC) was hanging out at the show and getting a good deal of quiet attention. It seems that not all of life’s difficult problems will hold still long enough for you to ship them off to a supercomputer facility. Sometimes, massive processing power is required to interpret images in real-time, process radio signals on the fly, or solve complicated algorithms from inside a moving vehicle. It’s those applications that put the “E” in HPEC, and several forward-thinking companies were at the show, working to help the rest of us see the light.

To briefly trace the history of embedded systems architectures, we have moved rapidly from systems-in-chassis to systems-on-board, then into system-on-chip (SoC) integration over the past decade. Each time we’ve integrated, our power density has increased as our form factors shrank. Interestingly, today, embedded systems have more in common with supercomputers than with commodity desktop and laptop machines. As we highlighted last week in “Changing Waves,” both supercomputers and embedded computers have hit the wall of diminishing returns on single-thread, Von Neumann processors and have moved into the domain of multi-core and alternative architecture processing.

The HPEC folks have just hit the wall a little earlier and a little harder than the rest of us. Supercomputing in embedded applications is a challenging engineering problem with little wiggle room for tradeoffs and compromises. Three primary solution tracks are in evidence today, one with multi-core embedded processing, one with specialized processors such as DSPs, and one with reconfigurable accelerators. Supercomputing 2005 showed us a rich crop of companies targeting multi-core development, and many of the compiler and OS technologies that serve the massively parallel grids and clusters are similarly applicable to HPEC.

A variety of embedded boards and systems from companies like Nallatech, Starbridge Systems, and Annapolis Microsystems were on display. Most of these combine conventional processors feeding DSPs or FPGA accelerators with generous helpings of memory for caching and fifos, and various high-performance I/O connections to hook up to the outside world. Performance claims and demonstrations on many of these devices were impressive, often rivaling or beating non-embedded supercomputers at the same task.

Unlike the HPC strategy of fitting the algorithm to the hardware, however, the HPEC community tends to fit the hardware to the algorithm. The reasons are economic. A typical supercomputer installation justifies its cost by lending its processing power to as many high-value problems as possible. These problems may be highly diverse, with their only commonality being the need for trillions of CPU cycles. In the embedded supercomputing domain, however, the machine is almost always optimized to solve one specific problem. It doesn’t have to be working on DNA sequence comparisons one day, hurricane forecasting the next day, and seismic data analysis on the third. This luxury allows for some serious specialization, and HPEC designers seldom fail to capitalize on that angle.

Like a race car, the HPEC can be fine tuned for precisely the problem it was conceived to solve. In the extreme, a custom ASIC can be designed for massive hardware acceleration for specific compute-intensive tasks with minimal power and space utilization. If more flexibility is needed, programmable logic devices can be used to provide reconfigurable algorithm acceleration with a slight power penalty (compared to ASIC). In any event, making supercomputing embedded almost always involves some additional acceleration beyond simple multi-core processing.

With any of these acceleration strategies, however, there is a formidable programming problem. Supercomputing 2005 was ready with a number of solutions to those issues as well. Mitrionics was debuting their “Mitrion-C” compiler that takes a C-like parallel programming language and generates a hardware-accelerated executable that can run on a variety of machines from Cray XD-1 supercomputers to custom embedded HPEC equipment with FPGAs. Celoxica showed continued success with their Handel-C environment for hardware acceleration of compute-intensive algorithms, aimed squarely at the embedded high-performance computing area. Starbridge Systems demonstrated their “Viva” graphical language compilers generating re-usable applications to run on a variety of hardware platforms from accelerated HPCs to FPGA development boards.

While many of us may never need the gigaflops of compute power available with HPEC systems, it is still good to see the state of the art push ahead, giving everyone some extra breathing room. Even though we may not need the power today, it takes only a small market shift to turn a compute-intensive incremental feature into a must-have. If nothing else, Supercomputing 2005 showed us that the embedded MIPS will be there when we need them.

Leave a Reply

featured blogs
Apr 19, 2024
Data type conversion is a crucial aspect of programming that helps you handle data across different data types seamlessly. The SKILL language supports several data types, including integer and floating-point numbers, character strings, arrays, and a highly flexible linked lis...
Apr 18, 2024
Are you ready for a revolution in robotic technology (as opposed to a robotic revolution, of course)?...
Apr 18, 2024
See how Cisco accelerates library characterization and chip design with our cloud EDA tools, scaling access to SoC validation solutions and compute services.The post Cisco Accelerates Project Schedule by 66% Using Synopsys Cloud appeared first on Chip Design....

featured video

MaxLinear Integrates Analog & Digital Design in One Chip with Cadence 3D Solvers

Sponsored by Cadence Design Systems

MaxLinear has the unique capability of integrating analog and digital design on the same chip. Because of this, the team developed some interesting technology in the communication space. In the optical infrastructure domain, they created the first fully integrated 5nm CMOS PAM4 DSP. All their products solve critical communication and high-frequency analysis challenges.

Learn more about how MaxLinear is using Cadence’s Clarity 3D Solver and EMX Planar 3D Solver in their design process.

featured chalk talk

GaN Solutions Featuring EcoGaN™ and Nano Pulse Control
In this episode of Chalk Talk, Amelia Dalton and Kengo Ohmori from ROHM Semiconductor examine the details and benefits of ROHM Semiconductor’s new lineup of EcoGaN™ Power Stage ICs that can reduce the component count by 99% and the power loss of your next design by 55%. They also investigate ROHM’s Ultra-High-Speed Control IC Technology called Nano Pulse Control that maximizes the performance of GaN devices.
Oct 9, 2023
24,943 views