feature article
Subscribe Now

Playing “What If…” With Multicore Processors

Multicore processors are upon us, but how much do they really help? If your boss were to ask you right now, “how much faster will our code run on a two-, four-, or eight-core processor” could you answer the question? How many of us have any idea how much performance we’d gain by moving from a single-core to a multicore processor?

Well, wonder no more. A Scottish prism is here to answer that very question.

“Prism” is the name of a new software-analysis tool from CriticalBlue, a Scottish company that’s been analyzing multicore software for a wee bit o’ time. The company was previously focused on SoC developers but has now turned its attention to the broader market of multicore programmers. All of us, in other words.

Prism is designed to answer the question, “how would my code perform if it ran on a multicore processor?” It’s an analysis tool, and an important one for anyone wondering how much performance headroom there might be in their existing code. Prism also enables “what if…” kinds of experimentation. You could see, for example, if your code would benefit from an 8-way processor or if it tops out after two cores. In short, Prism provides a quick way to estimate multicore performance without prototyping any multicore hardware. Prism is a type of crystal ball.

In Search of the Lost Vector

For the uninitiated, CriticalBlue made its name creating automatically generated coprocessors. Users of the company’s Cascade tool could feed it their C code and watch in awe as Cascade analyzed the code and synthesized a custom coprocessor tailored specifically to execute the thorniest parts of the code. Cascade was (and still is) impressive, but it’s useful only to SoC developers. Mere mortals using commercial chips couldn’t benefit from it.

That’s where Prism comes in. It’s a code-analysis and –optimization tool for the rest of us. It leverages CriticalBlue’s experience in spotting areas of potential parallelization but instead of producing custom hardware it produces a report.

Prism doesn’t change any code. That’s still left up to the programmer(s). Like Cascade, Prism analyzes existing C source code looking for dependencies and opportunities for parallelization. But unlike Cascade, Prism doesn’t automatically generate any hardware or software. Instead, it highlights both the dependencies and the opportunities but leaves the actual implementation to human hands. Prism does the diagnosis, not the surgery.

Because it’s an Eclipse plug-in, Prism is easy to integrate into existing tool chains (assuming, of course, that you’re already using an Eclipse-based tool chain). It’s licensed on an annual basis, just like most software tools, and there are no royalties involved. Currently, Prism supports C code and three microprocessor instructions sets: ARM, MIPS, and Toshiba’s Venezia.

ARM and MIPS make sense – but Venezia? The odd one out here is Toshiba’s newest implementation of its MeP (media-enhanced processor), a powerful but little-known architecture that appears in some of Toshiba’s own consumer-electronics products. Toshiba is also, not incidentally, an investor in CriticalBlue, which may explain some of the reason behind its support. Since MeP-based chips like Venezia always have multiple cores, they’re also logical targets for a tool like Prism. Indeed, many of Toshiba’s customers as well as Toshiba itself have been using Prism for several months.

Future releases of Prism may support PowerPC, Hitachi’s SuperH, or x86 instruction sets. As with most things, the decision will come down to customer demand and commercial support. For the time being, ARM and MIPS support should satisfy a broad segment of the market.

One Potato, Two Potato

Interestingly, Prism isn’t limited to modeling actual chips. It’s quite happy to tell you how your code would perform on a hypothetical 64-core ARM processor, even though no such chip exists. All Prism needs to know is the instruction set; the existence of a physical chip is irrelevant. This allows users to estimate how far their code can go before performance levels off – or even declines. Seriously serial programs with little room for parallelization may see no performance improvement at all, even on a dual-core processor. Heavily vectorized programs, on the other hand, may scale nicely from two cores to four, eight, sixteen or more. Prism makes it pretty easy to plot that curve in an afternoon.

And that’s what Prism is really all about: getting a quick peek into the performance gains that await (or not) by running on multicore processors. Most programmers have a vague notion that running their existing code on a multicore processor will probably afford some performance improvement, but quantifying that improvement is mostly guesswork. Without rewriting the code and actually trying it out on a new processor (or a good simulation of a new processor), there’s no way to know. That’s a time-consuming project just to satisfy idle curiosity. It also means guessing at where, and how, to modify the code. Would this function benefit from its own thread, or would this one? Trial and error is often the order of the day.

Yet if the code won’t scale, most programmers (and their bosses) would rather know now, so they can get an early start on rewriting. Or so they can cancel that order for multicore processors. Prism identifies where the code could be broken into multiple threads by highlighting lines of source code, displaying call graphs, and identifying dependencies. Core utilization charts give a quick visual indication of how evenly distributed the workload is. Lock contentions, data races, hot spots, and other threading delays all appear visually. Modeling with a different number of cores is a simple as changing an option and rerunning the profiler.

Assuming Prism finds opportunities for parallelization, making the changes to the source code is still a manual process. Prism will show where to look, but not what to do. Even though CriticalBlue has lots of experience generating automatic solutions to multicore problems, it deliberately chose not to do so with Prism. The feedback from customers was that they’d prefer to do that work themselves, rather than have an automated tool fiddle with their code (which may have been automatically generated to begin with). If nothing else, Prism is a learning tool for programmers just coming to grips with multicore programming. You tweak the code, simulate it running on an arbitrary number of processor cores, and learn what helps and what doesn’t. At best, Prism points the way to unlocking parallelism you didn’t know was there or didn’t know how to exploit. On a good day, Prism could help add years of life to existing code.

Leave a Reply

featured blogs
May 25, 2023
Register only once to get access to all Cadence on-demand webinars. Unstructured meshing can be automated for much of the mesh generation process, saving significant engineering time and cost. However, controlling numerical errors resulting from the discrete mesh requires ada...
May 24, 2023
Accelerate vision transformer models and convolutional neural networks for AI vision systems with the ARC NPX6 NPU IP, the best processor for edge AI devices. The post Designing Smarter Edge AI Devices with the Award-Winning Synopsys ARC NPX6 NPU IP appeared first on New Hor...
May 8, 2023
If you are planning on traveling to Turkey in the not-so-distant future, then I have a favor to ask....

featured video

Synopsys Solution for RTL to Signoff Power Analysis

Sponsored by Synopsys

Synopsys’ industry-leading power analysis solution built on PrimePower technology that enables early RTL exploration, low power implementation and power signoff for design of energy-efficient SoCs.

Learn more about Synopsys’ Energy-Efficient SoCs Solutions

featured contest

Join the AI Generated Open-Source Silicon Design Challenge

Sponsored by Efabless

Get your AI-generated design manufactured ($9,750 value)! Enter the E-fabless open-source silicon design challenge. Use generative AI to create Verilog from natural language prompts, then implement your design using the Efabless chipIgnite platform - including an SoC template (Caravel) providing rapid chip-level integration, and an open-source RTL-to-GDS digital design flow (OpenLane). The winner gets their design manufactured by eFabless. Hurry, though - deadline is June 2!

Click here to enter!

featured chalk talk

Automated Benchmark Tuning
Sponsored by Synopsys
Benchmarking is a great way to measure the performance of computing resources, but benchmark tuning can be a very complicated problem to solve. In this episode of Chalk Talk, Nozar Nozarian from Synopsys and Amelia Dalton investigate Synopsys’ Optimizer Studio that combines an evolution search algorithm with a powerful user interface that can help you quickly setup and run benchmarking experiments with much less effort and time than ever before.
Jan 26, 2023
16,374 views