feature article
Subscribe Now

Ten- Step Program

Sequence PowerArtist Identifies Ways to Reduce Power

Power has become a key design consideration for SoCs in pretty much any application. We’ve looked at some ways of reducing power in past articles, largely at a high level. We continue here with a specific look at some techniques that can be identified by a new tool from Sequence called PowerArtist. This tool takes ten specific steps to identify ways to reduce power, although only a couple of them are automatically implemented. Most of them may take some engineering evaluation to decide whether to implement, and, if so, exactly how to do them, so those techniques are so-called “guided” ones, in that the tool guides the engineer towards power savings opportunities.

The focus of the power savings techniques was directed by some data that Sequence gathered regarding where power is typically consumed in an SoC. The top three items were the clocks (30-60%), memories (20-50%), and datapath (~20%). These are therefore the areas that PowerArtist considers. Four of its “PowerBots” – the name they use for each of the analysis engines – address clock power; three address memories, and three cover the datapath.

When it comes to clocks, it’s all about the enables. It is generally accepted that using clock enables can reduce power. What’s less well understood is that for active signals that would rarely be disabled, adding a clock enable can actually increase power, since the power added by additional enabling circuitry overwhelms any potential minor power reduction. So a key aspect of what PowerArtist does is to determine which clocks would actually benefit from having a clock enable.

Of course, simply adding gating to a clock signal may disrupt the timing of that clock and certainly will add skew and unbalance the clock tree. Synthesis engines already have the capability of recognizing certain styles of logic as opportunities for clock gating and can implement the gating, taking into account all of the timing considerations. Rather than duplicate this effort, PowerArtist doesn’t explicitly gate the clock signals; instead, it creates logic that will be recognized by the synthesis tool as a clock-gating opportunity and lets the synthesis engine take care of the implementation details. But synthesis engines don’t take the signaling activity into account and therefore can’t distinguish clock gating that increases power; PowerArtist does this analysis, guiding the synthesis tool accordingly.

There are four steps to clock power reduction. The first is to find additional opportunities to make use of existing enables. This is a guided step, providing the designer with information on clocks without enables as well as the enable signals that exist, and the designer can apply those enables to additional registers if that makes sense for the function. Because reusing existing enable signals burns negligible additional power, this would be the preferred way to reduce power.  For any remaining non-enabled clocks that couldn’t use existing enables for whatever reason, the second step is that PowerArtist goes in and generates edge-detection logic to create new enable signals – as long as this will result in power savings.

The third step is to make use of the built-in analysis engine within PowerArtist to look over the enable logic for all of the registers. The tool provides detailed power information on both the logic cone fanning into the register and the logic fanning out of the register. The designer can use this information to look for alternative enabling signals or strategies that might consume less power.

Finally. PowerArtist generates a “master list” of clock enables to be created by the synthesis tool. This is essentially a series of constraints that will ensure that clock enabling is done only for those scenarios that will result in lower power. All enabled clocks are considered in this analysis, whether generated by the user or PowerArtist. However, because the latter are created only if power will be reduced, they will always be part of the master list. Any user-created enables that won’t reduce power are left off the list.

The next three steps relate to memory. The first automatically applies gates to memory clocks to inhibit clocking when the memory is inactive; this is entirely analogous to the main clock gating applied above, except that it typically involves the memory select signals as part of the clock gating.

The second step is a check to find opportunities to split wide memories. The idea here is that if the entire address field isn’t changing – a reasonably common situation, especially when running through a contiguous address range – then by having two half-memories instead of one full memory, half the memory can essentially be static (presumably the MSB half) while the other half cycles through the addresses. This is a guided operation and is not automatically implemented by the tool.

The third step aims at reducing signal switching on the data lines of the memory. Data inputs to a memory are useful only if the memory is being written. If the memory isn’t in write mode, then switching data inputs are simply wasting power. With a shared bus, the inputs and outputs use the same lines, but this step is focused on the cone of logic moving towards the memory and identifying logic to quiet the signals when not needed. It’s a guided step, so the tool will point out those opportunities, and the designer decides whether and how to implement any changes.

Finally, there are three steps dedicated to reducing power in the datapath. The first recognizes that many multiplexers may spend a reasonable amount of time with a particular input selected. One such example would be multiplexers used to implement some sort of switching of test or diagnostic circuitry that is usually not used in standard operation; another would be any multiplexers used in the switching of the many “modes” now associated with complex circuits. Any input switching on an unselected multiplexer input is wasted power, so this step identifies such occurrences; implementation is left to the designer.

The second datapath step recognizes that logic switching when a clock is disabled may be wasted activity, and it identifies opportunities to cut down the amount of logic switching by using existing enable signals. Again, the decision as to whether to do this and the implementation are handled by the designer.

The third step is similar to some of the ones we’ve already seen, but with a subtle difference. All prior attempts to quiet unnecessary activity have focused on signals switching over a series of clock cycles when those inputs will have no impact. The last datapath technique looks for activity within a clock cycle. Such activity is usually caused by intermediate logic states and hazards, and, while they pose no logic problem if properly timed, they do increase the amount of switching and therefore the power. This final step identifies opportunities to quiet such transient switching.

One of the things Sequence seems to have tried to do is to make it easier to identify the impact of any changes. The GUI provides power information at a relatively high level of precision. A list of all potential RTL changes is provided, along with the available power savings (in absolute and percentage terms), as well as the area overhead of making the change for automatically-implemented changes. Those automatic changes can be accepted or rejected. The display allows simultaneous viewing and cross-probing between this list, the RTL, and a block schematic view. They also use an OpenAccess database for all of the analysis results, allowing scripting for getting additional information that might not be part of the standard display.

The whole PowerArtist approach, being a collection of various and sundry techniques that can vary widely in their approach, lends itself to future additions of new engines and refinements to the existing ones. Sequence hasn’t made any specific promises to that effect, but it seems as though the foundation has been laid. It’s not a great leap of faith to think that more power savings could be on the horizon.

Leave a Reply

featured blogs
Apr 13, 2021
We explain the NHTSA's latest automotive cybersecurity best practices, including guidelines to protect automotive ECUs and connected vehicle technologies. The post NHTSA Shares Best Practices for Improving Autmotive Cybersecurity appeared first on From Silicon To Software....
Apr 13, 2021
If a picture is worth a thousand words, a video tells you the entire story. Cadence's subsystem SoC silicon for PCI Express (PCIe) 5.0 demo video shows you how we put together the latest... [[ Click on the title to access the full blog on the Cadence Community site. ]]...
Apr 12, 2021
The Semiconductor Ecosystem- It is the definition of '€œHigh Tech'€, but it isn'€™t just about… The post Calibre and the Semiconductor Ecosystem appeared first on Design with Calibre....
Apr 8, 2021
We all know the widespread havoc that Covid-19 wreaked in 2020. While the electronics industry in general, and connectors in particular, took an initial hit, the industry rebounded in the second half of 2020 and is rolling into 2021. Travel came to an almost stand-still in 20...

featured video

Learn the basics of Hall Effect sensors

Sponsored by Texas Instruments

This video introduces Hall Effect, permanent magnets and various magnetic properties. It'll walk through the benefits of Hall Effect sensors, how Hall ICs compare to discrete Hall elements and the different types of Hall Effect sensors.

Click here for more information

featured paper

From Chips to Ships, Solve Them All With HFSS

Sponsored by Ansys

There are virtually no limits to the design challenges that can be solved with Ansys HFSS and the new HFSS Mesh Fusion technology! Check out this blog to know what the latest innovation in HFSS 2021 can do for you.

Click here to read the blog post

Featured Chalk Talk

Accelerate HD Ultra-Dense Multi-Row Mezzanine Strips

Sponsored by Mouser Electronics and Samtec

Embedded applications are putting huge new demands on small connectors. Size, weight, and power constraints are combining with new signal integrity challenges due to high-speed interfaces and high-density connections, putting a crunch on connectors for embedded design. In this episode of Chalk Talk, Amelia Dalton chats with Matthew Burns of Samtec about the new generation of high-performance connectors for embedded design.

More information about Samtec AcceleRate® HD Ultra-Dense Mezzanine Strips: