Software Thermal Management with TI OMAP Processors

TI OMAP applications processors are powerful and flexible; this makes them well suited for navigating difficult power and thermal constraints. However, the complexity of OMAP parts is high and the ways in which to use them are varied and intricate. This article focuses specifically on ways to improve thermal performance in TI OMAP parts that limit the source of heat in integrated circuits: power.

Before getting into specifics about how to manage thermal problems with OMAP, let’s remember that to manage thermal issues, there are three approaches we can take:

Don’t generate heat in the first place. The best way to generate less heat is to consume less power. Because of the laws of thermodynamics (i.e. conservation of energy), and system behaviors governed by entropy and enthalpy equilibrium, the best way for us to not generate heat in the first place is to consume as little power as possible.
Transfer existing heat effectively. Once heat is generated, the job then becomes to transfer it effectively by providing an efficient path from the device to the environment via thermal pads, epoxy, clips, or any method that makes use of conduction, convection, or facilitates radiation.
Define your environment. If all else fails, a final way to limit heat is to define your operating environment. For example, by specifying that the ambient operating temperature for the device must be at or lower than a given temperature is one way to constrain the environment to your advantage. However, most often the environment defines itself; hence this dimension of thermal performance management can only be used if you have complete control of the operating environment.

Of these three categories, the OMAP is only suited to address #1: don’t generate heat in the first place. Heat in an embedded system is a byproduct of power thanks to the laws of thermodynamics. Fortunately for us, the OMAP is very good at managing power. By focusing on software techniques to manage power, we can affect the thermal profile of the device in which it operates.

In CMOS integrated circuits, dynamic power is a nonlinear function of capacitance, frequency, and volts squared (P_dynamic = CfV²) and is visually depicted in Figure 1. The nonlinear nature of this relationship is important, because it tells us that as the switching frequency gets higher, the amount of power (and heat!) grows exponentially. OMAP parts have a number of mechanisms that can be used to dynamically scale between lightweight processing and heavyweight processing on demand, ultimately scaling and manipulating the dynamic power curve in your favor.

Figure 1

OMAP parts provide three mechanisms for you to manage power in your embedded system: SmartReflex™ (Adaptive Voltage Scaling and Dynamic Power Switching), DVFS (Dynamic Voltage and Frequency Scaling), and SLM (Static Leakage Management).

SmartReflex™

SmartReflex is a TI brand name that essentially encompasses two techniques: Adaptive Voltage Scaling (AVS) and Dynamic Power Switching (DPS).

Adaptive Voltage Scaling (AVS). Each processor is unique and performs differently due to process variations and temperature. Because of this, some devices (called strong or hot devices) can meet frequency requirements at lower operating voltages than those defined by the specification. OMAP processors have dedicated hardware control loops that monitor the performance of each core and can automatically lower the input voltage to the lowest possible level for that specific processor (VDD_MPU_IVA and VDD_CORE), yielding considerable power savings. See Figure 2 for a visual depiction of AVS as it applies to the dynamic power curve and its goal of providing equivalent performance at lower power.

Figure 2

Dynamic Power Switching (DPS). With DPS, portions of the OMAP can be put into a low-power state while it is waiting for a timer or peripheral interrupt. By dynamically (and automatically) switching to lower power modes when system activity is low, leakage power is reduced, and less power is consumed.

SmartReflex is a feature that is essentially on or off. By enabling SmartReflex, you get all the benefits described above, and get to enjoy the fact that your dynamic power curve just got better. Except for a few types of applications (highly sensitive RF devices, for example), there is essentially no reason for you to turn SmartReflex off.

Dynamic Voltage and Frequency Scaling (DVFS)

Due to the nonlinear nature of the law of dynamic power, if the demand on the processor is light, it makes sense to reduce the operating frequency to save clock cycles and thereby reduce power consumption and heat generation. This technique is called dynamic frequency scaling (DFS).

When the frequency of a processor is reduced, that processor no longer requires the same voltage to operate. Hence when you reduce frequency, you should also be reducing voltage. The technique of adjusting input voltage to a processor core is called dynamic voltage scaling (DVS).

These two techniques (DFS + DVS), because they are so often combined together, are commonly called Dynamic Voltage and Frequency Scaling (DVFS). The OMAP processor families support DVFS. A visualization of this appears in Figure 3.

Figure 3

The OMAP defines the concept of an operating performance point (OPP). An OPP is a tuple consisting of a frequency (f) and a voltage (V) for each core that DVFS is enabled for. In the AM/DM37x, for instance, DVFS is enabled for VDD_MPU_IVA and VDD_CORE.

When using DVFS on an OMAP part, you can expect that the OMAP will provide the power states and the DVFS engine, but you must decide how and when to switch between the defined operating performance points (OPPs), and also handle coordinating system peripherals during low power suspend.

The DVFS engine in an OMAP part is in charge of sequencing transitions between OPPs. Specifically, when the new target frequency is higher (resulting from moving to a higher OPP), voltage is increased first, then the frequency. When the new target frequency is lower (resulting from moving to a lower OPP), frequency is reduced first, then the voltage. The DVFS engine handles this for you.

To use DVFS on OMAP, you need to define policies (or use generic ones) that govern when to switch between OPPs in your application. In Linux, there is operating system support for managing DVFS via CPUFreq utilities, and is comprised of a driver and one or more governors. For OMAP, TI has provided the driver support. Linux has a set of governors that you can use to optimize the DVFS subsystem based on the needs of your application.

Example governors exist for performance, power, userspace, and ondemand. The userspace governor allows applications to control when, specifically, to move between OPPs. The ondemand governor scales up when there is high CPU utilization and down when there is low CPU utilization. The CPUFreq framework allows applications to subscribe to DVFS change events, and you may write your own governor if you feel so inclined.

See the cpufreq section of the Linux kernel documentation for more information. Here is a quick list of commands that you can do to query and change existing governors (note that you must enable CPU frequency scaling, enable the OPP library, and enable specific governors in the base kernel configuration first).

List available governors
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors

List current active governor
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

Switch to a different governor
echo -n “<governor_name>” > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

Switch to the userspace governor
echo -n “userspace” > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

Static Leakage Management (SLM)

When the OMAP is in a system idle mode, and no useful processing is taking place, the amount of current that it draws is called leakage. Ideally we want to minimize leakage to an acceptably low level and the process for doing this is called Static Leakage Management (SLM). It is not possible to eliminate leakage completely unless you cut power to the entire processor. Some leakage is acceptable to do things like keep the wakeup domain powered so that the processor can wake up quickly when an event prompts it to do so.

OMAP parts such as the OMAP35x and AM/DM37x support a variety of options for low power standby states that allow you as the designer to trade off the level of power savings with the speed/latency of wakeup.

Factors that impact the level of power savings during standby are: whether internal memory and logic should be powered up or down, whether clocks are turned on or off, or whether external voltage regulators are used or not.

Off mode (SYS_OFF_MODE) is a very low power mode that puts the chip into the lowest power state possible while still allowing it to wake up on select events. In off mode, system state is saved to external memory, most of the chip is turned off, and a portion of the wakeup domain is kept alive to continuously monitor for wakeup events.

In addition to off mode, the OMAP35x and AM/DM37x can gate the high frequency clock within it is not needed via the SYS_CLKREQ signal, and/or automatically deassert this line automatically when in full-chip retention or off mode.

The OMAP has many features available for managing static leakage. This section only covers the highlights. See the Technical Reference Manual (TRM) for the specific part for all the details.

Kindle Fire Case Study

As a short case study, let’s look at the Kindle Fire from Amazon. Logic PD recently did a tear down of a Kindle Fire, and here is what we found.

The Kindle Fire uses a TI OMAP 4430, which is a dual-core ARM Cortex-A9, with a PowerVR SGX540 graphics core, an IVA3 hardware accelerator (DSP), and camera Image Signal Processor (ISP). Additionally, it has a 7” 600×1024 display, uses Wi-Fi 802.11n, has a 4400 mAh battery, USB 2.0, and comes with 8 GB of non-volatile storage. The software that runs today on the Kindle Fire is a customized version of Android Gingerbread 2.3.

After opening up the Kindle Fire, inspecting its contents, and being surprised at the minimalist approach that was taken to manage heat transfer within the plastic case, we were intrigued and dug further.

By rooting the device, and inspecting the ways in which frequency and voltage varies within the Kindle Fire under normal usage scenarios, we have confirmed that the Kindle Fire uses a wide variety of techniques (mostly the ones that are described in this article) to manage power dynamically, thereby managing heat dynamically, thereby reducing the need for complicated heat spreaders, heat sinks, thermal pads, epoxies, and metal encasements.

DVFS and Thermal Throttling

Our first test was to see how the Kindle Fire uses dynamic scaling (DVFS). When we put the Kindle Fire under load using the readily-available ANTuTu benchmark, the results for which can be seen in Figure 4, we found that the Kindle Fire uses an ondemand governor that scales the frequency and voltage of each ARM core within the OMAP up when utilization increases.

Figure 4

Similarly, we did a similar exercise to understand the performance, power, and temperature of the Kindle Fire when in idle mode. Idle mode in this case means that the Fire was on, the display was lit, but no significant processing was taking place. The results for this test is shown in Figure 5.

Figure 5

From these two simple experiments, we draw a number of interesting observations:

The law of dynamic power is at work. As the frequency ramps up, you can see the exponential effect that switching frequency and voltage have on power consumption. As we discussed earlier, this is a nonlinear relationship, and ramping to the highest frequency to achieve the highest performance means paying a hefty penalty in terms of power consumption (battery life), and dissipated power in the form of heat.
Multicore performs better under load. Because dynamic power is nonlinear, it makes sense that running two cores at 300 MHz, compared to one core at 600 MHz would draw less power. Our results confirm that this is true.
Single core performs better when idling. When the Fire is idling, it makes sense to turn off one of the ARM cores. However, we were surprised to find out that the Kindle Fire does not turn off one of the cores while idling, drawing needless extraneous power. Our recommendation to Amazon is to turn off one of these cores during idle mode, and start reaping the benefits of longer battery life.
Thermal throttling is used to limit IC damage. With a processor as beefy as the OMAP4, it’s easy to consume enough power to produce enough heat that can damage component reliability or cause failure. The Fire preempts this condition with thermal throttling. We saw this by continually running the benchmark and observing that over time we were yielding consistently lower results as the OMAP heated up. By self-limiting frequency and voltage at the maximum operating temperature threshold, the Fire is able to ensure that components are not damaged, and the long-term reliability and durability is maintained.

Adaptive Voltage Scaling (AVS)

The Kindle Fire also uses Adaptive Voltage Scaling (AVS) in the form of SmartReflex. Although SmartReflex in the Fire is on all of the time, we were curious to see what sort of power, performance, and thermal impact there might be if we turned it off. You can see in Figure 6 that by having SmartReflex on vs off, the Fire draws ~400 mW less and runs ~5? C cooler. Nice.

Figure 6

Suspend Power

For suspend mode, the Kindle Fire does a very good job of getting all peripherals coordinated and put into a very low power mode. In suspend, the Fire draws a mere 62 mW, runs at 31? C, and only takes 180 ms to wake up. These results can be seen in Figure 7. When the Fire boots up on a cold start, it can take 50+ seconds to boot. However, by using low power suspend effectively, and only taking 180 ms to go from suspend to idle (system on, backlight on), it is able to satisfy two demanding requirements in any consumer device: long battery life, and “instant on”.

Figure 7

Conclusion

Managing thermal performance in an embedded systems design comes down to managing power due to the fact that dissipated power manifests itself in heat. Fortunately, OMAP processors are very good at managing power, and can also scale up to provide laptop-like performance when needed.

OMAP processors were never designed to be used in a fully-on state all the time. If you did this, they would burn up quickly. OMAP was designed to be used in cell phones and tablets where short performance bursts and long periods of suspend are common.

OMAP is very good for small high-performance low-power multimedia handheld battery-powered devices that need this kind of burst scaling, yet draw tiny amounts of power at off-peak moments. As with most complex problems, complex solutions are required. Although TI OMAP processors are indeed complex, they may just be the hammer you need to tackle the trickiest power, performance, and thermal challenges in your next family of portable product designs.