`
`United States Patent
`US 7,155,617 B2
`(10) Patent No.:
`(12)
`Gary et al.
`(45) Date of Patent:
`Dec. 26, 2006
`
`
`6,105,142 A *
`(54) METHODS AND SYSTEMS FOR
`6,131,166 A
`PERFORMING DYNAMIC POWER
`Beeasoe 6.425,086 BL*
`VOLTAGESCALING
`6,519,707 BL*
`Inventors: Scott P. Gary, Santa Barbara, CA (US):
`6,895,520 BL*
`Robert J. Cyran, Delmont, PA (US);
`6,927,605 B1*
`Vijaya B. P. Sarathy, Karnataka (IN)
`
`(75)
`
`8/2000 Goffet al. cece 713/324
`10/2000 Wong-Insley
`7/2002 Clark et al. sevssssccnsee 713/322
`2/2003 Clark et al. we 713/922
`5/2005 Altmeyd et al.
`........... 713/924
`8/2005 Fetzer et al. .........0... 327/101
`
`(*) Notice:
`
`(73) Assignee: Texas Instruments Incorporated,
`Dallas, TX (US)
`Subject to anydisclaimer, the termofthis
`ee 3iere FA eee under 35
`S.C.
`154(b)
`by
`433
`days.
`:
`(21) Appl. No.: 10/461,947
`(22)
`Filed:
`Jun. 13, 2003
`
`(Continued)
`OTHER PUBLICATIONS
`Jiong Luo, et al.; Battery-Aware Static Schedulingfor Distributed
`Real-Time Embedded System, Dept. of Electrical Eng., Princeton
`Univ., Princeton, NJ, DAC 2001, Jun. 18-22, 1002, Las Vegas, NV,
`US, 2001 ACM 1-58113-297-2/01/0006;6 pgs.
`(Continued)
`
`(65)
`
`Prior Publication Data
`US 2004/0025069 Al
`Feb. 5. 2004
`-
`Related U.S. Application Data
`(60) ee application No. 60/400,426,filed on Aug.
`
`Primary Examiner—Lynne H. Browne
`Assistant Examiner—Michael J, Brown
`(74) Attorney, Agent, or Firm—Robert D. Marshall, Jr. W.
`James Brady; Frederick J. Telecky, Jr.
`(57)
`ABSTRACT
`
`Methods and systemsare provided for dynamically manag-
`(2006.01)
`(31) sot6
`ing the power consumption of a digital system. These
`(2006.01)
`G06F 1/32
`methods and systems broadly provide for varying the fre-
`(2006.01)
`GO6F oA4
`quency and voltageof one or moreclocks ofa digital system
`(2006.0 1)
`G06F 13/10
`upon request by an entity of the digital system. An entity
`(200601)
`GO6F 9/A5
`mayrequestthat the frequency ofa clock ofthe processor of
`713/300: 713/310: 713/320:
`(52) US.CI
`the digital
`system be changed. After the frequency is
`ae 713/322: 703/21: 703/22
`changed, the voltage point of the voltage regulator of the
`.
`; — py sonra
`pea
`Field of Classification oe0.321.322 can digital systemis automatically changed tothe lowestvoltage
`See application
`file for com tite. search Risto =
`point required for the new frequency ifthere is a single clock
`Se:
`APP
`B
`Petes
`tO:
`on the processor. If the processor is comprised of multiple
`References Cited
`processing cores with associated clocks, the frequency is
`changed to the lowest voltage point required byall frequen-
`U.S. PATENT DOCUMENTS
`cies ofall clocks.
`
`(58)
`
`(56)
`
`5,201,059 A
`5,812,860 A
`
`4/1993 Nguyen
`9/1998 Horden etal.
`
`33 Claims, 5 Drawing Sheets
`
`0001
`
`AMD EX1018
`AMD EX1018
`U.S. Patent No. 6,895,519
`U.S. Patent No. 6,895,519
`
`
`
`US 7,155,617 B2
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`2002/0083355 Al
`2002/0188877 Al
`
`6/2002 Clark etal.
`12/2002 Buch
`
`OTHER PUBLICATIONS
`
`Seongsoo Lee, et al.; Rum-Time Power Control Scheme Using
`Software Feedback Loop for Low-power Real-Time Applications,
`Ctr. For Collaborative Research and Institute of Industrial Science,
`Univ. of Tokyo, 2000 IEEE ISBN0-7803-5974-7, pp. 381-386.
`Luca Benini, et al.; System-Level Power Optimization: Techniques
`and Tools, ISLPED99, San Diego, CA, USA, 1999 ACM 1-58113-
`133-X/99/0008, pp. 288-293.
`Seongsoo Lee, et al; Run-Time Voltage Hopping for Low-Power
`Real-Time Systems, Ctr. for Collaborative Research andInstitute of
`Industrial Science, Univ, of Tokyo, 2000 ACM 1-58113-188-7/00/
`00006; pp. 806-809.
`Tom Halfhill, Jnte! Spills the Beans About Banias—New Mobile
`CPUand Chip Set Have Numerous Power-Saving Features, Micro-
`processor, Special Expanded Issue Covering Microprocessor Forum
`2002, vol. 16, Archive 11, Nov. 2002, pp. 4-10.
`Gang Quan,et al.; Energy Efficient Fixed-Priority Scheduling for
`Real-Time Systems on Fariable Voltage Processors, Dept. of Com-
`
`puter Science and Eng., Univ. of Notre Dame, Notre Dame, IN,
`USA, DAC 2001, Jun. 18-22, 1002, Las Vegas, NV, US, 2001 ACM
`1-58113-297-2/0 L/0006; 6 pgs.
`Texas Instruments Incorporated, Application Report: Analyzing
`Target System Energy Consumption in Code Composer Studio™
`IDE, Software Development Systems, pp. 1-12.A.
`Dongkun Shin, et al.; Intra-Task Voltage Scheduling for Low-
`Energy Hard Real-Time Applications,
`IEEE Design & Test of
`Computers, Voltage Scheduling for Applications, Mar-Apr. 2001,
`0740-7475/01, 2001 IEEE, pp. 20-30,
`Jun. 1997:CMOS
`Texas Instruments Incorporated, SCAA035B.
`Power Consumption and Cy4 catcutation Software Development
`Systems, 16 pgs.
`Amit Sinha, et al.; Energy Efficient Real-Time Scheduling, Mass.
`Inst. of Technology. Presentation, 19 pgs.
`Intel, Jntel® PCA Power Management, Sofiware Design Guide,
`Sep. 4, 2002, Rev. 1.0, 72 pgs.
`Texas Instruments Incorporated, Application Report: Calculation of
`TMS$320LC54x PowerDissipation, Digital Signal Processing Solu-
`tions, 1997, 62 pgs.
`
`* cited by examiner
`
`0002
`0002
`
`
`
`U.S. Patent
`
`Dec.26, 2006
`
`Sheet 1 of 5
`
`US 7,155,617 B2
`
`
`APPLICATION
`
`APPLICATION
`
`
`
`wowdG
`
`
` DRIVER
`
`0003
`0003
`
`
`
`U.S. Patent
`
`Dec.26, 2006
`
`Sheet 2 of 5
`
`US 7,155,617 B2
`
`3000
`
`PWRM- Power Manager Properties
`3001
`3002
`3003
`
`3004
`
`General | Idling } V/FScoling [Sleep|
`3005
`Enable PWRM Manager
`3006-YA Call user hook function at boot time
`
`Function:
`
`_OEM_turnOffAudioAmp
`
`ae||
`
`
`
`
`3007
`
`Reprogram BIOSclockafter frequency scaling
`
`RM Power|ManagerProperties
`
`3008 ~57 ‘TaleDSPdomainsintheBIOsidleloop
`[MIF 0tt—~<“=‘“‘S™S™S™S™S™SC~™S
`CLKGEN
`
`PERIPHS
`
`CACHE
`
`DMA
`
`CPU
`
`0004
`
`
`
`U.S. Patent
`
`Dec.26, 2006
`
`Sheet 3 of 5
`
`US 7,155,617 B2
`
`
`
`
`
`
`
`
`PWRM-Power Monoger Properties
`
`ir
`
`General I(
`Idling
`Il V/F Sealing | Sleep
`S000 siaaa PeweeSea
`
`
`Initial frequency (index to frequency lable):
`
`
`
`
`
`
`
`
`
`
`rita voyage (ots)
`Scale voliage along with frequency
`(4) Wait while voltage is being scaled down
`
`FIG. 3C
`
`
`
`PWRM- Power Manager Properties
`
`
`
`Hess
`
`
`S010 (Enabledeep sleep
`
`EMIF
`
`CLKGEN
`
`PERIPHS
`
`CACHE
`
`DMA
`
`CPU
`
`Wakeupinterrupt mask, |ERO:
`
`Wakeupinterrupt mask, IER1:
`Enable sleep until restart
`3011 rt) Enable snooze mode
`Timer to be used for snooze mode:
`
`0x0010
`
`0x0000
`
`[ Timer i |
`
`0005
`0005
`
`
`
`U.S. Patent
`
`Dec.26, 2006
`
`Sheet 4 of 5
`
`US 7,155,617 B2
`
`BUILD
`APPLICATION
`
`APPLICATION
`
`MEASURE
`POWER
`
`POWER OK?
`
`VISUALIZE
`POWER
`
`EXAMINE
`PERIPHERAL
`ACTIVITY
`
`ADJUST
`PERIPHERAL
`ACTIVITY IN
`
`EXAMINE CPU
`LOAD
`
`DYNAMIC POWER
`MANAGEMENT
`
`FIG. 4
`
`0006
`0006
`
`
`
`U.S. Patent
`
`Dec.26, 2006
`
`Sheet 5 of 5
`
`US 7,155,617 B2
`
`5000
`
`™
`
`POWER
`ANALYZER
`
`
`
`FIG. 3
`
`POWER DATA
`TRIGGER DATA
`
`
` IEEE 488 API
`PLUG-IN
`
`
`IEEE 488 DRIVER
`
`
`5004
`qucavon
`
`
`
`
`5014
`
`VO PIN
`
`EEE 488 CABLE
`
`ce
`
`sat
`
`CURRENT PROBE
`
`5012
`
`5002
`
`0007
`0007
`
`
`
`US 7,155,617 B2
`
`1
`METHODS AND SYSTEMS FOR
`PERFORMING DYNAMIC POWER
`MANAGEMENTVIA FREQUENCY AND
`VOLTAGE SCALING
`
`This application claims priority to provisionalapplication
`Ser. No. 60/400,426 filed Aug. 1, 2002 (TI-34977PS). This
`application is related to copending applications Ser. No.
`10/461,289 entitled Methodology for Coordinating and Tun-
`ing Application Power (T]-35526) and Ser. No. 10/461,025
`entitled Methodology for Managing Power Consumption in
`an Application (T]-35525).
`
`FIELD OF THE INVENTION
`
`This invention generally relates to software development
`systems, and more specifically to improvementsin software
`support for power management in systemsand applications.
`
`BACKGROUND OF THE INVENTION
`
`Powerefficiency is a key requirement across a broad
`range of systems, ranging from small portable devices, to
`rack-mounted processor farms. Evenin systems where high
`performance is key, powerefficiency is still a care-about.
`Powerefficiency is determined both by hardware design and
`component choice, and software-based runtime power man-
`agementtechniques.
`In wired systems powerefficiency will typically enable a
`reduction in power supply capacity, as well as a reduction in
`cooling requirements and fan noise, and ultimately product
`cost. Power efliciency can allow an increase in component
`density as well. For example, a designer may be limited by
`the number of processors that can be placed on a board
`simply because the cumulative power consumption would
`exceed compliance limits for the bus specification. Increased
`component density can result either in increased capacity, a
`reduction in product size, or both.
`In mobile devices, power efliciency means increased
`battery life, and a longer time between recharge.
`It also
`enables selection of smaller batteries, possibly a different
`battery technology, and a corresponding reduction in prod-
`uct size.
`Powerefficiency is a key productdifferentiator. A simple
`example is a buyer shopping for an MP3 player at an
`electronics store. In a side-by-side comparison of two play-
`ers with the same features, the decision will likely go to the
`player with the longest time between recharge. In many
`scenarios, the success or failure of a product in its market-
`place will be determined by its power efficiency.
`The total power consumption of a CMOScircuit is the
`sum of both active and static power
`consumption:
`ProratactivetPcaries Active power consumption occurs
`when the circuit is active, switching from one logic state to
`another. Active power consumption is caused both by
`switching current(that needed to charge internal nodes), and
`through current (that which flows when both P and N-chan-
`
`=
`
`20
`
`25
`
`30
`
`33
`
`40
`
`50
`
`55
`
`2
`transistors are both momentarily on). Active power
`nel
`consumption can be approximated by the
`equation:
`sw
`PomsiangtpdEXV.,.7xN
`where C,,
`is
`the dynamic
`capacitance, F is the switching frequency, V’_... is the supply
`voltage, and N,,,,
`is the number of bits switching. An
`additional relationship is that voltage (V_.) determines the
`maximumswitching frequency (I) for stable operation. The
`important concepts here are: 1) the active power consump-
`tionis linearly related to switching frequency, and quadrati-
`cally related to the supply voltage, and 2) the maximum
`switching frequency is determined by the supply voltage.
`If an application can reduce the CPUclock rate andstill
`meet its processing requirements, it can have a proportional
`savings in powerdissipation. Due to the quadratic relation-
`ship,
`if the frequency can be reduced safely, and this
`frequency is compatible with a lower operating voltage
`available onthe platform, thenin addition to the savings due
`to the reduced clock frequency, a potentially significant
`additional savings can occur by reducing the voltage. How-
`ever, it is important to recognize that for a given task set,
`reducing the CPU clock rate also proportionally extends the
`executiontime ofthe sametask set, requiring careful analy-
`sis of the application ensure thatit still meets its real-time
`requirements. The potential savings provided by dynamic
`voltage and frequency scaling (DVFS)has been extensively
`studied in academic literature, with emphasis on ways to
`reduce the scaling latencies, improve the voltage scaling
`range, and schedule tasks so that real-time deadlinescan still
`be met. For example, see Run-time Power Control Scheme
`Using Software Feedback Loop for Low-Power Real-time
`Applications, IEEE ISBN 0-7803-5974-7, Seongsoo Lee,
`Takayasu Sakurai, 2000; /ntra-Task Voltage Scheduling for
`Low-Energy Hard Real-Time Applications, IEEE Design &
`‘Test of Computers, Dongkun Shin, Jihong Kim, Seongsoo
`Lee, 2001; and Run-time Voltage Hopping for Low-power
`Real-time Systems, DAC2000, ACM 1-58113-188-7, Seong-
`soo Lee, Takayasu Sakurai 2000.
`Static power consumption is one componentofthe total
`power consumption equation. Static power consumption
`occurs even when the circuit
`is not switching, due to
`reverse-bias leakage. Traditionally,
`the static power con-
`sumption of a CMOScircuit has been very small in com-
`parison to the active power consumption. Embedded appli-
`cations will typically idle the CPUclock during inactivity to
`eliminate active power, which dramatically reduces total
`power consumption. However, new higher-performance
`transistors are bringing significant boosts in leakage cur-
`rents, which requires new attention to the static power
`consumption componentof the total power equation.
`There are many known techniques utilized both in hard-
`ware design and at run-time to help reduce powerdissipa-
`tion. Table 1 lists some up-front hardware design decisions
`for reducing powerdissipation. Table 2 lists commontech-
`niques employed at run-time to reduce power dissipation.
`Table 3 lists some fundamental challengestoutilizing these
`power managementtechniquesin real-time systems.
`
`Decision
`
`Description
`
`TABLE 1
`
`Choose a low-power
`technology base
`Partition separate voltage
`and clock domains
`
`Choosing a power-eflicient process (e.g., CMOS) is perhaps the most important up-
`front decision, and directly drives power efficiency.
`Bypartitioning separate domains, different components can be wired to the
`appropriate powerrail and clock line, eliminating the need for all circuitry to operate
`at the maximum required by any specific module.
`
`0008
`0008
`
`
`
`US 7,155,617 B2
`
`on
`
`se
`
`TABLE 1|-continued
`
`Decision
`Description
`
`Enable scaling of voltage
`and frequency
`
`Enable gating of different
`voltages to modules
`
`Utilize interrupts to alleviate
`polling by software
`
`Designing in programmable clock generators allows application code a linear savings
`in power when it can scale down the clock frequency. A programmable voltage
`source allows the potential for an additional quadratic power savings when the voltage
`can be reduced as well, because of reduced frequency. Also, designing the hardware
`to minimize scaling latencies will enable broader usage of the scaling technique.
`Some static RAMs require less voltage in retention mode vs. normal operation mode.
`By designing in voltage gating circuitry, power consumption can be reduced during
`inactivity while still retaining state.
`Often software is required to poll an interface periodically to detect events, For
`example, a keypad interface routine might need to spin or periodically wake to detect
`and resolve a keypad input. Designing the interface to generate an interrupt on
`keypad input will not only simplify the software, but it will also enable event-driven
`processing and activation of processoridle and sleep modes while waiting for
`intermupts.
`Decreasing capacitive and DC loading on output pins will reduce total power
`consumption,
`Depending on the application, utilizing cache and instruction buffers can drastically
`reduce off-chip memory accesses and subsequent power draw.
`Many systems bootin a fully active state, meaning full power consumption. If certain
`sub-systems can be left un-powered on boot, and later turned on whenreally needed,
`it eliminates unnecessary wasted power.
`Using shared clocks can reduce the number of active clock generators, and their
`corresponding power draw. For example, a processor's on-board PLL can be
`bypassed in favor of an external clock signal.
`Use clock dividers for fast
`A common barrier to highly dynamic frequency scaling ts the latency of re-locking a
`selection of analternate
`PLL on a frequency change. Adding a clock divider circuit at the output of the PLL
`
`frequency will allow instantaneous selection of a different clock frequency.
`
`Reduce loading of outputs
`
`Use hierarchical memory
`model
`Boot with resources un-
`powered
`
`Minimize numberofactive
`phase lock loops (PLL)
`
`TABLE 2
`
`Technique Description
`
`Gate clocks off when not
`needed
`
`As described above, active powerdissipation in a CMOScircuit occurs only when the
`circuit is clocked, By turning off clocks that are not needed, unnecessary active
`power consumption is eliminated. Most processors incorporate a mechanism to
`temporarily suspend active power consumption in the CPU while waiting for an
`external event. This idling of the CPU clockis typically triggered via a ‘halt’ or ‘idle’
`instruction, called during application or OS idle time. Some processors partition
`multiple clock domains, which can be individually idled to suspend active power
`consumption in unused modules. For example, in the Texas Instruments
`TMS320C5510 DSP, six separate clock domains, CPU, cache, DMA,peripheral
`clocks, clock generator, and external memory interface, can be selectively idled.
`Some peripherals have built-in low power modes that can be activated when the
`peripheral is not immediately needed. For example, a device driver managing a codec
`over a serial port can commandthe codec to a low power mode when there is no
`audio to be played, or if the whole system is being transitioned to a low-power mode.
`Someperipherals have built-in activity detectors that can be programmed to power
`down the peripheral after a period ofinactivity, For example, a disk drive can be
`automatically spun down when the drive is not being accessed, and spun back up
`when needed again.
`Dynamic memories and displays will typically have a self or auto-refresh mode where
`the device will efficiently managethe refresh operation on its own.
`Processors typically boot up fully powered, at a maximum clock rate, ready to do
`work. There will inevitably be resources powered that are not needed yet, or that may
`never be used in the course of the application. At boot time, the application or OS
`may traverse the system, tuming offidling unnecessary power consumers.
`A system may include a power-hungry module that need not be powered at all times.
`For example, a mobile device may have a radio subsystemthat only needs to be ON
`when in range of the device with which it communicates. By gating power OFF/ON
`on demand, unnecessary power dissipation can be avoided.
`Typically, systems are designed with excess processing capacity built in, either for
`safety purposes, or for future extensibility and upgrades. For the latter case, a
`common development technique is to fully exercise and benchmark the application
`to determine excess capacity, and then“dial-down’ the operating frequency and
`voltage to that which enables the application to fully meet its requirements, but
`minimizes excess capacity. Frequency and voltage are usually not changed at
`runtime, but are set at boot time, based upon the benchmarking activity,
`Another technique for addressing excess processing capacity is to periodically
`sample CPU utilization at runtime, and then dynamically adjust the frequency and
`voltage based upon the empiricalutilization of the processor. This “interval-based
`scheduling” technique improves on the power-savings of the previous static
`benchmarking technique because it takes advantage of the dynamic variability of
`the application’s processing needs.
`
`0009
`0009
`
`Activate peripheral low-
`power modes
`
`Leverage peripheral
`activity detectors
`
`Utilize auto-refresh modes
`
`On boot actively turn off
`ul-necessary power
`COMSLIMETS:
`
`Gate power to subsystems
`only as needed
`
`Benchmark application to
`find minimum required
`frequency and voltages
`
`Adjust CPU frequency and
`voltage based upon gross
`activity
`
`
`
`US 7,155,617 B2
`
`mn
`
`an
`
`TABLE 2-continued
`
`
`
`Optimize execution speed of
`code
`
`Use low-power code
`sequences and data patterns
`
`Technique Description
`Dynamically schedule CPU
`The “interval-based scheduling” technique enables dynamic adjustments to
`frequency and voltage to
`processing capacity based upon history data, but typically does not do well at
`match predicted work load
`anticipating the future needs of the application, and is therefore not acceptable for
`systems with hard real-time deadlines. An alternate technique is to dynamically
`varythe CPU frequency and voltage based upon predicted workload, Using
`dynamic, fine-grained comparison of work completed vs. the worst-case execution
`time (WCET)and deadline of the next task, the CPU frequency and voltage can be
`dynamically tuned to the minimum required. This technique is most applicable to
`specialized systems with data-dependent processing requirements that can be
`accurately characterized. Inability to fully characterize an application usually limits
`the general applicability of this technique. Study ofefficient and stable scheduling
`algorithms in the presence of dynamic frequency and voltage scaling is a topic of
`much on-going research.
`Developers often optimize their code for execution speed. However, in many
`situations the speed may be good enough, and further optimizations are not
`considered. When considering power consumption, faster code will typically mean
`more time for leveraging idle or sleep modes, or a greater reduction in the CPU
`frequency requirements. In some situations, speed optimizations may actually
`increase power consumption (e.g., more parallelism and subsequent circuit activity),
`but in others, there may be power savings.
`Different processor instructions exercise different functional units and data paths,
`resulting in different power requirements. Additionally, because of data bus line
`capacitances and the inter-signal capacitances between bus lines, the amount of
`power required is affected by the data patterns that are transferred over the data
`buses. And, the power requirements are affected by the signaling pattems chosen
`(1s vs. 0s) for external interfaces (e.g., serial ports). Analyzing the affects of
`individual instructions and data patterns is an extreme technique that is sometimes
`used to maximize powerefficiency.
`Architecting application and OS code bases to be scalable can reduce memory
`requirements and, therefore, the subsequent runtime power requirements. For
`example, by simply placing individual functions or APIs into individual linkable
`objects, the linker can link in only the code/data needed and avoid linking dead
`code/data,
`For some applications, dynamically overlaying code from non-volatile to fast
`memory will reduce both the cost and power consumption ofadditional fast
`memory.
`Accepting less accuracy in some calculations can drastically reduce processing
`requirements. For example, certain signal processing applications can tolerate more
`noise in the results, which enables reduced processing and reduced power
`consumption.
`When there is a change in the capabilities of the power source, ¢.g., when going
`from AC to battery power, a common technique is to enter a reduced capability
`mode with more aggressive runtime power management, A typical example is a
`laptop computer, where the OS is notified on a switch to battery power, and
`activates a different power management policy, with a lower CPU clock rate, a
`shorter timeout before the screen blanks or the disk spins down, etc. The OS power
`policy implements a tradeoff between responsiveness and extending battery life. A
`similar technique can be employed in battery-only systems, where a battery monitor
`detects reduced capacity, and activates more aggressive power management, such
`as slowing down the CPU, not enabling image viewing on the digital camera’s LCD
`display, ete.
`
`Scale application and OS
`footprint based upon
`minimal requirements
`
`Use code overlays to reduce
`fast memory requirements
`
`Tradeoff accuracy vs. power
`consumption
`
`Enter a reduced capability
`mode on a power change
`
`TABLE3
`
`
`
`Challenge Description
`
`Sealing CPU frequency with
`workload often affects
`peripherals
`
`V/F scaling latencies can be
`large, and platform-
`
`Might not have stable
`operation during V/F scaling
`
`For many processors the same clock that feeds the CPU also feeds on-chip
`peripherals, so scaling the clock based upon CPU workload can have side-affects
`on peripheral operation. The peripherals may need to be reprogrammed before
`and/or after the scaling operation, and this maybe difficult if a pre-existing (non
`power-aware) device driver is being used to manage the peripheral. Additionally,
`if the scaling operation affects the timer generating the OS systemtick,this timer
`will need to be adapted to follow the scaling operation, which will affect the
`absolute accuracy of the time base.
`The latency for voltage and frequency scaling operations will vary widely across
`platforms. An application that runs fine on one platform may not be portable to
`another platform, and may not run ona revision to the same platform if the
`latencies change much, For example, the time for a down-voltage scaling
`operation is typically load-dependent, and if the load changes significantly on the
`revised platform the application may not run correctly.
`Some processor vendors specify a non-operation sequence during voltage or
`clock frequency changes to avoid instabilities during the transition. In these
`situations, the scaling code will need to wait for the transition to occur before
`returning, increasing the scaling latency.
`
`0010
`0010
`
`
`
`US 7,155,617 B2
`
`TABLE 3-continued
`
`
`
`Challenge Description
`VIF scaling directly affects
`ability to meet deadlines
`
`Changing CPU frequency (and voltage when possible) will alter the execution
`time of a given task, potentially causing the task to miss a real-time deadline.
`Even if the new frequency is compatible with the deadline, there may still be a
`problem if the latency to switch between V/F setpoints is too big.
`If the clock that feeds the CPU also feeds the OS timer, the OS timer will be
`scaled along with the CPU, which compromises measurement of CPUutilization.
`
`Scaling the CPU clock can
`affect ability to measure CPU
`utilization
`Watchdogs still need to be
`kept happy
`
`Idle and sleep modes
`typically collide with
`emulation, debug, and
`instrumentation
`
`Context save/restore can
`become non-trivial
`
`Most advanced power
`management techniques are
`still in the research stage
`
`Different types of
`applications call for
`different techniques
`
`Watchdog timers are used to detect abnormal program behavior and either
`shutdown or reboot a system. Typically the watchdog needs to be serviced within
`a pre-defined time interval to keep it from triggering. Power management
`techniques that slow down or suspend processing can therefore inadvertently
`trigger application failure.
`Depending upon the processor and the debug tools, invoking idle and sleep modes
`can disrupt the transport of real-time instrumentation and debugging information
`from the target. In the worst case it may perturb and even crash the debug
`environment. Similar concems arise with V/F sealing, which may cause difficulty
`for the emulation and debug circuitry. It may be the case that power management is
`enabled when the system is deployed, but only minimally used during development.
`In a non-power managed environment the OS or application framework will
`typically save and restore register values during a context switch. As register
`banks, memories, and other modules are powered OFF and back ON,the context to
`be saved and restored can grow dramatically. Also, if a module is powered downit
`may be difficult (and sometimes not possible) to fully restore the internal state of
`the module.
`Manyof the research papers that demonstrate significant power savings use highly
`specialized application examples, and do not map well to general application cases.
`Or, they make assumptions regarding the ability to fully characterize an application
`such that it can be guaranteed to be schedulable. These techniques often do not
`map to ‘real world’, multi-function programmable systems, and more research is
`needed for broader applicability.
`Different hardware platforms have varying levels of support for the above listed
`techniques. Also, different applications running on the same platform may have
`different processing requirements. For some applications, only the low-latency
`techniques (e.g., clock idling) are applicable, but for others the higher-latency
`techniques can be used to provide significant power savings when the application
`switches between modes with significantly different processing requirements. For
`example, one mode can be run at low V/F, and another mode, with higher
`processing requirements, can be nin at a higher V/F. If the V/Flatency is
`compatible with the mode switch time, the application can use the technique.
`
`SUMMARYOF THE INVENTION
`
`change. The powerscaling library may be further executable
`to enable an entity to obtain a current frequency ofthe clock,
`The present invention provides methods and systems for
`to obtain a current voltage of the voltage regulator, to obtain
`dynamically managing power consumption inadigital sys-
`all valid frequencies of the clock,
`to obtain a minimum
`tem. Embodiments of the invention permit entities of the
`required voltage for each of the valid frequencies of the
`digital system to vary the frequency and voltage used to
`clock, and to obtain a maximum latency for a change from
`powera processorofthe digital system. The processorofthe
`a first frequency and voltage to a second frequency and
`digital system may be comprised ofa single processing core
`voltage.
`with a single clock or a multiple of processing cores, each
`In other embodiments, the processor ofthe digital system
`with its own clock. An entity may request that the current
`comprises multiple processing cores, each with an associ-
`frequency of a clock be changed to a new frequency and the
`ated clock. In such embodiments, the power scaling library
`change will be made. The voltage point of the voltage
`is executable to permit an entity to cause the frequencies of
`regulatorin the digital system is then automatically changed
`two or more clocks to be changed. And, the power scaling
`to the minimum voltage required by that frequency if there
`library is executable to automatically changes the voltage to
`is asingle clock. If there are multiple clocksin the processor,
`the minimum voltage required by all frequencies of all
`the voltage point is automatically changed to the minimum
`clocks whenthe frequencies of the two or more clocks are
`voltage required by all frequencies ofall clocks. An entity
`changed.
`may request a frequency changefor one clock or a multiple
`of clocks in a single request.
`Digital systems are provided operable to dynamically
`manage power consumption by providing for scaling of
`frequency and voltage during operation of the digital sys-
`tems. Embodiments of such digital systems comprise a
`processor, a voltage regulator, and a memory storing a power
`scaling library executable to enable anyentity of the digital
`system change the frequency and voltage of the digital
`system. The powerscaling library is operable to cause the
`voltage to be automatically changed to the minimum voltage
`required by a frequency whenan entity requests a frequency
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`60
`
`Particular embodiments in accordance with the invention
`will now be described, by way of example only, and with
`reference to the accompanying drawings:
`FIG. 1 presents a logical architecture of an embodiment of
`a system that permits applications to utilize power manage-
`5 ment techniques compatible with application requirements;
`FIG. 2 illustrates a registration andnotification method of
`a power management system;
`0011
`0011
`
`
`
`US 7,155,617 B2
`
`9
`FIGS. 3A—3Dillustrate the static configuration process of
`an embodiment of a power management system;
`FIG. 4 illustrates a method for application development
`that includes developing a power management strategy for
`the application; and
`FIG. 5 presents an embodiment of a minimally intrusive
`system for power profiling of an embedded application that
`enables the method of FIG, 4.
`Corresponding numerals and symbols in the different
`figures and tables refer to corresponding parts unless other-
`wise indicated.
`
`DETAILED DESCRIPTION OF EMBODIMENTS
`OF THE INVENTION
`
`The present invention provides systems and methods to
`permit application developers to select and utilize power
`management techniques that are compatible with specific
`application requirements. Although these systems and meth-
`ods are described belowin relation to a real-time operating
`system (RTOS), they may be easily adapted by one skilled
`in the art to other operating systems or application environ-
`ments without an operating system.
`FIG. 1 presents a logical architecture ofan embodiment of
`a systemthat permits applications to utilize power manage-
`ment techniques compatible with application requirements.
`Power management module (PWRM) 1000 is added to an
`application comprising real-time operating system (RTOS)
`1002, processor 1004, and various device drivers 1006.
`Conceptually, PWRM 1000 is an adjunct to RTOS 1002.
`PWRM 1000 does not operate as another task in RTOS
`1002. Instead, PWRM 1000 provides a set of application
`program interfaces (APIs) that execute in the context of
`application control threads and device drivers.
`The capabilities of real-time operating systems are well
`known to those skilled in the art. One representative
`example of an RTOS 1s DSP/BIOS from Texas Instruments
`Incorporated. DSP/BIOS is a scalable, instrumented real-
`time kernel for digital signal processors, The kernel
`is
`optimized for resource-constrained,
`real-time embedded
`applicati