`Mittal et a].
`
`‘
`US005719800A
`Patent Number:
`Date of Patent:
`
`[11]
`
`[45]
`
`5,719,800
`Feb. 17, 1998
`
`[54] PERFORMANCE THROTTLING TO REDUCE Primary Examiner—David I-l. Malzahn
`IC POWER CONSUMPTION
`Attorney, Agent, or Firm—Blakely. Sokololf. Taylor &
`Zafman
`
`[75] Inventors: Millind Mittal. South San Francisco.
`Calif; Robert Valentine. Qiryat Tivonv
`Israel
`
`[73] Assignee: Intel Corporation, Santa Clara. Calif.
`
`.
`[21] App 1' No" 497,853
`[22] Filed:
`Jun. 30, 1995
`
`6
`[51] Int. Cl. ...................................................... .. G06F 1/32
`[52] US. Cl. ........................................... .. 364/707; 395/750
`-
`_
`[58] Field of Search ............................ .. 364/707, 395/750
`
`[56]
`
`-
`References cued
`
`U~$- PATENT DOCUMENTS
`4/1996 Steward et al. ....................... .. 395/750
`5,504,907
`4/1996 WiSOl' ............. ..
`5,511,203
`7/1996 Alexander et a1.
`5,539,681
`8/1996 Wuizbury et a]
`5,546,591
`9/1996 Craft -------------- --
`5,557,551
`5,560,0e0 9/1996 Nakatani et a1
`5,576,738 11/1996 Anwyl etal
`5,579,524 11/1996 Kikinis .................................. .. 395/707
`
`ABSTRACT
`[57]
`The power consumed within an integrated circuit (IC) is
`reduced without substantial impact on its performance for
`typical applications by throttling the performance of par
`ticular functional units within the IC. Arti?cial worst-case
`power consumption is reduced by throttling down the activ
`ity levels of long-duration sequences of high-power opera
`tions. The recent utilization levels of particular functional
`units within an IC are monitored—for example, by comput
`mg each functional unit’s average duty cycle over its recent
`Operating history- If this activity level is greater than a
`threshold. then the functional unit is operated in a reduced
`power mode‘ The threshold value is set large enough to
`allow short bursts of high utilization to occur without
`impacting performance. The invention allows an integrated
`circuit to dynamically make the tradeoff between high-speed
`operation and low-power operation. by throttling back per
`fonnance of localized functional units when their utilization
`exceeds a Sustainable lcvcl- Additi°nally~ this dynamic
`Power/SW3d 'I?dwff can be Optimized across multiple func
`tional units within an IC or among multiple ICs within a
`system. Additionally. this dynamic power/speed tradeoif can
`be altered by providing software control over throttling
`pmems_
`
`32 Claims, 5 Drawing Sheets
`
`MODE
`CONTROL
`SIGNAL 110
`
`MODE
`CONTROLLER
`
`107
`
`FUNCTIONAL
`UNIT
`
`1 05
`
`ACTIVITY
`LEVEL
`1 09
`
`CURRENT
`ACTIVITY
`INFORMATION 108
`
`ACTIVITY
`MONITOR
`
`106
`
`ADVANCED MICRO DEVICES, INC.
`Exh. 2010
`LG ELECTRONICS, INC. v. ADVANCED MICRO DEVICES, INC.
`IPR2015-00324
`
`Page 1 of 15
`
`
`
`US. Patent
`
`Feb. 17, 1998
`
`Sheet 1 of 5
`
`5,719,800
`
`m9
`
`SF
`
`A v
`
`
`
`njoxwmmE. 39.6mm T;
`
`N2.
`
`396%
`
`5261
`
`Ho:
`
`wow
`
`BF
`
`m2
`
`229.522
`
`:2:
`
`_>_
`
`
`
`ammmwm 405.200
`
`moos
`
`o: .2206
`
`m: dim
`
`<H Aim
`
`Page 2 of 15
`
`
`
`US. Patent
`
`Feb. 17, 1998
`
`Sheet 2 of 5
`
`5,719,800
`
`FLOATING
`POINT
`UNIT
`
`UP/DOWN
`
`UP/DOWN COUNTER
`
`DIVIDE
`BY 2
`
`202
`
`SYSTEM
`CLOCK 201
`
`FIG. 2
`
`Page 3 of 15
`
`
`
`US. Patent
`
`Feb. 17, 1998
`
`Sheet 3 of 5
`
`5,719,800
`
`ACTIVE INCREMENT
`REGISTER
`304
`
`INACTIVE DECREMENT
`REGISTER
`305
`
`THRESHOLD
`REGISTER
`
`306
`
`MEMORY
`ADDRESS
`REQUESTED
`
`CACHE ACTIVE
`316
`
`F,
`
`ACTIVITY-LEVEL
`REGISTER
`
`309
`
`ADDRESS MISS 303
`
`REDUCE POWER 312
`
`COMPARATOR
`
`31 0
`
`314
`
`ACCESS EXTERNAL MEMORY
`31 5
`
`FIG. 3
`
`Page 4 of 15
`
`
`
`US. Patent
`
`Feb. 17, 1998
`
`Sheet 4 of 5
`
`5,719,800
`
`F3
`
`
`
`njoImmmE. zokmmmmm
`
`mmhm?mm
`
`o3
`
`
`
`mOEm E200
`
`
`
`NE mwgom mosomm
`
`in
`
`gm
`
`mmkmamm
`
`
`
`EmEwmoZ m>Fo<
`
`
`
`
`
`5555.2 .Ezmmb? wwmoo<
`
`m 5 w
`
`Page 5 of 15
`
`
`
`US. Patent
`
`Feb. 17, 1998
`
`Sheet 5 of 5
`
`5,719,800
`
`H am
`
`com
`
`
`a P $6 62: AV mt E3528 woo: ; $3528 m8:
`29258; New 8m am
`zg??whso 23 4 _ “_ Q $9.292 E52 .@ 5:202 E52 2 5s ézoaozE
`
`
`i mwjcmhzoo M502
`k mom
`
`Page 6 of 15
`
`
`
`5,719,800
`
`1
`PERFORMANCE THROTTLING TO REDUCE
`IC POWER CONSUMPTION
`
`FIELD OF THE INVENTION
`
`The invention relates generally to reducing the power
`consumption of Integrated Circuits (ICs). and particularly of
`Very Large Scale Integration (VLSI) ICs. In particular. it
`relates to methods and apparatus for throttling the perfor
`mance of particular functional units within an IC as needed
`to control worst-case power consumption.
`
`10
`
`BACKGROUND OF THE INVENTION
`
`15
`
`20
`
`25
`
`30
`
`35
`
`2
`It is desirable to reduce the power consumed by an IC by
`reducing or eliminating node transitions in functional units
`within the IC that are not being used during a particular
`sequence of operations. If an IC shuts down functional units
`when they are not being used. then typical power consump
`tion can be signi?cantly reduced with little or no impact on
`performance.
`However. shutting down functional units is likely to have
`little impact on worst-case power consumption. which often
`arises when the IC is performing sequences of operations
`that utilize many of the functional units within the IC.
`Worst-case power consumption is likely to be substantially
`higher than typical power consumption.
`Often particular functional units or logic blocks within an
`IC can be identi?ed that tend to consume a disproportionate
`share of the IC’s power——for example. the circuitry in a
`microprocessor that performs ?oating-point arithmetic. The
`power consumed by a microprocessor is signi?cantly less if
`it is not called on to perform many ?oating-point operations.
`The worst-case power consumption of a microprocessor
`might involve a sequence of ?oating point operations that
`operates on data values chosen to maximize node transitions
`from one to zero and visa versa. and that executes repeatedly
`using cache memory within the microprocessor so as to
`avoid reading or writing main memory. Additionally. if the
`microprocessor performs speculative evaluations of upcom
`ing operations based on predicting which way a branch
`operation will go, power consumption would be increased
`by increasing the percentage of branch operations for which
`the microprocessor's prediction is accurate. This is because
`an inaccln'ate prediction ?ushes the instruction-execution
`pipeline. thus leaving some functional units idle as the
`pipeline re?lls.
`The designer of the system in which the IC is to be used
`must know what the maximum power consumed by the IC
`will be for any possible sequence of operations. In order to
`make a system that incorporates an IC robust. the IC’s
`maximum worst-case power must be known and speci?ed.
`Reducing the worst-case power consumption of an IC is
`very important for reliability purposes, for heat dissipation
`purposes and for power-supply capacity purposes. Thus.
`there is a need to reduce the worst-case power consumed by
`an IC with little or no reduction in performance.
`A worst-case sequence of operations, as described above.
`is important for estimating worst-case power consumption.
`which is essential for the above-mentioned purposes. But
`such a sequence can be considered arti?cial. Le. it may not
`be encountered in practical applications of a microprocessor.
`For example. it is arti?cial to use a worst case power
`sequence based on lots of ?oating point computations in
`rating a microprocessor to be used in a portable computing
`device where ?oating point operations are infrequently used.
`It may not be important in typical applications of portable
`computing devices that long sequences of ?oating-point
`arithmetic be performed at maximum speed.
`If the performance of typical operations is maintained.
`then it may be acceptable to throttle back the performance of
`less typical or arti?cial sequences of operations for the sake
`of reducing power. Thus. there is a need to reduce the
`worst-case power consumed by an IC without reducing
`performance for normal applications.
`
`Reducing the power consumed by an IC has signi?cant
`advantages: (1) Less power must be supplied to the IC; and
`(2) Less heat must be dissipated by the IC and the devices
`surrounding it. Reducing power consumption is especially
`important when an IC is going to be used in a portable
`computing device. such as a hand-held or notebook-size
`digital device.
`Portable devices often operate for extended periods of
`time using only the power supplied by an internal battery.
`Because the size. weight and storage capacity of a portable
`battery is very limited. conserving power is critical in
`portable devices. The less power its ICs consume. the longer
`time the portable device can operate without changing or
`recharging its batteries.
`Further. portable devices generally must dissipate the heat
`that their components generate without the assistance of the
`mechanical heat sinks or radiators and cooling fans that can
`easily be used in a desk-top or rack-mount computer system.
`When the ICs within a portable device consume less power,
`it operates at a lower temperature. Elevated temperatures
`within a computing device can make its components operate
`unreliably or have shorter lifetimes.
`The power consumed by an IC can be reduced by low
`ering the speed at which it operates. For an IC fabricated
`using CMOS technology, which dominates the manufacture
`of commercial ICs, the power the IC consumes is directly
`proportional to both its clock rate and its operating voltage.
`If either clock rate or voltage is reduced. then the power
`consumed is reduced. Reducing the voltage also requires
`lowering the clock rate, unless an offsetting improvement is
`made in the manufacturing technology.
`Because typically a ?xed number of clock cycles is
`required to perform a particular operation, approaches to
`reducing IC power consumption that reduce the clock rate of
`the IC unfortunately also reduce performance. Thus. there is
`a need to reduce the power consumed by an IC without
`reducing its performance.
`For many complex ICs. the power consumed varies
`widely with the task that they are performing. If more of the
`circuit nodes within the IC transition from one to zero or visa
`versa. then more power is consumed. Thus in order to
`specify the typical power consumption of a particular IC. it
`is necessary to de?ne a benchmark sequence of operations
`that constitutes its typical usage. Such a benchmark would
`likely include substantial amounts of idle time. because
`computing devices designed for interactive use spend a large
`percentage of time waiting for user input. Once such a
`benchmark suite of typical operations is de?ned. then the
`power consumed by an IC in performing those operations
`can be measured or estimated. Such a typical power con
`sumption value would be useful. for example. in estimating
`the battery life of a portable computing device under normal
`use.
`
`45
`
`55
`
`65
`
`SUMMARY OF THE INVENTION
`A novel method and apparatus for controlling power
`consumption within an IC reduces worst-case power con
`sumption without substantially lowering performance for
`
`Page 7 of 15
`
`
`
`3
`typical applications. Worst-case power consumption is
`reduced by throttling down the activity levels of long
`duration sequences of high-power operations.
`Within any IC. a number of particular functional units can
`consume inordinate amounts of power. For example.
`?oating-point arithmetic units and cache memories are two
`types of functional units within a microprocessor IC that can
`consume substantial amounts of power. The invention
`allows IC designers to identify any number of such high
`power functional units within the IC they are designing. and
`place each under the control of its own power controller.
`Further. the invention allows IC designers to place the IC
`they are designing as a whole under the control of an overall
`power controller. In the case of a microprocessor IC. the
`power consumption as a whole can effectively be throttled
`by lowering either the instruction retirement rate or the
`instruction issue rate.
`In one embodiment. the power controller comprises an
`activity monitor and a mode controller. The activity monitor
`tracks the recent utilization level of a particular functional
`unit within the IC-—for example. by computing its average
`duty cycle over its recent operating history. If this activity
`level is greater than a threshold. then the mode controller
`switches the functional unit to operate in a reduced-power
`mode. The threshold value is set large enough to allow short
`bursts of high utilization to occur without impacting perfor
`mance.
`Embodiments of the invention exist that add only minimal
`cost and complexity to the IC’s design--for example. one
`up-down counter and some control circuitry per each func
`tional unit being controlled. On the other hand. the invention
`is ?exible in that it encompasses a wide variety of techniques
`for monitoring utilization of different functional units, for
`reducing the power they consume and for setting their
`throttling parameters.
`In accordance with another aspect of the invenn'on. the
`dynamic power/speed tradeoff of the invention can be opti~
`mized across multiple functional units within an IC or
`among or among multiple ICs within a system. The inven
`tion includes optimization schemes wherein the maximum
`power consumed by a particular functional unit can be
`inrreased or decreased depending on the power being con
`sumed elsewhere within the same IC. or on other ICs within
`the same system.
`In accordance with another aspect of the invention, the
`dynamic power/speed tradeolf of the invention can be con
`trolled by software. such as platform software executing at
`system boot time. or operating system software. or possibly
`even applications software.
`
`10
`
`20
`
`25
`
`30
`
`35
`
`45
`
`55
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`The invention is illustrated in the following drawings. in
`which known circuits are shown in block-diagram form for
`clarity. These drawings and the following textual description
`are for explanation and for aiding the reader’s
`understanding. but the invention should not be taken as
`being limited to the preferred embodiments and design
`alternatives illustrated therein.
`FIG. 1(a) shows the blocks of logic circuitry of the
`invention.
`FIG. 1(b) is a state diagram showing the transitions of a
`functional unit from its normal mode or state to its reduced
`power mode and back again. according to the invention.
`FIG. 2 shows the blocks of logic circuitry in an embodi
`ment of the invention that enforces a 50% maximum sus
`tainable duty cycle on a ?oating point functional unit.
`
`5,719,800
`
`4
`FIG. 3 shows the blocks of logic circuitry in an embodi
`ment of the invention that enforces a programmable maxi
`mum sustainable duty cycle on a cache memory.
`FIG. 4 shows the blocks of logic circuitry in an embodi
`ment of the invention that disables instruction cache
`prefetching based on the recent utilization level of the data
`cache.
`FIG. 5 shows the blocks of logic circuitry in an embodi
`ment of the invention where a power coordinator reads the
`activity levels of various functional units within an IC and
`alters. based on those activity levels. the throttling param
`eters of other functional units to dynamically optimize the
`power/speed tradeo?.
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`Overview
`
`The invention allows an IC to dynamically make the
`tradeo?’ between high-speed operation and low-power
`operation. by throttling back performance of a functional
`unit when its recent utilization exceeds a sustainable level.
`Thus, the invention allows the IC to dynamically throttle
`back the execution rate of maximum worst-case power
`consumption sequences of operations so as to not exceed the
`worst-case power consumption allowable, thus avoiding
`reliability. heat dissipation or power supply problems.
`At the same time, the invenn'on minimizes any perfor
`mance impact that such throttling has on realistic sequences
`of operations. This power reduction is done in a way that
`does not have a substantial a?‘ect on the performance of the
`IC for typical tasks. The localized control and the threshold
`value that the invention provides minimize performance
`impacts. Further, the performance impact is predictable and
`repeatable for those sequences of operations that the inven
`tion does throttle.
`The purpose of an IC is not to run some arti?cial.
`non-realistic maximum worst-case power consumption
`sequence of operations at high performance. Rather. it is to
`run realistic or typical sequences of operations at high
`performance. In some cases. there can be a substantial
`difference in power consumption between such the typical
`worse case power consumption and the arti?cial worst-case
`power consumption. The effectiveness of adding the inven
`tion to a particular IC design depends on the amount of
`difference between that design’s arti?cial worst-case power
`consumption and its typical worst-case power consumption.
`A preferred way to look at typical worst-case power
`consumption is to look at realistic sequences of operations
`typically used to perform actual work and identify from
`among those sequences the particular sequence that maxi
`mizes power consumption. Such a sequence could be deter
`mined by pro?ling the power consumption of sequences of
`operations in a mix of popular software programs. and
`choosing from among those sequences the sequence with the
`highest power consumption.
`According to the present invention. arti?cial sequences of
`operations that keep high-power functional units active for
`longer than a threshold are performed in low power mode.
`Thus, the invention prevents the IC from consuming power
`in excess of its speci?ed maximum regardless of the
`sequence of operations it is performing. This is critical in the
`case of malicious software, such as a virus. that might
`deliberately attempt to damage a microprocessor IC or the
`system that includes the microprocessor by causing excess
`power consumption.
`
`Page 8 of 15
`
`
`
`5,719,800
`
`6
`100.000 clock cycles. A substantial amount of high-speed
`computation can be performed in a high-power burst of
`100.000 clock cycles. Thus. the invention allows bursts of
`high activity. unless their duration exceeds the threshold.
`
`Design Alternatives for Monitoring Utilization
`
`10
`
`15
`
`20
`
`25
`
`35
`
`5
`The invention is independent of the technique of reducing
`overall power consumption by reducing voltage and/or clock
`rate. It can be used in conjunction with that approach. or in
`lieu of that approach. For example. if an IC would operate
`at 100 MHz except for excessive worst-case power con
`sumption at that speed. then (i) the clock rate could be
`lowered to reduce worst-case power consumption; (ii) the
`invention could be employed to reduce worst-case power
`consumption; or (iii) a combination of both techniques could
`be employed. For some power-limited designs. using the
`invention could make the diil’erence in whether or not a
`particular target clock rate can be met.
`FIG. 1(a) is a block diagram of one embodiment of the
`invention. Functional unit 105 provides current activity
`information 108 to activity monitor 106. Current activity
`information 108 describes what tasks or operations func
`tional unit 105 is currently performing. or indicates that it is
`currently idle. Based on this current activity information
`108. activity monitor 106 generates activity level 109, and
`provides it to mode controller 107. Activity level 109 could
`be a number, a set of signals each indicating that the activity
`level is within a speci?ed range. or even a single bit. Based
`on activity level 109, mode controller 107 generates mode
`control signal 110. which is coupled to functional unit 105.
`Mode controller 107 switches functional unit 105 between
`a normal mode of operation 101 (typically one with high
`performance and high power consumption). and a reduced
`power mode 102 (typically one lower in performance and
`lower in power consumption).
`Activity monitor 106 monitors the recent utilization of
`functional unit 105, via activity level 109. Activity level 109
`could be a special signal generated by functional unit 105,
`or it could simply be the commands that functional unit 105
`receives and responds to. Monitoring the recent utilization
`could consist of. for example, computing the average duty
`cycle of the functional unit over the preceding thousand
`cycles. If this activity level exceeds a threshold. then mode
`controller 107 places functional unit 105 in reduced-power
`mode. Further, if it is desired to monitor the overall power
`consumption of an IC. then its substrate temperature could
`be measured and this value used as the activity level of the
`invention.
`FIG. 1(b) is a state-transition diagram of the operation of
`the invention. It shows how mode controller 107 causes
`functional unit 105 to transition between normal mode 101
`and reduced-power mode 102. When the functional unit is in
`normal mode 101 and the recent utilization is greater than
`the threshold. then transition 103 occurs in which the mode
`controller places the functional unit in reduced-power mode
`102. Similarly. when in reduced-power mode 102 and the
`recent utilization is less than the threshold. then the mode
`controller takes transition 104 to restore the functional unit
`it controls to normal mode 101.
`Preferably. the threshold value used is set based on
`pro?ling the realistic worst-case power consumption bench
`mark being used in the design of this particular IC. The
`threshold is preferably set large enough that all or most
`bursts of high activity occurring in this benchmark are
`shorter than this threshold. and thus can be speedily
`executed with little or no throttling.
`In the case where heat dissipation is the primary deter
`minant of how much power can be consumed. the threshold
`may be on the order of a hundred thousand (100,000)
`operations. A spike in power consumption of one millisec
`ond (1 ms) may well be tolerable from a thermal point of
`view. If the IC is clocked at 100 MHz. then a 1 ms spike is
`
`The invention is ?exible in that it encompasses a wide
`range of methods and devices for monitoring activity levels.
`These design alternatives range from very simple to quite
`complex. In fact. each functional unit controlled may have
`a different monitoring technique to which it is best suited.
`A particularly simple monitoring technique is to use an
`up/down counter as an activity-level register whose contents
`indicate the current utilization of the functional unit being
`monitored. In a simple implementation. the up/down counter
`increments its contents by one during each clock cycle that
`the functional unit is active and decrements its contents by
`one for each clock cycle the functional unit is inactive. A
`slightly more complex design alternative is to increment and
`decrement not for each clock cycle. but rather once per each
`complex operation that the functional unit performs and
`decrement for each corresponding period that the functional
`unit is inactive. Another design alternative is for the activity
`monitor to increment by a value other then one. to decrement
`by a value other than one. or both.
`If the value by which the contents of the activity-level
`register is increased during each active cycle equals the
`value by which the activity-level register is decreased during
`each inactive cycle. then the activity monitor functions to
`enforce a maximum sustainable duty cycle of ?fty percent
`(50%). In an up-down counter implementation. care must be
`taken that the contents of the activity-level register never go
`below zero. or alternatively that a negative number as the
`value in the activity-level register is distinguished from a
`roll-over condition in which the value becomes too large in
`the positive direction.
`The current value of the activity-level register is com
`pared against a threshold value. The threshold value is
`independent of the maximum sustainable duty cycle. It is set
`so as to be large enough so that short bursts of high activity
`can execute at full speed. Preferably. the threshold value is
`set by pro?ling the sequence of operations selected as the
`realistic worst-case power consumption benchmark. The
`threshold value can be thought of as a de?cit limit which the
`functional unit can not exceed without having its speed
`throttled down. Carrying this analogy further. the current
`value of the activity-level register can be thought of as its
`current power de?cit.
`If a maximum sustainable duty cycle value other than ?fty
`percent (50%) is desired. then it is necessary to have the
`active increment be unequal in magnitude to the inactive
`decrement. For example. an increment of two and a decre
`ment of one produce a thirty-three (33%) percent maximum
`sustainable duty cycle. The sustainable duty cycle is given
`by Equation 1:
`
`45
`
`50
`
`55
`
`ID
`Equation 12 SDC=W
`
`SDC represents the sustainable duty cycle. Alrepresents the
`active increment amount and ID represents the inactive
`decrement amount. In Equation 1. AI and ID are each
`positive and represent the absolute value of the increment
`and decrement values actually used. Preferably the active
`increment value is positive and the inactive decrement value
`is negative.
`
`Page 9 of 15
`
`
`
`5,719,800
`
`7
`If the active increment. AL value is chosen to be one, then
`the maximum number of consecutive cycles that the func
`tional unit can be active is equal to the threshold value. In
`general. the maximum burst length is given by Equation 2:
`
`Equation 2: MRI. = l1}!
`
`10
`
`20
`
`25
`
`35
`
`45
`
`50
`
`55
`
`MBL represents the number of functional-unit cycles in the
`maximum burst length and TH represents the threshold
`value used to compare with the current activity value.
`More sophisticated activity monitoring schemes are pos
`sible within the scope of the invention. For example. the type
`of operation the functional unit is asked to perform could be
`monitored by an activity monitor that associated a particular
`activity increment with each possible type of operation. In
`such a scheme. the contents of the activity-level register
`could simply decrement at a constant rate.
`
`Design Alternatives for Reducing Power
`The invention is ?exible in that it encompasses a wide
`range of design alternatives for reducing the power of the
`functional unit that it controls. These design alternatives can
`range from very simple to quite complex. In fact. each
`functional unit controlled may have a different power reduc
`tion technique for which it is most suited
`A simple way to reduce the power consumed by the
`functional unit is to reduce its clock rate. This could be
`performed by dividing the clock which it normally receives
`by two. or by suppressing every other clock pulse. In the
`case where the maximum sustainable duty cycle is ?fty
`percent. then dividing the clock provided to the functional
`unit by two when the threshold is exceeded enforces this
`maximum duty cycle. Alternatively, the clock rate could be
`reduced by a factor other than two.
`Many ICs include cache memory to keep an internal. and
`thus quickly assessable. copy of data that is available at a
`slower speed in some type of external memory. Cache
`memory is used for performance reasons. Much less delay is
`involved in accessing the information from an on-chip cache
`than in accessing it from a device external to the IC.
`Cache memories can be major consumers of power within
`an IC. Thus. it may be desirable to place on-chip cache
`memories under the control of the invention. A simple
`scheme for reducing the power consumption of the cache
`memory is to force access to the external memory (even if
`a copy of the data is present in the on-chip cache). when
`necessary to reduce power consumption because the cache
`memory’s maximum sustainable duty cycle has been
`exceeded.
`It will be clear to one skilled in the art that an IC may have
`other on-chip functional units whose functions can be per
`formed at lower speed by oif-chip circuits. These are can
`didates for the same power reduction technique as used for
`cache memory-that is. have the off-chip circuit perform the
`operation when needed to reduce on-chip power consump
`tion.
`In rare cases it may be cost effective to include on an IC
`two complete implementations of a particular functional
`unit—one being high speed and high power and the other
`being low speed and low power. In this case. the mode
`controller of the invention selects which is to be used based
`on the current utilization of the functional unit and the
`current value of its threshold parameter.
`In the case of a microprocessor that performs speculative
`instruction execution. instructions are started through the
`
`8
`instruction-evaluation pipeline anticipating that a condi
`tional branch instruction will (or will not) be taken. If the
`prediction as to whether or not the branch is taken is correct.
`then a signi?cant performance speed-up is achieved. But
`sometimes the branch prediction is wrong and as soon as this
`is known. then the results of the speculative evaluations are
`discarded and the correct instructions are started through the
`instruction-evaluation pipeline. A preferred reduced-power
`mode for a microprocessor that performs speculative
`instruction execution may be to reduce or eliminate specu
`lative instruction execution.
`Another example of speculative operation is cache
`prefetching. Many ICs with on-chip cache memories antici
`pate that instruction or data memory accesses will be
`sequential. To increase performance. they prefetch to the
`instruction or data cache some number of words adjacent to
`the currently requested instruction or data address. A pre
`ferred reduced-power mode for a cache memory may com
`prise disabling some or all of its speculative prefetches. In
`general. a preferred reduced-power mode for any functional
`unit may be to reduce or eliminate its speculative activities.
`
`Controlling a Floating-Point Arithmetic Unit by an
`UP/Down Counter
`
`FIG. 2 shows an embodiment of the invention that
`enforces a maximum sustainable duty cycle of ?fty percent
`(50%) on ?oating-point unit 206. In its normal operating
`mode. multiplexer 203 passes system clock 201 on to the
`clock input of ?oating-point unit 206. In its reduced-power
`mode. multiplexer 203 passes the output of divide-by-two
`circuit 202 on to the clock input of ?oating-point unit 206,
`thus cutting both its speed and power consumption in half.
`Floating-point unit 206 provides active signal 207 to the
`upldown control input of upldown counter 205. For each
`cycle of system clock 201 for which active signal 207 is true,
`upldown counter 205 increments its contents by one. For
`each cycle of system clock 201 for which active signal 207
`is false. upldown counter 205 decrements its contents by
`one. If a decrement would take its contents below zero. then
`upldown counter 205 stays at zero.
`The select input of multiplexer 203 is driven by most
`signi?cant-bit output 204 from upldown counter 205. This
`bit provides the feedback to control whether or not the
`invention places ?oating point unit 206 into reduced power
`mode. Thus in this embodiment. the threshold is predeter
`mined and must be a power of two. Which power of two is
`used is selected by the number of bits in upldown counter
`205. When the contents of upldown counter 205 is large
`enough that its most signi?cant bit is a one. then the
`reduced-power mode is entered and multiplexer 203 selects
`the output of divide by two circuit 202 to clock ?oating
`point unit 206.
`Clocking ?oating-point unit 206 at half the frequency has
`the etfect of enforcing a ?fty percent (50%) maximum duty
`cycle on ?oating-point unit 206 during the period that it is
`in reduced-power mode i.e.. the period that most-signi?cant
`bit 204 is a one. During this period. ?oating-point unit active
`signal 207 is true for every other cycle of system clock 201.
`The increment magnitude and decrement magnitude used
`in this embodiment of the invention are equal; that is the
`contents of upldown counter 205 are either increased or
`decreased by one. Therefore, the maximum sustainable duty
`cycle allowed for ?oating-point unit 206 is ?fty percent
`(50%). Therefore. if the sequence of operations being per
`formed by the IC attempts to sustain a ?oating-point duty
`cycle of more than ?fty percent (50%) for longer than the
`
`Page 10 of 15
`
`
`
`5,719,800
`
`9
`burst allowed by the predetermined threshold. then the
`?oating point unit’s performance is throttled down to stay
`wit