`COMPUTER SCIENCE
`AND TECHNOLOGY
`
`EXECUTIVE EDITORS
`
`Allen Kent
`
`James G. Williams
`
`UNIVERSITY OF PITTSBURGH
`PITTSBURGH, PENNSYLVANIA
`
`ADMINISTRATIVE EDITOR
`
`Carolyn M. Hall
`
`ARLINGTON, TEXAS
`
`VOLUME36
`
`SUPPLEMENT21
`
`
`
`Marce.L DEKKER, INC.
`
`New York : Baset - Hone Kona
`
`1
`
`Exhibit 1017
`Apple v. Qualcomm
`IPR2018-01249
`
`Exhibit 1017
`Apple v. Qualcomm
`IPR2018-01249
`
`1
`
`
`
`COPYRIGHT© 1997 BY MARCEL DEKKER,INC.
`ALL RIGHTS RESERVED
`
`Neither this book nor any part may be reproduced or
`transmitted in any form or by any means,electronic
`or mechanical, including photocopying, microfilming,
`and recording, or by any information storage and
`retrieval system, without permission in writing from
`the publisher.
`
`MARCEL DEKKER,INC.
`270 Madison Avenue, New York, New York 10016
`
`LIBRARY OF CONGRESS CATALOG CARD NUMBER:74-29436
`ISBN: 0-8247-2289-2
`
`Current Printing (last digit)
`10987654321
`
`PRINTED IN UNITED STATES OF AMERICA
`
`2
`
`
`
`DESIGN TECHNOLOGIES FOR LOW-POWERVLSI
`
`MOTIVATION
`
`In the past, the major concerns of the very-large-scale integration (VLSI) designer
`were area, performance, cost, and reliability; power consideration was mostly of
`only secondary importance. In recent years, however, this has begun to change and,
`increasingly, power is being given comparable weight to area and speed considera-
`tions. Several factors have contributed to this trend. Perhaps the primary driving
`factor has been the remarkable success and growth of the class of personal comput-
`ing devices (portable desktops, audio- and video-based multimedia products) and
`wireless communications systems (personal digital assistants and personal commu-
`nicators) which demand high-speed computation and complex functionality with
`low-power consumption.
`In these applications, average power consumptionis a critical design concern.
`The projected power budget for a battery-powered, A4 format, portable multimedia
`terminal, when implemented using off-the-shelf components not optimized for low-
`poweroperation, is about 40 W. With advanced nickel-metal-hydride (secondary)
`battery technologies offering around 65 W hr/kg(1), this terminal would require an
`unacceptable 6 kg of batteries for 10 hr of operation between recharges. Even with
`new battery technologies such as rechargeable lithium ion or lithium polymercells,
`it is anticipated that the expected battery lifetime will increase to about 90-110 W
`hr/kg over the next 5 years (1) whichstill leads to an unacceptable 3.6-4.4 kg of
`battery cells. In the absence of low-power design techniques, current and future
`portable devices will suffer from either very short battery life or very heavy battery
`pack.
`
`There also exists strong pressure for producers of high-end products to reduce
`their power consumption. Contemporary performance optimized microprocessors
`dissipate as much as 15-30 W at 100-200 MHzclock rates (2)! In the future, it can
`be extrapolated that a 10-cm* microprocessor, clocked at 500 MHz (whichis a not
`too aggressive estimate for the next decade) would consume about 300 W. The cost
`associated with packaging and cooling such devices is prohibitive. Because core
`power consumption must be dissipated through the packaging, increasingly expen-
`sive packaging and cooling strategies are required as chip power consumption in-
`creases. Consequently, there is a clear financial advantage to reducing the power
`consumed in high-performancesystems.
`In addition to cost, there is the issue of reliability. High-power systems often
`trun hot, and high temperature tends to exacerbate several silicon failure mecha-
`nisms. Every 10°C increase in operating temperature roughly doubles a compo-
`nent’s failure rate (3). In this context, peak power (maximum possible powerdissi-
`pation)is a critical design factor, as it determines the thermalandelectrical limits of
`
`73
`
`3
`
`
`
`74
`
`Design Technologiesfor Low-Power VLSI
`
`designs, impacts the system cost, size and weight, dictates specific battery type,
`component and system packaging, and heat sinks, and aggravatesthe resistive and
`inductive voltage-drop problems. It is therefore essential to have the peak power
`under control.
`Another crucial driving factor is that excessive power consumption is becom-
`ing the limiting factor in integrating more transistors on a single chip or on a
`multiple-chip module. Unless power consumption is dramatically reduced, the re-
`sulting heat will limit the feasible packing and performance of VLSI circuits and
`systems.
`From the environmental viewpoint, the smaller the powerdissipation of elec-
`tronic systems, the lower the heat pumped into the rooms, the lower theelectricity
`consumed and, hence, the lower the impact on global environment, the less the
`office noise (e.g., due to elimination of a fan from the desktop), and the less
`stringent the environment/office power delivery or heat-removal requirements.
`The motivations for reducing power consumption differ from application to
`application. In the class of micropowered battery-operated, portable applications,
`such as cellular phones and personaldigital assistants, the goal is to keep the battery
`lifetime and weight reasonable and the packaging cost low. Powerlevels below 1-2
`W,for instance, enable the use of inexpensive plastic packages. For high-perform-
`ance, portable computers, such as laptop and notebook computers, the goal is to
`reduce the power dissipation of the electronics portion of the system to a point
`which is about half of the total power dissipation (including that of display and
`hard disk). Finally, for high-performance, nonbattery operated systems, such as
`workstations, desk-top computers, and multimedia digital signal processors, the
`overall goal of power minimization is to reduce system cost (cooling, packaging,
`and energybill) while ensuring long-term device reliability. These different require-
`ments impact how power optimization is addressed and how muchthedesigneris
`willing to sacrifice in cost or performance to obtain lower powerdissipation.
`The next question is to determine the objective function to minimize during
`low-power design. The answer varies from one application domainto the next. If
`extending the battery life is the only concern, then the energy (i.e., the power-delay
`product) should be minimized. In this case, the battery consumption is minimized
`even though an operation maytake a very long time. On the other hand,if both the
`battery life and the circuit delay are important, then the energy-delay product must
`be minimized (4). In this case, one can alternatively minimize the energy/delay ratio
`(i.e., the power) subject to a delay constraint. In most design scenarios, the circuit
`delay is set based on system-level considerations, and hence duringcircuit optimiza-
`tion, one minimizes power underuser-specified timing constraints.
`
`SOURCES OF POWERDISSIPATION
`
`Power dissipation in Complementary Metal-Oxide-Silicon (CMOS) circuits is
`caused by three sources: (1) The leakage current which is primarily determined by
`the fabrication technology, consists of reverse bias current in the parasitic diodes
`formed between source and drain diffusions and the bulk region in a MOStransistor
`as well as the subthreshold current that arises from the inversion charge that exists
`at the gate voltages below the threshold voltage; (2) thé short-circuit (rush-through)
`
`4
`
`
`
`Design Technologiesfor Low-Power VLSI
`
`75
`
`current which is due to the DC path between the supply rails during outputtransi-
`tions; and (3) the charging and discharging of capacitive loads during logic changes.
`The diode leakage occurs when a transistor is turned off and another active
`transistor charges up or down the drain with respect to the first transistor’s bulk
`potential. The resulting current is proportional to the area of the drain diffusion
`and the leakage current density. The diode leakage is typically 1 pA for a l-»m
`minimum feature size. The subthreshold leakage current for long-channel devices
`increases-linearly with the ratio of the channel width over channel length and de-
`creases exponentially with Vg.-V,, where V5 is the gate bias and V, is the threshold
`voltage. Several hundred millivolts of “off bias” (say, 300-400 mV)typically reduces
`the subthreshold current to negligible values. With reduced power supply and device
`threshold voltages,
`the subthreshold current will, however, become more pro-
`nounced. In addition, at short channel lengths, the subthreshold current also be-
`comes exponentially dependent on drain voltage Vps instead of being independent
`of Vps, (see Ref. 5 for a recent analysis). The subthresholdcurrent will remain 10°-
`10° times smaller than the “on current,” even at submicrondevicesizes.
`The short-circuit (crowbar current) power consumptionfor an inverter gate is
`proportional to the gain of the inverter, the cubic power of supply voltage minus
`device threshold, the input rise/fall time, and the operating frequency (6). The
`maximum short-circuit current flows when there is no load; this current decreases
`with the load. If gate sizes are selected so that the input and outputrise/fall times
`are about equal, the short-circuit power consumption will be less than 15% of the
`dynamic power consumption.If, however, the design for high performanceis taken
`to the extreme wherelarge gates are used to drive relatively small loads, then there
`will be a stiff penalty in terms of short-circuit power consumption.
`The short-circuit and leakage currents in CMOScircuits can be made small
`with proper circuit and device design techniques. The dominant source of power
`dissipation is thus the charging and discharging of the node capacitances (also
`referred to as the dynamic powerdissipation) andis given by
`
`P = 0.5CVigE(Sw)Ser
`
`(1)
`
`whereCis the physical capacitance of the circuit, Vg is the supply voltage, E(sw)
`(referred as the switching activity) is the average numberoftransitions in the circuit
`per 1/f,, time, and f,, is the clock frequency.
`
`LOW-POWER DESIGN SPACE
`
`The previous section revealed the three degrees of freedom inherent in the low-
`powerdesign space: voltage, physical capacitance, and data activity. Optimizing for
`powerentails an attempt to reduce one or moreof these factors. This section briefly
`discusses each of these factors, describing their relative importance, as well as the
`interactions that complicate the power optimization process.
`
`Voltage
`Because of its quadratic relationship to power, voltage reduction offers the most
`effective means of minimizing power consumption. Without requiring any special
`circuits or technologies, a factor of 2 reduction in supply voltage yields a factor of 4
`
`5
`
`
`
`76
`
`Design Technologiesfor Low-Power VLSI
`
`decrease in power consumption. Furthermore, this power reduction is a global
`effect, experienced not only in one subcircuit or block of the chip but throughout
`the entire design. Because of these factors, designers are often willing to sacrifice
`increased physical capacitance or circuit activity for reduced voltage. Unfortu-
`nately, we pay a speed penalty for supply voltage reduction, with delays drastically
`increasing as V4 approaches the threshold voltage V, of the devices. This tends to
`limit the useful range of V,, to a minimum of about (2-3) V,.
`In Ref. 7, an architecture-driven voltage scaling strategy is presented in which
`parallel and pipelined architectures are used to compensate for the increased gate
`delays at reduced supply voltages and meet throughput constraints. Another ap-
`proach to reduce the supply voltage without loss in throughputis to modify the V,
`of the devices. Reducing the V, allows the supply voltage to be scaled down without
`loss in speed. The limit of how low the V, can gois set by the requirementto set
`adequate noise margins and to control the increase in subthreshold leakage currents.
`The optimum V, must be determined based on the current drives at low-supply-
`voltage operation and control of the leakage currents. Because the inverse threshold
`slope (S) of a MOSField-Effect Transistor (MOSFET)is invariant with scaling, for
`every 80-100 mV (based on the operating temperature) reduction in V,, the standby
`current will be increased by one order of magnitude. This tends to limit V, to
`about 0.3 V for room-temperature operation of CMOScircuits. Another important
`concern in the low V,,-low V, regime is the fluctuation in V,. Basically, delay
`increases by 3x for a delta Vj, of 40.15 V at Vj of 1 V. This is a majorlimitation
`on how low V,, can go unless the V, fluctuation is cancelled by circuit techniques
`such as the self-adjusting threshold scheme which will reduce the V, fluctuation to
`+0.05 V at Va, of 1 V (8).
`
`Physical Capacitance
`Dynamic power consumption dependslinearly on the physical capacitance being
`switched. So, in addition to operating at low voltages, minimizing capacitances
`offers another technique for minimizing power consumption. In order to consider
`this possibility we must first understand what factors contribute to the physical
`capacitance ofa circuit.
`Powerdissipation is dependent on the physical capacitances seen by individual
`gates in the circuit. Estimating this capacitance at the behavioral or logical levels of
`abstraction is difficult and imprecise, as it requires estimation of the load capaci-
`tances from structures which are not yet mapped to gates in a cell library; this
`calculation can, however, be doneeasily after technology mapping by using the
`logic and delay information from thelibrary.
`Interconnect plays an increasing role in determining the total chip area, delay,
`and powerdissipation, and, hence, must be accounted for as early as possible during
`the design process. The interconnect capacitance estimation is, however, a difficult
`task even after technology mapping, due to lack of detailed place and route infor-
`mation. Approximate estimates can be obtained by using information derived from
`a companion placementsolution (9) or by using stochastic/procedural interconnect
`models (10). Interconnect capacitance estimation after layout is straightforward
`and, in general, accurate.
`With this understanding, we can now consider how to reduce physical capaci-
`
`6
`
`
`
`Design Technologies for Low-Power VLSI
`
`77
`
`tance. From the previous discussion, we recognize that capacitances can be kept at a
`minimum byusingless logic, smaller devices, and fewer and shorter wires. Example
`techniques for reducing the active area include resource sharing, logic minimization,
`and gate sizing. Example techniques for reducing the interconnect include register
`sharing, common subfunction extraction, placement, and routing. As with voltage,
`however, we are not free to optimize capacitance independently. For example, re-
`ducing device sizes reduces physical capacitance, but it also reduces the current drive
`of the transistors, making the circuit operate more slowly. This loss in performance
`might prevent us from lowering V4 as much as we might otherwise be able to do.
`
`Switching Activity
`In addition to voltage and physical capacitance, switching activity also influences
`dynamic power consumption. A chip may contain an enormous amountof physical
`capacitance, but if there is no switching in the circuit, then no dynamic powerwill
`be consumed. The data activity determines how often this switching occurs. There
`are two componentsto switching activity: f, which determines the average periodic-
`ity of data arrivals and E(sw) which determines how manytransitions each arrival
`will generate. For circuits that do not experience glitching, E(sw) can be interpreted
`as the probability that a power consuming transition will occur during a single data
`period. Even for these circuits, the calculation of E(sw)is difficult, as it depends
`not only on the switching activities of the circuit inputs and the logic function
`computedbythecircuit but also on the spatial and temporal correlations among the
`circuit inputs. The data activity inside a 16-bit multiplier may change by as much as
`one order of magnitude as a function of input correlations (11).
`For certain logic styles, however, glitching can be an important source of
`signal activity and, therefore, deserves some mention here. Glitching refers to spuri-
`ous and unwanted transitions that occur before a nodesettles down toits final
`steady-state value. Glitching often arises when paths with unbalanced propagation
`delays converge at the same point in the circuit. Because glitching can cause a
`node to makeseveral power-consumingtransitions, it should be avoided whenever
`possible.
`The data activity E(sw) can be combined with physical capacitance C to obtain
`switched capacitance, C,, = CE(sw), which describes the average capacitance
`charged during each data period 1/f,,. It should be noted that it is the switched
`capacitance that determines the power consumed by a CMOScircuit.
`
`Calculation of Switching Activity
`Calculation of the switching activity in a logic circuit is difficult, as it depends on a
`numberof circuit parameters and technology-dependent factors which are notreadily
`available or precisely characterized. Some of these factors are described next.
`
`Input Pattern Dependence
`Switching activity at the output of a gate depends not only on the switching activities
`at the inputs and the logic function of the gate but also on the spatial and temporal
`dependencies among the gate inputs. For example, consider a two-input AND gate
`g with independent inputs i and j whose signal probabilities are 1/2, then E,(sw) =
`
`7
`
`
`
`78
`
`Design Technologiesfor Low-Power VLSI
`
`3/8. This holds because in 6 out of 16 possible input transitions, the output of the
`two-input and gate makesa transition. Now supposeit is knownthat only patterns
`00 and 11 can be applied to the gate inputs and that both patterns are equally likely,
`then Z,(sw) = 1/2. Alternatively, assume that it is known that every 0 applied to
`input / is immediately followed by a 1, whereas every 1 applied to input / is immedi-
`ately followed by a 0; then E,(sw) = 4/9. Finally, assume that it is known that /
`changes exactly if / changes value; then E,(sw) = 1/4. Thefirst case is an example
`of spatial correlations between gate inputs, the second caseillustrates femporal
`correlations on gate inputs, and the third case describes an instance of spatiotempo-
`ral correlations.
`The straightforward approach of estimating power by using a simulatoris
`greatly complicated by this pattern-dependence problem.
`It is clearly infeasible to estimate the power by exhaustive simulation of the
`circuit. Recent techniques overcomethis difficulty by using probabilities that de-
`scribe the set of possible logic values at the circuit inputs and developing mecha-
`nisms to calculate these probabilities for gates inside the circuit. Alternatively,
`exhaustive simulation may be replaced by Monte Carlo simulation with well-defined
`stopping criteria for specified relative or absolute error in power estimates for a
`given confidence level (12).
`
`Delay Model
`Based on the delay model used, the power estimation techniques could account
`for steddy-state transitions (which consume power but are necessary to perform a
`computational task) and/or hazards and glitches (which dissipate power without
`doing any useful computation). Sometimes, the first component of power consump-
`tion is referred to as the functional activity, whereas the latter is referred to as the
`spurious activity. It is shown in Ref. 13 that the mean value of the ratio of hazard-
`ous componentto the total powerdissipation varies significantly with the considered
`circuits (from 9% to 38% in randomlogic circuits) and that the spurious power
`dissipation cannot be neglected in CMOScircuits. The spurious activity is much
`higher in certain data path modules (such as adders and multipliers). Indeed, in a
`32-bit pipelined multiplier, the power dissipation due to hazard activity is three to
`four times higher than that due to functional activity! The spurious powerdissipa-
`tion is likely to become even more importantin the future scaled technologies.
`Current power estimation techniques often handle both zero-delay (nonglitch)
`and real-delay models. In the first model, it is assumed that all changesat the circuit
`inputs propagate through the internal gates of the circuits instantaneously. The
`latter model assigns to each gate in the circuit a finite delay and can thus account
`for the hazards in the circuit. A real-delay model significantly increases the compu-
`tational requirements of the power estimation techniques while improving the accu-
`racy of the estimates.
`Calculation of the spuriousactivity in a circuit is, in general, very difficult and
`requires careful logic- and/or circuit-level characterization of the gates in a library
`as well as detailed knowledgeof the circuit structure.
`
`Logic Function
`Switching activity at the output of a logic gate is also strongly dependent on the
`Boolean function of the gate itself. This is because the logic function of a gate
`
`8
`
`
`
`Design Technologies for Low-Power VLSI
`
`79
`
`determines the probability that the present value of the gate outputis different from
`its previous value. For example, under the assumption that the input signals are
`uncorrelated, switching activity at the output of a (static) two-input NAND or NOR
`gate is 3/8, whereas that at the output of a two-input XOR gate is 1/2. Indeed,
`switching activity at the output of a K-input NANDor NORgate approaches 1/2**
`for large K, whereas that for a K-input XOR gate remainsat 1/2.
`
`Logic Style
`Switching activity of the circuits is also a function of the logic style used to imple-
`ment the circuit. The functional activity in dynamic circuits is always higher than
`that in static implementation of the same circuit, as all nodes are precharged to
`some value (one in N-type dynamic and zero in P-type dynamic) before the new
`input data arrives. This effectively increases the number of power-consumingtransi-
`tions. For example, under pseudorandom inputsignals, switching activities of two-
`input N-type dynamic NAND, NOR and XORgates are 3/2, 1/2, and 1, respec-
`tively, and those of the P-type version of these same gates are 1/2, 3/2, and 1,
`respectively. These values should be compared to the switching activities of these
`gates in static CMOSwhichare 3/8, 3/8, and 1/2, respectively. Note, however, that
`the physical capacitance in dynamic logic tends to be smaller than that in static
`logic, so the choice between dynamic and static logic implementations is not as
`clear-cut as it would be otherwise. Dynamic circuits are also glitch-free!
`
`Circuit Structure
`
`The major difficulty in computing the switching activities is the reconvergent nodes.
`Indeed, if a network consists of simple gates and has no reconvergent fanout nodes
`(i.e., circuit nodes that receive inputs from two paths that fanout from some other
`circuit node), then the exact switching activities can be computed during a single
`post-order traversal of the network. For networks with reconvergent fanout, the
`problem is much more challenging, as internal signals may becomestrongly corre-
`lated and exact consideration of these correlations cannot be performed with rea-
`sonable computational effort or memory usage. Current power estimation tech-
`niques either ignore these correlations or approximate them, thereby improving the
`accuracy at the expense of longer run times. Exact methods(i.e., symbolic simula-
`tion) have also been proposed but are impractical due to excessive time and memory
`requirements.
`
`Statistical Variation of Circuit Parameters
`In real networks, statistical perturbations of circuit parameters may change the
`propagation delays and produce changes in the numberoftransitions because of the
`appearance or disappearance of hazards. It is therefore useful to determine the
`changein the signal transition count as a function ofthis statistical perturbations.
`Variation of gate-delay parameters may change the number of hazards occurring
`during a transition as well as their duration. For this reason,it is expected that the
`hazardous component of powerdissipation is more sensitive to integrated-circuit
`(IC) parameter fluctuations than the power required to perform thetransition be-
`tweentheinitial and final states of each node.
`
`9
`
`
`
`80
`
`Design Technologiesfor Low-Power VLSI
`
`POWER ESTIMATION TECHNIQUES
`
`The design for the low-power problem cannot be achieved without accurate power
`prediction and optimization tools or without power-efficient gate and moduleli-
`braries. Therefore, there is a critical need for computer-aided design (CAD)tools to
`estimate power dissipation during the design process to meet the power budget
`without having to go through a costly redesign effort and enable efficient design
`and characterization of the design libraries.
`In the following section, various techniques for power estimation at the cir-
`cuit, logic, and behaviorallevels will be reviewed. These techniques are divided into
`two general categories: simulation based and nonsimulation based.
`
`Simulative Approaches
`
`Brute-Force Simulation
`
`Circuit simulation-based techniques (14,15) simulate the circuit with a representa-
`tive set of input vectors. They are accurate and capable of handling various device
`models, different circuit design styles, single-phase and multiphase clocking meth-
`odologies, tristate drives, and so forth. However, they suffer from memory and
`execution time constraints and are not suitable for large, cell-based designs. In
`addition,it is difficult to generate a compact stimulus vector set to calculate accu-
`rate activity factors at the circuit nodes. The size of such a vector set is dependent
`on the application and the system environment (16).
`PowerMill (17) is a transistor-level power simulator and analyzer which ap-
`plies an event-driven timing simulation algorithm (based on simplified table-driven
`device models, circuit partitioning, and single-step nonlinear iteration) to increase
`the speed by two to three orders of magnitude over SPICE.
`Switch-level simulation techniques are, in general, much faster than circuit-
`level simulation techniques but are not as accurate or versatile. Standard switch-
`level simulators [such as IRSIM (18)] can be easily modified to report the switched
`capacitance (and thus dynamic powerdissipation) during a simulation run.
`The Verilog-XL logic simulator is a Verilog-based gate-level simulation pro-
`gram that relies on the accuracy of the macromodels built for the gates in the
`application-specific IC (ASIC) library as well as gate-level timing analysis to pro-
`duce fast and accurate powerestimates. The accuracy depends heavily on the quality
`of the macromodels, the glitch filtering scheme used, and the accuracy of physical
`capacitances provided at the gate level. The speed is three to four orders of magni-
`tude faster than SPICE.
`Most of the high-level power-prediction tools use profiling and simulation to
`address data dependencies. Importantstatistics include the numberof operations of
`a given type, the numberof bus, register, and memory accesses, and the number of
`I/O operations executed within a given period (19,20). Instruction-level simulation
`or behavioral simulators are easily (and have indeed been) adapted to producethis
`information.
`r
`
`Hierarchical Simulation
`
`A simulation method based on a hierarchy of simulators is presented in Ref. 21.
`Theidea is to use a hierarchy of power simulators (e.g., at architectural, gate level,
`
`10
`
`10
`
`
`
`Design Technologiesfor Low-Power VLSI
`
`8&1
`
`andcircuit level) to achieve a reasonable accuracy and efficiency tradeoff. Another
`good example is Entice-Aspen (22). This power analysis system consists of two
`components: Aspen which computes the circuit activity information and Entice
`which computes the power characterization data. A stimulusfile is to be supplied to
`Entice, where power and timing delay vectors are specified. The set of power vectors
`discretizes all possible events in which power can bedissipated by the cell. With the
`relevant parameters set according to the user’s specs, a SPICEcircuit simulation is
`invoked to accurately obtain the power dissipation of each vector. During logic
`simulation, Aspen monitors the transition count of each cell and computes thetotal
`power consumption as the sum of the powerdissipation for all cells in the power
`vector path.
`
`Monte Carlo Simulation
`
`A Monte Carlo simulation approach for powerestimation whichalleviates the input
`pattern dependence problem has been proposed in Ref. 12. This approach consists
`of applying randomly generated input patterns at the circuit inputs and monitoring
`the power dissipation per time interval 7 using a simulator. Based on the assumption
`that the power consumedbythe circuit over any period T has a normaldistribution,
`and for a desired percentage error in the power estimate and a given confidence
`level, the numberof required power samples is estimated. The designer can use an
`existing simulator (circuit level, gate level, or behavioral) in the inner loop of the
`Monte Carlo program,thus trading accuracyfor higher efficiency. The convergence
`time for this approach is fast when estimating the total power consumption of
`the circuit. However, when signal probability (or power consumption) values on
`individual lines of the circuit are required, the convergencerate is very slow (23).
`The method does not handle spatial correlations at the circuit inputs.
`
`Nonsimulative Approaches
`
`Behavioral Level
`For functional units (adders, multipliers, and registers) or for memories, power
`estimates are directly obtained from the design library whereby each functional unit
`has been simulated using pseudorandom white noise data and the average switched
`capacitance per clock cycle has been calculated andstoredin the library.
`The power model for a functional unit may be parametrized in terms ofits
`input bit width. For example, the power dissipation of an adder (or a multiplier)is
`linearly (or quadratically) dependent on its input bit width. The library thus con-
`tains interface descriptions of each module, description of its parameters, its area,
`delay, and internal power dissipation (assuming pseudorandom white noise data
`inputs). The latter is determined by extracting a circuit- or logic-level model from
`the layout or logic-level descriptions of the module, simulating it using a long stream
`of randomly generated input patterns and calculating the average powerdissipation
`per pattern. These characteristics are available in terms of the parameter values
`(i.e., equations) or in the form of tables. Multiparameter modules are characterized
`with respect to all the parameters, yielding a multiparameter equation or table.
`Multifunction modules (e.g., Arithmetic Logic Unit (ALU)) are characterized for
`each function separately.
`The power modelthus generated and stored for each modulein thelibrary has
`
`11
`
`11
`
`
`
`82
`
`Design Technologies for Low-Power VLSI
`
`to be “conditioned” or “modulated” by the actual input switching activities in order
`to provide power estimates which are sensitive to the input activities. In Refs. 20
`and 24b,
`the model consists of a single physical capacitance value and a single
`switching activity value which represents the average switching activity on each
`input bit. In Ref. 24a, a more detailed model is presented, whereit is projected that
`data in the datapath of a digital system can be divided into tworegions: the least
`significant bits (LSB), which act as uncorrelated white noise, and the most signifi-
`cant bits (MSB), which correspondto sign bits and exhibit strong temporal depen-
`dence. The power model thus uses two capacitance values and requires two input
`switching activity values corresponding to the LSB and MSBregions. Both models
`ignore the spatial correlations amongbits of the sameinput or acrossbits of differ-
`ent inputs.
`Another parametric model is described in Ref. 25, where the powerdissipation
`of the various components of a typical processor architecture are expressed as a
`function of set of primary parameters. The technique suffers from an abundance of
`parameters, requires a lot of fine-tuning for specific architectures, and is sensitive
`to mismatches in the modeling assumptions.
`Word-level behavior of a data input can be properly captured by its probabil-
`ity density function (pdf). Similarly, spatial correlation between two data inputs can
`be captured bytheir joint pdf. This observation is used in Refs. 26 and 27 to develop
`a probabilistic technique for behavioral-level power prediction which consists of
`four steps: (1) building the joint pdf of the input variables of a data flow graph
`(DFG) based on the given input vectors, (2) computing the joint pdf for any combi-
`nation of internal arcs in the DFG, (3) calculating the switching activity a