`
`COMPUTER SCIENCE
`
`AND TECHNOLOGY
`
`EXECUTIVE EDITORS
`
`Allen Kent
`
`James G. Mllz'ams
`
`UNIVERSITY OF PITTSBURGH
`PITTSBURGH, PENNSYLVANIA
`
`ADMINISTRATIVE EDITOR
`
`Carolyn M. Hall
`
`ARLINGTON, TEXAS
`
`VOLUME 36
`
`SUPPLEMENT 21
`
`
`
`MARCBL DEKKER, INC.
`
`NEw YORK - BASEL - HONG KONG
`
`1
`
`Exhibit 1017
`
`Apple v. Qualcomm
`|PR2018-01249
`
`Exhibit 1017
`Apple v. Qualcomm
`IPR2018-01249
`
`1
`
`
`
`COPYRIGHT @ 1997 BY MARCEL DEKKER, INC.
`ALL RIGHTS RESERVED
`
`Neither this book nor any part may be reproduced or
`transmitted in any form or by any means. electronic
`or mechanical, including photocopying, microfilming,
`and recording, or by any information storage and
`retrieval system, without permission in writing from
`the publisher.
`
`MARCEL DEKKER, INC.
`270 Madison Avenue, New York, NW York 10016
`
`LIBRARY OF CONGRESS CATALOG CARD NUMBER: 'i'4-29436
`ISBN: 0-8247-2289-2
`
`Current Printing (last digit)
`l0 9 8 7 6 5 4 3 2 I
`
`PRINTED IN UNITED STATES OF AMERICA
`
`2
`
`
`
`DESIGN TECHNOLOGIES FOR LOW-POWER VLSI
`
`MOTIVATION
`
`In the past, the major concerns of the very-large-scale integration (VLSI) designer
`were area, performance, cost, and reliability; power consideration was mostly of
`only secOndary importance. In recent years, however, this has begun to change and,
`increasingly, power is being given comparable weight to area and speed considera-
`tions. Several factors have contributed to this trend. Perhaps the primary driving
`factor has been the remarkable success and growth of the class of personal comput-
`ing devices (portable desktops, audio- and video-based multimedia products) and
`wireless communications systems (personal digital assistants and personal commu-
`nicators) which demand high-speed computation and complex functionality with
`low-power consumption.
`In these applications, average power consumptiOn is a critical design concern.
`The projected power budget for a battery-powered, A4 format, portable multimedia
`terminal, when implemented using off-the-shelf components not optimized for low-
`power operation, is about 40 W. With advanced nickel—metal—hydride (secondary)
`battery technologies offering around 65 W hr/ kg (1 ), this terminal would require an
`unacceptable 6 kg of batteries for 10 hr of Operation between recharges. Even with
`new battery technologies such as rechargeable lithium ion or lithium polymer cells,
`it is anticipated that the expected battery lifetime will increase to about 90-110 W
`hr/kg over the next 5 years (1) which still leads to an unacceptable 3.6—4.4 kg of
`battery cells. In the absence of low—power design techniques, current and future
`portable devices will suffer from either very short battery life or very heavy battery
`pack.
`
`There also exists stroag pressure for producers of high-end products to reduce
`their power cansumption. Contemporary performance optimized microprocessors
`dissipate as much as 15—30 W at 100—200 MHz clock rates (2)! In the future, it can
`be extrapolated that a 10-Cm2 microprocessor, clocked at 500 MHz (which is a not
`too aggressive estimate for the next decade) would consume about 300 W. The cost
`associated with packaging and cooling such devices is prohibitive. Because core
`power consumption must be dissipated through the packaging, increasingly expen-
`sive packaging and cooling strategies are required as chip power consumption in-
`creases. Consequently, there is a clear financial advantage to reducing the power
`consumed in high-performance systems.
`In addition to cost, there is the issue of reliability. High-power systems often
`run hot, and high temperature tends to exacerbate several silicon failure mecha-
`nisms. Every 10°C increase in operating temperature roughly doubles a compo-
`nent’s failure rate (3). In this context, peak power (maximum possible power dissi-
`pation) is a critical design factor, as it determines the thermal and electrical limits of
`
`73
`
`3
`
`
`
`74
`
`Design Technologies for Low-Power VLSI
`
`designs; impacts the system cost, size and weight, dictates specific battery type,
`component and system packaging, and heat sinks, and aggravates the resistive and
`inductive voltage-drop problems. It is therefore essential to have the peak power
`under control.
`
`Another crucial driving factor is that excessive power consumption is becom-
`ing the limiting factor in integrating more transistors on a single chip or on a
`multiple—chip module. Unless power consumption is dramatically reduced, the re—
`sulting heat will limit the feasible packing and performance of VLSI circuits and
`systems.
`From the environmental viewpoint, the smaller the power dissipation of elec-
`tronic systems, the lower the heat pumped into the rooms, the lower the electricity
`consumed and, hence, the lower the impact on global environment, the less the
`office noise (e.g., due to elimination of a fan from the desktop), and the less
`stringent the environment/office power delivery or heat-removal requirements.
`The motivations for reducing power consumption differ from application to
`application. In the class of micropowered battery-operated, portable applications,
`such as cellular phones and personal digital assistants, the goal is to keep the battery
`lifetime and weight reasonable and the packaging cost low. Power levels below 1—2
`W, for instance, enable the use of inexpensive plastic packages. For highwperform—
`ance, portable computers, such as laptop and notebook computers, the goal is to
`reduce the power dissipation of the electronics portion of the system to a point
`which is about half of the total power dissipation (including that of display and
`hard disk). Finally, for high-performance, nonbattery operated systems, such as
`workstations, desk-top computers, and multimedia digital signal processors, the
`overall goal of power minimization is to reduce system cost (cooling, packaging,
`and energy bill) while ensuring long-term device reliability. These different require—
`ments impact how power optimization is addressed and how much the designer is
`willing to sacrifice in cost or performance to obtain lower power dissipation.
`The next question is to determine the objective function to minimize during
`low-power design. The answer varies from one application domain to the next. If
`extending the battery life is the only concern, then the energy (i.e., the power-delay
`product) should be minimized. In this case, the battery consumption is minimized
`even though an operation may take a very long time. On the other hand, if both the
`battery life and the circuit delay are important, then the energy-delay product must
`be minimized (4). In this case, one can alternatively minimize the energy/delay ratio
`(i.e., the power) subject to a delay constraint. In most design scenarios, the circuit
`delay is set based on system-level considerations, and hence during circuit optimiza-
`tion, one minimizes power under user~specified timing constraints.
`
`SOURCES OF POWER DISSIPATION
`
`Power dissipation in Complementary Metal—Oxide-Silicon (CMOS) circuits is
`caused by three sources: (1) The leakage current which is primarily determined by
`the fabrication technology, consists of reverse bias current in the parasitic diodes
`formed between source and drain diffusions and the bulk region in a MOS transistor
`as well as the subthreshold current that arises from the inversion charge that exists
`at the gate voltages below the threshold voltage; (2) the short-circuit (rush-through)
`
`4
`
`
`
`Design Technologies for Low-Po wer VLSI
`
`75
`
`current which is due to the DC path between the supply rails during output transi-
`tions; and (3) the charging and discharging of capacitive loads during logic changes.
`The diode leakage occurs when a transistor is turned off and another active
`transistor charges up or down the drain with respect to the first transistor’s bulk
`potential. The resulting current is proportional to the area of the drain diffusion
`and the leakage currentdensity. The diode leakage is typically 1 pA for a Lam
`minimum feature size. The subthreshold leakage current for long-channel devices
`increases-linearly with the ratio of the channel width over channel length and de-
`creases exponentially with VGS-Vn where VGs is the gate bias and V, is the threshold
`voltage. Several hundred millivolts of “off bias” (say, 300—400 mV) typically reduces
`the subthreshold current to negligible values. With reduced power supply and device
`threshold voltages,
`the subthreshold current will, however, become more pro-
`nounced. In addition, at short channel lengths, the subthreshold current also be-
`comes exponentially dependent on drain voltage VD, instead of being independent
`of V055 (see Ref. 5 for a recent analysis). The subthreshold current will remain 103—
`105 times smaller than the “on current,” even at submicron device sizes.
`The short-circuit (crowbar current) power consumption for an inverter gate is
`proportional to the gain of the inverter, the cubic power of supply voltage minus
`device threshold, the input rise/fall time, and the operating frequency (6). The
`maximum short-circuit current flows when there is no load; this current decreases
`with the load. If gate sizes are selected so that the input and output rise/fall times
`are about equal, the short—circuit power consumption will be less than 15% of the
`dynamic power consumption. If, however, the design for high performance is taken
`to the extreme where large gates are used to drive relatively small loads, then there
`will be a stiff penalty in terms of short-circuit power consumption.
`The short-circuit and leakage currents in CMOS circuits can be made small
`with proper circuit and device design techniques. The dominant source of power
`dissipation is thus the charging and discharging of the node capacitances (also
`referred to as the dynamic power dissipation) and is given by
`
`P = 0.5CVfidE(sW)flm
`
`[1]
`
`where C is the physical capacitance of the circuit, Vdd is the supply voltage, E(sw)
`(referred as the switching activity) is the average number of transitions in the circuit
`per l/fclk time, and f,“{ is the clock frequency.
`
`LOW-POWER DESIGN SPACE
`
`The previous section revealed the three degrees of freedom inherent in the low-
`power design space: voltage, physical capacitance, and data activity. Optimizing for
`power entails an attempt to reduce one or more of these factors. This section briefly
`discusses each of these factors, describing their relative importance, as well as the
`interactions that complicate the power optimization process.
`
`Voltage
`
`Because of its quadratic relationship to power, voltage reduction offers the most
`effective means of minimizing power consumption. Without requiring any special
`circuits or technologies, a factor of 2 reduction in supply voltage yields a factor of 4
`
`5
`
`
`
`76
`
`Design Technologiesfor Low-Power VLSI
`
`decrease in power consumption. Furthermore, this power reduction is a global
`effect, experienced not only in one subcircuit or block of the chip but throughout
`the entire design. Because of these factors, designers are often willing to sacrifice
`increased physical capacitance or circuit activity for reduced voltage. Unfortu-
`nately, we pay a speed penalty for supply voltage reduction, with delays drastically
`increasing as Vdd approaches the threshold voltage V, of the devices. This tends to
`limit the useful range of Vdd to a minimum of about (2—3)V,.
`In Ref. 7, an architecture-driven voltage scaling strategy is presented in which
`parallel and pipelined architectures are used to compensate for the increased gate
`delays at reduced supply voltages and meet throughput constraints. Another ap~
`proach to reduce the supply voltage without loss in throughput is to modify the V,
`of the devices. Reducing the V, allows the supply voltage to be scaled down without
`loss in speed. The limit of how low the V, can go is set by the requirement to set
`adequate noise margins and to control the increase in subthreshold leakage currents.
`The optimum V, must be determined based on the current drives at low-supply-
`voltage operation and control of the leakage currents. Because the inverse threshold
`slepe (S) of a MOS Field-Effect Transistor (MOSFET) is invariant with scaling, for
`every 80~100 mV (based on the operating temperature) reduction in V,, the standby
`current will be increased by one order of magnitude. This tends to limit V, to
`about 0.3 V for room-temperature operation of CMOS circuits. Another important
`concern in the low Vdd—low V, regime is the fluctuation in V,. Basically, delay
`increases by 3); for a delta Vdd of $0.15 V at V,icl of l V. This is a major limitation
`on how low V6,, can go unless the V, fluctuation is cancelled by circuit techniques
`such as the self-adjusting threshold scheme which will reduce the V, fluctuation to
`3.0.05 v at 14,, of 1 V (8).
`
`Physical Capacitance
`
`Dynamic power consumption depends linearly on the physical capacitance being
`switched. So, in addition to operating at low voltages, minimizing capacitances
`offers another technique for minimizing power consumption. In order to consider
`this possibility we must first understand what factors contribute to the physical
`capacitance of a circuit.
`Power dissipation is dependent on the physical capacitances seen by individual
`gates in the circuit. Estimating this capacitance at the behavioral or logical levels of
`abstraction is difficult and imprecise, as it requires estimation of the load capaci-
`tances from structures which are not yet mapped to gates in a cell library; this
`calculation can, however, be done easily after technology mapping by using the
`logic and delay information from the library.
`Interconnect plays an increasing role in determining the total chip area, delay,
`and power dissipation, and, hence, must be accounted for as early as possible during
`the design process. The interconnect capacitance estimation is, however, a difficult
`task even after technology mapping, due to lack of detailed place and route infor-
`mation. Approximate estimates can be obtained by using information derived from
`a companion placement solution (9) or by using stochastic/procedural interconnect
`models (10). Interconnect capacitance estimation after layout is straightforward
`and, in general, accurate.
`With this understanding, we can now consider how to reduce physical capaci-
`
`6
`
`
`
`Design Technologies for Low-Power VLSI
`
`77
`
`tance. From the previous discussion, we recognize that capacitances can be kept at a
`minimum by using less logic, smaller devices, and fewer and shorter wires. Example
`techniques for reducing the active area include resource sharing, logic minimization,
`and gate sizing. Example techniques for reducing the interconnect include register
`sharing, common subfunction extraction, placement, and routing. As with voltage,
`however, we are not free to optimize capacitance independently. For example, re
`ducing device sizes reduces physical capacitance, but it also reduces the current drive
`of the transistors, making the circuit operate more slowly. This loss in performance
`might prevent us from lowering Vdd as much as we might otherwise be able to do-
`
`Switching Activity
`
`In addition to voltage and physical capacitance, switching activity also influences
`dynamic power consumption. A chip may contain an enormous amount of physical
`capacitance, but if there is no switching in the circuit, then no dynamic power will
`be consumed. The data activity determines how often this switching occurs. There
`are two components to switching activity: fdk which determines the average periodic—
`ity of data arrivals and E{sw) which determines how many transitions each arrival
`will generate. For circuits that do not experience glitching, E(sw) can be interpreted
`as the probability that a power consuming transition will occur during a single data
`period. Even for these circuits, the calculation of E(sw) is difficult, as it depends
`not only on the switching activities of the circuit inputs and the logic function
`computed by the circuit but also on the spatial and temporal correlations among the
`circuit inputs. The data activity inside a 16-bit multiplier may change by as much as
`one order of magnitude as a function of input correlations (11).
`For certain logic styles, however, glitching can be an important source of
`signal activity and, therefore, deserves some mention here. Glitching refers to spuri-
`ous and unwanted transitions that occur before a node settles down to its final
`
`steady-state value. Glitching often arises when paths with unbalanced propagation
`delays converge at the same point in the circuit. Because glitching can cause a
`node to make several power-consuming transitions, it should be avoided whenever
`possible.
`The data activity E(sw) can be combined with physical capacitance C to obtain
`switched capacitance, C,,, : CEIsw), which describes the average capacitance
`charged during each data period 1/fdk. It should be noted that it is the switched
`capacitance that determines the power consumed by a CMOS circuit.
`
`Calculation of Switching Activity
`
`Calculation of the switching activity in a logic circuit is difficult, as it depends on a
`number of circuit parameters and technology-dependent factors which are not readily
`available or precisely characterized. Some of these factors are described next.
`
`Input Pattern Dependence
`
`Switching activity at the output of a gate depends not only on the switching activities
`at the inputs and the logic function of the gate but also on the spatial and temporal
`dependencies among the gate inputs. For example, consider a two-input AND gate
`g with independent inputs i and j whose signal probabilities are 1/2, then E,(sw) =
`
`7
`
`
`
`78
`
`Design Technologies for Low-Power VLSI
`
`3/8. This holds because in 6 out of 16 possible input transitions, the output of the
`two-input and gate makes a transition. Now suppose it is known that only patterns
`00 and ll can be applied to the gate inputs and that both patterns are equally likely,
`
`then Eg(sw) = 1/2. Alternatively, assume that it is known that every 0 applied to
`input i is immediately followed by a 1, whereas every I applied to input j is immedi-
`
`ately followed by a 0; then Eg(sw) = 4/9. Finally, assume that it is known that i
`changes exactly if 3' changes value; then Eg(sw) = 1/4. The first case is an example
`of spatial correlations between gate inputs, the second case illustrates temporal
`correlations on gate inputs, and the third case describes an instance of sputiotempo-
`rut correlations.
`
`The straightforward approach of estimating power by using a simulator is
`greatly complicated by this pattern—dependence problem.
`It is clearly infeasible to estimate the power by exhaustive simulation of the
`circuit. Recent techniques overcome this difficulty by using probabilities that de-
`scribe the set of possible logic values at the circuit inputs and developing mecha-
`nisms to calculate these probabilities for gates inside the circuit. Alternatively,
`exhaustive simulation may be replaced by Monte Carlo simulation with well-defined
`stopping criteria for specified relative or absolute error in power estimates for a
`given confidence level (12).
`
`Delay Model
`
`Based on the delay model used, the power estimation techniques could account
`for steady—state transitions (which consume power but are necessary to perform a
`computational task) and/or hazards and. glitches (which dissipate power without
`doing any useful computation). Sometimes, the first component of power consump-
`tion is referred to as the functional activity, whereas the latter is referred to as the
`spurious activity. It is shown in Ref. 13 that the mean value of the ratio of hazard-
`ous component to the total power dissipation varies significantly with the considered
`circuits (from 9% to 38% in random logic circuits) and that the spurious power
`dissipation cannot be neglected in CMOS circuits. The spurious activity is much
`higher in certain data path modules (such as adders and multipliers). Indeed, in a
`32-bit pipelined multiplier, the power dissipation due to hazard activity is three to
`four times higher than that due to functional activity! The spurious power dissipa-
`tion is likely to become even more important in the future scaled technologies.
`Current power estimation techniques often handle both zero-delay (nonglitch)
`and real-delay models. In the first model, it is assumed that all changes at the circuit
`inputs propagate through the internal gates of the circuits instantaneously. The
`latter model assigns to each gate in the circuit a finite delay and can thus account
`for the hazards in the circuit. A real-delay model significantly increases the compu-
`tational requirements of the power estimation techniques while improving the accu-
`racy of the estimates.
`Calculation of the spurious activity in a circuit is, in general, very difficult and
`requires careful logic- and/or circuit-level characterization of the gates in a library
`as well as detailed knowledge of the circuit structure.
`
`Logic Function
`
`Switching activity at the output of a logic gate is also strongly dependent on the
`Boolean function of the gate itself. This is because the logic function of a gate
`
`8
`
`
`
`Design Technologies for Low-Power VLSI
`
`79
`
`determines the probability that the present value of the gate output is different from
`its previous value. For example, under the assumption that the input signals are
`uncorrelated, switching activity at the output of a (static) two-input NAND or NOR
`gate is 3/8, whereas that at the output of a two—input XOR gate is 1/2. Indeed,
`switching activity at the output of a K-input NAND or NOR gate approaches 1/ 2‘"
`for large K, whereas that for a K-input XOR gate remains at 1/2.
`
`Logic Style
`
`Switching activity of the circuits is also a function of the logic style used to imple-
`ment the circuit. The functional activity in dynamic circuits is always higher than
`that in static implementation of the same circuit, as all nodes are precharged to
`some value (one in N—type dynamic and zero in P—type dynamic) before the new
`input data arrives. This effectively increases the number of power-consuming transi—
`tions. For example, under pseudorandom input signals, switching activities of two-
`input N—type dynamic NAND, NOR and XOR gates are 3/2, 1/2, and 1, respec-
`tively, and those of the P—type version of these same gates are 1/2, 3/2, and 1,
`respectively. These values should be compared to the switching activities of these
`gates in static CMOS which are 3/8, 3/8, and 1/2, respectively. Note, however, that
`the physical capacitance in dynamic logic tends to be smaller than that in static
`logic, so the choice between dynamic and static logic implementations is not as
`clear—cut as it would be otherwise. Dynamic circuits are also glitch-free!
`
`Circuit Structure
`
`The major difficulty in computing the switching activities is the reconvergent nodes.
`Indeed, if a network consists of simple gates and has no reconvergent fanout nodes
`(i.e., circuit nodes that receive inputs from two paths that fanout from some other
`circuit node), then the exact switching activities can be computed during a single
`post—order traversal of the network. For networks with reconvergent fanout, the
`problem is much more challenging, as internal signals may become strongly corre-
`lated and exact consideration of these correlations cannot be performed with rea-
`sonable computational effort or memory usage. Current power estimation tech-
`niques either ignore these correlations or approximate them, thereby improving the
`accuracy at the expense of longer run times. Exact methods (i.e., symbolic simula-
`tion) have also been proposed but are impractical due to excessive time and memory
`requirements.
`
`Statistical Variation of Circuit Parameters
`
`In real networks, statistical perturbations of circuit parameters may change the
`prepagation delays and produce changes in the number of transitions because of the
`appearance or disappearance of hazards. It is therefore useful to determine the
`change in the signal transition count as a function of this statistical perturbations.
`Variation of gate-delay parameters may change the number of hazards occurring
`during a transition as well as their duration. For this reason, it is expected that the
`hazardous component of power dissipation is more sensitive to integrated-circuit
`(IC) parameter fluctuations than the power required to perform the transition be—
`tween the initial and final states of each node.
`
`9
`
`
`
`80
`
`Design Technologies for Low-Power VLSI
`
`POWER ESTIMATION TECHNIQUES
`
`The design for the low-power problem cannot be achieved without accurate power
`prediction and optimization tools or without power-efficient gate and module li-
`braries. Therefore, there is a critical need for computer-aided design (CAD) tools to
`estimate power dissipation during the design process to meet the power budget
`without having to go through a costly redesign effort and enable efficient design
`and characterization of the design libraries.
`In the following section, various techniques for power estimation at the cir-
`cuit, logic, and behavioral levels will be reviewed. These techniques are divided into
`two general categories: simulation based and nonsimulation based.
`
`Simulatlve Approaches
`
`Brute-Force Simulation
`
`Circuit simulation-based techniques (14,15) simulate the circuit with a representa-
`tive set of input vectors. They are accurate and capable of handling various device
`models, different circuit design styles, single-phase and multiphase clocking meth-
`odologies, tristate drives, and so forth. However, they suffer from memory and
`execution time constraints and are not suitable for large, cell-based designs. In
`addition, it is difficult to generate a compact stimulus vector set to calculate accu-
`rate activity factors at the circuit nodes. The size of such a vector set is dependent
`on the application and the system environment (16).
`PowerMill (17) is a transistor-level power simuiaior and analyzer which ap-
`plies an event-driven timing simulation algorithm (based on simplified table-driven
`device models, circuit partitioning, and single-step nonlinear iteration) to increase
`the speed by two to three orders of magnitude over SPICE.
`Switch-level simulation techniques are, in general, much faster than circuit-
`leveI simulation techniques but are not as accurate or versatile. Standard switch-
`level simulators [such as IRSIM (18)] can be easily modified to report the switched
`capacitance (and thus dynamic power dissipation) during a simulation run.
`The Verilog-XL logic simulator is a Verilog—based gate-level simulation pro-
`gram that relies on the accuracy of the macromodels built for the gates in the
`application-specific IC (ASIC) library as well as gate-level timing analysis to pro-
`duce fast and accurate power estimates. The accuracy depends heavily on the quality
`of the macromodels, the glitch filtering scheme used, and the accuracy of physical
`capacitances provided at the gate level. The speed is three to four orders of magni-
`tude faster than SPICE.
`
`Most of the high-level power-prediction tools use profiling and simulation to
`address data dependencies. Important statistics include the number of operations of
`a given type, the number of bus, register, and memory accesses, and the number of
`
`I/O operations executed within a given period (19,20). Instruction-level simulation
`or behavioral simulators are easily (and have indee been) adapted to produce this
`information.
`.
`
`Hierarchical Simulation
`
`A simulation method based on a hierarchy of simulators is presented in Ref. 21.
`The idea is to use a hierarchy of power simulators (e.g., at architectural, gate level,
`
`10
`
`10
`
`
`
`Design Technologiesfor Low-Power VLSI
`
`81
`
`and circuit level) to achieve a reasonable accuracy and efficiency tradeoff. Another
`good example is EnticeuAspen (22). This power analysis system consists of two
`components: ASpen which computes the circuit activity information and Entice
`which computes the power characterization data. A stimulus file is to be supplied to
`Entice, where power and timing delay vectors are specified. The set of power vectors
`discretizes all possible events in which power can be dissipated by the cell. With the
`relevant parameters set according to the user’s specs, a SPICE circuit simulation is
`invoked to accurately obtain the pOWel' dissipation of each vector. During logic
`simulation, Aspen monitors the transition count of each cell and computes the total
`power consumption as the sum of the power dissipation for all cells in the power
`vector path.
`
`Monte Curio Simulation
`
`A Monte Carlo simulation approach for power estimation which alleviates the input
`pattern dependence problem has been proposed in Ref. 12. This approach consists
`of applying randomly generated input patterns at the circuit inputs and monitoring
`the power dissipation per time interval Tusing a simulator. Based on the assumption
`that the power consumed by the circuit over any period T has a normal distribution,
`and for a desired percentage error in the power estimate and a given confidence
`level, the number of required power samples is estimated. The designer can use an
`existing simulator (circuit level, gate level, or behavioral) in the inner loop of the
`Monte Carlo program, thus trading accuracy for higher efficiency. The convergence
`time for this approach is fast when estimating the total power consumption of
`the circuit. However, when signal probability (or power consumption) values on
`individual lines of the circuit are required, the convergence rate is very slow (23).
`The method does not handle spatial correlations at the circuit inputs.
`
`Nonsimulative Approaches
`
`Behavioral Levell
`
`For functional units (adders, multipliers, and registers) or for memories, power
`estimates are directly obtained from the design library whereby each functional unit
`has been simulated using pseudorandom white noise data and the average switched
`capacitance per clock cycle has been calculated and stored in the library.
`The power model for a functional unit may be parametrized in terms of its
`input bit width. For example, the power dissipation of an adder (or a multiplier) is
`linearly (or quadratically) dependent on its input bit width. The library thus con-
`tains interface descriptions of each module, description of its parameters, its area,
`delay, and internal power dissipation (assuming pseudorandom white noise data
`inputs). The latter is determined by extracting a circuit- or logic-level model from
`the layout or logic-level descriptions of the module, simulating it using a long stream
`of randomly generated input patterns and calculating the average power dissipation
`per pattern. These characteristics are available in terms of the parameter values
`(i.e., equations) or in the form of tables. Multiparameter modules are characterized
`with respect to all the parameters, yielding a multiparameter equation or table.
`Multifunction modules (e.g., Arithmetic Logic Unit (ALU)) are characterized for
`each function separately.
`
`The power model thus generated and stored for each module in the library has
`
`11
`
`11
`
`
`
`82
`
`Design Technologies for Low-Powar VLSI
`
`to be “conditioned” or “modulated” by the acme! input switching activities in order
`to provide power estimates which are sensitive to the input activities. In Refs. 20
`and 24b,
`the model consists of a single physical capacitance value and a single
`switching activity value which represents the average switching activity on each
`input bit. In Ref. 243, a more detailed model is presented, where it is projected that
`data in the datapath of a digital system can be divided into two regions: the least
`significant bits (LSB), which act as uncorrelated white noise, and the most signifi-
`cant bits (MSB), which correspond to sign bits and exhibit strong temporal depen-
`dence. The power model thus uses two capacitance values and requires two input
`switching activity values corresponding to the LSB and M33 regions. Both models
`ignore the spatial correlations among bits of the same input or across bits of differ-
`ent inputs.
`Another parametric model is described in Ref. 25, where the power dissipation
`of the various components of a typical processor architecture are expreSSed as a
`function of set of primary parameters. The technique suffers from an abundance of
`parameters, requires a lot of fine-tuning for specific architectures, and is sensitive
`to mismatches in the modeling assumptions.
`Word-level behavior of a data input can be properly captured by its probabil-
`ity density function (pdf). Similarly, spatial correlation between two data inputs can
`be captured by their joint pdf. This observation is used in Refs. 26 and 27 to develop
`a probabilistic technique for behavioral-level power prediction which consists of
`four steps: (1) building t