`
`677
`
`Instance name
`
`in pin-->out pin
`
`tr
`
`total
`
`incr
`
`cell
`
`END OF PATH
`D.a r ff b2
`- -
`-
`INBUF 24
`a 2
`BEGIN OF PATH
`
`PAD--->Y
`
`R
`R
`R
`
`4.52
`4.52
`0.00
`
`0.00
`4.52
`0.00
`
`DFI
`INBUF
`
`---------------------CLOCK to SETUP longest path--------------------(cid:173)
`Rise delay, worst case
`
`Instance name
`
`in pin-->out pin
`
`tr
`
`total
`
`incr
`
`cell
`
`END OF PATH
`D.sel r ff
`I 1 CMS
`I 3 CMS
`a r ff bl
`- -
`-
`BEGIN OF PATH
`
`SI0--->Y
`SOO--->Y
`CLK--->Q
`
`R
`R
`R
`R
`
`9.99
`9.99
`9.99
`5.60
`
`0.00
`0.00
`4.40
`5.60
`
`DFI
`CMS
`CMS
`DFI
`
`---------------------CLOCK to OUTPAD longest path-------------------(cid:173)
`Rise delay, Worst case
`
`Instance name
`
`in pin-->out pin
`
`tr
`
`total
`
`incr
`
`cell
`
`END OF PATH
`outp_2_
`OUTBUF 31
`outp_ff_b2
`BEGIN OF PATH
`
`D--->PAD
`CLK--->Q
`
`R
`R
`R
`
`11. 95
`11. 95
`4.40
`
`7.55
`4.40
`
`OUTBUF
`DFI
`
`The timing analyzer has examined the following:
`
`1. Paths that start at an input pad and end on the data input of a sequential logic
`cell (the D input to a D flip-flop, for example). We might call this an entry
`path (or input-to-D path) to a pipe lined design. The longest entry delay (or
`input-to-setup delay) is 4.52 ns.
`2. Paths that start at a clock input to a sequential logic cell and end at the data
`input of a sequential logic cell. This is a stage path (register-to-register path or
`clock-to-D path) in a pipeline stage. The longest stage delay (clock-to-D
`delay) is 9.99 ns.
`3. Paths that start at a sequential logic cell output and end at an output pad. This
`is an exit path (clock-to-output path) from the pipeline. The longest exit delay
`(clock-to-output delay) is 11.95 ns.
`
`692
`
`
`
`678
`
`CHAPTER 13 SIMULATION
`
`By pipelining the design we added three clock periods of latency, but we
`increased the estimated operating speed. The longest prelayout critical path is now
`an exit delay, approximately 12 ns-more than doubling the maximum operating fre(cid:173)
`quency. Next, we route the registered version of the design. The Actel software
`informs us that the postroute maximum stage delay is 11.3 ns (close to the preroute
`estimate of 9.99 ns). To check this figure we can perform another timing analysis.
`This time we shall measure the stage delays (the start points are all clock pins, and
`the end points are all inputs to sequential cells, in our case the D input to a D flip(cid:173)
`flop). We need to define the sets of nodes at which to start and end the timing analy(cid:173)
`sis (similar to the path clusters we used to specify timing constraints in logic synthe(cid:173)
`sis). In the Actel timing analyzer we can use predefined sets 'clock' (flip-flop
`clock pins) and' gated' (flip-flop inputs) as follows:
`
`timer> startset clock
`timer> endset gated
`timer> longest
`1st
`longest path to all endpins
`Rank Total Start pin
`First Net
`o
`11.3 a r ff b2:CLK
`a r 2
`1
`6.6 sel r ff:CLK
`sel r
`8 similar lines omitted ...
`
`End Net
`block 0 OUTI
`DEF NET 50
`
`End pin
`sel r ff:D
`outp_ff_bO:D
`
`We could try to reduce the long stage delay (11.3 ns), but we have already seen from
`the preroute timing estimates that an exit delay may be the critical path. Next, we
`check some other important timing parameters.
`
`13.7.1 Hold Time
`Hold-time problems can occur if there is clock skew between adjacent flip-flops, for
`example. We first need to check for the shortest exit delays using the same sets that
`we used to check stage delays,
`
`timer> shortest
`1st shortest path to all endpins
`Rank Total Start pin
`First Net
`4.0 b rr ff bl :CLK b rr 1
`0
`-
`-
`-
`-
`-
`-
`1
`4.1 a rr ff b2:CLK a rr 2
`-
`-
`- -
`-
`-
`8 similar lines omitted ...
`
`End Net
`DEF NET 48
`DEF NET 46
`
`End pin
`outp_ff_bl:D
`outp_ff_b2:D
`
`The shortest path delay, 4 ns, is between the clock input of a D flip-flop with
`instance name b_rr_ff_bI (call this x) and the D input of flip-flop instance name
`outp _ ff_bI (Y). Due to clock skew, the clock signal may not arrive at both flip(cid:173)
`flops simultaneously. Suppose the clock arrives at flip-flop Y 3 ns earlier than at flip(cid:173)
`flop x. The D input to flip-flop Y is only stable for (4 - 3) = 1 ns after the clock edge.
`To check for hold-time violations we thus need to find the clock skew corresponding
`
`693
`
`
`
`13.7 STATIC TIMING ANALYSIS
`
`679
`
`to each clock-to-D path. This is tedious and normally timing-analysis tools check
`hold-time requirements automatically, but we shall show the steps to illustrate the
`process.
`
`13.7.2 Entry Delay
`Before we can measure clock skew, we need to analyze the entry delays, including
`the clock tree. The synthesis tools automatically add I/O pads and the clock cells.
`This means that extra nodes are automatically added to the netlist with automatically
`generated names. The EDIF conversion tools may then modify these names. Before
`we can perform an analysis of entry delays and the clocknetwork delay, we need to
`find the input node names. By looking for the EDIF I rename I construct in the
`EDIF netlist we can associate the input and output node names in the behavioral
`Verilog model, comp _ mux _ rrr, and the ED IF names,
`
`piron% grep rename comp_mux_rrr_o.edn
`(direction INPUT))
`(port (rename a_2_ "a[2]")
`8 similar lines renaming ports omitted ...
`(net (rename a_rr_O_ "a_rr[O]")
`(joined
`9 similar lines renaming nets omitted ...
`piron%
`
`Thus, for example, the ED IF conversion program has renamed input port a [ 2 ]
`to a _2_ because the design tools do not like the Verilog bus notation using square
`brackets. Next we find the connections between the ports and the added I/O cells by
`looking for I PAD I in the Actel format netlist, which indicates a connection to a pad
`and the pins of the chip, as follows:
`
`piron% grep PAD comp_mux_rrr_o.adl
`NET DEF_NET_148; outp_2_, OUTBUF 31:PAD.
`NET DEF_NET_151; outp_1_, OUTBUF_32:PAD.
`NET DEF NET 154; outp_O_, OUTBUF 33:PAD.
`a 2 , INBUF 24:PAD.
`NET DEF NET 127;
`-
`-
`a 1 , INBUF 25:PAD.
`NET DEF NET 130 ;
`-
`a ° , INBUF 26:PAD.
`NET DEF NET 133;
`-
`b 2 , INBUF 27:PAD.
`NET DEF NET 136;
`-
`b 1 , INBUF 28:PAD.
`NET DEF NET 139;
`-
`b 0 , INBUF 29:PAD.
`NET DEF NET 142;
`NET DEF NET 145; clock, CLKBUF 30:PAD.
`-
`piron%
`
`This tells us, for example, that the node we called clock in our behavioral
`model has been joined to a node (with automatically generated name) called
`CLKBUF_30:PAD, using a net (connection) named DEF_NET_145 (again automati(cid:173)
`cally generated). This net is the connection between the node clock that is dangling
`in the behavioral model and the clock-buffer pad cell that the synthesis tools auto(cid:173)
`matically added.
`
`694
`
`
`
`680
`
`CHAPTER 13 SIMULATION
`
`13.7.3 Exit Delay
`We now know that the clock-pad input is CLKBUF 30 : PAD, so we can find the exit
`delays (the longest path between clock-pad input and an output) as follows (using
`the clock-pad input as the start set):
`
`timer> startset clockpad
`Working startset 'clockpad' contains 0 pins.
`
`timer> adds tart CLKBUF_30:PAD
`Working startset 'clockpad' contains 2 pins.
`
`I shall explain why this set contains two pins and not just one presently. Next, we
`define the end set and trace the longest exit paths as follows:
`
`timer> endset outpad
`Working endset 'outpad' contains 3 pins.
`
`timer> longest
`1st
`longest path to all endpins
`Rank Total Start pin
`First Net
`16.1
`CLKBUF 30/UO:PAD DEF NET 144
`0
`CLKBUF 30/UO:PAD DEF NET 144
`16.0
`1
`16.0
`CLKBUF 30/UO:PAD DEF NET 144
`2
`3 pins
`
`End Net
`DEF NET 154
`DEF NET 151
`DEF NET 148
`
`End pin
`OUTBUF 33:PAD
`OUTBUF 32:PAD
`OUTBUF 31:PAD
`
`This tells us we have three paths from the clock-pad input to the three output pins
`
`(outp [ ° ], outp [ 1], and outp [2]). We can examine the longest exit delay in more
`
`detail as follows:
`
`timer> expand 0
`1st longest path to OUTBUF - 33:PAD (rising) (Rank: 0)
`Total Delay Typ Load Macro
`Start pin
`Net name
`16.1
`3.7 Tpd
`0 OUTBUF
`OUTBUF 33:D
`DEF NET 154
`12.4
`4.5 Tpd
`1 DF1
`outp_ff_bO:CLK
`DEF NET 1530
`7.9
`7.9 Tpd 16 CLKEXT 0
`CLKBUF 30/UO:PAD DEF NET 144
`
`The input-to-clock delay, tIC, due to the clock-buffer cell (or macro) CLKEXT_O,
`instance name CLKBUF_30/uo, is 7.9ns. The clock-to-Q delay, tCQ' of flip-flop cell
`DF1, instance name outp_ff_bO, is 4.5 ns. The delay, tQo, due to the output buffer
`cell OUTBUF, instance name OUTBUF _33, is 3.7 ns. The longest path between clock(cid:173)
`pad input and the output, tco, is thus
`
`tco = tIC + tCQ + tQo= 16.1 ns.
`
`(13.23)
`
`This is the critical path and limits the operating frequency to (1 / 16.1 ns) "'" 62 MHz.
`
`695
`
`
`
`13.7 STATIC TIMING ANALYSIS
`
`681
`
`When we created a start set using CLKBUF_30 : PAD, the timing analyzer told us
`that this set consisted of two pins. We can list the names of the two pins as follows:
`
`timer> showset clockpad
`Pin name
`CLKBUF 30/UO : PAD
`CLKBUF 30/Ul : PAD
`2 pins
`
`Net name
`<no net>
`DEF NET 145
`
`Macro name
`CLKEXT 0
`CLKTRI 0
`
`The clock-buffer instance name, CLKBUF _30 luO, is hierarchical (with a '/' hierar(cid:173)
`chy separator). This indicates that there is more than one instance inside the clock(cid:173)
`buffer cell, CLKBUF 30. Instance CLKBUF 30/uo is the input driver, instance
`-
`-
`CLKBUF_30/Ul is the output driver (which is disabled and unused in this case).
`
`13.7.4 External Setup Time
`Each of the six chip data inputs must satisfy the following set-up equation:
`
`tsu (external) > tsu (internal) - (clock delay) + (data delay)
`
`(13.24)
`
`(where both clock and data delays end at the same flip-flop instance). We find the
`clock delays in Eq. 13.24 using the clock input pin as the start set and the end set
`I clock I . The timing analyzer tells us all 16 clock path delays are the same at 7.9 ns
`in our design, and the clock skew is thus zero. Actel's clock distribution system min(cid:173)
`imizes clock skew, but clock skew will not always be zero. From the discussion in
`Section 13.7.1, we see there is no possibility of internal hold-time violations with a
`clock skew of zero.
`Next, we find the data delays in Eq, 13.24 using a start set of all input pads and
`an end set of I gated I ,
`
`timer> longest
`... lines omitted ...
`longest path to all endpins
`1st
`End Net
`Rank 'Total Start pin
`First Net
`DEF NET 1320 DEF NET 1320
`10.0 INBUF 26:PAD
`10
`DEF NET 1380 DEF NET 1380
`9.7 INBUF 28 :PAD
`11
`-
`-
`DEF NET 1290 DEF NET 1290
`9.4 INBUF 25:PAD
`12
`-
`-
`DEF NET 1350 DEF NET 1350
`9.3 INBUF 27:PAD
`13
`DEF NET 1410 DEF NET 1410
`9.2 INBUF 29:PAD
`14
`-
`-
`DEF NET 1260 DEF NET 1260
`9.1 INBUF 24:PAD
`15
`-
`16 pins
`
`End pin
`a r ff bO:D
`-
`-
`-
`b r ff bl: D
`-
`-
`-
`a r ff bl:D
`-
`b r ff b2:D
`-
`-
`-
`b r ff bO :D
`- -
`-
`a r ff b2:D
`- -
`-
`
`We are only interested in the last six paths of this analysis (rank 10-15) that describe
`the delays from each data input pad (a [ 0 ] , a [ 1 ], a [ 2 ], b [ 0 ], b [ 1] , b [ 2 ]) to the D
`input of a flip-flop. The maximum data delay, 10 ns, occurs on input buffer instance
`name INBUF 26 (pad 26); pin INBUF 26 : PAD is node a 0
`III the EDIF file or
`
`696
`
`
`
`682
`
`CHAPTER 13 SIMULATION
`
`input a [ 0] in our behavioral model. The six tsu (external) equations corresponding
`to Eq, 13.24 may be reduced to the following worst-case relation:
`
`tsu (external)max > tsu (internal) -7.9 ns + max (9.1 ns, 10.0 ns)
`
`> tsu (internal) + 2.1 ns
`
`(13.25)
`
`We calculated the clock and data delay terms in Eq. 13.24 separately, but timing
`analyzers can normally perform a single analysis as follows:
`
`tsu (external)max > tsu (internal) -
`
`(clock delay - data delaY)min'
`
`(13.26)
`
`Finally, we check that there is no external hold-time requirement. That is to say,
`we must check that tsu (external) is never negative or
`
`tsu (external)min > tsu (internal) - (clock delay - data delaY)max > 0
`
`> tsu (internal) + 1.2 ns > O.
`
`(13.27)
`
`Since tsu (internal) is always positive on Actel FPGAs, tsu (external)min is always
`positive for this design. In large ASICs, with large clock delays, it is possible to
`have external hold-time requirements on inputs. This is the reason that some FPGAs
`(Xilinx, for example) have programmable delay elements that deliberately increase
`the data delay and eliminate irksome external hold-time requirements.
`
`13.8 Formal Verification
`
`Using logic synthesis we move from a behavioral model to a structural model. How
`are we to know (other than by trusting the logic synthesizer) that the two representa(cid:173)
`tions are the same? We have already seen that we may have to alter the original ref(cid:173)
`erence model because the HDL acceptable to a synthesis tool is a subset of HDL
`acceptable to simulators. Formal verification can prove, in the mathematical sense,
`that two representations are equivalent. If they are not, the software can tell us why
`and how two representations differ.
`
`13.8.1 An Example
`We shall use the following VHDL entity with two architectures as an example: 3
`entity Alarm is
`port(Clock, Key, Trip: in bit; Ring: out bit);
`end Alarm;
`
`--1
`--2
`--3
`
`3By one of the architects of the Compass VFormal software, Erich Marschner.
`
`697
`
`
`
`The following behavioral architecture is the reference model:
`
`13.8 FORMAL VERIFICATION
`
`683
`
`States;
`
`architecture RTL of Alarm is
`type States is (Armed, Off, Ringing); signal State
`begin
`process (Clock) begin
`if Clock =
`'I' and Clock'EVENT then
`case State is
`'I' then State <= Armed; end if;
`when Off => if Key =
`when Armed => if Key =
`'0' then State <= Off;
`elsif Trip =
`'I' then State <= Ringing;
`end if;
`when Ringing => if Key
`end case;
`end ifi
`end process;
`Ring <= 'I' when State
`end RTLi
`
`'0' then State <= Off; end ifi
`
`Ringing else '0';
`
`The following synthesized structural architecture is the derived model:
`
`library cells; use cells.all; II ... contains logic cell models
`architecture Gates of Alarm is
`i end component;
`component Inverter port(i : in BITiZ : out BIT)
`i end component;
`component NAnd2 port(a,b : in BITiZ : out BIT)
`component NAnd3 port(a,b,c : in BIT;z : out BIT)
`; end component;
`component OFF port(d,c : in BIT; q,qn : out BIT)
`; end component;
`signal, State, NextState : BIT_VECTOR(1 downto 0);
`signal sO, sl, s2, s3 : BIT;
`begin
`g2: Inverter port map ( i => State(O), Z => sl );
`g3: NAnd2 port map ( a => sl, b => State(I), Z => s2 );
`g4: Inverter port map ( i => s2, Z => Ring );
`g5: NAnd2 port map
`a => State(I), b => Key, Z => sO );
`g6: NAnd3 port map ( a => Trip, b => sl, C => Key, Z => s3 );
`g7: NAnd2 port map ( a => sO, b => s3, Z => NextState(l) );
`g8: Inverter port map (
`i => Key, Z => NextState(O) );
`state_ff_bO: OFF port map
`( d => NextState(O), c => Clock, q => State(O), qn => open );
`state_ff_bl: OFF port map
`d => NextState(l), c => Clock, q => State(I), qn => open );
`end Gates;
`
`--1
`--2
`--3
`--4
`--5
`--6
`--7
`--8
`--9
`--10
`--11
`--12
`--13
`--14
`--15
`--16
`
`--1
`--2
`--3
`--4
`--5
`--6
`--7
`--8
`--9
`--10
`--11
`--12
`--13
`--14
`--15
`--16
`--17
`--18
`--19
`--20
`--21
`
`698
`
`
`
`684
`
`CHAPTER 13 SIMULATION
`
`To compare the reference and the derived models (two representations), formal
`verification performs the following steps: (1) the HDL is parsed, (2) a finite-state
`machine compiler extracts the states present in any sequential logic, (3) a proof
`generator automatically generates formulas to be proved, (4) the theorem prover
`attempts to prove the formulas. The results from the last step are as follows:
`
`formulas to be proved:
`formulas proved VALID:
`
`8
`8
`
`that
`tells us
`the software
`then proving formulas
`By constructing and
`architecture RTL implies architecture Gates (implication is the default
`proof mechanism-we could also have asked if the architectures are exactly equiva(cid:173)
`lent). Next, we shall explore what this means and how formal verification works.
`
`13.8.2 Understanding Formal Verification
`The formulas to be proved are generated in a separate file of proof statements:
`
`# axioms
`Let Axiom ref = Axioms Of alarm-rtl
`Let Axiom_der = Axioms Of alarm-gates
`ProveNotAlwaysFalse (Axiom_ref)
`Prove (Axiom_ref => Axiom_der)
`# assertions
`Let Assert ref = Asserts Of alarm-rtl
`Let Assert der = Asserts Of alarm-gates
`Prove (Axiom_ref => (Assert_ref => Assert_der))
`# clocks
`Let ClockEvents ref = Clocks Of alarm-rtl
`Let ClockEvents der = Clocks Of alarm-gates
`Let Master clock event ref =
`Value (master_clock'event Of alarm-rtl)
`Prove (Axiom_ref => (ClockEvents_ref <=> ClockEvents_der))
`# next state of memories
`Prove ((Axiom_ref And Master_clock event ref) =>
`(Transition (state(l) Of alarm-rtl) <=>
`Transition (state_ff_b1.t Of alarm-gates)))
`Prove ((Axiom_ref And Master_clock event ref) =>
`(Transition (state(O) Of alarm-rtl) <=>
`Transition (state_ff_bO.t Of alarm-gates)))
`# validity value of outbuses
`Prove (Axiom_ref => (Domain (ring Of alarm-rtl) <=>
`Domain (ring Of alarm-gates)))
`Prove (Axiom_ref => (Domain (ring Of alarm-rtl) =>
`(Value (ring Of alarm-rtl) <=>
`Value (ring Of alarm-gates))))
`
`//1
`//2
`//3
`//4
`//5
`//6
`//7
`//8
`//9
`//10
`/ /11
`//12
`/ /13
`//14
`//15
`//16
`/ /17
`//18
`/ /19
`//20
`//21
`//22
`//23
`//24
`//25
`//26
`//27
`//28
`
`699
`
`
`
`13.8 FORMAL VERIFICATION
`
`685
`
`Formal verification makes strict use of the terms axiom and assertion. An axiom
`is an explicit or implicit fact. For example, if a VHDL signal is declared to be type
`BIT, an implicit axiom is that this signal may only take the logic values I 0 I and
`I 1 I • An assertion is derived from a statement placed in the HDL code. For exam(cid:173)
`ple, the following VHDL statement is an assertion:
`
`assert Key /= '1' or Trip /= '1' or NextState = Ringing
`report "Alarm on and tripped but not ringing";
`
`A VHDL assert statement prints only if the condition is FALSE. We know
`from de Morgan's theorem that (A + B + C) I = A I B 'e'. Thus, this statement checks
`for a burglar alarm that does not ring when it is on and we are burgled.
`I => I means implies. In logic calculus we
`In the proof statements the symbol
`write A =} B to mean A implies B. The symbol '<=> I means equivalence, and this
`to mean: A is equivalent to B.
`is stricter than implication. We write A q B
`Table 13 .13 show the truth tables for these two logic operators.
`
`TABLE 13.13
`
`Implication and equivalence.
`
`A
`F
`F
`T
`T
`
`B
`F
`T
`F
`T
`
`A=>B
`T
`T
`F
`T
`
`A<=>B
`T
`F
`F
`T
`
`13.8.3 Adding an Assertion
`If we include the assert statement from the previous section in architecture
`RTL and repeat formal verification, we get the following message from the FSM
`compiler:
`
`<E> Assertion may be violated
`SEVERITY: ERROR
`REPORT: Alarm on and tripped but not ringing
`FILE: ... /alarm-rtl3.vhdl
`FSM: a1arm-rtl3
`STATEMENT or DECLARATION: line8
`... /alarm-rtl3.vhdl (line 8)
`Context of the message is:
`(key And trip And memoryofdriver __ state(O»
`
`This message tells us that the assert statement that we included may be triggered
`under a certain condition: (key And
`state (0». The prefix
`trip And
`'memoryofdriver_' is used by the theorem prover to refer to the memory element
`
`700
`
`
`
`686
`
`CHAPTER 13 SIMULATION
`
`used for state (0). The state' off' in the reference model corresponds to state (0)
`in the encoding that the finite-state machine compiler has used (and also to state (0)
`in the derived model). From this message we can isolate the problem to the following
`case statement (the line numbers follow the original code in architecture RTL):
`
`case State is
`when Off => if Key = 'I' then State <= Armed; end if;
`when Armed => if Key =
`'0' then State <= Off:
`elsif Trip = 'I' then State <= Ringing:
`end if:
`when Ringing => if Key =
`end case:
`
`'0' then State <= Off: end if:
`
`--6
`--7
`--8
`--9
`--10
`--11
`--12
`
`'1' and Key =
`When we start in state Off and the two inputs are Trip =
`, 1 " we go to state Armed, and not to state Ringing. On the subsequent clock
`cycle we will go state Ringing, but only if Trip does not change. Since we have
`all seen "Mission Impossible" and the burglar who exits the top-secret computer
`room at the Pentagon at the exact moment the alarm is set, we know this is perfectly
`possible and the software is warning us of this fact. Continuing on, we get the fol(cid:173)
`lowing results from the theorem prover:
`
`Prove (Axiom_ref => (Assert_ref => Assert_der))
`Formula is NOT VALID
`But is VALID under Assert Context of alarm-rtl3
`
`We included the assert statement in the reference model (architecture
`RTL) but not in the derived model (architecture Gates). Now we are really
`mixed up: The assertion statement in the reference model says one thing, but the
`case statement in the reference model describes another. The theorem prover
`retorts: "The axioms of the reference model do not imply that the assertions of the
`reference model imply the assertions of the derived model." Translation: "These two
`architectures differ in some way." However, if we assume that the assertion is true
`(despite what the case statement says) then the formula is true. The prover is also
`saying: "Make up your mind, you cannot have it both ways." The prover goes on to
`explain the differences between the two representations:
`
`***Difference is:
`(Not state(l) And key And state(O) And trip)
`There are 1 cubes and 4 literals in the complete equation
`
`***Local Variable Assert der is:
`Not key Or Not state(O) Or Not trip
`There are 3 cubes and 3 literals in the complete equation
`
`***Local Variable Assert ref is: 1
`
`***Local Variable Axiom ref is:
`Not state(l) Or Not state(O)
`
`701
`
`
`
`13.8 FORMAL VERIFICATION
`
`687
`
`There are 2 cubes and 2 literals in the complete equation
`
`8
`formulas to be proved:
`7
`formulas proved VALID:
`formulas VALID under assert context of der.model:
`
`1
`
`Study these messages hard and you will see that the differences between the two
`models are consistent with our explanation.
`
`13.8.4 Completing a Proof
`To fix the problem we change the code as follows:
`
`case State is
`when Off =>
`
`if Key = '1' then
`if Trip =
`'1' then NextState <= Ringing;
`else NextState <= Armed;
`end if;
`end if;
`'0' then NextState <= Off;
`when Armed => if Key =
`elsif Trip =
`'1' then NextState <= Ringing;
`end if;
`when Ringing => if Key = '0' then NextState <= Off; end if;
`end case;
`
`This results in a minor change in the synthesized netlist,
`
`i => State(O), z => sl ) ;
`g2: Inverter port map
`a => 51, b => State(I), z => s2 ) ;
`port map
`g3: NAnd2
`i => s2, Z => Ring ) ;
`g4: Inverter port map
`a => State( 1), b => Key, z => sO
`port map
`) ;
`gS: NAnd2
`a => Trip, b => sl, C => Key, z => s3
`g6: NAnd3
`port map
`a => sO, b => s3, Z => NextState(l) ) ;
`port map
`g7: NAnd2
`i => Key, z => NextState(O)
`g8: Inverter port map
`) ;
`d => NextState(O), c => Clock, q =>
`state ff bO: DFF port map
`State(O), qn => open );
`state ff bl: DFF port map
`State(I), qn => open );
`
`d => NextState(I), c => Clock, q =>
`
`) ;
`
`Repeating the formal verification confirms and formally proves that the derived
`model will operate correctly. Strictly, we say that the operation of the derived model
`is implied by the reference model.
`
`702
`
`
`
`688
`
`CHAPTER 13 SIMULATION
`
`13.9 Switch-Level Simulation
`
`The switch-level simulator is a more detailed level of simulation than we have dis(cid:173)
`cussed so far. Figure 13.1 shows the circuit schematic of a true single-phase flip(cid:173)
`flop using true single-phase clocking (TSPC). TSPC has been used in some full(cid:173)
`custom ICs to attempt to save area and power.
`
`(a)
`
`(b)
`
`chargeDecayTime = 5 ns
`~
`chargeDecayTime = co
`...
`
`3/2,0 $----1---1
`3/2,0 N
`z
`Z
`
`3/2,0
`n
`(L
`
`D 15<'l-lPi>-_----1
`
`FIGURE 13.1 A TSPC (true single-phase clock) flip-flop.
`(a) The schematic (all devices are W /L = 3/2) created using a
`Compass schematic-entry tool. (b) The switch-level simula(cid:173)
`tion results (Compass MixSim). The parameter
`chargeDecayTime sets the time after which the simulator
`sets an undriven node to an invalid logic level (shown
`shaded).
`
`P3
`
`P2
`
`P1
`
`N2
`
`N1
`
`ON
`
`C
`
`D
`
`0
`
`100
`
`time/ns
`
`In a CMOS logic cell every node is driven to a strong '1' or a strong '0'. This
`is not true in TSPC, some nodes are left floating, so we ask the switch-level simula(cid:173)
`tor to model charge leakage or charge decay (normally we need not worry about this
`low-level device issue). Figure 13.1 shows the waveform results. After five clock
`cycles, or 100 ns, we set the charge decay time to 5 ns. We notice two things. First,
`some of the node waveforms have values that are between logic '0' and '1'. Sec(cid:173)
`ond, there are shaded areas on some node waveforms that represent the fact that,
`during the period of time marked, the logic value of the node is unknown. We can
`see that initially, before t = 100 ns (while we neglect the effects of charge decay), the
`circuit functions as a flip-flop. After t= lOOns (when we begin including the effects
`of charge decay), the simulator tells us that this circuit may not function correctly. It
`
`703
`
`
`
`13.10. TRANSISTOR-LEVEL SIMULATION
`
`689
`
`is unlikely that all the charge would leak from a node in 5 ns, but we could not stop
`the clock in a design that uses a TSPC flip-flop. In ASIC design we do not use dan(cid:173)
`gerous techniques such as TSPC and therefore do not normally need to use switch(cid:173)
`level simulation.
`A switch-level simulator keeps track of voltage levels as well as logic levels,
`and it may do this in several ways. The simulator may use a large possible set of dis(cid:173)
`crete values or the value of a node may be allowed to vary continuously.
`
`13.10 Transistor-Level Simulation
`
`Sometimes we need to simulate a logic circuit with more accuracy than provided by
`switch-level simulation. In this case we turn to simulators that can solve circuit
`equations exactly, given models for the nonlinear transistors, and predict the analog
`behavior of the node voltages and currents in continuous time. This type of
`transistor-level simulation or circuit-level simulation is costly in computer time.
`It is impossible to simulate more than a few hundred logic cells using a circuit-level
`simulator. Virtually all circuit-level simulators used for ASIC design are commercial
`versions of the SPICE (or Spice, Simulation Program with Integrated Circuit
`Emphasis) developed at UC Berkeley.
`
`FIGURE 13.2 Output buffer
`(OB.IN) schematic (created using
`Capilano's OesignWorks)
`
`+5V
`"
`
`rn2
`
`output
`~----------~--~.~
`
`rnl
`
`'~-l-'
`
`-::-
`
`c1
`10pF
`
`13.10.1 A PSpice Example
`Figure 13.2 shows the schematic for the output section of a CMOS I/O buffer driv(cid:173)
`ing a 10 pF output capacitor representing an off-chip load. The PSpice input file that
`follows is called a deck (from the days of punched cards):
`
`OB September 5, 1996 17:27
`.TRAN/OP Ins 20ns
`. PROBE
`cl output Ground 10pF
`
`704
`
`
`
`690
`
`CHAPTER 13 SIMULATION
`
`PWL(OuS 5V IOns 5V 12ns OV 20ns OV)
`DC OV
`
`Ground
`input
`VIN
`Ground
`VGround
`0
`DC 5V
`Vdd +5V
`0
`input Ground Ground NMOS W=lOOu L=2u
`ml output
`input +5V +5V PMOS W=200u L=2u
`m2 output
`.model nmos nmos level=2 vto=O.78 tox=400e-IO nsub=8.0e15 xj=-O.15e-6
`+ ld=O.20e-6 uo=650 ucrit=O.62e5 uexp=O.125 vmax=5.1e4 neff=4.0
`+ delta=1.4 rsh=37 cgso=2.95e-IO cgdo=2.95e-IO cj=195e-6 cjsw=500e-12
`+ mj=O.76 mjsw=O.30 pb=O.80
`.model pmos pmos level=2 vto=-O.8 tox=400e-10 nsub=6.0e15 xj=-O.05e-6
`+ ld=O.20e-6 uo=255 ucrit=O.86e5 uexp=O.29 vmax=3.0e4 neff=2.65
`+ delta=l rsh=125 cgso=2.65e-IO cgdo=2.65e-IO cj=250e-6 cjsw=350e-12
`+ mj=O.535 mjsw=O.34 pb=O.80
`.end
`
`Figure 13.3 shows the input and output waveforms as well as the current flow(cid:173)
`ing in the devices.We can quickly check our circuit simulation results as follows.
`The total charge transferred to the 10 pF load capacitor as it charges from 0 V to 5 V
`is 50 pC (equal to 5 V x 10 pF). This total charge should be very nearly equal to the
`integral of the drain current of the pull-up (p-channel) transistor h(m2). We can get
`a quick estimate of the integral of the current by approximating the area under the
`waveform for id(m2) in Figure 13.3 as a triangle-half the base (about 12ns) mul(cid:173)
`tiplied by the height (about SmA), so that
`
`f
`
`22ns
`0.5 (SmA) (l2ns) z 50pC = 5 (lOpF) .
`I L (m2) dt =
`
`(13.2S)
`
`IOns
`
`Notice that the two estimates for the transferred charge are equal.
`Next, we can check the time derivative of the pull-up current. (We can also do
`this by using the Probe program and requesting a plot of did (m2 ); the symbol dn
`represents the time derivative of quantity n for Probe. The symbol id (m2) requests
`Probe to plot the drain current of m2.) The maximum derivative should be roughly
`equal to the maximum change of the drain current (tlh(m2) = 8 mA) divided by the
`time taken for that change (about tlt = 2 ns from Figure 13.3) or
`
`tlt
`
`2ns
`
`-1
`6
`4 xlO As
`
`.
`
`(13.29)
`
`705
`
`
`
`13.10 TRANSISTOR-LEVEL SIMULATION
`
`691
`
`6.00V+--------------4--------------~--------------+--------------+
`
`o r - - - -o r - - - - - - - - - . . \
`
`\ I
`
`-1.00V+--------------4--------------~--------------+--------------+
`15ns
`Ons
`5ns
`10ns
`20ns
`o v( input) II v(OI..~tput)
`
`Time
`
`O~-----------------D--4
`
`10mA4-----------------~---------------+----------------~----------------+
`
`OmA
`0
`5ns
`Ons
`o id(ml) II -id(m2)
`
`•
`
`o-----------.--~~--~~--------+_-----[r-------_+
`20ns
`10ns
`15ns
`
`Time
`FIGURE 13.3 Output Buffer (OB.IN). (Top) The input and output voltage waveforms.
`(Bottom) The current flowing in the drains of the output devices.
`
`The large time derivative of the device current, here 4 MAs-I, causes problems
`in high-speed CMOS I/O. This sharp change in current must flow in the supply leads
`to the chip, and through the inductance associated with the bonding wires to the chip
`which may be of the order of lOnanohenrys. An electromotive force (emf), Vp , will
`be generated in the inductance as follows,
`
`706
`
`
`
`692
`
`CHAPTER 13 SIMULATION
`
`dI
`-1
`6
`V P = -L dt = -lOnH (4xlO ) As = -40 m V .
`
`(13.30)
`
`The result is a glitch in the power supply voltage during the buffer output transient.
`This is known as supply bounce or ground bounce. To limit the amount of bounce
`we may do one of two things:
`
`1. Limit the power supply lead inductance (minimize L)
`2. Reduce the current pulse (minimize dIldt)
`
`We can work on the first solution by careful design of the packages and by using
`parallel bonding wires (inductors add in series, reduce in parallel).
`
`13.10.2 SPICE Models
`Table 13.14 shows the SPICE parameters for the typical 0.5!lm CMOS process
`(0.6!lm drawn gate length), 05, that we used in Section 2.1. These LEVEL = 3
`parameters may be used with Spice3, PSpice, and HSPICE (see also Table 2.1 and
`Figure 2.4).
`There are several levels of the SPICE MOSFET models, the following is a sim(cid:173)
`plified overview (a huge number of confusing variations, fixes, and options have
`been added to these models-see Meta Software's HSPICE User's Manual, Vol. II,
`for a comprehensive description [1996]):
`
`1. LEVEL = 1 (Schichman-Hodges model) uses the simple square-law IDs-V DS
`relation we derived in Section 2.1 (Eqs. 2.9 and 2.12).
`2. LEVEL = 2 (Grove-Frohman model) uses the 3/2 power equations that result
`if we include the variation of threshold voltage across the channel.
`3. LEVEL = 3 (empirical model) uses empirical equations.
`4. The UCB BSIMI model (~1984, PSpice LEVEL = 4, HSPICE LEVEL = 13)
`focuses on modeling observed device data rather than on device physics. A
`commercial derivative (HSPICE LEVEL = 28) is widely used by ASIC vendors.
`5. The UCB BSIM2 model (~1991, the commercial derivative is HSPICE
`LEVEL = 39) improves modeling of subthreshold conduction.
`6. The DCB BSIM3 model (~1995, the commercial derivative is HSPICE
`LEVEL = 49) corrects potential nonphysical behavior of earlier models.
`
`Table 13.15 shows the BSIMI parameters (in the PSpice LEVEL = 4 format) for
`the G5 process. The Berkeley short-channel IGFET model (BSIM) family models
`capacitance in terms of charge. In Sections 2.1 and 3.2 we treated the gate-drain
`
`707
`
`
`
`13.10 TRANSISTOR-LEVEL SIMULATION
`
`693
`
`TABLE 13.14 SPICE transistor model parameters (LEVEL = 3).
`
`SPICE
`parameter
`CGBO
`
`CGDO
`
`CGSO
`
`CJ
`
`CJSW
`
`DELTA
`
`ETA
`
`GAMMA
`
`KAPPA
`
`KP
`
`LD
`
`LEVEL
`
`MJ
`
`2E-4
`
`5E-8
`
`3
`
`0.56
`
`0.52
`
`Explanation
`Gate-bulk overlap capacitance (CGBoh, not CGBzero)
`
`Gate-drain overlap capacitance (CGDoh, not CGDzero)
`
`Gate-source overlap capacitance (CGSoh, not CGSzero)
`
`Junction area capacitance
`
`Junction sidewall capacitance
`
`Narrow-width factor for adjusting threshold voltage
`
`Static-feedback factor for adjusting threshold voltage
`
`Body-effect factor
`
`Saturation-field factor (channel-