throbber
Deel a ration of Rachel J, Watters on Authentication of Publication
`
`I, Rachel J. Watters, am a librarian, and the Director of Wisconsin TechSearch
`
`("WTS"), located at 728 State Street, Madison, Wisconsin, 53706. WTS is an
`
`interlibrary loan department at the University of Wisconsin-Madison. I have worked as
`
`a librarian at the University of Wisconsin library system since 1998. I have been
`
`employed at WTS since 2002, first as a I ibrarian and, beginning in 2011, as the Director.
`
`Through the course of my employment, I have become well informed about the
`
`operations of the University of Wisconsin library system, which follows standard library
`
`practices.
`
`This Declaration relates to the dates of receipt and availability of the following:
`
`Hennessy, J.L. and Patterson, D.A. (1998) Computer
`Organization and Design: the Hardware/Software Interface. San
`Francisco, CA: Morgan Kaufmann Publishers.
`
`Standard operaNng procedures for materials at the University of Wisconsin-
`
`Madison Libraries. When a volume was received by the Library, it would be checked
`
`in, added to library holdings records, and made available to readers as soon after its
`
`arrival as possible. The procedure normally took a few days or at most 2 to 3 weeks.
`
`Exhibit A to this Declaration is true and accurate copy of the front matter of the
`
`Computer Organi::ation and Design: the Hardware/5ic~ftware Jnte1:face ( 1998)
`
`publication, which incl udcs a stamp on the verso page showing that this book is the
`
`property of the University of Wisconsin-Madison Libraries.
`
`1
`
`INTEL - 1013
`
`

`

`Declaration of Rachel J. Watters on Authentication of Publication
`
`Attached as Exhibit B is the cataloging system record of the University of
`
`Wisconsin-Madison Libraries for its copy of the Computer Organization and Design:
`
`the Hardware/Sofnvare Interface ( 1998) publication. As shown in the "Receiving date"
`
`field of this Exhibit, the University of Wisconsin-Madison Libraries owned this book
`
`and had it cataloged in the system as of June 6, 1999.
`
`Members of the interested public could locate the Computer Organization and
`
`Design: the Hardware/Software Interface (1998) publication after it was cataloged by
`
`searching the public library catalog or requesting a search through WTS. The search
`
`could be done by title, author, and/or subject key words. Members of the interested
`
`public could access the publication by locating it on the library's shelves or requesting it
`
`from WTS.
`
`I declare that all statements made herein of my own kQ.owledge are true and that
`
`all statements made on information and belief are believed to be true; and further that
`
`these statements were made with the knowledge that willful false statements and the like
`
`so made are punishable by fine or imprisonment, or both, under Section I 001 of Title I 8
`
`of the United States Code.
`
`Date: January 30, 2020
`
`Wisconsin TechSearch
`Memorial Library
`728 State Street
`Madison, Wisconsin 53706
`
`Director
`
`2
`
`INTEL - 1013
`
`

`

`SECOND ED IT ION
`
`Computer Organization and Design
`
`THE HARDWARE / SOFTWARE INTERFACE
`
`INTEL - 1013
`
`

`

`T R A D E M A R K S
`
`Th" following trad emarks arc the property ol the following organiz,,lions:
`
`T,·X is a trademark of America! M,1the matical Sodety.
`
`Ap ple II and Macintosh are t rad e mMks of Apple Compuk r~, Inc.
`CDC 6b0~, CDC 7b00, CDC STAR·lOO, CYBER-JSU. CYBl::R·
`1811/990, and CYBER-20.:; are trademarks of Control Data C,irpor,, (cid:173)
`tion.
`
`11,e Cosmic Cub" is a tr,,dcmark o f C.11ifornia lns Htutc of Te,:hnnl(cid:173)
`ogy.
`C PJJOIJ is a trademark of Conner Pe ripherals.
`Cra y. CllAY-1, CRAY J<IIJ, CRAY T<l0, C RAY X-MP/ 416, and
`CRAY Y-M I' are lr~d.imarks of Cray Res~arch.
`
`Alpha , AlphaSer,·t:r, AlphnStation, D EC, UEC~ystem, DECsyst.-m
`3100, DECstation, l'OP-8. PDP-11, Unib us, VAX, VAX 37ll0, and
`VAXl 1 /7&J MC trademarks of Digit,11 E<1uipmen t Corporation.
`MP2361A, Su per Eagle, VPIOO, VP2lllJ, and Vl'l'300,u,• tra demarks
`of Fujitsu Corpora tion.
`
`Gnu C Compiler is a trademark of Fre.i Softw,irc Fo111\t1ath:,n.
`
`Goodyear MPP is a trad emark of Goodyear Tire am\ Rubber Co.,
`Inc.
`Apollo ON 300, Apollo DN 10000, Corw~x. I IP, HP Pred,ion
`Archil~cturc, HPPA, H P850, HP 30Ull, HP 300/70, P/\-RTSC, and
`Prl!Cision M<' r.igist.ired trademarks of H.,wlet-l',1eka rd Cllmpany.
`432,960 CA, 4004, 8008, 8080, SU86, llll117, 8088, 80136, 11028b, 8038b,
`80480, Della, iAJ'X ~32. i8ol!, Intel, lnt.il4&, Intel Hypercub.-. iP(cid:173)
`SC/2, MMX, Mu ltib~s, Multibus JI, Parai,;on, and Pentium a re
`trado1marks of Intel Corporation. ln td ln~id o1 is a registered trade·
`m a rk vf Intel Corporation.
`
`360, 360/30, 3b0/ 40, 300/50, 3()()/C,5, 360/ 85, 3bll/<11, 370, 370/158,
`370/165, 370/ 168, 371>-XA, F.SA/370, 701, 704, 709, ~01, 3033, 3080,
`3080 series, 30RO VI', 3081, 3090, 30911/ltlO, 30'l0/20U, 3090/ 400,
`3090/600, 3090/6005, 30'10 VI', ,1330, 3380, 33801), 3380 Disk Modd
`AK4, 3380), 3390, 38110-23, J<J9il, 7090, 7W4, IBM, IBM l'C, IBM PC(cid:173)
`AT, IBM SVS, ISAM, MVS, Pl..8, Pow.,rPC. POWERstation, RT-I'C,
`RAMAC, RS/6000, Sage, Stretch, System/:160, Vector Faility . aml
`VM ar.- trademarks of International Business Machines Corpor,1•
`lion. PO WERserver, RISC Sy~tem/60Ull, ~nd SP2 nrc rcKistered
`trademarks <>f ln t<:mational Business Machines Corp<>r,, tiun.
`
`ICL DAI' is~ tradt.'mark nf lnterMtional C nmputers Lim\to!d.
`In mos and Transputer Mc trademarks of In mos.
`
`FulurcBus is a tradcmMk of the Institute o f 1::lectrical and Electron(cid:173)
`ic l:ngin.,ers.
`
`KSR-1 is a trademark of Kendall S,1u,,re Research.
`
`MASP/\R MP-1 And MASPAR MI'-2 ,,r~ tr~dcmarks of MMPar
`Corp<•rilti,m.
`Mll"S, R2000, R31)0U, ,,ml RI 0000 are regis te red trad"marks o f
`MIPS Tcdm,,logy, Inc.
`
`Windows is a trademark of Microsoft Corporation.
`
`Nu Bus is., tr.1dcmark of Massachu~••tts Institute of Technology.
`D•lta Series l:!61!~, System V/AA R32V1, VME bus, b809, b8UOO.
`btlOIO, 1>81120, hROJll, ol\881, (>81!82, 8800U, K!«l<lO 1.8 .4m14, ~8100,
`and ~ll20IJ Me trademarks uf Mt>tnmla Corf)(>ration.
`
`Ncube an d nCu bc / ten are tradt1m,uks of Nt-ube Corporation.
`NEC is a ro,gl~tercd tr,1d~n,ark of :--JECCurporation.
`Network Com pute r is a trademark of Ornd• Corpor,,tion.
`
`l'a rsyt,·c C.C is a tT.iti~mark o f l'arsytcc, Inc.
`lm primi,, I l'l-2, S.1br.i, Sabre '17W<i, S¢ai;atl', ,md Wren IV are
`trad,•marks of SeaKat~ Teclnmloi,;y, Inc.
`
`NUMA•Q , S...")""111, and Symmd ry are trademarks of Sequent
`Cllmputcrs.
`
`Power C h,,11.,ngc, Silicon Gra phics, Silicon Graphics 43/ 240,
`5ilicon Gr,,phlc~ 4D/ 611, Silicon Grnphic. 4D/240, and Silkun
`Gr,,p hics 4 D !kri.-$ are trademarks of SIiicon Graphics. O rlgin2000
`is a registered trademark of Silicon Graphics.
`SPEC is a r'-'11isle rL'<f lr,,d ~mark of the Standard Performance Eval(cid:173)
`uatio n Corpora tion.
`Spice is a trad~milrk o( Uniwn-ity of California al Berkeley.
`
`En k>rprise, Java, Sun, 5un Ultra, Sun Microsystems, and Ultra are
`trademarks o f Su n Microsystems, Inc. SPARC and UltraSPARC
`a,., registered trademarks of SPARC In terna tional, Inc., licensed tv
`Sun Microsy.stenis, Inc.
`
`Conncctit>n Machin e, C M· 2, and C M-.5are trad emarks of Thinking
`Machines.
`Burroui,;hts b511ll, B500ll, B5500, 0 -machine, UNIV /\C, U NIV AC !,
`and U NIV AC 1103 are trademarks of UN !SYS.
`
`Alto, PARC, P,,lo Alto Rcs..,arch Cc11tcr. and Xerox are trademarks
`of Xerox Corpor,,ti<•n.
`Tht1 UNIX tradcn,ark i~ licensed excluslw ly through X/ Ope11
`Comp,,ny Ltd.
`
`All 0 th..,. product nanws arc trad emarks or rcgiskred trademarks
`of their respcctiv.., m mJMni,.,., Where tradema rks ap pear in this
`book and Morgan K,rnfm,,nn Publishe rs w as aw.ue of a tr,,d.,mark
`dai,n, the tr.,dcmarks haw bcc11 print~d in initial caps <>r a ll ca ps.
`
`INTEL - 1013
`
`

`

`S E C O N D ED IT ION
`
`Computer Organization and Design
`
`THE HARDWARE / SOFTWARE
`
`INTERFACE
`
`John L. Henne"y
`Stanford University
`
`David A. Patterson
`University of California, Berkeley
`
`With a contribution by
`James R. Larus
`University of Wisconsin
`
`M~(cid:141)
`
`Morgan Kaufmann Publishers, Inc.
`San Francisco, California
`
`INTEL - 1013
`
`

`

`Kurt 1<. Wendt Librnry
`University of Wizconsii,-Madison
`215 N. Randall A11enue
`Madison, WI 53706-1680
`
`Spon~rlng Editor Denise Pcnrust:
`Production Manager Yonic Overhm
`Production Editor Julie Pabst
`Editorial Coordinator Jane Elliott
`Text and Cover Dealgn Ross Carron Design
`Uluetratlon Alexand~r Tesb ln Associates, with second edition modifications by Dartmouth
`Publishing, Inc.
`Chapter Opener Illustrations Canary Studios
`Copyedltor Ken De!lilPcnt,1
`Composition Nancy Lc,~an
`Proofreader ),mnifcr McClain
`Indexer Steve Rath
`Printer Courier Corporation
`
`Morgan Kaufmann Pul,lisher.i, Inc.
`Editorial and Sales Office:
`340 Pine Street, Sixth Floor
`San Fnmcisco, CA <,(cid:157) J0-1-320~
`USA
`
`Tdephone
`-115/392-26~5
`F-'<:simile 415/<182-2!>65
`Email mkp@mkf• . .-o1111
`WWW hftp;//11•111w.mk11.com
`Onfor toll fl>'(' 800/745-7323
`
`•£> 1998 by Morgan Kaufmann Publishers, Inc.
`All rights reserved
`Printed in lht: United States t'f Americd
`
`()2 01 00 <l<J
`
`5 4 3 2
`
`No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form
`or by any means-elt'ctronic, mechanical, photocopying, recording, or otherwise-without the prior
`written permission of the publisher.
`
`Advice, Prelse, and Errors: Any cmr,·~pondence rel,11,,d to this p ublication nr int<>ndcd for the nuthnrs
`shouM be st!nt dectronically to n n/21•11s,@111kj1.COIII. lnform,1tinn ,egarding error sightings is ,•11cournged.
`Any t>rrm sightings that ar,· i\cc:cptl'd for corredion in subsequent p rintin11s will be rewarded by th,,
`,rnthors with ,1 pa>•mcnt of SUH) (U.S.) per correction at the linw of their implcnwntalion in a reprint.
`
`Library of Congress Cataloglng-iit-Publlcatlon Data
`Patterson, Dav id /1..
`Co rnput.-r org,,nization ,md d,•s ign : the ha rdw.nc / software int,·rlace
`I D,wid A. Palterson, John L. H,mnessy.-2nd ed .
`p. cm.
`Includes bibliographic,,! rderences and index.
`IS!JN 1-5;,X60--lQI-X (p:1pcr)
`ISBN l·SS~hll--l21H> (dnth).-
`1. Comp1•t~r organization. 2. c,.»nputers-Desi11,n and constructio n.
`I. li,•nnt'~sy, John L.
`II. Tille
`3. Compute r interfaces.
`QA76.<J.CM3H46
`1997
`tl04.2'2-dc2.I
`
`<J7-Jf,(}5U
`
`INTEL - 1013
`
`

`

`viii
`
`Content•
`
`Contents
`
`Foreword vi
`by John H. Crawfl11·d
`
`Worked Examples xiii
`
`Computer Organization and Design Online xvi
`
`Preface xix
`
`CHAPTERS
`
`a Computer Abstractions and Technology 2
`
`J...,t.
`1.2
`·--1.a
`":i.4
`._..t..5
`1.e
`.' 1.7
`__ 1.8
`1.9
`. -1.t.O
`
`Introduction 3
`Below Your Program 5
`Under the Covers 10
`Integrated Circuits: Fueling Innovation 21
`Real Stuff: Manufacturing Pentium Chips 24
`Fallaclea and Pitfalls 29
`Concluding Remarks 30
`Hiatorical Perapective end Further Reading 32
`KeyTerms 44
`Exercises 45
`
`II The Role of Performance s2
`
`2,1
`lntrocluction 54
`2.2 Measuring Performance 58
`2.3 Relating the Metric• 60
`2.4 Clloosing Programs to Evaluate Performance 66
`2 .5 Comparing and Summarizing Performance 69
`2.8 Real Stuff: The SPEC95 Benchmark• and Performance of Recent
`Processors 71
`2. 7 Fallaciea and Pitfall& 75
`2.8 Concluding Remarka 82
`
`INTEL - 1013
`
`

`

`Contents
`
`ix
`
`2.9 Historical Perspective and Further Reading 8:1
`2.10 Key Terms 8CJ
`2.11 Exercises 90
`
`-II Instructions: Language of the Machine
`
`104
`
`Introduction 106
`3.1
`3.2 Operations of the Computer Hardware 107
`3.3 Operands of the Computer Hardware 109
`3.4 Representing Instructions in the Computer 116
`Instructions for Making Decisions 122
`3.5
`3,6 Supporting Procedures In Computer Hardware 132
`3.7 Beyond Numbers 142
`3.8 other Styles of MIPS Addressing 145
`3.9 Starting a Program 15b
`3.10 An Example to Put It All Together 163
`3.11 Arrays versus Pointers 171
`3.12 Real Stuff: PowerPC and 80x86 Instructions 175
`3 .13 Fallacies and Pitfalls 185
`3.14 Concluding Remarks 11'7
`3.15 Historical Perspective and Further Reading 18'1
`3,16 Key Terms 196
`3.17 Exercises 19h
`
`a Arithmetic for Computers 20H
`
`4.1
`Introduction 210
`':,4 ,2 Signed and Unsigned Numbers 210
`_, 4,3 Addition and Subtraction 220
`4.4 Logical Operations 225
`4.5 Constructing an Arithmetic Logic Unit 2:10
`4.6 Multiplication 250
`4. 7 Division 265
`4.8 Floating Point 275
`4.9 Real Stuff: Floating Point in the PowerPC and 80x86 301
`4.10 Fallacies and Pitfalls 304
`.11 Concluding Remarks 3()8
`4 .12 Historical Perspective and Further Reading 312
`4.13 Key Terms 322
`4.14 Exercises 322
`
`II The Processor: Datapath and Control 330
`
`5 1
`Introduction 338
`U,- Building a Datapath 343
`5.3 A Simple Implementation Scheme 351
`
`INTEL - 1013
`
`

`

`X
`
`Contents
`
`5.4 A Multicycle Implementation 377
`5.5 Microprogramming: Simpllfylng Control Design 399
`5.6 Exceptions 410
`5.7 Real Stuff: The Pentium Pro lmplementatlon 416
`5.8 Fallacies and Pitfalls 419
`5.9 Concluding Remark& 421
`5.10 Historical Perspective and Further Reading 423
`5.11 Kay Terms 426
`5.12 Exercises 427
`
`II Enhancing Performance with Pipelining
`
`434
`
`~ An Overview of Pipelining 436
`6.2 A Pipelined Datapath 449
`6.3 Pipelined Control 466
`6.4 Data Ha:rards and Forwarding 476
`6.5 Data Hazards and Stall• 489
`6.6 Branch Hazards 496
`8. 7 Exceptions 505
`6.8 Supenicalar and Dynamic Pipelining 510
`6.9 Real Stuff: PowerPC 604 and Pentium Pro Pipelines 517
`6.10 Fallacies and Pitfalls 520
`6.11 Concluding Remarks 521
`6.12 Hlstoricel Perspective and Further Reading 525
`6.13 Key Terms 529
`6.14 Exercises 529
`
`(cid:127) Large and Fast: Exploiting Memory Hierarchy 538
`
`Introduction 540
`7.1
`7.2 The Basics of Caches 545
`7.3 Measuring and Improving Cache Performance 564
`7.4 Virtual Memory 579
`7.5 A Common Framework for Memory Hierarchies 603
`7.8 Real Stuff: The Pentium Pro and PowerPC 604 Memory Hierarchies 611
`7.7 Fallacies and Pitfalls 615
`7,8 Concluding Remarks 618
`7.9 Historical Perspective and Further Reading 621
`7.10 Key Terms 627
`7.11 Exercises 628
`
`a Interfacing Processors and Peripherals 636
`
`Introduction 638
`8.1
`8.2 1/0 Performance Measures: Some Examples from Disk and FIie
`Syetems 641
`8.3 Types and Characteristic(cid:127) of 1/0 Devices 644
`
`INTEL - 1013
`
`

`

`Content"
`
`xi
`
`8.4 Buses: Connecting 1/0 Devices to Processor and Memory b55
`Interfacing 1/ 0 Devices to the Memory, Processor, and Operating
`8.5
`System 673
`8.6 Designing an 1/0 System b84
`8.7 Real Stuff: A Typical Desktop 1/ 0 System 687
`8.8 Fallacies and Pitfalls 688
`8.9 Concluding Remarks 690
`8.10 Hlstorical Perspective and Further Reading
`8.11 Key Terms 700
`8.12 Ell.ercises 700
`
`t,94
`
`a Multiprocessors 710
`
`Introduction 712
`9.1
`9.2 Programming Multiprocesaors 714
`9.3 Multiprocessors Connected by a Single Bus 717
`9.4 Multiprocessors Connected by a Network 727
`9.5 Clusters 734
`9.6 Network Topologies 73h
`9. 7 Real Stuff: Future Directions for Multiprocessors 740
`9,8 Fallacies and Pitfalls 743
`9.9 Concluding Remarks-Evolution versus Revolution in Computer
`Architecturo 746
`9.10 Historical Perspective and Further Reading 748
`9.11 Key Terms 756
`9.12 Exercises 750
`
`A-2
`
`APPENDICES
`
`II
`
`Assemblers, Linkers, and the SPIM Simulator
`by James R. um1s, Uni,•crsity of Wisconsin
`A.1
`Introduction A-3
`A.2 Assemblers A-10
`A.3 Linkers A-17
`A.4 Loading A-19
`/\-20
`A.5 Memory Usage
`A.6 Procedure Call Convention A -22
`A, 7 Exceptions and Interrupts A-32
`A.8
`Input and Output A-3t,
`A.9 SPIM A-38
`A.10 MIPS R2000 Assembly Language A-49
`A.11 Concluding Remarks /\-75
`A.12 Key Terms A -7o
`A.13 Exercises A-7n
`
`INTEL - 1013
`
`

`

`xii
`
`Contents
`
`II The Basics of Logic Design B-2
`
`Introduction B-3
`B.1
`8.2 Gates, Truth Tables, and Logic Equations B-4
`B.3 Combinational Logic B-8
`B.4 Clocks B-18
`B.S Memory Elements B-21
`B.6 Finite State Machines B-35
`B. 7 Timing Methodologies B-39
`B.8 Concluding Remarks B-44
`B.9 Key Terms B-45
`B.10 Exercise& 8-45
`
`II Mapping Control to Hardware c-2
`
`Introduction C-3
`C.J.
`Implementing Combinational Control Units C-4
`C.2
`C.3
`Implementing Finite State Machine Control C-8
`Implementing the Next-State Function with a Sequencer C-21
`C.4
`C.5 Translating a Microprogram to Hardware C-28
`C.8 Concluding Remarks C-31
`C. 7 Key Terms C-32
`C.B Ellerclses C-32
`
`Glossary G-1
`
`Index 1-1
`
`INTEL - 1013
`
`

`

`K.F. WENDT LIBRARY
`UW COLLEGE OF ENGR.
`215 N. RANDALL t\VENUE
`MADISON ,. 11 53706
`
`INTEL - 1013
`
`

`

`234
`
`Chapter 4 Arithmetic for Computers
`
`The Sum bit is set when exactly one input is 1 or when all three inputs are 1..
`The Sum results in a complex Boolean equation (recall that a mec1ns NOT a):
`Sum = (a· b · Carryln) +{a· b · Carry In)+ (a· b · Carryln) +(a · b · Carryln)
`
`)· 1[) The drawing of the logic for the Sum bit in the adder black box is left as an
`
`/
`
`exercise (see Exercise 4.43).
`Figure 4.14 shows a I-bit ALU derived by combining the c1dder with the ear-
`lier components. Sometimes designers also want the ALU to perform a few
`more simple operations, such as generating 0. The easiest way to add an oper-
`11tion is to expand the multiplexor controlled by the Operation line and, for this
`example, to connect 0 directly to the new input of that expanded multiplexor.
`
`A 32-Bit ALU
`Now that we have completed the 1-bit ALU, the full 32-bit ALU is created by
`connecting adjacent ''black boxes." Using xi to mean the ith bit of x, Figure 4.15
`shows a 32-bit ALU. Just as a single stone can cause ripples to radiate to the
`shores of a quiet lake, a single carry out of the least significant bit (Result0)
`can ripple all the way through the adder, causing a carry out of the most sig(cid:173)
`nificant bit (Result31). Hence, the adder created by directly linking the cc1rries
`of 1-bit adders is called a ripple carry adder. We'll see a faster way to connect
`the 1-bit adders starting on page 241.
`
`Operation
`Carryln
`
`Result
`
`1
`
`2
`
`;..
`
`FIGURE 4.14 A 1•blt ALU that performs AND, OR, and addition (see Flgure 4.13).
`
`CarryOut
`
`INTEL - 1013
`
`

`

`4.5 Conltl"l.lctlng an Arithmetic Logic Unit
`
`235
`
`Carryln
`
`Operation
`
`ao
`
`bO
`
`al
`
`b1
`
`a2
`
`b2
`
`Carryln
`
`ALUO
`
`Carryout
`
`Carryln
`
`ALU1
`
`Carryout
`
`Carryln
`
`ALU2
`
`Carryout
`
`ResultO
`
`Resultl
`
`Result2
`
`a31
`
`b31
`
`Carryln
`
`ALU31
`
`1------- Result31
`
`FIGURE 4.15 A 32-blt ALU con1tructed from 32 1-blt ALU•. CarryOut of the less significant
`bit is connected to the Carryln of the more significant bit. Thi~ organizotiun is callt'CI ripple carry.
`
`Subtraction is the same as adding the negative version of an operand, and
`this is how adders perform subtraction. Recall that the shortcut for negating a
`two's complement number is to invert each bit (sometimes called the one's co111-
`plcmc11f as explained in the elaboration on page 219) and then add 1. To invert
`each bit, we simply add a 2:1 multiplexor that chooses between b and b, as
`Figure 4.16 shows.
`Suppose we connect 32 of these 1-bit ALUs, as we did in Figure 4.15. The
`added multiplexor gives the option of b or its inverted value, depending on
`Binvert, but this is only one step in negating a two's complement number. No(cid:173)
`tice that the least significant bit still has a Carryln signal, even though it's un(cid:173)
`necessary for addition. What happens if we set this Carryln to l inste21d of O?
`
`INTEL - 1013
`
`

`

`236
`
`Chapter 4 Arithmetic for Computers
`
`Binvert
`
`Operation
`Carryln
`
`a --1---- -1 - - -- - i
`
`0
`
`1
`
`2
`
`Result
`
`Carryout
`
`FIGURE 4.16 A 1-blt ALU that performs AND, OR, and addition on a and b or a and b, By
`selecting b Cl3invert = 1) and selling Carryln to I in thi: lens! significant bit of the ALU, we get
`two's complement ,ubtrnction of b from .1 in!-IPad of ~ddition of b to a .
`
`The adder will then calculate c1 + b + 1. Dy selecting the inverted version of b,
`we get exactly what w~ ant:
`;;_
`-
`.
`a + b + Ii = a + (b + I) == a + ( -b) = a - b
`'
`The simplicity of the hardware design of a two's complement adder helps
`explain why two's complement representation hns become the universal stan·
`dard for integer computer arithmetic.
`
`Tailoring the 32-Bit ALU to MIPS
`This set of operations-add, subtract, AND, OR-is found in the ALU of
`almost every computer. If w e look at Figure 4.7 un page 228, we see that the
`operations of most MIPS instructions can be performed by this ALU. But the
`design of the ALU is incomplete.
`One instruction that still needs support is the set on less than instruction
`(s1 t). Rccdll that the operation produces 1 if rs< rt, and O otherwise. Conse·
`,1ucntly, s l t will set all but the least significant bit to 0, with the least signifi(cid:173)
`cant bit set according to the comparison. For the ALU to perform s 1 t, we first
`need to expand the three-input multiplexor in Figure 4.16 to add an input for
`the s l t result. We call that new input Less. and use it only for s l t .
`
`X
`
`..
`
`INTEL - 1013
`
`

`

`i
`
`,(
`
`(
`
`4 .5 Constructing an Arithmetic Logic Unit
`
`237
`
`,
`
`The top dra\•ving of Figure 4.17 shows the nev,, 1-bit ALU w ilh the expanded
`multiplexor. From the description of s l t above, we must connect Oto the Less
`input for the upper 31 bits of the ALU, since those bits me always set to O. What
`remains to consider is how to compare and set the least sig11ificn11t bit for set on
`less than instructions.
`What happens if we subtract b from a? If the difference is negative, then a<
`b since
`')
`f.
`
`(a-b) < O(cid:157)
`((a - b)+b)<(O+b)
`(cid:157) a<b
`We want the least significant bit oL.a sel on less than operation to be a l il a <
`b; t~at is, a 1 if a - bis neg__ativc and a O if it's positive. This desired result cor(cid:173)
`responds exactly to the sign-bit values: 1 means negative and O means posi(cid:173)
`tive. Following this line of argument, we need only connect the sign bit from
`the adder output to the least significan t bit to get set on less than.
`Unfortunately, the Result output from the most significant ALU bit in the
`top of Figure 4.17 for the s l t operation is not the output of the adder; the ALU
`Ol1tput for the s l t operation is obviously the input value Less.
`Thus, we need a new 1-bit ALU for the most significant bit that has an extra
`output bit: the adder output. The bottom d rawing of Figure 4.17 shows the de(cid:173)
`sign, with this new adder output line called Set, and used only for s l t. As long
`as we need a special ALU for tht! most significant bit, we added the overflow
`detection logic since it is also associated with that bit.
`Alas, the test of less than is a little more complicated than just described be(cid:173)
`cause of overflow; Exercise 4.23 on page 326 explores what must be done.
`Figure4.18 shows the 32-bit ALU.
`Notice that every time we want the ALU to subtract, we set both Carryln
`and Binvert to 1. For adds or logical operations, we want both control lines to
`be 0. We can therefore simplify control of the ALU by combining the Carry In
`and Binvcrt to a single control line called B11egnte.
`To furth er tailor the ALU to the MIPS instruction set, we must support con(cid:173)
`ditional branch instructions. These instructions branch either if two registers
`are equal or if they are unequal. The easiest way to test equality with the ALU
`is to subtract b from a and then test to see if the result is O since
`(a - b = 0) (cid:157)
`a = b
`Thus, if we add hardware to test if the result is 0, we can test for equality.
`The simplest way is to OR all the outputs together and then send that signal
`through an inverter:
`
`Zero = (Result31 + Result30 + . .. + Result2 + Resultl + ResultO)
`Figure 4.19 shows the revised 32-bit ALU. We can think of the combination
`of the 1-bit Bnegate line and the 2-bit Operation lines as 3-bit control lines for
`
`INTEL - 1013
`
`

`

`238
`
`Chapter 4 Artthrnetlc for Computer•
`
`Blnvert
`
`Operation
`Carryln
`
`a_,._- ----+----...i
`
`0
`
`1
`
`2
`
`Result
`
`Less -+-----------+---+1 3
`
`Carryout
`
`B1nvert
`
`Operation
`Carryln
`
`a -+-----+----+--i
`
`0
`
`1
`
`2
`
`Result
`
`Less -+ - - -----+-11-+---1t--tr-+13
`
`- - - - -set
`
`Over11ow
`detection
`
`r-----+-... Overflow
`
`FIGURE 4.17 (Top) A 1·blt ALU that perform• AND, OR, and addllloa on II and b orb, and
`(bottom) • 1•blt ALU for the moat elgnlflcant bit. The top draw in~ includes n direct input th.it
`is C'Onnected to perform the set on less than operiltinn (see Fi~urc 4. 18); the bottom has a direct out·
`put from the adder for the less than compariso11 called Set. (Refer to Exercis.- 4.42 to sw how to c:il·
`culate overflow with fower inputs.)
`
`INTEL - 1013
`
`

`

`4.1 Constructing •• Arithmetic Logic Unit
`
`239
`
`Binvert
`
`Carryln
`
`Operation
`
`ao
`bO
`
`a1
`b1
`0
`
`a2
`b2
`0
`
`Less
`Carryout
`
`Carryln
`ALU1
`
`ResultO
`
`=lesultl
`
`Carryln
`ALU2 1------1---+ Result2
`Less
`Carryout
`
`Carryln
`
`a31
`b31
`0
`
`Carryln 1------- Result31
`1 - - - - - - Set
`ALU31
`1-------.. ... Overflow
`Less
`
`FIGURE 4.18 A 32-blt ALU constructed from the 31 copies of the 1.-1t ALU In the top of
`Figure 4.17 and one 1•blt AW In the bottom of that figure. The Less inputs are connected to
`0 except for the least significant bit, and that is connected to th<! Set output of the most significant
`bit. If the ALU performs a - b and we select th<! input 3 in the multiplexor in Figure 4.17, then
`Result = 0 ... 001 if n < b, and Result = 0 .. . 000 otherwise.
`
`the ALU, telling it to perform add, subtract, AND, OR, or set on less than.
`Figure 4.20 shows the ALU control lines and the corresponding ALU opera(cid:173)
`tion.
`Finally, now that we have seen what is inside a 32-bit ALU, we will use the
`universal symbol for a complete ALU, as shown in Figure 4.21.
`
`INTEL - 1013
`
`

`

`240
`
`Chapter 4 Arithmetic for Computer•
`
`Bnegate
`
`Operation
`
`ao
`bO
`
`al
`bl
`0
`
`a2
`b2
`0
`
`ResultO
`
`Resultl
`
`Result2
`
`Carryln
`ALU1
`Less
`Carryout
`
`Carryln
`ALU2
`Less
`Carryout
`
`[p-,,m
`
`a31
`b31
`0
`
`Result31
`Carryln
`1------, Set
`ALU31
`Less 1---- -+--- -- -- -- -----+ Overflow
`
`FIGURE 4,19 The final 32-blt ALU. This adds a Zero detector to Figure 4.18.
`
`ALU control lines
`
`Function
`
`000
`001
`010
`110
`111
`
`r - -
`
`and
`or
`add
`subtract
`set on less than
`
`FIGURE 4.20 Th• value& of the t,.ree ALU c:ontrol lines Bnegate and Operation and the
`corresponding ALU operations.
`
`INTEL - 1013
`
`

`

`4.5 Com1tructlng an Arithmetic Logic Unit
`
`241
`
`ALU operation
`
`Zero
`Result
`Overflow
`
`carryout
`
`FIGURE 4.21 The symbol co111monly used to represent an ALU, as shown In Figure 4.19.
`Thi~ symb()I is also used to r1cprc~ent an ,,ddcr, s,, it is norm,,lly lnbelcu either with ALU or
`Adder.
`
`Carry Lookahead
`The next quc::;tion is, How quick1y cc1n this ALU <1dd two 32-bit operands? We
`Ci\n d etermine the a and b inputs, but the Carry In input depends on the oper(cid:173)
`ation in the adjacent 1-bit adder. If w e trace all the way through the chain of
`dependencies, we connect the most signific.int bit to the le<1st significant bit,
`so the most significant bit of the sum must wait for tlv.~ scqucntinl evalu.ition of
`all 32 1-bit adders. This sequential chilin reaction is too slow to be used in
`time-criticc1l hardware.
`There are a variety of schemes to anticipate the car ry so that the worst-case
`scenario is a function of the log2 of the number of bits in the adder. These a n(cid:173)
`ticipa tory signals are foster because they go through few er gates in sequence,
`but it takes many more gates to anticipate the proper carry.
`A key to understanding fast carry schemes is to remember that, unlike soft(cid:173)
`ware, hardware executes in parallel whenever inputs change.
`
`Fast Carry Using "Infinite" Hardware
`Appendix B mentions that any equation can be represented in two levels of
`logic. Since the only external inputs are the two upe rnnds a nd the Carryln to
`the least significant bit of the adder, in theory we could calculate the Carry In
`values to all the remaining bits of the adder in just twu levels of logic.
`For example, the Carry In for bit 2 of tlw adder is exactly the CarryOut of l;iit
`l, so the formirla is
`Carryin2 == ( b I · Carry In 1 }+ ( a I · Carry lnl J + (a I · b I )
`
`Simila rly, Carryln1 is defined as
`Carrylnl = l bll - Carryln0J + (a0 · Carryin0) + (aO · bO)
`
`INTEL - 1013
`
`

`

`-
`
`250
`
`Chapter 4 Arithmetic for Computera
`
`(cid:127)
`
`MuHipllcation
`
`Multiplicafio11 is vexnlion,
`Division is as bad;
`The rnle of three doth puzzlr: me,
`And pmctici: drives me mad.
`
`Anonymous, Elizabethan manuscript, 1570
`
`With the construction of the ALU and explanation of addition, subtraction,
`and shifts, we are ready to build the more vexing operation of multiply.
`• But first let's review the multiplication of decimal numbers in longhand to
`remind ourselves of the steps and the names of the operands. For reasons that
`will become clear shortly, we limit this decimal example to using only the dig(cid:173)
`its O and 1. Multiplying lOOOtt!n by 10011en;
`
`Multiplicand
`Multiplier
`
`lOOOten
`· 1001 ten
`
`X
`
`Product
`
`1000
`0000
`0000
`1000
`1001 OOOten
`The first operand is called the 11111/tiplimnd and the second the multiplier. The
`final result is called the product. As you may recall, the algorithm learned in
`grammar school is to take the digits of the multiplier one at a time from right
`to left, multiplying the multiplicand by the single digit of the multiplier and
`shifting the intermediate product one digit to the left of the earlier inter(cid:173)
`mediate products.
`The firs t observation is that the number of digits in the product is consider(cid:173)
`ably larger than the number in either the multiplicand or the multiplier. In fact,
`if we ignore the sign bits, the length of the multiplication of an 11-bit multipli(cid:173)
`cand and an 111-bit multiplier is a product that is I!+ m bits long. That is, 11 + m
`bits are required to represent all possible products. Hence, like add, multiply
`must cope with overflow because we frequently want a 32-bit product as the
`res ult of multiplying two 32-bit numbers.
`In this example we restricted the decimal digits to O and 1. With only two
`choices, each step of the multiplication is simple:
`
`1. Just place a copy of the multiplicand (1 x multiplicand) in the proper
`place if the multiplier digit is a 1, or
`
`2. Place O (0 x m ultiplicand) in the proper place if the digit is O.
`
`INTEL - 1013
`
`

`

`4.6 Multlpllcatlon
`
`2U
`
`Although the decimal example above happened to use only O and 1, multipli(cid:173)
`cation of binary numbers must always use O ,rnd 1, and thus always offers
`only these two choices.
`Now that we have reviewed the bnsics of multiplication, the traditional next
`step is to provide the h ighly optimized multiply hardware. We break with tra(cid:173)
`dition in the belief that you will gain a better understanding by seeing the evo(cid:173)
`lution of the multiply hardwc1re and algorithm through three generations. The
`rest of this section presents successive refinements of the hardware and the
`algorithm until we have a version used in some computers. For now, let's
`assume that we are multiplying only positive numbers.
`
`First Version of the Multiplication Algorithm and Hardware
`The initial design mimics the algorithm we learned in grammar school; the
`hardware is shown in Figure 4.25. We have drawn the hardware so that data
`flows from top to bottom to more closely resemble the paper-and-pencil
`method.
`Let's assume that the multiplier is in the 32-bit Multiplier register and that
`the 64-bit Product register is initialized to 0. From the paper-and-pencil exam(cid:173)
`ple above, it's dear that i.ve will need to move the multiplicand left one digit
`each step as it may be added to the intermediate products. Over 32 steps a
`
`--Mult1phcand
`
`Shirt ie1't
`
`64-bit ALU
`
`Product
`
`64 bits
`
`Write , __ ....,
`
`Control test
`
`-Multiplier
`
`Shi~ right
`32 bits
`
`FIGURE 4.25 First version of tlle multlplic:atlon h11rdw11re. The Multiplicand regbter, ALU.
`and Product rcgbter arc all r,4 bil~ wide, with only tht> Multiplier register t·ontainin~ :12 bit~. The
`32-bit m11ltiplic,md :,,tarts in the right half of the Multiplicand register, ,rnd is shifted left 1 bit nn
`each step. The multiplier is ~hifted in the opposite direction at e,Kh step. The al~orithm starts
`with the product infti,ilized toll. Control decides wlwn to ~hift the ~foltiplic,rnd and \.lultiplier
`n:gi~ters ,md when tu write new ,·alue:,, into the Prnduct n:gister.
`
`INTEL - 1013
`
`

`

`252
`
`Chapter 4 Arithmetic for Computer•
`
`-
`
`32-bit multiplicand would move 32 bits to the left. Hence we need a 64-bit Mul(cid:173)
`tiplicand register, initialized with the 32-bit multiplicand in the right half and
`O in the left half. This register is then shifted left

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket