`
`
`~----~~~~----~/
`
`!
`j
`I
`\-
`
`'
`
`~·· • 0 007 615 824 1 0
`-·~-
`
`/
`
`";
`'
`
`·" .~ .;"·
`
`--
`-<·--~ _______ :: __ :_~--~-~=--~ ~--
`
`-- -~~---:-:-:-<- -- -<-~
`' - - - ~ .- -
`
`.-
`
`- -
`--< __ -_-___ -~-<--<.<-~ ·. -_-
`
`- --
`- ---- ·- ..... -- -
`-.- -.-.. - -----.·--.:. ---;~:=.=-·~----~-
`
`.
`
`..... '
`
`.. "
`
`_-_- ---
`
`-
`
`:--(cid:173)
`. ~- ,_ '-''"'
`
`-
`
`--.
`
`---
`
`-
`
`-
`
`--
`
`----
`
`. " -
`
`:'-,>---.:-::;:_::o;:_::~:,:-:::':-~--::-
`
`-- c::.o>;i._.:>:;.:<·""C"-:::---:c·"""
`
`:-__ . ---~--
`
`-- -:··;.x,.-~-
`
`COMP.UTEI\,..dRGANiZATlON .·
`
`-~DESIGN
`.-c· THE HARDWARE/ SOFTWARE
`
`">·
`
`...
`
`·,
`
`,_
`
`-.. INTERFACE
`
`PAnERSON
`HENNESSY
`
`-tn
`=o
`"'3:
`=~ >c
`~---~
`0""
`-~~
`>o
`~~
`"'~
`... --tN
`'Vi> oz
`~:= >(cid:173)~-~
`
`FT MEADE
`GenCo! I
`
`QA 76
`.9
`: .C643 H46
`1994
`
`--j
`-,
`-
`- - ~
`
`COPY 2
`
`MORGAN
`-KAUFMANN
`
`--------
`--- :_::.:·:·_:>--:·;_~_~-:~ --~~~~=<-->;-· ~--·
`- :.~ --=->-:_---;:~.
`---_-_- __ -~:::---::._~:~:-- .. -
`-
`
`--"-__,-; .. ··--<~-:.-~-~-:::- .. -:-_ ---
`-----------
`
`~--
`
`.
`
`wi~~-~~€~S~Ushe~ r ~~~~;y;:; .•. =~:r.id oman················
`
`Hughes, Exh. 1020, p. 1
`
`
`
`Computer Organization and Design
`
`THE HARDWARE/SOFTWARE
`
`INTERFACE
`
`...
`
`..
`
`John L. Hennessy
`Stanford University
`
`David A. Patterson
`University of California at Berkeley
`
`With a contribution by
`James R. Larus
`University of Wisconsin
`
`Morgan Kaufmann Publishers
`San Mateo, California
`
`Hughes, Exh. 1020, p. 2
`
`
`
`Senior Editor: Bruce M. Spatz
`Production Manager: Y onie Overton
`Editorial Coordinator: Douglas Sery
`Copyeditlng: Steve Hiatt and Gary Morris
`Text Design: Ross Carron Design
`Illustration: Alexander Teshin Associates
`Composition/Color Separation/Postscript
`Programming: Edward W. Sznyter, Babel Press
`Cover Design: David Lance Goines
`Additional Cover Mechanical Art: Patty King
`Chapter Opener Illustrations: Jo Jackson
`Indexing: Steve Rath
`Proofreading: Gary Morris
`Electronic Prepress: The Courier Connection
`Printer: Courier Corporation
`
`Morgan Kaufmann Publishers, Inc.
`Editorial Office:
`2929 Campus Drive, Suite 260
`San Mateo, CA 94403
`
`© 1994 by Morgan Kaufmann Publishers, Inc.
`All rights reserved
`Printed in the United States of America
`
`97 96 95 94
`
`54 3 2 1
`
`No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form
`or by any means-electronic, mechanical, photocopying, recording, or otherwise--without the prior
`written permission of the publisher.
`
`Advice, Praise, and Errors: Any correspondence related to this publication or intended for the authors
`should be addressed to the editorial offices of Morgan Kaufmann Publishers, Inc., Dept. P&H APE. In(cid:173)
`formation regarding error sightings is encouraged. Any error sightings that are accepted for correction
`in subsequent printings will be rewarded by the authors with a payment of $1.00 (U.S.) per correction at
`the time of their implementation in a reprint. Electronic mail can be sent to errors@cs.berkeley.edu.
`
`Instructor Support: For information on the SPIM software simulator and other instructor materials
`available to adaptors, please contact the editorial offices of Morgan Kaufmann Publishers, Inc.
`
`f
`
`I I
`I
`I
`
`Cataloging-in-Publication Data
`
`Patterson, David A.
`Computer organization and design: the hardware/ software interface
`I David A. Patterson, John L. Hennessy.
`p. em.
`Includes bibliographical references and index.
`ISBN 1-55860-281-X
`1. Computer organization. 2. Computers--Design and construction.
`I. Hennessy, John L.
`3. Computer interfaces.
`II. Title
`<r-;-
`QA76.9.C643P37 1994
`004.2'2-dc20
`*17639
`CIP
`
`Hughes, Exh. 1020, p. 3
`
`
`
`--
`
`viii
`
`~
`l
`! ;
`\
`
`I
`
`Contents
`
`Foreword vi
`by Maurice V. Wilkes
`
`Preface xiii
`
`The SPIM Simulator for the MIPS R2000/R3000 xxiii
`by James R. Larus, University of Wisconsin
`
`CHAPTERS
`
`Computer Abstractions and Technology 2
`
`1.1
`Introduction 3
`1.2 Below Your Program 5
`1.3 Under the Covers 10
`1.4
`Integrated Circuits: Fueling Innovation 21
`1.5 Fallacies and Pitfalls 26
`1.6 Concluding Remarks 28
`1.7 Historical Perspective and Further Reading 30
`1.8 Exercises 41
`
`The Role of Performance 46
`
`2.1
`Introduction 48
`2.2 Measuring Performance 52
`2.3 Relating the Metrics 54
`2.4 Popular Performance Metrics 60
`2.5 Choosing Programs to Evaluate Performance 66
`2.6 Comparing and Summarizing Performance 68
`2.7 Fallacies and Pitfalls 70
`2.8 Concluding Remarks 76
`2.9 Historical Perspective and Further Reading 77
`2.10 Exercises 81
`
`Hughes, Exh. 1020, p. 4
`
`
`
`Contents
`
`ix
`
`Instructions: Language of the Machine 92
`
`Introduction 94
`3.1
`3.2 Operations of the Computer Hardware 95
`3.3 Operands of the Computer Hardware 97
`3.4 Representing Instructions in the Computer 103
`3.5
`Instructions for Making Decisions 110
`3.6 Supporting Procedures in Computer Hardware 119
`3. 7 Other Styles of MIPS Addressing 124
`3.8 Alternatives to the MIPS Approach 130
`3.9 An Example to Put It All Together 135
`3.10 A Longer Example 138
`3.11 Arrays versus Pointers 143
`3.12 Fallacies and Pitfalls 147
`3.13 Concluding Remarks 148
`3.14 Historical Perspective and Further Reading 150
`3.15 Exercises 155
`
`k~Q:j Arithmetic for Computers
`
`166
`
`Introduction 168
`4.1
`4.2 Negative Numbers 168
`4.3 Addition and Subtraction 175
`4.4 Logical Operations 179
`4.5 Constructing an Arithmetic Logic Unit 182
`4.6 Multiplication 198
`4.7 Division 212
`4.8 Floating Point 225
`4.9 Fallacies and Pitfalls 244
`4.10 Concluding Remarks 246
`4.11 Historical Perspective and Further Reading 249
`4.12 Exercises 258
`
`The Processor: Datapath and Control 268
`
`5.1
`Introduction 270
`5.2 Building a Datapath 276
`5.3 A Simple Implementation Scheme 283
`5.4 A Multiple Clock Cycle Implementation 312
`5.5 Microprogramming: Simplifying Control Design 333
`5.6 Exceptions 344
`5. 7 Fallacies and Pitfalls 350
`5.8 Concluding Remarks 351
`5.9 Historical Perspective and Further Reading 353
`5.10 Exercises 357
`
`Hughes, Exh. 1020, p. 5
`
`
`
`X
`
`Contents
`
`Enhancing Performance with Pipelining 362
`
`Introduction 364
`6.1
`6.2 A Pipelined Datapath 367
`6.3 Pipelined Control 381
`6.4 Data Hazards 390
`6.5 Control for Data Hazards: Stalls 399
`6.6 Reducing Data Hazards: Forwarding 412
`6. 7 Branch Hazards 424
`6.8 Exceptions 430
`6.9 Performance of Pipelined Systems 435
`6.10 Fallacies and Pitfalls 436
`6.11 Concluding Remarks 438
`6.12 Historical Perspective and Further Reading 441
`6.13 Exercises 445
`
`Large and Fast: Exploiting Memory Hierarchy 452
`
`Introduction 454
`7.1
`7.2 Caches 458
`7.3 Virtual Memory 481
`7.4 A Common Framework for Memory Hierarchies 501
`7.5 Fallacies and Pitfalls 515
`7.6 Concluding Remarks 519
`7.7 Historical Perspective and Further Reading 521
`7.8 Exercises 527
`
`Interfacing Processors and Peripherals 532
`
`Introduction 534
`8.1
`8.2 1/0 Performance Measures: Some Examples from Disk and
`File Systems 537
`8.3 Types and Characteristics of 1/0 Devices 539
`8.4 Buses: Connecting 1/0 Devices to Processor and Memory 548
`Interfacing 1/0 Devices to the Memory, Processor, and
`8.5
`Operating System 565
`8.6 Fallacies and Pitfalls 576
`8. 7 Concluding Remarks 578
`8.8 Historical Perspective and Further Reading 581
`8.9 Exercises 584
`
`Hughes, Exh. 1020, p. 6
`
`
`
`Contents
`
`xi
`
`~ Parallel Processors 594
`
`Introduction 596
`9.1
`9.2 SIMD Computers-Single Instruction Stream, Multiple Data Streams 596
`9.3 MIMD Computers-Multiple Instruction Streams, Multiple Data
`Streams 602
`9.4 Programming MIMDs 603
`9.5 MIMDs Connected by a Single Bus 607
`9.6 MIMDs Connected by a Network 619
`9. 7 Future Directions for Parallel Processors 630
`9.8 Fallacies and Pitfalls 637
`9.9 Concluding Remarks-Evolution versus Revolution in Computer
`Architecture 640
`9.10 Historical Perspective and Further Reading 642
`9.11 Exercises 646
`
`APPENDICES
`
`Assemblers, Linkers, and the SPIM Simulator A-2
`by James R. Larus, University of Wisconsin
`A.1
`Introduction A-3
`A.2 Assemblers A-10
`A.3 Linkers A-17
`A.4 Loading A-19
`A.S Memory Usage A-19
`A.6 Procedure Call Convention A-21
`A.7 Exceptions and Interrupts A-30
`A.S
`Input and Output A-34
`A.9 SPIM A-36
`A.10 MIPS R2000 Assembly Language A-47
`A.11 Concluding Remarks A-71
`A.12 Exercises A-72
`
`The Basics of Logic Design B-2
`
`Introduction B-3
`B.1
`B.2 Gates, Truth Tables, and Logic Equations B-4
`B.3 Combinational Logic B-8
`B.4 Clocks B-18
`B.S Memory Elements B-21
`B.6 Finite State Machines B-35
`B. 7 Timing Methodologies B-39
`B.S Exercises B-45
`
`~
`------------------...... ==-------;,_,~\
`
`Hughes, Exh. 1020, p. 7
`
`
`
`__ ,
`
`r~
`;,.·;
`
`xii
`
`Contents
`
`Mapping Control to Hardware c-2
`
`Introduction C -3
`C.1
`Implementing Finite State Machine Control C-4
`c.2
`Implementing the Next·State Function with a Sequencer C-15
`C.3
`C.4 Translating a Microprogram to Hardware C-23
`C.S Concluding Remarks C-27
`C.G Exercises C-28
`
`Introducing C to Pascal Programmers D-2
`
`Introduction D-3
`D.1
`D.2 Variable Declarations D-3
`D.3 Assignment Statements D-4
`D.4 Relational Expressions and Conditional Statements D-5
`D.S Loops D-6
`D.G Examples to Put it All Together D-7
`D. 7 Exercises D-8
`
`Another Approach to Instruction Set Architecture-VAX E-2
`
`Introduction E-3
`E.1
`E.2 VAX Operands and Addressing Modes E-4
`E.3 Encoding VAX Instructions E-7
`E.4 VAX Operations E-9
`E.S An Example to Put It All Together: swap E-ll
`E.G A Longer Example: sort E-15
`E. 7 Fallacies and Pitfalls E-19
`E.S Concluding Remarks E-22
`E.9 Historical Perspective and Further Reading E-23
`E.10 Exercises E-25
`
`Index
`
`I-1
`
`-;
`
`Hughes, Exh. 1020, p. 8
`
`
`
`I
`
`I
`
`~ i
`:I
`·j
`
`I
`
`i ;
`i j'
`'
`
`3.2 Operations of the Computer Hardware
`
`95
`
`the simulator that comes with this book. We conclude with a look at the histor(cid:173)
`ical evolution of instruction sets and an overview of other machine dialects.
`The chosen instruction set comes from MIPS Computer Company and is
`typical of instruction sets designed since the early 1980s. We reveal the MIPS
`instruction set a piece at a time, giving the rationale along with the machine
`structures. This step-by-step tutorial weaves the components with their expla(cid:173)
`nations, making assembly language more palatable. To keep the overall pic(cid:173)
`ture in mind, each section ends with a figure summarizing the MIPS
`instruction set revealed thus far, highlighting the portions presented in that
`section.
`
`Operations of the Computer Hardware
`
`There must certainly be instructions for performing the fundamental arithmetic
`operations.
`
`Burks, Goldstine, and von Neumann, 1947
`
`Every computer must be able to perform arithmetic. The MIPS notation
`add a, b, c
`instructs a computer to add the two variables b and c and to put their sum in a.
`This notation is rigid in that each MIPS arithmetic instruction must always
`have exactly three variables. For example, suppose we want to place the sum
`of variables b, c, d, and e into variable a. The following sequence of instruc(cid:173)
`tions adds the variables:
`#The sum of b and c is placed in a.add a.
`add a, b, c
`#The sum of b, c and d is now in a.add a.
`a, d
`#The sum of b, c. d and e is now in a.
`a, e
`Thus it takes three instructions to take the sum of four variables.
`The words to the right of the sharp symbol(#) on each line above are com(cid:173)
`ments for the human reader, and they are ignored by the computer. Note that
`unlike other programming languages, each line of this language can contain,
`at most, one instruction. Another difference is that comments terminate at the
`end of a line.
`The natural number of operands for an operation like addition is three: the
`two numbers being added together and a placeholder for the sum. Requiring
`every instruction to have exactly three operands, no more and no less, con(cid:173)
`forms to the philosophy of keeping the hardware simple. Hardware for a vari(cid:173)
`able number of operands is more complicated than hardware for a fixed
`
`Hughes, Exh. 1020, p. 9
`
`
`
`96
`
`Chapter 3
`
`Instructions: Language of the Machine
`
`I''' 1 '1"' {"f:.<o]j,'jh ~ I
`adrl
`Arithmetic
`subtract
`
`MIPS assembly language
`I~'Et'tlo ,,.,,
`.. lll"''i'
`a = b + c
`add a.b.c
`a = b - c
`sub a. b. c
`
`.~
`
`~!
`
`I
`
`-l! c
`
`.-
`Always 3 operands
`Always 3 operands
`
`FIGURE 3.1 MIPS architecture revealed In section 3.1. The real machine operands will be
`unveiled in the next section. Highlighted portions in such summaries show MIPS structures
`introduced in this section; for this first figure, all is new.
`
`number. This situation illustrates the first of four underlying principles of
`hardware design:
`Principle 1: Simplicity favors regularity.
`We can now show, in the two examples that follow, the relationship of pro(cid:173)
`grams written in programming languages to programs in this more primitive
`notation. Figure 3.1 summarizes the portions of MIPS described in this section.
`
`Example
`
`This segment of a C program contains the five variables a, b, c, d, and e:
`
`b + c;
`a
`d = a - e;
`
`(Reminder: We are using the C programming language; readers familiar
`with another language should refer to Appendix D for a short comparison
`of C with Pascal.) The translation from C to MIPS instructions is performed
`by the compiler. Show the MIPS code produced by the C compiler.
`
`Answer
`
`The C compiler could produce
`
`add a. b. c
`subd, a, e
`
`These instructions are symbolic representations of what the processor
`understands; they are called assembly language instructions.
`
`Hughes, Exh. 1020, p. 10
`
`
`
`3.3 Operands of the Computer Hardware
`
`97
`
`Example
`
`A more complex C statement contains the five variables f, g, h, i, and j:
`
`f = (g +h) -
`
`(i + j);
`
`What would the C compiler produce?
`
`Answer
`
`The statement might be compiled into the following three MIPS instruc(cid:173)
`tions:
`
`add tO,g,h # temporary variable tO contains g+h
`add tl,i,j #temporary variable t1 contains i+j
`sub f,t0,t1 # f gets t0-t1, or (g+h)-(i+j)
`
`Note that the compiler created two new variables, tO and t 1, to express the
`program in the restricted three-operands-per-instruction notation of the
`machine.
`
`[iJ Operands of the Computer Hardware
`
`Unlike programs in high-level languages, the operands of arithmetic instruc(cid:173)
`tions cannot be any variables; they must be from a limited number of special
`locations called registers. Registers are the bricks of computer construction, for
`registers are primitives used in hardware design that are also visible to the pro(cid:173)
`grammer when the computer is completed. The size of a register in the MIPS
`architecture is 32 bits; groups of 32 bits occur so frequently that they are given
`the name word in the MIPS architecture.
`One major difference between the variables of a programming language
`and registers is the limited number of registers, typically between 16 and 32 on
`current computers. MIPS has 32 registers, using the notation $0, $1, ... , $31 to
`represent them. (See section 3.14 for the history of the number of registers.)
`The reason for this limit may be found in the second of our four underlying
`principles of hardware technology:
`Principle 2: Smaller is faster.
`A very large number of registers would increase the clock cycle time simply
`because it takes electronic signals longer when they must travel farther. Guide(cid:173)
`lines such as "smaller is faster" are not absolutes; 31 registers may not be faster
`than 32. Yet the truth behind such observations causes computer designers to
`
`i
`
`!
`I I .
`
`i
`
`Hughes, Exh. 1020, p. 11
`
`
`
`/:
`
`98
`
`Chapter 3
`
`Instructions: Language of the Machine
`
`take them seriously.ln this case, the ~esigr;er m~st balance the craving of pro(cid:173)
`grams for more registers with the designer s desire to keep th~ clock cycle fast.
`Chapters 5 and 6 show the central role that r~gisters play~ har~ware con(cid:173)
`struction; as we shall see in this chapter, effectiVe use of registers IS a key to
`program performance.
`
`Example
`
`It is the compiler's job to associate program variables with registers. Take,
`for instance, the C statement from our earlier example:
`
`f = (g +h) -
`
`(i + j);
`
`The variables f, g, h, i, and j can be assigned to the registers $16, $17, $18,
`$19, and $20, respectively. What is the compiled MIPS assembly code?
`
`Answer
`
`The compiled program is
`
`add $8,$17,$18
`add $9,$19,$20
`sub $16,$8,$9
`
`# register $8 contains g+h
`# register $9 contains i+j
`# f gets $8-$9, or (g+h)-(i+j)
`
`Registers $8 and $9 correspond to tO and t1 in the earlier example.
`
`Programming languages have simple variables that contain single data ele(cid:173)
`ments as in these examples, but they also have more complex data structures
`such as arrays. These complex data structures can contain many more data el(cid:173)
`ements than there are registers in a machine. How can a computer represent
`and access such large structures?
`Recall the five components of a computer introduced in Chapter 1 and de(cid:173)
`picted on the second page of this chapter (page 93). The processor can store
`only a small amount of data in registers, but computer memory contains mil(cid:173)
`lions of data elements. Hence data structures, like arrays, are kept in memory.
`As explained above, arithmetic operations occur only on registers in MIPS;
`thus MIPS must include instructions that transfer data between memory and
`registers. Such instructions are called data transfer instructions. To access a
`word in memory, the instruction must supply its address. Memory is really just
`a large, single-dimensional array, with the address acting as the index to that
`array. Addresses start at 0. For example, in Figure 3.2, the address of the third
`data element is 2, and the value of Memory[2] is 1000.
`
`Hughes, Exh. 1020, p. 12
`
`
`
`Wll"..E..-o~-------------------~--------------------------- --."'"
`; 1 "tn .:.--..:
`1 ~~c.
`i
`'·I j;:!
`I I
`I i
`
`3.3 Operands of the Computer Hardware
`
`99
`
`Processor
`
`Memory
`
`Address
`
`0
`
`1
`
`2
`
`3
`
`Data
`
`100
`
`10
`
`1000
`
`1
`
`FIGURE 3.2 Memory addresses and contents of memory at those locations. This is a sim(cid:173)
`plification of the MIPS addressing; Figure 3.3 shows MIPS addressing for sequential words in
`memory.
`
`The data transfer instruction that moves data from memory to a register is
`called load. The format of the load instruction is the name of the operation fol(cid:173)
`lowed by the register to be loaded, then the start address of the array, and
`finally a register that contains the index of the element of the array to be load(cid:173)
`ed. Thus the memory address of the array element is formed by the sum of the
`constant portion of the instruction and a register. The MIPS name for this in(cid:173)
`struction is l w, standing for load word.
`
`Example
`
`Assume that A is an array of 100 elements and that the compiler has associ(cid:173)
`ated the variables g, h, and i with the registers $17, $18, and $19. Let's also
`assume that the array starts at address As tart. Translate this C statement:
`
`g = h + A[i];
`
`Answer
`
`The C assignment statement becomes
`
`lw
`add
`
`$8,Astart($19)
`$17,$18,$8
`
`#Temporary reg $8 gets A[i]
`# g = h + A[i]
`
`Hughes, Exh. 1020, p. 13
`
`
`
`"·~/
`;.t,...'_-
`/·~r
`
`100
`
`Chapter 3
`
`Instructions: Language of the Machine
`
`The load instruction l w adds the starting address of the array A (named
`As tart here) to the index i in register $19 to form the address of element
`A [; ] • The register added to the address is therefore called the index register.
`The processor then reads the value from memory at that address and places
`it into register $8, which is used as a temporary variable. The following add
`instruction can operate on the value in $8 (which equals A[ i J) since it is in
`a register. The instruction adds A [ i J to h and puts the sum in the register
`corresponding to g.
`
`In addition to associating variables with registers, it is up to
`the compiler to allocate data structures like arrays to loca(cid:173)
`tions in memory. The compiler can then place the proper
`starting address into the data transfer instructions.
`Since 8-bit bytes are useful in many programs, the MIPS
`architecture addresses individual bytes. The address of a
`word is therefore actually the same as one of the 4 bytes in a
`word. Hence, addresses of sequential words differ by 4. Figure 3.3 shows the
`actual addresses for Figure 3.2. (Appendix A, section A.6 on page A-21, shows
`the two ways to number bytes in a word.)
`
`Processor
`
`Memory
`
`Address
`
`0
`
`4
`
`8
`
`12
`
`Data
`
`100
`
`10
`
`1000
`
`1
`
`dd
`FIGURE 3.3 Actual MIPS mem
`ory a
`resses and contents of memory for those words.
`Th h
`e c anged addresses are highlighted to contrast to Figure 3.2.
`
`Hughes, Exh. 1020, p. 14
`
`
`
`3.3 Operands of the Computer Hardware
`
`101
`
`Byte addressing also affects the index i. To get the proper byte address in
`the code above, register $19 must have 4 x i so that the sum of $19 and
`As t a r t will select A[ i J and not A [ i I 4 J .
`
`The instruction complementary to load is called store; it transfers data from
`a register to memory. The format of a store is similar to that of a load:
`the
`name of the operation, followed by the register to be stored, then the starting
`address of the array, and finally a register that contains the index to the ele(cid:173)
`ment of the array to be stored. The MIPS name is sw, standing for store word.
`Once again, the MIPS address is specified in part by a constant and in part by
`a register.
`
`Example
`
`Assume the variables g and h are associated with the registers $17 and $18.
`To accommodate the byte addresses of MIPS, assume that register $19 now
`has the value 4 xi. Chapter 4 explains how to multiply in MIPS; for now
`assume the multiplication has already occurred. What is the MIPS assembly
`code for the C statement below?
`
`A[i] = h + A[i];
`
`Answer
`
`The C assignment statement becomes
`
`$8,Astart($19)
`lw
`add $8,$18,$8
`sw
`$8,Astart($19)
`
`#Temporary reg $8 gets A[i]
`#Temporary reg $8 gets h+A[i]
`# Stores h+A[i] back into A[i]
`
`Instead of placing the sum of h and A [ i J into register $17, as in the prior
`example, the sum is placed into temporary register $8 and then stored back
`into A[ i].
`
`These are the instructions that transfer words between memory and regis(cid:173)
`ters in the MIPS architecture. Other brands of computers use instructions in
`addition to load and store to transfer data; these alternatives are described in
`section 3.8. Figure 3.4 summarizes the portions of MIPS described in this sec(cid:173)
`tion.
`
`Hughes, Exh. 1020, p. 15
`
`
`
`258
`
`Chapter 4 Arithmetic for Computers
`
`Exercises
`
`Never give in, never give in, never, never, never-in nothing, great or small, large
`or petty-never give in ....
`
`Winston Churchill, Address at Harrow School, October 29, 1941
`
`4.1 [15] <§3, 4.2, 4.8> The Big Picture on page 243 mentions that bits have no
`inherent meaning. Given the bit pattern
`
`10001111111011111100 0000 0000 0000
`
`What does it represent, assuming that it is
`
`a. a two's complement integer?
`
`b. an unsigned integer?
`
`c. a single precision floating-point number?
`
`d. a MIPS instruction?
`
`4.2 [10] <§4.2, 4.4, 4.8> This exercise is similar to Exercise 4.1, but this time use
`the bit pattern
`
`0000 0000 0000 0000 0000 0000 0000 0000
`
`4.3 [3] <§4.2> Convert 512ten into a 32-bit two's complement binary number.
`
`4.4 [3] <§4.2> Convert -1,023ten into a 32-bit two's complement binary num(cid:173)
`ber.
`
`4.5 [5] <§4.2> Convert -4,000,000ten into a 32-bit two's complement binary
`number.
`
`4.6 [5] <§4.2> What decimal number does this two's complement binary
`number represent: 111111111111111111111110 0000 1100tw
`?
`
`0
`4.7 [5] <§4.2> What decimal number does this two's complement binary
`number represent: 1111111111111111 1111 1111 11111111two?
`
`4.8 [5] <§4.2> What decimal number does this two's complement binary
`number represent: 01111111111111111111 1111 11111111tw
`?
`
`0
`4.9 [5] <§4.2> What binary number does this hexadecimal number represent:
`7fff fffahex? What decimal number does it represent?
`
`Hughes, Exh. 1020, p. 16
`
`
`
`---------------'-- ..
`
`·~
`
`4.12 Exercises
`
`259
`
`1\
`I
`!
`'I
`
`4.10 [5] <§4.2> What hexadecimal number does this binary number
`represent: 1100101011111110111110101100 1110tw0 ?
`4.11 [5] <§4.8> Using the notation in the Hardware Software Interface sec(cid:173)
`tions on pages 226 and 227, show the MIPS binary floating-point formats in
`single precision and double precision for 10ten·
`
`4.12 [5] <§4.8> This exercise is similar to Exercise 4.11, but this time replace
`the number 10ten with 10.5ten·
`4.13 [10] <§4.8> This exercise is similar to Exercise 4.11, but this time replace
`the number 10ten with 0.1ten·
`4.14 [10] <§4.10> For the program gee (Figure 4.46 on page 248), find the 10
`most frequently executed MIPS instructions. List them in order of popularity,
`from most used to least used. Show the rank, name, and percentage of instruc(cid:173)
`tions executed for each instruction. If there is a tie for a given rank, list all in(cid:173)
`structions that tie with the same rank, even if this results in more than 10
`instructions.
`
`4.15 [10] <§4.10> This exercise is similar to Exercise 4.14, but this time replace
`the program gee with the program spice.
`
`4.16 <§4.10> {§4.14, 4.15} These questions examine the relative frequency of
`instructions in different programs.
`
`a.
`
`b.
`
`c.
`
`d.
`
`[5] Which instructions are found in both the answer to Exercise 4.14 and in
`the answer to Exercise 4.15?
`
`[5] What percentage of gee instructions executed is due to the instructions
`identified in Exercise 4.14a?
`
`[5] What percentage of gee instructions executed is due to the instructions
`identified in Exercise 4.14?
`
`[5] What percentage of spice instructions executed is due to the instruc(cid:173)
`tions identified in Exercise 4.14a?
`
`e.
`
`[5] What percentage of spice instructions executed is due to the instruc-
`tions identified in Exercise 4.15?
`4.17 [10] <§4.10> {ex. 4.14, 4.15, 4.16} If you were designing a machine to ex(cid:173)
`ecute the MIPS instruction set, what are the five instructions that you would
`try to make as fast as possible, based on the answers to Exercises 4.14 through
`4.16? Give your rationale.
`4.18 [15] <§2, 4.10> Using Figure 4.46 on page 248, calculate the average clock
`cycles per instruction (CPI) for the program gee. Figure 4.47 gives the average
`
`Hughes, Exh. 1020, p. 17
`
`
`
`260
`
`Chapter 4 Arithmetic for Computers
`
`I [•.'f·
`
`·"!.{t
`
`.,,,, '.(
`
`r:.:"::::"'
`Loads and stores
`
`Conditional branch
`
`Jumps
`Integer multiply
`
`Integer divide
`Floating-point add and subtract
`Floating-point multiply, single precision
`Floating-point multiply, double precision
`Floating-point divide, single precision
`
`Floating-point divide, double precision
`
`'!(~~(!;!~~
`01.4
`01.8
`01.2
`10.0
`30.0
`02.0
`04.0
`05.0
`12.0
`19.0
`
`FIGURE 4.47 CPI for MIPS Instruction categories.
`
`CPI per instruction category, taking into account cache misses and other ef(cid:173)
`fects. Assume that instructions omitted from the table have a CPI of 1.0.
`
`4.19 [15] <§2, 4.10> This exercise is similar to Exercise 4.18, but this time re(cid:173)
`place the program gee with the program spice.
`
`4.20 [5] <§4.2> Why doesn't MIPS have a subtract immediate instruction?
`
`4.21 [10] <§4.2> Find the shortest sequence of MIPS instructions to determine
`the absolute value of a two's complement integer. Convert this instruction (ac(cid:173)
`cepted by the MIPS assembler):
`
`abs
`
`$10,$11
`
`This instruction means that register $10 has a copy of register $11 if ~egister
`$11 is positive, and the two's complement of register $11 if $11 is negative.
`4.22 [10] <§4.3> Find the shortest sequence of MIPS instructions to determine
`if there is a carry out from the addition of two registers, say registers $11 and
`$12. Place a 0 or 1 in register $10 if carry out is 0 or 1, respectively.
`
`4.23 [15] <§4.3> {ex. 4.22} Find the shortest sequence of MIPS instructions to
`perform double precision integer addition. Assume that one 64-bit, two's com(cid:173)
`plement integer is in registers $12 and $13 and another is in registers $14 and
`$~.5. ~e sum is to be placed in registers $10 and $11. In this example th~ most
`slgmflcant word of the 64-bit integer is found in the even-numbered reglsters,
`and the least significant word is found in the odd-numbered registers.
`
`Hughes, Exh. 1020, p. 18
`
`
`
`4.12 Exercises
`
`261
`
`4.24 [20] <§4.6> Find the shortest sequence of MIPS instructions to perform
`double precision integer multiplication. Assume that one 64-bit, unsigned inte(cid:173)
`ger is in registers $12 and $13 and another is in registers $14 and $15. The
`128-bit product is to be placed in registers $8, $9, $10 and $11. The most signif(cid:173)
`icant word is found in the lower numbered registers, and the least significant
`word is found in the higher numbered registers in this example. Hint: Write
`out the formula for (ax 232 +b) x (c x 232 +d).
`4.25 [15] <§4.5> The ALU supported set on less than ( s l t) using just the sign
`bit of the adder. Let's try a set-on-less-than operation using the values -7ten
`and 6ten· To make it simpler to follow the example, let's limit the binary repre(cid:173)
`sentations to 4 bits: 1001two and OllOtwo·
`1001two- OllOtwo = 1001two + 1010two = OOlltwo
`
`This result would suggest that -7 > 6, which is clearly wrong. Hence we must
`factor in overflow in the decision. Modify the 1-bit ALU in Figure 4.15 on
`page 192 to handle s l t correctly. Make your changes on a photocopy of this
`figure to save time.
`4.26 [15] <§4.4> Some computers have explicit instructions to extract an arbi(cid:173)
`trary field from a 32-bit register and place it in the least significant bits of a reg(cid:173)
`ister. The figure below shows the desired operation:
`
`31
`
`31
`0 ...
`
`31-jbits
`
`0
`
`32- G - i) bits
`
`j- i bits
`
`Find the shortest sequence of MIPS instructions that extracts a field for the
`constant values i = 7 and j = 19 from register $16 and places it in register $17.
`
`In More Depth: Logical Instructions
`
`The full MIPS instruction set has two more logical operations not mentioned
`thus far: xor and nor. The operation xor stands for exclusive OR, and nor
`stands for not OR. The table that follows defines these operations on a bit(cid:173)
`by-bit basis. These instructions will be useful in the following two exercises.
`
`Hughes, Exh. 1020, p. 19
`
`
`
`262
`
`Chapter 4 Arithmetic for Computers
`
`0
`0
`1
`1
`
`0
`1
`0
`1
`
`0
`1
`
`1
`0
`
`1
`0
`0
`0
`
`4.27 [15] <§4.4> Show the minimal MIPS instruction sequence for a new in(cid:173)
`struction called swap that exchanges two registers. After the sequence com(cid:173)
`pletes, the Destination register has the original value of the Source register,
`and the Source register has the original value of the Destination register. Con(cid:173)
`vert this instruction:
`swap rlO,r20
`
`The hard part is that this sequence must use only these two registers! Hint: Try
`to use the new logical instructions: What is the value of (0 xor A)? (B xor B)?
`4.28 [5] <§4.4> Show the minimal MIPS instruction sequence for a new in(cid:173)
`struction called not that takes the one's complement of a Source register and
`places it in a Destination register. Convert this instruction (accepted by the
`MIPS assembler):
`not rlO,r20
`Hint: Try to use the new logical instructions.
`
`4.29 [20] <§4.5> A simple check for overflow during addition is to see if the
`Carry In to the most significant bit is not the same as the CarryOut of the most
`significant bit. Prove that this check is the same as in Figure 4.3 on page 177.
`
`4.30 [10] <§4.5> Draw the gates for the Sum bit of an adder, given the equa(cid:173)
`tion on page 187.
`
`4.31 [5] ~§4.5> Rewrite the equations on page 197 for a carry lookahead lo?ic
`for a 16-bit adder using a new notation. First use the names for the Carry In sig(cid:173)
`nals of the individual bits of the adder. That is, use c4, c8, c12, ... instead of
`C1, C2, C3, .... Also, let Pi,j mean a propagate signal for bits ito j, and Gi,j
`mean a generate signal for bits ito j. For example, the equation
`C2 = G1 + (P1·GO) + (P1·PO ·cO)
`
`can be rewritten as
`cS = G7,4+ (P7,4·G3,o) + (P7,4.p3,a·cO)
`This more general notation is useful in creating wider adders.
`
`Hughes, Exh. 1020, p. 20
`
`
`
`4.12 Exercises
`
`263
`
`4.32 [15] <§4.5> {ex. 4.31} Write the equations for a carry lookahead logic for
`a 64-bit adder using the new notation from Exercise 4.31 and using 16-bit
`adders as building blocks.
`
`4.33 [10] <§4.5> Now calculate the relative performance of adders. Assume
`that hardware corresponding to any equation containing only OR or AND
`terms, such as the equations for pi and gi on pa