throbber
- I
`
`
`~----~~~~----~/
`
`!
`j
`I
`\-
`
`'
`
`~·· • 0 007 615 824 1 0
`-·~-
`
`/
`
`";
`'
`
`·" .~ .;"·
`
`--
`-<·--~ _______ :: __ :_~--~-~=--~ ~--
`
`-- -~~---:-:-:-<- -- -<-~
`' - - - ~ .- -
`
`.-
`
`- -
`--< __ -_-___ -~-<--<.<-~ ·. -_-
`
`- --
`- ---- ·- ..... -- -
`-.- -.-.. - -----.·--.:. ---;~:=.=-·~----~-
`
`.
`
`..... '
`
`.. "
`
`_-_- ---
`
`-
`
`:--(cid:173)
`. ~- ,_ '-''"'
`
`-
`
`--.
`
`---
`
`-
`
`-
`
`--
`
`----
`
`. " -
`
`:'-,>---.:-::;:_::o;:_::~:,:-:::':-~--::-
`
`-- c::.o>;i._.:>:;.:<·""C"-:::---:c·"""
`
`:-__ . ---~--
`
`-- -:··;.x,.-~-
`
`COMP.UTEI\,..dRGANiZATlON .·
`
`-~DESIGN
`.-c· THE HARDWARE/ SOFTWARE
`
`">·
`
`...
`
`·,
`
`,_
`
`-.. INTERFACE
`
`PAnERSON
`HENNESSY
`
`-tn
`=o
`"'3:
`=~ >c
`~---~
`0""
`-~~
`>o
`~~
`"'~
`... --tN
`'Vi> oz
`~:= >(cid:173)~-~
`
`FT MEADE
`GenCo! I
`
`QA 76
`.9
`: .C643 H46
`1994
`
`--j
`-,
`-
`- - ~
`
`COPY 2
`
`MORGAN
`-KAUFMANN
`
`--------
`--- :_::.:·:·_:>--:·;_~_~-:~ --~~~~=<-->;-· ~--·
`- :.~ --=->-:_---;:~.
`---_-_- __ -~:::---::._~:~:-- .. -
`-
`
`--"-__,-; .. ··--<~-:.-~-~-:::- .. -:-_ ---
`-----------
`
`~--
`
`.
`
`wi~~-~~€~S~Ushe~ r ~~~~;y;:; .•. =~:r.id oman················
`
`Hughes, Exh. 1020, p. 1
`
`

`

`Computer Organization and Design
`
`THE HARDWARE/SOFTWARE
`
`INTERFACE
`
`...
`
`..
`
`John L. Hennessy
`Stanford University
`
`David A. Patterson
`University of California at Berkeley
`
`With a contribution by
`James R. Larus
`University of Wisconsin
`
`Morgan Kaufmann Publishers
`San Mateo, California
`
`Hughes, Exh. 1020, p. 2
`
`

`

`Senior Editor: Bruce M. Spatz
`Production Manager: Y onie Overton
`Editorial Coordinator: Douglas Sery
`Copyeditlng: Steve Hiatt and Gary Morris
`Text Design: Ross Carron Design
`Illustration: Alexander Teshin Associates
`Composition/Color Separation/Postscript
`Programming: Edward W. Sznyter, Babel Press
`Cover Design: David Lance Goines
`Additional Cover Mechanical Art: Patty King
`Chapter Opener Illustrations: Jo Jackson
`Indexing: Steve Rath
`Proofreading: Gary Morris
`Electronic Prepress: The Courier Connection
`Printer: Courier Corporation
`
`Morgan Kaufmann Publishers, Inc.
`Editorial Office:
`2929 Campus Drive, Suite 260
`San Mateo, CA 94403
`
`© 1994 by Morgan Kaufmann Publishers, Inc.
`All rights reserved
`Printed in the United States of America
`
`97 96 95 94
`
`54 3 2 1
`
`No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form
`or by any means-electronic, mechanical, photocopying, recording, or otherwise--without the prior
`written permission of the publisher.
`
`Advice, Praise, and Errors: Any correspondence related to this publication or intended for the authors
`should be addressed to the editorial offices of Morgan Kaufmann Publishers, Inc., Dept. P&H APE. In(cid:173)
`formation regarding error sightings is encouraged. Any error sightings that are accepted for correction
`in subsequent printings will be rewarded by the authors with a payment of $1.00 (U.S.) per correction at
`the time of their implementation in a reprint. Electronic mail can be sent to errors@cs.berkeley.edu.
`
`Instructor Support: For information on the SPIM software simulator and other instructor materials
`available to adaptors, please contact the editorial offices of Morgan Kaufmann Publishers, Inc.
`
`f
`
`I I
`I
`I
`
`Cataloging-in-Publication Data
`
`Patterson, David A.
`Computer organization and design: the hardware/ software interface
`I David A. Patterson, John L. Hennessy.
`p. em.
`Includes bibliographical references and index.
`ISBN 1-55860-281-X
`1. Computer organization. 2. Computers--Design and construction.
`I. Hennessy, John L.
`3. Computer interfaces.
`II. Title
`<r-;-
`QA76.9.C643P37 1994
`004.2'2-dc20
`*17639
`CIP
`
`Hughes, Exh. 1020, p. 3
`
`

`

`--
`
`viii
`
`~
`l
`! ;
`\
`
`I
`
`Contents
`
`Foreword vi
`by Maurice V. Wilkes
`
`Preface xiii
`
`The SPIM Simulator for the MIPS R2000/R3000 xxiii
`by James R. Larus, University of Wisconsin
`
`CHAPTERS
`
`Computer Abstractions and Technology 2
`
`1.1
`Introduction 3
`1.2 Below Your Program 5
`1.3 Under the Covers 10
`1.4
`Integrated Circuits: Fueling Innovation 21
`1.5 Fallacies and Pitfalls 26
`1.6 Concluding Remarks 28
`1.7 Historical Perspective and Further Reading 30
`1.8 Exercises 41
`
`The Role of Performance 46
`
`2.1
`Introduction 48
`2.2 Measuring Performance 52
`2.3 Relating the Metrics 54
`2.4 Popular Performance Metrics 60
`2.5 Choosing Programs to Evaluate Performance 66
`2.6 Comparing and Summarizing Performance 68
`2.7 Fallacies and Pitfalls 70
`2.8 Concluding Remarks 76
`2.9 Historical Perspective and Further Reading 77
`2.10 Exercises 81
`
`Hughes, Exh. 1020, p. 4
`
`

`

`Contents
`
`ix
`
`Instructions: Language of the Machine 92
`
`Introduction 94
`3.1
`3.2 Operations of the Computer Hardware 95
`3.3 Operands of the Computer Hardware 97
`3.4 Representing Instructions in the Computer 103
`3.5
`Instructions for Making Decisions 110
`3.6 Supporting Procedures in Computer Hardware 119
`3. 7 Other Styles of MIPS Addressing 124
`3.8 Alternatives to the MIPS Approach 130
`3.9 An Example to Put It All Together 135
`3.10 A Longer Example 138
`3.11 Arrays versus Pointers 143
`3.12 Fallacies and Pitfalls 147
`3.13 Concluding Remarks 148
`3.14 Historical Perspective and Further Reading 150
`3.15 Exercises 155
`
`k~Q:j Arithmetic for Computers
`
`166
`
`Introduction 168
`4.1
`4.2 Negative Numbers 168
`4.3 Addition and Subtraction 175
`4.4 Logical Operations 179
`4.5 Constructing an Arithmetic Logic Unit 182
`4.6 Multiplication 198
`4.7 Division 212
`4.8 Floating Point 225
`4.9 Fallacies and Pitfalls 244
`4.10 Concluding Remarks 246
`4.11 Historical Perspective and Further Reading 249
`4.12 Exercises 258
`
`The Processor: Datapath and Control 268
`
`5.1
`Introduction 270
`5.2 Building a Datapath 276
`5.3 A Simple Implementation Scheme 283
`5.4 A Multiple Clock Cycle Implementation 312
`5.5 Microprogramming: Simplifying Control Design 333
`5.6 Exceptions 344
`5. 7 Fallacies and Pitfalls 350
`5.8 Concluding Remarks 351
`5.9 Historical Perspective and Further Reading 353
`5.10 Exercises 357
`
`Hughes, Exh. 1020, p. 5
`
`

`

`X
`
`Contents
`
`Enhancing Performance with Pipelining 362
`
`Introduction 364
`6.1
`6.2 A Pipelined Datapath 367
`6.3 Pipelined Control 381
`6.4 Data Hazards 390
`6.5 Control for Data Hazards: Stalls 399
`6.6 Reducing Data Hazards: Forwarding 412
`6. 7 Branch Hazards 424
`6.8 Exceptions 430
`6.9 Performance of Pipelined Systems 435
`6.10 Fallacies and Pitfalls 436
`6.11 Concluding Remarks 438
`6.12 Historical Perspective and Further Reading 441
`6.13 Exercises 445
`
`Large and Fast: Exploiting Memory Hierarchy 452
`
`Introduction 454
`7.1
`7.2 Caches 458
`7.3 Virtual Memory 481
`7.4 A Common Framework for Memory Hierarchies 501
`7.5 Fallacies and Pitfalls 515
`7.6 Concluding Remarks 519
`7.7 Historical Perspective and Further Reading 521
`7.8 Exercises 527
`
`Interfacing Processors and Peripherals 532
`
`Introduction 534
`8.1
`8.2 1/0 Performance Measures: Some Examples from Disk and
`File Systems 537
`8.3 Types and Characteristics of 1/0 Devices 539
`8.4 Buses: Connecting 1/0 Devices to Processor and Memory 548
`Interfacing 1/0 Devices to the Memory, Processor, and
`8.5
`Operating System 565
`8.6 Fallacies and Pitfalls 576
`8. 7 Concluding Remarks 578
`8.8 Historical Perspective and Further Reading 581
`8.9 Exercises 584
`
`Hughes, Exh. 1020, p. 6
`
`

`

`Contents
`
`xi
`
`~ Parallel Processors 594
`
`Introduction 596
`9.1
`9.2 SIMD Computers-Single Instruction Stream, Multiple Data Streams 596
`9.3 MIMD Computers-Multiple Instruction Streams, Multiple Data
`Streams 602
`9.4 Programming MIMDs 603
`9.5 MIMDs Connected by a Single Bus 607
`9.6 MIMDs Connected by a Network 619
`9. 7 Future Directions for Parallel Processors 630
`9.8 Fallacies and Pitfalls 637
`9.9 Concluding Remarks-Evolution versus Revolution in Computer
`Architecture 640
`9.10 Historical Perspective and Further Reading 642
`9.11 Exercises 646
`
`APPENDICES
`
`Assemblers, Linkers, and the SPIM Simulator A-2
`by James R. Larus, University of Wisconsin
`A.1
`Introduction A-3
`A.2 Assemblers A-10
`A.3 Linkers A-17
`A.4 Loading A-19
`A.S Memory Usage A-19
`A.6 Procedure Call Convention A-21
`A.7 Exceptions and Interrupts A-30
`A.S
`Input and Output A-34
`A.9 SPIM A-36
`A.10 MIPS R2000 Assembly Language A-47
`A.11 Concluding Remarks A-71
`A.12 Exercises A-72
`
`The Basics of Logic Design B-2
`
`Introduction B-3
`B.1
`B.2 Gates, Truth Tables, and Logic Equations B-4
`B.3 Combinational Logic B-8
`B.4 Clocks B-18
`B.S Memory Elements B-21
`B.6 Finite State Machines B-35
`B. 7 Timing Methodologies B-39
`B.S Exercises B-45
`
`~
`------------------...... ==-------;,_,~\
`
`Hughes, Exh. 1020, p. 7
`
`

`

`__ ,
`
`r~
`;,.·;
`
`xii
`
`Contents
`
`Mapping Control to Hardware c-2
`
`Introduction C -3
`C.1
`Implementing Finite State Machine Control C-4
`c.2
`Implementing the Next·State Function with a Sequencer C-15
`C.3
`C.4 Translating a Microprogram to Hardware C-23
`C.S Concluding Remarks C-27
`C.G Exercises C-28
`
`Introducing C to Pascal Programmers D-2
`
`Introduction D-3
`D.1
`D.2 Variable Declarations D-3
`D.3 Assignment Statements D-4
`D.4 Relational Expressions and Conditional Statements D-5
`D.S Loops D-6
`D.G Examples to Put it All Together D-7
`D. 7 Exercises D-8
`
`Another Approach to Instruction Set Architecture-VAX E-2
`
`Introduction E-3
`E.1
`E.2 VAX Operands and Addressing Modes E-4
`E.3 Encoding VAX Instructions E-7
`E.4 VAX Operations E-9
`E.S An Example to Put It All Together: swap E-ll
`E.G A Longer Example: sort E-15
`E. 7 Fallacies and Pitfalls E-19
`E.S Concluding Remarks E-22
`E.9 Historical Perspective and Further Reading E-23
`E.10 Exercises E-25
`
`Index
`
`I-1
`
`-;
`
`Hughes, Exh. 1020, p. 8
`
`

`

`I
`
`I
`
`~ i
`:I
`·j
`
`I
`
`i ;
`i j'
`'
`
`3.2 Operations of the Computer Hardware
`
`95
`
`the simulator that comes with this book. We conclude with a look at the histor(cid:173)
`ical evolution of instruction sets and an overview of other machine dialects.
`The chosen instruction set comes from MIPS Computer Company and is
`typical of instruction sets designed since the early 1980s. We reveal the MIPS
`instruction set a piece at a time, giving the rationale along with the machine
`structures. This step-by-step tutorial weaves the components with their expla(cid:173)
`nations, making assembly language more palatable. To keep the overall pic(cid:173)
`ture in mind, each section ends with a figure summarizing the MIPS
`instruction set revealed thus far, highlighting the portions presented in that
`section.
`
`Operations of the Computer Hardware
`
`There must certainly be instructions for performing the fundamental arithmetic
`operations.
`
`Burks, Goldstine, and von Neumann, 1947
`
`Every computer must be able to perform arithmetic. The MIPS notation
`add a, b, c
`instructs a computer to add the two variables b and c and to put their sum in a.
`This notation is rigid in that each MIPS arithmetic instruction must always
`have exactly three variables. For example, suppose we want to place the sum
`of variables b, c, d, and e into variable a. The following sequence of instruc(cid:173)
`tions adds the variables:
`#The sum of b and c is placed in a.add a.
`add a, b, c
`#The sum of b, c and d is now in a.add a.
`a, d
`#The sum of b, c. d and e is now in a.
`a, e
`Thus it takes three instructions to take the sum of four variables.
`The words to the right of the sharp symbol(#) on each line above are com(cid:173)
`ments for the human reader, and they are ignored by the computer. Note that
`unlike other programming languages, each line of this language can contain,
`at most, one instruction. Another difference is that comments terminate at the
`end of a line.
`The natural number of operands for an operation like addition is three: the
`two numbers being added together and a placeholder for the sum. Requiring
`every instruction to have exactly three operands, no more and no less, con(cid:173)
`forms to the philosophy of keeping the hardware simple. Hardware for a vari(cid:173)
`able number of operands is more complicated than hardware for a fixed
`
`Hughes, Exh. 1020, p. 9
`
`

`

`96
`
`Chapter 3
`
`Instructions: Language of the Machine
`
`I''' 1 '1"' {"f:.<o]j,'jh ~ I
`adrl
`Arithmetic
`subtract
`
`MIPS assembly language
`I~'Et'tlo ,,.,,
`.. lll"''i'
`a = b + c
`add a.b.c
`a = b - c
`sub a. b. c
`
`.~
`
`~!
`
`I
`
`-l! c
`
`.-
`Always 3 operands
`Always 3 operands
`
`FIGURE 3.1 MIPS architecture revealed In section 3.1. The real machine operands will be
`unveiled in the next section. Highlighted portions in such summaries show MIPS structures
`introduced in this section; for this first figure, all is new.
`
`number. This situation illustrates the first of four underlying principles of
`hardware design:
`Principle 1: Simplicity favors regularity.
`We can now show, in the two examples that follow, the relationship of pro(cid:173)
`grams written in programming languages to programs in this more primitive
`notation. Figure 3.1 summarizes the portions of MIPS described in this section.
`
`Example
`
`This segment of a C program contains the five variables a, b, c, d, and e:
`
`b + c;
`a
`d = a - e;
`
`(Reminder: We are using the C programming language; readers familiar
`with another language should refer to Appendix D for a short comparison
`of C with Pascal.) The translation from C to MIPS instructions is performed
`by the compiler. Show the MIPS code produced by the C compiler.
`
`Answer
`
`The C compiler could produce
`
`add a. b. c
`subd, a, e
`
`These instructions are symbolic representations of what the processor
`understands; they are called assembly language instructions.
`
`Hughes, Exh. 1020, p. 10
`
`

`

`3.3 Operands of the Computer Hardware
`
`97
`
`Example
`
`A more complex C statement contains the five variables f, g, h, i, and j:
`
`f = (g +h) -
`
`(i + j);
`
`What would the C compiler produce?
`
`Answer
`
`The statement might be compiled into the following three MIPS instruc(cid:173)
`tions:
`
`add tO,g,h # temporary variable tO contains g+h
`add tl,i,j #temporary variable t1 contains i+j
`sub f,t0,t1 # f gets t0-t1, or (g+h)-(i+j)
`
`Note that the compiler created two new variables, tO and t 1, to express the
`program in the restricted three-operands-per-instruction notation of the
`machine.
`
`[iJ Operands of the Computer Hardware
`
`Unlike programs in high-level languages, the operands of arithmetic instruc(cid:173)
`tions cannot be any variables; they must be from a limited number of special
`locations called registers. Registers are the bricks of computer construction, for
`registers are primitives used in hardware design that are also visible to the pro(cid:173)
`grammer when the computer is completed. The size of a register in the MIPS
`architecture is 32 bits; groups of 32 bits occur so frequently that they are given
`the name word in the MIPS architecture.
`One major difference between the variables of a programming language
`and registers is the limited number of registers, typically between 16 and 32 on
`current computers. MIPS has 32 registers, using the notation $0, $1, ... , $31 to
`represent them. (See section 3.14 for the history of the number of registers.)
`The reason for this limit may be found in the second of our four underlying
`principles of hardware technology:
`Principle 2: Smaller is faster.
`A very large number of registers would increase the clock cycle time simply
`because it takes electronic signals longer when they must travel farther. Guide(cid:173)
`lines such as "smaller is faster" are not absolutes; 31 registers may not be faster
`than 32. Yet the truth behind such observations causes computer designers to
`
`i
`
`!
`I I .
`
`i
`
`Hughes, Exh. 1020, p. 11
`
`

`

`/:
`
`98
`
`Chapter 3
`
`Instructions: Language of the Machine
`
`take them seriously.ln this case, the ~esigr;er m~st balance the craving of pro(cid:173)
`grams for more registers with the designer s desire to keep th~ clock cycle fast.
`Chapters 5 and 6 show the central role that r~gisters play~ har~ware con(cid:173)
`struction; as we shall see in this chapter, effectiVe use of registers IS a key to
`program performance.
`
`Example
`
`It is the compiler's job to associate program variables with registers. Take,
`for instance, the C statement from our earlier example:
`
`f = (g +h) -
`
`(i + j);
`
`The variables f, g, h, i, and j can be assigned to the registers $16, $17, $18,
`$19, and $20, respectively. What is the compiled MIPS assembly code?
`
`Answer
`
`The compiled program is
`
`add $8,$17,$18
`add $9,$19,$20
`sub $16,$8,$9
`
`# register $8 contains g+h
`# register $9 contains i+j
`# f gets $8-$9, or (g+h)-(i+j)
`
`Registers $8 and $9 correspond to tO and t1 in the earlier example.
`
`Programming languages have simple variables that contain single data ele(cid:173)
`ments as in these examples, but they also have more complex data structures
`such as arrays. These complex data structures can contain many more data el(cid:173)
`ements than there are registers in a machine. How can a computer represent
`and access such large structures?
`Recall the five components of a computer introduced in Chapter 1 and de(cid:173)
`picted on the second page of this chapter (page 93). The processor can store
`only a small amount of data in registers, but computer memory contains mil(cid:173)
`lions of data elements. Hence data structures, like arrays, are kept in memory.
`As explained above, arithmetic operations occur only on registers in MIPS;
`thus MIPS must include instructions that transfer data between memory and
`registers. Such instructions are called data transfer instructions. To access a
`word in memory, the instruction must supply its address. Memory is really just
`a large, single-dimensional array, with the address acting as the index to that
`array. Addresses start at 0. For example, in Figure 3.2, the address of the third
`data element is 2, and the value of Memory[2] is 1000.
`
`Hughes, Exh. 1020, p. 12
`
`

`

`Wll"..E..-o~-------------------~--------------------------- --."'"
`; 1 "tn .:.--..:
`1 ~~c.
`i
`'·I j;:!
`I I
`I i
`
`3.3 Operands of the Computer Hardware
`
`99
`
`Processor
`
`Memory
`
`Address
`
`0
`
`1
`
`2
`
`3
`
`Data
`
`100
`
`10
`
`1000
`
`1
`
`FIGURE 3.2 Memory addresses and contents of memory at those locations. This is a sim(cid:173)
`plification of the MIPS addressing; Figure 3.3 shows MIPS addressing for sequential words in
`memory.
`
`The data transfer instruction that moves data from memory to a register is
`called load. The format of the load instruction is the name of the operation fol(cid:173)
`lowed by the register to be loaded, then the start address of the array, and
`finally a register that contains the index of the element of the array to be load(cid:173)
`ed. Thus the memory address of the array element is formed by the sum of the
`constant portion of the instruction and a register. The MIPS name for this in(cid:173)
`struction is l w, standing for load word.
`
`Example
`
`Assume that A is an array of 100 elements and that the compiler has associ(cid:173)
`ated the variables g, h, and i with the registers $17, $18, and $19. Let's also
`assume that the array starts at address As tart. Translate this C statement:
`
`g = h + A[i];
`
`Answer
`
`The C assignment statement becomes
`
`lw
`add
`
`$8,Astart($19)
`$17,$18,$8
`
`#Temporary reg $8 gets A[i]
`# g = h + A[i]
`
`Hughes, Exh. 1020, p. 13
`
`

`

`"·~/
`;.t,...'_-
`/·~r
`
`100
`
`Chapter 3
`
`Instructions: Language of the Machine
`
`The load instruction l w adds the starting address of the array A (named
`As tart here) to the index i in register $19 to form the address of element
`A [; ] • The register added to the address is therefore called the index register.
`The processor then reads the value from memory at that address and places
`it into register $8, which is used as a temporary variable. The following add
`instruction can operate on the value in $8 (which equals A[ i J) since it is in
`a register. The instruction adds A [ i J to h and puts the sum in the register
`corresponding to g.
`
`In addition to associating variables with registers, it is up to
`the compiler to allocate data structures like arrays to loca(cid:173)
`tions in memory. The compiler can then place the proper
`starting address into the data transfer instructions.
`Since 8-bit bytes are useful in many programs, the MIPS
`architecture addresses individual bytes. The address of a
`word is therefore actually the same as one of the 4 bytes in a
`word. Hence, addresses of sequential words differ by 4. Figure 3.3 shows the
`actual addresses for Figure 3.2. (Appendix A, section A.6 on page A-21, shows
`the two ways to number bytes in a word.)
`
`Processor
`
`Memory
`
`Address
`
`0
`
`4
`
`8
`
`12
`
`Data
`
`100
`
`10
`
`1000
`
`1
`
`dd
`FIGURE 3.3 Actual MIPS mem
`ory a
`resses and contents of memory for those words.
`Th h
`e c anged addresses are highlighted to contrast to Figure 3.2.
`
`Hughes, Exh. 1020, p. 14
`
`

`

`3.3 Operands of the Computer Hardware
`
`101
`
`Byte addressing also affects the index i. To get the proper byte address in
`the code above, register $19 must have 4 x i so that the sum of $19 and
`As t a r t will select A[ i J and not A [ i I 4 J .
`
`The instruction complementary to load is called store; it transfers data from
`a register to memory. The format of a store is similar to that of a load:
`the
`name of the operation, followed by the register to be stored, then the starting
`address of the array, and finally a register that contains the index to the ele(cid:173)
`ment of the array to be stored. The MIPS name is sw, standing for store word.
`Once again, the MIPS address is specified in part by a constant and in part by
`a register.
`
`Example
`
`Assume the variables g and h are associated with the registers $17 and $18.
`To accommodate the byte addresses of MIPS, assume that register $19 now
`has the value 4 xi. Chapter 4 explains how to multiply in MIPS; for now
`assume the multiplication has already occurred. What is the MIPS assembly
`code for the C statement below?
`
`A[i] = h + A[i];
`
`Answer
`
`The C assignment statement becomes
`
`$8,Astart($19)
`lw
`add $8,$18,$8
`sw
`$8,Astart($19)
`
`#Temporary reg $8 gets A[i]
`#Temporary reg $8 gets h+A[i]
`# Stores h+A[i] back into A[i]
`
`Instead of placing the sum of h and A [ i J into register $17, as in the prior
`example, the sum is placed into temporary register $8 and then stored back
`into A[ i].
`
`These are the instructions that transfer words between memory and regis(cid:173)
`ters in the MIPS architecture. Other brands of computers use instructions in
`addition to load and store to transfer data; these alternatives are described in
`section 3.8. Figure 3.4 summarizes the portions of MIPS described in this sec(cid:173)
`tion.
`
`Hughes, Exh. 1020, p. 15
`
`

`

`258
`
`Chapter 4 Arithmetic for Computers
`
`Exercises
`
`Never give in, never give in, never, never, never-in nothing, great or small, large
`or petty-never give in ....
`
`Winston Churchill, Address at Harrow School, October 29, 1941
`
`4.1 [15] <§3, 4.2, 4.8> The Big Picture on page 243 mentions that bits have no
`inherent meaning. Given the bit pattern
`
`10001111111011111100 0000 0000 0000
`
`What does it represent, assuming that it is
`
`a. a two's complement integer?
`
`b. an unsigned integer?
`
`c. a single precision floating-point number?
`
`d. a MIPS instruction?
`
`4.2 [10] <§4.2, 4.4, 4.8> This exercise is similar to Exercise 4.1, but this time use
`the bit pattern
`
`0000 0000 0000 0000 0000 0000 0000 0000
`
`4.3 [3] <§4.2> Convert 512ten into a 32-bit two's complement binary number.
`
`4.4 [3] <§4.2> Convert -1,023ten into a 32-bit two's complement binary num(cid:173)
`ber.
`
`4.5 [5] <§4.2> Convert -4,000,000ten into a 32-bit two's complement binary
`number.
`
`4.6 [5] <§4.2> What decimal number does this two's complement binary
`number represent: 111111111111111111111110 0000 1100tw
`?
`
`0
`4.7 [5] <§4.2> What decimal number does this two's complement binary
`number represent: 1111111111111111 1111 1111 11111111two?
`
`4.8 [5] <§4.2> What decimal number does this two's complement binary
`number represent: 01111111111111111111 1111 11111111tw
`?
`
`0
`4.9 [5] <§4.2> What binary number does this hexadecimal number represent:
`7fff fffahex? What decimal number does it represent?
`
`Hughes, Exh. 1020, p. 16
`
`

`

`---------------'-- ..
`
`·~
`
`4.12 Exercises
`
`259
`
`1\
`I
`!
`'I
`
`4.10 [5] <§4.2> What hexadecimal number does this binary number
`represent: 1100101011111110111110101100 1110tw0 ?
`4.11 [5] <§4.8> Using the notation in the Hardware Software Interface sec(cid:173)
`tions on pages 226 and 227, show the MIPS binary floating-point formats in
`single precision and double precision for 10ten·
`
`4.12 [5] <§4.8> This exercise is similar to Exercise 4.11, but this time replace
`the number 10ten with 10.5ten·
`4.13 [10] <§4.8> This exercise is similar to Exercise 4.11, but this time replace
`the number 10ten with 0.1ten·
`4.14 [10] <§4.10> For the program gee (Figure 4.46 on page 248), find the 10
`most frequently executed MIPS instructions. List them in order of popularity,
`from most used to least used. Show the rank, name, and percentage of instruc(cid:173)
`tions executed for each instruction. If there is a tie for a given rank, list all in(cid:173)
`structions that tie with the same rank, even if this results in more than 10
`instructions.
`
`4.15 [10] <§4.10> This exercise is similar to Exercise 4.14, but this time replace
`the program gee with the program spice.
`
`4.16 <§4.10> {§4.14, 4.15} These questions examine the relative frequency of
`instructions in different programs.
`
`a.
`
`b.
`
`c.
`
`d.
`
`[5] Which instructions are found in both the answer to Exercise 4.14 and in
`the answer to Exercise 4.15?
`
`[5] What percentage of gee instructions executed is due to the instructions
`identified in Exercise 4.14a?
`
`[5] What percentage of gee instructions executed is due to the instructions
`identified in Exercise 4.14?
`
`[5] What percentage of spice instructions executed is due to the instruc(cid:173)
`tions identified in Exercise 4.14a?
`
`e.
`
`[5] What percentage of spice instructions executed is due to the instruc-
`tions identified in Exercise 4.15?
`4.17 [10] <§4.10> {ex. 4.14, 4.15, 4.16} If you were designing a machine to ex(cid:173)
`ecute the MIPS instruction set, what are the five instructions that you would
`try to make as fast as possible, based on the answers to Exercises 4.14 through
`4.16? Give your rationale.
`4.18 [15] <§2, 4.10> Using Figure 4.46 on page 248, calculate the average clock
`cycles per instruction (CPI) for the program gee. Figure 4.47 gives the average
`
`Hughes, Exh. 1020, p. 17
`
`

`

`260
`
`Chapter 4 Arithmetic for Computers
`
`I [•.'f·
`
`·"!.{t
`
`.,,,, '.(
`
`r:.:"::::"'
`Loads and stores
`
`Conditional branch
`
`Jumps
`Integer multiply
`
`Integer divide
`Floating-point add and subtract
`Floating-point multiply, single precision
`Floating-point multiply, double precision
`Floating-point divide, single precision
`
`Floating-point divide, double precision
`
`'!(~~(!;!~~
`01.4
`01.8
`01.2
`10.0
`30.0
`02.0
`04.0
`05.0
`12.0
`19.0
`
`FIGURE 4.47 CPI for MIPS Instruction categories.
`
`CPI per instruction category, taking into account cache misses and other ef(cid:173)
`fects. Assume that instructions omitted from the table have a CPI of 1.0.
`
`4.19 [15] <§2, 4.10> This exercise is similar to Exercise 4.18, but this time re(cid:173)
`place the program gee with the program spice.
`
`4.20 [5] <§4.2> Why doesn't MIPS have a subtract immediate instruction?
`
`4.21 [10] <§4.2> Find the shortest sequence of MIPS instructions to determine
`the absolute value of a two's complement integer. Convert this instruction (ac(cid:173)
`cepted by the MIPS assembler):
`
`abs
`
`$10,$11
`
`This instruction means that register $10 has a copy of register $11 if ~egister
`$11 is positive, and the two's complement of register $11 if $11 is negative.
`4.22 [10] <§4.3> Find the shortest sequence of MIPS instructions to determine
`if there is a carry out from the addition of two registers, say registers $11 and
`$12. Place a 0 or 1 in register $10 if carry out is 0 or 1, respectively.
`
`4.23 [15] <§4.3> {ex. 4.22} Find the shortest sequence of MIPS instructions to
`perform double precision integer addition. Assume that one 64-bit, two's com(cid:173)
`plement integer is in registers $12 and $13 and another is in registers $14 and
`$~.5. ~e sum is to be placed in registers $10 and $11. In this example th~ most
`slgmflcant word of the 64-bit integer is found in the even-numbered reglsters,
`and the least significant word is found in the odd-numbered registers.
`
`Hughes, Exh. 1020, p. 18
`
`

`

`4.12 Exercises
`
`261
`
`4.24 [20] <§4.6> Find the shortest sequence of MIPS instructions to perform
`double precision integer multiplication. Assume that one 64-bit, unsigned inte(cid:173)
`ger is in registers $12 and $13 and another is in registers $14 and $15. The
`128-bit product is to be placed in registers $8, $9, $10 and $11. The most signif(cid:173)
`icant word is found in the lower numbered registers, and the least significant
`word is found in the higher numbered registers in this example. Hint: Write
`out the formula for (ax 232 +b) x (c x 232 +d).
`4.25 [15] <§4.5> The ALU supported set on less than ( s l t) using just the sign
`bit of the adder. Let's try a set-on-less-than operation using the values -7ten
`and 6ten· To make it simpler to follow the example, let's limit the binary repre(cid:173)
`sentations to 4 bits: 1001two and OllOtwo·
`1001two- OllOtwo = 1001two + 1010two = OOlltwo
`
`This result would suggest that -7 > 6, which is clearly wrong. Hence we must
`factor in overflow in the decision. Modify the 1-bit ALU in Figure 4.15 on
`page 192 to handle s l t correctly. Make your changes on a photocopy of this
`figure to save time.
`4.26 [15] <§4.4> Some computers have explicit instructions to extract an arbi(cid:173)
`trary field from a 32-bit register and place it in the least significant bits of a reg(cid:173)
`ister. The figure below shows the desired operation:
`
`31
`
`31
`0 ...
`
`31-jbits
`
`0
`
`32- G - i) bits
`
`j- i bits
`
`Find the shortest sequence of MIPS instructions that extracts a field for the
`constant values i = 7 and j = 19 from register $16 and places it in register $17.
`
`In More Depth: Logical Instructions
`
`The full MIPS instruction set has two more logical operations not mentioned
`thus far: xor and nor. The operation xor stands for exclusive OR, and nor
`stands for not OR. The table that follows defines these operations on a bit(cid:173)
`by-bit basis. These instructions will be useful in the following two exercises.
`
`Hughes, Exh. 1020, p. 19
`
`

`

`262
`
`Chapter 4 Arithmetic for Computers
`
`0
`0
`1
`1
`
`0
`1
`0
`1
`
`0
`1
`
`1
`0
`
`1
`0
`0
`0
`
`4.27 [15] <§4.4> Show the minimal MIPS instruction sequence for a new in(cid:173)
`struction called swap that exchanges two registers. After the sequence com(cid:173)
`pletes, the Destination register has the original value of the Source register,
`and the Source register has the original value of the Destination register. Con(cid:173)
`vert this instruction:
`swap rlO,r20
`
`The hard part is that this sequence must use only these two registers! Hint: Try
`to use the new logical instructions: What is the value of (0 xor A)? (B xor B)?
`4.28 [5] <§4.4> Show the minimal MIPS instruction sequence for a new in(cid:173)
`struction called not that takes the one's complement of a Source register and
`places it in a Destination register. Convert this instruction (accepted by the
`MIPS assembler):
`not rlO,r20
`Hint: Try to use the new logical instructions.
`
`4.29 [20] <§4.5> A simple check for overflow during addition is to see if the
`Carry In to the most significant bit is not the same as the CarryOut of the most
`significant bit. Prove that this check is the same as in Figure 4.3 on page 177.
`
`4.30 [10] <§4.5> Draw the gates for the Sum bit of an adder, given the equa(cid:173)
`tion on page 187.
`
`4.31 [5] ~§4.5> Rewrite the equations on page 197 for a carry lookahead lo?ic
`for a 16-bit adder using a new notation. First use the names for the Carry In sig(cid:173)
`nals of the individual bits of the adder. That is, use c4, c8, c12, ... instead of
`C1, C2, C3, .... Also, let Pi,j mean a propagate signal for bits ito j, and Gi,j
`mean a generate signal for bits ito j. For example, the equation
`C2 = G1 + (P1·GO) + (P1·PO ·cO)
`
`can be rewritten as
`cS = G7,4+ (P7,4·G3,o) + (P7,4.p3,a·cO)
`This more general notation is useful in creating wider adders.
`
`Hughes, Exh. 1020, p. 20
`
`

`

`4.12 Exercises
`
`263
`
`4.32 [15] <§4.5> {ex. 4.31} Write the equations for a carry lookahead logic for
`a 64-bit adder using the new notation from Exercise 4.31 and using 16-bit
`adders as building blocks.
`
`4.33 [10] <§4.5> Now calculate the relative performance of adders. Assume
`that hardware corresponding to any equation containing only OR or AND
`terms, such as the equations for pi and gi on pa

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket