throbber
Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, Cover
`
`

`
`S
`
`E C O N D
`
`E D I T I O N
`
`Computer Organization and Design
`
`THE HARDWARE/SOFTWARE INTERFACE
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, Cover-2
`
`

`
`TRADEMARKS
`
`The following trademarks are the property of the following organizations:
`
`TeX is a trademark of Americal Mathematical Society.
`
`Apple II and Macintosh are trademarks of Apple Computers, Inc.
`
`CDC 6600, CDC 7600, CDC STAR-100, CYBER-180, CYBER-
`180/990, and CYBER-205 are trademarks of Control Data Corpora-
`tion.
`
`The Cosmic Cube is a trademark of California Institute of Technol-
`ogy.
`
`CP3100 is a trademark of Conner Peripherals.
`
`Cray, CRAY-1, CRAY J90, CRAY T90, CRAY X-MP/416, and
`CRAY Y-MP are trademarks of Cray Research.
`
`Alpha, AlphaServer, AlphaStation, DEC, DECsystem, DECsystem
`3100, DECstation, PDP-8, PDP-11, Unibus, VAX, VAX 8700, and
`VAX11/780 are trademarks of Digital Equipment Corporation.
`
`MP2361A, Super Eagle, VP100, VP200, and VPP300 are trademarks
`of Fujitsu Corporation.
`
`Gnu C Compiler is a trademark of Free Software Foundation.
`
`Goodyear MPP is a trademark of Goodyear Tire and Rubber Co.,
`Inc.
`
`Apollo DN 300, Apollo DN 10000, Convex, HP, HP Precision
`Architecture, HPPA, HP850, HP 3000, HP 300/70, PA-RISC, and
`Precision are registered trademarks of Hewlet-Packard Company.
`
`432, 960 CA, 4004, 8008, 8080, 8086, 8087, 8088, 80186, 80286, 80386,
`80486, Delta, iAPX 432, i860, Intel, Inte1486, Intel Hypercube, iP-
`SC/2, MMX, Multibus, Multibus II, Paragon, and Pentium are
`trademarks of Intel Corporation. Intel Inside is a registered trade-
`mark of Intel Corporation.
`
`360, 360/30, 360/40, 360/50, 360/65, 360/85, 360/91,370, 370/158,
`370/165, 370/168, 370-XA, ESA/370, 701, 704, 709, 801, 3033, 3080,
`3080 series, 3080 VF, 3081, 3090, 3090/100, 3090/200, 3090/400,
`3090/600, 3090/600S, 3090 VF, 3330, 3380, 3380D, 3380 Disk Model
`AK4, 3380J, 3390, 3880-23, 3990, 7090, 7094, IBM, IBM PC, IBM PC-
`AT, IBM SVS, ISAM, MVS, PL.8, PowerPC, POWERstation, RT-PC,
`RAMAC, RS/6000, Sage, Stretch, System/360, Vector Faility, and
`VM are trademarks of International Business Machines Corpora-
`tion. POWERserver, RISC System/6000, and SP2 are registered
`trademarks of International Business Machines Corporation.
`
`ICL DAP is a trademark of International Computers Limited.
`
`Inmos and Transputer are trademarks of Inmos.
`
`FutureBus is a trademark of the Institute of Electrical and Electron-
`ic Engineers.
`
`KSR-1 is a trademark of Kendall Square Research.
`
`MASPAR MP-1 and MASPAR MP-2 are trademarks of MasPar
`Corporation.
`
`MIPS, R2000, R3000, and R10000 are registered trademarks of
`MIPS Technology, Inc.
`
`Windows is a trademark of Microsoft Corporation.
`
`NuBus is a trademark of Massachusetts Institute of Technology.
`
`Delta Series 8608, System V/88 R32V1, VME bus, 6809, 68000,
`68010, 68020, 68030, 68881, 68882, 88000, 88000 1.8.4m14, 88100,
`and 88200 are trademarks of Motorola Corporation.
`
`Ncube and nCube/ten are trademarks of Ncube Corporation.
`
`NEC is ~i registered trademark of NEC Corporation.
`
`Network Computer is a trademark of Oracle Corporation.
`
`Parsytec GC is a trademark of Parsytec, Inc.
`
`hnprimis, IPI-2, Sabre, Sabre 97209, Seagate, and Wren IV are
`trademarks of Seagate Technology, Inc.
`
`NUMA-Q, Sequent, and Symmetry are trademarks of Sequent
`Computers.
`
`Power Challenge, Silicon Graphics, Silicon Graphics 43/240,
`Silicon Graphics 4D/60, Silicon Graphics 4D/240, and Silicon
`Graphics 4D Series are trademarks of Silicon Graphics. Origin2000
`is a registered trademark of Silicon Graphics.
`
`SPEC is a registered trademark of the Standard Performance Eval-
`uation Corporation.
`
`Spice is a trademark of University of California at Berkeley.
`
`Enterprise, Java, Sun, Sun Ultra, Sun Microsystems, and Ultra are
`trademarks of Sun Microsystems, Inc. SPARC and UltraSPARC
`are registered trademarks of SPARC International, Inc., licensed to
`Sun Microsystems, Inc.
`
`Connection Machine, CM-2, and CM-5 are trademarks of Thinking
`Machines.
`
`Burroughts 6500, B5000, B5500, D-machine, UNIVAC, UNIVAC I,
`and UNIVAC 1103 are trademarks of UNISYS.
`
`Alto, PARC, Palo Alto Research Center, and Xerox are trademarks
`of Xerox Corporation.
`
`The UNIX trademark is licensed exclusively through X/Open
`Company Ltd.
`
`All other product names are trademarks or registered trademarks
`of their respective companies. Where trademarks appear in this
`book and Morgan Kaufmann Publishers was aware of a trademark
`claim, the trademarks have been printed in initial caps or all caps.
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, Cover-3
`
`

`
`S
`
`E C O N D
`
`E D I T I O N
`
`Computer Organization and Design
`
`THE HARDWARE/SOFTWARE INTERFACE
`
`John L. Hennessy
`Stanford University
`
`David A. Patterson
`University of California, Berkeley
`
`With a contribution by
`James R. Larus
`University of Wisconsin
`
`Morgan Kaufmann Publishers, Inc.
`
`San Francisco, California
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, Cover-4
`
`

`
`Sponsoring Editor Denise Penrose
`Production Manager Yonie Overton
`Production Editor Julie Pabst
`Editorial Coordinator Jane Elliott
`Text and Cover Design Ross Carron Design
`Illustration Alexander Teshin Associates, with second edition modifications by Dartmouth
`Publishing, Inc.
`Chapter Opener Illustrations Canary Studios
`Copyeditor Ken DellaPenta
`Composition Nancy Logan
`Proofreader Jennifer McClain
`indexer Steve Rath
`Printer Courier Corporation
`
`Morgan Kaufmann Publishers, Inc.
`Editorial and Sales Office:
`340 Pine Street, Sixth Floor
`San Francisco, CA 94104-3205
`USA
`
`Telephone 415/392-2665
`Facsimile 415/982-2665
`Email mkp@mkp.com
`WWW http:ffwww.mkp.com
`Order toll free 800/745-7323
`
`© 1998 by Morgan Kaufmann Publishers, Inc.
`All rights reserved
`Printed in the United States of America
`
`04 03
`
`10 9
`
`No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form
`or by any means--electronic, mechanical, photocopying, recording, or otherwise--without the prior
`written permission of the publisher.
`
`Advice, Praise, and Errors: Any correspondence related to this publication or intended for the authors
`should be sent electronically to cod2bugs@mkp.com. Information regarding error sightings is encouraged.
`Any error sightings that are accepted for correction in subsequent printings will be rewarded by the
`authors with a payment of $1.00 (U.S.) per correction at the time of their implementation in a reprint.
`
`Library of Congress Cataloging-in-Publication Data
`Patterson, David A.
`Computer organization and design : the hardware/software interface
`/ David A. Patterson, John L. Hennessy.--2nd ed.
`p. cm.
`Includes bibliographical references and index.
`ISBN 1-55860-428-6 (cloth).--ISBN 1-55860-491-X (paper)
`1. Computer organization. 2. Computers--Design and construction.
`I. Hennessy, John L. II. Title
`3. Computer interfaces.
`1997
`QA76.9.C643H46
`004.2’2--dc21
`
`97-16050
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, Cover-5
`
`

`
`TO LI N DA AND ANDREA
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, Cover-6
`
`

`
`vi
`
`Foreword
`
`by John H. Crawford
`Intel Fellow, Director of Microprocessor Architecture
`Intel Corporation, Santa Clara, California
`
`Computer design is an exciting and competitive discipline. The microproces-
`sor industry is on a treadmill where we double microprocessor performance
`every 18 months and double microprocessor complexity--measured by the
`number of transistors per chip--every 24 months. This unprecedented rate of
`change has been evident for the entire 25-year history of the microprocessor,
`and it promises to continue for many years to come as the creativity and
`energy of many people are harnessed to drive innovation ahead in spite of the
`challenge of ever-smaller dimensions. This book trains the student with the
`concepts needed to lay a solid foundation for joining this exciting field. More
`importantly, this book provides a framework for thinking about computer
`organization and design that will enable the reader to continue the lifetime of
`learning necessary for staying at the forefront of this competitive discipline.
`The text focuses on the boundary between hardware and software and ex-
`plores the levels of hardware in the vicinity of this boundary. This boundary
`is captured in a computer’s architecture specification. It is a critical boundary
`for a successful computer product: an architect must define an interface that
`can be efficiently implemented by hardware and efficiently targeted by com-
`pilers. The interface must be able to retain these efficiencies for many genera-
`tions of hardware and compiler technology, much of which will be unknown
`at the time the architecture is specified. This boundary is central to the disci-
`pline of computer design: it is where compilation (in software) ends and inter-
`pretation (in hardware) begins.
`This book builds on introductory programming skills to introduce the con-
`cepts of assembly language programming and the tools needed for this task:
`the assembler, linker, and loader. Once these prerequisites are completed, the
`remainder of the book explores the first few levels of hardware below the ar-
`chitectural interface. The basic concepts are motivated and introduced with
`clear and intuitive examples, then elaborated into the "real stuff" used in to-
`day’s modern microprocessors. For example, doing the laundry is used as an
`analogy in Chapter 6 to explain the basic concepts of pipelining, a key tech-
`nique used in all modern computers. In Chapter 4, algorithms for the basic
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, p. vi
`
`

`
`Foreword
`
`vii
`
`floating-point arithmetic operators such as addition, multiplication, and divi-
`sion are first explained in decimal, then in binary, and finally they are elabo-
`rated into the best-known methods used for high-speed arithmetic in today’s
`computers.
`New to this edition are sections in each chapter entitled "Real Stuff." These
`sections describe how the concepts from the chapter are implemented in com-
`mercially successful products. These provide relevant, tangible examples of
`the concepts and reinforce their importance. As an example, the Real Stuff in
`Chapter 6, Enhancing Performance with Pipelining, provides an overview of a
`dynamically scheduled pipeline as implemented in both the IBM/Motorola
`PowerPC 604 and Intel’s Pentium Pro microprocessor.
`The history of computing is woven as a thread throughout the book to re-
`ward the reader with a glimpse of key successes from the brief history of this
`young discipline. The other side of history is reported in the Fallacies and Pit-
`falls section of each chapter. Since we can learn more from failure than from
`success, these sections provide a wealth of learning!
`The authors are two of the most admired teachers, researchers, and practi-
`tioners of the art of computer design today. John Hennessy has straddled both
`sides of the hardware/software boundary, providing technical leadership for
`the legendary MIPS compiler as well as the MIPS hardware products through
`many generations. David Patterson was one of the original RISC proponents:
`he coined the acronym RISC, evangelized the case for RISC, and served as a
`key consultant on Sun Microsystem’s SPARC line of processors. Continuing
`his talent for marketable acronyms, his next breakthrough was RAID (Redun-
`dant Arrays of Inexpensive Disks), which revolutionized the disk storage in-
`dustry for large data servers, and then NOW (Networks of Workstations).
`Like other great "software" products, this second edition went through an
`extensive beta testing program: 13 beta sites tested the draft manuscript in
`classes to "debug" the text. Changes from this testing have been incorporated
`into the "production" version.
`Patterson and Hennessy have succeeded in taking the first edition of their
`excellent introductory textbook on computer design and making it even better.
`This edition retains all of the good points of the original, yet adds significant
`new content and some minor enhancements. What results is an outstanding in-
`troduction to the exciting field of computer design.
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, p. vii
`
`

`
`viii
`
`Contents
`
`Contents
`
`Foreword vi
`by John H. Crawford
`
`Worked Examples xiii
`
`Computer Organization and Design Online
`
`xvi
`
`Preface xix
`
`CHAPTERS
`
`Computer Abstractions and Technology
`
`Introduction 3
`1,1
`1.2 Below Your Program 5
`1.3 Under the Covers 10
`integrated Circuits: Fueling innovation 21
`1.4
`1.5 Real Stuff: Manufacturing Pentium Chips 24
`1.6 Fallacies and Pitfalls 29
`1.7 Concluding Remarks 30
`1,8 Historical Perspective and Further Reading 32
`1.9 Key Terms 44
`1.10 Exercises 45
`
`The Role of Performance 52
`
`2.1
`introduction 54
`2.2 Measuring Performance 58
`2.3 Relating the Metrics 60
`2.4 Choosing Programs to Evaluate Performance 66
`2.5 Comparing and Summarizing Performance 69
`2.6 Real Stuff: The SPEC95 Benchmarks and Performance of Recent
`Processors 71
`2.7 Fallacies and Pitfalls 75
`2.8 Concluding Remarks 82
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, p. viii
`
`

`
`Contents ix
`
`2.9 Historical Perspective and Further Reading 83
`2.10 Key Terms 89
`2.11 Exercises 90
`
`Instructions: Language of the Machine 104
`
`Introduction 106
`3.1
`3.2 Operations of the Computer Hardware 107
`3.3 Operands of the Computer Hardware 109
`3.4 Representing instructions in the Computer 116
`Instructions for Making Decisions 122
`3.5
`3.6 Supporting Procedures in Computer Hardware 132
`3.7 Beyond Numbers 142
`3.8 Other Styles of MiPS Addressing 145
`3.9 Starting a Program 156
`3.10 An Example to Put It All Together 163
`3.11 Arrays versus Pointers 171
`3.12 Real Stuff: PowerPC and 80x86 instructions 175
`3.13 Fallacies and Pitfalls 185
`3.14 Concluding Remarks 187
`3.15 Historical Perspective and Further Reading 189
`3.16 Key Terms 196
`3.17 Exercises 196
`
`Arithmetic for Computers 2o8
`
`introduction 210
`4.1
`4.2 Signed and Unsigned Numbers 210
`4.3 Addition and Subtraction 220
`4.4 Logical Operations 225
`4.5 Constructing an Arithmetic Logic Unit 230
`4.6 Multiplication 250
`4.7 Division 265
`4.8 Floating Point 275
`4.9 Real Stuff: Floating Point in the PowerPC and 80x86
`4.10 Fallacies and Pitfalls 304
`4.11 Concluding Remarks 308
`4.12 Historical Perspective and Further Reading 312
`4.13 Key Terms 322
`4.14 Exercises 322
`
`301
`
`The Processor: Datapath and Control
`
`336
`
`introduction 338
`5.1
`5.2 Building a Datapath 343
`5.3 A Simple Implementation Scheme 351
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, p. ix
`
`

`
`Contents
`
`5.4 A Multicycle Implementation 377
`5.5 Microprogramming: Simplifying Control Design 399
`5.6 Exceptions 410
`5.7 Real Stuff: The Pentium Pro implementation 416
`5.8 Fallacies and Pitfalls 419
`5.9 Concluding Remarks 421
`5.10 Historical Perspective and Further Reading 423
`5.11 Key Terms 426
`5.12 Exercises 427
`
`Enhancing Performance with Pipelining 434
`
`6.1 An Overview of Pipelining 436
`6.2 A Pipeiined Datapath 449
`6.3 Pipalined Control 466
`6.4 Data Hazards and Forwarding 476
`6.5 Data Hazards and Stalls 489
`6.6 Branch Hazards 496
`6.7 Exceptions 505
`6.8 Snperscalar and Dynamic Pipelining 510
`6.9 Real Stuff: PowerPC 604 and Pentium Pro Pipelines
`6.10 Fallacies and Pitfalls 520
`6.11 Concluding Remarks 521
`6.12 Historical Perspective and Further Reading 525
`6.13 Key Terms 529
`6.14 Exercises 529
`
`517
`
`Large and Fast: Exploiting Memory Hierarchy 538
`
`7.1
`7.2
`7.3
`7.4
`7.5
`7.6
`7.7
`7.8
`7.9
`7.10
`7.11
`
`Introduction 540
`The Basics of Caches 545
`Measuring and Improving Cache Performance 564
`Virtual Memory 579
`A Common Framework for Memory Hierarchies 603
`Real Stuff: The Pentium Pro and PowerPC 604 Memory Hierarchies
`Fallacies and Pitfalls 615
`Concluding Remarks 618
`Historical Perspective and Further Reading 621
`Key Terms 627
`Exercises 628
`
`611
`
`Interfacing Processors and Peripherals 636
`
`8.1
`8.2
`
`introduction 638
`I/O Performance Measures: Some Examples from Disk and File
`Systems 641
`8.3 Types and Characteristics of I/O Devices 644
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, p. x
`
`

`
`Contents
`
`xi
`
`8.4
`8.5
`
`Buses: Connecting I/O Devices to Processor and Memory 655
`interfacing I/O Devices to the Memory, Processor, and Operating
`System 673
`8.6 Designing an I/O System 684
`8.7 Real Stuff: A Typical Desktop i/O System 687
`8.8 Fallacies and Pitfalls 688
`8.9 Concluding Remarks 690
`8.10 Historical Perspective and Further Reading 694
`8.11 Key Terms 700
`8.12 Exercises 700
`
`Multiprocessors 710
`
`Introduction 712
`9.1
`9.2 Programming Multiprocessors 714
`9.3 Muitiprocessors Connected by a Single Bus 717
`9.4 Muitiprocessors Connected by a Network 727
`9.5 Clusters 734
`9.6 Network Topologies 736
`9.7 Real Stuff: Future Directions for Multiprocessors 740
`9.8 Fallacies and Pitfalls 743
`9.9 Concluding Remarks--Evolution versus Revolution in Computer
`Architecture 746
`9.10 Historical Perspective and Further Reading 748
`9.11 Key Terms 756
`9.12 Exercises 756
`
`APPENDICES
`
`Assemblers, Linkers, and the SPIM Simulator
`by James R. Larus, University of Wisconsin
`
`A-2
`
`introduction A-3
`A.1
`A.2 Assemblers A-10
`A.3 Linkers A-17
`A.4 Loading A-19
`A.5 Memory Usage A-20
`A.6 Procedure Call Convention A-22
`A.7 Exceptions and Interrupts A-32
`Input and Output A-36
`A.8
`A.9 SPIM A-38
`A.10 MIPS R2000 Assembly Language A-49
`A.11 Concluding Remarks A-75
`A.12 Key Terms A-76
`A.13 Exercises A-76
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, p. xi
`
`

`
`xii
`
`Contents
`
`The Basics of Logic Design B-2
`
`B-4
`
`Introduction B-3
`B.1
`B.2 Gates, Truth Tables, and Logic Equations
`B.3 Combinational Logic B-8
`B.4 Clocks B-18
`B.5 Memory Elements B-21
`B.6 Finite State Machines B-35
`B.7 Timing Methodologies B-39
`B.8 Concluding Remarks B-44
`B.9 Key Terms B-45
`B.10 Exercises B-45
`
`Mapping Control to Hardware c-2
`
`Introduction C-3
`C.1
`Implementing Combinational Control Units C-4
`C.2
`implementing Finite State Machine Control C-8
`C.3
`Implementing the Next-State Function with a Sequencer
`C.4
`C.5 Translating a Microprogram to Hardware C-28
`C.6 Concluding Remarks C-31
`C.7 Key Terms C-32
`C.8 Exercises C-32
`
`C-21
`
`Glossary G-1
`
`Index I-1
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, p. xii
`
`

`
`Xlll
`
`Worked Examples
`
`Chapter 2: The Role of Performance
`
`Throughput and Response Time 56
`Relative Performance 57
`Improving Performance 60
`Using the Performance Equation 62
`Comparing Code Segments 64
`MIPS as a Performance Measure 78
`
`Chapter 3: Instructions: Language of the Machine
`
`Compiling Two C Assignment Statements into MIPS 108
`Compiling a Complex C Assignment into MIPS 109
`Compiling a C Assignment Using Registers 110
`Compiling an Assignment When an Operand Is in Memory 112
`Compiling Using Load and Store 113
`Compiling Using a Variable Array Index 114
`Translating a MIPS Assembly Instruction into a Machine Instruction 117
`Translating MIPS Assembly Language into Machine Language 119
`Compiling an If Statement into a Conditional Branch 123
`Compiling if-then-else into Conditional Branches 124
`Compiling a Loop with Variable Array Index 126
`Compiling a while Loop 127
`Compiling a Less Than Test 128
`Compiling a switch Statement by Using a Jump Address Table 129
`Compiling a Procedure that Doesn’t Call Another Procedure 134
`Compiling a Recursive Procedure, Showing Nested Procedure Linking 136
`Compiling a String Copy Procedure, Showing How to Use C Strings 143
`Translating Assembly Constants into Machine Language 145
`Loading a 32-Bit Constant 147
`Showing Branch Offset in Machine Language 149
`Branching Far Away 150
`Decoding Machine Code 154
`Linking Object Files 160
`Compiling an Assignment Statement into Accumulator Instructions 190
`Compiling an Assignment Statement into Memory-Memory Instructions 192
`Compiling an Assignment Statement into Stack Instructions 193
`
`Chapter 4: Arithmetic for Computers
`ASCII versus Binary Numbers 212
`Binary to Decimal Conversion 214
`Signed versus Unsigned Comparison 215
`Negation Shortcut 216
`Sign Extension Shortcut 217
`Binary-to-Hexadecimal Shortcut 218
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, p. xiii
`
`

`
`xiv
`
`Worked Examples
`
`Binary Addition and Subtraction 220
`C Bit Fields 229
`Both Levels of the Propagate and Generate 247
`Speed of Ripple Carry versus Carry Lookahead 248
`First Multiply Algorithm 253
`Second Multiply Algorithm 256
`Third Multiply Algorithm 257
`Booth’s Algorithm 261
`Multiply by 2i via Shift 262
`First Divide Algorithm 268
`Third Divide Algorithm 271
`Floating-Point Representation 279
`Converting Binary to Decimal Floating Point 280
`Decimal Floating-Point Addition 282
`Decimal Floating-Point Multiplication 287
`Compiling a Floating-Point C Program into MIPS Assembly Code 293
`Compiling Floating-Point C Procedure with Two-Dimensional Matrices into MIPS 294
`Rounding with Guard Digits 297
`
`Chapter 5: The Processor: Datapath and Control
`Composing Datapaths 351
`hnplementing Jumps 370
`Performance of Single-Cycle Machines 373
`Performance of a Single-Cycle CPU with Floating-Point Instructions 375
`CPI in a Multicycle CPU 397
`
`Chapter 6: Enhancing Performance with Pipelining
`Single-Cycle versus Pipelined Performance 438
`Stall on Branch Performance 442
`Forwarding with Two Instructions 446
`Reordering Code to Avoid Pipeline Stalls 447
`Labeled Pipeline Execution, Including Control 471
`Dependency Detection 479
`Forwarding 485
`Pipelined Branch 498
`Loops and Prediction 501
`Comparing Performance of Several Control Schemes 504
`Exception in a Pipelined Computer 507
`Simple Superscalar Code Scheduling 513
`Loop Unrolling for Superscalar Pipelines 513
`
`Chapter 7: Large and Fast: Exploiting Memory Hierarchy
`
`Bits in a Cache 550
`Mapping an Address to a Multiword Cache Block 556
`Calculating Cache Performance 565
`Cache Performance with Increased Clock Rate 567
`Associativity in Caches 5.71
`Size of Tags versus Set Associativity 575
`Performance of Multilevel Caches 576
`Overall Operation of a Memory Hierarchy 595
`
`Chapter 8: Interfacing Processors and Peripherals
`Impact of I/O on System Performance 639
`Disk Read Time 648
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, p. xiv
`
`

`
`Worked Examples
`
`Xv
`
`Performance of Two Networks 654
`FSM Control for I/O 662
`Performance Analysis of Synchronous versus Asynchronous Buses 662
`Performance Analysis of Two Bus Schemes 665
`Overhead of Polling in an I/O System 676
`Overhead of Interrupt-Driven I/O 679
`Overhead of I/O Using DMA 681 .
`I/O System Design 685
`
`Chapter 9: Multiprocessors
`Speedup Challenge 715
`Speedup Challenge, Bigger Problem 716
`Parallel Program (Single Bus) 718
`Parallel Program (Message Passing) 729
`
`Appendix A: Assemblers, Linkers, and the SPIM Simulator
`Local and Global Labels A-11
`String Directive A-15
`Macros A-15
`Stack in Recursive Procedure A-28
`Interrupt Handler A-34
`
`Appendix B: The Basics of Logic Design
`
`Truth Tables B-5
`Logic Equations B-6
`Sum of Products B-11
`PLAs B-13
`Don’t Cares B-16
`
`Appendix C: Mapping Control to Hardware
`Logic Equations for Next-State Outputs C-12
`Control ROM Entries C-17
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, p. xv
`
`

`
`d_
`
`16
`1i-
`
`re
`he
`d-
`is.
`~n
`,le
`
`e-
`
`e-
`in
`rt
`it
`~f
`~e
`
`"i-
`
`)-
`
`)r
`Le
`le
`
`Le
`’e
`g
`
`L-
`
`e
`1
`
`7.9 Historical Perspective and Further Reading
`
`621
`
`As we will see in Chapter 9, memory systems are also a central design issue
`for parallel processors. The growing importance of the memory hierarchy in
`determining system performance in both uniprocessor and multiprocessor
`systems means that this important area will continue to be a focus of both de-
`signers and researchers for some years to come.
`
`Historical Perspective and Further Reading
`
`¯ .. the one single development that put computers on their feet was the invention
`of a reliable form of memory, namely, the core memory .... Its cost was reasonable,
`it was reliable and, because it was reliable, it could in due course be made large.
`
`Maurice Wilkes,
`Memoirs of a Computer Pioneer, 1985
`
`The developments of most of the concepts in this chapter have been driven by
`revolutionary advances in the technology we use for memory. Before we dis-
`cuss how memory hierarchies were developed, let’s take a brief tour of the
`development of memory technology. In this section, we focus on the technolo-
`gies for building main memory and caches; Chapter 8 will provide some of
`the history of developments in disk technology.
`The ENIAC had only a small number of registers (about 20) for its storage
`and implemented these with the same basic vacuum tube technology that it
`used for building logic circuitry. However, the vacuum tube technology was
`far too expensive to be used to build a larger memory capacity. Eckert came up
`with the idea of developing a new technology based on mercury delay lines. In
`this technology, electrical signals were converted into vibrations that were sent
`down a tube of mercury, reaching the other end, where they were read out and
`recirculated. One mercury delay line could store about 0.5 Kbits. Although
`these bits were accessed serially, the mercury delay line was about a hundred
`times more cost-effective than vacuum tube memory. The first known working
`mercury delay lines were developed at Cambridge for the EDSAC. Figure 7.36
`shows the mercury delay lines of the EDSAC, which had 32 tanks and a total
`of 512 36-bit words.
`Despite the tremendous advance offered by the mercury delay lines, they
`were terribly unreliable and still rather expensive. The breakthrough came
`with the invention of core memory by J. Forrester at MIT as part of the Whirl-
`wind project, in the early 1950s (see Figure 7.37). Core memory uses a ferrite
`core, which can be magnetized, and once magnetized, acts as a store (just as a
`magnetic recording tape stores information). A set of wires running through
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, p. 621
`
`

`
`622
`
`Chapter 7 Large and Fast: Exploiting Memory Hierarchy
`
`FIGURE 7.36 The mercury delay lines in the EDSAC. This technology made it possible to
`build the first stored-program computer. The young engineer in this photograph is none other
`than Maurice Wilkes, the lead architect of the EDSAC. Photo courtesy of the Computer Museum,
`
`Boston.
`
`the center of the core, which had a dimension of 0.1-1.0 millimeters, make it
`possible to read the value stored on any ferrite core. The Whirlwind eventually
`included a core memory with 2048 16-bit words, or a total of 32 Kbits. Core
`memory was a tremendous advance: It was cheaper, faster, much more reli-
`able, and had higher density. Core memory was so much better than the alter-
`natives that it became the dominant memory technology only a few years after
`its invention and remained so for nearly 20 years.
`The technology that replaced core memory was the same one that we now
`use both for logic and memory: the integrated circuit. While registers were
`built out of transistorized memory in the 1960s, and IBM machines used tran-
`sistorized memory for microcode store and caches in 1970, building main
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, p. 622
`
`

`
`7.9 Historical Perspective and Further Reading
`
`623
`
`to
`er
`n,
`
`it
`Y
`e
`
`P
`
`r
`
`FIGURE 7.37 A core memory plane from the Whirlwind containing 256 cores arranged in
`a 16 x 16 array. Core memory was invented for the Whirlwind, which was used for air defense
`problems, and is now on display at the Smithsonian. (Incidentally, Ken Olsen, the founder and
`president of Digital for 20 years, built the machine that tested these core memories; it was his first
`computer.) Photo courtesy of the Computer Museum, Boston.
`
`memory out of transistors remained prohibitive until the development of the
`integrated circuit. With the integrated circuit, it became possible to build a
`DRAM (dynamic random access memory--see Appendix B for a description).
`The first DRAMS were built at Intel in 1970, and the machines using DRAM
`memories (as a high-speed option to core) came shortly thereafter; they used
`1-Kbit DRAMs. In fact, computer folklore says that Intel developed the micro-
`processor partly to help sell more DRAM. Figure 7.38 shows an early DRAM
`board. By the late 1970s, core memory became a historical curiosity. Just as core
`memory technology had allowed a tremendous expansion in memory size,
`DRAM technology allowed a comparable expansion. In the 1990s, many per-
`sonal computers have as much memory as the largest machines using core
`memory ever had.
`Nowadays, DRAMs are typically packaged with multiple chips on a little
`board called SIMM (single inline memory module) or DIMM (dual inline
`memory module). The SIMM shown in Figure 7.39 contains a total of 1 MB and
`sells for about $5 in 1997. In 1997, SIMMs and DIMMs are available with up to
`64 MB. While DRAMs will remain the dominant memory technology for some
`time to come, dramatic innovations in the packaging of DRAMs to provide
`both higher bandwidth and greater density are ongoing.
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, p. 623
`
`

`
`624
`
`Chapter 7 Large and Fast: Exploiting Memory Hierarchy
`
`FIGURE 7.38 An eariy DRAM board. This board uses 18-Kbit chips. Photo courtesy of IBM.
`
`,i
`
`;!
`
`FIGURE 7.39 A 1-MB SIMM, built in 1986, using 1-Mbit chips. This SIMM, used in a Mac-
`intosh, sells for about $5/MB in 1997. In 1997, most main memory is packed in either SIMMs or
`DIMMs similar to this, though using much higher-density memory chips (16-Mbit or 64-Mbit).
`Photo courtesy of MIPS Technology, Inc.
`
`’1
`
`The Development of Memory Hierarchies
`
`Although the pioneers of computing foresaw the need for a memory hier-
`archy and coined the term, the automatic management of two levels was first
`proposed by Kilburn and his colleagues and demonstrated at the University
`of Manchester with the Atlas computer, whici~ implemented virtual memory.
`This was the year before the IBM 360 was announced. IBM planned to include
`
`Petitioners SK hynix Inc., SK hynix America Inc. and SK hynix memory solutions Inc.
`Ex. 1017, p. 624
`
`

`
`7.9 Historical Perspective and Further Reading
`
`625
`
`virtual memory with the next generation (System/370), but the OS/360 oper-
`ating system wasn’t up to the challenge in 1970. Virtual memory was
`announced for the 370 family in 1972, and it was for this machine that the
`term translation-lookaside buffer was coined. The only computers today without
`virtual memory are a few supercomputers, and even they may add this fea-
`ture in the near future.
`The problems of inadequate address space have plagued designers repeat-
`edly. The architects of the PDP-11 identified a small address space as the only
`architectural mistake that is difficult to recover from. When the PDP-11 was de-
`signed, core memory densities were increasing at a very slow rate, and the
`competition from 100 other minicomputer companies meant that DEC might
`not have a cost-competitive product if every address had to go through the
`16-bit datapath twice. Hence the decision to add just 4 more address bits than
`the predecessor of the PDP-11. The architects of the IBM 360 were aware of the
`importance of address size and planned for the architecture to extend to 32 bits
`of address. Only 24 bits were used in the IBM 360, however, because the low-
`end 360 models would have been even slower with the larger addresses. Un-
`fortunately, the expansion effort was greatly complicated by programmers
`who stored extra information in the upper 8 "unused" address bits.
`Running out of a

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket