`
`In Praise of Memory Systems: Cache, DRAM, Disk
`
`Memory Systems: Cache, DRAM, Disk is the first book that takes on the whole hierarchy in a way that is
`consistent, covers the complete memory hierarchy, and treats each aspect in significant detail. This book will
`serve as a definitive reference manual for the expert designer, yet it is so complete that it can be read by a relative
`novice to the computer design space. While memory technologies improve in terms of density and performance,
`and new memory device technologies provide additional properties as design options, the principles and meth(cid:173)
`odology presented in this amazingly complete treatise will remain useful for decades. I only wish that a book
`like this had been available when I started out more than three decades ago. It truly is a landmark publication.
`Kudos to the authors.
`-Al Davis, UniversityofUtah
`
`Memory Systems: Cache, DRAM, Disk fills a huge void in the literature about modern computer architecture.
`The book starts by providing a high level overview and building a solid knowledge basis and then provides the
`details for a deep understanding of essentially all aspects of modern computer memory systems including archi(cid:173)
`tectural considerations that are put in perspective with cost, performance and power considerations. In addi(cid:173)
`tion, the historical background and politics leading to one or the other implementation are revealed. Overall,
`Jacob, Ng, and Wang have created one of the truly great technology books that turns reading about bits and bytes
`-
`into an exciting journey towards understanding technology.
`
`-Michael Schuette, Ph.D., VP of Technology Development at OCZ Technology
`
`This book is a critical resource for anyone wanting to know how DRAM, cache, and hard drives really work.
`It describes the implementation issues, timing constraints, and trade-offs involved in past, present, and future
`designs. The text is exceedingly well-written, beginning with high-level analysis and proceeding to incredible
`detail only for those who need it. It includes many graphs that give the reader both explanation and intuition.
`This will be an invaluable resource for graduate students wanting to study these areas, implementers, designers,
`and professors.
`
`-Diana Franklin, California Polytechnic University, San Luis Obispo
`
`Memory Systems: Cache, DRAM, Disk fills an important gap in exploring modern disk technology with accu(cid:173)
`racy, lucidity, and authority. The details provided would only be known to a researcher who has also contributed
`in the development phase. I recommend this comprehensive book to engineers, graduate students, and research(cid:173)
`ers in the storage area, since details provided in computer architecture textbooks are woefully inadequate.
`-Alexander Thomasian, IEEE Fellow, New Jersey Institute of Technology and Thomasian and Associates
`
`Memory Systems: Cache, DRAM, Disk offers a valuable state of the art information in memory systems that
`can only be gained through years of working in advanced industry and research. It is about time that we have
`such a good reference in an important field for researchers, educators and engfneers.
`
`-Nagi Mekhiel, Department of Electrical and Computer Engineering, Ryersop. University, Toronto
`
`This is the only book covering the important DRAM and disk technologies in detail. Clear, comprehensive, and
`authoritative, I have been waiting for such a book for long time.
`-Yiming Hu, University of Cincinnati
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. i
`
`
`
`Memory is often perceived as the performance bottleneck in computing architectures. Memory Systems:
`Cache, DRAM, Disk, sheds light on the mystical area of memory system design with a no-nonsense approach to
`what matters and how it affects performance. From historical discussions to modern case study examples this
`book is certain to become as ubiquitous and used as the other Morgan Kaufmann classic textbooks in computer
`engineering including Hennessy and Patterson's Computer Architecture: A Quantitative Approach.
`
`-R. Jacob Baker, Micron Technology, Inc. and Boise State University.
`
`Memory Systems: Cache, DRAM, Disk is a remarkable book that fills a very large void. The book is remarkable
`in both its scope and depth. It ranges from high performance cache memories to disk systems. It spans cirpuit
`design to system architecture in a clear, cohesive manner. It is the memory architecture that defines modern
`computer systems, after all. Yet, memory systems are often considered as an appendage and are covered in a
`piecemeal fashion. This book recognizes that memory systems are the heart and soul of modern computer
`systems and takes a 'holistic' approach to describing and analyzing memory systems.
`
`The classic book on memory systems was written by Dick Matick of IBM over thirty years ago. So not only does
`this book fill a void, it is a long-standing void. It carries on the tradition of Dick Matick's book extremely well,
`and it will doubtless be the definitive reference for students and designers of memory systems for many years to
`come. Furthermore, it would be easy to build a top-notch memory systems course around this book. The authors
`clearly and succinctly describe the important issues in an easy- to-read manner. And the figures and graphs are
`really great-one of the best parts of the book.
`
`When I work at home, I make coffee in a little stove-top espresso maker I got in Spain. It makes good coffee very
`efficiently, but if you put it on the stove and forget it's there, bad things happen-smoke, melted gasket-'burned
`coffee meltdown.' This only happens when I'm totally engrossed in a paper or article. Today, for the first time, it
`happened twice in a row-while I was reading the final version of this book.
`-Jim Smith, University ofWisconsin-Madison
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. ii
`
`
`
`Memory Systems
`Cache, DRAM, Disk
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. iii
`
`
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. iv
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. iv
`
`
`
`Memory Systems
`Cache, DRAM, Disk
`
`Bruce Jacob
`University of Maryland at College Parle
`
`Spencer W. Ng
`Hitachi Global Storage Technologies
`
`David T. Wang
`MetaRAM
`
`With Contributions By
`
`Samuel Rodriguez
`Advanced Micro Devices
`
`ELSEVIER
`
`AMSTERDAM • BOSTON • HEIDELBERG LONDON
`NEW YORK • OXFORD • PARIS • SAN DIEGO
`SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO
`Morgan Kaufmann is an imprint of Elsevier
`
`MORGAN KAUFMANN PUBLISHERS
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. v
`
`
`
`Publisher
`Acquisitions Editor
`Publishing Services Manager
`Senior Production Editor
`Developmental Editor
`Assistant Editor
`Cover Design
`Text Design
`Composition
`Interior printer
`Cover printer
`
`Denise E.M. Penrose
`Chuck Glaser
`George Morrison
`Paul Gottehrer
`Nate McFadden
`Kimberlee Honjo
`Joanne Blank
`Dennis Schaefer
`diacriTech
`Maple-Vail Book Manufacturing Group
`Phoenix Color
`
`Morgan Kaufmann Publishers is an imprint of Elsevier.
`30 Corporate Drive, Suite 400, Burlington, MA 01803, USA
`
`This book is printed on acid-free paper.
`
`© 2008 by Elsevier Inc. All rights reserved.
`
`Designations used by companies to distinguish their products are often claimed as trademarks or registered trademarks. In all
`instances in which Morgan Kaufmann Publishers is aware of a claim, the product names appear in initial capital or all capital
`letters. Readers, however, should contact the appropriate companies for more complete information regarding trademarks and
`registration.
`
`No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means-electronic,
`mechanical, photocopying, scanning, or otherwise--without prior written permission of the publisher.
`
`Permissions may be sought directly from Elsevier's Science & Technology Rights Department in Oxford, UK: phone:
`(+44) 1865 843830, fax: (+44) 1865 853333, E-mail: permissions@elsevier.com. You may also complete your request online
`via the Elsevier homepage (http://elsevier.com), by selecting "Support & Contact" then "Copyright and Permission" and then
`"Obtaining Permissions."
`
`Library of Congress Cataloging-in-Publication Data
`Application submitted
`
`ISBN: 978-0-12-379751-3
`
`For information on all Morgan Kaufmann publications,
`visit our Web site at www.mkp.com or www.books.elsevier.com
`Printed in the United States of America
`08 09 10 11 12
`5 4 3 2 1
`
`Working together to grow
`libraries in developing countries
`www.elsevier.com I www.bookaid.org I www.sabre.org
`ELSEVIER
`~,?,~,~<t~~?t
`Sabre Foundation
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. vi
`
`
`
`Dedication
`
`Jacob To my parents, Bruce and Ann Jacob, my wife,
`Dorinda, and my children, Garrett, Carolyn,
`and Nate
`
`Ng Dedicated to the memory of my parents
`Ching-Sum and Yuk-Ching Ng
`
`Wang Dedicated to my parents Tu-Sheng Wang and
`Hsin-Hsin Wang
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. vii
`
`
`
`You can tell whether a person plays or not by the way he carries the
`instrument, whether it means something to him or not.
`Then the way they talk and act. If they act too hip, you know they can't
`play [jack].
`
`-Miles Davis
`
`[ .. .} in connection with musical continuity, Cowell remarked at the
`New School before a concert of works by Christian Wolff, Earle Brown,
`Morton Feldman, and myself, that here were four composers who were
`getting rid of glue. That is: Where people had felt the necessity to stick
`sounds together to make a continuity, we four felt the opposite neces(cid:173)
`sity to get rid of the glue so that sounds would be themselves.
`
`Christian Wolff was the first to do this. He wrote some pieces vertically
`on the page but recommended their being played horizontally left to
`right, as is conventional. Later he discovered other geometrical means
`for freeing his music of intentional continuity. Morton Feldman di(cid:173)
`vided pitches into three areas, high, middle, and low, and established
`a time unit. Writing on graph paper, he simply inscribed numbers of
`tones to be played at any time within specified periods of time.
`
`There are people who say, '1f music's that easy to write, I could do it." Of
`course they could, but they don't. I find Feldman's own statement more
`affirmative. We were driving back from some place in New England
`where a concert had been given. He is a large man and falls asleep
`easily. Out of a sound sleep, he awoke to say, ((Now that things are so
`simple, there's so much to do." And then he went back to sleep.
`
`-John Cage, Silence
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. viii
`
`
`
`Contents
`
`Preface
`
`111t1 s the Memory1 Stupidl11
`
`.............................................. xxxi
`
`Overview On Memory Systems and Their Design ................................. 1
`Ov.l Memory Systems .......................................................................................... 2
`Ov.1.1 Locality of Reference Breeds the Memory Hierarchy ......................................... 2
`Ov.1.2
`Important Figures of Merit ................................................................................. 7
`Ov.1.3 The Goal of a Memo1y Hierarchy ..................................................................... 10
`Ov.2 Four Anecdotes on Modular Design .......................................................... 14
`Ov.2.1 Anecdote I: Systemic Behaviors Exist ............................................................... 15
`Ov.2.2 Anecdote II: The DLL in DDR SDRAM ............................................................. 17
`Ov.2.3 Anecdote III: A Catch-22 in the Search for Bandwidth ................................... 18
`Ov.2.4 Anecdote W: Proposals to Exploit Variability in Cell Leakage ........................ 19
`Ov.2.5 Perspective ......................................................................................................... 19
`Ov.3 Cross-Cutting Issues .................................................................................. 20
`Ov.3.1 Cost/Performance Analysis ............................................................................... 20
`Ov.3.2 Power and Energy ............................................................................................. 26
`Ov.3.3 Reliability .......................................................................................................... 32
`Ov.3.4 Virtual Memory ................................................................................................. 34
`Ov.4 An Example Holistic Analysis .................................................................... 41
`Ov.4.1 Fully-Buffered DIMM vs. the Disk Cache ......................................................... 41
`Ov.4.2 Fully Buffered DIMM: Basics ............................................................................ 43
`Ov.4.3 Disk Caches: Basics ........................................................................................... 46
`Ov. 4.4 Experimental Results ........................................................................................ 4 7
`Ov.4.5 Conclusions ....................................................................................................... 52
`Ov.5 What to Expect ............................................................................................ 54
`
`ix
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. ix
`
`
`
`X Contents
`
`Part I Cache ............................................... 55
`Chapter 1 An Overview of Cache Principles ......................................... 57
`1.1 Caches, 'Caches; and "Caches" .................................................................... 59
`1.2 Locality Principles ........................................................................................ 62
`1.2.1 Temporal Locality .................................................................................................. 63
`1.2.2 Spatial Locality ...................................................................................................... 63
`1.2.3 Algorithmic Locality .............................................................................................. 64
`1.2.4 Geographical Locality? Demographical Locality? ............................................... 65
`1.3 What to Cache, Where to Put It, and How to Maintain It ............................. 66
`1.3.1 Logical Organization Basics: Blocks, Tags, Sets .................................................... 67
`1.3.2 Content Management: To Cache or Not to Cache ................................................ 68
`1.3.3 Consistency Management: Its Responsibilities ..................................................... 69
`1:3.4 Inclusion and Exclusion ........................................................................................ 70
`1.4 Insights and Optimizations .......................................................................... 73
`1.4.1 Perspective .............................................................................................................. 73
`1.4.2 Important Issues, Future Directions ..................................................................... 77
`
`Chapter 2 Logical Organization ......................................................... 79
`2.1 Logical Organization: A Taxonomy .............................................................. 79
`2.2
`'Ii."ansparently Addressed Caches ................................................................. 82
`Implicit Management: Transparent Caches ......................................................... 86
`2.2.1
`2.2.2 Explicit Management: Software-Managed Caches .............................................. 86
`2.3 Non-'Ii."ansparently Addressed Caches ......................................................... 90
`2.3.1 Explicit Management: Scratch-Pad Memories ..................................................... 91
`2.3.2 Implicit Management: Self-Managed Scratch-Pads ............................................ 92
`2.4 Virtual Addressing and Protection ............................................................... 92
`2. 4.1 Virtual Caches ........................................................................................................ 93
`2.4.2 ASIDs and Protection Bits ...................................................................................... 96
`2.4.3 Inherent Problems ................................................................................................. 96
`2.5 Distributed and Partitioned Caches ............................................................ 97
`2.5.1 UMA andNUMA ................................................................................................... 98
`2.5.2 COMA ..................................................................................................................... 99
`2.5.3 NUCA and NuRAPID ............................................................................................. 99
`2.5.4 Web Caches ........................................................................................................... 100
`2.5.5 Buffer Caches ............................................. , .......................................................... 102
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. x
`
`
`
`Contents xi
`
`2.6 Case Studies ................................................................................................ 103
`2.6.1 A Horizontal-Exclusive Organization: Victim Caches, Assist Caches ................ 103
`2.6.2 A Software Implementation: BSD's Buffer Cache ............................................... 104
`2.6.3 Another Dynamic Cache Block: Trace Caches .................................................... 106
`
`Chapter 3 Management of Cache Contents .......................................... 117
`3.1 Case Studies: On-Line Heuristics ............................................................... 120
`3.1.1 On-Line Partitioning Heuristics ......................................................................... 120
`3.1.2 On-Line Prefetching Heuristics ........................................................................... 129
`3.1.3 On-Line Locality Optimizations ......................................................................... 141
`3.2 Case Studies: Off-Line Heuristics ............................................................... 151
`3.2.1 Off-Line Partitioning Heuristics ......................................................................... 151
`3.2.2 Off-Line Prefetching Heuristics ........................................................................... 155
`3.2.3 Off-Line Locality Optimizations ......................................................................... 161
`3.3 Case Studies: CombinedApproaches ......................................................... 169
`3.3.1 Combined Approaches to Partitioning ............................................................... 170
`3.3.2 Combined Approaches to Prefetching ................................................................ 174
`3.3.3 Combined Approaches to Optimizing Locality .................................................. 180
`3.4 Discussions ................................................................................................. 202
`3.4.1 Proposed Scheme vs. Baseline ............................................................................. 202
`3.4.2 Prefetching vs. Locality Optimizations ....... ~ ....................................................... 203
`3.4.3 Application-Directed Management vs. Transparent Management .................. 203
`3.4.4 Real Time vs. Average Case .................................................................................. 204
`3.4.5 Namingvs. Cache Conflicts ........................ : ......................................................... 205
`3.4.6 Dynamic vs. Static Management ........................................................................ 208
`3.5 Building a Content-Management Solution ............................................... 212
`3.5.1 Degree of Dynamism ........................................................................................... 212
`3.5.2 Degree of Prediction ............................................................................................. 213
`3.5.3 Method of Classification ..................................................................................... 213
`3.5.4 Method for Ensuring Availability ....................................................................... 214
`
`Chapter 4 Management of Cache Consistency .................................... 217
`4.1 Consistency with Backing Store ................................................................. 218
`4.1.1 Write-Through ..................................................................................................... 218
`4.1.2 Delayed Write, Driven By the Cache .................................................................... 219
`4.1.3 Delayed Write, Driven by Backing Store ............................................................. 220
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. xi
`
`
`
`xii Contents
`
`Chapter 5
`
`4.2 Consistency with Self .................................................................................. 220
`4.2.1 Virtual Cache Management ................................................................................ 220
`4.2.2 ASID Management ............................................................................................... 225
`4.3 Consistency with Other Clients .................................................................. 226
`4.3.1 Motivation, Explanation, Intuition .................................................................... 226
`4.3.2 Coherence vs. Consistency ................................................................................... 231
`4.3.3 Memory-Consistency Models .............................................................................. 233
`4.3.4 Hardware Cache-Coherence Mechanisms .......................................................... 240
`4.3.5 Software Cache-Coherence Mechanisms ........................................................... 254
`
`Implementation Issues ....................................................... 257
`5.1 Overview ..................................................................................................... 257
`5.2 SRAM Implementation ............................................................................... 258
`5.2.1 Basic 1-Bit Memory Cell ...................................................................................... 259
`5.2.2 Address Decoding ................................................................................................. 262
`5.2.3 Physical Decoder Implementation ...................................................................... 268
`5.2.4 Peripheral Bitline Circuits ................................................................................... 278
`5.2.5 Sense Amplifiers ................................................................................................... 281
`5.2.6 WriteAmplifier ..................................................................................................... 283
`5.2. 7 SRAM Partitioning ............................................................................................... 285
`5.2.8 SRAM Control and Timing .................................................................................. 286
`5.2.9 SRAM Interface .................................................................................................... 292
`5.3 Advanced SRAM Topics ............................................................................... 293
`5.3.1 Low-Leakage Operation ...................................................................................... 293
`5.4 Cache Implementation ............................................................................... 297
`5.4.1 Simple Caches ...................................................................................................... 297
`5.4.2 Processor Interfacing ........................................................................................... 298
`5.4.3 Multiporting ......................................................................................................... 298
`
`Chapter 6 Cache Case Studies ......................................................... 301
`6.1 Logical Organization .................................................................................. 301
`6.1.1 MotorolaMPC7450 .............................................................................................. 301
`6.1.2 AMD Opteron ....................................................................................................... 301
`Intelitanium-2 .................................................................................................... 303
`6.1.3
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. xii
`
`
`
`Contents xiii
`
`6.2 Pipeline Interface ....................................................................................... 304
`6.2.1 MotorolaMPC7450 .............................................................................................. 304
`6.2.2 AMD Opteron ....................................................................................................... 304
`Intel Itanium-2 .................................................................................................... 304
`6.2.3
`6.3 Case Studies of Detailed Itanium-2 Circuits .............................................. 305
`6.3.1 L1 Cache RAM CellArray ..................................................................................... 305
`6.3.2 L2 Array Bitline Structure .................................................................................... 305
`6.3.3 L3 Subarray Implementation .............................................................................. 308
`Itanium-2 TLB and CAM Implementation ........................................................ 308
`6.3.4
`
`Part II DRAM ........................................... 313
`
`Chapter 7 Overview of DRAMs ...................................................... 315
`7.1 DRAM Basics: Internals, Operation ........................................................... 316
`7.2 Evolution of the DRAM Architecture ......................................................... 322
`7.2.1 Structural Modifications Targeting Throughput ............................................... 322
`7.2.2
`Interface Modifications Targeting Throughput .................................................. 328
`7.2.3 Structural Modifications Targeting Latency ....................................................... 330
`7.2.4 Rough Comparison of Recent DRAMs ................................................................ 331
`7.3 Modern-Day DRAM Standards ................................................................... 332
`7.3.1 Salient Features of ]EDEC's SDRAM Technology ................................................ 332
`7.3.2 Other Technologies, Ram bus in Particular ......................................................... 335
`7.3.3 Comparison of Technologies in Rambus and ]EDEC DRAM ............................. 341
`7.3.4 Alternative Technologies ...................................................................................... 343
`7.4 Fully Buffered DIMM: A Compromise of Sorts .......................................... 348
`7.5 Issues in DRAM Systems, Briefly ................................................................ 350
`7.5.1 Architecture and Scaling ..................................................................................... 350
`7.5.2 Topology and Timing ........................................................................................... 350
`7.5.3 Pin and Protocol Efficiency ................................................................................. 351
`7.5.4 Power and Heat Dissipation ............................................................................... 351
`7.5.5 Future Directions ................................................................................................ 351
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. xiii
`
`
`
`xiv Contents
`
`Chapter 8 DRAM Device Organization: Basic Circuits and
`Architecture ................................................................... 353
`8.1 DRAM Device Organization ....................................................................... 353
`8.2 DRAM Storage Cells .................................................................................... 355
`8.2.1 Cell Capacitance, Leakage, and Refresh ........................................................... 356
`8.2.2 Conflicting Requirements Drive Cell Structure ................................................ 356
`8.2.3 Trench Capacitor Structure ............................................................................... 357
`8.2.4 Stacked Capacitor Structure .............................................................................. 357
`8.3 RAM Array Structures ................................................................................. 358
`8.3.1 Open BitlineArray Structure ............................................................................. 359
`8.3.2 Folded Bitline Array Structure ........................................................................... 360
`8.4 Differential Sense Amplifier ....................................................................... 360
`8.4.1 Functionality of Sense Amplifiers in DRAM Devices ........................................ 361
`8.4.2 Circuit Diagram of a Basic Sense Amplifier ..................................................... 362
`8.4.3 Basic Sense Amplifier Operation ....................................................................... 362
`8.4.4 Voltage Waveform of Basic Sense Amplifier Operation .................................... 363
`8.4.5 Writing into DRAM Array .................................................................................. 365
`8.5 Decoders and Redundancy ........................................................................ 366
`8.5.1 Row Decoder Replacement Example ................................................................ 368
`8.6 DRAM Device Control Logic ....................................................................... 368
`8.6.1 Synchronous vs. Non-Synchronous .................................................................. 369
`8.6.2 Mode Register-Based Programmability ............................................................ 370
`8. 7 DRAM Device Configuration ...................................................................... 370
`B. 7.1 Device Configuration Trade-offs ....................................................................... 371
`8.8 Data 1/0 ....................................................................................................... 372
`8.8.1 Burst Lengths and Burst Ordering .................................................................... 372
`8.8.2 N-BitPrefetch ..................................................................................................... 372
`8.9 DRAM Device Packaging ............................................................................ 373
`8.10 DRAM Process Technology and Process Scaling
`Considerations ........................................................................................... 37 4
`8.1 0.1 Cost Considerations ........................................................................................... 375
`8.10.2 DRAM- vs. Logic-Optimized Process Technology ............................................. 375
`
`Chapter 9 DRAM System Signaling and Timing ................................. 377
`9.1 Signaling System ........................................................................................ 377
`
`Samsung Electronics Co., Ltd.
`Ex. 1021, p. xiv
`
`
`
`Contents XV
`
`9.2 Transmission Lines on PCBs ...................................................................... 379
`9.2.1 Brief Tutorial on the Telegrapher's Equations ................................................... 380
`9.2.2 RC and LC Transmission Line Models ............................................................... 382
`9.2.3 LC Transmission Line Model for PCB Traces ..................................................... 383
`9.2.4 Signal Velocity on the LC Transmission Ltne .................................................... 383
`9.2.5 Skin Effect of Conductors ................................................................................... 384
`9.2.6 Dielectric Loss .............................................