throbber
1
`
`
`
`
`
`
`
`
`
`
`
`DAVl-D-I.__'A; PATTERSON '-
`JoHN LHEN N. E 5 SY
`
`M [f
`
`HDRGRN KAUFMANN
`
`
`
`Sony Corp., et al. v.
`Memory Imegrily, LLC
`IPR2015-BD153
`EXHIBIT
`Ime f' —20l]5
`
`Memo
`
`

`

`
`
`
`THIRD ED
`
`lTION
`
`
`
`Computer Organization and Design
`
`THE HARDWARE/SOFTWARE INTERFACE
`
`David A. Patterson
`
`University of California, Berkeley
`
`John L. Hennessy
`
`Stanford University
`
`With a contribution by
`Peter I. Ashenden
`Ashenden Designs Pty Ltd
`
`James R. Larus
`Microsoft Research
`
`Daniel I. Sorin
`Duke University
`
`
`
`ELSEVIER
`
`AMSTERDAM ' BOSTON ' HEIDELBERG - LONDON
`NEW YORK - OXFORD - PARIS - SAN DIEGO
`
`SAN FRANCISCO-SINGAPORE-SYDNEY~TOKYO M {{G‘)
`
`MorganKaufmannis animprintofElsevier
`
`MORGAN KAUFMANN PUBLISHERS
`
`
`
`
`
`
`
`2
`
`

`

`Denise E. M. Penrose
`Simon Crump
`Summer Block
`Ross Caron Design
`Chris Asimoudis
`GGS Book Services
`Nancy Logan and Dartmouth Publishing, Inc.
`Dartmouth Publishing, Inc.
`Ken DellaPenta
`Jacqui Brownstein
`Linda Buskus
`Courier
`Courier
`
`
`
`Senior Editor
`Publishing Services Manager
`Editorial Assistant
`Cover Design
`Cover and Chapter Illustration
`Text Design
`Composition
`Technical Illustration
`Copyeditor
`Proofreader
`Indexer
`Interior printer
`Cover printer
`
`Morgan Kaufmann Publishers is an imprint of Elsevier.
`500 Sansome Street, Suite 400, San Francisco, CA 94111
`
`This book is printed on acid—free paper.
`
`© 2005 by Elsevier Inc. All rights reserved.
`
`I
`
`Designations used by companies to distinguish their products are often claimed as trademarks or registered
`trademarks. In all instances in which Morgan Kaufmann Publishers is aware of a claim, the product names
`appear in initial capital or all capital letters. Readers, however, should contact the appropriate companies
`for more complete information regarding trademarks and registration.
`
`No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or
`by any means—~electronic, mechanical, photocopying, scanning, or otherwise~without prior written per—
`mission of the publisher.
`
`Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford,
`UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, e—mail: permissions@elsevier.com.uk. You may
`also complete your request on—line via the Elsevier homepage (http://elsevier.com) by selecting “Customer
`Support” and then “Obtaining Permissions.”
`
`
`
`
`
`
`
`Library of Congress Cataloging—in-Publication Data
`Application submitted
`ISBN: 1-55860—604-1
`
`For information on all Morgan Kaufmann publications,
`Visit our Web site at www.mkp.com.
`
`Printed in the United States of America
`0405060708
`54321
`
`
`
`
`3
`
`

`

`
`
`473
`7.2 The Basics of Caches
`
`
`
`
`
`
`
`
`CPU
`
`Levels in the
`
`Increasing distance
`from the CPU in
`access time
`
`
`<
`
`>
`
`Size of the memory at each level
`____________—____————
`
`FIGURE 7.3 This diagram shows the structure of a memory hierarchy: as the distance
`from the processor increases, so does the size. This structure with the appropriate operating
`mechanisms allows the processor to have an access time that is determined primarily by level 1 of the hier—
`archy and yet have a memory as large as level n. Maintaining this illusion is the subject of this chapter.
`Although the local disk is normally the bottom of the hierarchy, some systems use tape or a file server over a
`local area network as the next levels of the hierarchy.
`
`The Basics of Caches
`
`In our library example, the desk acted as a cache—a safe place to store things
`(books) that we needed to examine. Cache was the name chosen to represent the
`level of the memory hierarchy between the processor and main memory in the
`first commercial computer to have this extra level. Today, although this remains
`the dominant use of the word cache, the term is also used to refer to any storage
`managed to take advantage of locality of access. Caches first appeared in research
`computers in the early 19605 and in production computers later in that same
`decade; every general—purpose computer built today, from servers to low-power
`embedded processors, includes caches.
`In this section, we begin by looking at a very simple cache in which the processor
`requests are each one word and the blocks also consist of a single word. (Readers
`already familiar with cache basics may want to skip to Section 7.3 on page 492.)
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Cache: a safe place for hid—
`ing or storing things.
`Webster’s New World Diction—
`
`ary of the American Language,
`Third College Edition (1988)
`
`
`
`4
`
`

`

`
`
`474
`
`Chapter 7 Large and Fast: Exploiting Memory Hierarchy
`
`
`
`
`
`
`
`x4
`X4
`,
`
`X
`X
`
`XL,
`x”;
`
`
`
`
`F
`X.-.
`X.-.
`
`
`
`
`
`X3
`
`X2
`
`X3
`
`X2
`
`l:x7_::l
`
`b. After the reference to Xn
`a. Before the reference to X”
`/—
`FIGURE 7.4 The cache just before and just after a reference to a word X,I that is not
`initially in the cache. This reference causes a miss that forces the cache to fetch anrom memory and
`insert it into the cache.
`
`Figure 7.4 shows such a simple cache, before and after requesting a data item that is
`not initially in the cache. Before the request, the cache contains a collection of recent
`references X1, X2, .
`. . , Xn_1, and the processor requests a word Xn that is not in the
`cache. This request results in a miss, and the word X1 is brought from memory into
`cache.
`‘
`there are two questions to
`In looking at
`the scenario in Figure 7.4,
`answer: How do we know if a data item is in the cache? Moreover, if it is, how do
`we find it? The answers to these two questions are related. If each word can go in
`exactly one place in the cache, then it is straightforward to find the word if it is in
`the cache. The simplest way to assign a location in the cache for each word in
`memory is to assign the cache location based on the address of the word in mem-
`ory. This cache structure is called direct mapped, since each memory location is
`mapped directly to exactly one location in the cache. The typical mapping
`between addresses and cache locations for a direct—mapped cache is usually sim—
`ple. For example, almost all direct—mapped caches use the mapping
`
`(Block address) modulo (Number of cache blocks in the cache)
`
`This mapping is attractive because if the number of entries in the cache is a power
`of two, then modulo can be computed simply by using the low—order log2 (cache
`size in blocks) bits of the address; hence the cache may be accessed directly with
`the low—order bits. For example, Figure 7.5 shows how the memory addresses
`
`direct-mapped cache A cache
`structure in which each memory
`location is mapped to exactly
`one location in the cache.
`
`xi.“;x.svz..,,Aw
`
`
`
`
`
`
`
`
`5
`
`

`

`
`
`7.2 The Basics of Caches
`475
`
`Cache
`
`000 001 010 011 100 101 110 111
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`11101
`
`
`
`00001
`00101
`01001
`01101
`10001
`10101
`11001
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Memory
`
`FIGURE 7.5 A direct-mapped cache with eight entries showing the addressesiof memory
`words between 0 and 31 that map to the same cache locations. Because there are eight words in
`the cache, an address X maps to the cache word X modulo 8. That is, the low—order log2(8) = 3 bits are used as
`the cache index. Thus, addresses 00001tw0, OlOOltwo, lOOOltWO, and llOOltWO all map to entry 001,:WO of the
`cache, while addresses 00101tw0, 01101two, 10101Mo, and 111mm all map to entry 101mm of the cache.
`
`between 1ten (00001tw0) and 29ten (11101tw0) map to locations 1ten (001
`5ten (101m)) in a direct—mapped cache of eight words.
`Because each cache location can contain the contents of a number of different
`
`two
`
`) and
`
`memory locations, how do we know whether the data in the cache corresponds to
`a requested word? That is, how do we know whether a requested word is in the
`cache or not? We answer this question by adding a set of tags to the cache. The
`tags contain the address information required to identify whether a word in the
`cache corresponds to the requested word. The tag needs only to contain the upper
`portion of the address, corresponding to the bits that are not used as an index into
`the cache. For example, in Figure 7.5 we need only have the upper 2 of the 5
`address bits in the tag, since the lower 3—bit index field of the address selects the
`block. We exclude the index bits because they are redundant, since by definition
`the index field of every address must have the same value.
`We also need a way to recognize that a cache block does not have valid infor—
`mation. For instance, when a processor starts up, the cache does not have good
`
`tag A field in a table used for a
`memory hierarchy that contains
`the address information required
`to identifywhether the associated
`block in the hierarchy corre—
`sponds to a requested word.
`
`
`
`
`
`6
`
`

`

`
`
`476
`Chapter 7 Large and Fast: Exploiting Memory Hierarchy
`
`validbit A field in the tables of a
`
`memory hierarchy that indicates
`that the associated block in the
`
`hierarchy contains valid data.
`
`data, and the tag fields will be meaningless. Even after executing many instruc—
`tions, some of the cache entries may still be empty, as in Figure 7.4. Thus, we need
`to know that the tag should be ignored for such entries. The most common
`method is to add a valid bit to indicate whether an entry contains a valid address.
`If the bit is not set, there cannot be a match for this block.
`For the rest of this section, we will focus on explaining how reads work in a
`cache and how the cache control works for reads. In general, handling reads is a
`little simpler than handling writes, since reads do not have to change the contents
`of the cache. After seeing the basics of how reads work and how cache misses can
`be handled, we’ll examine the cache designs for real computers and detail how
`these caches handle writes.
`
`Accessing a Cache
`
`Figure 7.6 shows the contents of an eight—word direct—mapped cache as it
`responds to a series of requests from the processor. Since there are eight blocks in
`the cache, the low—order 3 bits of an address give the block number. Here is the
`action for each reference:
`
`Assigned cache block
`Hit or miss
`Binary address
`Decimal address
`
`
`
`of reference in cacheof reference (where found or placed)
`
`10110two
`miss (7. 6b)
`(10110two mod 8): 110two
`
`26
`11010two
`miss (7.6c)
`(11010two mod 8): 010m
`
`22
`10110two
`hit
`(10110two mod 8) = 110two
`
`26
`11010two
`hit
`(11010th mod 8) = 010two
`
`16
`10000th
`miss (7.6d)
`(10000two mod 8) = 000two
`
`3
`00011th
`miss (7.6e)
`(00011two mod 8) = 011two
`16
`10000two
`hit
`(10000,“ mod 8) = 000th
`18
`10010MO
`miss (7.6f)
`(10010two mod 8) = 010M,
`
`
`
`
`
`
`When the word at address 18 ( 0010““) is brought into cache block 2
`(010mm), the word at address 26 (llOlOtwo), which was in cache block 2
`(010m)), must be replaced by the newly requested data. This behavior allows a
`cache to take advantage of temporal locality: recently accessed words replace
`less recently referenced words. This situation is directly analogous to needing a
`book from the shelves and having no more space on your desk—some book
`already on your desk must be returned to the shelves In a direct—mapped cache,
`there is only one place to put the newly requested item and hence only one
`choice of what to replace.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`7
`
`

`

`9.6 Network Topologies
`
`9-27
`
`9.6
`
` Network Topologies
`
`9.6
`
`Chapter 8 reviewed off-the-shelf, switched, local area networks that are the foun-
`dation of clusters. In this section we describe proprietary networks used in multi-
`processors.
`Network costs include the number of switches, the number of links on a switch
`to connect to the network, the width (number of bits) per link, and length of the
`links when the network is mapped into a physical machine. For example, on a
`machine that scales between tens and hundreds of processors, some links may be
`metal rectangles within a chip that are a few millimeters long, and others may be
`cables that must stretch several meters from one cabinet to another. Network per-
`formance is multifaceted as well. It includes the latency on an unloaded network
`to send and receive a message, the throughput in terms of the maximum number
`of messages that can be transmitted in a given time period, delays caused by con-
`tention for a portion of the network, and variable performance depending on the
`pattern of communication. Another obligation of the network may be fault toler-
`ance, since very large systems may be required to operate in the presence of bro-
`ken components.
`Networks are normally drawn as graphs, with each arc of the graph represent-
`ing a link of the communication network. The processor-memory node is shown
`as a black square, and the switch is shown as a colored circle. In this section, all
`links are bidirectional; that is, information can flow in either direction. All net-
`works consist of switches whose links go to processor-memory nodes and to other
`switches. The first improvement over a bus is a network that connects a sequence
`of nodes together:
`
`This topology is called a ring. Since some nodes are not directly connected, some
`messages will have to hop along intermediate nodes until they arrive at the final
`destination.
`Unlike a bus, a ring is capable of many simultaneous transfers. Because there
`are numerous topologies to choose from, performance metrics are needed to dis-
`tinguish these designs. Two are popular. The first is total network bandwidth,
`which is the bandwidth of each link multiplied by the number of links. This repre-
`sents the very best case. For the ring network above with P processors, the total
`network bandwidth would be P times the bandwidth of one link; the total net-
`work bandwidth of a bus is just the bandwidth of that bus, or two times the band-
`width of that link.
`
`network bandwidth Infor-
`mally, the peak transfer rate of a
`network; can refer to the speed
`of a single link or the collective
`transfer rate of all links in the
`network.
`
`8
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket