throbber
(19) United States
`(12) Patent Application Publication (10) Pub. No.: US 2002/0184445 A1
`(43) Pub. Date:
`Dec. 5, 2002
`Cherabuddi
`
`US 2002O1844.45A1
`
`(54)
`
`(76)
`
`(21)
`(22)
`
`(51)
`(52)
`
`DYNAMICALLY ALLOCATED CACHE
`MEMORY FOR A MULTI-PROCESSOR UNIT
`
`(57)
`
`ABSTRACT
`
`Inventor: Rajasekhar Cherabuddi, Cupertino,
`CA (US)
`Correspondence Address:
`WILLIAM L. PARADICE, III
`425 CALFORNIA STREET
`SUTE 900
`SAN FRANCISCO, CA 94104 (US)
`Appl. No.:
`09/838,921
`Filed:
`Apr. 20, 2001
`
`Publication Classification
`
`Int. Cl. ................................................ G06F 13/00
`U.S. Cl. .............................................................. 711/130
`
`The resources of a partitioned cache memory are dynami
`cally allocated between two or more processors on a multi
`processor unit (MPU). In one embodiment, the MPU
`includes first and Second processors, and the cache memory
`includes first and Second partitions. A cache access circuit
`Selectively transferS data between the cache memory parti
`tions to maximize cache resources. In one mode, both
`processors are active and may simultaneously execute Sepa
`rate instruction threads. In this mode, the cache access
`circuit allocates the first cache memory partition as dedi
`cated cache memory for the first processor, and allocates the
`Second cache memory partition as dedicated cache memory
`for the Second processor. In another mode, one processor is
`active, and the other processor is inactive. In this mode, the
`cache access circuit allocates both the first and Second cache
`memory partitions as cache memory for the active processor.
`
`-
`
`MPU2
`
`f :
`
`MPU3
`
`MPU
`
`
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1018, p. 1
`
`

`

`Patent Application Publication
`
`Dec. 5, 2002. Sheet 1 of 4
`
`US 2002/0184445 A1
`
`N
`N
`
`-v.
`&
`
`S.
`S.
`
`N
`
`1.
`
`-CA
`
`&
`S
`S.
`,
`
`*
`
`re
`
`Q
`N
`---
`
`re.
`
`A.
`VO
`N
`-
`
`(Ny
`
`>
`S.
`S Q
`&
`
`S.
`
`S.
`s
`se
`
`s
`N a
`
`-
`N r
`s
`
`a.
`
`2
`>
`
`s
`
`r)
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1018, p. 2
`
`

`

`Patent Application Publication
`
`Dec. 5, 2002 Sheet 2 of 4
`
`US 2002/0184445 A1
`
`
`
`s
`
`s
`s
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1018, p. 3
`
`

`

`Patent Application Publication
`
`Dec. 5, 2002 Sheet 3 of 4
`
`US 2002/0184445 A1
`
`
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1018, p. 4
`
`

`

`Patent Application Publication
`
`Dec. 5, 2002 Sheet 4 of 4
`
`US 2002/0184445 A1
`
`
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1018, p. 5
`
`

`

`US 2002/01844.45 A1
`
`Dec. 5, 2002
`
`DYNAMICALLY ALLOCATED CACHE MEMORY
`FOR A MULTI-PROCESSOR UNIT
`
`BACKGROUND
`0001) 1. Field of Invention
`0002 This invention relates generally to multiprocessor
`computer System and Specifically to cache memory of mul
`tiprocessor computer Systems.
`0003 2. Description of Related Art
`0004 Some manufactures combine two or more central
`processing units (CPUs) on a single chip and sell the chip as
`a multi-processor unit (MPU). The MPU takes advantage of
`parallel processing to increase performance over a single
`CPU. An MPU typically includes a cache memory to store
`data in anticipation of future use by the CPUs. The cache
`memory is smaller and faster than the MPU's main memory,
`and thus can transfer data to the CPUs in much less time than
`data from the main memory. When data requested by the
`CPUs is in the cache memory, there is a cache hit, and CPU
`performance approaches the Speed of the cache memory.
`Conversely, when there is a cache miss, the requested data
`must be retrieved from main memory, and thus CPU per
`formance approaches the Speed of main memory. Thus,
`increased performance may be achieved by maximizing the
`percentage of cache hits during operation.
`0005 Some MPU architectures include a single cache
`memory that is shared by each of its CPUs. Since data stored
`in the shared cache memory is shared by each CPU on the
`chip, it is not necessary to Store duplicate Sets of data, which
`increases cache efficiency. Further, if one of the CPUs on the
`chip becomes defective, or is otherwise not required for a
`particular operation, the other CPU(s) may still access the
`entire cache memory. However, since more than one CPU
`may access the same cache memory locations, chip-level
`Snoop operations are required between the CPUs on each
`MPU. These snoop operations are in addition to any system
`level Snoop operations between MPUs on a common bus.
`The additional circuitry required to perform the chip-level
`Snoop operations undesirably increase the Size and complex
`ity of the associated cache controllers.
`0006 Other MPU architectures include a dedicated cache
`memory for each of its CPUs. Since only one CPU has
`access to any given cache memory location, Snoop opera
`tions between the CPUs on the MPUs may be performed at
`the System-level rather than the chip-level. Accordingly, the
`cache controllers for dedicated cache memories are Smaller
`and Simpler than the cache controllers for a shared cache
`memory. However, if one of the CPUs becomes defective or
`is otherwise not required for a particular application, its
`dedicated cache memory is not accessible by the other
`CPU(s), thereby wasting cache resources.
`0007 Thus, there is a need for better management of
`cache resources on an MPU without requiring large and
`complicated cache controllers.
`
`SUMMARY
`0008. A method and apparatus are disclosed that over
`come problems in the art described above. In accordance
`with the present invention, the resources of a partitioned
`cache memory are dynamically allocated between two or
`
`more processors on a multi-processor unit (MPU) according
`to a desired System configuration or to the processing needs
`of the processors. In some embodiments, the MPU includes
`first and Second processors, and the cache memory includes
`first and Second partitions. In one embodiment, each cache
`memory partition is a 2-way associative cache memory. A
`cache access circuit provided between the cache memory
`and the processors Selectively transferS addresses and data
`between the first and/or second CPUs and the first and/or
`Second cache memory partitions to maximize cache
`CSOUCCS.
`0009. In one mode, both processors are set as active, and
`may simultaneously execute Separate instruction threads. In
`this two-thread mode, the cache access circuit allows each
`processor to use a corresponding cache memory partition as
`a dedicated cache. For example, during cache read opera
`tions, the cache access circuit provides addresses from the
`first processor to the first cache memory partition and
`addresses from the Second processor to the Second cache
`memory partition, and returns data from the first cache
`memory partition to the first processor and data from the
`Second cache memory partition to the Second processor.
`Similarly, during cache write operations, the cache acceSS
`circuit routes addresses and data from the first processor to
`the first cache memory partition and routes addresses and
`data from the Second processor to the Second cache memory
`partition. Thus, the first and Second processors may use the
`first and Second cache memory partitions, respectively, as
`dedicated 2-way associative caches.
`0010. In another mode, one processor is set as the active
`processor, and the other processor is set as the inactive
`processor. In this one-thread mode, the cache access circuit
`allows the active processor to use both the first and Second
`cache memory partitions. For example, during cache read
`operations, the cache access circuit provides addresses from
`the active processor to both the first and Second cache
`memory partitions, and returns matching data from the first
`and Second cache memory partitions to the active processor.
`Similarly, during cache write operations, the cache acceSS
`circuit returns addresses and data from the active processor
`to the first and Second cache memory partitions. In this
`manner, the active processor may collectively use the first
`and Second cache memory partitions as a 4-way associative
`cache.
`0011. The ability to dynamically allocate cache resources
`between multiple processors advantageously allows the
`entire cache memory to be used, irrespective of whether one
`or both processors are currently active, thereby maximizing
`cache resources while allowing for both one-thread and
`two-thread execution modes. In addition, the present inven
`tion may be used to maximize cache resources when one of
`the on-board processors is defective. For example, if one
`processor is found to be defective during testing, it may be
`Set as inactive, and the cache access circuit may allocate the
`entire cache memory to the other processor.
`BRIEF DESCRIPTION OF THE DRAWINGS
`0012 FIG. 1 is a block diagram of a computer system
`within which embodiments of the present invention may be
`implemented;
`0013 FIG. 2 is a block diagram of a multi-processor unit
`having a dynamically allocated cache memory in accordance
`with the present invention;
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1018, p. 6
`
`

`

`US 2002/01844.45 A1
`
`Dec. 5, 2002
`
`FIG. 3 is a state diagram illustrating state transi
`0.014
`tions for the multi-processor unit of FIG. 2; and
`0015 FIG. 4 is block diagram of one embodiment of the
`multi-processor unit of FIG. 2.
`0016. Like reference numerals refer to corresponding
`parts throughout the drawing figures.
`
`DETAILED DESCRIPTION
`0.017. The present invention is described below with
`reference to an MPU having two processors for simplicity
`only. It is to be understood that embodiments of the present
`invention are equally applicable to MPUs having any num
`ber of processors. Further, although described as having
`2-way associative cache memory partitions, the dynamically
`allocated cache memory of the present invention may be
`configured for any desired level of associativity. In addition,
`the particular logic levels assigned to Signals discussed
`herein is arbitrary and, thus, may be reversed where desir
`able. Accordingly, the present invention is not to be con
`Strued as limited to specific examples described herein but
`rather includes within its scope all embodiments defined by
`the appended claims.
`0018 FIG. 1 shows a computer system 10 within which
`embodiments of the present invention may be implemented.
`System 10 is shown to include four MPUs 11 connected to
`each other and to a main memory 12, an input/output (I/O)
`device 13, and a network 14 via a system bus 15. Main
`memory 12 is shared by MPUs 11, and may be any suitable
`random access memory (RAM) Such as, for example,
`DRAM. I/O device 13 allows a user to interact with system
`10, and may include, for example, a computer monitor,
`keyboard, and/or mouse input. Network 14 may be any
`Suitable network Such as, for example, a local area network,
`a wide area network, and/or the Internet. Additional devices
`may be connected to the system bus 15 as desired.
`0019 FIG. 2 shows an MPU20 that is one embodiment
`of MPU 11 of FIG. 1. MPU 20 is shown to include first and
`Second processors Such as central processing units (CPUs)
`21a-21b, a cache access circuit 22, and a dynamically
`allocated cache memory 23. CPUs 21a-21b are well-known
`processing devices. Cache memory 23 is partitioned into
`first and Second cache memory partitions 23a-23b, and is
`preferably a high Speed cache memory device Such as
`SRAM, although other cache devices may be used. For the
`purpose of discussion herein, each cache memory partition
`23a-23b is configured as a 2-way associative cache memory.
`Of course, in actual embodiments, the cache memory par
`titions may be configured for other levels of associativity.
`0020 Cache access circuit 22 selectively couples the first
`and/or second CPUs 21a-21b to the first and/or second cache
`memory partitions 23a-23b. As explained in detail below,
`cache access circuit 22 allows the resources of cache
`memory 23 to be dynamically allocated between the first and
`second CPUs 21a-21b according to each CPU's processing
`requirements to more efficiently utilize cache resources.
`0021
`Referring also to FIG. 1, system 10 includes well
`known System operating Software that assigns tasks of one
`or more computer programs running thereon to the various
`MPUs 20 for execution. The operating software, which is
`often referred to as the System kernel, also assigns tasks
`between the CPUs 21a-21b of each MPU 20. For applica
`
`tions that include a Single instruction execution thread and
`are thus best executed using only one CPU 21, e.g., for
`applications having a highly Sequential instruction code, the
`kernel assigns all the tasks to one CPU and idles the other
`CPU. Conversely, for applications that can be divided into
`two parallel instruction execution threads, e.g., for applica
`tions having parallel execution loops, the kernel may assign
`different threads to CPUs 21a-21b for simultaneous execu
`tion therein.
`0022 FIG. 3 illustrates state transitions of MPU 20
`between a one-thread (1T) state and a two-thread (2T) state.
`In one embodiment, upon power-up of MPU 20, the kernel
`sets a mode signal M=0 to initialize MPU20 to the 1T state.
`The kernel sets one of the CPUs 21 to an active state and sets
`the other CPU 21 to an inactive state. For purposes of
`discussion herein, during the 1T state, the kernel sets CPU
`21a as the active CPU and sets CPU 21b as the inactive
`CPU, although in other embodiments the kernel may set
`CPU 21b as the active CPU and set CPU 21a as the inactive
`CPU. While in the 1T state, the kernel assigns tasks of the
`computer program(s) only to the active CPU21a, while the
`other CPU 21b remains idle. In response to M=0, cache
`access circuit 22 couples the first CPU 21a to both the first
`and second cache memory partitions 23a-23b to allow the
`first CPU 21a to use all resources of cache memory 23. In
`this state, the active CPU 21a may use cache memory
`partitions 23a-23b as a 4-way associative cache memory.
`0023) If, during execution of the computer program(s),
`the kernel determines that certain tasks may be executed in
`parallel, and thus may be divided into 2 threads, the kernel
`may transition MPU20 to the 2T state by changing the mode
`signal to M=1. When M=1, the kernel sets both CPUs
`21a-21b to the active State, and thereafter assigns one
`execution thread to CPU 21a and another execution thread
`to CPU 21b in a well-known manner. In response to M=1,
`dirty data in cache memory partition 23b is written back to
`main memory 12 using a well-known writeback operation,
`thereby flushing cache memory partition 23b. The cache
`access circuit 22 couples the first CPU 21a to the first cache
`memory partition 23a for exclusive access thereto, and
`couples the second CPU 21b to the second cache memory
`partition 23b for exclusive access thereto. In this state, CPU
`21a may use cache memory partition 23a as a dedicated
`2-way associative cache memory, and CPU 21b may use
`cache memory partition 23b as a dedicated 2-way associa
`tive cache memory.
`0024. Thereafter, if the kernel determines that only one of
`CPUs 21a-21b is necessary for a particular instruction code
`sequence, the kernel may transition MPU20 to the 1T state
`by changing the mode Signal to M=0, flushing the Second
`cache memory partition 23b, and then assigning execution
`of the instruction code sequence to the active CPU 21a.
`0025 By dynamically allocating resources of cache
`memory 23 in response to Specific needs of associated CPUS
`21a-21b, embodiments of the present invention maximize
`cache performance by ensuring that both cache memory
`partitions 23a-23b are utilized, irrespective of whether one
`or both CPUs 2la-21b are active. Thus, in the 1T state, both
`cache memory partitions 23a-23b are allocated to the active
`CPU, and in the 2T state, each cache memory partition 23a
`and 23b is allocated only to its corresponding CPU 21a and
`21b, respectively. Since allocation of cache memory parti
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1018, p. 7
`
`

`

`US 2002/01844.45 A1
`
`Dec. 5, 2002
`
`tions 23a-23b is controlled by cache access circuit 22, cache
`memory 23 does not require any special hardware, and thus
`may be of conventional architecture. Further, Since cache
`memory 23 is not shared between CPUs 21a-21b, all Snoop
`operations may be performed at the System level. As a result,
`the cache controllers (not shown in FIG. 2) in CPUs
`21a-21b are much simpler and occupy leSS Silicon area than
`cache controllers for shared cache memory Systems.
`0026. The ability to dynamically allocate cache resources
`is also useful in situations where portions of MPU 20 are
`defective. For example, during testing of MPU 20, if CPU
`21b is found to be defective or otherwise unusable, the
`kernel may be configured to maintain MPU 20 in the 1T
`state, where CPU 21a is the active CPU and has access to
`both cache memory partitions 23a-23b, and CPU 21b is
`inactive. Thus, in contrast to MPUs that have dedicated
`cache memory for each on-board CPU, the failure of one
`CPU 21 on MPU 20 does not render any part of cache
`memory 23 inaccessible.
`0027 FIG. 4 shows an MPU 40 that is one embodiment
`of MPU20, and includes CPUs 2la-21b, cache access circuit
`22, and cache memory partitions 23a-23b. Each CPU 21 is
`shown to include a CPU core 41 and a cache controller 42.
`Each cache controller 42, which may be of conventional
`architecture, transferS address and data between its associ
`ated CPU core 41 and cache access circuit 22, and includes
`(or is associated with) a memory element 43. Memory
`element 43 may be any Suitable memory device including,
`for example, a register or memory cell. Although shown in
`FIG. 4 as being internal to cache controller 42, memory
`element 43 may be external to cache controller 42. CPU core
`41 includes other well-known elements of CPU 21 includ
`ing, for instance, L1 cache memory, instruction units, fetch
`and decode units, execution units, register files, write
`cache?s), and So on.
`0028 Cache memory partition 23a includes two data
`RAM arrays 51-52 having corresponding searchable tag
`arrays 61-62, respectively, while cache memory partition
`23b includes two data RAM arrays 53-54 having corre
`sponding Searchable tag arrayS 63-64, respectively. Cache
`memory partition 23a includes a well-known address con
`verter 56a that converts a main memory address received
`from cache access circuit 22 into a cache address that is used
`to concurrently address the tag arrays 61-62 and the data
`arrays 51-52. Similarly, cache partition 23b includes a
`well-known address converter 56b that converts an address
`received from cache access circuit 22 into a cache address
`that is used to concurrently address the tag arrayS 63-64 and
`the data arrays 53-54.
`0029 Data arrays 51-54 each include a plurality of cache
`lines for Storing data retrieved from main memory 12. Cache
`lines in data arrays 51-54 may be any suitable length. In one
`embodiment, each cache line of data arrays 51-54 stores 32
`Bytes of data. Each data array 51-54 also includes a well
`known address decoder (not shown for simplicity) that
`Selects a cache line for read and write operations in response
`to a received cache index. Data arrays 51-52 provide data at
`a selected cache line to a MUX 57a, and data arrays 53-54
`provide data at selected cache line to a MUX 57b.
`0030 Tag arrays 61-64 each include a plurality of lines
`for Storing tag information for corresponding cache lines in
`data arrays 51-54, respectively. Tag arrays 61-62 provide
`
`tags at the Selected cache line to a comparator 58a which, in
`response to a comparison with a tag address received from
`address converter 56a, generates a select signal for MUX
`57a. Similarly, tag arrays 62 and 63 provide tags at the
`Selected cache line to a comparator 58b which, in response
`to a comparison with a tag address received from address
`converter 56b, generates a select signal for MUX 57b.
`Comparators 58a and 58b are well-known.
`0031 Cache access circuit 22 is shown to include four
`multiplexers (MUXs) 44-47, two AND gates 48a and 48b,
`and two comparators 49a and 49b, although after reading
`this disclosure it will be evident to those skilled in the art that
`various other logic configurations may be used to Selectively
`route addresses and data between MPU 20 and cache
`memory 23. MUXes 44-45 selectively provide address
`information from CPUs 21a-21b to cache memory partitions
`23a-23b , respectively, and MUXes 46–4747 selectively
`provide data from cache memory partitions 23a-23b to
`CPUs 21a-21b, respectively. MUXes 44-45 are controlled
`by control signals C44 and C45, respectively. MUX 46 is
`controlled by AND gate 48a, which includes a first input
`terminal coupled to receive a control Signal C46 and a
`Second input terminal coupled to comparator 49a. Compara
`tor 49a includes input terminals coupled to receive Select
`signals from comparators 58a and 58b of cache memory 23.
`MUX 47 is controlled by AND gate 48b, which includes a
`first input terminal coupled to receive a control Signal C47
`and a Second input terminal coupled to comparator 49b.
`Comparator 49b includes input terminals coupled to receive
`Select signals from comparators 58a and 58b of cache
`memory 23. Comparators 49a and 49b are well-known.
`Values for signals C44 and C46 may be stored in memory
`43a of cache controller 42a, and values for signals C45 and
`C47 may be stored in memory 43b of cache controller 42b.
`0032 Specifically, MUX 44 selectively provides address
`and data information to cache memory partition 23a from
`either CPU 21a or CPU 21b in response to C44, and MUX
`45 selectively provides address and data information to
`cache memory partition 23b from either CPU 21a or CPU
`21b in response to C45. MUX 46 selectively returns data to
`CPU 21a from either cache memory partition 23a or 23b in
`response to AND gate 48a, and MUX 47 selectively returns
`data to CPU 21b from either cache memory partition 23a or
`23b in response to AND gate 48b.
`0033) For simplicity, MUXes 44-45 are shown in FIG. 4
`as routing both address and data information to cache
`memory partitions 23a-23b, respectively. However, in other
`embodiments, cache access circuit 22 may include a dupli
`cate Set of MUXes to route data to respective cache memory
`partitions 23a-23b, in which case MUXes 44-45 route only
`address information to respective cache memory partitions
`23-23b.
`0034) When MPU20 is in the 2T state (e.g., when M=1),
`each CPU 21a-21b is processing its own instruction thread,
`and the kernel sets signals C44-C47 to logic low (i.e., logic
`0) to simultaneously provide CPU 21a with exclusive use of
`cache memory partition 23a and to provide CPU 21b with
`exclusive use of cache memory partition 23b. Thus, C44-0
`forces MUX 44 to provide an address or data from CPU 21a
`to cache memory partition 23a, C45=0 forces MUX 45 to
`provide an address or data from CPU 21b to cache memory
`partition 23b, C46=0 forces the output of AND gate 48a to
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1018, p. 8
`
`

`

`US 2002/01844.45 A1
`
`Dec. 5, 2002
`
`logic 0 to force MUX 46 to provide data from cache memory
`partition 23a to CPU 21a, and C47=0 forces the output of
`AND gate 48b to logic 0 to force MUX 47 to provide data
`from cache memory partition 23b to CPU 21b.
`0035) To request data from cache memory partition 23a,
`CPU 21a provides a main memory address to address
`converter 56a via MUX 44. Address converter 56a converts
`the main memory address to a cache address that includes a
`tag address and a cache index. The cache indeX is used to
`Select a cache line in data arrayS 51-52 and associated tag
`arrays 61-62. If there is data stored at the selected cache line
`in data arrays 51 and/or 52, the data is read out to MUX 57a.
`Also, the tag fields from the Selected line of tag arrays 61-62
`are read out to comparator 58a, which also receives the tag
`address from address converter 56.a. Comparator 58a com
`pares the tag address with tag fields provided by tag arrayS
`61-62, and in response thereto provides a Select Signal to
`MUX57a that selects whether data from data array 51 or 52
`(or neither, if there is no matching data) is read out to MUX
`46 of cache access circuit 22. Since C46=0, MUX 46
`provides matching data from cache memory partition 23a to
`cache controller 42a of CPU 21a.
`0.036 CPU 21b may simultaneously request data from
`cache memory partition 23b in a similar manner. Thus, a
`main memory address provided by CPU 21b to address
`converter 56b via MUX 44 is converted into a cache address
`that includes a tag address and a cache index. The cache
`indeX Selects a cache line in data arrayS53-54 and associated
`tag arrays 63-64. If there is data stored at the selected cache
`line in data arrays 53 and/or 54, the data is read out to MUX
`57b. Also, the tag fields from the selected line of tag arrays
`63-64 are read out to comparator 58b, which also receives
`the tag address from address converter 56b. Comparator 58b
`compares the tag address with tag fields provided by tag
`arrayS 63-64, and in response thereto provides a Select Signal
`to MUX57b that selects whether data from data array 53 or
`54 (or neither, if there is no matching data) is read out to
`MUX 47 of cache access circuit 22. Since C47=0, MUX 47
`provides matching data from cache memory partition 23b to
`cache controller 42b of CPU 21b.
`0037. In this manner, CPU 21a may use cache memory
`partition 23a as a dedicated 2-way associative cache while
`CPU 21b simultaneously and independently uses cache
`memory partition 23b as a dedicated 2-way associative
`cache.
`0038. When MPU 20 transitions to the 1T state (e.g.,
`M=0), the kernel sets CPU 21a as the active CPU and sets
`CPU21b as the inactive CPU (as mentioned earlier, in other
`embodiments the kernel may set CPU 21b as the active CPU
`and set CPU 21a as the inactive CPU). The kernel also sets
`signal C44 to logic low and sets signals C45-C46 to logic
`high (i.e., logic 1) to provide CPU 21a with use of both
`cache memory partitions 23a-23b. Thus, C44-0 forces
`MUX 44 to provide an address or data from CPU 21a to
`cache memory partition 23a, C45=1 forces MUX 45 to
`provide the same address or data from CPU 21a to cache
`memory partition 23b, and C46=1 allows a result signal
`from comparator 49a to select whether data from cache
`memory partition 23a or 23b is returned to CPU 21a. Since
`CPU 21b is inactive, C47 is a don't care (d/c) for M=0.
`0.039 To request data from both cache memory partitions
`23a-23b, CPU 21a provides a main memory address to
`
`address converter 56a via MUX 44 and to address converter
`56b via MUX 45. Thus, the cache address is provided to data
`arrays 51-54 and to tag arrays 61-64. Data arrays 51-52 read
`out the selected cache line to MUX57a, and tag arrays 61-62
`read out corresponding tag fields to comparator 58a. Com
`parator 58a compares the tag fields with the tag address
`received from address converter 56a, and selects which data
`(if any) MUX 57a forwards to MUX 46. Similarly, data
`arrays 53-54 read out the selected cache line to MUX 57b,
`and tag arrayS 63-64 read out corresponding tag fields to
`comparator 58b. Comparator 58b compares the tag field
`with the tag address received from address converter 56b,
`and selects which data (if any). MUX 57b forwards to MUX
`46.
`0040. The select signals provided by comparators 58a
`and 58b are compared in comparator 49a to generate a Select
`signal that is provided to MUX 46 via AND gate 48a to
`select which data (if any) is returned to CPU 21a. Thus, if
`there is matching data in either cache memory partition 23a
`or cache memory partition 23b, it is returned to CPU21a via
`MUX 46. In this manner, data arrays 51-54 provide a 4-way
`associative cache memory for CPU 21a. Values for control
`signals C44-C47 for the 1T and 2T states are summarized
`below in Table 1.
`
`TABLE 1.
`
`mode
`
`C44
`
`C45
`
`C46
`
`1T
`2T
`
`O
`O
`
`1.
`O
`
`1.
`O
`
`C47
`
`di?c
`O
`
`0041 AS discussed above, the ability to easily transition
`between using cache memory 23 as two dedicated 2-way
`associative cache memories for respective CPUs 21a-21b,
`and using cache memory 23 as a 4-way associative memory
`for only one CPU 21a, advantageously allows for use of the
`entire cache memory 23, irrespective of whether MPU20 is
`executing one or two threads, and thereby maximizes the
`effectiveness of cache memory 23. Further, since CPUs
`21a-21b do not simultaneously share access to the Same data
`in cache memory 23, cache controllers 42a and 42b do not
`need to perform Separate chip-level Snoop operations, and
`thus are much simpler and occupy leSS Silicon area than
`cache controllers for a shared cache memory System.
`0042. While particular embodiments of the present
`invention have been shown and described, it will be obvious
`to those skilled in the art that changes and modifications may
`be made without departing from this invention in its broader
`aspects and, therefore, the appended claims are to encom
`pass within their Scope all Such changes and modifications as
`fall within the true spirit and scope of this invention. For
`example, although described above as having two partitions,
`in actual embodiments cache memory 23 may have any
`number of partitions.
`
`I claim:
`1. A method of dynamically allocating a cache memory
`between first and Second processors, comprising:
`partitioning the cache memory into first and Second
`partitions,
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1018, p. 9
`
`

`

`US 2002/01844.45 A1
`
`Dec. 5, 2002
`
`in a first mode, allocating the first cache memory partition
`for exclusive use by the first processor, and allocating
`the Second cache memory partition for exclusive use by
`the Second processor; and
`in a Second mode, allocating the first cache memory
`partition and the Second cache memory partition for
`exclusive use by the first processor.
`2. The method of claim 1, further comprising:
`during the first mode, providing access to first memory
`cache partition for the first processor and providing
`access to the cache memory partition for the Second
`processor.
`3. The method of claim 2, wherein during the first mode
`each cache memory partition operates as a 2-way associative
`cache memory.
`4. The method of claim 1, further comprising:
`during the Second mode, providing access to the first
`memory cache partition and the Second cache memory
`partition for the first processor.
`5. The method of claim 3, wherein during the second
`mode the first and Second cache memory partitions collec
`tively operate as a 4-way associative cache memory.
`6. The method of claim 4, further comprising:
`flushing the Second cache partition during the Second
`mode.
`7. The method of claim 4, further comprising:
`Setting the Second processor to an inactive State during the
`second mode.
`8. A method of dynamically allocating a cache memory
`between first and Second processors, comprising:
`partitioning the cache memory into first and Second
`partitions,
`Selectively coupling the first cache memory partition to
`the first and Second processors in response to a mode
`Signal; and
`Selectively coupling the Second cache memory partition to
`the first and Second processors in response to the mode
`Signal.
`9. The method of claim 8, further comprising:
`when the mode Signal is in a first State,
`allocating the first cache memory partition as dedicated
`cache memory for the first processor, and
`allocating the Second cache memory partition as dedi
`cated cache memory for the Second processor.
`10. The method of claim 9, wherein during the first state,
`each cache memory partition operates as a 2-way associative
`cache memory.
`11. The method of claim 9, further comprising:
`when the mode Signal is in a Second State, allocating the
`first and Second cache memory partitions as cache
`memory for the first processor.
`12. The method of claim 11, wherein during the second
`State, the first and Second cache memory partitions collec
`tively operate as a 4-way associative cache memory.
`13. The method of claim 11, further comprising:
`Setting the Second processor to an inactive State during the
`Second State.
`
`14. The method of claim 11, further comprising:
`flushing the Second cache partition during the Second
`State.
`15. A multi-processor System, comprising:
`a first processor,
`a Second processor,
`a cache memory including first and Second partitions, and
`a cache access circuit for Selectively coupling the first
`cache memory partition to the first and Second proces
`Sors, and for Selectively coupling the Second cache
`memory partition to

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket