`
`ARM Ex. 1015
`IPR Petition - USP 5,463,750
`
`
`
`US. Patent
`
`May 2, 1995
`
`Sheet 1 of 3
`
`5,412,787
`
`m:
`
`«EED53EDEasQ.g
`__oo35283528.2;2;2ozE“a“gEwass?9Hmap
`
`i
`
`I—H.
`
`“I
`
`Eu)“:-.—52a.
`
`as533%
`
`
`
`53.225252.
`
`.
`
`ARM_VPT_IPR_00000526
`ARM VPT IPR 00000526
`
`
`
`
`US. Patent
`
`May 2, 1995
`
`Sheet 2 of 3
`
`5,412,787
`
`mm
`
`mg
`
`{E
`
`E85
`
`zezzfimz
`
`53
`
`xw_
`
`25mm
`
`35E
`
`2:25
`
`225352
`
`3.1::”E
`
`zEEEEEmN8%;
`
`m:
`
`to2%:
`
`225252
`
`£2
`
`8
`
`22525meo“395
`
`fig
`
`03m
`
`ARM_VPT_IPR_00000527
`ARM VPT IPR 00000527
`
`
`
`
`
`
`
`US. Patent
`
`May 2, 1995
`
`Sheet 3 of 3
`
`5,412,787
`
`TAGS
`
`
`
`
`
`
`
`
`
`
`TLB [0] —w0RD 0
`
`16k
`
`TLB [0] -w0RD1
`
`[66:1
`
`
`TLB [01 - worm 2
`
`TLB [0] - woans
`
`-
`
`
`
`
`
`
`lfik
`
`162!)
`
`152/
`
`185/
`
`ARM_VPT_IPR_00000528
`ARM_VPT_IPR_00000528
`
`
`
`1
`
`5,412,787
`
`TWO-LEVEL TLB HAVING THE SECOND LEVEL
`TLB MPIEMCENTED IN CACHE TAG RAMS
`
`This is a continuation of application Ser. No.
`07/616,540 filed on Nov. 21, 1990, now abandoned.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`
`The present invention relates to a two-level transla-
`tion lookaside buffer (TLB), and more particularly, to a
`two-level TLB having a first level TLB on the CPU
`and a second level TLB residing in otherwise unused
`portions of the cache tag RAMs.
`2. Description of the Prior Art
`fre-
`Conventional computer processing systems
`quently include a very large main memory address
`space that a user accesses via virtual memory addresses.
`These virtual memory addresses are typically con-
`verted to physical memory addresses by any of a num-
`ber of techniques so that the processor can access the
`desired information stored in the main memory. How-
`ever, since access to the main memory is often quite
`time consuming, many computer systems employ a
`cache memory for interfacing the main memory to the
`processor. The cache memory typically includes the
`memory pages and associated tags most likely to be
`asked for by the processor, and since the cache memory
`is typically small and located proximate the processor, it
`can be accessed much faster than the main memory.
`Cache memories thus help to significantly improve
`processing speed for typical applications.
`Certain conventional cache memories comprise a
`high speed data Ram and a parallel high speed tag
`RAM. The address of each entry in the cache is gener-
`ally the same as the low order portion of the main mem-
`ory address to which the entry corresponds, where the
`high order portion of the main memory address is
`stored in the tag RAM as tag data. Thus, it‘ main mem-
`ory has 2*"1 blocks of 2" words each, the i’th word in the
`cache data RAM will be a copy of the i’th word of one
`of the 2’" blocks in main memory, and the identity of
`that block is stored in the tag RAM. Then, when the
`processor requests data from memory, the low order
`portion of the address is supplied as an address to both
`the cache data and tag RAMs. The tag for the selected
`cache entry is compared with the high order portion of
`the processor’s address and, if it matches, the data from
`the cache data RAM is enabled onto the data bus. If the
`
`tag does not match the high order portion of the proces-
`sor’s address, then the data is fetched from main mem-
`ory. The data is also placed in the cache for potential
`future use, overwriting the previous entry. On a data
`write from the processor, either the cache RAM or
`main memory or both may be updated, and flags such as
`“data valid” and “data dirty” may be used to indicate to
`one device that a write has occurred in the other. The
`
`use of such a small, high speed cache in the computer
`design permits the use of relatively slow but inexpensive
`RAM for the large main memory space without sacri-
`ficing processing Speed.
`An address translation unit is used in conjunction
`with cache memories in such virtual memory systems
`for performing the aforementioned virtual to physical
`address translation. Generally, the address translation
`unit provides a map to the desired page of main mem-
`ory, and such a map is typically stored as a page table.
`To increase the speed of access to entries in the page
`
`5
`
`10
`
`15
`
`2f}
`
`25
`
`30
`
`35
`
`4O
`
`45
`
`55
`
`65
`
`ARM_VPT_IPR_00000529
`ARM_VPT_IPR_00000529
`
`2
`table, and hence the speed of address translation, trans-
`lation lookaside buffers (TLBs) are often employed
`with or in place of the page tables. TLBs generally
`operate as cache; for the page table and, when used,
`allow faster access to the page tables. The TLBs, as
`with the data caches, are typically small and may be
`located proximate to or actually on the processor chip.
`The speed of the processor can thus be improved with-
`out significantly increasing its chip area. A conven-
`tional address translation system is described, for exam-
`ple, by Moussouris et al in US. Pat. No. 4,953,073.
`Most existing reduced instruction set computer
`(RISC) designs use only a single level TLB. However,
`one known design implements a second-level TLB in
`the cache data array in order to take advantage of the
`system clock frequencies available with modern VLSI
`technologies. For example, in the MIPS RC6280 CPU,
`the primary cache is split between instructions and data,
`is direct mapped, virtually addressed, and write
`through. The second level cache is unified, two—way set
`associative, physically addressed, and copy back. The
`principal tags for the second level cache are virtual,
`rather than physical, so as to facilitate a small onvchip
`TLB. The CPU chip contains the virtual address gener—
`ation logic which forms the index into the primary
`cache, 3 96-bit first level TLB which is used to form the
`index into the second level cache, and the cache control
`state machines which handle the management of the
`cache subsystem and memory mapping functions.
`The MIPS RC6280 CPU utilizes a two-level TLB so
`
`that the address translation may be provided on-chip.
`The first level TLB comprises a ié-entry, 96-bit “TLB-
`sllce” located on the CPU chip, while the second level
`TLB backs up the first level TLB with a 4,096-entry full
`TLB stored in the second level cache. The TLB-slice
`consists of two direct mapped tables (one for instruc~
`tions, one for data) which deliver just enough bits of the
`physical page number to complete the second level
`cache index. On the other hand, the second-level TLB
`is disposed in a reserved section of the second level
`cache in order to simplify TLB miss software. How-
`ever, by implementing the second-level TLB in the
`cache data array in this manner, the maximum amount
`of cache data memory available is thereby limited. Of
`course, this adversely affects system performance.
`Accordingly, it is desired to design a two-level TLB
`so.that address translation may be performed on-chip
`without limiting the maximum amount of cache data
`memory available. The present invention has been de-
`signed to meet this need.
`SUB/[MARY OF THE INVENTION
`
`The present invention relates to a processor which
`implements two levels of translation lookaside buffers
`(TLBs). Separate TLBs are provided for instructions
`(ITLB) and data (DTLB). The ITLB and DTLB each
`have first-ulevel TLBs which are small, two-set associa-
`tive, have a one-cycle access time and reside on the
`CPU chip. The second-level TLBs, on the other hand,
`are large, direct mapped, have seven-cycle access time,
`and reside in otherwise unused portions of the cache tag
`RAMs. By so combining the second-level TLBs with
`the cache tag RAMs, a single set of external memory
`devices may be used to serve both functions.
`The present invention thus relates to a computer
`system which has a memory for storing blocks of data
`and processing means for providing virtual addresses of
`data in the memory which is to be processed. In accor-
`
`
`
`5,412,787
`
`3
`dance with the invention, virtual address translation is
`performed using a first-level translation lookaside buffer
`(TLB) which stores address translation information for
`translating the virtual addresses to physical addresses of
`the data in the memory, and tag memory means divided
`into a first area for storing data tag information corre-
`sponding to the blocks of data in the memory and a
`second area for storing a second-level translation looka-
`side buffer (TLB) having address translation informa-
`tion which is accessed for a virtual to physical address
`translation when the address translation information for
`
`making the virtual to physical address translation is not
`available in the first-level translation lookaside buffer.
`Thus, in accordance with the invention. the second-
`level TLB is incorporated into unused portions of the
`tag memory means. Preferably, the first and second
`areas are one of the lower or upper half of the tag mem-
`ory means such that the processing means can access
`the second-level TLB by Specifying a most significant
`address bit of an address to the tag memory means
`which corresponds to the second area. Also, the so»
`cond-level TLB is preferably direct mapped to the
`memory means.
`
`the processing
`In accordance with the invention.
`means and the fn'st-level TLB are located proximate
`each other on a semiconductor chip. The first-level
`TLB also may be two-set associative. In such an em-
`bodiment, the memory preferably comprises first and
`second data RAMs while the tag memory means com-
`prises first and second tag RAMs corresponding to the
`first and second data RAMs. Each entry in the second-
`level TLB preferably occupies two data words in the
`second area in each of the first and second tag RAMs so
`as to allow parallel access. In a preferred embodiment,
`each entry in the second-level TLB occupies two data
`words of the second area in each of the first and second
`tag RAMs. Also, in such a preferred embodiment, the
`tag memory means has an addressable data size approxi-
`mately one-fourth to one-half that of the memory to
`which it corresponds.
`In accordance with another aspect of the invention,
`the computer system has separate TLBs for instruction
`and data. Such a computer system in accordance with
`the invention preferably comprises first memory means
`for storing blocks of data, second memory means for
`storing blocks of instruction data and processing means
`for providing virtual addresses of the data in the first
`memory means and virtual addresses of the instruction
`data in the second memory means. Address translation
`is preferably performed using a first-level data transla-
`tion lookaside buffer (DTLB) for storing address trans-
`lation information for translating the virtual addresses
`to physical addresses of the data in the first memory
`means and a first-level instruction translation lookaside
`buffer (ITLB) for storing address translation informa-
`tion for translating the Virtual addresses to physical
`addresses of the instruction data in the second memory
`means. However, the invention is further characterized
`by first tag memory means divided into a first area for
`storing data tag information corresponding to the
`blocks of data in first memory means and a second area
`for storing a second‘level data translation lookaside
`bufi‘er (DTLB) having address translation information
`which is accessed for a virtual to physical address trans-
`lation when the address translation information for
`making the virtual to physical address translation is not
`available in the first-level DTLB. The invention also
`includes a second tag memory means divided into a first
`
`ARM_VPT_IPR_00000530
`ARM_VPT_IPR_00000530
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`4
`area for storing instruction tag information correspond-
`ing to the blocks of instruction data in the second mem-
`ory means and a sec0nd area for storing a second—level
`instruction translation lookaside buffer (ITLB) haVing
`address translation information which is accessed for a
`
`virtual to physical address translation when the address
`translation information for making the virtual to physi-
`cal address translation is not available in the first-level
`ITLB.
`
`The invention also encompasses a method of translat-
`ing a virtual address output by a processor into a physi—
`cal address of a memory containing blocks of data for
`processing by the processor. Such a method in accor-
`dance with the invention preferably comprises the steps
`of:
`
`(a) providing a virtual address from the processor
`corresponding to a physical address of the memory
`at which data to be processed is stored;
`(b) searching a first-level translation lookaside buffer
`(TLB) for particular address translation informa-
`tion which can be used by the processor for trans-
`lating the virtual address to the physical address,
`and if the particular address translation information
`is found, performing a virtual to physical address
`translation;
`'
`(c) when the particular address translation informa-
`tion cannot be found in the first-level TLB, search-
`ing a second-level
`translation lookaside buffer
`(TLB) stored in a tag memory for the particular
`address translation information, and if the particu-
`lar address translation is found, copying the partic-
`ular address translation information to the first-
`level TLB and repeating step (b); and
`(d) when the particular address translation informa-
`tion cannot be found in the second-level TLB, generat-
`ing a TLB miss signal and continuing processing of the
`processor at a TLB miss interruption vector address.
`In accordance with the method of the invention, the
`tag memory is divided into a first area for storing data
`tag information corresponding to the blocks of data in
`the memory and a second area for storing the second-
`level TLB. In such a case, step (c) preferably comprises
`the step of specifying a most significant address bit of an
`address to the tag memory which corresponds to the
`second area. Step (c) may also comprise the step of
`reading at least two words in the tag memory corre-
`mailing to a single entry in the second-level TLB. In
`other words, since the TLB entries are typically larger
`than the tag entries, more than one tag RAM location
`may be used for storing a single TLB entry.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The above and other objects and advantages of the
`invention will become more apparent and more readily
`appreciated from the following detailed description of
`the presently preferred exemplary embodiments of the
`invention taken in conjunction with the accompanying
`drawings, of which:
`FIG. 1 illustrates a high performance processing sys-
`tem having separate two-level TLBs for instructions
`and data in accordance with the invention.
`
`FIG. 2 illustrates the implementation of the second-
`level TLB in accordance with the invention.
`
`65
`
`FIG. 3 illustrates the technique by which TLB entries
`are stored in the two-set associative cache data tag
`RAMs in accordance with the invention.
`
`
`
`5
`
`5,412,787
`
`DETAILED DESCRIPTION OF THE
`PRESENTLY PREFERRED EMBODIMENT
`
`A preferred embodiment of the invention will now be
`described with reference to FIGS. 1—3. It will be appre-
`ciated by those of ordinary skill in the art that the de-
`scription given herein with respect to those figures is for
`exemplary purposes only and is not intended in any way
`to limit the scope of the invention. All questions regard-
`ing the scope of the invention may be resolved by refer-
`ring to the appended claims.
`FIG. 1 illustrates a processor which implements a
`two-level TLB in accordance with the invention. As
`shown, processor chip 100 includes CPU 110, a first-
`level TLB for instructions (ITLB 120),
`instruction
`cache control unit 130, a separate first-level TLB for
`data (DTLB 140), and a data cache control unit 150.
`The first level TLBs ITLB 120 and DTLB 140 are
`
`preferably small, two»set associative memories having a
`one-cycle access time. As shown, ITLB 120 and DTLB
`140 reside on the processor chip 100 with the CPU 110
`and the instruction cache control unit 130 and data
`cache control unit 150 so as to allow on-chip address
`generation and translation.
`As also showu in FIG. 1, the processor chip 100
`communicates with an external instruction cache com-
`prising instruction cache (1C) data RAMs 16D and 162
`and IC tag RAMs 164 and 166. The outputs of these
`respective RAMS are provided to IC multiplexer 168
`which outputs the desired instruction address in main
`memory under control of the output of ITLB 120 and
`instruction cache control unit 130. The processor chip
`100 also communicates with an external data cache
`
`comprising data RAMs 170 and 172 and data tag RAMs
`174 and 176. The outputs of the respective RAMS are
`multiplexed in odd address multiplexer 178 and even
`address multiplexer 180 under control of DTLB 140
`and data cache control unit 150 so as to provide the
`desired data address to the main memory.
`The data cache/address translation system of FIG. 1
`generally functions as follows. CPU 100 outputs a vir-
`tual address corresponding to a physical address of data
`in the main memory which is desired for that processing
`step. The virtual address is then converted to a physical
`address using the two-level TLBs of the invention. For
`the vast majority of instruction and data accesses, the
`needed Virtual address translation information will re-
`side in the first-level TLB (ITLB 120 or DTLB 140)
`and can be accessed with no processing or pipeline
`penalties. When the needed translation does not exist in
`the first-level TLB, however, CPU 110 stalls the pipe-
`line and accesses the second-level TLBs. If the transla-
`
`tion exists in the second-level TLB, it will copied to the
`appropriate first-level TLB, ITLB 120 or DTLB 140,
`and instruction execution by CPU 110 will continue.
`However, if the translation does not exist in the second-
`level TLB either, execution will continue at the appro-
`priate TLB-miss interruption vector address in accor«
`dance with known techniques. By providing a first-level
`TLB that can be accessed with no penalty and by only
`accessing the second-level TLB in the case of a first-
`level TLB miss, a high percentage of code can run on
`CPU 110 with no TLB penalty. Also, by making the
`first-level TLB two-set associative, many applications
`may be prevented from thrashing when addresses map
`to the same TLB location. Moreover, since the first»
`level TLBs are small, they may he placed on the proces-
`sor chip 100 and accessed very quickly. 0n the other
`
`ARM_VPT_IPR_00000531
`ARM_VPT_IPR_00000531
`
`6
`hand, by providing a large second-level TLB that can
`be accessed directly (such as through direct mapping),
`the second level TLB can be accessed with very few
`penalty cycles (7 penalty cycles in a preferred embodi—
`ment). and as a result, a very high TLB hit rate can be
`maintained and system degradation due to TLB misses
`kept very small even in applications that require many
`TLB entries.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`The implementation of such a two-level TLB in the
`processor of the invention is distinguished in that the
`second-level TLB is kept in a normally unused portion
`of the cache tag RAMs 164 and 166 for the instruction
`data and cache tag RAMS I74 and 176 for the data.
`Thus, in accordance with the invention, the second-
`level TLB is not placed in data RAMS 160, 162 or 170,
`172 as proposed in the aforementioned prior art. Rather,
`the present inventors have discovered that the second-
`level TLBs may be more advantageously placed in
`otherwise unused portions of the cache tag RAMs 164,
`166 or 174, 176. As known by those skilled in the art,
`these tag RAMs typically hold physical address infor-
`mation and various flags, such as “valid” flags, “dirty”
`flags, and “shared bit” flags. Since the tag RAMs typi-
`cally do not use a large amount of memory space com-
`pared to their corresponding data RAMs, a sufficient
`amount of memory space is typically left over to allow
`the second-level TLBs to be incorporated therein as
`proposed by the present inventors without performance
`degradation caused by limiting the amount of maximum
`data cache memory area as in the prior art device dis
`cussed above.
`
`For example, as shown in FIG. 2, the 1C data RAM
`entries are typically organized into blocks so that each
`block of data requires only one entry in the correspond-
`ing tag RAMs. Thus, if the same depth RAMs are used
`for data and tag entries, and if each block consists of
`more than one data entry, then there will be unused
`portions of the tag RAMs. Typically, even when the
`depth of the tag RAMs is smaller than the depth of the
`corresponding data RAMS, portions of the tag RAMs
`are still typically unused.
`In the example illustrated in FIG. 2, the instruction
`cache data RAMs 160 and 162 are 256-kbyte RAMS
`organized into 4-byte (32 bit) word entries and 8k of
`32-byte blocks. Thus, if eight “64k by 4” RAM parts are
`used to realize the 256-kbyte instruction cache, then 8k
`tag entries are needed. In other words, one entry is put
`into the tag RAMs 164a and 1660 for each 32 byte block
`of instruction data in IC data RAMs 160 and 162. Ac-
`cordingly, even if a “16k by 41:” RAM is used for the
`tag RAMs 164 and 166, then only half of the tag depth
`is needed for the instruction tag entries. The remainder
`of the IC tag RAMs 164 and 166 may hence be used for
`the second-level instruction TLBs 16d!) and 166!) in
`
`accordance with the invention without giving up other-
`wise used cache memory space. In other words, the
`unneeded portion of IC tag RAMs 164 and 166 may be
`used to hold the second-level TLB entries without pro-
`cessor performance penalty.
`In accordance with the invention, CPU 110 logically
`splits the tag RAMS 164, 166 and 174, 176 into two
`portions by using an address pin to select either the
`cache tag half (164a, 166a) or the TLB half (1646, 166th)
`for access. This may be simply done by toggling the
`most significant bit of the address to the cache tag
`RAMs 164 or 166 during access. Howwer, because
`most TLB entries are typically much wider than tag
`entries, processor 110 must read more than one address
`
`
`
`7
`of the second-level TLB 164b, 1666 for each TLB entry.
`For example, each tag entry may have 24 bits, where 21
`are address bits (for a memory with 23' pages) and three
`are the aforementioned flags. TLB entries, on the other
`hand, are generally on the order of 88 bits for such a 5
`system and hence require four 2A—bit tag word entries (4
`words by 24 bits each) for each second-level TLB en-
`try. Thus, in such a case it is necmsmy for one TLB
`entry to occupy four entries of the cache tag RAMs
`164, 166.
`FIG. 3 shows the instruction cache tag RAMs 164
`and 166 in more detail. As noted above, since CPU 110
`implements two-set associative caches, there are two
`instruction cache tag RAMs. When such two-set asso-
`ciative caches are used, each TLB entry preferably
`takes up two consecutive words in each TLB cache for
`a total of four words. In other words, each second-level
`TLB entry comprises four words, two in each IC tag
`RAM, so that four reads must be split over the two
`associative IC tag RAMS 164, 166 for the TLB entries
`to be accessed. Preferably, both of the TLB caches 164
`and 166 are read simultaneously by CPU 110 so that
`only two reads are necessary for accessing each TLB
`entry. Thus, when the second-level TLB is accessed by
`CPU 110 for an address translation, CPU 110 specifies
`the MSB of the address to the tag RAM as that of the
`second-level TLB, thereby ignoring the instruction tag
`entries 164a, 1660, and ifthe desired address translation
`entry is found, that entry may be copied to the first-
`level TLB with two reads so that processing may then
`continue.
`
`10
`
`15
`
`25
`
`30
`
`35
`
`45
`
`Although only the IC tag RAMs 164, 166 have been
`described with reference to FIGS. 2 and 3, it should be
`apparent to one skilled in the art that the same tech-
`niques may be used for implementing the second-level
`TLB in the data tag RAMs 174 and 176. Best results are
`achieved in accordance with the invention when the tag
`RAMs have approximately i to i the number of entries
`as the data RAMs. For the example given above, each
`of the second-level TLBs 16413 and 166B have Bk of 40
`addressable space; therefore, 4k entries (16k words di-
`Vided by 4 words per entry) are available in the second-
`level TLB, which is sufficient for most purposes.
`The design of the present invention thus has two
`major advantages over existing methods having sepa-
`rate RAMs for implementing off-chip TLBs. The first
`advantage is the ability to implement the external se
`cond-level TLB memory arrays using only a small num-
`ber of additional signal pads/pins on the processor chip
`100. Reduced pin count can be translated to reduced 50
`package cost or higher performance due to utilization of
`more pins for other performance enhancing features.
`Maximum pin count limits on integrated circuit devices
`are a reality of technological limitations and are difficult
`to exceed without large expenses and technological 55
`risks. The second advantage is reduced cost for a given
`level of performance. By combining the second-level
`TLBs with existing cache tags, 3 single set of external
`memory devices may be used for both purposes. Even if
`higher capacity memory devices are required than 60
`would ordinarily be required for the cache tag function
`alone, there is still a significant savings over the cost of
`two sets of smaller devices. The design also results in
`reduced system power, PC board area requirements,
`reduced PC board routing complexity, and reduced 65
`power and area for circuits on the processor chip 100.
`Those skilled in the art will readily appreciate that
`many modifications to the invention are possible within
`
`ARM_VPT_IPR_00000532
`ARM_VPT_IPR_00000532
`
`5,412,787
`
`8
`the scope of the invention. For example, data RAMs
`and tag RAMs of varying sizes may be used in accor-
`dance with the amount of Space available or the budget
`for memory. Any such arrangement is possible so long
`as extra memory space is available in the tag RAMs for
`storing the second-level cache of the invention, which is
`usually the case for the reasons noted herein. Accord-
`ingly, the scope of the invention is not intended to be
`limited by the preferred embodiment described above,
`but only by the appended claims.
`We claim:
`
`-
`
`1. A computer system comprising:
`first memory means for storing blocks of data;
`second memory means for storing blocks of instruc-
`tion data;
`a first-level data translation lookaside buffer (DTLB)
`for storing address translation information for use
`in translating virtual addresses to physical ad-
`dresses of said blocks of data stored in said first
`memory means;
`a first-level instruction translation lookaside buffer
`
`(ITLB) for storing address translation information
`for use in translating virtual addresses to physical
`addresses of said blocks of instruction data stored
`in said second memory means;
`first tag memory means divided into a first area for
`storing data tag information corresponding to said
`blocks of data in said first memory means and a
`second area for storing a second-level data transla-
`tion lookaside buffer (DTLB) for storing address
`translation information for use in translating virtual
`addresses to physical addresses of said blocks of
`data stored in said first memory means;
`second tag memory means divided into a first area for
`storing data tag information corresponding to said
`blocks of instruction data in said second memory
`means and a second area for storing a second-level
`instruction translation lookaside buffer (lTLB) for
`storing address translation information for use in
`translating virtual addresses to'physical addresses
`of said blocks of instruction data stored in said
`second memory means; and
`processing means for (1) providing a virtual address
`of one of data stored in said first memory means
`and of instruction data stored in said second mem~
`
`cry means, (2) when a virtual address of data stored
`in said first memory means is provided, searching
`said first-level DTLB for address translation infor-
`mation for said virtual address of data stored in said
`
`first memory means and translating said virtual
`address of data stored in said first memory means to
`a physical address of a corresponding block of data
`stored in said first memory means if said address
`translation information for said virtual address of
`data stored in said first memory means is found in
`said first-level DTLB, else searching said first-level
`lTLB for address translation information for said
`virtual address of instruction data stored in said
`second memory means and translating said virtual
`address of instruction data stored in said second
`
`memory means to a physical address of a corre-
`sponding block of instruction data stored in said
`second memory means if said address translation
`information for said virtual address of instruction
`data stored in said second memory means is found
`in said first-level ITLB, (3) when a virtual address
`of data stored in said first memory means is pro-
`vided and said address translation information for
`
`
`
`9
`said virtual address of data stored in said first mem-
`ory means is not found in said first-level DTLB,
`searching said second-level DTLB stored in said
`first tag memory means for address translation
`information for said virtual address of data stored
`in said first memory means and translating said
`virtual address of data stored in said first memory
`means to said physical address of said correspond-
`ing block of data stored in said first memory means
`if said address translation information for said vir-
`tual address of data stored in said first memory
`means is found in said second-level DTLB, else
`when said address translation information for said
`virtual address of data stored in said first memory
`means is not found in said first-level ITLB, search-
`ing said second-level ITLB stored in said second
`tag memory means for address translation informa-
`tion for said virtual address of instruction data
`stored in said second memory means and translat-
`ing said virtual address of instruction data stored in
`said second memory means to said physical address
`of said corresponding block of instruction data
`stored in said second memory means if said address
`translation information for said virtual address of
`instruction data stored in said second memory
`means is found in said second-level ITLB, and (4)
`when said processor has completed a virtual ad-
`dress to physical address translation, accessing one
`of said first memory means and said second mem-
`ory means at the physical address of one of said
`corresponding block of data stored in said first
`memory means and said correSponding block of
`instruction data stored in said second memory
`means.
`
`2. A method of translating a virtual address output by
`a processor into a physical address of a memory con-
`taining blocks of data to be processed by said processor,
`comprising the steps of:
`(a) providing a virtual address from said processor
`corresponding to said physical address of said
`memory where data to be processed is stored;
`(b) searching a first-level translation lookaside buffer
`('1'LB) for address translation information for said
`virtual address and translating said Virtual address
`to said physical address if said address translation
`information for said virtual address is found in said
`first-level TLB;
`(c) when said address translation information for said
`virtual address is not found in said first-level TLB,
`searching a
`second-level
`translation lookaside
`buffer (TLB) stored in a tag memory for said ad-
`dress translation information for said virtual ad
`dress, and if said address translation information for
`said virtual address is found in said second-level
`TLB, copying said address translation information
`found in said second-level TLB to said first-level
`
`TLB and repeating step (b); and
`(d) when said address translation information is not
`found in said second-level TLB, generating a TLB
`miss signal and continuing processing by said pro-
`cessor at a TLB miss interruption vector address.
`3. The method of translating as in claim 2, compiising
`the steps of dividing said tag memory into a first area
`comprising a portion of said tag memory having a most
`significant addr-s bit of a first value and a second area
`comprising a portion of said tag memory having a most
`significant address bit ofa second value and storing data
`tag information corresponding to said blocks of data in
`
`ARM_VPT_IPR_00000533
`ARM_VPT_IPR_00000533
`
`5,412,787
`
`10
`said memory in said first area and said second-level
`TLB in said second area.
`4. The method of translating as in claim 3, wherein
`step (c) comprises the step of accessing said second-
`level TLB by addressing said tag memory with an ad-
`dress having a most significant address bit of said sec-
`ond value.
`
`10
`
`15
`
`25
`
`30
`
`35
`
`45
`
`SD
`
`55
`
`65
`
`5. The method of translating as in claim 2, wherein
`step (c) comprises the step of simultaneously reading at
`least two data words in said second area of said tag
`memory during said searching of said second-level
`TLB.
`
`6. A computer system comprising:
`memory means for storing blocks