`Hattersley et a1.
`
`lllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll
`US005341485A
`[11] Patent Number:
`5,341,485
`[45] Date of Patent:
`Aug. 23, 1994
`
`[54]
`
`[75]
`
`[73]
`
`[21]
`[22]
`[51]
`[52]
`
`[58]
`
`[56]
`
`MULTIPLE VIRTUAL ADDRESS
`TRANSLATION PER COMPUTER CYCLE
`
`Inventors: John R. Hattersley, Saugerties, N.Y.;
`‘Thomas D. Kim, Highland Park, NJ .;
`Jeffery Y. Lee, Saugerties, N.Y.;
`Forrest A. Reiley, Eastham, Mass.
`
`Assignee: International Business Machines
`Corporation, Armonk, NY.
`
`Appl. No.: 697,168
`Filed:
`May 7, 1991
`
`Int. Cl.5 ............................................ .. G06F 12/10
`US. Cl. ........................... .. 395/400; 364/DIG. 1;
`364/DIG. 2
`Field of Search ............... .. 395/400; 364/DIG. 1,
`364/DIG. 2, 246.3, 246.4, 966.3, 966.4, 964.33
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`4,136,385 l/l979 Gannon et al.
`
`.... .. 395/400
`
`4,188,662 2/1980 Ishibashi . . . . . . . . . . . .
`
`. . . . .. 395/400
`
`4,293,910 10/1981 Flusche B181. ................ .. 395/425
`4,456,954 6/1984 Bullions, III et al.
`.... .. 395/400
`4,500,952 2/1985 Heller et a1. .............. .. 395/400
`
`4,691,281 9/1987 Furui . . . . . . . . .
`4,700,291 10/1987 saw
`
`. . . . .. 395/400
`............ .. 395/425
`
`4,727,484 2/ 1988 Saito . . . . . .
`
`. . . . .. 395/400
`
`395/425
`4,736,293 4/ 1988 Patrick
`395/400
`4,754,394 6/ 1988 Brantley, Jr. et al.
`4,769,770 9/ 1988 Miyadera et al. ............. .. 395/400
`4,794,521 12/1988 Ziegler et al. . . . .
`. . . . .. 395/425
`
`4,849,876 7/1989 Ozawa et al. . . . . .
`
`. . . . .. 395/400
`
`4,875,155 10/1989 Iskiyan et a1. . . . . .
`
`. . . . .. 395/425
`
`4,905,141 2/1990 Brenza . . . . . . . . . . . . . . .
`
`. . . . .. 395/425
`
`5,099,415 3/1992 Osler et a1. ................ .. 395/400
`5,111,389 5/1992 McAuliffe et al.
`.... .. 395/800
`5,133,059 7/1992 Ziegler et a1. .................... .. 395/425
`
`OTHER PUBLICATIONS
`Norton et al., “A Class of Boolean Linear Transforma
`tions for Con?ict Free Power-of-two Stride Access”,
`Proceedings of the 1987 International Conference on
`Parallel Processing pp. 247-254.
`Primary Examiner-—-Joseph L. Dixon
`Assistant Examiner-Jack A. Lane
`Attorney, Agent, or Firm-Floyd A. Gonzalez; Robert
`L. Troike
`ABSTRACT
`[57]
`Dynamic address translation structures and procedures
`are capable of multiple address translations for the same
`processor in a single cycle. According to one approach,
`a plurality of directory look aside tables (DLATs) are
`used to provide multiple address translation. The
`DLATs are accessed in parallel by separate virtual
`address generators. To avoid the problem of generating
`the same address multiple times for each of the DLATs,
`a generated address for one DLAT may be written to
`all the DLATs or, alternatively, if a miss occurs in one
`DLAT, a search is made of the other DLATs before the
`address is generated. In the former case, an address
`written to all the DLATs may overwrite an address that
`will be needed for a future translation by one of the
`other DLATs. This is avoided in the latter case, but
`translations in other DLATs are interrupted when a
`miss occurs in one of the DLATs. This, in turn, may be
`avoided by employing “shadow” DLATs which are
`copies of the DLATs. The shadow DLATs are
`searched when a miss occurs in one of the DLATs
`thereby avoiding any interruption of translations by the
`DLATs themselves. Rather than use multiple DLATs,
`a single interleaved DLAT may be used by multiple
`address generators.
`
`13 Claims, 13 Drawing Sheets
`
`TRANSATKN
`
`TRANSLA'HW
`
`IRANSLA‘HON
`
`1
`
`APPLE 1010
`
`
`
`US. Patent
`
`Aug. 23, 1994
`
`Sheet 1 of 13
`
`5,341,485
`
`A0
`
`A11 A12
`A19 A20
`A31
`T DISPLACEMENT VIRTUAL ADDRESS
`PAGE
`SEGMENT I
`NUMBER SX } NUMBER PX }
`V = (S, P, D)
`DX
`H :1.
`Prior Art
`
`/--23
`
`SEGMENT
`TABLE
`
`VIRTUAL ADDRESS GENERATOR
`A11'A12
`A31
`ArsiAzo
`A0
`SEGMENT I
`PAGE
`1 DISPLACEMENT
`:
`PX
`DX
`SX
`.___T
`
`I
`
`\
`
`DLAT
`p21
`
`22
`
`S
`
`i
`
`‘L.
`
`4;
`
`PAGE '
`TABLE
`
`6.2
`Fl
`Prior Art
`
`RA31
`RA19 {RA20
`RAD
`28 -—~
`FlELD
`{DISPLACEMENT
`REAL ADDRESS REGISTER
`
`2
`
`
`
`US. Patent
`
`Aug. 23, 1994
`
`Sheet 2 of 13
`
`5,341,485
`
`TRANSLATION
`
`' TRANSLATION
`
`TRANSLATION
`
`31K
`ADDRESS
`GENERATOR
`
`32 *\
`ADDRESS ‘
`GENERATOR
`
`33 '-\
`ADDRESS
`GENERATOR
`
`DIRECTORY
`LOOK- ASIDE
`TABLE
`
`I \34
`
`l
`
`DIRECTORY
`LOOK-ASIDE
`TABLE
`
`DIRECTORY
`LOOK-ASIDE
`TABLE
`
`\ss 1 \se
`
`ADDRESS
`TO MEMORY
`
`ADDRESS
`TO MEMORY
`ELGJ
`
`ADDRESS
`To MEMORY
`
`AGEN 1
`
`AGEN__2__
`
`AGENi
`
`XLATE ADD 1
`XLATE ADD 2
`M ADD 3
`
`(f)
`W (C
`I
`N’C’Y
`
`3
`
`
`
`US. Patent
`
`Aug. 23, 1994
`
`Sheet 3 of 13
`
`5,341,485
`
`TRANSLATION
`
`'
`
`32
`
`33
`
`‘ ADDRESS
`GENERATOR’
`
`'T
`
`DIRECTORY
`LOOK-ASIDE
`TABLE
`\34
`
`ADDRESS
`GENERATOR
`
`IT
`
`‘
`
`DIRECTORY
`LOOK-ASIDE
`TABLE
`\35
`EIQS
`
`ADDRESS
`GENERATOR
`
`:1
`
`DIRECTORY
`LOOK-ASIDE
`TABLE
`\ 36
`
`AGEN 1
`
`AGEN 2
`
`AGEN 3
`
`%%
`NCY
`
`XLATE ADD 1
`
`XLATE ADD 2 (IN DLAT)
`
`XLATE ADD 3 (IN DLAT)
`
`Hi6.
`
`K N+1 CY
`
`4
`
`
`
`US. Patent
`
`Aug.
`
`’23
`
`1994
`
`Sheet 4 of 13
`
`5,341,485
`
`UN
`
`mN
`
`Nh<._o_$652_8.:T:5
`
`<N
`
`AN
`
`0N
`
`on
`
`mN
`
`mm
`
`<NNzmo<_STzmo<
`
`<mnzmu<
`
`mmwmoo<
`
`moh<mmzue
`
`mmumoa<
`
`«9552mm
`
`mmmmoe<
`
`mo~<mmzmo
`
`20F<4mz<~=
`
`
`
`ZOF<._mz<E.zo=<4mz<m=
`
`on
`
`mm.
`
`<n
`
`n.533
`
`Noo<zo:.<._x_E_22zocfix
`
`mm
`
`m.DD<zo:.<._x
`
`>mOHou~=o
`
`ma_m<lv_oo._
`
`m..m<._.
`
`>mohbmm5
`
`mo_m<...xoo._
`
`3m:
`
`Hm:
`
`ma_m<lxoo._
`
`5.0.98555
`
`z_<u<m2:
`
`m9:
`
`.2:
`
`5
`
`
`
`
`
`
`
`
`
`
`
`
`US. Patent
`
`uA
`
`1
`
`5
`
`5,341,485
`
`w_m:__mm":T2%
`
`
`
`”52828mozmuzuo”5282.628%?
`889%88%?Ee8mm«mn28
`
`mmmm8mm5N282
`
`o.m.i.282
`
`8.9-63899163
`
`
`
`8o_m<..v_oo._mS528%5.88%328%3,ma,2<~N:30
`
`m96E22%mm:
`
`.m_mm__<~_N8<Efix
`am...<nn92HEX
`
`m_E_SE85x
`
`5m:3%5m:85<nn2.56
`
`ad
`
`6
`
`
`
`
`US. Patent
`
`Aug. 23, 1994
`
`Sheet 6 of 13
`
`5,341,485
`
`*m
`
`co2_quzmmo<m
`
`8z.ozazmmw<m
`
`92.ozazmHE;
`
`Z2.ozEzmH5};
`
`mo_m<lxoo._
`
`>mobbm~=o
`
`WEE.
`
`m0h<mw2mo
`$552a
`
`m0H<mm2mo
`mmuxom<Nm
`
`mobxmuzmo
`magma. B
`
`
`
`0—mp<9Fzwo<
`
`ONONmm<NNzmo<
`
`ononmm<mnzmo<
`
`UNmN«NN:30
`
`onmm<nn.530
`
`m:4:_285x
`
`NZOF<._x
`
`3.0..”—
`
`ommm5.n52%
`
`SN203%
`
`5n322%
`
`_302%
`
`7
`
`
`
`
`
`
`
`
`
`
`US. Patent
`
`Aug. 23, 1994
`
`Sheet 7 of 13
`
`5,341,485
`
`98.2.5245m5E“.
`mo“.52m9.3::$
`
`m><m90072mmes
`
`>0QMmOZDI
`
`9.9...
`
`_>02_m22mij_>0z_N84Eix_>02__ooqHEX__.__N02E;zmo____N92EsEu______F.92ESzmo
`
`>N>U_U
`
`8
`
`
`
`US. Patent
`
`Aug. 23, 1994
`
`Sheet 8 of 13
`
`5,341,485
`
`
`
`>0 N+2 mwxdg.
`
`N Qa< ES zwo
`
`N oo< ES zmo
`
`
`
`m 604 wEQJX
`
`9
`
`
`
`US. Patent
`
`Aug. 23, 1994
`
`Sheet 9 of 13
`
`5,341,485
`
`CY1 CY2 CY3 CY4 CY5 CY6 CY7
`ADDRESS GEN 1 REQUESlI IA I 18 I 10 I 10 I
`ADDRESS GEN 2 REQUES_T_I 2A I
`I 28 I 2D I 2D I
`ADDRESS GEN s REouESTI 3A I
`I 38 I so I 30 I
`
`DLAT xLATES ADD1
`DLAT XLATES ADD 2
`DLAT XLATES ADD 3
`
`I IA I 18 I 10 I 10 I
`I 2A I 28 I 2c I 20 I
`I 3A I 38 I 30 I 3D I
`
`DLAT ‘SECTION 00
`DLAT SECTION 01
`DLAT SECTION 10
`DLAT SECTION 11
`
`I IA I 2A I 3A I
`I 18 I 28 I 38 I
`I 1D I 2c I 30 I
`I 10 I 20 I 30
`
`EIQJQ
`
`m R
`OF INSA'ITDRDUORON
`
`GENERATE
`IN VIRTUAL --—61
`MODE
`USING
`DIRECTORY
`LOOK-ASIDE
`TABLE
`
`H
`
`f‘ 62
`L
`GENERATE FIRST ADDRESS
`'" V'RTUAL MODE
`USING DIREcToRY
`LooK-ASIDE TABLE
`m I I—
`INCREMENT REAL
`ADDRESS BY STRIDE
`
`54
`
`ADDRES
`CROSS PAGE
`BOUNDRY
`?
`
`65'\
`ADDRESS IS TRANSLATED
`USE NEW ADDRESS
`I
`L____
`ADDRESS
`FOR MEMORY
`
`10
`
`
`
`US. Patent
`
`Aug. 23, 1994
`
`Sheet 10 of 13
`
`5,341,485
`
`IsT ADDREss
`OF INSTRUCTION
`
`1ST ADDREss
`OF INSTRUCTION
`I
`ADDREss
`71
`REGISTER *-
`
`73
`
`DIRECTORY
`I
`LOOK~ASIDE ,_7
`ADDREss
`'REGISTER _"’ TABLE
`2
`IsT vIRT
`I
`ADDREss
`L
`71
`
`I
`ADDER W73
`
`I
`
`ADDER
`
`'
`
`-
`
`-
`
`'
`
`DIRECTORY
`LOOK-ASIDE #72
`TABLE
`I
`REAL ADDRESS
`FOR MEMORY
`
`‘G 17
`
`I
`
`PAGE
`CROSSING ~74
`DETECTION
`LOGIC
`I
`ADDREss
`FOR
`MEMORY
`
`EIQJQ
`
`VIRTUAL ADDRESS GENERATION MODE
`
`ADDREss GENERATOR
`
`VIRTUAL
`MODE
`
`1
`
`REAL ADDRESS GENERATION MODE
`
`DIRECTORY LOOK-ASIDE
`
`TABLE
`
`ADDRESS
`FOR MEMORY
`19
`
`ADDRESS
`GENERATOR
`REAL
`<--
`MODE
`I
`ADDRESS
`FOR MEMORY
`
`DIRECTORY
`LOOK-ASIDE
`TABLE
`
`Elli-2Q
`
`11
`
`
`
`US. Patent
`
`Aug. 23, 1994
`
`Sheet 11 of 13
`
`5,341,485
`
`VIRTUAL ADDRESS GENERATION MODE
`ADDRESS‘
`6515M’ R
`MODE
`
`DIRECTORY
`LOOI<—ASIDE TABLE
`PAGE ENDING IN 00
`
`F IG,Z1
`
`PAGE ENDING IN 01
`
`PAGE ENDING IN 10
`
`PAGE ENDING IN 11
`I
`ADDRESS
`'F OR MEMORY
`
`REAL ADDRESS GENERATION MODE
`
`ADDRESS
`GENERATOR
`REAL
`MODE
`I
`ADDRESS
`FOR MEMORY
`
`1
`
`.
`
`DIRECTORY
`LOOK-ASIDE TABLE
`PAGE ENDING IN 00
`
`PAGE ENDING IN 01
`
`PAGE ENDING IN 10
`
`PAGE ENDING IN 11
`
`12
`
`
`
`US. Patent
`
`Aug. 23, 1994
`
`Sheet 12 of 13
`
`5,
`
`341,485
`
`mwmmoo<N
`
`$0szoh
`
`mmmmoa<N
`
`E0292oh
`
`mmwmoc<N
`
`E26:0»
`
`mNdE
`
`E052Emco<45m
`
`#5szOFNoo<45¢
`
`
`
`EOE:o.—moo<41mm
`
`Noo<ESzuo Noo<ESzmu
`
`
`_oo<ES28
`
`2mmmm9<m5s_3o2.ESx1
`
`3
`
`13
`
`
`
`
`
`
`US. Patent
`
`Aug. 23, 1994 I
`
`Sheet 13 of 13
`
`5,341,485
`
`mzo\\
`
`uzo\\
`
`mzo\,\
`
`on mm
`
`$55.2 n ma 3 ON 3
`
`mag?‘ ¢ an on mm <n
`
`wage: k 2 E 2
`
`2 up M: E
`
`MN ON 8 mm
`
`z 3
`
`2.5 E 55x
`
`5... n co< ES zwo
`
`1
`
`5 P oa< ES 2%
`
`<~ N Qo< ES zmu
`
`
`
`E262 oh N oo< #mm
`
`5.2m: 2 n 02 Sm
`
`14
`
`
`
`1
`
`MULTIPLE VIRTUAL ADDRESS TRANSLATION
`PER COMPUTER CYCLE
`
`5,341,485
`2
`The requests may be, for example, three separate in
`structions so that three addresses must be generated
`every cycle to make the memory requests. What is
`therefore needed are new dynamic address translation
`(DAT) structures and procedures which are capable of
`generating multiple addresses for the same processor in
`a single cycle.
`
`DESCRIPTION
`BACKGROUND OF THE INVENTION
`1. Field of the Invention
`The present invention generally relates to virtual
`storage mechanisms for data processing systems and,
`more particularly, to new dynamic lookaside address
`translation (DLAT) structures and procedures which
`are capable of generating multiple addresses for a pro
`cessor in a single cycle.
`2. Description of the Prior Art
`Virtual storage organization and management for
`data processing systems are described, for example, by
`Harvey M. Deitel in An Introduction to Operating Sys
`tems, Addison-Wesley (1984), by Harold Lorin and
`Harvey H. Deitel in Operating Systems, Addison-Wes
`ley (1981), and by Harold S. Stone in High-Performance
`Computer Architecture, Addison-Wesley (1987). In a
`virtual storage system, paging is a relocation and ad
`dress-to-physical-location binding mechanism provid
`ing the user of the system with what appears to be a
`considerably larger memory space than is really avail
`able. The key feature of the virtual storage concept is
`disassociating the addresses referenced in a running
`process from the addresses available in main storage.
`The addresses referenced by the running process are
`called virtual addresses, while the addresses available in
`main storage are called real addresses. The virtual ad
`dresses must be mapped into real addresses as the pro
`cess executes, and that is the function of the dynamic
`address translation (DAT) mechanism. One such mech
`35
`anism employs a directory look aside table (DLAT),
`sometimes referred to as a translation lookaside buffer
`(TLB), which stores recent virtual address translations.
`For virtual addresses stored in the DLAT, the transla
`tion process requires only a single or, at most, a couple
`of machine cycles. For addresses not stored in the
`DLAT, the DAT process may take from ?fteen to sixty
`cycles.
`Translations from the virtual address to' the real ad
`dress must be made to ?nd where the addressed instruc
`45
`tion or data is in main memory. This is typically done on
`a page basis. In fact, the translations stored in the
`DLAT are actually only page translations, and the last
`bits of an address are the location in that page, so only
`the page address must be translated. Often, the ad
`dresses are in a speci?c order as in scienti?c computing
`where the addresses are at speci?c increments in mem
`ory. These increments are called a “stride”. If all ad- '
`dresses are in incremental order, the stride is one, but if
`every other address is used, the stride is two, and so
`forth. This permits easy prediction of future addresses,
`In scienti?c or vector computing, an instruction speci
`?es a starting address, the stride and number of oper
`ands in the instruction. This allows the address genera
`tion to increment the earlier address by the stride to
`obtain the next address.
`In typical applications, a processor generates only
`one address per cycle. Some processors have more than
`one address generator going to a DLAT (or TLB), but
`still only one address is actually translated per cycle. As
`processors have evolved, there has developed a need to
`65
`generate and translate more than a single address per
`cycle. Speci?cally, the processor requires more than
`one memory request every cycle to be fully' utilized.
`
`15
`
`25
`
`30
`
`50
`
`55
`
`SUMMARY OF THE INVENTION
`It is therefore an object of the present invention to
`provide new dynamic address translation structures and
`procedures which are capable of multiple address trans
`lations for the same processor in a single cycle.
`According to the invention, a plurality of DLATs are
`used to provide multiple address translation. The
`DLATs are accessed in parallel by separate virtual
`address generators. To avoid the problem of generating
`the same address multiple times fop each of the DLATs,
`a generated address for one DLAT may be written to
`all the DLATs or, alternatively, if a miss occurs in one
`DLAT, a search is made of the other DLATs before the
`address is generated. In the former case, an address
`written to all the DLATs may overwrite an address that
`will be needed for a future translation by one of the
`other DLATs. This is avoided in the latter case, but
`translations in other DLATs are interrupted when a
`miss occurs in one of the DLATs. This, in turn, may be
`avoided by employing “shadow” DLATs which are
`copies of the DLATs. The shadow DLATs are
`searched when a miss occurs in one of the DLATs
`thereby avoiding any interruption of translations by the
`DLATs themselves.
`Rather than use multiple DLATs, a single interleaved
`DLAT may be used by multiple address generators. '
`The DLAT is partitioned into several sections, and the
`last bits of a page address are used to select the section
`of the DLAT to be addressed for an address translation.
`Performance may be further enhanced for either the
`case of multiple DLATs or a single, interleaved DLAT
`by the use of mode switching. Since translations need be
`made only when crossing a page boundary, the DLAT
`is accessed only when a page crossing is detected. This
`has the further advantage of reducing traf?c to the
`DLAT.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`The foregoing and other objects, aspects and advan
`tages will be better understood from the following de
`tailed description of a preferred embodiment of the
`invention with reference to the drawings, in which:
`FIG. 1 is a block diagram illustrating the format of a
`virtual address;
`FIG. 2 is a block diagram of a conventional dynamic
`address translation structure capable of translating (at
`most) one address per processor cycled
`FIG. 3 is a block diagram showing a translation struc
`ture in which the DLAT is replicated to provide three
`independent address generators;
`FIG. 4 is a timing diagram illustrating the operation
`of the translation structure of FIG. 3 when the same
`address must be written to each of the DLATs;
`FIG. 5 is a block diagram showing the translation
`structure of FIG. 3 modi?ed so that generated page
`addresses are written to all the DLATs;
`FIG. 6 is a timing diagram illustrating the operation
`of the translation structure of FIG. 5 indicating the time
`savings achieved by this structure;
`
`15
`
`
`
`5
`
`25
`
`35
`
`5,341,485
`3
`4
`prise the segment bits, bits A12 through A19 comprise
`FIG. 7 is a block diagram showing the translation
`the page bits, and bits A20 through A31 comprise the
`structure of FIG. 3 modi?ed so that a miss in one
`displacement bits.
`DLAT causes a search to be made of the other DLATs
`As shown in FIG. 2, the virtual address is generated
`before a page address is generated;
`FIG. 8 is a timing diagram illustrating the operation
`by address generator 20. The address generator 20 is
`part of a central processing unit (CPU) (not shown).
`of the translation structure of FIG. 3 when a DLAT
`The most recently referenced pages have entries in the
`miss occurs;
`FIG. 9 is a timing diagram illustrating the operation
`DLAT 21. For a DLAT with 256 congruence classes,
`bits A12 through A19 of the virtual address are used to
`of the translation structure of FIG. 7 where a DLAT
`address the DLAT. The virtual page identi?cation bits
`miss occurs but a hit is found in one of the other
`DLATs;
`from the addressed entry read out of the DLAT 21 are
`compared in comparator 22 with bits A1 through All
`FIG. 10 is a block diagram showing the translation
`structure of FIG. 7 modi?ed with the addition of
`of the virtual address. If there is no match, a DLAT
`“shadow” DLATs;
`miss has occurred. On a DLAT miss, address transla
`FIG. 11 is a timing diagram illustrating the operation
`tion is obtained through, for example, a segment/page
`table search and placed in the DLAT.
`of the translation structure shown in FIG. 10;
`The segment/page table search begins by adding the
`FIG. 12 is a block diagram showing a DLAT struc
`value in the segment table origin register 23 and the bits
`ture implemented as an interleaved DLAT;
`FIG. 13 is a timing diagram showing the operation of
`A0 to All of the virtual address in adder 24 to obtain an
`index value for the segment map table 25. The entry
`the DLAT structure shown in FIG. 3 beginning at a
`start up condition where the address translation for the
`output from the segment map table 25 is, in turn, used as
`an index for the page map table 26 entry, there being a
`page is not in the DLATs;
`FIG. 14 is a timing diagram showing the operation of
`separate page map table for each segment. The entry
`output from the page map table 26 provides the page
`the DLAT structure shown in FIG. 12 beginning at a
`frame at which the virtual page resides in real storage
`start up condition where the address translation for the
`and is passed by OR gates 27 and concatenated with the
`page is not in the DLAT;
`FIG. 15 is a timing diagram showing the operation of
`displacement bits A20 through A31 of the virtual ad
`the DLAT structure shown in FIG. 12 wherein the
`dress generator 20 to form the real address in real ad
`dress register 28.
`stride is one, all translations are the same page, and the
`translation is in the buffer;
`On the other hand, if there is a match in the DLAT
`30
`FIG. 16 is a ?ow diagram illustrating the logic of
`21, the comparator 22 enables AND gates 29 which
`address generation mode determination according to a
`passes the entry output from the DLAT 21 to OR gates
`further aspect of the invention;
`27. In this case, the entry output from the DLAT 21 is
`FIG. 17 is a functional block diagram illustrating
`the associated real address ?eld which is concatenated
`operation in virtual mode address translation;
`to the displacement bits A20 through A31 to form the
`real storage address in register 28. Obviously, this pro
`FIG. 18 is a functional block diagram illustrating
`operation in real mode address translation;
`cess of address translation is considerably faster than
`FIG. 19 is a block diagram showing the operation of
`that of the segment/page table search which occurs on
`a single DLAT in virtual mode address translation;
`a DLAT miss. The segment/ page table search may take
`FIG. 20 is a block diagram showing the operation of
`?fteen to eighty cycles to complete, whereas a DLAT
`a single DLAT in real mode address translation;
`access can be completed in one cycle.
`FIG. 21 is a block diagram showing the operation of
`Normally, most address translation requests are made
`a single, interleaved DLAT for multiple virtual mode
`by a search of the DLAT, and while the segment/page
`address translations;
`table search takes a greater number of processor cycles
`FIG. 22 is a block diagram showing the operation of
`than making the translation by means of the DLAT, the
`45
`a single, interleaved DLAT for multiple real mode
`segment/page table search is itself not without the pos
`sibility of a translation failure. For example, the segment
`address translations;
`FIG. 23 is a timing diagram illustrating the operation
`map table search may indicate that the segment is not in
`primary or main storage, causing the operating system
`of multiple, non-interleaved DLATs when no real
`mode switching is available; and
`to locate the segment on secondary storage, i.e., a direct
`FIG. 24 is a timing diagram illustrating the improved
`access storage device (DASD), create a page table for
`operation of multiple, non-interleaved DLATs when
`the segment, and load the appropriate page into primary
`storage, possibly replacing an existing page in the pro
`real mode switching is available for the case of stride
`one and all addresses in the same page.
`cess.
`Even if the segment is in primary storage, the desired
`page may not be in primary storage, causing the operat
`ing system to locate the page on secondary storage and
`loading the page in primary storage, again possibly
`replacing an existing page in the process. The process of
`accessing secondary storage can take up to several hun
`dred processor cycles.
`The foregoing description is for a conventional
`DLAT structure intended to make (at most) one trans
`lation per processor cycle. The subject invention modi
`?es this structure so that multiple translations per pro
`cessor cycle can be made. In the description of the
`preferred embodiments of the invention, the example of
`making three translations every cycle is used through
`
`DETAILED DESCRIPTION OF A PREFERRED
`EMBODIMENTS OF THE INVENTION
`The description which follows uses the term
`“DLAT” for dynamic look-aside table, but those skilled
`in the art will understand that this term may be used
`interchangeably with “TLB” for translation look-aside
`buffer. For purposes of the following description, a
`paging/segmentation virtual address system is assumed.
`In such systems, the virtual address format is as shown
`in FIG. 1 and comprises s-bits for the segment index
`65
`(SX), p-bits for the page index (PX), and d-bits for the
`displacement index (DX). The virtual address may be,
`for example, 32 bits of which hits A0 through A11 com
`
`50
`
`55
`
`16
`
`
`
`O
`
`25
`
`30
`
`5,341,485
`5
`out. It will, of course, be understood that this is merely
`illustrative, and those skilled in the art will be able to
`apply the teaching of the invention to any number of
`translations per cycle for a speci?c application.
`Referring now to FIG. 3, there is shown a DLAT
`structure comprising three address generators
`(AGENs) 31, 32 and 33. Each address generator gener
`ates a virtual address and passes the address to THE
`respective DLAT 34, 35 or 36 for a translation. If the
`DLAT does not have that translation (a DLAT miss),
`the address must be generated, as previously described.
`The main problem with this approach, as illustrated in
`FIG. 4, is that often the translation will have been made
`for one DLAT and the same translation will be needed
`for the others. Thus, for example, the three address
`generators generate virtual address in parallel. A miss
`occurs in the ?rst DLAT requiring N cycles to trans
`late. Similarly, misses for the same address occur in the
`second and third DLATs, each requiring N cycles to
`translate. Since the same translation might be made
`three times for the same page, a total of 3N cycles might
`be required for the translation. Note that the transla
`tions must be sequentially performed by the operating
`system.
`A solution to making the same translation three times
`is to write to all three DLATs when a translation is
`made, as illustrated in FIG. 5. Thus, as indicated in FIG.
`6, the translation need only be made once by the operat
`ing system and will, thereafter, be available in all the
`DLATs in only N cycles,
`This solution, however, risks that a translation could
`be removed from a DLAT that might be needed in the
`future. Therefore, rather than write to all three
`DLATs, another solution is for each of the DLATs to
`be searched in succession before resorting to a seg
`ment/page table search or a translation performed by
`the operating system. This is shown in FIG. 7, where
`after a miss occurs DLAT 34, a search is made of
`DLATs 35 and 36. Only if a miss occurs in all three
`DLATs is a translation made.
`FIG. 8 shows the consequences of a miss in the trans
`lation structure of FIG. 3. As described before, N cy
`cles will be required to make the translation when a
`DLAT miss occurs. On the other hand, if the transla
`45
`tion was in DLAT 35, for example, the solution shown
`in FIG. 7 would provide the translation on the next
`cycle as illustrated in FIG. 9. However, with the trans
`lation structure of FIG. 7, a BLAT miss will interrupt a
`cycle of translations in the other DLATs 35 and 36, but
`this is overall a good tradeoff.
`As indicated, the solution of FIG. 7 has the drawback
`that searching the other DLATs suspends translations
`being made by those DLATs. Rather than using the
`DLATs directly, a modi?cation of the solution shown
`in FIG. 7 is to use “shadow” copies of the DLATs, as ‘
`illustrated in FIG. 10. Each of the DLATs 34, 35 and 36
`are copied as “shadow” DLATs 44, 45 and 46, respec
`tively. When a DLAT miss occurs, the shadow copies
`are searched without interrupting the translations being
`made by the other DLATs. Thus, as illustrated in FIG.
`11, even when a DLAT miss occurs, translations con
`tinue uninterrupted in the other DLATs and, if a match
`is found in one of the shadow DLATs, only one addi
`tional cycle is required for the translation. This modi?
`cation of the solution shown in FIG. 7 does require
`twice as much hardware, but this can be justi?ed where
`speed is important.
`
`6
`It is also possible to interleave the DLAT in a similar
`way as a cache is interleaved. In FIG. 12, three address
`generators 51, 52 and 53 address a common DLAT 54.
`The DLAT 54 is partitioned into several sections, four
`being shown in FIG. 12. The last bits of a page address
`from the address generators 51, 52 and 53 are used to
`select the section of the DLAT to be addressed for an
`address translation. Since the three address generators
`go to the four-way interleaved DLAT for translation,
`the DLAT can hold many more translations for a simi~
`lar amount of hardware as the approaches shown in
`FIGS. 5, 5, 7, and 10.
`FIG. 15 illustrates the operation of the translation
`structure of FIG. 5 at start up where the address transla
`tions for the pages are not yet in the DLATs. At start
`up, each address generated ?rst goes to the correspond
`ing DLAT, but since the DLATs have no entries for
`the addresses, the translations must be made in a serial
`mode. Again, as in the case illustrated in FIG. 4, 3N
`cycles are required. In contrast, under the same condi
`tions, only N+2 cycles are required for the translation
`structure shown in FIG. 12, as illustrated FIG. 14. As
`suming that address in generators 51, 52 and 53 all gen
`erate addresses in the same page, the page translation is,
`at start up, not in the DLAT 54. Only one translation
`must be made, because once the translation is placed in
`the DLAT 54, the other address generators can access
`it. Since an application program usually has similar
`stride values, the interleaved DLAT will generally not
`have con?icts after the ?rst cycle as the translations are
`now programmed and offset by one cycle. Thus, the
`interleaved DLAT can produce three translations per
`cycle, as illustrated in FIG. 15. FIG. 15 also illustrates
`a partition page ending in 00, a page ending in 01, a page
`ending in 10, and a page ending in 11.
`The invention also addresses an enhancement to the
`dynamic address translation mechanism where address
`generation is done in either real or virtual mode. When
`done in real mode, the real address of the required data
`in memory is generated, and no translation is required.
`In virtual mode, the generated address must be trans
`lated. This method uses the fact that these translations
`only need to be made when the new addresses are in a
`new page. If the address remains in a page boundary,
`the real address only needs to be incremented by the
`stride. When a page boundary crossing is detected, it is
`then necessary to make a translation from the virtual to
`new real address. By going to the DLAT only when it
`is really required, the time required to generate a real
`address is reduced by removing the cycle required for
`the DLAT translation. In addition, the traf?c to the
`DLAT is reduced, allowing more than one address
`generator to go to the same DLAT. For an interleaved
`DLAT, this procedure reduces con?icts of the address
`generators going to the same section of the DLAT.
`This, in turn, increases the speed of address generation
`or allows more address generators to be hooked to the
`DLAT with little or no impact to the speed of address
`generation per generator.
`FIG. 16 shows the logic for switching between vir
`tual and real modes. The process begins in decision
`block 60. If the stride is low enough to expect more than
`four addresses to be generated in a page, a switch is
`made to real mode; otherwise, the translation is made in
`virtual mode using the DLAT in function block 61. If a
`switch is made to real mode, the DLAT is used to make
`the ?rst translation in function block 62, then in function
`block 65, the translated virtual address (real) is incre
`
`60
`
`65
`
`17
`
`
`
`10
`
`30
`
`5,341,485
`7
`8
`mented by the stride, removing the DLAT from the
`Having thus described our invention, what we claim
`translation process. After generating each real address
`as new and desire to secure by Letters Patent is as fol
`lows:
`by incrementing the translated virtual address, a test is
`1. An address translation mechanism for a virtual
`made in decision block 64 to determine if a page bound
`storage system in a data processing system which sup
`ary has been crossed. If not, the incremented address is
`ports multiple virtual address translations per computer
`used as the translated address in function block 65, and
`cycle, said address translation mechanism comprising:
`the process loops back to function block 63 where the
`means for storing a plurality of virtual addresses to be
`address is again incremented. When a page crossing is
`translated, said virtual addresses including segment
`detected, the process loops back to function block 62 to
`index bits, page index bits and displacement index
`generate the ?rst address in the next page in virtual
`bits;
`mode using the DLAT.
`dynamic look-aside table means for storing virtual~to
`The functions performed in the flow diagram of FIG.
`real translation information; and
`16 are implemented in the hardware illustrated in FIGS.
`accessing means responsive to said page index bits of
`17 and 18. FIG. 17 represents the hardware con?gura
`said virtual addresses for accessing during a com
`tion in the virtual mode, while FIG. 18 represents the
`puter cycle said dynamic look-aside table means to
`hardware con?guration in the real mode. Beginning
`simultaneously generate real addresses for each of
`with FIG. 17, as in the conventional DLAT structure
`said virtual addresses,
`shown in FIG. 2, the ?rst address of a vector instruction
`wherein an instruction speci?es a starting address, a
`is stored in a register 71 and sent to the DLAT 72. The
`stride and a number of operands in the instruction,
`20
`remaining addresses are generated by incrementing the
`the stride being an increment between successive
`virtual address from the register 71 by the stride in
`addresses, said mechanism further comprising:
`adder 73 and sending the incremented addresses to the
`mode switching means for determining when the
`DLAT 72. The register 71 and the adder 73 constitute
`stride is less than a predetermined fraction of a
`the address generator. In real mode, as shown in FIG.
`page size and for switching between a real mode of
`18, the address register 71 initially searches the DLAT
`translation and a virtual mode of translation; and
`72 for a match at the beginning of a page. The transla
`incrementing means operable in said real mode for
`tion made by the DLAT 72 is incremented by the stride
`incrementing a ?rst real address multiple times
`in adder 73, and each real address output from the adder
`within a page boundary to generate real addresses.
`73 is similarly incremented. Page crossing detection
`2. The addres