`Sachs et al.
`
`[11] Patent Number:
`[45] Date of Patent:
`
`4,933,835
`Jun. 12, 1990
`
`[54] APPARATUS FOR MAINTAINING
`CONSISTENCY OF A CACHE MEMORY
`WITH A PRIMARY MEMORY
`Howard G. Sachs, Los Altos; James
`[75] Inventors:
`Y. Cho, Los Gatos; Walter H.
`Hollingsworth, Campbell, all of Calif.
`Intergraph Corporation, Huntsville,
`Ala.
`[21] Appl. No.: 300,174
`[22] Filed:
`Jan. 19, 1989
`
`[73] Assignee:
`
`[63]
`
`Related US. Application Data
`Continuation of Ser. No. 915.272, Oct. 3, 1986, aban
`doned, Continuation-impart of Ser. No. 704,568, Feb.
`22, 1985, abandoned.
`
`[51] Int. (31.5 .............................................. .. G06F 9/00
`[52] US. Cl. ............................... .. 364/200; 364/243.4;
`364/243.4l; 364/243.42; 364/243.43
`[58] Field of Search .............................. .. 364/200, 100
`[56]
`References Cited
`U.S. PATENT DOCUMENTS
`364/200
`3,693,765 9/1972 Reiley et al.
`364/200
`3,723,976 3/1973 Alvarez et al.
`364/200
`3,761,881 9/1973 Anderson et a1.
`364/200
`3,764,996 10/1973 Ross ............ ..
`364/200
`3,896,419 7/1975 Lange et a1.
`364/200
`3,898,624 8/1975 Tobias ......... ..
`364/200
`3,902,164 8/1975 Kelley et al.
`364/200
`3,956,737 5/1976 Ball .............. ..
`364/200
`4,037,209 7/1977 Nakajima et al. .
`364/200
`4,057,848 11/1977 Hayashi ....... ..
`364/200
`4,068,303 1/1978 Morita ..... ..
`364/200
`4,077,059 2/1978 Corcli et a1.
`364/200
`4.144.563 3/1979 l-leuer et a1.
`4,151,593 4/1979 Jenkins et al. .................... .. 364/200
`(List continued on next page.)
`
`FOREIGN PATENT DOCUMENTS
`58-58666 4/1983 Japan
`364/200
`60-41146 3/1985 Japan .
`60-120450 6/1985 Japan .
`60-144847 7/1985 Japan .
`1444228 7/1976 United Kingdom
`
`364/200
`
`OTHER PUBLICATIONS
`Losq, et al., “Conditional Cache Miss Facility for Han
`dling Short/Long Cache Requests", IBM TDB, vol. 25,
`No. 1, Jun. ‘82, pp. 110-111.
`MC68120/MC6812l-Intelligent Peripheral Controller
`Users Manual, Motorola, Inc.
`Electronics International, vol. 55, No. 16, Aug. 1982,
`pp. 112-117, N.Y., 115; P. Knudsen: "Supermini Goes
`Multiprocessor Route to Put it up Front in Perfor
`mance".
`Primary Examiner-Raulfe B. Zache
`Assistant Examiner-John G. Mills
`Attorney, Agent, or Firm—Townsend and Townsend
`[57]
`ABSTRACT
`A microprocessor system is disclosed having a high
`speed system bus for coupling system elements, and
`having a dual bus microprocessor with separate ultra
`high speed instruction and data cache-MMU interfaces
`coupled to independently operable instruction and data
`cache-MMU, respectively. A main memory is coupled
`to the system bus for selectively storing and outputting
`digital information. The instruction and data cache
`MMU‘s are coupled to the main memory via the system
`bus for independently storing and outputting digital
`information to respective mapped addressable very
`high speed cache memory. The microprocessor is cou
`pled via separate and independent very high speed in
`struction and data buses to each of the instruction
`cache-MMU and data cache-MMU, respectively, for
`processing data received from the data cache-MMU
`responsive to instructions received from the instruction
`cache-MMU. The instruction bus and data bus are ex
`clusive and independent of one another, and allow for
`simultaneous very high-speed transfer. The data cache
`MMU and instruction cache-MMU each have separate
`dedicated system bus interfaces for coupling to the main
`memory and to other peripheral devices which are
`coupled to the system bus. Numerous other system
`elements can also be coupled to the system bus, includ
`ing an interrupt controller, an I/O processor, a bus
`arbiter, an array processor, and other peripheral con
`troller devices.
`23 Claims, 23 Drawing Sheets
`
`, : vscrons
`5 cannot.
`
`1
`
`APPLE 1021
`
`
`
`US. Patent
`
`_Jun. 12, 1990
`
`Sheet 1 of23
`
`4,933,835
`
`ZOE“.
`
`
`
`>mos_m_2Z_<S_
`
`OE
`
`mu._._oEzooo:E:zmE.z_
`
`.5528mmmo5u>n
`
`><mm<
`
`mommwoomm
`
`zo_B:Emz_
`
`
`
`D22-m:u<o
`
`
`
`mm>um\mmm>Eo
`
`mu<nEm»z_02¢
`
`..
`
`mam
`
`mmEmm<
`
`on.
`
`¢m_
`
`2
`
`
`
`
`
`
`
`
`
` Pm Pm
`
`
`
`MM
`
`
`
` 538., 538.,
`
`
`
`g:8w-m.Iflg:8w-m.Ifl
`
`
`
`.Es.s_=_a:=._2.2._=s_.Es.s_=_a:=._2.2._=s_
`
`
`
`
`
`
` L,ruuuuuuua-LJs:.E___weas_:::::::::Iu.«IIIIIIIIIII.JMag_. L,ruuuuuuua-LJs:.E___weas_:::::::::Iu.«IIIIIIIIIII.JMag_.
`
`m...H.E525.._frrlm...H.E525.._frrl
`
`
`
`
`
`WHruuuuuuuuuuI1...at.sm_.a_...a__.g.~_m.a§.s=__WHruuuuuuuuuuI1...at.sm_.a_...a__.g.~_m.a§.s=__
`
`
`
` _.mH:53.58MH.HHH. _.mH:53.58MH.HHH.
`
`
`
`.1IIIIiiIIIIIIIIII..lJ.1IIIIiiIIIIIIIIII..lJ
`
`
`
`S_.S_.
`
`
`MH...MH...
`
`_52:89_52:89
`
`3
`
`
`
`
`
`
`
`Sheet 3 0f23
`US. Patent Jun. 12, 1990
`INTERFACE 1310
`
`4,933,835
`
`ISEND
`
`(3M6)
`‘INPUT |3|2~
`q
`REG
`
`IH
`
`SET
`
`I313
`
`IHD
`
`‘
`
`05:00)
`
`,
`
`i
`
`l3l9
`
`B's‘ CACHE
`ADVANCE
`lNH- LOGIC
`1' IBM
`
`Mux
`
`i 1
`
`n‘
`
`A
`
`i
`
`M
`
`i
`
`'
`1C
`
`11.0
`
`IAD
`
`100
`
`FROM
`|__.¢_q " " "
`{"1330
`
`l3|5~
`
`Mux
`
`|
`,
`102d INSTRUCTION
`REGlSTER B '
`i’
`BRANCH U [5'8
`DECODER
`|
`
`SET
`
`'3'?
`
`TO
`,2,
`
`IADF
`
`IASF
`
`~
`
`our-
`# PUT
`REG
`
`.
`'
`
`52'
`
`S(3l=00)
`
`,
`A i—|
`PROGRAM
`IN‘;
`COUNTER
`, ¥—_}
`
`BUFFER
`ADVANCE
`LOGIC
`
`FL
`
`MCLK
`
`PADY
`
`4
`
`
`
`U.S. Patent
`
`Jun. 12, 1990
`
`Sheet 4 of 23
`
`4,933,835
`
`E S
`
`E §
`
`199
`
`
`
`3 3
`
`o
`E‘
`
`J.SNI
`
`0,
`
`3 “
`
`:3
`P1
`
`3 > I> 5
`I E(Q=|E)JOVI
`
`._.
`
`I'5'lnjulnlullllln
`
`’
`
`oumu
`
`, Zuni-IIIIIIIIIIIIW
`3'
`2
`
`I ' “;"“‘““°"
`_ D KSIIIIIIIIII
`
`9
`2 mIIIIIII— Mm“,
`_.
`5 _IIII|II M1,,
`‘-’ -IIIIII
`‘3
`Ijulnlt msavo
`Ilj omu
`Inn mu“
`
`
`2
`2
`
`<E I:
`
`I30
`
`3
`E
`
`-4-_
`
`5
`
`
`
`4,933,835
`_sheet 5 of 23
`Jun. 12, 1990
`U.S. Patent
` 7:1 g_5‘_
`r_ _ _ _ _ _ _ _ __
`‘
`
`4|
`
`Ba»
`8630
`
`I} nTRM
`
`AMD (3'-Q)
`TG (4:8)
`
`nRMw(0UT PUT ONLY)
`ADDTBHQQ)
`-
`" In-
`W "CHIN "W
`RESET/
`1;
`um-:sET
`
`
`
`= InDIRO n|CA
`
`[RESET
`BERR
`7aMoat-:
`nMSBE
`
`'20
`
`
`
`ADDT(3l:C1G )
`
`AD(3|-.Q@)
`
`%
`M
`
`4|
`
`w
`
`open collector
`—.<]}_._
`
`Pull up resistor
`
`
`
`
` nRMW(OUTPUT ONLY)
`TG(4-:8)
`AM Dl3:&)
`nTRM
`
` nRDYi
`RDYo
`CABSYO
`
`I-CAMMU
`
`l
`
`-4!
`-4|
`
`1;
`4!
`
`BGI
`am
`
`:33
`Mcz_K,McLK/———-I
`3cu< —>s —
`
`6
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`_Sheet 6 of 23
`
`4,933,835
`
`mmoamflm"Em
`
`emuH23so.__
`
`
`
`Eozmz._<um
`
`
`
`ImVK.|fillllnllM
`
`ao_ooc9<m
`
`
`
`Eosms._<:.E_>
`
`-«RumWu»
`
`a2.n..3?<>
`
`[~_..Lr:o_l>Io_Iu
`I.0.\..Im..|.I|3.~.l
`
`flcosoomo.~"s_>
`
`MOOOQO_N
`3:5EN0omen.
`
`
`
`
`
`=.2.8o_%__..39.“..E.oum
`
`mams_m»m>m
`
`7
`
`
`
`
`t3PS
`
`291M
`
`Sheet 7 of23
`
`53,3o,,4
`
`...no_o2.<m62:9...4|._._8mONFl.IL
`
`:Ncuu_2.t_3<>_
`
`.2
`
`.3.5%.
`
`0mmcozaficuz
`
`132:02.2.0
`
`.flmanEufiam:5.uo.._vdo._8
`
`_\i2..mP_
`.._x_:5
`
`8
`
`
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 3 of 23
`
`4,933,835
`
`$3:Q. BA-
`TLB SUBSYSTEM
`
`VA
`
`A“?
`
`2H
`
`
`
`
`
`243
`
`RA
`
`;_:I_ Q. 193-
`HTLB
`
`PGHH’
`
`AWH4nI
`
`2_|2.
`AIR Brn
`
`9
`
`
`
`US. Patent
`
`9MS
`
`B
`
`4,933,835
`
`
`
`
`
`nm_Eosuoocoogoooooo_am._mun
`
`,m...m
`
`
`
`..Lon»NNM
`
`....E....£
`
`um:38in3.5:93.554
`
`.201“.
`
`232
`
`
`
`>mos_m2moi.m._.rso2:0:
`
`cmzsogouoco
`
`caaiaflfl
`
`¢0.N.
`
`m.._._.
`
`s....I.3
`
`:.._m.<m
`
`X32
`
`10
`
`10
`
`
`
`
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 10 of23
`320
`
`4,933,835
`
`U
`
`?_1 510A.
`
`329
`
` 7:; 5710.5.
`
`Illa '
`
`'-We
`
`E
`
`RA: zu bits
`
`DT=32bits
`E: um
`LV= Inn
`LD= Hart
`
`350
`
`.
`
`7:; gj./A- -4
`
`359
`
`Vfl
`
`T
`
`64Llnas
`j_
`
`_E?l115- unlmsnac-an:
`SV UV
`D R
`
`nuns
`
`|4bifs
`VA =
`RA= 20bll’s
`ST: 5!)”:
`PL= 4-bits
`SV= I bit
`UV=
`I bit
`D =
`I bit
`R =
`lbit
`
`11
`
`
`
`US. Patent Jun. 12, 1990
`
`Sheet 11 of 23
`
`4,933,835
`
`F4 57 __ .12
`
`"dd
`
`100
`I0 4
`‘0 8
`
`10 C
`l l 0
`I I 4
`l l 8
`l I C
`I20
`I24
`I28
`
`12 C
`
`Quud- word boundary
`3
`2
`l
`o
`
`<-—————Quod-word boundary
`
`_
`
`I
`
`u
`
`.
`
`‘____._
`
`||
`
`7'_:_._l g. .13.
`
`PAGEO
`PAGE |
`PAGE2
`PAGES
`PAGE4
`PAGES
`PAGEG
`PAGE?
`
`VECTORS
`
`0000 0000
`
`MM l0
`MM I0
`BOOT ROM
`BOOT ROM
`
`PAGE 1,048,575
`
`FFFFFFFF
`
`VIRTUAL
`PAGE NO.
`
`VlRTUAL ADD
`
`REAL. ADD
`
`0
`I
`2
`3
`4
`5
`6
`7
`
`00000X XX
`OOOOIXXX
`00002XXX
`00003XXX
`00004XXX
`00005XXX
`OOOOGXXX
`OOOOTXXX
`
`OOOOOXXX
`OOOOIXXX
`0O002XXX
`00003XXX
`O0000XX X
`0000|X X X
`OOOOOXXX
`OOOOIXXX
`
`ST
`
`00lXX
`OOIXX
`00|XX
`00lXX
`IXOXX
`IXOXX
`IX l X X
`IXIXX
`
`12
`
`
`
`US. Patent Jun. 12,1990
`
`Sheet 12 0123
`
`4,933,835
`
`Cache
`Memory
`Subsystem
`
`/ / iv
`
`)32
`
`‘
`-
`Lme Reglster
`
`400
`~
`
`, 32
`
`QB = Q
`
`230
`H
`C
`8
`
`<
`
`32
`Processor/Cache ' ‘I ,
`Bus
`
`410
`
`Quad-Word Line Boundary Register
`(AIR 2)
`
`Mr
`Reg VA _
`(AIRI)
`
`/
`2\O
`
`1
`Quad Boundary ___/-— 420
`Comparator
`Quad
`/ Boundary
`
`Control Logic
`430 J
`
`13
`
`
`
`US. Patent Jun. 12, 1990
`
`Sheet 13 0123
`
`4,933,835
`
`ON_ Om
`
`00 On O
`
`043
`
`
`
`>m<QZDOm 236:0 >m<O2DOm 0430 5:23
`
`111T
`
`
`
`
`
`
`
`lkHvwuk
`
`IA
`
`_
`
`amt
`
`/| vju:
`
`mmohw
`
`
`
`1t 8080....
`
`amm
`
`:8 ,1 41 i 16955,‘
`
`
`
`I l: | l 6130a
`
`51am
`
`14
`
`
`
`US. Patent Jun. 12, 1990
`
`Sheet 14 0123
`
`4,933,835
`
`‘7:4 51.1’7A _
`
`4l0
`
`@ Main
`
`Copybook
`
`Memory &
`
`Private Page
`
`‘In
`
`Shaad Page
`
`i
`b’ m
`
`( fost Write Operation )
`
`_7:_z_ g_ 175..
`41o
`@ W. Thru
`
`“2 @
`
`(Doto Consistoncy)
`
`Copy @ock
`@ ‘
`
`etch
`
`Cache
`Memory
`Subsystem
`
`,1 SIS
`
`Memor y
`Subsystem
`
`DAT
`?namtinn
`
`15
`
`
`
`US. Patent Jun. 12, 1990
`‘7:I— g J~9_
`
`Sheet 15 0123
`
`4,933,835
`
`31
`
`22 2!
`
`l2 ll
`
`0 3'0
`
`VA
`
`s'r0_
`Reamer STO
`
`SEGMENT (l0)
`
`PAGE U0)
`
`D\SPU2)
`
`ZERO'S
`
`vA(||=0)
`'
`
`(20)
`
`(l2)
`
`VA(3I=22)
`/, Io
`
`( (20)
`k
`Segment
`Table
`Entry
`Address
`Accumumtor
`
`00
`(IO!
`J
`[3‘ — SE67“ EGT ‘FA-BLED1
`I
`1
`32
`SEGMENTO
`I
`g0
`H
`I
`I
`I
`PF I
`
`PTE
`
`I
`seem 1023
`
`I
`I
`I
`
`)
`/
`
`A? 20
`
`A’
`
`z‘f
`
`VA (2': ‘2)
`
`Page
`Table
`E523”,
`I Accumulator
`
`
`< IMcIin Memory PAGE TABLE \\
`
`I
`3|
`0
`PAGEO
`I
`5 4
`I
`I /—-"—\r’\
`RA ST PL D R PF I
`1’
`"
`
`(20)
`
`
`
`(IO) 00 \/ J
`
`32
`
`'
`
`RA,fI'OI'I'ITLB
`20
`
`I
`
`2o
`
`To
`TLB '
`
`‘
`I
`
`PAGE I023
`
`_ __ _ ._._ _l
`
`(20)
`
`I
`(I2)
`
`PF = PAGE FAULT
`PL= ACCESS PROTECT \ON
`5T: SYSTEM TAGS
`D = DIRTY FLAG
`R = REFERENCED FLAG
`
`\
`
`Y
`32 BIT REAL ADDRESS Real
`Address
`Accumulator
`
`16
`
`
`
`PS
`
`n
`
`4|.om.
`
`u_EExhmomJEE
`
`mdmumN.
`
`|
`
`tl.
`
`..w81”,3
`
`
`
`0Vone?o_oo._4Eaoz<.5528
`
`uuSmMMMmw.wRQQ1mmm
`
`n.
`
`w“.2&am
`
`
`
`1.¢¢§.S..ommmooma
`,wmamm:o<o
`
`
`
`5¢.._n.<m>mn.Q<
`
`On.
`
`amm
`
`17
`
`17
`
`
`
`
`
`071%mam
`
`gm.dug._.:n.z_zmpmfima
`
`
`
`
`
`
`.8:mmumoo<2m:m>m«ma
`
`4,933,335
`
`
`
`auxmzaaom._m<hmmmem
`
`..om_>..on=mn
`
`
`
`n8.2.S_=E._L.3:: .@mx»om..3.<..9...mUdmz
`
`
`
`
`
`
`
`.3:._.:n_._.DOsm:.m>mmom
`
`W291M..lHmPQMU
`
`2m$>m
`
`$238.
`
`Eosmzm:o<o
`
`_0.
`
`0.0
`
`
`
`.85._.:n_.—.....Om:o<o"moo
`
`
`
`aux._.Dn2_u:o<o"So
`
`
`
`
`
`.3:._.:mz_mmumoo<um_<
`
`
`
`
`
`00:.Ea;amino:3%."32:
`
`
`
`.ou..._mooo203.023....3on
`
`
`
`mmooozmmooon_<m_...3n_<z._.
`
`auxomoz,-o<:c3mo
`
`¢o»<m<n:zouHn_s_o
`
`18
`
`18
`
`
`
`
`
`
`U.S. Patent
`
`Jun. 12, 1990
`
`Sheet 18 of 23
`
`4,933,835
`
`715
`
`sun
`
`700
`
`255
`
`7‘; g_ 2 2.
`
`7no\r_
`5
`
`" _ Fc:o'Crn6'7=5nJ—1
`
`I
`
`[
`
`1
`
`I [
`
`PCin
`
`_ _ _ __ e
`
`ROM
`
`‘
`
`s 0
`
`5 UNIT LOADS
`
`
`VECTOR
`5|
`°°"°'°'°'
`
`
`
`73°
`
`ROM(5l=48)
`
`ROM(47‘-00)
`
`SIGNAIS
`
`740
`
`RSTO
`
`MUX SLT
`PCinc CRho|d
`OR hold
`
`STKset
`
`19
`
`19
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 190f23
`
`4,933,835
`
`7: 7 ca 25'_
`
`CAMMU CONTROL
`
`{320
`CACHE
`MEMORY
`
`,830
`CACHE
`CONTROL
`
`CAWT
`
`CAEN
`
`CA HIT/MISS
`
`(420
`QUAD.
`COMP.
`CMPEN
`HIT/
`H55 [5'0
`START
`CPU
`CONTROL
`
`TOH
`CPU
`
`(650
`_L8_‘E)
`our SYSTEM
`To
`BUS "SYSTEM
`IN CONTROL
`BUS
`
`MICRO
`ENGINE
`
`END
`
`TLBEN
`
`[350
`ma
`MEMORY
`
`_TLB ummss i
`‘820
`TLB WT
`
`TLB
`CONTROL
`
`20
`
`
`
`US. Patent
`
`Jun. 12,1990
`
`Sheet 20 of 23
`
`4,933,835
`
`3:-._f'g_ 2 4 _
`CPU CONTROL
`
`FROM CPU FCR
`
`FROM TLBHIT
`T
`PLF
`
`INPUT
`DECODER
`
`930
`
`SYSBUSY
`FROM SYSCTL
`
`
`
`920
`
`OUTPUT
`DECODE
`
`TO
`CAMMU
`
`‘ CPSTATE
`TO ENGlNE
`
`MCLK
`
`CAHIT
`FROM
`CACTL
`
`END
`FROM
`ENGINE
`
`INST/nDATA
`
`21
`
`
`
`US. Patent
`
`Jun.12, 1990
`
`Sheet 21 of23
`
`4,933,835
`
`7:__‘7E_ 2 5..
`
`TLB CONTROL
`
`LOAD/STORE/TAS
`
`TO
`CPCTL
`
` PROTECT ION
`FAULT LEVEL
`
`DECODER
`
`
`
`
`w/x SELECT
`
`
`
`U
`TLB
`
`REPLACEMENT
`TLB
`sv w/x
`
`LOGIC
`MEMORY
`uv w/x
`
`
`wane
`smoes
`GENERATOR
`
`1-Lawr
`FRQM
`ENGINE
`
`
`
`
`TLB HIT
`TO
`CPCTL
`
` MAPPED I/0
`DECODER
`
`RESET
`R/W REGISTERS
`R/W TLB
`
`22
`
`22
`
`
`
`US. Patent
`
`Jun. 12,1990
`
`Sheet 22 of23
`
`4,933,835
`
`7:__7_g7_26_
`CACHE CONTROL
`
`II3O
`
`W/X SELECT
`
`CA HIT
`
`CPEOTL
`
`CA HIT X
`
`CACHE HIT
`
`DETECTOR
`
`
`
`CACHE
`MEMORY
`
`U
`
`‘L;l‘{,’£§
`
`CACHE
`REPLACEMENT
`
`LOGIC
`
`
`
`CAWT
`
`FROM ENGINE
`
`FROM SYSCTL
`
`FROM CPCTL
`
`23
`
`23
`
`
`
`US. Patent
`
`Jun. 12,1990
`
`Sheet 23 of23
`
`4,933,835
`
`‘7=__7_g_ 27.
`SYSTEM BUS CONTROL
`
`250
`
`97 BUS
`
`svsram
`OUTPUT REG.
`
`650
`
`INPUT REG.
`
` SYSTEM
`5'“
`READ/WRITE
`
`LOGIC
`
`'32°
`
`TRANSFER
`END
`
`COUNTER
`
`'2
`
`TO
`CACTL
`
`QADR
`
`BUS cz.oc:<
`DETECTOR
`
`I230
`
`1210
`
`uzso
`
`ADDR/DATA
`
`“DY
`
`"DIR
`
`3°‘-K
`MCLK
`
`TG
`c1-
`causwr
`
`T0
`SYSTEM
`aus
`
`260
`READ/WRITE
`
`FROM
`nacn.
`
`BUS WATCH
`START
`
`INST
`
`BUS WATCH
`DETECTOR
`
`READI
`WRITE
`
`MODE
`
`CONTROL
`REGISTER
`
`24
`
`24
`
`
`
`1
`
`4,933,835
`
`APPARATUS FOR MAINTAINING CONSISTENCY
`OF A CACHE MEMORY WITH A PRIMARY
`MEMORY
`
`This application is a continuation of U.S. patent appli-
`cation Ser. No. 915,272, which is a continuation-in-part
`of U.S. patent application Ser. No. 704,568, both now
`abandoned.
`
`BACKGROUND
`
`This invention relates to computer system architec-
`tures and more particularly to a microprocessor system
`having a system bus for coupling system elements, and
`having a dual bus microprocessor with separate instruc-
`tion and data cache interfaces coupled to independently
`operable instruction and data caches which are coupled
`to the system bus.
`Prior microprocessor system architectures have pro-
`vided a single external cache subsystem for data and/or
`instructions. Such systems have typically provided for
`direct microprocessor interface to both the cache sys-
`tem and other system elements. In prior systems, a sin-
`gle address/data/control bus provided for interfacing
`to the cache system and to other system elements. Some
`newer microprocessor designs have provided a separate
`interface to a single cache system for data and/or in-
`structions. Some have additionally provided a separate
`general bus for coupling of all system elements to the
`microprocessor,
`including main memory, peripheral
`controller chips, etc. Transfer of digital information to
`and from the microprocessor in these prior art designs
`could either occur between microprocessor and the
`cache system or the microprocessor and peripheral
`controllers or main memory directly. Furthermore, the
`cache system memory cycle required address informa-
`tion from the processor to the cache system for each
`transfer of digital information to the processor from the
`cache system. While the cache system could return one
`or more words of data per cache system data transfer,
`each cache system memory access cycle required a
`separate address be provided from the processor.
`SUMMARY
`
`In accordance with the present invention, a micro-
`processor-based computing system is provided, which
`has a system bus, a main memory and instruction and
`data cache and memory management units (cache-
`MMU) coupled to the system bus. The system bus pro-
`vides for communication of digital information. The
`main memory selectively stores and outputs digital in-
`formation from an addressable high speed read-write
`memory. The instruction cache-MMU manages selec-
`tive access to the main memory via the system bus and
`provides for the selective storage and output of digital
`instruction words to a mapped addressable very high
`speed cache memory, and therefrom to the processor
`via a very high speed processor/cache bus. A data
`cache-MMU manages access to the main memory for
`selectively storing and outputting digital data words to
`and from a mapped addressable very high speed cache
`memory, to and from main memory via the system bus.
`A processor is independently coupled to each of the
`instruction cache-MMU and data cache-MMU via inde-
`pendent very high speed buses. The processor provides
`means for processing data received from the data cache-
`MMU responsive to instructions simultaneously re-
`ceived from the instruction cache-MMU.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`25
`
`2
`The data cache and instruction cache each have sepa-
`rate dedicated system bus interfaces for coupling to the
`main memory and to other peripheral devices coupled
`to the system bus. Numerous other system elements can
`be coupled to the system bus. These include an interrupt
`controller, an I/O processor, a bus arbiter, an array
`processor, and other peripheral interface or peripheral
`controller devices. The I/O processor provides intelli-
`gent interface to various I/O devices and other proto-
`cols and buses. The bus arbiter is coupled to the devices
`coupled to the system bus, such as the instruction and
`data caches, the I/O processor. etc. The bus arbiter
`provides means for selectively resolving channel access
`conflicts between the various elements coupled to the
`system bus so as to maintain the integrity of communi-
`cations on the system bus.
`The data cache contains an address register which is
`loaded with an address from the processor prior to each
`transfer of a defined number of words of data between
`the data cache and the processor. The instruction cache
`contains a program counter which is loaded with an
`address from the processor, and which is advanced by a
`cache advance signal from the microprocessor. The
`instruction cache program counter is loaded with an
`address only during branch instructions and context
`switches. This provides for continuous transfer of in-
`structions from the instruction cache to the processor
`responsive to a single initial address until a branch or
`context switch occurs.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`These and other features and advantages of the pres-
`ent invention will become apparent from the following
`detailed description of the drawings, wherein:
`FIG. 1 illustrates a block diagram of a microproces-
`sor-based dual cache/dual bus system architecture in
`accordance with the present invention;
`FIG. 2 shows CPU 110 of FIG. 1 in more detail;
`FIG. 3 shows the CPU instruction bus interface of
`FIG. 2 in more detail;
`FIG. 4 is an electrical diagram illustrating the instruc-
`tion cache/processor bus, the data cache/processor bus,
`and the system bus;
`FIG. 5 illustrates the system bus to cache interface of
`FIG. 4 in greater detail;
`FIG. 6 is an electrical diagram illustrating the dri-
`vers/receivers between the instruction cache-MMU
`and the system bus;
`FIGS. 7A-C illustrate the virtual memory, real mem-
`ory, and virtual address concepts as utilized with the
`present invention;
`FIG. 8 illustrates an electrical block diagram of a
`cache memory management unit;
`FIG. 8A shows the translation lookaside buffer sub-
`system (TLB) in more detail;
`FIG. 8B shows the hardwired translation lookaside
`buffer (I-ITLB) in more detail;
`FIG. 9 is a detailed block diagram of the cache mem-
`ory management unit of FIG. 8;
`FIGS. l0A—B illustrate the storage structure within
`the cache memory subsystem 320;
`FIGS. llA—B illustrate the TLB memory subsystem
`350 storage structure in greater detail;
`FIG. 12 illustrates the cache memory quadword
`boundary organization;
`FIG. 13 illustrates the hardwired virtual to real trans-
`lations provided by the TLB subsystem;
`
`25
`
`
`
`3
`FIG. 14 illustrates the cache memory subsystem and
`affiliated cache-MMU architecture which support the
`quadword boundary utilizing line registers and line
`boundary registers;
`FIG. 15 illustrates the load timing for the cache-
`MMU systems 120 and 130 of FIG. 1;
`FIG. 16 illustrates the store operation for the cache-
`MMU systems 120 and 130 of FIG. 1, for storage from
`the CPU to the cache-MMU in copyback mode, and for
`storage from the CPU to the cache-MMU and the main
`memory for the write-through mode of operation;
`FIG. 17A illustrates the data flow of store operations
`on Copy-Back mode, and FIG. 17-B illustrates the data
`flow of operations on Write-Thru Mode;
`FIG. 18 illustrates the data flow and state flow inter-
`action of the CPU, cache memory subsystem, and TLB
`memory subsystem;
`FIG. 19 illustrates the data flow and operation of the
`DAT and TLB subsystems in performing address trans-
`lation;
`FIG. 20 illustrates a block diagram of the cache-
`MMU system, including bus interface structures inter-
`nal to the cache-MMU;
`FIG. 21 is a more detailed electrical block diagram of
`FIG. 20;
`FIG. 22 is a detailed electrical block diagram of the
`control logic microengine 650 of FIG. 21;
`FIG. 23 illustrates an arrangement of the major con-
`trol and timing circuits for the Cache-MMU;
`FIG. 24 illustrates CPU control circuit 810 of FIG.
`23 in greater detail;
`FIG. 25 illustrates TLB control circuitry 820,
`TLBCTL, of FIG. 23 in greater detail;
`FIG. 26 illustrates the cache control circuit 830,
`CACTL, of FIG. 23 in greater detail; and
`FIG. 27 illustrates the System Bus control circuit 840,
`SYSCTL, of FIG. 23 in greater detail.
`DETAILED DESCRIPTION OF THE
`DRAWINGS
`
`Referring to FIG. 1, a system embodiment of the
`present invention is illustrated. A central processing
`unit 110 is coupled via separate and independent very
`high speed cache/processor buses, an instruction bus
`121 and a data bus 131, coupling to an instruction cache-
`memory management unit 120 and a data cache-mem-
`ory management unit 130, respectively, each having an
`interface to main memory 140 through system bus 141.
`Main memory 140 contains the primary storage for the
`system, and may be comprised of dynamic RAM, static
`RAM, or other medium to high speed read-write mem-
`ory. Additionally, a system status bus 115 is coupled
`from the CPU 110 to each of the instruction cache-
`memory management unit 120 and data cache-memory
`management unit 130.
`Additionally, as illustrated in FIG. 1, other system's
`elements can be coupled to the system bus 141, such as
`an l/O processing unit, IOP 150, which couples the
`system bus 141 to the I/0 bus 151. The 1/0 bus 151 may
`be a standard bus interface, such as Ethernet, Unibus,
`VMEbus or Multibus. I/0 bus 151 can couple to the
`secondary storage or other peripheral devices, such as
`hard disks, floppy disks, printers, etc. Multiple IOPs can
`be coupled to the system bus 141 and thereby can com-
`municate with the main memory 140.
`The CPU 110 is also coupled via interrupt lines 111 to
`an interrupt controller 170. Each of the units contend-
`ing for interrupt priority to the CPU has separate inter-
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`4,933,835
`
`4
`rupt lines coupled into the interrupt controller 170. As
`illustrated in FIG. 1, the array processor 188 has an
`interrupt output 165 and the IOP 150 has an interrupt
`output 155. Controller 170 prioritizes and arbitrates
`priority of interrupt requests to the CPU 110.
`A system clock 160 provides a master clock MCLK
`to the CPU 110, instruction cache-memory manage-
`ment unit 120 and data cache-memory management unit
`130 for synchronizing operations. In addition, a bus
`clock BCLK output from the system clock 160, pro-
`vides bus synchronization signals for transfers via the
`system bus 141, and is coupled to all system elements
`coupled to the system bus 141.
`Where multiple devices request access to the system
`bus 141 at the same time, a bus arbitration unit unit 180
`is provided which prioritizes access and avoids colli-
`sions.
`
`FIG. 2 shows CPU (processor) 110 in greater detail.
`Instructions from instruction cache-MMU 120 enter
`from instruction bus 121 to instruction bus interface unit
`1310 where they are held in prefetch buffer 1311 until
`needed for execution by instruction control unit 1320.
`Instructions are also supplied as needed from macro
`instruction unit 1330 which holds frequently used in-
`struction sequences in read only memory. Instructions
`first enter register 102 and then register 104 (instruction
`registers B and C, respectively) which form a two stage
`instruction decoding pipeline. Control signals from
`instruction decoder 103 are timed and gated to all parts
`of the processor for instruction execution. For speed of
`execution, instruction decoder 103 is preferably imple-
`mented in the form of sequential state machine logic
`circuitry rather than slower microcoded logic circuitry.
`Program counter 1321 contains the address of the in-
`struction currently being executed in instruction regis-
`ter C. The execution unit 105, comprising integer execu-
`tion unit 1340 and floating point execution unit 1350,
`executes data processing instructions. Data is received
`from and transmitted to data cache-MMU 130 over data
`cache-MMU bus 131 through data bus interface 109.
`Instruction interface 1310 of processor 110 includes a
`multi-stage instruction bus 1311 which provides means
`for storing, in seriatim, a plurality of instruction parcels,
`one per stage. A cache advance signal ISEND is sent by
`the instruction interface as it has free space. This signals
`instruction cache-MMU 120 to provide an additional
`32-bit word containing two 16-bit instruction parcels
`via instruction bus 121. This multi-stage instruction
`buffer increases the average instruction throughput
`rate.
`
`Responsive to the occurrence of a context switch or
`branch in the operation of the microprocessor system,
`instruction interface 1310 selectively outputs an instruc-
`tion address for storage in an instruction cache-MMU
`120 program counter. A context switch can include a
`trap, an interrupt, or initialization. The cache advance
`signal provides for selectively incrementing the instruc-
`tion cache-MMU program counter, except during a
`context switch or branch.
`In FIG. 3, prefetch buffer 1311 is shown in detail,
`comprising the four prefetch buffer register stages IH,
`IL, IA and IC. The IH register stage holds a 16-bit
`instruction parcel in register 1312 plus an additional bit
`of control information in register 1313, IHD, which bit
`is set to indicate whether IH currently contains a parcel.
`Each of the register stages is similarly equipped to con-
`tain an instruction parcel and an associated control bit.
`Buffer advance logic circuit 1314 administers the parcel
`
`26
`
`26
`
`
`
`4,933,835
`
`6
`
`agement units of the instruction cache-MMU 120 and
`data cache-MMU 130 perfonn all memory manage-
`ment, protection, and virtual to physical address trans-
`lation.
`-
`As illustrated in FIGS. 1, 7A-C, and 8, the processor
`110 provides virtual address outputs which have a
`mapped relationship to a corresponding physical ad-
`dress in main memory. The memory management units
`of the instruction and data cache-MMUs 120 and 130
`are responsive to the respective virtual address outputs
`from the instruction and data interfaces of the processor
`110, such that the memory management units selec-
`tively provide physical address and the associated
`mapped digital information for the respective virtually
`addressed location. When the requested information for
`the addressed location is not stored in the respective
`cache-MMU memories (i.e. a cache miss), the micro
`engine of the cache-MMUs provides a translated physi-
`cal address for output to the main memory 140. The
`corresponding information is
`thereafter
`transferred
`from the main memory 140 to the respective instruction
`cache-MMU 120 or to or from the data cache-MMU
`130, and as needed to the processor 110.
`The two separate cache interface buses, the instruc-
`tion bus 121 and the data bus 131 are each comprised of
`multiple signals. As illustrated in FIGS. 4 and 5, for one
`embodiment, the signals on both the data cache bus 131
`and the instruction cache bus 121 are as follows:
`
`DATA CACHE BUS
`
`ADF<3l:O>: address/data bus
`These lines are bidirectional and provide an address-
`/data multiplexed bus. The CPU puts an address on
`these lines for one clock cycle. On store operations, the
`address is followed by the data. On load or TAS (i.e.
`test and set) operations,
`these bus lines become idle
`(floating) after the address cycle, so that these lines are
`ready to receive data from the Data Cache-MMU. The
`Data Cache-MMU then puts the addressed data on the
`lines.
`FC<3:O>: function code/trap code
`The CPU puts “the type of data transfer" on
`FC<3:0> lines for one clock cycle at the address cy-
`cle. The D-CACHE, or I-CACHE, sends back “the
`type of trap" on abnormal operations along with TSTB
`(i.e. Trap Strobe Signal).
`
`
`Transfer tyfi
`gOn ASF Active)
`
`FC < 3
`2
`l
`D
`>
`0
`0
`0
`0
`load singleword mode
`0
`D
`O
`1
`load doubleword mode
`0
`0
`l
`0
`load byte
`0
`D
`1
`I
`load halfword
`O
`l
`O
`0
`Test and set
`1
`X
`0
`0
`store singleword
`1
`X
`0
`1
`store doubleword
`1
`X
`1
`0
`store byte
`
`1 1X1 store halfword
`
`
`
`
`The D-cache puts the TRAP code on FC to respond to
`the CPU.
`
`
`Trap Code
`
`l0
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`[on TSTB active!
`FC < 3
`2
`l
`X
`0
`0
`
`0
`
`0
`
`>
`
`5
`and control bit contents of the four register stages. In
`response to the parcel advance control signal PADV
`from instruction decoder 103, buffer advance logic
`circuit 1314 gates the next available instruction parcel
`into instruction register 102 through multiplexor 1315,
`and marks empty the control bit associated with the
`register stage from which the parcel was obtained. In
`response to the control bits of the four stages, circuit
`1314 advances the parcels to fill empty register stages.
`As space becomes available for new instruction parcels
`from the instruction cache-MMU, cache advance logic
`circuit 1316 responds to the control bits to issue the
`ISEND signal on instruction bus 121. Instruction cache-
`MMU responds with a 32-bit word containing two
`parcels. The high order parcel is received in H-1, and the
`low order parcel in IL through multiplexor 1319.
`On each MCLK cycle both the buffer advance and
`cache advance circuits attempt to keep the prefetch
`buffer stages full as conditions permit. The buffer ad-
`vance and cache circuits are implemented in combina-
`tional logic in a manner that is evident to those skilled in
`the art. For example, cache advance circuit 1316 pro-
`duces ISEND in response to the negation of the follow-
`ing boolean logic expression: (ICD,IAD,ILD+ICD-
`,IAD,II-ID+ICD,ILD,IHD+IAD,ILD,II-ID).
`The
`first two terms indicates that IA and IC are full with
`either IH or IL full. The last two terms indicate that IL
`and 1H are full with either IC or IA full. In all of these
`cases, there is no available register space in the prefetch
`buffer, while in all other cases, there is space.
`Instruction parcels stored in instruction register 102
`are partially decoded before being sent to instruction
`register 104 to complete the decoding process. Decod-
`ing of branch instructions is done by branch decoder
`1317, a part of decoder 103, in response to instruction
`register 102. In the case of a branch instruction, the
`branch address is set into program counter 1321 from
`the processor S bus, cache advance circuit 1316 is inhib-
`ited from sending ISEND and the prefetch buffer is
`flushed (signal path 1318). Branch decoder 1317 instead
`sends IASF to the instruction cache-MMU. This causes
`instruction cache-MMU 120 to take the new branch
`address from cache bus 121.
`The MCLK is the clock to the entire main clock, (e.g.
`33 MHz), logic. BCLK is the system bus clock, prefera-
`bly at either i or 1 of the MCLK.
`For the system bus 141 synchronization, BCLK is
`delivered to all the units on the system bus, i.e. IOPs,
`bus arbiter, caches, interrupt controllers, the main mem-
`ory and so forth. All signals must be generated onto the
`bus and be sampled on the rising edge of BCLK. The
`propagation delay of the signals must be within the one
`cycle of BCLK in order to guarantee the synchronous
`mode of bus operation. The phase relationships between
`BCLK and MCLK are strictly specified. In one em-
`bodiment, BCLK is a 50% duty-cycle clock of twice or
`four times the cycle time of MCLK, which depends
`upon the physical size and loads of the system bus 141.
`As illustrated in FIG. 1, the transfer of instructions is
`from the instruction cache-MMU 120 to the processor
`110. The transfer of data is bidirectional between the
`data cache-MMU 130 and processor 110. Instruction
`transfer is from the main memory 140 to the instruction
`cache-MMU 120. Instruction transfer occurs whenever
`an instruction is required which is not resident in the
`cache memory of instruction cache-MMU 120. The
`transfer of data between the data cache-MMU 130 and
`main memory 140 is bidirectional. The memory man-
`
`27
`
`27
`
`
`
`4,933,835
`
`7
`-continued
`
`Trap Code
`
`gon TSTB active!
`
`FC < 3
`2
`1
`0
`>
`X
`0
`0
`1
`memory error (MSBE)
`X
`0
`l
`0
`memory error (MDBE)
`X
`0
`l
`l
`X
`1
`0
`0
`page fault
`X
`l
`0
`1
`protection fault (READ)
`X
`l
`l
`O
`
`llX protection fault (WRITE) l
`
`
`
`
`ASF: address strobe
`ASF is activated by the CPU indicating that the ‘ad-
`dress’ and ‘type of data transfer’ are valid on AD-
`F<3l:10> and FC<3:0> lines, respectively. ASF is
`activated one half a clock cycle prior to the address
`being activated on the ADF bus.
`RSP: response signal
`On load operations, the RSP signal is activated by the
`D-cache indicating that data is ready on the ADF bus.
`RSP is at the same timing as the data on the ADF bus.
`The D-cache sends data to CPU on a load operation.
`On store operations, RSP is activated when the data
`cache-MMU becomes ready to accept the next opera-
`tion.
`On load-double, RSP is sent back along with each
`data parcel transfer.
`On store-double, only one RSP is sent back after the
`second data parcel is accepted.
`TSTB: TRAP strobe
`TSTB, along with