`Sachs et al.
`
`[11] Patent Number:
`[45] Date of Patent:
`
`4,933,835
`Jun. 12, 1990
`
`[54] APPARATUS FOR MAINTAINING
`CONSISTENCY OF A CACHE MEMORY
`WITH A PRIMARY MEMORY
`Howard G. Sachs, Los Altos; James
`[75] Inventors:
`Y. Cho, Los Gatos; Walter H.
`Hollingsworth, Campbell, all of Calif.
`Intergraph Corporation, Huntsville,
`Ala.
`[21] Appl. No.: 300,174
`[22] Filed:
`Jan. 19, 1989
`
`[73] Assignee:
`
`[63]
`
`Related US. Application Data
`Continuation of Ser. No. 915.272, Oct. 3, 1986, aban
`doned, Continuation-impart of Ser. No. 704,568, Feb.
`22, 1985, abandoned.
`
`[51] Int. (31.5 .............................................. .. G06F 9/00
`[52] US. Cl. ............................... .. 364/200; 364/243.4;
`364/243.4l; 364/243.42; 364/243.43
`[58] Field of Search .............................. .. 364/200, 100
`[56]
`References Cited
`U.S. PATENT DOCUMENTS
`364/200
`3,693,765 9/1972 Reiley et al.
`364/200
`3,723,976 3/1973 Alvarez et al.
`364/200
`3,761,881 9/1973 Anderson et a1.
`364/200
`3,764,996 10/1973 Ross ............ ..
`364/200
`3,896,419 7/1975 Lange et a1.
`364/200
`3,898,624 8/1975 Tobias ......... ..
`364/200
`3,902,164 8/1975 Kelley et al.
`364/200
`3,956,737 5/1976 Ball .............. ..
`364/200
`4,037,209 7/1977 Nakajima et al. .
`364/200
`4,057,848 11/1977 Hayashi ....... ..
`364/200
`4,068,303 1/1978 Morita ..... ..
`364/200
`4,077,059 2/1978 Corcli et a1.
`364/200
`4.144.563 3/1979 l-leuer et a1.
`4,151,593 4/1979 Jenkins et al. .................... .. 364/200
`(List continued on next page.)
`
`FOREIGN PATENT DOCUMENTS
`58-58666 4/1983 Japan
`364/200
`60-41146 3/1985 Japan .
`60-120450 6/1985 Japan .
`60-144847 7/1985 Japan .
`1444228 7/1976 United Kingdom
`
`364/200
`
`OTHER PUBLICATIONS
`Losq, et al., “Conditional Cache Miss Facility for Han
`dling Short/Long Cache Requests", IBM TDB, vol. 25,
`No. 1, Jun. ‘82, pp. 110-111.
`MC68120/MC6812l-Intelligent Peripheral Controller
`Users Manual, Motorola, Inc.
`Electronics International, vol. 55, No. 16, Aug. 1982,
`pp. 112-117, N.Y., 115; P. Knudsen: "Supermini Goes
`Multiprocessor Route to Put it up Front in Perfor
`mance".
`Primary Examiner-Raulfe B. Zache
`Assistant Examiner-John G. Mills
`Attorney, Agent, or Firm—Townsend and Townsend
`[57]
`ABSTRACT
`A microprocessor system is disclosed having a high
`speed system bus for coupling system elements, and
`having a dual bus microprocessor with separate ultra
`high speed instruction and data cache-MMU interfaces
`coupled to independently operable instruction and data
`cache-MMU, respectively. A main memory is coupled
`to the system bus for selectively storing and outputting
`digital information. The instruction and data cache
`MMU‘s are coupled to the main memory via the system
`bus for independently storing and outputting digital
`information to respective mapped addressable very
`high speed cache memory. The microprocessor is cou
`pled via separate and independent very high speed in
`struction and data buses to each of the instruction
`cache-MMU and data cache-MMU, respectively, for
`processing data received from the data cache-MMU
`responsive to instructions received from the instruction
`cache-MMU. The instruction bus and data bus are ex
`clusive and independent of one another, and allow for
`simultaneous very high-speed transfer. The data cache
`MMU and instruction cache-MMU each have separate
`dedicated system bus interfaces for coupling to the main
`memory and to other peripheral devices which are
`coupled to the system bus. Numerous other system
`elements can also be coupled to the system bus, includ
`ing an interrupt controller, an I/O processor, a bus
`arbiter, an array processor, and other peripheral con
`troller devices.
`23 Claims, 23 Drawing Sheets
`
`, : vscrons
`5 cannot.
`
`1
`
`APPLE 1021
`
`
`
`US. Patent
`
`_Jun. 12, 1990
`
`Sheet 1 on3
`
`4,933,835
`
`20mm.
`
`>555:23:
`
`OE
`
`OEfigmmmkg
`
`mwjomhzonu
`
`Joy—#28mmmo...ou>m
`
`><mm<
`
`mOmwwoOm—n.
`
`20....03mhmz.
`
`:22-m:o<o
`
`
`
`mm>um\wmmZmo
`
`mo<m¢mhz_oz<
`
`2
`
`
`
`
`
`
`
`
`US. Patent
`
`Jun. 12,1990
`
`Sheet 2 0f23
`
`4,933,835
`
`3:55.95fig
`
`rIIIIIIIIII.
`
`_.._.
`
`_..._
`
`0¢n_x...
`
`
`
`
`
`rllllllllllllWE.5mm.2...Essa...“2.5....
`
`s...2.53...5......uI12......a........s_...§...
`
`
`
`
`
`
`
`:2:225865.2-259:
`
`:8
`
`..N1H
`
`.358exa22525:.E51%m:
`
`:2:
`
`alIIIIIIIII
`
`ag_.was...2....
`
`J_
`
`II.
`
`"2......atag_.~_
`
`3
`
`
`
`
`
`
`
`Sheet 3 0f23
`US. Patent Jun. 12, 1990
`INTERFACE 1310
`
`4,933,835
`
`ISEND
`
`(3M6)
`‘INPUT |3|2~
`q
`REG
`
`IH
`
`SET
`
`I313
`
`IHD
`
`‘
`
`05:00)
`
`,
`
`i
`
`l3l9
`
`B's‘ CACHE
`ADVANCE
`lNH- LOGIC
`1' IBM
`
`Mux
`
`i 1
`
`n‘
`
`A
`
`i
`
`M
`
`i
`
`'
`1C
`
`11.0
`
`IAD
`
`100
`
`FROM
`|__.¢_q " " "
`{"1330
`
`l3|5~
`
`Mux
`
`|
`,
`102d INSTRUCTION
`REGlSTER B '
`i’
`BRANCH U [5'8
`DECODER
`|
`
`SET
`
`'3'?
`
`TO
`,2,
`
`IADF
`
`IASF
`
`~
`
`our-
`# PUT
`REG
`
`.
`'
`
`52'
`
`S(3l=00)
`
`,
`A i—|
`PROGRAM
`IN‘;
`COUNTER
`, ¥—_}
`
`BUFFER
`ADVANCE
`LOGIC
`
`FL
`
`MCLK
`
`PADY
`
`4
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 4 of 23
`
`4,933,835
`
`Vcc Vs:
`
`>Z
`
`Vcc
`
`_— 7391.98
`
`IZO
`
`lSNI
`
`2(8-IEHOVI
`
`I-BUS Vcc 61V
`I_iiiE-Illlllllllllm
`
`
`
`IRbus
`
`-4-
`
`___2_
`
`Iiliflllllllllll-
`"139
`A
`IIIlIlI'lI‘II0810u
`
`LL%O
`
`D-CAMMU
`E'iiiiii
`
`
`
`
`.(ll
`
`IIIIIIIIIIIII
`EEIIIIIIIIII (3331“
`IBIIIIIIIIII
`manna-WMWH u
`
`_IIIII CASS V3
`I—III- Gaga“
`
`SC
`
`I
`|
`
`815.].
`ONES
`
`D-BUS
`
`VLVG ‘1
`
`ISO
`
`Vcc Vs:
`
`5
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 5 of 23
`
`4,933,835
`
`7: 1 g_5’_
`.
`
`E— _ _ __ __ _ __ __ ._ _
`
`BR»
`860
`
`4|
`I}
`
`cssv/
`
`.4|
`
`
`
`RDY/
`Ifl
`2‘7 Q, ”l-
`
`“32%? —"¢-III
`RESET/
`I)
`
`
`
`
`TG(4:Q)
`
`nnmmour PUT ONLY)
`
`
`
`.
`
`mun-51.96»
`
`"BF-RR
`
`"RESET
`I—._l
`
`
`nDIRO nICA
`
`
`
`
`
`E3
`
`IRESET
`nBERR
`nMDBE
`Eggs; '0& )
`
`
`
`nRMW(OUTPUT ONLY)
`mum)
`
`AM Dl3za)
`nTRM
`
`
`
`T
`4:
`EDSCK
`
`AD(3I18@)
`
`fl
`
`open collemor
`—.< }_——
`Pull up resistor
`
`
`
`
`
`rt
`I'- ,
`‘
`I’
`‘
`860+ 36.
`
`1|
`'5 =
`
`
`
`=
`
`
`
`
`l
`
`°<l
`«I
`
`I)
`4|
`
`
`
`nRDYJ
`RDYo
`
`CABSYO
`
`
`
`
`
`I-CAMMU
`
`:33
`MCLKMCLK/ ——-|
`BCLK ——s —
`
`
`BGI
`BRI
`
`
`
`6
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 6 on3
`
`4,933,835
`
`mmOnva—Nu.21
`
`emu“25:5.u_
`
`
`
`$0.2m:44mm
`
`Ilmull.
`
`aorocoém
`
`
`
`Eosms..ZPES
`
`
`
`I(N.IIEI
`
`[uTerQlkIQL
`19.5L9!
`
`nEoEoow9w":5
`
`«coca.QN“cum
`
`33mEN“omen.
`
`
`
`
`
`.5632%%£094...300%
`
`ka‘||Um3333u<>
`
`
`momSukm>m
`
`7
`
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 7 01'23
`
`4,933,835
`
`.9:52.6m3522?
`
`22522.;
`
`EN
`
`Imlmflllm
`
`*
`
`
`
`3.59.:280
`
`m:3&03i
`
`085.8.38.
`
`‘882:.
`
`<m
`
`
`
`35as:
`
`_
`
`a3...._
`I.ats<>
`
`.02.35.65ENFIIL
`
`_.09._2.525
`2.000
`mamoguuo
`
`5.35.
`
`8
`
`
`
`
`
`
`
`
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 8 of 23
`
`4,933,835
`
`#4 g. 5A-
`TLB SUBSYSTEM
`
`AIR
`
`
`
`AIR Bm
`
`2_Ia
`
`PGHIT
`
`AIR I4m
`
`9
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 9 on3
`
`4,933,835
`
`20:”.
`
`E05”:3::mu;:02.42
`
`.3on
`
`«MOOOO¢FM
`
`th
`
`an?oceanco
`
`2.0332%..
`
`mNm
`
`0mm
`
`ONm
`
`
`
`s..5mo<¢aho~54
`
`
`
`mmn«mm
`
`x
`
`230:
`
`snag
`
`Amnm.
`
`10
`
`10
`
`
`
`
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 10 01‘23
`320
`
`4,933,835
`
`U
`
`?_1 510A-
`
`329
`
` iniiigifiifl25i
`
`l28Lines
`
`L
`
`mm I Line
`LL”
`RA= 2| bits
`
`LV
`E
`
`DT= 32am
`E: um
`LV= um
`LD= Ibrr
`
`f—gl-11A-u
`
`-Z35\%
`
`LE115— I-Ima'II-LIII mm
`
`SV UV
`
`D R
`
`64Llnas
`
`I4bifs
`VA =
`RA= ZODII’s
`ST: 5be:
`PL: 4bits
`SV= I bit
`UV:
`I bit
`0 =
`I bit
`R= lbif
`
`11
`
`11
`
`
`
`US. Patent Jun. 12, 1990
`
`Sheet 11 of 23
`
`4,933,835
`
`F4 57 __ .12
`
`"dd
`
`100
`I0 4
`‘0 8
`
`10 C
`l l 0
`I I 4
`l l 8
`l I C
`I20
`I24
`I28
`
`12 C
`
`Quud- word boundary
`3
`2
`l
`o
`
`<-—————Quod-word boundary
`
`_
`
`I
`
`u
`
`.
`
`‘____._
`
`||
`
`7'_:_._l g. .13.
`
`PAGEO
`PAGE |
`PAGE2
`PAGES
`PAGE4
`PAGES
`PAGEG
`PAGE?
`
`VECTORS
`
`0000 0000
`
`MM l0
`MM I0
`BOOT ROM
`BOOT ROM
`
`PAGE 1,048,575
`
`FFFFFFFF
`
`VIRTUAL
`PAGE NO.
`
`VlRTUAL ADD
`
`REAL. ADD
`
`0
`I
`2
`3
`4
`5
`6
`7
`
`00000X XX
`OOOOIXXX
`00002XXX
`00003XXX
`00004XXX
`00005XXX
`OOOOGXXX
`OOOOTXXX
`
`OOOOOXXX
`OOOOIXXX
`0O002XXX
`00003XXX
`O0000XX X
`0000|X X X
`OOOOOXXX
`OOOOIXXX
`
`ST
`
`00lXX
`OOIXX
`00|XX
`00lXX
`IXOXX
`IXOXX
`IX l X X
`IXIXX
`
`12
`
`
`
`US. Patent Jun. 12,1990
`
`Sheet 12 0123
`
`4,933,835
`
`Cache
`Memory
`Subsystem
`
`/ / iv
`
`)32
`
`‘
`-
`Lme Reglster
`
`400
`~
`
`, 32
`
`QB = Q
`
`230
`H
`C
`8
`
`<
`
`32
`Processor/Cache ' ‘I ,
`Bus
`
`410
`
`Quad-Word Line Boundary Register
`(AIR 2)
`
`Mr
`Reg VA _
`(AIRI)
`
`/
`2\O
`
`1
`Quad Boundary ___/-— 420
`Comparator
`Quad
`/ Boundary
`
`Control Logic
`430 J
`
`13
`
`
`
`US. Patent Jun. 12, 1990
`
`Sheet 13 0123
`
`4,933,835
`
`ON_ Om
`
`00 On O
`
`043
`
`
`
`>m<QZDOm 236:0 >m<O2DOm 0430 5:23
`
`111T
`
`
`
`
`
`
`
`lkHvwuk
`
`IA
`
`_
`
`amt
`
`/| vju:
`
`mmohw
`
`
`
`1t 8080....
`
`amm
`
`:8 ,1 41 i 16955,‘
`
`
`
`I l: | l 6130a
`
`51am
`
`14
`
`
`
`US. Patent Jun. 12, 1990
`
`Sheet 14 0123
`
`4,933,835
`
`‘7:4 51.1’7A _
`
`4l0
`
`@ Main
`
`Copybook
`
`Memory &
`
`Private Page
`
`‘In
`
`Shaad Page
`
`i
`b’ m
`
`( fost Write Operation )
`
`_7:_z_ g_ 175..
`41o
`@ W. Thru
`
`“2 @
`
`(Doto Consistoncy)
`
`Copy @ock
`@ ‘
`
`etch
`
`Cache
`Memory
`Subsystem
`
`,1 SIS
`
`Memor y
`Subsystem
`
`DAT
`?namtinn
`
`15
`
`
`
`US. Patent Jun. 12, 1990
`‘7:I— g J~9_
`
`Sheet 15 0123
`
`4,933,835
`
`31
`
`22 2!
`
`l2 ll
`
`0 3'0
`
`VA
`
`s'r0_
`Reamer STO
`
`SEGMENT (l0)
`
`PAGE U0)
`
`D\SPU2)
`
`ZERO'S
`
`vA(||=0)
`'
`
`(20)
`
`(l2)
`
`VA(3I=22)
`/, Io
`
`( (20)
`k
`Segment
`Table
`Entry
`Address
`Accumumtor
`
`00
`(IO!
`J
`[3‘ — SE67“ EGT ‘FA-BLED1
`I
`1
`32
`SEGMENTO
`I
`g0
`H
`I
`I
`I
`PF I
`
`PTE
`
`I
`seem 1023
`
`I
`I
`I
`
`)
`/
`
`A? 20
`
`A’
`
`z‘f
`
`VA (2': ‘2)
`
`Page
`Table
`E523”,
`I Accumulator
`
`
`< IMcIin Memory PAGE TABLE \\
`
`I
`3|
`0
`PAGEO
`I
`5 4
`I
`I /—-"—\r’\
`RA ST PL D R PF I
`1’
`"
`
`(20)
`
`
`
`(IO) 00 \/ J
`
`32
`
`'
`
`RA,fI'OI'I'ITLB
`20
`
`I
`
`2o
`
`To
`TLB '
`
`‘
`I
`
`PAGE I023
`
`_ __ _ ._._ _l
`
`(20)
`
`I
`(I2)
`
`PF = PAGE FAULT
`PL= ACCESS PROTECT \ON
`5T: SYSTEM TAGS
`D = DIRTY FLAG
`R = REFERENCED FLAG
`
`\
`
`Y
`32 BIT REAL ADDRESS Real
`Address
`Accumulator
`
`16
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 16 of 23
`
`4,933,835
`
`Om.
`
`:05E
`
`{box
`
`<>gscm
`
`.¢@_n.<m>m
`
`
`
`53.550
`
`(6.8.12) 10 u
`
`(2 III ENE AS
`
`(ZI=I EWA
`
`(fiG'JEMfl U
`
`(0.8.2! IWEAS
`
`(summon
`
`( $81” )VA
`
`00m
`
`l_.<ooz<AOmHZOU
`
`0.00.—
`
`wzzwzm
`
`aw“.nth:
`
`ammonia
`
`mammzos
`
`uo<
`
`On.
`
`26
`
`um<
`
`amt
`
`17
`
`I Q N |
`
`El
`
`17
`
`
`
`
`
`
`
`
`
`
`US. Patent
`
`m291M
`
`%mam
`
`556
`
`0
`
`4,933,335
`
`m.855&2.233mEm
`
`
`
`
`.8:mmmmoo<2396«Em
`
`
`
`.ommMZGEOm..m<._..wmmEm
`
`
`
`.emmFan—#30Sukm>mmom
`
`W.$.98
`
`3n:=3&335v.
`
`
`
`immune—.3»Btuw.99.3235.8".b.
`
`waSmN.
`
`rmOEMEmzo<o
`
`
`
`_o.
`
`
`
`
`
`
`
`.ommban—#30m10<0”£00
`
`
`
`
`
`00..wb130m_m_ngo<u¢”32¢
`
`
`
`.oumwooo20:023....HUm
`
`
`
`.ouz:52.mmwmoo<u12
`
`.oumPan!—qu<o"EU
`
`
`
`mun—002mmooon21..."nan...
`
`Aman.8
`
`.owmGrog-055"mo
`
`¢Oh<m<msou"n=20
`
`18
`
`18
`
`
`
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 18 of 23
`
`4,933,835
`
`7:1 g_ 2 2.
`
`7|0\_
`‘
`
`" _ Foofiné‘vaufl
`
`7l5
`(1
`
`sun
`
`700
`
`255
`
`ROM
`
`l
`
`[
`
`1
`
`PCin
`
`I L
`
`. _ _. _
`
`VECTOR
`
`
`G°“°'°'°'
`
`a o
`
`5 UNIT LOADS
`
`73°
`
`ROM(5l=48)
`
`ROM(47‘-00)
`
`
`
`
`STATUS LOG'
`SIGNAIS
`
`740
`
`
`
`
`R8T0 Controller
`
`Signal
`
`mux SL‘l'
`PCinc CRhold
`OR hold
`
`
`
`i (43)
`
`_ _ __
`:Tgnslutor j
`I..-
`V
`OUTPUT
`Signals
`
`STKset
`
`19
`
`19
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 190f23
`
`4,933,835
`
`7: 7 ca 25'_
`
`CAMMU CONTROL
`
`{320
`CACHE
`MEMORY
`
`,830
`CACHE
`CONTROL
`
`CAWT
`
`CAEN
`
`CA HIT/MISS
`
`(420
`QUAD.
`COMP.
`CMPEN
`HIT/
`H55 [5'0
`START
`CPU
`CONTROL
`
`TOH
`CPU
`
`(650
`_L8_‘E)
`our SYSTEM
`To
`BUS "SYSTEM
`IN CONTROL
`BUS
`
`MICRO
`ENGINE
`
`END
`
`TLBEN
`
`[350
`ma
`MEMORY
`
`_TLB ummss i
`‘820
`TLB WT
`
`TLB
`CONTROL
`
`20
`
`
`
`US. Patent
`
`Jun. 12,1990
`
`Sheet 20 of 23
`
`4,933,835
`
`$.; g _ 2 4 _
`CPU CONTROL
`
`FROM CPU FCR
`
`930
`
`
`
`
`INPUT
`DECODER
`
`FROM TLBHIT
`T
`PLF
`
`SYSBUSY
`FROM SYSCTL
`
`920
`
`OUTPUT
`DECODE'
`
`TO
`CAMMU
`
`CAHIT
`FROM
`CACTL
`
`
`
`
`‘ CPSTATE
`TO ENGINE
`
`
`END
`FROM
`ENGINE
`
`
`
`MCLK
`
`lNST/nDATA
`
`21
`
`21
`
`
`
`US. Patent
`
`Jun.12, 1990
`
`Sheet 21 0123
`
`4,933,835
`
`ng_ 2 5..
`
`TLB CONTROL
`
`LOAD/STORE/TAS
`
` PROTECT ION
`
`FAULT LEVEL
`
`DECODER
`
`
`TO
`CPCTL
`
`
`U
`TLB
`
`
`
`TLB
`sv W/X
`REPLACEMENT
`MEMORY
`uv wxx
`LOGIC
`
`
`
`
`WRITE
`STROBE
`
`GENERATOR
`
`W/x SELECT
`
`TLBWT
`FROM
`ENGINE
`
`
`
`TLB HIT
`TO
`CPCTL
`
`
`1030
`
`
`
`MEMORY
`MAPPED I/O
`
`
`DECODER
`
`
`
`RESET
`R/W REGISTERS
`R/W TLB
`
`22
`
`22
`
`
`
`US. Patent
`
`Jun. 12,1990
`
`Sheet 22 0123
`
`4,933,835
`
`7:49-26-
`CACHE CONTROL
`
`IIBO
`
`CACHE
`REPLACEMENT
`LOGIC
`
`
`
`
`
`
`
`— CACHE HIT
`CA HIT X
`DETECTOR
`
`CACHE
`MEMORY
`
`W/X SELECT
`
`CA HIT
`
`CP1COTL
`
`
`FROM SYSCTI.
`
`
`
`CAWT
`
`FROM ENGINE
`
`FROM CPCTL
`
`23
`
`23
`
`
`
`US. Patent
`
`Jun. 12,1990
`
`Sheet 23 of23
`
`4,933,835
`
`‘7:_7_—g_27_
`SYSTEM BUS CONTROL
`
`250
`
`W BUS
`
`SYSTEM
`OUTPUT REG.
`
`h. SYSTEM 9 ADDR/DATA
`hm
`65°
`READ/WRITE
`m READ/WRITE
`LOGIC
`
`INPUT REG.
`
`'220
`
`TRANSFER
`END
`
`COUNTER
`
`CAEQI'L QADR
`
`BUS CLOCK
`DETECTOR
`
`BUS WATCH
`START
`
`INST
`
`BUS WATCH
`DETECTOR
`
`READ/
`WRITE
`
`MODE
`
`CONTROL
`REGISTER
`
`'2 '
`
`I230
`
`I2IO
`
`I250
`
`T0
`SYSTEM
`BUS
`
`ROY
`
`"DIR
`
`BCLK
`MCLK
`
`TG
`c1-
`CBUSY
`
`260
`READ/WRITE
`
`FROM
`TLBCTL
`
`24
`
`24
`
`
`
`1
`
`4,933,835
`
`APPARATUS FOR MAINTAINING CONSISTENCY
`OF A CACHE MEMORY WITH A PRIMARY
`MEMORY
`
`This application is a continuation of US. patent appli-
`cation Ser. No. 915,272, which is a continuation-in-part
`of U.S. patent application Ser. No. 704,568, both now
`abandoned.
`
`BACKGROUND
`
`This invention relates to computer system architec-
`tures and more particularly to a microprocessor system
`having a system bus for coupling system elements, and
`having a dual bus microprocessor with separate instruc-
`tion and data cache interfaces coupled to independently
`operable instruction and data caches which are coupled
`to the system bus.
`Prior microprocessor system architectures have pro-
`vided a single external cache subsystem for data and/or
`instructions. Such systems have typically provided for
`direct microprocessor interface to both the cache sys-
`tem and other system elements. In prior systems, a sin-
`gle address/data/control bus provided for interfacing
`to the cache system and to other system elements. Some
`newer microprocessor designs have provided a separate
`interface to a single cache system for data and/or in-
`structions. Some have additionally provided a separate
`general bus for coupling of all system elements to the
`microprocessor,
`including main memory, peripheral
`controller chips, etc. Transfer of digital information to
`and from the microprocessor in these prior art designs
`could either occur between microprocessor and the
`cache system or the microprocessor and peripheral
`controllers or main memory directly. Furthermore, the
`cache system memory cycle required address informa-
`tion from the processor to the cache system for each
`transfer of digital information to the processor from the
`cache system. While the cache system could return one
`or more words of data per cache system data transfer,
`each cache system memory access cycle required a
`separate address be provided from the processor.
`SUMMARY
`
`In accordance with the present invention, a micro-
`processor-based computing system is provided, which
`has a system bus, a main memory and instruction and
`data cache and memory management units (cache-
`MMU) coupled to the system bus. The system bus pro-
`vides for communication of digital information. The
`main memory selectively stores and outputs digital in-
`formation from an addressable high speed read-write
`memory. The instruction cache-MMU manages selec.
`tive access to the main memory via the system bus and
`provides for the selective storage and output of digital
`instruction words to a mapped addressable very high
`speed cache memory, and therefrom to the processor
`via a very high speed processor/cache bus. A data
`cache-MMU manages access to the main memory for
`selectively storing and outputting digital data words to
`and from a mapped addressable very high speed cache
`memory, to and from main memory via the system bus.
`A processor is independently coupled to each of the
`instruction cache-MMU and data cache-MMU via inde-
`pendent very high speed buses. The processor provides
`means for processing data received from the data cache-
`MMU responsive to instructions simultaneously re-
`ceived from the instruction cache-MMU.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`25
`
`2
`The data cache and instruction cache each have sepa-
`rate dedicated system bus interfaces for coupling to the
`main memory and to other peripheral devices coupled
`to the system bus. Numerous other system elements can
`be coupled to the system bus. These include an interrupt
`controller, an [/0 processor, a bus arbiter, an array
`processor, and other peripheral interface or peripheral
`controller devices. The 1/0 processor provides intelli-
`gent interface to various I/O devices and other proto—
`cols and buses. The bus arbiter is coupled to the devices
`coupled to the system bus, such as the instruction and
`data caches, the I/O processor. etc. The bus arbiter
`provides means for selectively resolving channel access
`conflicts between the various elements coupled to the
`system bus so as to maintain the integrity of communi-
`cations on the system bus.
`The data cache contains an address register which is
`loaded with an address from the processor prior to each
`transfer of a defined number of words of data between
`the data cache and the processor. The instruction cache
`contains a program counter which is loaded with an
`address from the processor, and which is advanced by a
`cache advance signal from the microprocessor. The
`instruction cache program counter is loaded with an
`address only during branch instructions and context
`switches. This provides for continuous transfer of in-
`structions from the instruction cache to the processor
`responsive to a single initial address until a branch or
`context switch occurs.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`These and other features and advantages of the pres-
`ent invention will become apparent from the following
`detailed description of the drawings, wherein:
`FIG. 1 illustrates a block diagram of a microproces-
`sor-based dual cache/dual bus system architecture in
`accordance with the present invention;
`FIG. 2 shows CPU 110 of FIG. 1 in more detail;
`FIG. 3 shows the CPU instruction bus interface of
`FIG. 2 in more detail;
`FIG. 4 is an electrical diagram illustrating the instruc-
`tion cache/processor bus, the data cache/processor bus,
`and the system bus;
`FIG. 5 illustrates the system bus to cache interface of
`FIG. 4 in greater detail;
`FIG. 6 is an electrical diagram illustrating the dri-
`vers/receivers between the instruction cache~MMU
`and the system bus;
`FIGS. 7A-C illustrate the virtual memory, real mem-
`ory, and virtual address concepts as utilized with the
`present invention;
`FIG. 8 illustrates an electrical block diagram of a
`cache memory management unit;
`FIG. 8A shows the translation lookaside buffer sub-
`system (TLB) in more detail;
`FIG. SB shows the hardwired translation lookaside
`buffer (HTLB) in more detail;
`FIG. 9 is a detailed block diagram of the cache mem-
`ory management unit of FIG. 8;
`FIGS. IDA—B illustrate the storage structure within
`the cache memory subsystem 320;
`FIGS. IlA—B illustrate the TLB memory subsystem
`350 storage structure in greater detail;
`FIG. 12 illustrates the cache memory quadword
`boundary organization;
`FIG. 13 illustrates the hardwired virtual to real trans-
`lations provided by the TLB subsystem;
`
`25
`
`
`
`3
`FIG. 14 illustrates the cache memory subsystem and
`affiliated cache-MMU architecture which support the
`quadword boundary utilizing line registers and line
`boundary registers;
`FIG. 15 illustrates the load timing for the cache-
`MMU systems 120 and 130 of FIG. 1;
`FIG. 16 illustrates the store operation for the cache-
`MMU systems 120 and 130 of FIG. 1, for storage from
`the CPU to the cache-MMU in copyback mode, and for
`storage from the CPU to the cache-MMU and the main
`memory for the write-through mode of operation;
`FIG. 17A illustrates the data flow of store operations
`on Copy-Back mode, and FIG. 17-B illustrates the data
`flow of operations on Write-Thru Mode;
`FIG. 18 illustrates the data flow and state flow inter-
`action of the CPU, cache memory subsystem, and TLB
`memory subsystem;
`FIG. 19 illustrates the data flow and operation of the
`DAT and TLB subsystems in performing address trans-
`lation;
`FIG. 20 illustrates a block diagram of the cache-
`MMU system, including bus interface structures inter-
`nal to the cache-MMU;
`FIG. 21 is a more detailed electrical block diagram of
`FIG. 20;
`FIG. 22 is a detailed electrical block diagram of the
`control logic microengine 650 of FIG. 21;
`FIG. 23 illustrates an arrangement of the major con-
`trol and timing circuits for the Cache-MMU;
`FIG. 24 illustrates CPU control circuit 810 of FIG.
`23 in greater detail;
`FIG. 25 illustrates TLB control circuitry 820,
`TLBCTL, of FIG. 23 in greater detail;
`FIG. 26 illustrates the cache control circuit 830,
`CACTL, of FIG. 23 in greater detail; and
`FIG. 27 illustrates the System Bus control circuit 840,
`SYSCTL, of FIG. 23 in greater detail.
`DETAILED DESCRIPTION OF THE
`DRAWINGS
`
`Referring to FIG. 1, a system embodiment of the
`present invention is illustrated. A central processing
`unit 110 is coupled via separate and independent very
`high speed cache/processor buses, an instruction bus
`121 and a data bus 131, coupling to an instruction cache-
`memory management unit 120 and a data cache-mem-
`ory management unit 130, respectively, each having an
`interface to main memory 140 through system bus 141.
`Main memory 140 contains the primary storage for the
`system, and may be comprised of dynamic RAM, static
`RAM, or other medium to high speed read-write mem-
`ory. Additionally, a system status bus 115 is coupled
`from the CPU 110 to each of the instruction cache-
`memory management unit 120 and data cache-memory
`management unit 130.
`Additionally, as illustrated in FIG. 1, other system’s
`elements can be coupled to the system bus 141, such as
`an l/O processing unit, IOP 150, which couples the
`system bus 141 to the I/O bus 151. The I/O bus 151 may
`be a standard bus interface, such as Ethernet, Unibus,
`VMEbus or Multibus. I/O bus 151 can couple to the
`secondary storage or other peripheral devices, such as
`hard disks, floppy disks, printers, etc. Multiple IOPs can
`be coupled to the system bus 141 and thereby can com-
`municate with the main memory 140.
`The CPU 110 is also coupled via interrupt lines 111 to
`an interrupt controller 170. Each of the units contend-
`ing for interrupt priority to the CPU has separate inter-
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`4,933,835
`
`4
`rupt lines coupled into the interrupt controller 170. As
`illustrated in FIG. 1, the array processor 188 has an
`interrupt output 165 and the 10? 150 has an interrupt
`output 155. Controller 170 prioritizes and arbitrates
`priority of interrupt requests to the CPU 110.
`A system clock 160 provides a master clock MCLK
`to the CPU 110, instruction cache~memory manage-
`ment unit 120 and data cache-memory management unit
`130 for synchronizing operations. In addition, a bus
`clock BCLK output from the system clock 160, pro-
`vides bus synchronization signals for transfers via the
`system bus 141, and is coupled to all system elements
`coupled to the system bus 141.
`Where multiple devices request access to the system
`bus 141 at the same time, a bus arbitration unit unit 180
`is provided which prioritizes access and avoids colli-
`sxons.
`
`FIG. 2 shows CPU (processor) 110 in greater detail.
`Instructions from instruction cache-MMU 120 enter
`from instruction bus 121 to instruction bus interface unit
`1310 where they are held in prefetch buffer 131] until
`needed for execution by instruction control unit 1320.
`Instructions are also supplied as needed from macro
`instruction unit 1330 which holds frequently used in-
`struction sequences in read only memory. Instructions
`first enter register 102 and then register 104 (instruction
`registers B and C, respectively) which form a two stage
`instruction decoding pipeline. Control signals from
`instruction decoder 103 are timed and gated to all parts
`of the processor for instruction execution. For speed of
`execution, instruction decoder 103 is preferably imple—
`mented in the form of sequential state machine logic
`circuitry rather than slower microcoded logic circuitry.
`Program counter 1321 contains the address of the in-
`struction currently being executed in instruction regis-
`ter C. The execution unit 105, comprising integer execu-
`tion unit 1340 and floating point execution unit 1350,
`executes data processing instructions. Data is received
`from and transmitted to data cache-MMU 130 over data
`cache-MMU bus 131 through data bus interface 109.
`Instruction interface 1310 of processor 110 includes a
`multi-stage instruction bus 1311 which provides means
`for storing, in seriatim, a plurality of instruction parcels,
`one per stage. A cache advance signal ISEND is sent by
`the instruction interface as it has free space. This signals
`instruction cache-MMU 120 to provide an additional
`32-bit word containing two 16-bit instruction parcels
`via instruction bus 121. This multi-stage instruction
`buffer increases the average instruction throughput
`rate.
`
`Responsive to the occurrence of a context switch or
`branch in the operation of the microprocessor system,
`instruction interface 1310 selectively outputs an instruc-
`tion address for storage in an instruction cache-MMU
`120 program counter. A context switch can include a
`trap, an interrupt, or initialization. The cache advance
`signal provides for selectively incrementing the instruc-
`tion cache-MMU program counter, except during a
`context switch or branch.
`In FIG. 3, prefetch buffer 1311 is shown in detail,
`comprising the four prefetch buffer register stages IH,
`IL, IA and IC. The IH register stage holds a lé-bit
`instruction parcel in register 1312 plus an additional bit
`of control information in register 1313, IHD, which bit
`is set to indicate whether IH currently contains a parcel.
`Each of the register stages is similarly equipped to con-
`tain an instruction parcel and an associated control bit.
`Buffer advance logic circuit 1314 administers the parcel
`
`26
`
`26
`
`
`
`4,933,835
`
`6
`
`agement units of the instruction cache-MMU 120 and
`data cache-MMU 130 perform all memory manage-
`ment, protection, and virtual to physical address trans-
`lation.
`-
`As illustrated in FIGS. 1, 7A-C, and 8, the processor
`110 provides virtual address outputs which have a
`mapped relationship to a corresponding physical ad-
`dress in main memory. The memory management units
`of the instruction and data cache-MMUs 120 and 130
`are responsive to the respective virtual address outputs
`from the instruction and data interfaces of the processor
`110, such that the memory management units selec-
`tively provide physical address and the associated
`mapped digital information for the respective virtually
`addressed location. When the requested information for
`the addressed location is not stored in the respective
`cache-MMU memories (i.e. a cache miss), the micro
`engine of the cache-MMUs provides a translated physi-
`cal address for output to the main memory 140. The
`corresponding information is
`thereafter
`transferred
`from the main memory 140 to the respective instruction
`cache-MMU 120 or to or from the data cache-MMU
`130, and as needed to the processor 110.
`The two separate cache interface buses, the instruc-
`tion bus 121 and the data bus 131 are each comprised of
`multiple signals. As illustrated in FIGS. 4 and 5, for one
`embodiment, the signals on both the data cache bus 131
`and the instruction cache bus 121 are as follows:
`
`DATA CACHE BUS
`
`ADF<31:0>: address/data bus
`These lines are bidirectional and provide an address-
`/data multiplexed bus. The CPU puts an address on
`these lines for one clock cycle. 0n store operations, the
`address is followed by the data. On load or TAS (i.e.
`test and set) operations,
`these bus lines become idle
`(floating) after the address cycle, so that these lines are
`ready to receive data from the Data Cache-MMU. The
`Data Cache-MMU then puts the addressed data on the
`lines.
`FC<3=O>= function code/trap code
`The CPU puts “the type of data transfer" on
`FC<3:0> lines for one clock cycle at the address cy-
`cle. The D-CACHE, or I-CACHE, sends back “the
`type of trap" on abnormal operations along with TSTB
`(i.e. Trap Strobe Signal).
`
`
`Transfer tm
`
`On ASF Active
`
`FC < 3
`2
`l
`D
`>
`0
`0
`0
`0
`load singleword mode
`0
`D
`0
`1
`load doubleword mode
`0
`0
`1
`0
`load byte
`0
`D
`l
`I
`load halfword
`O
`l
`O
`0
`Test and set
`1
`X
`0
`0
`store singieword
`1
`X
`0
`1
`store doubleword
`1
`X
`1
`0
`store byte
`
`1 1X1 store halfword
`
`
`
`
`The D-cache puts the TRAP code on PC to respond to
`the CPU.
`
`
`Trap Code
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`(on TSTB active!
`FC < 3
`2
`l
`X
`0
`0
`
`0
`
`0
`
`>
`
`5
`and control bit contents of the four register stages. In
`response to the parcel advance control signal PADV
`from instruction decoder 103, buffer advance logic
`circuit 1314 gates the next available instruction parcel
`into instruction register 102 through multiplexer 1315,
`and marks empty the control bit associated with the
`register stage from which the parcel was obtained. In
`response to the control bits of the four stages, circuit
`1314 advances the parcels to fill empty register stages.
`As space becomes available for new instruction parcels
`from the instruction cache-MMU, cache advance logic
`circuit 1316 responds to the control bits to issue the
`ISEND signal on instruction bus 121. Instruction cache-
`MMU responds with a 32-bit word containing two
`parcels. The high order parcel is received in IR, and the
`low order parcel in IL through multiplexor 1319.
`On each MCLK cycle both the buffer advance and
`cache advance circuits attempt to keep the prefetch
`buffer stages full as conditions permit. The buffer ad-
`vance and cache circuits are implemented in combina-
`tional logic in a manner that is evident to those skilled in
`the art. For example, cache advance circuit 1316 pro-
`duces ISEND in response to the negation of the follow-
`ing boolean logic expression: (ICD,IAD,ILD+ICD-
`,IAD,IHD+ICD,ILD,II-ID+IAD,ILD,IHD).
`The
`first two terms indicates that IA and IC are full with
`either IH or IL full. The last two terms indicate that IL
`and 1H are full with either 1C or IA full. In all of these
`cases, there is no available register space in the prefetch
`buffer, while in all other cases, there is space.
`Instruction parcels stored in instruction register 102
`are partially decoded before being sent to instruction
`register 104 to complete the decoding process. Decod-
`ing of branch instructions is done by branch decoder
`1317, a part of decoder 103, in response to instruction
`register 102. In the case of a branch instruction, the
`branch address is set into program counter 1321 from
`the processor S bus, cache advance circuit 1316 is inhib-
`ited from sending ISEND and the prefetch buffer is
`flushed (signal path 1318). Branch decoder 1317 instead
`sends IASF to the instruction cache-MMU. This causes
`instruction cache-MMU 120 to take the new branch
`address from cache bus 121.
`The MCLK is the clock to the entire main clock, (e.g.
`33 MHz), logic. BCLK is the system bus clock, prefera-
`bly at either i or i of the MCLK.
`For the system bus 141 synchronization, BCLK is
`delivered to all the units on the system bus, i.e. IOPs,
`bus arbiter, caches, interrupt controllers, the main mem-
`ory and so forth. All signals must be generated onto the
`bus and be sampled on the rising edge of BCLK. The
`propagation delay of the signals must be within the one
`cycle of BCLK in order to guarantee the synchronous
`mode of bus operation. The phase relationships between
`BCLK and MCLK are strictly specified. In one em-
`bodiment, BCLK is a 50% duty-cycle clock of twice or
`four times the cycle time of MCLK, which depends
`upon the physical size and loads of the system bus 141.
`As illustrated in FIG. 1, the transfer of instructions is
`from the instruction cache-MMU 120 to the processor
`110. The transfer of data is bidirectional between the
`data cache-MMU 130 and processor 110. Instruction
`transfer is from the main memory 140 to the instruction
`cache-MMU 120. Instruction transfer occurs whenever
`an instruction is required which is not resident in the
`cache memory of instruction cache-MMU 120. The
`transfer of data between the data cache-MMU 130 and
`main memory 140 is bidirectional. The memory man-
`
`27
`
`27
`
`
`
`4,933,835
`
`7
`-continued
`
`Trap Code
`
`son TSTB active!
`
`FC < 3
`2
`l
`0
`>
`X
`0
`0
`1
`memory error (MSBE)
`X
`0
`l
`0
`memory error (MDBE)
`X
`0
`l
`l
`X
`l
`0
`0
`X
`l
`0
`1
`X
`l
`l
`O
`X
`l
`l
`l
`
`page fault
`protection fault (READ)
`protection fault (WRITE)
`
`ASF: address strobe
`ASF is activated by the CPU indicating that the ‘ad‘
`dress’ and ‘type of data transfer’ are valid on AD-
`F<3l:10> and FC<3:0> lines, respectively. ASF is
`activated one half a clock cycle prior to the address
`being activated on the ADF bus.
`RSP: response signal
`On load operations, the RSP signal is activated by the
`D-cache indicating that data is ready on the ADF bus.
`RSP is at the same timing as the data on the ADF bus.
`The D-cache sends data to CPU on a load operation.
`On store operations, RSP is activated when the data
`cache-MMU becomes ready to accept the next opera-
`tion.
`On load-double, RSP is sent back along with each
`data parcel transf