`
`ARM Ex. 1006
`IPR Petition - USP 5,463,750
`
`
`
`US. Patent
`
`_Jun. 12, 1990
`
`Sheet 1 of 23
`
`4,933,835
`
`hasmam._.Z_
`
`mun—1.8.5.00
`
`Jew—Ecommagnum)m
`
`
`
`20mm
`
`.55522.32
`
`OS
`
`rdmmd
`
`comm-“Gomo—
`
`zocbsmhmz_
`
`DEE-urodo
`
`
`
`mm>omawn—mama
`
`”Uducuhzfl02¢
`
`ARM_VPT_IPR_00000168
`ARM VPT IPR 00000168
`
`
`
`
`
`
`
`US. Patent
`
`Jun. 12,1990
`
`Sheet20f23
`
`4,933,835
`
` 2:...
`
`32:5me53::
`
`EEEmums?:2.
`
`265.
`
`
`
`mg3525:..
`
`gangL
`
`ARM_VPT_IPR_00000169
`ARM VPT IPR 00000169
`
`
`
`.Es22:355.2-25.3.
`
`Duo
`
`INlmflm
`
`My“:
`
`22525:
`
`.5528
`
` 2:...
`
`
`
`
`
`.U.S. Patent
`
`Jun. 12, 1990
`
`Sheet 3 of 23
`
`4,933,835
`
`INTERFME BIO
`
`ISEND
`
`(31 I6}
`
`SET
`
`|
`
`l
`3 6
`
`INH
`
`CACHE
`ADVANCE
`LOGIC
`
`.I
`
`-
`
`r 314
`
` T0
`
`IZI
`
`PROGRAM
`
`coma M
`
`Mil-3—
`
`ARM_VPT_IPR_00000170
`ARM_VPT_IPR_OOOOO17O
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 4 0:23
`
`4,933,835
`
`I20
`iIII)—-’J(il) I.
`IE1—N
`
`
`cum I:
`
`
`IIIIIIIIIIIII .
`Eaullllln- $5)ng
`lac-unmi- .
`g —IIIIIIIII mu”
`5 l-llllifi
`. _._|]|
`a
`I—IIIIIIOASHVO
`_Ill- OAOH
`III—Ill 90%
`_- g9 flu
`_ 399th
`_ p99
`_ was
`
`130
`
`Vac Vat
`
`l-IIIIIIIIIIl-
`"1°“
`I-IIIIIIIIIII m
`
`‘am
`
`
`
`I I
`
`n%5
`
`
`5
`
`CPU
`
`Vcc
`
`1’
`
`Vcc 15w
`
`"0
`
`
`
`IRbus
`
`
`
`7—:__z_-4-_
`
`D-BUS
`
`ARM_VPT_IPR_00000171
`ARM_VPT_IPR_OOOOO171
`
`
`
`4,933,835
`sheet 5 of 23
`Jun. 12, 1990
`US. Patent
`
`?: .7. g_5'_ »
`
`
`
`|
`
`.
`l
`
`l
`
`"I
`ll1-
`I
`3601- BGI
`4.!
`I
`.Ia =
`I”
`
`IIIII
`
`I
`
`.
`I
`'
`
`I
`1
`
`I
`
`BFb
`360
`
`: casw
`
`3 RDY/
`
`W a)
`33¢
`LOCK
`
`| i
`
`. ADISIIGQ}
`| EEEE?
`I RESET/
`
`L_L
`
`=
`
`m“: 1»
`TG(4= )
`nRMWlOUTPUT GNU}
`.
`ADDTf3l-QQ}
`REESE
`"RESET
`
`u
`
`nDIRO nICA
`I'IDIR'O nICA
`'—
`
`Rafi
`
`J
`
`“RESET
`nBERR
`
`“3°85
`n
`ACUTE
`
`I
`I
`I
`I
`
`
`I
`
`1
`
`
`
`
`
`I
`|
`
`II
`
`i
`
`'
`I
`
`
`
`openuflmckr
`
`'i“
`
`Puflupreswhr
`
`I
`
`I
`I
`
`.
`
`
`
`InRMWIOUTPUT ONIJ'}
`ITGPkEI
`AM 051751)
`I
`nTRM
`
`
`nRDYJ
`RDYo
`
`BGI
`BR!
`
`
`
`CA BSY 0
`
`
`MCLKNfiLK/w~——*I
`I33
`BCLK #1 "_I I I I I I I I I
`
`ARM_VPT_IPR_00000172
`ARM_VPT_IPR_OOOOO172
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet60f23
`
`4,933,335
`
`
`
`.55qu44mm
`
`25anmOR“2)
`
`ExamSN“coon.
`nooumQM"mum
`
`I0.0Illmim
`
`mamEuhmrm
`
`
`
`Imwblmflxu
`
`
`
`
`
`rte—aus—1.54.5.5
`
`I<0IL?!
`
`unannaonas)
`
`ARM_VPT_IPR_00000173
`ARM VPT IPR 00000173
`
`
`
`
`
`taPm
`
`Hm
`
`&
`
`4,933,835
`
`
`
`Dn—OBi:0;62523wasE213_|9mW.-\"can253
`lamm“3...
`
`I
`
`gmSwarm
`
`<1
`
`mm
`
`ofifiufim.23.magainaa9.#l2.:52.30
`3.62.08
`
`
`.8.a:..t._Bm.0“rIIII—
`
`ARM_VPT_IPR_00000174
`ARM_VPT IPR 00000174
`
`39.85%
`
`
`
`mE...32542
`
`
`
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 8 0f23
`
`4,933,835
`
`:F_z 9. 5A-
`TLB SUBSYSTEM
`
`AIR
`
`
`
`AIR [3m
`
`PGHIT
`
`AIR Him
`
`@
`
`ARM_VPT_IPR_00000175
`ARM_VPT_IPR_OOOOO175
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 9 of 23
`
`4,933,835
`
`20¢...
`
`.642
`
`>10qu
`
`.391
`
`xz
`
`2..53ml
`
`
`
`”Nun.
`
`
`
`SosaEm3.5:5E4
`
`use9:comfi8%...canownmanmwaoan
`Nmm»
`
`III:I3:0:
`3&2...on.I:If«mm
`
`
`
`3E203com2633032I.mum
`
`
`whm.-|w||
`
`
`
`ARM_VPT_IPR_00000176
`ARM VPT IPR 00000176
`
`
`
`
`US. Patent
`
`Jun.12, 1990
`
`Sheet 10 of23
`032
`
`4,933,835
`
`LI
`
`7—_7_—__5210A-
`
`329
`
` 3:3. QJOAE
`
`“Ila-m“- I Lin-
`LL”
`RA: 21bit:
`U;
`
`DT= 32bit:
`E: Ibit
`Lv= lbif
`LD= lbfl
`
`l
`
`E
`
`El-ufl'A-
`
`last“
`
`$115
`
`(54le3
`
`I Llne
`
`3V UV
`
`DR
`
`I413“:
`VA:
`RA: BODIES
`ST: 5b":
`PL: 4am
`3V:
`lblt
`UV=
`Ibit
`=
`[bit
`=
`[bit
`
`ARM_VPT_IPR_00000177
`ARM_VPT_IPR_OOOOO177
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`4,933,835
`
`Sheet 11 of23
`_ .12-—£ Quad- word Imumflmr
`
`O —
`
`3
`
`2
`
`--—+—Qudd-word boumdary
`
`‘7:__2_
`
`odd
`
`IN
`
`|Q4
`I08
`
`IQC
`
`I IS
`
`H4
`HE
`
`”C
`
`IZG'
`I24
`
`IZB
`
`I26
`
`VIRTUAL
`PAGE NO.
`
`VIRTUAL ADD
`
`REAL ADD
`
`OOOOOXXX
`OOOOIXXX
`OOOOZXXX
`OOOOSXXX
`000 O4XXX
`OOOOSXXX
`OOOOSXXX
`OOOOTXXX
`
`OOOOOXXX
`OOOOIXXX
`OOOOZXXX
`OOOO3XXX
`OOOOOX X X
`OOOOI X X X
`OOOOOXXX
`OOOOIXXX
`
`OOIXX
`OOIXX
`OOIXX
`OOIXX
`IXOXX
`IXOXX
`[XIX X
`IXIXX
`
`0 I
`
`40301450!“
`
`ARM_VPT_IPR_00000178
`ARM_VPT_IPR_OOOOO178
`
`
`
`US. Patent
`
`Jun. 12,1990
`
`Sheet 12 0:23
`
`4,933,835
`
`7"?—g 14 _
`
`Cache
`
`Memory
`
`|
`
`Subsystem
`
`Processor/Cache
`Bus
`
`Quad-Word Line Boundary Register
`(AIR 2)
`
`
`
`420
`
`
`
`(2qu Boundary
`Comparator
`Quad
`Boundary
`
`
`
`Control Logic
`
`43
`
`0
`
`ARM_VPT_IPR_00000179
`ARM_VPT_IPR_OOOOO179
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 13 of23
`
`4,933,835
`
`
`
`51malum-
`
`:9.-m~le
`
`ARM_VPT_IPR_00000180
`ARM VPT IPR 00000180
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 14 0123
`
`4,933,835
`
`7—2—1 g_!7A_
`
`4IO
`
`
`
`(fast Write Operation]
`
`“7:4 9*. 175-
`
`516
`
`Onarnfinn
`
`ARM_VPT_IPR_00000181
`ARM_VPT_IPR_OOOOO181
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 15 of 23
`
`4,933,835
`
`77.1—g. 19_
`
`Table
`En try
`Address
`Accumulator
`
`TO
`TLB
`
`RAJmmTLB
`
`
`
`p}! = PAGE FAULT
`
`PL= ACCESS PROTECTION
`
`ST= SYSTEM TAGS
`o - DIRTY FLAG
`R = REFERENCE!) FLAG
`
`32 BIT REAL mam Rea.
`
`1
`
`Address
`
`Accumuidtor
`
`ARM_VPT_IPR_00000182
`ARM_VPT_IPR_OOOOO182
`
`
`
`OS
`
`4,933,835
`
`mmmmM.»000JEm3.82.mzoqo
`JEEmOn.05
`I.I.acm.WHumPI
`Al.
`
`AH0Vw.mmHa
`
`18383mamt
`,w........mommxoqo
`.I.“54%59.38mm”9833.5
`".3ma
`
`
`f‘
`
`.5392¢Jog—.200
`
`0—00..
`
`wzazm
`
`ARM_VPT_IPR_00000183
`ARM VPT IPR 00000183
`
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 17 of 23
`
`4,933,335
`
`.ommmmmmooq55$«Em
`
`
`
`dmmPaar—.30EmhmrwKOm
`
`Ba5%.55$5
`
`.0mmszEOmeEnomm0.5
`
`
`
` .owxanmhuoqBtu.m.omwanus—.3.zaumm.3...a3323:»a
`
`$.96.
`
`
`
`.85303-38"mo_0.8..guanine:93m"32m
`
` muooozu“.58".5:n13:;.eum382952...“:o..._3,.5%..
`mmumaaq"Ed.
`.oum522.9950
`
`
`
`.8:59:5mzoqo”moo
`
`moEmisoo"$5
`
`ARM_VPT_IPR_00000184
`ARM VPT IPR 00000184
`
`
`
`
`US. Patent
`
`Jun. 12,1990
`
`Sheet 13 0:23
`
`4,933,835
`
`7—_L g_ 2 2_
`
`7‘0
`
`\— * — Fodfinfizfiafl
`
`7I5
`
`A
`
`3.,“
`
`
`
`
`4'?
`
`l
`'
`
`_
`OR
`
`47
`
`a
`
`9
`
`0 0
`
`
`
`'SUNTLOADS
`
`ROMt5|=48l
`
`ROM“?! 00}
`
`750 NG(4'}’=00} ‘
`ROM
`I (48}
`
`ORhold
`
`
`
`
`I__ * fl _ _|
`Signal
`HST” Controller
`1
`.
`{g [3 (I)
`.LTLanlsfltoL J
`I—1
`oh 0
`l'
`
`l o I
`I |0
`OUTPUT
`I
`:
`(.3
`Signals
`I
`I
`I
`
`
`
`|
`
`PCinc when muxsu
`Y
`on hold
`5mm
`
`ARM_VPT_IPR_00000185
`ARM_VPT_IPR_OOOOO185
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`
`W. 23_
`
`Sheet 19 of23
`
`4,933,835
`
`CAMMU CONTROL
`
`830
`320
`
`CACHE
`CONTROL
`
`
`
`
`
`MERO
`
`
`SYSTEM
`BUS
`CONTROL
`
`
`
`TLB
`CONTROL MEMORY'
`
`
`
`ARM_VPT_IPR_00000186
`ARM_VPT_IPR_OOOOO186
`
`
`
`US. Patent
`
`Jun. 12,1990
`
`Sheet 20 of 23
`
`4,933,835
`
`fig- 2 4 __
`CPU CONTROL
`
`930
`
`
`
`INPUT
`DECODER
`
`
`
`SYSBUSY
`FROM SYSCTL
`
`FROM CPU FCR
`
`FROM TLBHIT
`1'
`PLF’
`
`
`
`' CPSTATE
`
`T0 ENGINE
`
`MCl.K
`
`INST/n DATA
`
`ARM_VPT_IPR_00000187
`ARM_VPT_IPR_OOOOO187
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 21 of23
`
`4,933,835
`
`7:49—25—
`
`TLB CONTROL
`
`LOAD/STORE/TAS
`
`PROTECT ION
`FAULT LEVE
`DECODER
`
`
`
`
`TO
`CPCTL
`
`
`
`
`U
`TLB
`
`
`TLB —Rgpmm W/X SELECT
`LOGIC
`MEMORY W wx
`
`
`
`
`
`WRITE
`
`STROBE
`GENERATOR
`
`
`
`TLBWT
`FROM
`ENGINE
`
`
`HIT
`TLB HIT
`
`GENERATOR
`TO
`
`CPCTL
`
`MAPPED I/O
`DECODER
`
`
`
`RESET
`R/W REGISTERS
`R/W TLB
`
`ARM_VPT_IPR_00000188
`ARM_VPT_IPR_OOOOO188
`
`
`
`US. Patent
`
`Jun. 12, 1990
`
`Sheet 22 on3
`
`4,933,835
`
`F4g-26-
`CACHE CONTROL
`
`"310
`
`u
`
`CACHE
`MEMORY
`
`
`
`CA HIT X
`
`LV wxx
`mm1
`
`
`
`CACHE
`
`REPLACEMENT
`LOGIC
`
`
`
`— CACHE HIT
`
`W/X SELECT
`
`CA HIT
`
`CPEQI'L
`
`DETECTOR
`
`
`QADR
`
`FROM SYSCTL
`
`
`CAWT
`
`FROM ENGINE
`
`ARM_VPT_IPR_00000189
`ARM_VPT_IPR_OOOOO189
`
`
`
`US. Patent
`
`Jun.12, 1990
`
`Sheet 23 on3
`
`4,933,835
`
`77:_7_g_,27.
`SYSTEM BUS CONTROL
`
`250
`
`W BUS
`
`SYSTEM
`OUTPUT REG.
`
`h. SYSTEM 3 “DUE/WA
`
`INPUT REG.
`
`T0
`SYSTEM
`BUS
`
`"DY
`
`nDlR
`TR
`
`BCLK
`MCLK
`
`TG
`(:7
`CBUSY
`
`I260
`READ/WRITE FROM
`TLBCTL
`
`650
`REAWRITE
`m READ/WRITE
`LOGIC
`
`SET
`
`I220
`
`TRANSFER
`END
`
`COUNTER
`
`l2 '-
`
`MEL GAD“
`
`BUS CT"
`
`SYNCH
`
`BUS CLOCK
`DETECTOR
`
`1230
`
`r210
`
`l250
`
`BUS WATCH
`START
`
`INST
`
`BUS WATCH
`DETECTOR
`
`READ/
`WRITE
`
`MODE
`
`CONTROL
`REGISTER
`
`ARM_VPT_IPR_00000190
`ARM_VPT_IPR_OOOOO190
`
`
`
`1
`
`4,933,835
`
`APPARATUS FOR MAINTAINING CONSISTENCY
`OF A CACHE MEMORY WITH A PRIMARY
`MEMORY
`
`This application is a continuation of us. patent appli-
`cation Ser. No. 915,272. which is a continuation-in-part
`of U5. patent application Ser. No. 704.568, both now
`abandoned.
`
`BACKGROUND
`
`This invention relates to computer system architec-
`tures and more particularly to a microprocessor system
`having a system bus for coupling system elements. and
`having a dual bus microprocessor with separate instruc-
`tion and data cache interfaces coupled to independently
`operable instruction and data caches which are coupled
`to the system bus.
`Prior microprocessor system architectures have pro-
`vided a single external cache subsystem for data and/or
`instructions Such systems have typically provided for
`direct microprocessor interface to both the cache sys-
`tem and other system elements. In prior systems, a sin-
`gle address/data/control bus provided for interfacing
`to the cache system and to other system elements. Some
`newer microprocessor designs have provided a separate
`interface to a single cache system for data and/or in-
`structions Some have additionally provided a separate
`general bus for coupling of all system elements to the
`micrOprocessor.
`including main memory, peripheral
`controller chips, etc. Transfer of digital information to
`and from the micrOprocessor in these prior art designs
`could either occur between microprocessor and the
`cache system or the micmproceasor and peripheral
`controllers or main memory directly. Furthermore. the
`cache system memory cycle required address informa-
`tion from the processor to the cache system for each
`transfer of digital information to the processor from the
`cache system. While the cache system could return one
`or more words of data per cache system data transfer,
`each cache system memory access cycle required a
`separate address be provided from the processor.
`SUMMARY
`
`In accordance with the present invention. a micro-
`processor—based computing system is provided. which
`has a system bus, a main memory and instruction and
`data cache and memory management units (cache-
`MMU} coupled to the system bus. The system bus pro—
`vides for communication of digital information. The
`main memory selectively stores and outputs digital in-
`formation from an addressable high speed read-write
`memory. The instruction cache-MMU manages selec-
`tive access to the main memory via the system bus and
`provides for the selective storage and output of digital
`instruction words to is mapped addressable very high
`speed cache memory, and therefrom to the processor
`via a very high speed processor/cache bus. A data
`cache-MMU manages access to the main memory for
`selectively storing and outputting digital data words to
`and from a mapped addressable very high speed cache
`memory, to and from main memory via the system bus.
`A processor is independently coupled to each of the
`instruction cache-MMU and data cache-MMU via inde-
`pendent very high speed buses. The processor prevides
`means for processing data received from the data cache-
`MMU responsive to instructions simultaneously re-
`ceived from the instruction cache-MMU.
`
`10
`
`IS
`
`25
`
`35
`
`45
`
`55
`
`65
`
`ARM_VPT_IPR_00000191
`ARM_VPT_IPR_OOOOO191
`
`2
`The data cache and instruction cache each have sepa-
`rate dedicated system bus interfaces for coupling to the
`main memory and to other peripheral devices coupled
`to the system bus. Numerous other system elements can
`be coupled to the system bus. These include an interrupt
`controller. an I/O processo . a bus arbiter. an array
`processor, and other peripheral interface or peripheral
`controller devices. The [/0 processor provides intelli-
`gent interface to various [/0 devices and other proto-
`cols and buses. The bus arbiter is coupled to the devices
`coupled to the system bus. such as the instruction and
`data caches, the [/0 processor. etc. The bus arbiter
`provides means for selectively resolving channel access
`conflicts between the various elements coupled to the
`system bus so as to maintain the integrity of communi-
`cations on the system bus.
`The data cache contains an address register which is
`loaded with an address from the processor prior to each
`transfer of a defined number of words of data between
`the data cache and the processor. The instruction cache
`contains a program counter which is loaded with an
`address from the processor, and which is advanced by a
`cache advance signal from the microprocessor. The
`instruction cache program counter is loaded with an
`address only during branch instructions and context
`switches. This provides for continuous transfer of in-
`structions from the instruction cache to the processor
`respOnsive to a single initial address until a branch or
`contest switch occurs.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`These and other features and advantages of the pres-
`ent invention will become apparent from the following
`detailed description of the drawings. wherein:
`FIG. 1 illustrates a block diagram of a microproces-
`sor-based dual cache/dual bus system architecture in
`accordance with the present invention;
`FIG. 2 shows CPU 110 of FIG. 1 in more detail;
`FIG. 3 shows the CPU instruction bus interface of
`FIG. 2 in more detail;
`FIG. 4 is an electrical diagram illustrating the instruc-
`tion cache/processor bus, the data cache/processor bus,
`and the system bus;
`FIG. 5 illustrates the system bus to cache interface of
`FIG. 4 in greater detail;
`FIG. 6 is an electrical diagram illustrating the dri-
`vers/receivers between the instruction cache-MMU
`and the system bus;
`FIGS. TA-C illustrate the virtual memory, real mem-
`ory, and virtual address concepts as utilized with the
`present invention;
`FIG. 3 illustrates an electrical block diagram of a
`cache memory management unit;
`FIG. 8A shows the translation lookaside buffer sub-
`system (TLB) in more detail;
`FIG. BB shows the hardwired translation lookaside
`buffer (HTLB) in more detail;
`FIG. 9 is a detailed block diagram of the cache mem-
`ory management unit of FIG. 8;
`FIGS. lflA—B illustrate the storage structure within
`the cache memory subsystem 320;
`FIGS. llA—B illustrate the TLB memory subsystem
`350 storage structure in greater detail;
`FIG. 12 illustrates the cache memory quadword
`boundary organization;
`FIG. 13 illustrates the hardwired virtual to real trans-
`lations provided by the TLB subsystem;
`
`
`
`4,933,835
`
`3
`FIG. 14 illustrates the cache memory subsystem and
`affiliated cacheMMU architecture which support the
`quadword boundary utilizing line registers and line
`boundary registers;
`FIG. 15 illustrates the load timing for the cache-
`MMU systems 120 and 130 of FIG. 1;
`FIG. 16 illustrates the store operation for the cache-
`MMU systems 120 and 130 of FIG. 1, for storage from
`the CPU to the cacheMMU in copybaclt mode. and for
`storage from the CPU to the cache-MMU and the main
`memory for the write-through mode of operation;
`FIG. 11A illustrates the data flow of store operations
`on Capy—Back mode, and FIG. 17-]! illustrates the data
`flow of operations on Write-Thru Mode;
`FIG. 18 illustrates the data flow and state flow inter-
`action of the CPU, cache memory subsystem. and TLB
`memory subsystem;
`FIG. 19 illustrates the data flow and operation of the
`DAT and TLB subsystem in performing address trans-
`lafion;
`FIG. 20 illustrates a block diagram of the cache-
`MMU system. including bus interface structures inter-
`nal to the cache-MMU;
`FIG. 21 is a more detailed electrical block diagram of
`FIG. 20;
`FIG. 22 is a detailed electrical block diagram of the
`control logic microengine 650 of FIG. 21;
`FIG. 23 illustrates an arrangement of the major con-
`trol and timing circuits for the Cache-MMU;
`FIG. 24 illustrates CPU control circuit 810 of FIG.
`23 in greater detail;
`FIG. 25 illustrates TLB control circuitry 020.
`TLBC'I'L. of FIG. 23 in greater detail;
`FIG. 26 illustrates the cache control circuit 830,
`CACTL, of FIG. 23 in greater detail; and
`FIG. 27 illustrates the System Bus control circuit 840.
`SYSCTL. of FIG. 23 in greater detail.
`DETAILED DESCRIPTION OF THE
`DRAWINGS
`
`Referring to FIG. I, a system embodiment of the
`present
`invention is illustrated. A central processing
`unit 110 is coupled via separate and independent very
`high speed cache/processor buses, an instruction bus
`121 and a data bus 131. coupling to an instruction cache-
`memory management unit 120 and a data cache-mem—
`ory management unit 130. respectively. each having an
`interface to main memory 140 through system bus 1‘1.
`Main memory 140 contains the primary storage for the
`system, and may be comprised of dynamic RAM. static
`RAM. or other medium to high speed read-write mem-
`ory. Additionally. a system status bus 115 is coupled
`from the CPU 110 to each of the instruction cache-
`nrernory management unit 120 and data cache-memory
`management unit 130.
`Additionally, as illustrated in FIG. 1. other system’s
`elements can be coupled to the system bus 141. such as
`an [/0 processing unit. IOP ”0. which couples the
`system bus 141 to the 1/0 bus 151. The [/0 bus 151 may
`be a standard bus interface, such as Ethernet. Unibus.
`VMEbus or Multibus. 1/0 bus 151 can couple to the
`secondary storage or other peripheral devices. such as
`hard disks. floppy disks, printers. etc. Multiple 101’s can
`be coupled to the system bus 141 and thereby can com-
`municate with the main memory 140.
`The CPU 110 is also coupled via interrupt lines 111 to
`an interrupt controller 170. Each of the units contend-
`ing for interrupt priority to the CPU has separate inter-
`
`ARM_VPT_IPR_00000192
`ARM_VPT_IPR_OOOOO192
`
`4
`rupt lines coupled into the interrupt controller 170. As
`illustrated in FIG. 1,
`the array processor 183 has an
`interrupt output 165 and the IOP 150 has an interrupt
`output 155. Controller 170 prioritizes and arbitrates
`priority of interrupt requests to the CPU 110.
`A system clock 160 provides a master clock MCLK
`to the CPU 110, instruction cache-memory manage-
`ment unit 120 and data cache-memory management unit
`130 for synchronizing operations. In addition, a bus
`clock BCLK output from the system clock 160, pro-
`vides bus synchronization signals for transfers via the
`system bus 141. and is coupled to all system elements
`coupled to the system bus 141.
`Where multiple devices request access to the system
`bus 141 at the same time. a bus arbitration unit unit 180
`is provided which prioritizes access and avoids colli-
`stons.
`
`FIG. 2 shows CPU (processor) 110 in greater detail.
`Instntctions from instruction cache-MMU 120 enter
`from instruction bus 121 to instruction bus interface unit
`1310 where they are held in prefetch buffer 1311 until
`needed for execution by instruction control unit 1320.
`Instructions are also supplied as needed from macro
`instruction unit 1330 which holds frequently used in-
`struction sequences in read only memory. Instructions
`first enter register 102 and then register 104 (instruction
`registers 13 and C. respectively) which form a two stage
`instruction decoding pipeline. Control signals from
`instruction decoder 103 are timed and gated to all parts
`of the processor for instruction execution. For speed of
`execution. instruction decoder 103 is preferably imple-
`mented in the form of sequential state machine logic
`circuitry rather than slower microcoded logic circuitry.
`Program counter 1321 contains the address of the in-
`struction currently being executed in instruction regis-
`ter C. The execution unit 105. comprising integer execu-
`tion unit 1340 and floating point execution unit 1350,
`executes data processing instructions. Data is received
`from and transmitted to data cache-MMU 130 over data
`cache-MMU bus 131 through data bus interface 109.
`Instruction interface 1310 of processor 110 includes a
`multi-stage instruction bus 1311 which provides means
`for storing. in seriatim. a plurality of instruction parcels,
`one per stage. A cache advance signal ISEN'D is sent by
`the instruction interface as it has free space. This signals
`instruction cache—MMU 120 to provide an additional
`32-bit word containing two 16-bit instruction parcels
`via instruction bus 121. This multi-stage instruction
`buffer
`increases the average instruction throaghput
`rate.
`
`Responsive to the occurrence of a context switch or
`branch in the operation of the microprocessor system.
`instruction interface 1310 selectively outputs an instruc-
`tion address for storage in an instruction cache-MMU
`120 program counter. A context switch can include a
`trap. an interrupt. or initialization. The cache advance
`signal provides for selectively incrementing the instruc-
`tion cache-MMU program counter. except during a
`context switch or branch.
`In FIG. 3. prefetch buffer 1311 is shown in detail.
`comprising the four prefetch buffer register stages II-I,
`IL, IA and IC. The [H register stage holds a 16-bit
`instruction parcel in register 1312 plus an additional bit
`of control information in register 1313, Il-ID. which hit
`is set to indicate whether Il-I currently contains a parcel.
`Each of the register stages is similarly equipped to con-
`tain an instruction parcel and an associated control bit.
`Buffer advance logic circuit 1314 administers the parcel
`
`10
`
`IS
`
`25
`
`35
`
`4-5
`
`55
`
`65
`
`
`
`4,933,335
`
`6
`
`agement units of the instruction cache-MMU 120 and
`data cache-MMU 130 perform all memory manage-
`ment, protection. and virtual to physical address trans-
`lation.
`-
`As illustrated in FIGS. 1, TA-C, and B, the processor
`110 provides virtual address outputs which have a
`mapped relationship to a corresponding physical ad-
`dress in main memory. The memory management units
`of the instruction and data cache-MMUs 120 and 130
`are respOnsive to the respective virtual address outputs
`from the instruction and data interfaces of the processor
`110. such that the memory management units selec-
`tively provide physical address and the associated
`mapped digital information for the respective virtually
`addressed Ioeation. When the requested information for
`the addressed location is not stored in the respective
`cache-MMU memories (i.e. a cache miss}.
`the micro
`engine of the cache-MMUs provides a translated physi-
`cal address for output to the main memory 140. The
`corresponding information is
`thereafter
`transferred
`from the main memory 140 to the respective instruction
`cache-MMU 120 or to or from the data cache-MMU
`130, and as needed to the processor 110.
`The two separate cache interface buses, the instruc-
`tion bus 121 and the data bus 131 are each comprised of
`multiple signals. As illustrated in FIGS. 4 and 5. for one
`embodiment. the signals on both the data cache bus 131
`and the instruction cache bus 121 are as follows:
`
`DATA CACHE BUS
`
`ADF<31:0): address/data bus
`These lines are bidirectional and provide an address-
`/data multiplexed bus. The CPU puts an address on
`these lines for one clock cycle. 0n store operations, the
`address is followed by the data. On load or TAS (i.e.
`test and set) operations,
`these bus lines become idle
`(floating) after the address cycle, so that these lines are
`ready to receive data from the Data Cache-MMU. The
`Data Cache-MMU then puts the addressed data on the
`lines.
`FC<3:0): function code/trap code
`The CPU puts “the type of data transfer" on
`FC<3:0> lines for one clock cycle at the address cy-
`cle. The D-CACHE. or I-CACHE sends back “the
`type of trap" on abnormal operations along with TSTB
`(Le. Trap Strobe Signal}.
`
`
`Transfer tm
`
`
`
`0n ASP Active
`
`
`
` FC c 3 2 I D b
`
`O
`D
`0
`[I
`load singleword mode
`(1
`ll
`0
`I
`load doubleword mode
`0
`{I
`I
`I}
`load byte
`0
`I]
`l
`I
`load helfword
`(l
`l
`O
`I]
`Test and set
`I
`X
`0
`I)
`store singleword
`l
`X
`0
`I
`store doubleword
`l
`X
`l
`I]
`store byte
`
`I IXl store hell'word
`
`
`
`
`The D-cache puts the TRAP code on PC to respond to
`the CPU.
`
`
`Trap Code
`
`10
`
`15
`
`25
`
`35
`
`45
`
`55
`
`65
`
`jun TSTB active!
`FC 4 3
`2
`l
`X
`D
`O
`
`0
`
`p»
`
`0
`
`5
`and control bit contents of the four register stages. In
`response to the parcel advance control signal PADV
`from instruction decoder 103, buffer advance logic
`circuit 1314 gates the next available instruction parcel
`into instruction register 102 through multiplexor 1315.
`and marks empty the control bit associated with the
`register stage from which the parcel was obtained. In
`response to the control bits of the four stages. circuit
`1314 advances the parcels to fill empty register stages.
`As space becomes available for new instruction parcels
`from the instruction cache-MMU. cache advance logic
`circuit 1316 responds to the control bits to issue the
`ISEND signal on instruction bus 111. Instruction cache-
`MMU responds with a 32-bit word containing two
`parcels. Thehighorderparcelisreceivedin II-l,andthe
`low order parcel in IL through multiplesor 1319.
`On each MCLK cycle both the buffer advance and
`cache advance circuits attempt to keep the prefetch
`buffer stages full as conditions permit. The butler ad-
`vance and cache circuits are implemented in combina-
`tional logic in a manner that is evident to those skilled in
`the art. For example. cache advance circuit 1316 pro-
`duces [SEND in response to the negation of the follow-
`ing boolenn logic expression: {ICD.IAD,ILD+ICD-
`,IADJHD + ICDJLDJHI) + IADJLDJHD).
`The
`first two terms indicates that [A and 1C are full with
`either Ill or 11. full. The last two terms indicate that IL
`and IHarefull witheither [Cor mfufl. Inallofthese
`cases. there is no available register space in the prefetch
`buffer, while in all other cases. there is space.
`Instruction parcels stored in instructiOn register 102
`are partially decoded before being sent to instruction
`register 10‘ to complete the decoding process. Decod-
`ing of branch instructions is done by branch decoder
`131?, a part ofdecoder 103, in response to instruction
`register 102. In the case of a branch instruction, the
`branch address is set into program counter 1321 from
`the praceesor 8 bus. cache advance circuit 1316 is inhib-
`ited from sending [SEND and the prefetch buffer is
`flushed (signal path [318). Branch decoder 131? instead
`sends [ASP to the instruction cache-MMU. This causes
`instmction cache-MMU 120 to take the new branch
`address from cache bus 1.21.
`The MCLK is the clock to the entire main clock. (eg.
`33 MHz), logic. BCLK is the system bus clock, prefera-
`bly at either i or i of the MCLK.
`For the system bus 111 synchronintion, BCLK is
`delivered to all the units on the system bus, i.e. IOPs.
`bus arbiter, caches. interrupt controllers. the main ruem—
`oryandsoforth. All signals mustbegeneratedonto the
`bus and be sampled on the rising edge of BCLK. The
`propagation delay ofthe signals must be within the one
`cycle of BCLK in order to guarantee the synchronous
`mode of bus operation. The phase relationships between
`BCLK and MCLK are strictly specified. In one em-
`bodiment, BCLK is a 50% duty-cycle clock of twice or
`four times the cycle time of MCLK. which depends
`upon the physical size and loads of the system bus 141.
`As illustrated in FIG. 1. the transfer of instructions is
`from the instruction cache-MMU 120 to the processor
`110. The transfer of data is bidirectional between the
`data cache-MW 130 and prooessor 110. Instruction
`transfer is from the main memory 140 to the instruction
`cache-MMU 120. Instruction transfer occurs whenever
`an instruction is required which is not resident in the
`cache memory of instruction cache-MMU 120. The
`transfer of data between the data cache-MMU 130 and
`main memory 140 is bidirectional. The memory man-
`
`ARM_VPT_IPR_00000193
`ARM_VPT_IPR_OOOOO193
`
`
`
`4,933,835
`
`‘7
`continued
`
`
`Tag Code
`
`Sort TST'B active!
`
`FC 4:
`3
`I
`l
`0
`>
`x
`D
`0
`1
`memory error (MSBE)
`x
`0
`l
`0
`memory error (MDEE)
`X
`0
`I
`l
`x
`l
`0
`0
`x
`I
`0
`l
`X
`I
`1
`0
`x
`I
`l
`1
`
`page fault
`protection fault (READ)
`protection fault (WRITE)
`
`ASF: address strobe
`ASF is activated by the CPU indicating that the ‘ad-
`dress' and 'type of data transfer’ are valid on AD-
`F<31=lO> and FC<3=0> lines. respectively. ASF is
`activated one half a clock cycle prior to the address
`being activated on the ADF bus.
`RSP: response signal
`011 load operations. the RSP signal is activated by the
`D-cache indicating that data is ready on the ADF bus.
`RSP is at the same timing as the data on the ADF bus.
`The D-cache sends data to CPU on a load operation.
`On store operations, RSP is activated when the dam
`cache-MMU becomes ready to accept the next opera-
`tiOn.
`On load-double. RSP is sent back along with each
`data parcel transfer.
`0n store-double, only one RSP is sent back after the
`second data parcel is accepted.
`TSTB: TRAP strobe
`TSTB, along with the trap code on PC (21)), is sent
`out by the D-cache indicating that an Operation is ab-
`normally terminated. and that the TRAP code is avail-
`able on FC<2=0> lines. On an already-corrected error
`(MSBE), TSTB is followed by RSP after two clock
`intervals whereas on any FAULTs or on a non-correct-
`able ERROR (MDBE), only TSTB is sent out.
`nDATA: D-cache
`Low on this line defines the operation of this cache-
`MMU as a data cache-MMU.
`
`INST bus
`
`IO
`
`15
`
`20
`
`25
`
`35
`
`IADF<3I:U>: address/instruction bus
`These lines are bidirectional. and form an addras/in-
`struction multiplexed bus. The CPU sends out a virtual
`or real address on these lines when it changes the flow
`of the program such as Branch. RETURN, Supervisor
`Call. etc. or when it changes SSW<30=26> value. The
`instruction cacheMMU returns instructions on these
`lines.
`IFC< 3:0): function code/response code
`The l-cache puts the TRAP code on the FC lines to
`respond to the CPU.
`
`
`45
`
`55
`
`8
`half a clock cycle earlier than the address is on the
`IADF bus.
`ISEND: send instruction (Le. cache advance signal).
`ISEND is activated by the CPU. indicating that the
`CPU is ready to accept the next instruction (e.g. the
`instruction buffer in CPU is not full).
`At the trailing edge ofRSP. [SEND must be off if the
`instruction buffer is full, otherwise the next instructions
`will be sent from the instruction cache-MMU. When
`the new address is generated. on Branch for example.
`ISEND must be off at least one clock cycle earlier than
`IASF becomes active.
`“151’: response signal
`iRSP is activated by the I-cache, indicating an in-
`struction is ready on the IADF <31£> lines. IRSP is at
`the same timing as the data on the bus.
`ITSTB: TRAP strobe
`This is activated by the I-cache, indicating that the
`cache has abnormally terminated its operation, and that
`a TRAP code is available on IFC<3D> lines. On an
`already—corrected error (MSBE), TSTB is followed by
`RSP after two clock intervals, whereas on FAULTs or
`a non-correctable ERROR (MDBE). only TSTB is sent
`out and becomes active.
`INST: I-cache
`
`A high on this line defines the operation of this cache-
`MMU as an instruction cache-MMU.
`
`SYSTEM STATUS BUS
`
`MPUO: SSWSO, supervisor mode
`MPK: SSW”. protection key
`MPUOU: SSWZS, selecting a user's data Space on su-
`pervisor mode
`mPKU: SSWZ'}, protection key of a user‘s data space on
`supervisor mode
`MPM: SSWZG, virtual mapped
`These signals represent
`the System Status Word
`(SSW (30:26)) in the CPU and are provided to both
`the D-cache and I-cache.
`Each of the instruction cache-MMU 120 and data
`cache-MMU 130 has a second bus interface for coupling
`to the system bus 141. The system bus 141 communi-
`cates information between all elements coupled thereto.
`The bus clock signal BCLK of the system clock 160
`provides for synchronization of transfers between the
`elements coupled to the system bus 141.
`As shown in FIGS. 5 and 6, the system bus output
`from the instruction cache-MMU 120 and data cache-
`MMU 130 are coupled to a common intermediate bus
`133 which couples to TTL driver/buffer circuitry 135
`for buffering and driving interface to and from the sys-
`tem bus 141. This is particularly useful where the in-
`struction cache-MMU 120 and data cache-MMU 130
`are mounted on one module, and where it is desirable to
`reduce the number of signals and protect the monolithic
`integrated circuits from bus interface hazards. The fol-
`lowing bus signals coordinate bus driver/receiver activ-
`11y:
`DlRout: direction of the AD bus is outward
`
`
`
`IFC [at ITSTB active!
` 3 2 l O
`
`
`X
`0
`0
`l]
`x
`0
`0
`I
`X
`0
`1
`0
`X
`0
`l
`I
`X
`l
`[I
`I}
`page fault
`x
`I
`D
`I
`protection fault (execution}
`X
`I
`l
`D
`
`l lX I
`
`
`
`memory error (MSBE)
`memory error (MDBE)
`
`IASF: address strobe
`lASF is activated by the CPU. indicating that the
`address is valid on IADF<31zO> lines. IASF is active
`
`ARM_VPT_IPR_00000194
`ARM_VPT_IPR_OOOOO194
`
`65
`
`This signal is used to control off-chip drivers-receiv-
`ers of the AD lines. The master cache activates this
`signal on generating the ADDRESS. and on sending
`out DATA on the write mode. The slave cache acti-
`vates this signal on sending out the DATA on the read
`mode.
`ICM: I-cache access
`
`
`
`4,933,835
`
`10
`placed on bus delays which will in turn limit bus length
`and loading.
`The system bus 141 is comprised of a plurality of
`signals. As illustrated in FIG. 5, for one embodiment.
`the system bus 141 can be comprised of the following
`signals, where "2’" indicates a low true signal.
`AD<31£>= address/data bus
`This is the multiplexed address/data bus. During a
`valid bus cycle, the bus master with the right of the bus
`puts an address on the bus. Then that bus master either
`puts data on the bus for a write. or three-state (floats) its
`AD bus outputs to a high impedance state to'prepare to
`receive data during a read.
`CT < 3:0): CycleType
`CT (3:2) indicates the type of master on the bus and
`whether a read or write cycle is occurring.
`
` 3
`
`2
`0
`
`l
`
`0
`
`0
`
`0
`
`I
`
`CPU write (write issued
`by a CPU type device
`CPU read (read issued
`by a CPU type device)
`IA) write {write issued
`by an [OP type device)
`IXO read [read issued
`by an [OP type device)
`“(1.0) indicates the number of words to be
`transfe