throbber

`
`Eurapéfachas Paiwtam‘t
`
`Eumpaan Fatah? W568
`Ofiics exampéen das bimats
`
`® Publicatiannumber:
`
`@ 2%4 y? a
`A2
`
`'
`
`EURQPEAN ngm Apmmwam
`
`.Appiicatian number: 353045993
`&
`ling:
`‘
`.
`@an om Hams
`
`® inhCL‘; G as F 9M4
`G 86 F 121138,
`
`(3 08 F 13J’4U
`
`
`Pn’or‘ity: 22.:3185 US 257mg
`® Envemw: mandrawfiimard T.
`32,3185 US 7573§3
`Sevanhficnsdnmkfirfive
`22.61% US 753?“)
`WWOrdMnssachflmmfiisflfifUQ
`
`‘
`
`.
`
`Oataofpubiicaz‘vongfappsicazéon:
`“1.83.827 Sufism 82/12
`Sasignazed Cantractmg States:
`AT m9 CH DE FR (38 {I U LG N}. SE
`GE) AppiicBm: Mumrcompmsasvswems
`canwmuow
`$2 theg Path
`Mm maswcbumm m mums;
`® Inventm: Fwdsau, Ruban L
`28 Sawin Sass:
`.
`Arlington Massachusens 02:74:15)
`(52) invemoz: Brunet, Ronald H.
`”2&8 WestBerlin Raafi
`Bunch Massachuséfis {11632185}
`
`
`
`69 Inventor: Mundifl,Cr&igJ.
`fi‘éfifirasfifiam
`mewsMamachugefismflswfii
`GE) inventor: Myswmkiiwmhaw 4‘
`fipaflmsnificficz38§8 3:8!“st
`Adan Mazsachmafls 01 I’ZMUS}
`
`'
`
`invamm: Stewsrt‘ Wifl‘iam x,
`Na 1‘ 3'5 Gemini: 512%:
`Brfshkm Massathumrts fiwatUS)
`Q} Inventor: Vameraamaa E,
`32 (50211“!an
`Framingmm Wham:busatts 01701 (US)
`P ® {memos Zhghsr‘ Michaa: i“
`1885'}: Lane
`thiflwifla Massanhasetts m SBMUSI
`
`Representative: Dams, Méchsewahn Percy 91 a},
`How Wése, TragsarSaCO‘ f‘éormaa Hausa 1 055199 Strand
`Lvndcn WCZR OAEIG 8?
`
` fiigik‘al computer“
`
`technique far canwrremty execusing the
`@ An Efficient
`iterazians of an ilerative conszruct is described. A paralla-
`proces'séng csmputer is provided which '15 capable of success-
`:uflv processing computatienafiy intensivz applications typio
`:33; found in engineering and snienfific applications.
`A xechnique is a§so descvibed for Ema ria awn; the memuw
`eiamrmts of 3 parallehpmcesgmg computer, andin penicufar
`(me adapted far use in processa'ng campmationaiiy iniansiva
`appiications 5nvalving memory &CEIBSSE3 at fixed may; The
`memory ewmems Hm exampkz came seczians} are highly
`accessible to the processors"
`There is alse Gescfibeé a hackplaneaiesidem swath fez;
`seéectivaiy’ Connecting any Slammed giuraiéty a: first gut),
`buses.
`system busea to any seiecxed pfu wiity af second sub-system
`
`E?”a2%?“Fig£2
`
`Gordan; Pruning Compmy Lad
`
`ARR|S883|PR|0001374
`
`Petitioner ARRIS Group, InC.’S
`EXHIBIT
`
`
`
`ARRIS883IPRI0001374
`
`

`

`MSW? k509i)”:
`
`umm‘r Em"
`
`
` 0’14““! 1
`w-
`,.,-s _,_A
`Ll’ Cum" 0
`QM“?—
`

`
`
`
`C-DXJCMMBBCY CONTROL BUS
`
`NflLTkMS
`
`Fifi E
`
`[a
`
`ARR|S883|PR|0001375
`
`ARRIS883IPRI0001375
`
`

`

`@2’%4?‘§8
`
`5
`
`10
`
`15
`
`
`Bacfigraund of the Invenfiion
`This inventian relate5 to digital computers capabie
`of parallel processing. There has been same aaademic
`interest in the yessibility of concurrently executing
`the itarations of an iterative consttuct such as a DB
`
`leap in FORTRAN. 3.9", Ruck, DuJ., ”Paraliek FraseSSUz
`
`Architecture~*8 Survey“, 1975 Sagamore Comguter
`W
`Conference fig Parailel Processing; Kuck,
`
`D.J¢, “Autfimatic Program Restructaring for High~8peed
`Camputation", Prcceedings of CONFfiR £1983}; Ruck. 0.3.;
`Stracture of Computezs and gggflutationn
`A principal
`difficulty in designing a paxallel~prucessing
`architecture that will efficiently execute DO leaps
`and othar iterative conStructs is synchzcnizing the
`sewcallefi éependencies that exist in the 19095. Ruck
`has classified more than one txge cf dependency (lib)F
`but
`the dependency of practical interest is that
`
`wherein a first instructian cannet progerly be executeé
`
`in a given iteration until a seconé instruction {which
`
`20
`
`couid also be the first instructian) has been executed
`in a prior iteration.
`‘
`
`25
`
`30
`
`There has been little prflgxess in apglying
`garallel pracessing to the camputatienally intensive
`applications typically founé in engineering and
`sciéntific applications, yarticularly to parallel
`
`grocessing of the same jab {i,e.,
`and data).
`
`the same instructiOfis
`
`In such applicationsl it is typical :0
`
`find repetitive accessea t0 memory at fixed address
`intervals known as Striées.
`Each new access iaitiated
`by a gracessor is for a memory location separateé from
`the last access by the length 0f the stride.
`A stride
`
`of one means that the processar accesses every wozd
`
`ARR|S883|PR|0001376
`
`ARRIS883IPRI0001376
`
`

`

`1
`
`mea
`
`(whase length may very)
`
`in sequence‘
`
`A atride of
`
`two means that evety athez were is acceesed.
`
`If
`
`interleaved memory elements are acteased by thé
`
`processors? the Stride determines a unique sequence 0f
`
`5
`
`memory accesses known as the access pattern (8.9.,
`
`fox
`
`four memory elements,
`
`the access pattern might be ABCD3£
`
`Caches have long been aged in digital
`
`computergi and have baen apglied to parallel
`
`procesging, with one cache assigned to each procesaor.
`A cache is a highmspeed memory cantaining copies of
`
`10
`
`selected data from the main memory. Memory accegses
`
`from a processor come t0 the cache, which determines
`
`whether it currently has a copy of the accessed mamory
`
`iocatisn.
`
`If not; a cache ”miss“ hag occurred,
`
`find the
`
`15
`
`cache customarily Etsps accepting new accessea while it
`
`performs a main memory access tax the data neade§ by
`
`the pzoceasor,
`This invention relates t0 digital comgutezsl
`
`and particularly to techniques far iniexconnacting
`parallel processsts with memoriESx
`in & digital
`
`29
`
`Computer
`
`in which a plurality of pracessars muat
`
`be connected to a plurality of memofiesg it is
`
`convantional to provide a common bus connected to all
`
`grocessors anfi memories, and to have the pracessorg
`share the bus,
`
`25
`
`ARR|S883|PR|0001377
`
`ARRIS883IPRI0001377
`
`

`

`3
`
`fimww
`
`
`Summer: g; the invenfiion
`
`The invention has three aspects.
`
`In a first aspect of the inventian; we
`
`have discoverea and zeduced to gxactica an extremely
`
`fi
`
`efficient technique far cancurrentiy executing the
`
`iterations of an iterative construct.
`
`The invantian
`
`provides a parallelwpracessing computer capable 0f
`
`successfully processing computationally intensiVe
`
`applications typically inund in engineering and
`
`la
`
`scientific apgiications.
`
`IR genexal this first aspect of the
`
`invention features, firstly, a plurality of processors
`
`eacn adapted f0: cancurrentiy ptccesging different
`
`iiezatians of an iterative canstzuct, and each adapted
`
`15
`
`for serially processing portians Bf the grogram outsiae
`
`of the iterative construct; means are proviaed fez
`
`activating ifile yrocessors at the start of concurrent
`
`pr0035$ing of the itexative canstruct and far
`
`transferring sufficient state infarmatien to the
`
`23
`
`activated processszs so that thay cam begin concurrent
`piecessing,
`
`In a fizst set of preferred embediments 0f
`
`this first aspect,
`
`the iterative construct contains one
`
`or more dependencies.
`
`A processor encaunteting the
`
`25
`
`first instructicn (0f
`
`the dependenty defimeé abDVE)
`
`delays further precessing until it receives a go-ahead
`
`signal
`
`indicating that the secand instruction of the
`
`degendency has been executed in the prior iteratian.
`
`The gD—ahead signal is pzovideé by a synchzonizing
`
`30
`
`means that stares a numba: represeatative of the lowest
`
`iieration to have not executed the seccnfi instruction;
`
`the stored number
`
`is incremented each time the second
`
`instruction is executfid;
`
`The synchronizing means
`
`ARR|S883|PR|0001378
`
`ARRIS883IPRI0001378
`
`

`

`L?
`
`@§%fi?“¥3
`
`issues the go~ahead signal when a comyarison of the
`
`stored number
`
`to a number representative cf the prior
`
`iteratian shaws that the seconfi inshruction has been
`
`executed in the prior iteratian.
`
`The synchroniziag
`
`maans includes a synchronizatian register that is
`
`inexementefi each time it receives an advance-register
`
`signal, sent to it afte: the secané instruction is
`executed in each iteration; precasaing is delayed aftex
`
`execution of the second instruction until the processor
`
`10
`
`is informe& that all leef iterations than the one
`
`being executeé have causeé the synchtonization
`
`register to ba incremented; Spaciai await and advance
`instructions are inserted befiore an& after the first
`
`ané second instructions, respectively,
`
`t0 effect
`
`synchronizatian: a glurality of synchronization
`registers axe provided, each fqr synchronizing a
`diiferant fiegenfiency i0: group of depenfiencies);
`
`the
`
`await and advance instructions have an argument far
`
`ggecifying the synchzaniaatien register; the gQ-ahead
`Signal
`i3 issued hasa§ on a comparision of the contents
`0f the synchrcnization register ta the contents of a
`CUKtént iteratian registez minus a specified itexation
`
`offaet (representative of the number of iterations
`
`separating the current iteratinn fram the prior
`iteratiun in which the second instruction must have
`
`been executed}; the iteration affiset is specified as an
`
`argument to the await instruction; the synchronization
`
`zegister contains only the lgast significant bits 9f
`the lowest iteration to have mat executeé the secona
`
`instructien, enough hits tn exprass the maximum
`
`éifference in iterations ever being processed
`
`concurrantly; a further bit in the :egiaters keeps
`
`track of whether the register has been advanced during
`
`the current itEEatian.
`
`201
`
`25
`
`30
`
`ARR|S883|PR|0001379
`
`ARRIS883IPRI0001379
`
`

`

`6
`
`fi§?fi?“§3
`
`In a second Set of preferred embodimenta cf
`
`the invention features transfietxing
`this first aspect,
`state information to all 0f the 9rocessors during
`
`initiation of censurrent precessing so thateany one
`
`5
`
`of the grocessots can xesume Serial processing at
`
`the completion 0f concurrent processing.
`
`The processes
`
`to resume serial processing is the one to pzacess the
`
`last iteration of the iterative censtruct; and the
`
`transferxed state information incluées the value 0f tha
`
`10
`
`stack gainter jast befora cancurrent grocassing began.
`
`Allowing the processnr execmting the last iteration to
`
`resume serial pracassing, rather than returning serial
`
`processing t0 a piedetermined protease: has the
`
`advantage that program state does net naed to be
`
`15
`
`transfetted at the resumgtian 0f serial processing; a5
`
`the processor executing the laat iteratian necessaziky
`
`has the ccrzect program gtate fa: continuing serial
`
`processing.
`
`20
`
`25
`
`In a third set of preferreé embadiments of
`
`this first asgectf the inventien featuzes assigning a
`new iteratisn to the next processar t9 request an
`
`iteration, so that iterations are not assigned t0
`
`processors in any prearranged ordtrt
`
`Iteratians are
`
`assigned by determining the number of pEOCESSars
`simultaneously bidding for an iteratisn using ready
`
`lines that extend between the processczs, Qae line
`
`being assigned to each yroceasor;
`
`the tstal number of
`
`asseztea ready lines is used to increment a register
`
`that kteps track of which iteraticns have already been
`assigned (and which preferably contains the next
`
`30
`
`iteratian, i.e., the iteratian t0 be assigned when the
`
`next bid is made for an iteraticn}; a curzent iteratimn
`
`register is provided;
`
`the maximum iteratian cf the
`
`ARR|S883|PR|0001380
`
`ARRIS883IPRI0001380
`
`

`

`b
`
`@Eid?fifi
`
`iterative construct is transferied to all the
`
`processoss and campazed to the current iteration
`La astermine whether cancurrent precessing by that
`
`5
`
`the hazaware for making
`processor should be terminated;
`iteiatinn assignments is locatefl at each prCGSSGK anfi
`is adapted t9 permit each processo:
`:0 simultaneously
`and independently determine its iteration assignment
`baseé an the number of aaserted :eady lines; initial
`
`iteratisn asgignments are also made simultaneously
`
`10
`
`using the sama harawage.
`
`15
`
`20
`
`In a fourth set 9f preferred embediments 3f
`
`this first aspect;E the invention featur&s concuzxency
`
`cantrol lines connecting the proc2530rs and serving as
`
`a conduit for pasaing signals between the processozs
`
`lacal concurrency
`to central concurrent pzocesaing;
`central logic is previfiea at egch grocesgor for
`trangmitting ané xeceiving signals over
`the lines.
`A common data bus extenés between the processors
`
`for tranaferring state informatian such as the stack
`
`pointar between grocesgozs at the start ofi concuyrent
`pracessing;
`the cancurrency contzol line; extending
`betwean gracessars incluae the :eady lines, lines for
`passing signals such 35 the advance~register signais
`for the synchronizaticn registexs, and lines for
`
`25
`
`informing the processnrs ef the initiation and
`
`conclusion cf concurzent processing:
`
`In a fifth set of preferred embodimants 05
`
`the inventian featurea concaxzently
`this first asgect,
`executing an iterative construct that cantains within
`it a conditional branch to aa address Gutgide 0f the
`
`30
`
`canstruct; means are provided far infgrming processors
`
`ether than the one taking the branch that such has
`
`taken place and that cancurrent processing is
`
`ARR|S883|PR|0001381
`
`ARRIS883IPRI0001381
`
`

`

`7
`
`fi?‘%fi“§8
`
`A processar encauntezing a tzap is
`terminated.
`prevented fxom taking the trap until it receives an
`
`indication fram a tzapvserializing synchronizatifin
`
`regiatez that the itexation it ia pzoaaasing is the
`
`S
`
`lowefit sue being pracegsed fin which there remains a
`
`passibility that the conditional branch caulfi be taken;
`the synchranizatian register is incrementefl at the
`
`compieticn of each iteratimn {or at a paint in the
`
`code beyonfi which theze is any pagsibility af said
`
`10
`
`conditional branch occurring before campietien cf an
`
`iteration);
`
`the conditianal branch in the lowest
`
`iteratian is forceé to occur prior ta such a branch in
`
`a highar iteration by insexting an await instraction
`
`before the quit instruction;
`
`the await instruction
`
`15
`
`causes further processing as be delayed until the
`
`trapu5erializing synchronization register reaches a
`vaiue equal ta the current iteration.
`
`In a sixth set of preferred emboéiments of
`
`20
`
`this first aspect,
`the invantian features providing
`each pfnceasor with a private stack gointer for use
`flaring condurzent processing far staring data unique
`
`t0 a single iteraticn {e.g.;
`
`tempcrary variables,
`
`subrnutine arguments, anfi retuzn addresses).
`
`The
`
`giobal stack pointer in use befcre the star: of
`
`25
`
`cancurrent procesaing £5 savefi far reuse at tha
`
`completion 3f cancurrent processing; and the 910ba1
`
`stack pointer is transferred to all the processars so
`
`that it is available ta whichever piecessor resumes
`
`serial processing»
`
`30
`
`In a seventh set of preferseé embediments of
`
`this first aspect,
`
`the invgntinn features determining
`
`whether an iterative censtruct intended us be pEQCESsed
`
`concurrently is nested within another construct already
`
`ARR|S883|PR|0001382
`
`ARRIS883IPRI0001382
`
`

`

`<5
`
`'E‘fiéfi
`Q
`
`“i
`"Egg
`
`being concurzently processedw If such is the case,
`
`the
`
`nestea construct or lOQp is executed serially on tha
`
`processor that encounters it.
`
`The nested loop 13
`
`exacuted using the same iteration asaignment haréware,
`
`excegt that fihe current iteration is incremented by one
`
`for each new iteratiun; and the current iteration of
`
`the outer cancurrent loop is saved so that it can be
`
`reusefi after yrecessing of the nested loop has been
`completed.
`7
`~
`
`This first aspect of the invention also
`
`features concurrent processing oi vector instructiona;
`
`the elements cf the vectors operated an by the vector
`
`ingtructions are éiviéed between the procesaorsl
`
`eithe: horizontally 0r vertically, anfl each precassox
`
`takes fine pass (or “iteration";
`
`threugh the Vector
`
`instructions.
`
`In preferreé embcdiments, 9&ch processor
`
`camputeg,
`
`just prior ta concuzrently execating the
`
`vectar instzuctions, a length,
`
`inczement, ané cffset
`
`Ear U$e in selecting the vacter elements that have been
`
`assigned to the processmr; a startmvectcrwconcurrency
`
`instructfion is placed befiore the first of the vectox
`
`instructians and a xeyeat concurrency instructian is
`
`placefi after the instructions.
`
`The inventisn praviées a high performanae
`
`syatem capable of being constructed fzom moderately
`
`pricaé camponents, perfiarmance that can easily be
`
`expandeé by adding aéditionai gracessors {C33}, and
`
`fault tolerance resulting from continued aperatiqn
`
`after failure of One ox more grocessorsa
`
`In a Seccnd aspect of the iHVention, we have
`
`éiscavered an excellent technique for interleaving the
`
`memory elements of a parallel~pracessing computer: ane
`
`U1
`
`l0
`
`20
`
`25
`
`30
`
`ARR|S883|PR|0001383
`
`ARRIS883IPRI0001383
`
`

`

`(:3
`
`@Qifi‘?‘§3
`
`particulazly afiaptafi far use in processing
`
`camputationaliy intensive applications invalving
`
`It makes the memory
`memnry accesses at fixad strides.
`elements {mgH cache sections} highly accessible to
`
`5
`
`the processors”
`
`In ganeral this secand agpect of the invention
`
`featuresr firstly, a glurality of memazy elements
`
`{e.g., cache sections) connected to a glurality Df
`
`parallel pracessorsg with the memory Elements so
`
`10
`
`interleaved that the access yattern generated by
`saié prccessors when accessing flats at piedetezmined
`
`strides permits all of the ptocessars t0 reach a phase
`
`relationshig with the ether pracessorg in which each
`
`gracessor is able ta access a diff9£ant said memory
`
`15
`
`element simultaneously without access conflictg
`
`arising,
`
`In preferred embodiments,
`
`the processors
`
`are adapted for cancurtentiy processing the same
`
`instructians aha data (3.9., éiffiezent iteratians Cf
`
`tha interleaving is
`fihe same iterative construct};
`such that the access pattern generated by the
`pracessora fer strides OE fine and tw0 meets tha
`
`the pattern will tolerate being
`conditions that (l)
`affset with respect to an identical pattern by an
`
`ior any multiple thereof) equal to thfi iangth of
`offset
`the pattern divided by the number of memery elements,
`and (2)
`the pattern includes at least Gne conflict at
`
`ewery offset other than the tolegable offsat so that
`
`20
`
`25
`
`access caniliets force the processors ta assume a phase
`
`relationship with each other wherein their accessing
`yatterns are cffset by the talerable offsat;
`there are
`
`3O
`
`four memory elements W,X,Y,Z asd the interleaving
`
`praduces an accesg pattern af WXKWYZZY (or,
`
`lesa
`
`prefezably, WXXWYYZZ a; WWXZEXZX)
`
`for a stride of
`
`ARR|S883|PR|0001384
`
`ARRIS883IPRI0001384
`
`

`

`£0
`
`fimwm
`
`ane and WXYZ for 5 stzide of two: or
`
`the interleaving
`
`prcduces an access yattern of WWKXXXWWYYZZZZEY for a
`
`stride of ans, 5 pattern 0f WXXWEZZY for a sitide 0i
`
`twog and a pattern of WXYZ far a stride of four;
`
`5
`
`tha intarleaving for eight mamory eiaments iS
`
`ABCDDCBaEFGHHGFE for a stride of one and ACDBEGBF
`
`far a Stride of two;
`
`the memory elements choose among
`
`simultaneausly contending processora on the hagis of a
`
`fixed priority zanking“
`
`10
`
`This secgnd aspect of th& inventian also
`
`features a glabal cache accesseé by a plurality of
`
`the
`In prafarrea embodiments,
`parallel processors.
`processors are adapted far concurrently processing the
`same instructiens anfi data (e.g,, diiferent iterationg
`
`15
`
`of the same itexative canstruct3;
`
`the cache 13 éivided
`
`into interleaved sections; blacks Qf data accessefi £10m
`
`memory ate éivided betwean the cacha sections; each of
`twa cache sections arive a Gammon memary adéress bus
`
`with the black addzéss of the biock being accessed and
`
`20
`
`both sections cancurzently reaé the block adfiress fram
`
`the cammmn bus; separate buses connect each of the two
`
`cache sectians to main memory, and each cache sectimn‘s
`
`half of the flata block is transferreé over its memazy
`
`bus; and the two cache Bastiona share a summon circuit
`board.
`
`23
`
`In this second aspect 0f the EHVentien it is
`
`also mated that in a parallel proceasing environment,
`
`particulariy one dedicatefi to concurrently processing
`gartions of the same jab (ige.,
`same instruetions and
`data), a cache may be usefully givan the capability cf
`simultaneously accepting curzent accesses while warking
`
`33
`
`on completion oi pending accesses fihat it was uaable to
`
`camplete earlier {e.g.. because a main memory accegs
`
`ARR|S883|PR|0001385
`
`ARRIS883IPRI0001385
`
`

`

`is
`
`@3?fi?%3
`
`was required)& This makes possible far greater memary
`banéwidth for each processor‘
`In preferred
`
`embodiments,
`
`the black addresseshfier the gending
`
`accesses are stazed along with a status coée comprising
`a prescription 0f the steps necessary t0 comgleta the
`
`5
`
`the stored block addresses £05 yending accesses
`acgess;
`arg compared :0 the adéresses DE each new black 05 aata
`
`receive& from main memary and the status code for each
`
`is reéetezmined baseé on the prior state of the code
`
`10
`
`and the outcome of the addréss campaziaon;
`
`the block
`
`afiéresses sf entrant accesses are comgaied to the
`
`aéfiresses of main memory accesses still in pzegzess
`
`(1.2., nut yet available in the cache memory);
`
`the
`
`cache index of current ascesses are camgared ta the
`
`13
`
`cathe inéexes of main memary accesses in progress to
`
`determine whether a black of data fict€s3éd
`
`from memory,
`
`though not the data sought by the current access,
`
`is
`
`data that will be written into the cache memary
`
`location afldrgssefi by the currant access;
`
`the cache
`
`20
`
`15 fiivified into a plurality of sectians ané 15 capable
`
`of concurrently accepting a plurality 0f accesges to
`
`éiffezent sectians:
`
`the cache central logic is divided
`
`into quick—werk logic for campleting accegses capable
`
`of immeéiate completion, pending—status logic for
`
`25
`
`initially determining the status cade for a pending
`
`accesa and redetermining the statas codes as canéitions
`
`change, and access—cempletion logic for taking contzal
`
`af the data and adéress paths within the cache to
`
`camplete a pending access when the status cede
`
`30
`
`indicates that it can be campieted;
`
`logic is gravidea
`
`t0 assure that gending accesses frem the same pracesso:
`
`are completed in the drder they are received, rather
`
`than the orde:
`
`in which they could best be campleted;
`
`ARR|S883|PR|0001386
`
`ARRIS883IPRI0001386
`
`

`

`{3‘
`
`wwwa
`
`two cache sections sglit éata blacks accessed from
`
`memory and share a main memery address bus, an as
`
`thereby ta detect the adfirasses of blocks accessed by
`
`the ether cacha section« We
`
`incozpotate by reference
`
`5
`
`the cogending applicatien entitled "Digital Comguter
`
`with Cache Cayable 0f Csncurrently Eanaling Multiple
`Accesses from Parallal Procassors”, filed on even fiate
`
`herewith.
`
`A third aspect of the invention features a
`
`10
`
`backplane~resident switch fa: selectively connecting
`
`any selected piurality of first subsystem buses to any
`
`selectea plurality of seconé subsystem buses.
`
`In
`
`preferred embodiments,
`
`the firSt subaystemg are
`
`processars and the second subgystems are cache
`
`15
`
`sections;
`
`there are more processcrs than cache
`
`sections; separate processozwaccess lines and
`
`cashewacknowledge lines {3.gx, CBUSYL) are provided
`
`outside of the Switch; central 0f the switch resifies
`
`is the cache sections; 3&0h3*n3t*£eafiy lines (e¢g‘,
`
`20
`
`the
`CWAETL) axe providefi se§arate fram the Switch;
`cache sections are interleaved; and the switch is
`
`implementefi as a plurality of gate azzays mounted an
`
`a circuit board forming the backplane.
`
`The third aspecfi of the invention also
`
`25
`
`featuzes a bug‘switching means selectively cannecting a
`
`pluraiity 0f parallel gracessor buses t9 a plurality af
`
`mamary buses in a parallel gracessing system in which
`
`the pracessars are adapteé £0: cancurrently proces3ing
`
`portions of the same job {i,e.,
`
`the same instructions
`
`30
`
`and data).
`
`In preferrefi embadiments,
`
`the Processazs
`
`are adapted for ccncurrently yrocessing different
`iterations of the same iterative constzuct‘
`
`ARR|S883|PR|0001387
`
`ARRIS883IPRI0001387
`
`

`

`[,3
`
`fig”?
`fi??%
`
`Sthe: feafiures ana advamtagea of the inventian
`
`will be apparent Exam the descripfiion af a preferxed
`embafliment and fram the Claims“
`
`ARR|S883|PR|0001388
`
`ARRIS883IPRI0001388
`
`

`

`fifiéwfi??8
`
`
`Descrigtion of
`the Pzeierred Embeoiment
`
`1‘ Drawings
`
`$19.
`
`1 is a system black éiagram.
`
`Fig,
`
`2 is a black éiagram of an interactive
`
`5
`
`pracessor.
`
`Fig.
`
`3
`
`is a block diagram of a high—speed
`
`pracessar OE computatianal element (hereinafter a "CE”)!
`
`Fig“ 4
`
`is a diagxam showing the structure 0f
`
`the cancurrency status register (CSTAT).
`
`1g
`
`Fig. 5 is a black diagram showing the
`
`CQHCUftefiCY control bus that connects cemcurzency
`
`control units {hareinafter “CCBS“) and showing the
`
`connections between each CCU and its CE {only twe C35
`
`and CCUs are Shawn),
`
`15
`
`Figa. éulfl coliectivaiy are a black diagzam 0f
`
`the CCU.
`
`Figs. 11$ and 218 are twa halves 0f a block
`
`diagram cf the CC3 status register (CSTAT).
`
`Fig. 12 is a block fiiagxam of one of the éight
`
`20
`
`4~bit synchronization registers‘
`
`Fig. 13 is a timing diagram for the CSTRRT
`instruction.
`
`Fig. 14 is a diagram illustrating an example
`
`of concurrent processing*
`
`25
`
`Fig. 15 is a block diagram showing the CEs,
`
`backplane switch, and cache quadzants, anfi the
`connections therebetween.
`
`Fig; 16 is a block diagram of the
`
`backplane~switch logic simglified t0 Show the lagic for
`
`30
`
`Qua bit of the ninety»six bits switched between the CES
`
`and cache quadrants,
`
`ARR|S883|PR|0001389
`
`ARRIS883IPRI0001389
`
`

`

`@3?£¥:?‘E E
`
`Fig,
`
`l? is a block diagram showing one of
`
`twenty—four four~bit gate arrays forming the backplane
`switch.
`-
`
`Fig. 18 is a perspective View, somewhat
`
`5
`
`diagrammatic, 0f the circuit boarés farming the CES,
`
`cache quadrants, anfi backplane.
`
`Fig‘ 19 is a block fiiagram showing the address
`
`and data pathg in the cache quadrants.
`
`Fig- 20 is an cverall black diagram ofi
`
`the
`
`10
`
`control logic for the cache quadrants.
`
`Fig‘ 21 is a block fiiagram of the
`
`pending—status logic in a cache quadrant.
`
`IE,
`
`Summary
`
`A system black diagram is shown in Fig. l.
`
`15
`
`Eight high~speed precesaazs or computaticnal elements
`
`{CBS} 10 are connected tG twa central pzocegsing cache
`
`boards 12 {each comprising twa quaaxants of a fourwway
`
`interleaveé central processing cache {CF cache); by a
`
`switch 14 which resides on backplaue 192 {Fig. 13)
`
`into
`
`20
`
`which the CE and cache beazds are pluggefi.
`
`The switch
`
`permits any fOU£ £35 to be concurrently accessing the
`
`four cache qu&flrants.
`
`Tha CBS each baVe a Canurrency
`
`cantrol unit
`
`{CCG§ 24 for contzollifig concurrent
`
`processing‘
`
`The CCUS communicate with ofiher CCUS
`
`25
`
`acrmgs a concurrency centrol bus 25‘ Memury bus 18
`
`coanecte the cache quadrants fie eight memory modules 17
`
`{each 8 megabytes}. A153 cannected t0 the memery bus
`
`are twa interactive prccessaz caches 18, each of which
`
`is connected to three intezactive piecessoxs {SP5} 20.
`
`33
`
`Each I? serves a multibus 22,
`
`to which gE£ipheral
`
`fievices (not shown) are cannected.
`
`The System is a sharedwglcbalwmfimaryg
`
`symmetric (1.8., not master—slave) multiprocessing
`
`ARR|S883|PR|0001390
`
`ARRIS883IPRI0001390
`
`

`

`%
`
`@E?£?fi8
`
`Im
`
`10
`
`computer garticularly useful in: general scientific
`and enginfiering comgutationu
`The CES can execute
`vector, floating~pointt and concurrency instructians,
`as well as integer and lagical instructiensé
`The CBS
`concurrently process different iterations of the same
`iterative constzuct {but
`they may aisn aperate
`
`indepenfigntly ts provifle a highnpaxformance
`multiwtasking system).
`The 195 are mofiezate~speed
`interactive processors that can execute intege: and
`iogical operations only, ané are usefl for hanéling
`inputfoutput trafific, text editing, and similar
`operations. Data tyges supparted include 8; l6, and 32
`bit integer/logical éata as well as IEEE standard 32
`and 64 bit floating—goint data on the CBS unlyv Memory
`
`15
`
`is virfiually aadressed‘
`
`C33 accesg global memary
`
`through cache boazés 12, which the CBS cammunicate
`with via switch 14.
`Each CE has its own lSK byte
`
`The IPs access
`virtuallyvaaéressed inatrucfiimn cache‘
`globai memory through interactive procesaor caches 18.
`
`IIIm Comgutational Elements
`The computational elements {CBS} are intended
`fer high-sgeed computations.
`The CBS are ifieutical,
`an& as few as one may be installed in a system.
`
`Any number of CBS may participate in concurrent
`ptocessing. Those that do are said to be in the
`concurrency campiex {hexainafter "the complex“), Tfisse
`that do not are saié to be detached (CCU status bifi
`
`DETACHED is l to inflicata such detached Cgeraticn).
`
`CBS can be detached in the evant that a job mix being
`
`executed on the System includes fiobs that cannet make
`
`use of concurrent gracessing {8.g‘, comgilation and
`debugging) as well as jobs that can {8,9‘, praduction
`
`25
`
`30
`
`ARR|S883|PR|0001391
`
`ARRIS883IPRI0001391
`
`

`

`fi??$?%8
`
`£7
`
`jobsi,
`
`A detached CE acts as if it weze a system
`
`with only one CE gresant; it executes concuzrency
`
`instructions 88 ii it Nero a one~grocessor system,
`
`A block diagram at a CE is shown in Fig. 3c
`
`5
`
`A processor internal bus yxaus (32 bits wide} is use&
`
`to transfer data and addresses between fine CCU 24, an
`
`address firanslation unit, an instruction grocessor
`
`{containing both an adoross unit ané an integer Logic
`
`unit and comprising a motorola 68020 instruction set},
`
`10
`
`and a CR switch; which servea as a conduit between the
`
`PIBUS,
`
`the voctor registers, and the floating point
`
`units {which include moltipliezs: flividers, and an
`
`8666:).
`
`A contzoi section (which inclufies an
`
`instzuction parser, a microsequenoer, and a RAM-based
`
`15
`
`contzol store) provides instruction commands to the
`
`the instruction progessor,etho VECtOr registers,
`CC3,
`an instruction cache, and thE‘CE switch“
`The CE
`
`communicates with othe: CES across the concurrency
`
`control bus 26, and with the CP caches 12 using the
`
`30
`
`address and data gozts connectefl to the mgmory
`switch 14.
`
`A.
`
`instructiooo
`Each CE executes instructions stored in
`
`memory* Each is capable of interpreting an&
`
`25
`
`oxeouting four categorios of ingtructioos:
`instructions, which implement data movement,
`
`{l} EEEE
`logical,
`
`integer arithmetic, shift and {otate, bit manipulation,
`
`bit field, binary coéed oecima}, and program control
`
`{2} floating Egigg instructions, which
`operations;
`implemént arithmetic {including functions such as
`
`30
`
`square root and sine), test, and move operations on
`
`floating point data;
`
`{3) vector instructions, which
`
`ARR|S883|PR|0001392
`
`ARRIS883IPRI0001392
`
`

`

`E
`
`fifififi??fi
`
`implement integer and floating point aperations an up
`
`to 32 data eiementg at a time: and (Q) cancurrencg
`
`instructions, which implement the parallel exacutian mi
`
`instructions by multiple CBSw
`
`The processor contains
`
`5
`
`sevetal classes of registers {see below)
`
`for sungtting
`
`instruction execution. An instructimm consists of One
`
`or morg whale wards, whate a wax& is 16 bits.
`
`An instructien contains the information an
`
`the functicn and ogerands, Particular bits in the
`
`10
`
`instinction define thé function being requested of the
`processor;
`the number and loéation at the defining bits
`
`can vary dapending GU the instructisn. The remaining
`
`bits of the instruction fiefine the oparand~~the data to
`
`be acted upon.
`
`The flata can be in memory,
`
`in registers
`
`15 within the piecesfior, or in the instructicn itself
`
`(immediate aata); it can be specified as zegistar
`
`numbers. Values, and addresses,
`
`Instruction execution normally scents in the
`
`Ogder
`
`in which the instructions appear in memary, frem
`
`20
`
`low addresses to high aafitegsesa
`
`Instructians being
`
`executefi
`
`in the normal sequence must
`
`immediately fallaw
`
`one anothez.
`
`The processa: conttels the sequence of
`
`inStruction execution by maintaining the memory address
`
`25
`
`of the next instructian to be executed in a 32~bit
`
`register called the program counter {PC}. During the
`execution of instructions that do not alter the normal
`
`sequence,
`
`the pracessgr increments the PC so that it
`
`contains the address of the next sequential instruction
`
`3U
`
`1m memory (that is,
`
`the adéress Uf the wotd immeéiately
`
`follgwing the current instructioa).
`
`For example,
`
`the
`
`PC wauld be incremented by 4 foliawing execution of
`
`a 32 bit instructiont
`
`Instructians modifiy the
`
`ARR|S883|PR|0001393
`
`ARRIS883IPRI0001393
`
`

`

`FE
`
`flfi‘gefiy'fig
`
`contents cf the PC i0 pexmit branching from the aoxmal
`
`instruction sequenca. Other instructions alter the
`
`normal sequence by loading a specifiefl vaiue into
`the PC.
`
`5
`
`Most of the base instructions and those vectot
`
`instructions zhat praduce scalar integer results alter
`
`special bita in the pracessox tailed integer conéitian
`
`codes.
`
`Fax example, Subtracting two equal integar
`
`values sets the zero cenéitian codeg while aubtracting
`
`10
`
`a larger integer value frum a Smaller integer value
`sets the negative condition code‘
`Same instructiong
`
`contain a 4-bit confiition code fielé whose value
`
`specifies a condition such as equal, 1955 than,
`
`gxeate:
`
`than, aaé 59 an.
`
`The conditiun is true if the
`
`15
`
`conéitisn cades are set in a certain way.
`
`For example,
`
`if the zero condition caég is set,
`
`the equal conéition
`
`is true; if the negative condition code is set and the
`
`overflow condition code is cleareé, aha less than
`
`condition is true; if the negative and overflaw
`
`28
`
`condition codes are set ané the zero condition code
`
`is cleared,
`
`the greater than condition is true.
`
`The
`
`fioating point instructions have separate flcating
`
`point condition caées.
`
`l. Vactor Ingtrucfiions
`
`25
`
`a yector instructian can Qperate on up to

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket