throbber
lllllllllllllllIlllllllllllllllllllllllllllllllllllllllllllllllllllllllll
`
`USOO8898254B2
`
`(12) United States Patent
`Watson, Jr. et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 8,898,254 B2
`*Nov. 25, 2014
`
`(54)
`
`TRA_\'SACT1()l\" PROCESSING USING
`MUl.'l‘ll’l..l?I l’R0'l‘()(.‘0l, F.N('§I:\‘lCS
`
`(71)
`
`Applicant:
`
`t\’1emor_y Integrity, LLC, Wilmington,
`D13 (US)
`
`lnventors: Charles Edward Watson, 11:, Austin,
`TX (US); R-ajesli Kata, Austin, TX
`(US); David Brian Glasco, Austin, TX
`(US)
`
`(*1
`
`Notice:
`
`Subject to any disclaimer, the term oftliis
`patent is extended or adjusted under 35
`U.S.C. l54(b) by 0 days.
`
`This patent is subject to a terminal dis-
`claimer.
`
`(21)
`
`(22)
`
`(65)
`
`(63)
`
`(51)
`
`(52)
`
`App1.No.: 14/021,984
`
`Filed:
`
`Sep. 9, 2013
`
`Prior Publication Data
`
`US 2014/0013079 Al
`
`Jan. 9, 2014
`
`Related U.S. Application Data
`
`Continuation ofapplication No. 13/327,483, filed on
`Dec. 15, 2011, now Pat. No. 8,572,206, which is a
`continuation of application No. 10/289,492, filed on
`Nov. 5, 2002, now Pat. No. 8,185,602.
`
`(2006.01)
`(2006.01)
`(2006.01)
`(2006.01)
`
`Int. Cl.
`0/16)’ 15/16
`00617 15/80
`G06)’ 15/] 73
`H041. 29/06
`U.S. C1.
`CPC ........ .. G061’ 15/80 (2013.01); G06F 15/17337
`(2013.01); H114/L 69/18 (2013.01); H041. 69/12
`(2013.01)
`709/217; 709/212; 709/238; 370/230;
`370/474; 370/235; 370/338; 370/352
`
`USPC
`
`(58) Field of Classification Search
`USPC ........ .. 709/212, 238; 370/230, 474, 235, 338,
`370/352
`See application tile for complete search history.
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`5,560,027 A
`5,634,043 A
`5,682,512 A
`5,859,975 A
`5,950,226 A
`5,961,592 A *
`
`9/1996 Watson et .11.
`5/1997 Sclfet .11.
`10/1997 Tetrick
`1/1999 Brewer et :11.
`9/1999 1‘1.'igCI'S1'€l‘l€fIi1.
`10/1999 Hsin
`
`(Continued)
`
`..
`
`.. 709/217
`
`FOREIGN 1’A'1'1iN'l‘ DOC.UMEN'l‘S
`
`DE
`W0
`
`10045915
`WO 99/26144
`
`5/2001
`S/1999
`
`(Continued)
`OTl-1 ER PUBl.,lCAT1ONS
`
`US Office Action mailed Apr. 13, 2006, issued in US. Appl. No.
`10/289,492 [P018].
`
`(Continued)
`
`Primary Exazniner — Tammy Nguyen
`
`ABSTRACT
`(57)
`A multi-processor computer system is described in whjcli
`transaction processing is distributed among multiple protocol
`engines. The system includes a plurality oflocal nodes and an
`interconnection controller interconnected by a local point —to-
`point architecture. The interconnection controller comprises
`a plurality of protocol engines for processing transactions.
`Transactions are distributed among the protocol engines
`using destination information associated with the transac-
`tions.
`
`8 Claims, 12 Drawing Sheets
`
`
`
`Cluster 107
`
` Pm
`
`.
`Clustc:l2
`
`'\—l4ld
`
`141::
`
`.
`Clustw 12
`
`.
`Pm
`C|u:t::rl2
`
`lNTEL 1001
`
`

`
`US 8,898,254 B2
`Page 2
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`Memory Itrtcgrity, LLC v.Axrt.r1elc Compu/e/; lnc., and ASUS Com-
`puter International, District ot‘De1:r.ware 1:1_'l—c\'»01797,Complaint’
`and Answer; 2013.
`ll/femor_v Integrity, LLC v. 131ack1)erry 1.111 and B1.‘lCl(1‘J€1'l'y Corpora-
`tion, District of Delaware 1:13~cv-01798, Cornplaiiit and Answer;
`2013.
`Memory Integrity, LLC V. Fu/m, lnc., District o1'1)c1a\-.-‘are l:I3-c\I-
`OI799, Complaint and Answer; 2013.
`/l/Iemor_iv Integrity, I,/,C' V. Fig/iI.\‘u Linrited and I-"ujilxu ./tnrerim, /m.‘.,
`District of Delaware 1:13—cv~01800, Complaint and Answer; 2013.
`Memorylntegrity, LLC v. Gougle, Inc. and Motorola Mobility, LL C,
`DisLi'ict ofDel:iw'.ii'e 1:13~cv-01801, Cornplaint and Answer‘; 2013.
`Memolylntegrity, L1.(.‘v. Hirenre1/rterlmlioiml($0., Ltd and liisense
`USA Corporation, District of Delaware 1:13-c\-'-01983. Complaint
`and Answer; 2013.
`Mein0I)IIIrtegriI_r.>, LLC \’. HTC Corporation and HTC.‘ .4 nrericn. lnc.,
`District’ ofDelawar'c 1 :13-c\~'-01802, Complaint ancl Answer; 2013.
`Memory lnlqrity, LLC v, 1-11/awei Devi(.'e USA, Inc. and 1-‘trttri'ewei
`Technologies, lnc., District of Delaware 1:13-cv-01803, Complaint
`and Answer; 2013.
`Memory Integrity, LLC v. Intel Coryjoration, District of Delaware
`1:13-cv—0l804, Complaint and Answer; 2013.
`Memory Integrity, LLC v. Lwrovo Group Ltd uml Lmrovo (United
`States) lnc., District of Delaware l:l3—c\=-01805, Complaint and
`Answer; 2013.
`lllemorylntegfity, LLC v. LG Electronics, ]nr:.; LG Electronics, USA,
`Inc; and LG Electronics Mobileconun USA, lnc., District of Dela-
`ware 1:l3-cv-0l806, Complaint & Answer; 2013.
`Memory Integrity, LLC \'. Mfcroxoli Coi]10rr1Iiun, District o1'Del:r-
`ware 1:l3~cv-01984, Complaint and Answer; 2013.
`Memory Integrity, LLC v. Motorola Solutions, lnc., District of Dela-
`ware 1:13-cv-01807, Complaint and Answer‘; 2013.
`Memory Integrity, LLC v Samsung E/ca-Ironicr Co Ltd; Samsung
`Electronics America, LLC; & Samsung Tclecomrminications, Dis-
`trict of Delaware 1:13-cv-01808, Complaint & Answer;2013.
`Memory Integrity LLC v Sony Corporation; Sony l?lerrIr0nir?.t lnc.,
`Sony Mobile Communications (USA) Inc; & Sony Mobile, District
`ofDelaware l:13—cv-01809, Complaint & Answcr20l 3.
`Memory Integrity, LLC v. Toslrilm C()l'[}OI'(IIi0l1,‘ To.t/Iiba America,
`Inc.; &. Toshiba America Information Systems, Inc, District ofl)cla-
`ware 1:13-cv—0l8l0, Complaint and Answcr; 2013.
`Memory Integrity, LLC V. Z77-I CI)t])0l‘(IIl()II and ZTJ-f (USA) Inc,
`District ofDclawarc 1:13-cv-0181 1, Complaint and Answer; 2013.
`AMD Press Release, AMD Announces Stir Generation Arclritecture
`for Microproccssors (2001).
`AMD Press Release, Broadcmn, Cisco, Nvidia, Sun Among First
`Adopters ofAmd‘s [sic] New Hyper1ranspor1® 'l‘eclmo1ogy (2001).
`AMD White Paper, IlyperTranspoi1® Technology 110 Link, AMD
`(2001).
`AMD Press Release, AMD Discloses New Technologies at Micro-
`processor 1'-'oium (1999).
`AMD Press Release, AMD Discloses l\‘ext—Gencration AMD-K81“
`Processor Microarchitecture at Microprocessor Fortrrri (1998).
`Joon-Ho Ha and Timothy Mark Pinkston, Speed DMON: Cache
`Coherence on an Optical Multichannel Interconnect Architecture, .1.
`ofParal1el & Distributed Computing 41, 78-91 (1997).
`David Chaiken et al., Directory-Based Cache Coherence in Large-
`Scale Multiprocessors, Computer 23:6, 49-58 (1990).
`I-lermann Hellwagncr and Alexander‘ Reineteld, SCI: Scalable
`Coherent Interface, Springer (1999).
`D. A. Patterson and J. Hcnnessy, Computer Organization and Design:
`The Hardware Software, Interface, Morgan Kaufman (1994).
`Milo Tomasevic and Veljko Milutinovic, The (L'aclie—Cohcrence
`Problem in Slrared—;\/Ieinory Multiprocessorss 1~lardware Solutions.
`IEEE Computer Society Press (1993).
`
`* cited by examiner
`
`l/2000 VanDoreneta1.
`12/"2000 Kelleretnl.
`10/2001 Roach et :11.
`4/2002 Hager'steneta1.
`5/2002 Kelleret al.
`12/2002 Kelleret :11.
`4/2004 Alvarez, 11 et al.
`6/2004 Gliaraclrorloo et al.
`9/2004 Baumarr
`10/"2004 Klrare et. al.
`3./'2005 Conway
`12/2006 Gttrevich etal.
`5/2007 Kosteretal.
`7/2007 Glzisco
`712008 Szabo etal.
`7/2003 Tada ............ ..
`412009 Acliaiya etal.
`8/2009 1-lashimotoctal.
`4/2010 Kostcret :11.
`4/2010 S'zaboet.a1.
`. 709/238
`6/2010 Gilfix ctal.
`. 370/235
`ll./2010 Okamoto cta.
`. 370/474
`6/2011 Park ......... ..
`. 370/230
`8/2011 Pope ctal.
`.
`. 709/212
`5/2012 Watson, Jr. ct al.
`. 709/212
`12/2013 Popectal.
`5/2002 Pong
`3/2006 Liu
`
`
`
`.
`
`.......... .. 709/238
`
`703/22
`
`709/238
`84/645
`.. 370/352
`. 370/338
`
`6,014,690 A
`6,167,492 A
`304,910 B1
`6,370,585 131""
`6,385,705 B1
`6,490,661 B1
`6,725,307 B1
`6,751,710 132
`6,799,252 131
`6,810,467 131
`6,868,485 131
`7,149,673 132*‘
`7,213,106 B1
`7,251,698 B2
`7,395,349 131*
`7,395,993 132*
`7,522,581 132*
`7,577,123 132*
`7,698,509 131
`7,702,809 B1 “-‘
`7,742,417 132"‘
`7,843,968 132"‘
`7,957,278 132*
`8,005,916 132"‘
`3,185,602 B2
`8,612,536 132*
`200210053004 Al
`2006/0053258 A1
`
`
`
`FOREIGN l’A'1‘EN'1‘ DOCUMENTS
`
`W0
`W0
`W0
`
`WO 02113020
`W0 02/13945
`WO 2004/029776
`
`2./2002
`2./2002
`10/2004
`
`OTI-IER PUBLICATIONS
`
`US Office Action mailed Oct. 4, 2006, issued in US. Appl. No.
`I0/289.492 [P018].
`US Final Office Action dated May 3, 2007, issued in US. Appl. No.
`10/289,492 [P018].
`US Ofiice Action dated Nov. 27, 2007, issued in U.S. Appl. No.
`10/289,492 [P018].
`US Final Office Action dated Apr. 15. 2008, issued in US. Appl. No.
`10/289.492 [P018].
`US Notice ofAllowa.nce and Fees due dated Mar. 12, 2012, issued in
`US. Appl. No. 10/289,492 [P018].
`International Search Report and Written Opinion dated Dec. 27,
`2005, issued in Application No. 20t)3/034833 [POIRWO].
`1.03,”
`"l>lyperTransport”
`1/‘O
`l.,inl<
`Specification
`revision
`I-Jy/JerYi‘mrrpoI'lT-"‘ Conrortfum, Oct. 10, 2001, Copyright (C) 2001
`Hyper'l‘ranspoi1 Teclrnology Consortium (127 Pages).
`US (L)fIice Action dated Mar. 6, 2013, issued in US. Appl. No.
`13/327,483 [P018Cl].
`U.S. Notice ofAl1owance and Fees Due dated Aug. 14, 2013, issued
`in U.S. Appl. No. 13/327,483 [l’018C1].
`US. Notice of/Xllowaiice and Fees Due (supplemental) dated Sep. 9,
`2013, issued in 1.1.8. Appl. No. 13/327,483 [1’0l8C1j.
`Memory Integrity, LLC v. Arna7..on.com._ lnc., District of Delaware
`l:l3~c\-‘—01795, Complaint and Answer; 2013.
`Memory Integrity, LLC V. /lp/1/8, lnc., District ofDc1aware l:l3~cv—
`01796, Complaint and Answer; 2013.
`Memory ])7l¢3g/‘fly, LLC v. Archer SA. and Arclror, Inc, District" of
`Delaware 1:13-cv-01981, Complaint and Answer; 2013.
`
`

`
`U.S. Patent
`
`Nov. 25, 2014
`
`Sheet 1 of 12
`
`US 8,898,254 B2
`
`Cluster
`
`Process‘
`
`Fig. 1B
`
`
`
`Proccssi
`Cluster 12
`
`
`
` Processing
`
`
`Cluster 125
`
`

`
`U.S. Patent
`
`Nov. 25, 2014
`
`Sheet 2 of 12
`
`US 8,898,254 B2
`
`fix.V
`
`EN
`
`38._§82m
`
`8u:\
`
`/.682
`
`.§~.\:
`
`§T\
`
`$3\
`
`§~._§3.E51!.»as5:380
`
`
`
`.!.§~.§8emI
`
`
`

`
`U.S. Patent
`
`Nov. 25, 2014
`
`Sheet 3 of 12
`
`US 8,898,254 B2
`
`momseam
`wcmucom
`
`
`
`ufiwcm_oo8o._m
`
`
`
`
`
`howoomtfifiE05300
`
`)6ca
`
`mma
`
`
`
`HuuwtmufiEubnoosoz
`
`

`
`U.S. Patent
`
`mN
`
`41102
`
`.5“.
`
`cl0441eeh
`
`21..
`
`US 8,898,254 B2
`
`cuNew\
`
`£3
`
`
`
`4.Hommuoogm
`
`8385:CEA
`
`Eoiwom
`
`<3‘
`
`.9.”
`Lu
`
`#383¢23¢
`
`
`
`wncaozmqazcx
`
`3%
`
`

`
`U.S. Patent
`
`Nov. 25, 2014
`
`Sheet 5 of 12
`
`Us 8,898,254 B2
`
`Reguesting Quad
`
`501
`
`Locai Map
`
`Global Map
`
`505
`
` §Q
`L
`I M
`
`003
`
`GP
`
`

`
`U.S. Patent
`
`Nov. 25, 2014
`
`Sheet 6 0f12
`
`US 8,898,254 B2
`
`
`
`Cluster 2
`
`L#— Link number
`N#— Link number
`
`Fig. 6A
`
`Local Table
`
`Global Table
`
`
`
`Des’: Cluster
`
`Lo
` Desi Node
`
`SOUYCC
`ClusIe1'0
`Nodc0
`
`N0
`X
`Lo
`X
`
`N1
`E59
`X
`Lo
`
`Li’ X
`X
`Lo
`Lo
`X
`X
`
`Lo T X
`
`Co
`NA
`X
`NA
`
`L.
`NA
`L2
`NA
`
`

`
`U.S. Patent
`
`Nov. 25, 2014
`
`Sheet 7 of 12
`
`US 8,898,254 B2
`
`
`
`
`Receive locally
`generated transaction
`
`702
`
`Allocate space in
`pending buffer
`
`_r’‘‘‘” 704
`
`Append global transaction tag
`.__,..—— 706
`and cluster ID and transmit
`transaction
`
`
`7
`
`
`Receive incoming
`_,-~ 708
`transmissions related to
`transaction
`
`
`
`
`
`Indcx incoming
`transmission in pending
`buffer using global tag
`
`710
`
`
`
`_
`Local tag required‘?
`
`_..—~— 712
`
`
`Use local tag from
`pen ding buffer entry
`
`" 714
`
`

`
`U.S. Patent
`
`Nov. 25, 2014
`
`Sheet 8 of 12
`
`US 8,898,254 B2
`
`Receive remotely
`gcncratcd transaction
`
`” 807-
`
`Assign local —‘ 804
`transaction tag
`
`Allocate space in
`pending b11fi°er
`
`,r-"—" 806
`
`Fig. 8
`
`lnsen entry with global
`and local tags in pending "
`buffer
`
`_,..»—— 808
`
`Receive outgoing _/”“' 810
`transmission
`
`Index outgoing
`transmission in pending
`l)ulTer using local lag
`
`‘_/—— 8l2
`
`Use global tag for this and
`related subsequent
`transactions
`
`814
`
`

`
`U.S. Patent
`
`N
`
`4102
`
`9Hm9
`
`US 8,898,254 B2
`
`0N3!
`
`4%mamE33
`5,oofiamau
`
`
`6.25Tia.|38.18T48macawE828o0Boo_Bou0Eu
`
`rmofiom
`
`n08539
`
`9ma
`
`NmmeaEH30
`
`Bofium
`
`
`
`

`
`U.S. Patent
`
`Nov. 25, 2014
`
`Sheet 10 of 12
`
`Us 8,898,254 B2
`
`wczéom
`
`.H0%.w3m
`
`82
`
`
`
` |l\//Dy.......W838£.§£
`E8260
`
`-.82
`
`I.iL
`)c82
`
`2mm
`
`
`
`oowtofimEobnooaoz
`
`M2:
`
`
`

`
`U.S. Patent
`
`Nov. 25, 2014
`
`Sheet 11 of 12
`
`US 8,898,254 B2
`
`omazm
`
`was
`
`mazm
`
`:83v
`
`monoz
`
`omazm
`
`can
`
`Gaza
`
`oSafiv
`
`M.5¥.3U
`
`N83.30
`
`:mi
`
`£L4..C: >~.z/J-—~ U N—
`
`o
`
`oommm»._oE
`
`2
`
`
`

`
`US. Patent
`
`Nov. 25, 2014
`
`Sheet 12 of 12
`
`US 8,898,254 B2
`
`transaction packet
`a generated by local
`node in request cluster
`
`’
`
`1302
`
`7 L
`
`..
`
`dircctcd
`
`03 broadcast
`
`1204
`
`interconnection controller
`
`transaction packet
`transmitted directly to one
`maps packet to one ofa
`of a plurality of PBS
`plurality of PBS
`
`
`
`
`
`
`PE determines how to
`process packet
`
`-
`P18 1 2
`
`PE maps current dcst
`node and unit IDS to
`
`target. node and unit "IDS
`
`___ 1208
`
`PE transmits packet to v”— 12] 0
`
`rcmolc cluster
`
`Remote cluster
`interconnection controller
`
`receives packet
`
`1212
`
`Interconnection controller in
`remote cluster maps packet to
`
`one ofa plurality ofPEs
`
`F--- 1214
`
`end
`
`
`
`PF. r.r'«msmits packet
`to local node
`
`PE determines how "”‘ 1216
`to process packet
`
`1218 '_‘~~
`
`

`
`US 8,898,254 B2
`
`I
`TRAI\'SAC'I"I()N I’ROCI<ZSSI.\'G USING
`MU IJI‘I1’LIi PROTOCOL I§I\'GI.\'I£S
`
`RBI.ATF.I’) A l’PI.I(.‘ATION DATA
`
`The present application is a continuation of and claims
`priority under 35 U.S.C. 120 to U.S. patent application Ser.
`No. I3/327,483 for Transaction Processing Using Mtrltiple
`Protocol l7,ngincs filed Dec. I 5, 201 I
`, which is a continuation
`of US. patent application Ser. No. 10/289,492 for Transac-
`ti on Processing Using Multiple Protocol Engines in Systems
`Having Multi-Processor Clusters filed Nov. 5, 2002, now U.S.
`Pat. No. 8,185,602 , the entire disclosures ofhoth ofwhich are
`incorporated herein by reference for all purposes.
`B./-\Cl(GROUNIi) OF 'I‘l‘II.7. lNVF,NTl()N
`
`The present invention relates generally to ntulti—processor
`computer systems. More specifically, the present invention
`provides techniques for building computer systems having a
`plurality of multi-processor cluzzters.
`A relatively new approach to the design ofntulti—processor
`systems replaces broadcast communication among proces-
`sors with a point—to—point data transfer mechanism in which
`the processors communicate similarly to network nodes in a
`ti ghtly—coupled computing system. That is, the processors are
`interconnected via a plurality of communication links and
`requests are transferred among the processors over the links
`according to routing tables associated with each processor.
`The intent is to increase the amount ofinformation transmit-
`ted within a multi—processor platform per unit time.
`One limitation associated with such an architecture is that
`the node ID address space associated with the point-to-point
`infrastructure is fixed, therefore allowing only a limited num-
`ber of nodes to be interconnected. The infrastructure is also
`fiat, therefore allowing only a single level of mapping for
`address spaces and routing functions. In addition, the pro-
`cessing throughput for transactions in a computer system
`employing such a point-to—point data transfer mechanism
`nray be limited by the capacity of the protocol engine resporr-
`sible for processing, those transactions.
`It is therefore desirable to provide tecluriques by which
`computer systems employing suchan inlrastructure as a basic
`building block are not so limited.
`
`SUMMARY OF TI~lE IN\~’ENTION
`
`According to the present invention, a multi—proccssor sys-
`tem is provided in which a plurality ofmulti-processor clus-
`ters, each employing a point-to-point communication infra-
`structure, are interconnected. The invention employs multiple
`protocol engines in each cluster to process transactions
`thereby
`improving transaction processing throughptrt.
`According to a specific embodiment, transaction packets are
`mapped to the various protocol engines associated with a
`cluster according to the target address. According to a more
`specific embodiment,
`transaction packets which do not
`specify a target address are mapped to a protocol engine based
`on in.format'ion in the packet which may be used to identify
`the cluster and cluster resource for which the packet
`is
`intended.
`Thus, the present invention provides a computer system
`including a plurality of processor clusters. Each cluster
`includes a plurality of local nodes and an interconnection
`controller interconnected by a local point-to-point architec-
`ture. The interconnection controller in each cluster comprises
`a plurality ofprotocol engines for processing transactions. At
`
`-.1.
`
`IO
`
`15
`
`20
`
`ha (I!
`
`40
`
`45
`
`SC:
`
`60
`
`2
`least one of the intcrconttcctiott controller and the local nodes
`in each cluster is operable to map the transactions to the
`protocol engines according to destination information asso-
`ciated with the transactions. According to one embodiment,
`the interconnection controller effects the mapping with ref-
`erence to target addresses associated with the transactions.
`According to another embodiment, the local nodes effect the
`mapping by mapping the target addresses to one ofa plurality
`of nodes associated with the local interconnection controller,
`each of which corresponds to at least one of the protocol
`engines.
`A further understanding ofthe nature and advantages ofthe
`present invention may be realized by reference to the remain-
`ing portions of the specification and the drawings.
`BRIEF DIESCRIPTION OF THIS DRAWINGS
`
`FIGS. IA and 1B are diagrammatic representations depict-
`ing systems having multiple clusters.
`FIG. 2 is a diagrammatic representation of an exemplary
`cluster having a plurality of processors for use with specific
`embodiments of the present invention.
`FIG. 3 is a diagrammatic representation of an exemplary
`interconnection controller for facilitating various embodi-
`ments of the present invention.
`FIG. 4 is a diagrammatic representation of a local proces-
`sor for use with various embodiments of the present inven-
`tion.
`
`FIG. 5 is a diagrammatic representation of a memory map-
`ping scheme for use with various embodiments ofthe inven-
`tion.
`FIG. 6A is a simplified block diagram of a four cluster
`system for use with various embodiments of the invention.
`FIG. 6B is a combined routing table including routing
`information for the four cluster system of FIG. 6A.
`FIGS. 7 and 8 are flowcharts illustrating transaction man-
`agement
`in a multi-cluster system according to various
`embodiments of the invention.
`FIG. 9 is a diagrammatic representation of commttnica-
`tions relating to an exemplary transaction irt a multi—cluster
`system.
`FIG. 10 is another diagrammatic representation of an
`exemplary interconnection controller for facilitating various
`embodiments of the present invention.
`FIG. 11 is art exemplary mapping of protocol engines in a
`processor cluster to a global memory space in a multi-cluster
`system.
`FIG. 12 is a flowchart illustrating mapping of transactions
`to protocol engines according to a specific embodiment ofthe
`invention.
`
`DETAILED DESCRIPTION OF SPECIFIC
`EMBODIMFNTS
`
`Reference will now be made in detail to some specific
`embodiments of the invention including the best modes con-
`templated by the inventors for carrying out the invention.
`Examples of these specific embodiments are illustrated in the
`accompanying drawings. While the invention is described in
`conjunction with these specific embodiments,
`it will be
`understood that it is not intended to limit the invention to the
`described embodiments. On the contrary, it is intended to
`cover alternatives, modifications, and equivalents as may be
`included within the spirit and scope of the invention as
`defined by the appended claims. Multi-processor architec-
`tures having point-to-point communication among their pro-
`cessors are suitable for implementing specific embodiments
`
`

`
`US 8,898,254 B2
`
`-4-
`
`IO
`
`25
`
`3
`of the presetit invention. In the following description. nttmer—
`ous specific details are set forth in order to provide a thorough
`understanding ofthe present invention. The present invention
`may be practiced without sortie or all of these specific details.
`Well known process operations have not been described in
`detail in order not to unnecessarily obscure the present inven-
`tion. Furthermore, the present applicatiotfs reference to a
`particular‘ singular entity includes that possibility that
`the
`methods and apparatus ofthe present invention can be imple-
`mented using more than one entity, trnless the context clearly
`dictates otherwise.
`FIG. IA is a diagrammatic representation of one example
`of a multiple cluster, multiple processor system which may
`employ the techniques oftheprcscnt invention. Each process-
`ing cluster 10], 103, 105, and 107 includes a plurality of IS
`processors. The processing clusters 101, 103, I05, and 107
`are connected to each other through point-to-point
`links
`Illa-fl The multiple processors in the multiple cluster archi-
`tecture shown in FIG. IA share a global memory space. In this
`example, the point-to-point links 111a-fare intemal system
`connections that are used in place ofa traditional front-side
`bus to connect the multiple processors in the multiple clusters
`101,103,105, and 107. The point-to-point links may support
`any point-to-point coherence protocol.
`FIG. 1B is a diagrammatic representation of another
`example ofa rnulliple cluster, multiple processor system that
`may employ the techniques of the present invention. Each
`processing cluster 12], 123, 125, and 127 is coupled to a
`switch 131 through point-to-point links 141a—d. It should be
`noted that using a switch and point-to-point links allows
`implementation with lewer point-to-point links when con-
`necting multiple clusters in the system. A swit'ch 13] can
`include a general purpose processor with a coherence proto-
`col interface.According to various implementations, a multi-
`cluster system shown in FIG. IA may be expanded using a
`switch 13] as shown in FIG. 1B.
`FIG. 2 is a diagrammatic representation ofa multiple pro-
`cessor cluster such as, for example, cluster 101 shown in FIG.
`1A. Cluster 200 includes processors 20211-20211, one or more
`Basic I/O systems (BIOS) 204, a memory sttbsystcm corn-
`prising memory banks 206a—206d, point -to-point communi-
`cation links 20811-2082, and a service processor 212. The
`point-to-point communication links are configured to allow
`interconnections between processors 202a-202a’, I/O switch
`210, and interconnection controller 230. The service proces-
`sor 212 is configured to allow communications with proces-
`sors 202a-2(l2d, I/O switch 210, and interconnection control-
`ler 230 via a ITAG interface represented in FIG. 2 by links
`21411-2t4f It should be noted that other interfaces are sup-
`ported. I/O switch 210 connects the rest of the system to I/O
`adapters 216 and 220, and to BIOS 204 for bootitig purposes.
`According to specific Cml’JOClil11Cl1lS, the service processor
`oftbe present invention has the intelligence to partition sys-
`tem resources aceordingto a previously specified partitioning
`schema. The partitioning can be achieved through direct
`manipulation of routing tables associated with the system
`processors by the service processor which is made possible
`by the point-to-point communication infrastntctttre. The
`routing tables can also be changed by execution of the BIOS
`code in one or tnore processors. The routing tables are used to
`control and isolate various system resources, the connections
`between which are defined therein.
`The processors 202a—d are also coupled to an interconnec-
`tion controller 230 through point-to-point
`links 232a-d.
`According to various embodiments and as will be described
`below in greater detail, interconnection controller 230 per-
`forms a variety of functions which enable the number of
`
`40
`
`45
`
`50
`
`60
`
`65
`
`4
`interconnected processors in the system to exceed the node ID
`space and mapping table limitations associated with each ofa
`plurality of processor clusters. According to S0l1‘l(:‘ embodi-
`ments, interconnection controller 230 performs a variety of
`other functions inclttdingthe maintaining ofcache coherency
`across clusters.
`Intercoruiection controller 230 can be
`coupled to similar controllers associated with other multi-
`processor clusters. It should be noted that there can be more
`than one such interconnection controller in one cluster. Inter-
`connection controller 230 communicates with both proces-
`sors 20212-(1 as well as remote clusters using a point-to-point
`protocol.
`More generally, it should be understood that the specific
`architecture shown in FIG. 2 is merely exemplary and that
`embodiments of the present invention are contemplated hav-
`ing different configurations and resource interconnections,
`and a variety ofaltematives for each of the system resources
`shown. However, for purpose ofillustration, specific details
`of cluster 200 will be assumed. For example, most of the
`resources shown in FIG. 2 are assumed to reside on a single
`electronic assembly. In addition, memory banks 206u—206d
`may comprise double data rate (DDR) memory which is
`physically provided as dual
`in-line memory tnodules
`(DIMMS). I/O adapter 216 may be, for example, an ultra
`direct memory access (UDMA) controller or a small com-
`puter system interface (SCSI) controller which provides
`access to a permanent storage device. I/O adapter 220 may be
`an Ethernet card adapted to provide communications with a
`network such as, for example, a local area network (LAN) or
`the Internet. BIOS 204 may be any persistent memory like
`flash memory.
`According to one embodiment, service processor 212 is a
`Motorola MPC855T microprocessor which includes inte-
`grated chipset finictions, and interconnection controller 230
`is anApplication Specific Integrated Circuit (ASIC) support-
`ing the local point-to-point coherence protocol. Interconnec-
`tion controller 230 can also be configured to handle a rioti-
`cohercnt protocol to allow communication with I/O devices.
`In one embodiment, interconnection controller 230 is a spe-
`cially configured. programmable chip such as a program-
`mable logic device or a field programmable gate array. In
`another embodiment, the interconnect controller 230 is an
`Application Specific Integrated Circuit
`(ASIC).
`In yet
`another embodiment, the interconnect controller 230 is a
`general purpose processor augmented with an ability to
`access and process interconnect packet traffic.
`FIG. 3 is a diagrammatic representation of one example of
`an interconnection controller 230 for facilitating various
`aspects of the present
`invention. According to various
`embodiments, the interconnection controller includes a pro-
`tocol cnginc 305 configured to handle packets such as probes
`and requests received from processors in various clusters of a
`multiprocessor system. The functionality of the protocol
`engine 305 can be partitioned across several engines to
`improve performance. In one example, partitioning is done
`based on packet type (request, probe and response), direction
`(incoming and outgoing), or transaction flow (request flows,
`probe flows, etc).
`The protocol engine 305 has access to a pending buffer 309
`that allows the interconnection controller to track transac-
`tions such as recent requests and probes and associate the
`transactions with specific processors. Transaction informa-
`tion maintained in the pending bufi"er 309 can include trans-
`action destination nodes, the addresses of requests for sttbse-
`quent
`collision detection and protocol optiini7atioris,
`response information, tags, and state information. As will
`
`

`
`US 8,898,254 B2
`
`5
`become clear, this functionality is leveraged to enable par-
`ticttlar aspects of the present invention.
`‘the interconnection controller has a coherent protocol
`interface 307 that allows the interconnection controller to
`communicate with other processors in the cluster as well as
`external processor clusters. The interconnection controller
`may also include other interfaces such as a non-coherent
`protocol interface 311 for communicating with I/O devices
`(e.g._. as represented in FIG. 2 by links 208c and 208d).
`According to various embodiments, each interface 307 and
`311 is implemented either as a full crossbar or as separate
`receive and transmit units using components such as multi-
`plexers and buffers. It should be noted that the interconnec-
`tion controller 230 does not necessarily need to provide both
`coherent and non-coherent interfaces, It should also be noted
`that an interconnection controller 230 in one cluster can com-
`municate with an interconnection controller 230 in another
`cluster.
`
`According to various embodiments of the invention, pro-
`cessors 202a-202:1 are substantially identical. FIG. 4 is a
`simplified block diagram of such a processor 202 which
`includes an interface 402 having a plurality of ports 404a-
`404c and routing tables 4060-4060 associated therewith.
`Each port 404 allows communication with other resources,
`eg., processors or I/O devices, in the computer system via
`associated links, t.-.g., links 208a-208e of FIG. 2.
`The infrastructure shown in FIG. 4 can be generalized as a
`point—to-point, distributed routing mechanism which coin-
`prises a plurality of segments interconnecting the systems
`processors according to any of a variety of topologies, e_g.,
`ring, mesh, etc. Each of the endpoints ofeach ofthe segments
`is associated with a connected processor which has a unique
`node ID and a plurality of associated resources which it
`“owns,” e.g., the memory and I/O to which it’s connected.
`The routing tables associated with each of the nodes in the
`distributed routing mechanism collectively represent the cur-
`rent state of interconnection among the computer system
`resources. Each of the resources (e.g., a specific memory
`range or l/O device) owned by any given node (e.g., proces-
`sor) is represented in the routing table(s) associated with the
`node as an address. When a request arrives at a node, the
`requested address is compared to a two level entry in the
`node’s routing table identifying the appropriate node and
`link,
`i.e., given a particular address within a range of
`addresses, go to node x; and for node x use link y.
`As shown in FIG. 4, processor 202 can conduct point—to-
`point conuuunication with three other processors according
`to the information in the associated routing tables. According
`to a specific embodiment, routing tables 4050-406K: comprise
`two-level tables, a first level associating the unique addresses
`of system resources (e.g., a memory bank) with :1 correspond-
`ing node (e.g._. one of the processors), and a second level
`associating each node with the link (e.g., 208a-2tl8e) to be
`used to reach the node from the current node.
`Processor 202 also has a set ol‘JTAG handshake registers
`408 which, among other things, facilitate conununication
`between the service processor (e.g., service processor 212 of
`FIG. 2) and processor 202. That is, the service processor can
`write routing table entries to handshake registers 408 for
`eventual storage in routing tables 406a—406c. It should be
`understood that the processor architecture depicted in FIG. 4
`is merely exemplary for the purpose of describing a specific
`embodiment ofthe present‘ invention. l-‘or example, a fewer or
`greater number of ports and/or routing tables may be used to
`implement other embodiments ofthe invention.
`As mentioned above, the basic protocol upon which the
`clusters in specific embodiments of the invention are based
`
`'4-
`
`l0
`
`l5
`
`20
`
`tov.
`
`30
`
`40
`
`45
`
`50
`
`60
`
`65
`
`6
`provides fora limited node lD space which, according to a
`particularimplementation, is a 3-bit space, therefore allowing
`for the uniqtte identification ofonly 8 nodes. That is, ifthis
`basic protocol is employed without the innovations repre-
`sented by the present invention, only 8 nodes may be inter-
`connected in a single clttster via the point-to-point infrastruc-
`ture. To get around this limitation, the present
`invention
`introduces a hierarchical mechanism which preserves the
`single-layer identification scheme within particular clusters
`while enabling interconnection with and communication
`between other similarly situated clusters and processing
`nodes.
`According to a specific embodiment, one of the nodes in
`each niulti-processor cluster is an interconnection controller,
`e.g., interconnection controller 230 ofFlG. 2, which manages
`the hierarchical mapping of information thereby enabling
`multiple clusters to share a single memory address space
`while simultaneously allowing the processors within its clus-
`ter to operate and to interact with any processor in any cluster
`without “knowledge" ofanything outside oftheir own cluster.
`The interconnection controller appears to its associated pro-
`cessor to bejust another one of the processors or nodes in the
`clttster.
`
`In the basic protocol, when a particular processor in a
`cluster generates a request, a set ofaddress mapping tables are
`employed to tnap the request to one of the other nodes in the
`cluster. That is, each node in a cluster has a portion ofa shared
`memory space with which it is associated. There are different
`types of address mapping tables for main memory, memory-
`ntapped I/O, dil‘l'erent types of I/O space, etc. These address
`mapping tables map the address identified in the request to a
`particular node in the cluster.
`A set. of routing tables are then employed to determine how
`to get from the requesting node to the node identified from the
`address mapping table. That is, as discussed above, each
`processor (i.e., cluster node) has associated routing tables
`which identify a particular link in the point—to-point infra-
`structure which m

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket