`US005170482A
`United States Patent [is]
`[ii] Patent Number:
`5,170,482
`US005170482A
`[45] Date of Patent:
`Sbu et al.
`Dec. 8, 1992
`United States Patent [191
`5,170,482
`[11] Patent Number:
`Dec. 8, 1992
`Shu et a1.
`[45] Date of Patent:
`TOPOLOGY FOR
`
`[54] IMPROVED HYPERCUBE
`Stephen Colley, et al., "A Microprocessor-based
`MULTIPROCESSOR COMPUTER SYSTEMS
`IEEE Micro, Oct. 1986.
`Hypercube Supercomputer,"
`"Combining Parallel and Sequential Sorting on a Bool
`Stephen Colley, et al., "A Microprocessor-based
`[54] IMPROVED HYPERCUBE TOPOLOGY FOR
`[75] Inventors: Renben Shu, St. Paul; David H. C.
`ean n-Cube",
`Johnsson;
`
`IEEE, copyright 1984.
`Hypercube Supercomputer,“ IEEE Micro, Oct. 1986.
`MULTIPROCESSOR COMPUTER SYSTEMS
`Du, New Brighton, both of Minn.
`"Dense Trivalent Graphs for Processor Interconnec
`"Combining Parallel and Sequential Sorting on a Bool
`tion", Leland and Solomon; IEEE Transactions on
`[75] Inventors: Renben Shu, St. Paul; David H. C.
`[73] Assignee: Regents of the University of
`ean n-Cube", Johnsson; IEEE, copyright 1984.
`Computers, vo. C-31, No. 3, Mar. 1982.
`Du, New Brighton, both of Minn.
`Minnesota, St. Paul, Minn.
`"Dense Trivalent Graphs for Processor Interconnec
`"A Survey of Interconnection Networks", Tse-yun
`[73] Assignee:
`tion”, Leland and Solomon; IEEE Transactions on
`Regents of the University of
`[21] Appl. No.: 655,258
`Feng, IEEE, Dec. 1981.
`Computers, v0. C-3l, No. 3, Mar. 1982.
`Minnesota, St. Paul, Minn.
`(List continued on next page.)
`Feb. 13, 1991
`[22] Filed:
`“A Survey of Interconnection Networks", Tse-yun
`[21] Appl.No.: 655,258
`Feng, IEEE, Dec. 1981.
`Primary Examiner—Thomas C. Lee
`Related U.S. Application Data
`(List continued on next page.)
`Assistant Examiner—Robert B. Harrell
`[22] Filed:
`Feb. 13, 1991
`Attorney, Agent, or Firm—Merchant, Gould, Smith,
`[63] Continuation of
`
`Ser. No. 85,980, Aug. 14, 1987, aban
`Primary Examiner-Thomas C. Lee
`Edell, Welter & Schmidt
`doned.
`Related 1.1.5. Application Data
`Assistant Examiner—Robert B. Harrell
`[51] Int. 0.5
`G06F 13/00
`[57]
`ABSTRACT
`Attorney. Agent, or Firm-Merchant, Gould, Smith,
`[63]
`Continuation of Ser. No. 85,980, Aug. 14, 1987, aban~
`[52] U.S. a.
`395/800; 364/DIG.
`1;
`doned.
`Edell, Welter & Schmidt
`A hypercube
`system which has been modified by add
`364/DIG.
`2; 364/931; 364/931.4;
`364/931.41;
`ing additional communication links between the most
`[57]
`ABSTRACT
`[51] Int. C1.5 ............................................ .. G06F 13/00
`364/229
`distant nodes of a classic hypercube topology is de
`[52] US. Cl. ........................... .. 395/800; 364/D1G. l;
`364/200, 900;
`[58] Field of Search
`A hypercube system which has been modi?ed by add
`scribed herein. This
`
`in improvement a hypercube topol
`
`364/D1G. 2; 364/931; 364/93l.4; 364/9314];
`1, DIG. 2, 800, 700, 325
`395/DIG.
`ing additional communication links between the most
`ogy is termed as a Modified Hypercube
`topology.
`364/229
`distant nodes of a classic hypercube topology is de
`a topology contains extra links which connects a node
`References Cited
`[56]
`[58] Field of Search ............................. .. 364/200, 900;
`scribed herein. This improvement in a hypercube topol
`to another node in the topology which requires the
`395/DIG. l, DIG. 2, 800, 700, 325
`U . S . PATENT DOCUMENTS
`ogy is termed as a Modi?ed Hypercube topology. Such
`greatest number of nodal hops over the shortest path.
`a topology contains extra links which connects a node
`References Cited
`[56]
`Also stated another way, that node having
`the greatest
`364/200
`4,247,892 1/1981 Lawrence
`to another node in the topology which requires the
`number of singly traversed or hopped nodes along the
`358/213
`4,574,311 3/1986 Resnikoff et al
`U.S. PATENT DOCUMENTS
`greatest number of nodal hops over the shortest path.
`. 370/60
`4.598,400 7/1986 Hillis
`shortest path from an originating node to that node
`Also stated another way, that node having the greatest
`4,247,892 l/1981 Lawrence ......................... .. 364/200
`4,691,291 9/1987 Wolfrom
`364/717
`makes that node the most distant processor node. If
`number of singly traversed or hopped nodes along the
`4,574,311 3/1986 ResnikolT et a1.
`358/213
`4.729.095 3/1988 Colley et al
`364/200
`hamming were
`to
`be
`implemented
`in
`4,598,400 7/1986 Hillis ........... ..
`370/60
`shortest path from an originating node to that node
`. 371/43
`4,730,322 3/1988 Pollara-Bizzola
`added an extra link between two nodes having the
`4,691,291 9/1987 Wolfrom ..... ..
`364/717
`364/200
`4,739,476 4/1988 Fiduccia
`makes that node the most distant processor node. If
`4,729,095 3/1988 Colley et a1.
`364/200
`greatest hamming
`distance.
`Such
`a
`(List continued on next page.)
`hamming were to be implemented in the system, there is
`4,730,322 3/1988 Pollara-Bizzola .
`371/43
`nological trade off to reduce the diameter of a classic
`added an extra link between two nodes having the
`4,739,476 4/1988 Fiduccia ............................ .. 364/200
`FOREIGN
`PATENT
`DOCUMENTS
`hypercube at the cost of incrementally increasing the
`greatest hamming distance. Such a system makes a tech
`(List continued on next page.)
`number of I/O ports at each node. This trade off has
`nological trade off to reduce the diameter 01' a classic
`30132926 5/1984 European Pat. Off. .
`FOREIGN PATENT DOCUMENTS
`been recognized in the industry
`as
`advantageous
`hypercube at the cost of incrementally increasing the
`0251594 1/1988 European Pat. OfT. .
`63-186365 8/1988 Japan .
`number of I/O ports at each node. This trade of!‘ has
`great gain
`in performance
`is achieved
`n
`exchange
`30132926 5/1984 European Pat. Off. .
`(List continued on next page.)
`been recognized in the industry as advantageous since a
`0251594 1/1988 European Pat. Off. .
`incremental
`impact to the hardware.
`Clearly
`the
`63-186365 8/1988 Japan .
`great gain in performance is achieved 11 exchange for an
`mance advantages
`of
`the
`present
`invention
`OTHER PUBLICATIONS
`(List continued on next page.)
`incremental impact to the hardware. Clearly the perfor
`the hypercube
`grows
`and the maxi
`
`number of nodes
`in
`Franco P. Preparata,
`Cube-Connected Cy
`
`et
`al.,
`"The
`mance advantages of the present invention grows as the
`mum distance between nodes increases.
`OTHER PUBLICATIONS
`cles; A Versatile Network For Parallel Computation,"
`number of nodes in the hypercube grows and the maxi
`Franco P. Preparata, et al., "The Cube-Connected Cy
`Comm. of ACM vol. 24, No. 5 May 1981.
`mum distance between nodes increases.
`cles: A Versatile Network For Parallel Computation,"
`Marshall C. Pease, "The
`Indirect Binary n-Cube Mi-
`Comm. of ACM v01. 24, No. 5 May 1981.
`IEEE Transaction on Computers,
`croprocess Array,"
`Marshall C. Pease, “The Indirect Binary n-Cube Mi
`May 1977.
`croprocess Array,” IEEE Transaction on Computers,
`May 1977.
`
`14 Claims, 6 Drawing Sheets
`14 Claims, 6 Drawing Sheets
`
`Such
`
`the
`
`system
`
`110
`
`III 4^3
`\ Y
`
`100
`
`nk \
`
`/''IDI
`
`010 "V
`
`•jr
`
`'^S
`
`000
`
`Oil
`
`®00l
`
`BUNGIE - EXHIBIT 1022
`
`
`
`5,170,482
`5,170,482
`Page 2
`Page 2
`
`Pamphlet: "N—Cube Ten-True Parallel Computing",
`U S PATENT DOCUMENTS
`US PATENT DOCUMENTS
`Pamphlet: “N-Cube Ten-True Parallel Computing",
`from N—Cube Corporation.
`4,766,534 &/198& DeBenedictis
`364/200
`from N—Cube Corporation.
`"Performance of the Direct Binary n-Cube Network
`4,713,038 9/\98& HiHis e\ al
`4,766,534 8/1988 DeBenedictis .................... .. 364/200
`364/900
`“Performance of the Direct Binary n-Cube Network
`4,773,038 9/1988 Hillis et a].
`364/900
`for Multiprocessors," Abraham and Padmanabhan;
`4,773,873 9/1988 Hillis
`439/260
`for Multiprocessors," Abraham and Padmanabhan;
`4,773,873 9/1988 Hillis ........ ..
`439/260
`IEEE, copyright 1986.
`4,791,641 12/1988 Hillis
`. 371/38
`4,791,641 12/1988 Hillis
`371/38
`IEEE, copyright 1986.
`4,805,091 2/1989 Thiel et al
`364/200
`"Distributed Routing Algorithms for Broadcasting and
`2/1989 Thiel et a.
`364/200
`“Distributed Routing Algorithms for Broadcasting and
`4,805,173 2/1989 Hillis et al.
`371/38
`4,805,173 2/1989 Hillis et al
`.. 371/38
`Personalized Communication in Hypercubes", Ho and
`Personalized Communication in Hypercubes", Ho and
`4,809,202 2/1989 Wolfram ......... ..
`364/578
`4,809,202 2/1989 Wolfram
`364/578
`Johnson; IEEE, copyright 1986.
`Johnson; IEEE, copyright 1986.
`4,814,980 3/1989 Peterson et a1. .... ..
`364/200
`4,814,980 3/1989 Peterson et al
`364/200
`“The Architecture of a Homogeneous Vector Super
`"The Architecture of a Homogeneous Vector Super
`4,827,403 5/1989 Steele, Jr. et a].
`364/200
`4,827,403 5/1989 Steele, Jr. et al
`364/200
`computer“, Gustafson, Hawkinson and Scott; IEEE,
`computer", Gustafson, Hawkinson and Scott; IEEE,
`copyright 1986.
`FOREIGN PATENT DOCUMENTS
`copyright 1986.
`“Scalability of a Binary Tree on a Hypercube", Desh
`FOREIGN PATENT DOCUMENTS
`“108808566 3/1988 PCT Int'l Appl. .
`"Scalability of a Binary Tree on a Hypercube", Desh-
`pande and Jenevein; IEEE, copyright 1986.
`WO8804077 6/1988 PCT Int‘] Appl. .
`W08808566 3/1988 PCT Int'l Appl. .
`"Multigrid Algorithms on the Hypercube Multiproces
`pande and Jenevein; IEEE, copyright 1986.
`WO8808652 11/1988 PCT Int’l App]. .
`W08804077 6/1988 PCT Int'l Appl. .
`sor", Chan and Saad; IEEE Transactions on Comput
`"Multigrid Algorithms on the Hypercube Multiproces
`WO8808652 11/1988 PCT Int'l Appl. ,
`ers, vol. C(?), No. 11, Nov. 1986.
`OTHER PUBLICATIONS
`sor", Chan and Saad; IEEE Transactions on Comput
`"How Robust is the n—Cube?”, Becker and Simon;
`ers, vol. C(?), No. 11, Nov. 1986.
`“Architecture of a Hypercube Supercomputer", Hayes,
`IEEE, copyright 1986.
`OTHER PUBLICATIONS
`"How Robust is the n—Cube?", Becker and Simon;
`Madge, Stout, et al., IEEE, copyright 19867
`"On Folded Hypercubers”, by Lati? and El—Amawy,
`"Architecture of a Hypercube Supercomputer", Hayes,
`IEEE, copyright 1986.
`Collection of slides taken fron an N-Cube Corporation
`published in the 1989 International Conference on Par
`allel Processing.
`presentation, identi?ed on ?rst page with the N-Cube
`"On Folded Hypercubers", by Latifi and El-Amawy,
`Madge, Stout, et al., IEEE, copyright 1986.
`founders.
`“Properties and Performance of Folded Hypercubes",
`Collection of slides taken fron an N-Cube Corporation
`published in the 1989 International Conference on Par
`Selected portion of book entitled The Connection Ma
`by El-Amawy and Latifi, published in the IEEE Trans
`presentation, identified on first page with the N-Cube
`allel Processing.
`chine, by Hills, MIT Press.
`actions on Parallel and Distributed Systems.
`"Properties and Performance of Folded Hypercubes",
`founders.
`by El-Amawy and Latifi, published in the IEEE Trans
`Selected portion of book entitled The Connection Ma
`actions on Parallel and Distributed Systems.
`chine, by Hills, MIT Press.
`
`
`
`Ki
`00
`
`-J o N*
`cn
`
`0\
`o
`&
`sr
`1/1
`
`K>
`SO
`VO
`
`oo
`rt> p
`©
`
`S
`o
`(0
`*0
`C/3
`c!
`
`FIG. IF
`
`901
`
`900
`FIG. IC
`
`FIG. IE
`
`Tir
`
`<>
`
`901
`
`FIG. IB
`
`900
`
`901
`
`901
`
`900
`
`(PRIOR ART)
`
`FIG. ID
`
`o
`
`O
`
`100
`
`901
`
`FIG. IA
`
`900
`
`901
`
`
`
`U.S. Patent
`US. Patent
`
`Dec. 8, 1992
`Dec. 8, 1992
`
`Sheet 2 of 6
`Sheet 2 of 6
`
`5,170,482
`5,170,482
`
`f
`
`r
`Vertex (Node)
`
`Vertex (Node) V
`
`Q I
`I
`
`(PRIOR ART)
`(PRIOR ART)
`
`-As.
`k
`
`900
`901
`900
`90l
`u-l
`<? u
`10
`I0
`ll
`
`Edge
`Edge
`
`6 0
`0
`FIG. 2 A
`FIG.2A
`
`00<!>
`00
`
`401
`0|
`FIG.2B
`FIG.2B
`
`\
`
`III
`
`oil
`
`110
`
`100
`
`101
`010
`
`000
`
`001
`FIG.2C
`
`1110
`
`1010
`
`mi
`
`0110
`
`0010
`
`Old
`1100! ^
`
`0100
`
`9
`
`0101
`
`v* j 1011
`
`0011
`
`v fXJOl
`
`0000
`
`1101
`
`900
`
`-901
`
`1001
`
`1000
`
`FIG. 2D
`
`
`
`U.S. Patent
`US. Patent
`
`Dec. 8, 1992
`Dec. 8, 1992
`
`Sheet 3 of 6
`Sheet 3 of 6
`
`5,170,482
`5,170,482
`
`ro
`n .07...
`o
`(i.
`
`>«
`
`a»:
`
`X
`
`3......
`
`K
`
`K
`
`i
`I
`Q
`
`i
`i
`
`X
`
`o
`ti.
`
`a2:
`
`x"
`
`X
`
`K)
`
`x
`
`<M
`x
`
`V
`6
`
`x"0
`
`o S O
`ox-x
`
`X
`
`
`
`U.S. Patent
`US. Patent
`
`Dec. 8, 1992
`Dec. 8, 1992
`
`Sheet 4 of 6
`Sheet 4 of 6
`
`5,170,482
`5,170,482
`
`FIG. 6
`
`no
`
`HI
`
`FIG. 5
`
`10
`
`o
`
`II
`
`*>
`
`>
`
`100
`
`oo 6^
`
`01
`
`000
`
`A, A
`
`101
`
`-k \
`oi9^>^
`V-
`•X
`
`/ /
`f 9
`
`J
`
`0,1
`
`FIG. 8
`
`To (101,10)
`
`TotllO, II)
`
`(001,10)
`
`(001,11)
`
`(001,00]
`
`(001,01)
`
`Node 001
`Node OOI
`Of MH3
`Of NH;
`Replaced
`Replaced
`With MH2
`Wiih MHz
`
`To (000,00)
`To (000,00)
`
`To (011,01)
`To (0".0l)
`
`
`
`U.S. Patent
`US. Patent
`
`Dec. 8, 1992
`Dec. 8, 1992
`
`Sheet 5 of 6
`Sheet 5 of 6
`
`5,170,482
`5,170,482
`
`3
`3
`
`2
`2
`
`5
`
`4
`4
`
`4
`4
`
`5
`
`7
`7
`
`6
`
`7
`
`6
`6
`
`2
`
`3
`3
`
`0
`O
`
`I
`l
`FIG. 7A
`FIG. 7A
`
`0
`O
`
`l
`I
`FI6.7B
`FIG. 7B
`
`6
`6
`
`|
`I
`
`6
`6
`
`7
`7
`
`4
`4
`
`3
`
`4
`4
`
`5
`
`2
`
`5
`5
`
`2
`
`3
`3
`
`0
`O
`
`7
`7
`
`0
`O
`
`I
`l
`
`FIG. 7C
`FIG. 7C
`
`~
`
`FIG.7D
`FIG. 7 D
`
`
`
`U.S. Patent
`US. Patent
`
`Dec. 8, 1992
`Dec. 8, 1992
`
`Sheet 6 of 6
`Sheet 6 of 6
`
`5,170,482
`5,170,482
`
`FIG. 9
`
`[6.0]
`
`[7.0] )
`
`[6.1]
`
`>1
`[6.3]
`[6,2] \
`[4.l]-^yp.0]
`t
`
`[4,2]-,
`
`P.Q
`
`-[7.2]
`
`[7.3]
`
`[5.0]^
`
`/'-'f
`//<"•' [3.3]
`
`[3.0
`
`[3.2]
`
`[4.3]
`
`[2.2]\
`
`[2.I]X
`
`'A-*
`
`[2.3]
`L'
`
`/''l^.O]
`
`[3,3]
`
`,^[3.2]
`
`\ [l,2] [3.0]/ '^C3.0
`
`[0.2]^<W?T^[0'33
`
`[•.3]
`
`[o.C
`
`[0.0]
`
`[M]
`
`\
`[Vfl
`
`
`
`1
`1
`
`IMPROVED HYPERCUBE TOPOLOGY FOR
`IMPROVED HYPERCUBE TOPOLOGY FOR
`MULTIPROCESSOR COMPUTER SYSTEMS
`MULTIPROCESSOR COMPUTER SYSTEMS
`
`5,170,482
`5,170,482
`2
`2
`The indirect interconnection topology is typically
`The indirect interconnection topology is typically
`described in terms of the physical layout of the inter
`described in terms of the physical layout of the inter
`connected nodes of the system. These interconnect
`connected nodes of the system. These interconnect
`schemes are usually described in terms of the degree
`schemes are usually described in terms of the degree
`This is a continuation of application Ser. No. 5 required for implementation, i.e., degree one, degree
`This is a continuation of application Ser. No.
`required for implementation, i.e., degree one, degree
`5
`two, degree three, hypercube, etc. The most promise
`07/085,980, filed Aug. 14, 1987, now abandoned.
`07/085,980, ?led Aug. 14, 1987, now abandoned.
`two, degree three, hypercube, etc. The most promise
`for very large multiprocessor system interconnect to
`for very large multiprocessor system interconnect to
`FIELD OF THE INVENTION
`FIELD OF THE INVENTION
`pologies has been found in the hypercube topology.
`pologies has been found in the hypercube topology.
`The present invention pertains to the field of high-
`The present invention pertains to the ?eld of high
`Several prior art multiprocessors have been designed
`Several prior art multiprocessors have been designed
`the hypercube topology. Exemplary of
`speed digital data processors, and more particularly to 10 based
`speed digital data processors, and more particularly to
`based upon the hypercube topology. Exemplary of
`interprocessor communication networks for multipro
`interprocessor communication networks for multipro
`these types of multiprocessors is the NCUBE/10 paral
`these types of multiprocessors is the NCUBE/lO paral
`cessor or multicomputer designs.
`cessor or multicomputer designs.
`lel processing computer manufactured by NCUBE Cor
`lel processing computer manufactured by NCUBE Cor
`poration of Beaverton, Oregon. This multiprocessor
`poration of Beaverton, Oregon. This multiprocessor
`BACKGROUND OF THE INVENTION
`BACKGROUND OF THE INVENTION
`system uses a 10-dimensional hypercube, or 10-cube,
`system uses a lO-dimensional hypercube, or lO-cube,
`The design of multiprocessor computers with a large 15
`The design of multiprocessor computers with a large
`when fully configured. The NCUBE/10 machine can
`when fully con?gured. The NCUBE/lO machine can
`number of parallel processing elements requires fast and
`number of parallel processing elements requires fast and
`operate using 1,024 microprocessors operating in paral
`operate using 1,024 microprocessors operating in paral
`efficient communication throughout the multiprocessor
`e?'icient communication throughout the multiprocessor
`lel providing an overall performance of upwards of 500
`lel providing an overall performance of upwards of 500
`network. The largest bottleneck in the processing abil
`network. The largest bottleneck in the processing abil
`million floating point operations per second (MLFOPS)
`million ?oating point operations per second (MLFOPS)
`ity of large multiprocessor systems is the interprocessor
`ity of large multiprocessor systems is the interprocessor
`or 2,000 million integer instructions per second (MIPS).
`or 2,000 million integer instructions per second (MIPS).
`communication. There exists in the prior art a wide 20
`communication. There exists in the prior art a wide
`20
`However, a multiprocessor of this capability is still
`However, a multiprocessor of this capability is still
`variety of interprocessor communication network to
`variety of interprocessor communication network to
`limited in its ability to grow to a larger size due to I/O
`limited in its ability to grow to a larger size due to I/O
`pologies which yield various levels of performance.
`pologies which yield various levels of performance.
`limitations between the microprocessors.
`limitations between the microprocessors.
`The trend is to build large system configurations of
`The trend is to build large system con?gurations of
`Several variations on multi-degree topologies from
`Several variations on multi-degree topologies from
`multiprocessors operating in parallel in the range of
`multiprocessors operating in parallel in the range of
`25 multiprocessors have been proposed in the prior art to
`multiprocessors have been proposed in the prior art to
`thousands of processors.
`thousands of processors.
`25
`reduce the longest path between microprocessors and
`reduce the longest path between microprocessors and
`In a multiprocessor or multicomputer design (the
`In a multiprocessor or multicomputer design (the
`the number of interconnections between microproces
`the number of interconnections between microproces
`present discussion centers on topology structure so no
`present discussion centers on topology structure so no
`sors in order to improve performance and to allow the
`sors in order to improve performance and to allow the
`differentiation is made between multiprocessors and
`differentiation is made between multiprocessors and
`number of microprocessors to grow larger. These pro-
`number of microprocessors to grow larger. These pro
`multicomputers) using thousands of parallel processing
`multicomputers) using thousands of parallel processing
`posals for multiprocessor interconnection topologies
`elements, the physical packaging of the hardware be- 30 posals for multiprocessor interconnection topologies
`elements, the physical packaging of the hardware be
`are typically shown in the form of a graph in which
`comes a major concern to interprocessor communica
`are typically shown in the form of a graph in which
`comes a major concern to interprocessor communica
`nodes represent switching points or processing elements
`tion. For example, the preferred cost-effective method
`nodes represent switching points or processing elements
`tion. For example, the preferred cost-effective method
`and edges represent communication links. Since the
`of implementing a large parallel processor is to use
`and edges represent communication links. Since the
`of implementing a large parallel processor is to use
`topologies tend to be regular, the descriptions lend
`microprocessor packaged in individual VLSI packages
`microprocessor packaged in individual VLSI packages
`topologies tend to be regular, the descriptions lend
`with a small number of memory and other support chips 35 themselves to graphical displays representing systems
`themselves to graphical displays representing systems
`with a small number of memory and other support chips
`35
`for each processor. The number of other microproces
`such as the types shown in the Figures attached to the
`for each processor. The number of other microproces
`such as the types shown in the Figures attached to the
`sor packages that a single package can communicate
`present patent application. Those skilled in the art
`sor packages that a single package can communicate
`present patent application. Those skilled in the art
`with depends on the number of available input/output
`readily recognize the conversion of graphical represen
`with depends on the number of available input/output
`readily recognize the conversion of graphical represen
`(I/O) ports available. For the fastest direct communica-
`tations of system topologies into hardware. Hence, this
`(I/O) ports available. For the fastest direct communica
`tations of system topologies into hardware. Hence, this
`tion between two microprocessors, dedicated pins on 40 shorthand notation is a convenient method of represent-
`tion between two microprocessors, dedicated pins on
`40
`shorthand notation is a convenient method of represent
`the LSI package are used to form the communication
`ing larger and more complex hardware multiprocessor
`the LS1 package are used to form the communication
`ing larger and more complex hardware multiprocessor
`interconnect. Thus, the number of communications
`interconnect. Thus, the number of communications
`systems without the associated complexity of unneces
`systems without the associated complexity of unneces
`ports that a microprocessor can support is directly de
`ports that a microprocessor can support is directly de
`sary details.
`sary details.
`pendent on the pin limitations of the packages. LSI
`pendent on the pin limitations of the packages. LS1
`To best understand the prior art of interconnection
`To best understand the prior art of interconnection
`packages of the prior art have been built with more than 45
`packages of the prior art have been built with more than
`topologies for multiprocessor systems, the present pa
`topologies for multiprocessor systems, the present pa
`300 pins, however, in multiprocessor designs using tens
`300 pins, however, in multiprocessor designs using tens
`tent application includes a detailed discussion of the
`tent application includes a detailed discussion of the
`of thousands of processors, pin limitations are still a
`of thousands of processors, pin limitations are still a
`prior art to more carefully place the present invention in
`prior art to more carefully place the present invention in
`factor limiting the size of the design even if the I/O
`factor limiting the size of the design even if the I/O
`light of its advancements over the prior art.
`light of its advancements over the prior art.
`ports are multiplexed on shared pins. This is a very
`ports are multiplexed on shared pins. This is a very
`important factor as to why a single processor element in 50
`SUMMARY OF THE INVENTION
`important factor as to why a single processor element in
`SUMMARY OF THE INVENTION
`a multiprocessor design cannot have a large number of
`a multiprocessor design cannot have a large number of
`An improved topology for multiprocessor systems is
`An improved topology for multiprocessor systems is
`ports.
`ports.
`described to facilitate interprocessor communication
`described to facilitate interprocessor communication
`Interprocessor communication can be divided into
`Interprocessor communication can be divided into
`for parallel processing computers. In one preferred
`for parallel processing computers. In one preferred
`two main categories of topologies, commonly entitled
`two main categories of topologies, commonly entitled
`embodiment of the present invention, an interprocessor
`embodiment of the present invention, an interprocessor
`direct and indirect. The direct network connection 55
`direct and indirect. The direct network connection
`55
`communication system for use in multiprocessor de
`communication system for use in multiprocessor de
`topology has been widely researched and considered
`topology has been widely researched and considered
`signs is constructed by modifying an n-dimensional
`signs is constructed by modifying an n-dimensional
`for telephone switching connections. These connec
`for telephone switching connections. These connec
`hypercube. This modified hypercube is constructed
`hypercube. This modi?ed hypercube is constructed
`tions are typically found to be tightly coupled, slow-
`tions are typically found to be tightly coupled, slow
`using 2''-nodes arranged as an n-dimensional hypercube
`using 2"-nodes arranged as an n-dimensional hypercube
`speed communications networks designed either as sin
`speed communications networks designed either as sin
`gle-stage or multi-stage designs. The most commonly 60 (n-cube). Additional communication paths are inserted
`(n-cube). Additional communication paths are inserted
`gle-stage or multi-stage designs. The most commonly
`60
`int0 the hypercube structure to reduce the diameter of
`used direct interconnect topology is the crossbar inter-
`into the hypercube structure to reduce the diameter of
`used direct interconnect topology is the crossbar inter
`the hypercube by increasing the number of ports (de-
`the hypercube by increasing the number of ports (de
`connect widely used in older telephone exchange cen-
`connect widely used in older telephone exchange cen
`gree) for each node. This is an acceptable trade-off
`gree) for each node. This is an acceptable trade-off
`tral offices. When the number of nodes increases in a
`tral offices. When the number of nodes increases in a
`when the number of nodes within the system increases.
`when the number of nodes within the system increases.
`direct interconnection topology, the number of possible
`direct interconnection topology, the number of possible
`In a second preferred embodiment of the present
`interconnections grow very quickly. These types of 65
`In a second preferred embodiment of the present
`interconnections grow very quickly. These types of
`65
`invention, the comers of the modified hypercube are
`prior art direct interconnection techniques are very
`invention, the corners of the modi?ed hypercube are
`prior art direct interconnection techniques are very
`substituted with topologies having cycle paths in order
`substituted with topologies having cycle paths in order
`expensive for loosely-coupled or distributed multipro
`expensive for loosely-coupled or distributed multipro
`to reduce the degree of the overall structure.
`cessor designs with a very large number of processors.
`to reduce the degree of the overall structure.
`cessor designs with a very large number of processors.
`
`
`
`5,170,482
`5,170,482
`4
`3
`4
`3
`for a multiprocessor system can be considered as a
`for a multiprocessor system can be considered as a
`BRIEF DESCRIPTION OF THE DRAWINGS
`BRIEF DESCRIPTION OF THE DRAWINGS
`graph with each processor as a-vertex (or node) and
`graph with each processor as a-vertex (or node) and
`with an edge between two vertices if their correspond
`In the drawings, where like numerals refer to like
`with an edge between two vertices if their correspond
`In the drawings, where like numerals refer to like
`ing processors are directly connected. The terms
`components throughout the several views:
`ing processors are directly connected. The terms
`components throughout the several views:
`interchangeably
`FIGS. 1A, IB, 1C, ID, IE and IF show a variety of 5 "graph" and "system" are used
`“graph“ and "system" are used interchangeably
`FIGS. 1A, 1B, 1C, ID, 115 and IF show a variety of
`throughout this application. In selecting a topology for
`prior art network topologies such as the chordal ring,
`throughout this application. In selecting a topology for
`prior an network topologies such as the chordal ring,
`a multiprocessor system, several factors are considered
`the cube-connected cycle (CCC), the ring (cycle), the
`the cube-connected cycle (CCC), the ring (cycle), the
`a multiprocessor system, several factors are considered
`to optimize the design and maximize performance.
`tree, the mesh (grid) and the completely connected
`to optimize the design and maximize performance.
`tree, the mesh (grid) and the completely connected
`The first criterion is that the degree of a graph should
`network topology.
`The ?rst criterion is that the degree of a graph should
`network topology.
`FIG. 2A, 2B, 2C, 2D shows prior art hypercubes of 10 be small. The degree of a graph is defined as the maxi-
`be small. The degree of a graph is de?ned as the maxi
`FIG. 2A, 2B, 2C, 2D shows prior art hypercubes of
`- 0
`mum number of vertices connected directly (through
`one-, two-, three-, and four-dimensional configurations.
`mum number of vertices connected directly (through
`one-, two-, three-, and four-dimensional con?gurations.
`I/O ports) to a vertex in the graph. This restriction
`FIG. 3 shows a path connecting x and x' in the basic
`I/O ports) to a vertex in the graph. This restriction
`FIG. 3 shows a path connecting x and x’ in the basic
`represents the fact that a processor with a large number
`topology 8;.
`topology Sj.
`represents the fact that a processor with a large number
`of I/O line interfaces is expensive and may not be feasi-
`FIG. 4 shows a path connecting (x, y) and (x', y') in
`of 1/0 line interfaces is expensive and may not be feasi
`FIG. 4 shows a path connecting (x, y) and (x', y’) in
`15 ble for realization due to pin-out limitations, multiplex
`a Substituted Topology ST(S2,SI).
`ble for realization due to pin-out limitations, multiplex
`a Substituted Topology ST(S;,S1).
`ing problems, etc.
`FIG. 5 is a Modified Hypercube MHj formed from a
`ing problems, etc.
`FIG. 5 is a Modi?ed I-Iypercube MHZ formed from a
`The second criterion is that the diameter of the graph
`2-cube.
`The second criterion is that the diameter of the graph
`2-cube.
`should be small relative to the number of the vertices.
`FIG. 6 is a Modified Hypercube MH3 formed from a
`should be small relative to the number of the vertices.
`FIG. 6 is a Modi?ed Hypercube MH; formed from a
`The diameter of a graph is defined as the maximum
`3-cube.
`The diameter of a graph is de?ned as the maximum
`3-cube.
`FIG. 7A, 7B, 7C, 7D exemplifies the robustness of the 20 shortest distance between any pair of vertices in the
`shortest distance between any pair of vertices in the
`FIG. 7A, 7B, 7C, 7D exempli?es the robustness of the
`graph. Processing elements not directly connected will
`Modified Hypercube by showing four remaining virtual
`graph. Processing elements not directly connected will
`Modi?ed Hypercube by showing four remaining virtual
`have to have messages relayed by intervening vertices,
`2-cubes in a MHj configuration when nodes 3 and 4 are
`have to have messages relayed by intervening vertices.
`Z-cubes in a MH; con?guration when nodes 3 and 4 are
`The number of relays in the worst case should be kept
`faulty.
`faulty.
`The number of relays in the worst case should be kept
`small to keep communication time delay between pro-
`FIG. 8 shows a substituted node of a Substituted
`small to keep communication time delay between pro
`FIG. 8 shows a substituted node of a Substituted
`Topology MH3 (MH2) where the basic topology is a 25 cessors to a minimum.
`Topology MH3 (MI-l1) where the basic topology is a
`cessors to a minimum.
`Third, there should be no congestion parts in the
`Modified Hypercube of the type shown in FIG. 6 and
`Third, there should be no congestion parts in the
`Modi?ed Hypercube of the type shown in FIG. 6 and
`system. The message relaying load should be uniformly
`the block topology is a Modified Hypercube of the type
`system. The message relaying load should be uniformly
`the block topology is a Modi?ed Hypercube of the type
`distributed throughout the network such that there is no
`shown in FIG. 5.
`distributed throughout the network such that there is no
`shown in FIG. 5.
`bottleneck in the message communication flow.
`FIG. 9 is a graphical representation of a Substituted
`bottleneck in the message communication ?ow.
`FIG. 9 is a graphical representation of a Substituted
`Fourth, the routing algorithm for any node in the
`Modified Hypercube SMH3 (MH2) where the basic 30
`Fourth, the routing algorithm for any node in the
`Modi?ed Hypercube SMI-Ig (MHZ) where the basic
`system should be easily implemented, so that the cost of
`topology is a Modified Hypercube of the type shown in
`system should be easily implemented, so that the cost of
`topology is a Modi?ed Hypercube of the type shown in
`time for communication is small. Also, the system
`FIG. 6 and the block topology is a Modified Hypercube
`time for communi