throbber
Architectural Tradeofi’s in
`Field-Programmable-Device-Based Computing Systems
`
`[Ch893]
`
`Pak K. Chau'and Martino DF. Schlagl
`
`Computer Engineering
`University of California, Santa Cruz
`Santa. Cruz, California 95064 USA
`
`Reprogrammable Field-Programmable Gate Ar-
`rays (FPGAs) have enabled the realization of high-
`performance and affordable reconfigurable comput-
`ing engines. We examine the architectural tradeofis
`involved in dsigning eneral purpose FPGA-based
`computin systems wit
`field-programmable ate ar-
`rays and old-programmable interconnects.
`be fact
`that FPGAs provide both programmable logic and
`programmable interconnects raises numerous design
`issues that need to be considered with care. Factors
`that influence the tradeofis are mutability, rearrange-
`ability, and speed
`
`I FPGA-hased computing systems
`Reprogrammahle FPGAs form the basis of high-
`performauce and afl'ordablc reconfigurable comput—
`ing‘ prototyping and emulation systems [1, 2, 3. It, 5.
`6, 1']. Interconnections schemes in these systems vary
`quite a, hit. In SPLASH, PAM, QUICKTURN GANGLIDN
`and ANYBOARD 1, 3, 4, 5, 7], the FPGAs are directly
`connected to es
`other in a fixed pattern. In Bonn
`and REALIZER [6,
`'2},
`logic and interconnection are
`separate. The REALIZER uses the partial crossbar
`interconnection architecture to connect the FPGAs,
`while the sons has complete interconnection.
`in
`this paper, We shall investigate the design of inter
`connection architectures using multiple crossbar-s and
`discuss the architectural
`tradeofi's in the design of
`such FPGA-based computing systems.
`We view FPGA—based computing systems as con-
`sisting of reprogrammeble FPGAs, and reprogram-
`mable interconnects. A number of commercial de-
`vices can serve as field-programmable interconnects:
`Aptix FPle have a place and route architecture
`which is not a crossbar As]. Routin delay is not
`entirely controllable an
`predictable as a result
`of its place-and-route architecture. Each FPlC
`chi has a maximum of 940 user programmable
`if s.
`
`IQlfifl Each FPID chip has a maximum of160 user
`programmable I/Os. The heart of the [(2160 is a
`”'6 x 176 crossbar. Every port can be configured
`to connect to any port, Routing delay is entirely
`'Supported in part by NSF Grant MIP-SIQSITG {c MICRO
`Isu ported in put by NSF FYI Grant MIP—sssszrs 5:
`Mining
`
`O‘SIEG-SEQO-ZBB $08.00 6 1993 [BEE
`
`152
`
`controllable and predictable as a result of this
`crossbar architecture [9].
`TI crossbar The Texas Instrument SNNACTBB“
`is a 16 x 16 crwebar with 4-bit word lcn th.
`The crossbar
`is or anized as
`16 ports I:
`4-
`bit each. Routing clay is entirely controllable
`and predictable as a result of this crossbar
`architecture. Each crossbar has a maximum of
`128 user programmable I/Os.
`
`FP GAS have a place and route architecture. Rout-
`ing delay is not entirely controllable and pro:
`dictabie as a result of thisGplawaud-route ar-
`chitecture [10].
`Each FP A typically has a.
`maximum of 200 user programmable l/Oa.
`Function-wise, the FPGAs and Aptix chips can be
`conceived of as devices capable of
`eri'orrnin
`as
`crossbars under many circumstances
`ignoring fiieir
`unpredictable deiaysg.
`This will be the view of
`this paper and we a all use switches and crossbar:
`interchangeably in our discussion.
`While all of the devices mentioned above can serve
`as programmable interconnects to provide flexible
`interconnection between computing elements in an
`FPGA-based computing system, the number of inter-
`connections required may easily exceed the number of
`user pro rammable 1/05 on asingle device. We need
`to considger building a larger interconnection structure
`from these basic device packages which wiIl be viewed
`3 smail crossbars.
`In this regard,
`the problem of
`building large crosabars from smaller crossbars is
`well studied in the context of telephone switching
`networks. Gina and Benes networks are representative
`of rearrangeable networks made up of small crossbars.
`We shall review the basic terminolo y, concepts and
`known results related to Cios networ s in Section II.
`The notion of folded-Clo: networks is introduced
`to facilitate the application of known rwults and
`a control algorithm to configure the crowbars in
`an FPGA-based system consisting of multiple small
`crocshars. This is one of the contributions of this
`paper. Then, we show that the fact that FPGAs have
`dual personality (both as programmable logic and
`programmable interconnects) which can be exploited
`to enable system tradeolfs
`in the design of an
`FPGA-based system using both FPGAs and field-
`programmable interconnect. The issues of speed,
`rearrangeability and mutability are considered in
`Section IV.
`
`|PFl2018-01594
`
`EXHIBIT
`
`2054
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 1
`
`IPR2018-01594
`
`EXHIBIT
`2054
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 1
`
`

`

`input switches
`
`middle switches
`
`output switches
`
`
`
`Figure l: A three-stage (1106 network C(3.3,4).
`
`II Rearrangeabie CLOS switching
`network:
`
`We review some basic terminology and concepts
`related to Clos networks in this section.
`A connecting network is an arrangement of switches
`and interconnects allowing a set of input terminals to
`be connected to a set of output terminals in various
`combinations [[1, 12}. All connections are point-
`to-poinl. connections. An assignment for a given
`network is a list of input/output pairs which are
`to be connected; eanh terminal appears in at most
`one pair. An assignment is renlr'zcfile if there exist
`disjomt paths in the network connecting all pairs
`of input/output
`terminals in the assignment.
`A
`network is rearranged“: if any assignment is realis-
`ahle. Nontlach‘ng networks have the property that in
`addition to being rearrangeable, any connection can
`he established between any idle pair of input/output
`terminals without rearrangements.
`Three-stage symmetrical Clot network consists of
`two symmetrical outer stagfi of r n x m switches and
`an middle sta e oi m r x r switches. The network
`is completely ciaracterized by three parameters in, m
`and r, or C(n,m,r] collectively. A 00.3.4) Clos
`network is shown in Figure l. The Slepian-Duguid
`theorem states that a 3-stage (3105 network C(n, m, r)
`is recrrcngeetle if and only if n: 2 n [11, 13]. For our
`purpose, we assume n = m, hence all smtches are
`square and our Clos networks are rearrangeahle.
`Cine shows that a 3-stage Clos network C(u, m, r?
`is nonblacting in the sir-tel sense if m 2 2n -
`[1-1].
`Clos also illustrates that
`it
`is possible to
`apply the principle recursively to form multi-stage
`reerrangeable or nonblocking networks. Clea assumes
`that all connections are point-to-point connections.
`In Section V, we shall consider an extension to multi-
`pin connections.
`Although rearrangeable networks generally use
`
`153
`
`require careful and sometimes
`fewer switches, the
`hard routing
`ntro algorithms. On the other hand,
`non-blocking; networks require less demanding or no
`control algorithm [15]. The next section presents
`a known control al orithrn for rearran able Clos
`networks based on bipartite matching [125i2
`III-A A control algorithm for rearrangeable
`Clos networks
`Given an assignment .4, a bipartite graph C(A) =
`V], V2. E) can be constructed, where the Vertex net
`| consists of the input switches, and the vertex set
`V; consists of the output switches. There is an edge
`between switch 3 E V. and T E V; for each pair of
`inputfoutput terminals in the assignment where the
`input terminal belon s to S and the output terminal
`belongs to ‘1". The
`ges of this bipartite graph can
`always be partitioned into In disjoint matchings. and
`then each matching provides the at most
`:- pairs
`of input/output terminals which can he realised by
`a single 1- x I" switch in the middle stage of the
`three-stage Clos network.
`Thus,
`the problem of
`setting up the switches in a three—stage rearrangeable
`Clos network C(n,m.r) can be formulated as edge-
`coloring with m colors (or equivalently partitioning
`the edges into m matchings] for the bipartite graph
`C(11) where edges incident to the same vertex must
`be of distinct colors [12]. This problem can be solved
`in polynomial time {16].
`III Architecture of FPGA-based
`systems
`With the basics defined in the previous sections, we
`are now in a position to discuss FPGAAbased systems
`consisting of the following components:
`FPGAa: which can serve as computing elements,
`programmable interconnect or both. Figure ‘2
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 2
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 2
`
`

`

`
`
`Routing chip
`
`
`
`
`Routing chip
`
`Figure 3: The Bone prototyping board viewed as as a three-stage C(2,2.4) Cios rearrangeahie network,
`
`illustrates the last case in which an FPGA is
`viewed as a device consisting of programmable
`logic and a number of 111 x n,
`“switches"
`connecting the logic to the programmable 1/0
`pads. The In x m “switches” are realized by
`reroutin or reassigning the nets among the n;
`pads.
`he size of these switches {the amount
`
`of routing resources used to im iement them)
`afiecm the mutability of the PP As and hence
`the amount of logic that can be implemented on
`the device.
`
`is that
`The software implication of this model
`we do not lock the pins to conform with board-
`Ievel constraints. This is advantageous since
`
`154
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 3
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 3
`
`

`

`outer stage
`
`middle stage
`
`cam pucing
`element
`
`element
`
`computing
`clement
`
`computing
`element
`
`computing
`
`Figure 4: A folded-Clos network with n = m = 3. and r = 4.
`
`reassigning or lacking} pins encrally affects the
`mutability of the FP As [10 . Instead. we allow
`the automatic placement and routing tool
`to
`choose freely the pin assignment. Afterward,
`one more routing step is required to use the
`“switches” to reroute the nets to conform with
`the board-level constraints.
`
`Memory elements serve as local memories to the
`computing elements.
`
`Host interface serves to communicate the FPGA-
`based computing system with the host machine
`to which the FPGAvbased system is attached.
`
`Programmable routin whose sole purpose is to
`provide programmab e interconnect amongst the
`computing elements and/or memoryr eiements.
`We assume that
`the number of interconnec-
`tions required exceeds the number of user pro
`grammable l/Os on a. single programmable routv
`1ng device. The major Issue is then the prob-
`lem of building a large switch from the smaller
`switches.
`
`We shall defer until Section VI the discussion of the
`impact of introducing other devices such as memoryr
`elements and the host interface to the system. So
`until
`then. we assume that there are only FFGAs
`and programmable interconnect in the system. As
`an example, consider the small programmable proto-
`typing board none [6] which has two routing chi
`s;
`it can be. vieWed as a threestage rearranges le Cos
`network. The FPGAs serve the dual purpose of
`programmable logic and programmable interconnccA
`tion. The switches inside the FPGAs, provided by
`
`155
`
`interchanging the nets among the programmable ifO
`pads,
`form the outer stages of the 0105 network.
`The middle stage is realized by the FPGAs R1 and
`R2 which serve as routing chi s, as illustrated in
`Figure 3. The fact that the l/
`pads of the FPGA
`are connected to the routing chips in an alternating
`fashion makes the system depicted in Figure 3 a. three-
`sta e rearrangeabie Clos network with n = m = 2
`r = 4.
`[ he actual aono uses Xilinx XC30305
`as routing chips resulting in a 612,127) Clo: rear-
`rangeahle network.)
`The scheme shown in Fi ure 3 only works for
`m = 2.
`in next section. we a all discuss how to use
`Clos networks When there are more than two FPGAs,
`for m > 2.
`
`III-A Folded-(2105 networks
`The Clo: network and control algorithm outlined
`in Section ll has a definite notion of input and
`output terminals. Therefore,
`in order to apply the
`control algorithm discussed in Section [I one must
`identify the two terminals of a. net either as input
`or output (irrespective of signal flow). This notion
`does not necessarily exist in an FPGA-baeed system.
`in which the computing elements have pads which
`can be reconfigured to be either input, output or
`bidirectional pads. The numbers and distribution of
`
`inputs/outputs change from application to applica-tron.
`in this case it
`is most convenient
`to describe
`the pragmmmnlrlc-rosting component of an FFGA-
`based computin
`system as a folded-Clea network.
`as depicted in
`igure 4. A three—stage folded-Clea
`network resembles a three-stage 6105 network. but
`there is no distinction between input and output
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 4
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 4
`
`

`

`outer stage
`
`middle stage
`
`FPGA:
`
`FFGA:
`
`Figure 5; A folded-Clue network using the FPGAs as both computing element and programmable interconnect.
`
`the stage which connects
`stages. We simply call
`directly to the computing elements the “outer“ sta e;
`it consists of :- palrs ofrn x m switches. The “midd e"
`stage consists of m r x r crowhars.
`
`III-B A control algorithm for folded-0105
`networks
`
`The control algorithm for the folded-Clos network
`depicted in Figure 4 can be adapted from the control
`algorithm for the three-stage Clos network depicted
`in Figure l as folltms. First, directions are assigned
`to each of the nets so that each pair ofrnx m switches
`has run iii-terminals and m out-terminals. To do this,
`we form a net-graph from the required connections,
`where each node come ends to a. group of 2m pins.
`and an edge is added or each net between the two
`nodes{groups) to which its two pins belong, We then
`traverse this net-graph visiting each edge once to
`partition the edges into edge-disjoint circuits (a node
`may appear more than once). We assign a direction
`to the edges as we traverse the net-graph. After the
`traversal, each node will have m in-edges and :71 out-
`edges.
`Once the 2911 ed es have been divided, we can
`unfolded the faded Jos network by separatin
`the
`2m terminals into two in: x m switches and o tain
`a network identical to the three-stage Clos network.
`The C105 network control algorithm can be used
`to configure the m x m switches to realize any
`assignment. Notice that
`the direction assigned to
`an ed e is purely for the urpose of using the edge-
`colora ility control algorithm described in Section Ii—
`A. the direction is unrelated to the actual signal flow.
`
`156
`
`IV Architectural tradeofi's: FPGAS as
`both computing element and
`programmable interconnect
`In this section, we examine the architectural tradeofis
`involved in the or anisation of an FPGA-hased
`computing systems.
`he issues are:
`l. rearrangenbility
`2. mutability, and
`3. speed.
`Any point—to—point connection realized by the
`architecture presented in Figure 4 has the delay
`of 3 switches
`It
`is possible to trade mutability
`with speed by exploiting the dual personality of
`FPGAs.
`Figure 5 shows a system resembling a
`folded-Clos network with the FPGAs serving as both
`programmable interconnect and computin elements.
`it is not difficult to show the network
`epicted in
`Figure 5 satisfies the SlepianADuguid rearrangenbility
`criterion (see Section II} providing the algorithm
`outlined in Section Ill—B has been used to identify
`the input and output stagts. We eliminate two stages
`of switch delay in Figure 5 but risk the danger of
`having a non-rearrangeahle network. This posstbility
`is illustrated as follows. Assume that we control
`the utilisation of the FPGAs such that they can be
`modeled as shown in Figure 2 with m = 3. r = 4_
`Now, sup one that the utilization: of the FPGAs
`increase an
`that n = m decreases from 3 to 2 while
`the external connections remain unchanged, as shown
`in Figure 5. Keep in mind that the interconnections
`between the outer stage and the middle stage are
`fixed.
`it is not hard to see that the network now is
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 5
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 5
`
`

`

`outer stage
`
`middle stage
`
`
`
`Figure 6: An example to illustrate the effect of increase in FPGA utilization on rearrangeability.
`
`a non-Clue oon-rearrangeable network. For example,
`the connections A:
`-—o A; and B; —' B; cannot be
`realized simultaneously.
`From an architectural stand oiat, keeping the size
`of the outer switches, m, small,(say 3) enhances the
`routahility (and hence utilization] ofthe FPGAs. 0n
`the number of switches in the mi die ate. e of the
`the other hand rearrangeability reauires that m be
`folded-0105 network. For a iven FPGA an crossbar
`device packages,
`this woucl determine the
`roper
`configuration .of the FPGA-based system
`there
`are a total of y user [/05 per FPGA. a total of .1:
`user I/Os per crossbar, and F FPGAs as computing
`elements in the system, then since the total number
`of 1/05 'm roughly conserved (in the absence of “other
`devices"), we have
`(1}
`Fy = Im
`(2}
`F = Mlzfvl‘
`lQlfiO FPID has 160 user-l/Os per
`For example,
`chip.
`i.e., e = 160. Given that m = 3‘
`if we
`are using SO-user-I/O FPGAS,
`then P can be at
`most 5 In order
`to maintain rearrangeability. As
`another example. Aptix has 940 userel/Os per chip.
`Given that m = 3. F can be at most 33 in order
`to maintain rearrangeability.
`Incorporating more
`FPGAs while maintaining rearran eahility requires
`increasing m beyond 3. As more P GAB are added.
`the additional resources consumed by the "outer"
`switches will result in adiminiahlngloglc capacity per
`FPGA. Conceptually, m should not reach the point
`where FPGA utilization is the same independent of
`whether the [/0 pins were locked or not.
`
`V Considerations for multi—pin nets.
`The development so far had assumed that all connec-
`tions are point-to-point connections. However in an
`FPGA-based system‘ ifa signal is the source of mul-
`tiple loads at different FPGAs, it is only reasonable
`to assume (with pin-limited packages that this si nal
`has onl one source pin originating tom one F? A.
`and it snout: in a middle crossbar to reach different
`destinations. Figure T illustrates a net N1 with 3 pins
`fanout inside a crossbar.
`Given this assumption. the problem of configuring
`the crossbar: to realize a given set of connections has
`no known solution. The clomt related work in this
`area is {or configuring non-blockin
`networks with
`definite signal direction [17]. Our fol ed-Clos network
`doesn’t have any definite signal flow direction and we
`only require rearrangeability.
`We can formulate the problem of confi urng
`the crossbars to realize a. given set of mu tI-pin
`connections in the following; manner, Let
`I P be the set of all pins (terminals)
`c 0,‘ be the set of
`ins (terminals) on outer
`switch 1‘ of the folde ~Clos network
`0 Ni be a BEL. Ni = {Pi.llFl-.=l ”apt-3.} Q P
`where each PM is a pinI we have (Lt-=1 N.- E P).
`Since no two nets can share a pin‘
`{3)
`Vu {NrnN§)=¢
`and only one pin per outer-stage switch for each
`net Ni is needed, we have
`Vt: We ”Oil 5 1-
`
`(4)
`
`151
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 6
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 6
`
`

`

`outer stage
`
`middle stage
`
`FPGA:
`
` FPGA.
`
`FFGA:
`
`Figure 1’: A multipin-net Ni with 3 terminals.
`
`We assume that the in middle switChe: are capable
`of realizing any interconnection pattern among their
`R = 2:- pins. The problem is solved by assigning
`the nets to the 1:11 switches (coloring the nets with :1:
`colors)
`
`I : Nets—v {I,2,...,m]
`while avoiding contention among nets
`
`f(M-J=I(Nr) =‘ VrlojnlN-UNrJlSI- (5)
`Vi}!
`Constraint (5) ensures that each outer switch has at
`most one net of each color, that isI at most one net
`per middle switch. This constraint also ensures that
`each middie switch will not be assigned too many
`connections. That is,
`
`v,- Z IMIsR.
`IENr)=.i
`
`(5}
`
`The multi- in control problem is more difficult
`than its poin to-‘poml. counterpart. To Illustrate,
`for exam l_e consi er the following seven nets to be
`connectedam the network as shown In Figure 8.
`N: = {Ph Pna,PII}.N3 = {PL Pr}. ”3 = {21,R,Pu}.
`N. = {P..P.,P..}.N. = {PhPmPu}. N. = {Paar}.
`Nr= {PHI-Pl?)
`
`A contention graph G can be constructed according
`to incompatibility among the nets. The vertex set V
`in G are the nets,
`there is an edge joining u. and
`u;
`if net N.- and NJ: cannot be in the same middle
`switch. The contention graph for our example is
`shown on the left in Figure 9. Since there are m = 3
`
`middle switches. a. feasible 3-coloring of the verticcs
`of the contention graph should provide a valid routing
`connection. This is, however, not possible in this
`example. Hence the foldechlas network is no longer
`rearrangeahle when multi-pin nets are allowed.
`Note that
`in this particular example allowing
`slightiy more flexibility in reassigning of the nets
`would admit a solution.
`If we swap the nets on
`pins P9 and Pm (which is
`the same as setting
`N3 = {P'3.‘nl:’]JJ:P|5%I and N5 = {35,139,}:13” the
`new contention grep shown on the right in Fi ure 9
`is 3-colorabie. Since rearrangeability is no ionger
`guaranteed,
`it mag!J be necessary to consider larger
`‘outer" switches w ile iimi ting the amount ofinternal
`reroutin of nets. Another possibility is to consider
`the m ifl'erent aii nrnents of the “outer” switches
`with respect to the PGA ins.
`On the other hand nomglockio networks ofi‘er a
`guarantee, but require a considerafile increase in the
`number of middle switches rn. For instance, the bust
`wide-sense nan-blachng network with broadcasting
`capability requires
`
`(T)
`m;>(n«-i)logF+‘2{n-l)
`for a system with F FPGAs where n is the width
`of each switch in the outer stage {1?}. Whereas a
`rearrangeable network only requires
`m; = r:
`
`(3)
`for point-topoint connections. Given n = :5, and
`F :16, 1'11] i512 but 1112 is 3
`
`VI How should other devices be
`connected in the system '3
`Finally We consider the impact of the placement of
`other elements (memory. host interface) in an FPGA-
`
`:58
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 7
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 7
`
`

`

`
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 8
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 8
`
`

`

`
`
`
`.“\;.4\\Q.2.».‘\.~.~\V”
`
`x4!3:5"fill
`fl.3?) fr
`\\ >11 “'9‘
`.-
`
`rll9 2‘5»'
`
`
`
`
`
`
`
`
`
`
`p14
`
`9";1\§
`
`Lt!-
`
`
`
`
`
`
`
`
`
`
`
`
`=»;'<Ei(
`x. //;
`Wm»;
`.93“:
`2:2::
`
`.A
`
`
`
`
`
`
`Figure l2: Connection between the RAMS and the FPGAS in a folded—Clog netwc-zrk.
`1M
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 9
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 9
`
`

`

`[5]
`
`“GANGLION—
`[5] C. E. Cox and W. E. Elam.
`A Fast Field-Programmable Gate Array Imple-
`mentation of a Connectionist Classifier." IEEE
`Journal on Solid-Sluts Circuits, vol. 27, pp. 9.88-
`299. March 1992,
`P. K. Chan, M. Schlag, and M. Martin, “BORG:
`A reconfigurable prototyping board using Field-
`Programmable Gate Arrays.”
`in Proceedings
`of 1"
`International ACM/SlGDA Workshop
`on Field-Programmable Gate Army's, (Berkeley.
`California. USA), pp. Ill-51. Feb. 1992.
`D. A. Thomas, T. A. Petersen. and D. E. Van
`den Bout. “The Anyboard rapid prototyping
`environment," in Advanced Research in VLSI.
`Proceedings 9
`Inc 1'99] UC Santa Cruz Confer-
`ence, (Santa
`rur. CA), pp. 356—370. 1991.
`
`[8} Aptix Corp. 225 Charcot Avenue, San Jose. CA
`95131.
`
`[9} I-Cube Design Systems. 2323-0 Walsh Avenue.
`Santa Clara. CA 951351.
`{10] XILINX: The Programmable Gate Array Dole
`Book. 2100 Logic Drive, San Jose, CA 95124.
`1992.
`
`[11] V. E. Beneé. Mathematical Theory of Connecting
`Networks and Telephone Tl'nfic.
`Academic
`Press. 1955.
`
`l-lwang. “Control Al orithrns for Bear-
`{12] F, K.
`rsngeable Cloe Networks," EEE “liposuction:
`on Communications. vol. COM-31. pp. 952—954.
`Aug. 1983.
`
`"Two Theorems on a Particu-
`[13] D. Slepian,
`lar Crossbar Switching Network." Unpublished
`Memo, 1952.
`
`[14] C. Clos, “A study of nonvbloclrin switching net-
`works." Bell System Technical
`our-rial. vol. 32.
`pp. 406—424. Mar. 1955.
`
`[15] A. Vanna and C. S. Raghavendra. Interconnec-
`tion Networks for Multiprocessor: and Multr'com-
`purer-s: Theory and Practice. IEEE Press. 1993.
`[16] H. N. Gebow and 0. Kariv, "Algorithms for
`edge coloring bipartite graphs and multigraphs."
`SIAM Journal of Corriputing. vol. 11. no. 1.
`pp. 117-129,1982.
`“Nonlilocking
`[17] Y. Yang and GA M. Masson,
`Broadcast Switching Networks." IEEE Transac—
`l'iarrs on Computers, vol. C-‘IU, pp. 1005—1015,
`Sept. 1991.
`
`based computing system connected as a [aided-Clo:
`network. We assume that each memory element
`serves as local memory to the corresponding FPGA.
`There are several possibilities:
`(such as memory
`1. Attach the other element
`element) directly to the FPGA. as illustrated in
`Figure 10. This involves locking some of the pads
`of the FPGA. and may im air the mutability of
`the FPGA. This scheme
`as the advantage of
`incurring no additional memory accesa time.
`
`‘2. Attach the other element to the middle stage of
`the folded~Clos network, as illustrated in F1 ure
`11. This scheme incurs higher (one switch dc ay
`accefi delay to the elements. and creates force -
`nets *6]. That is, some nets must
`a through
`speci c switches.
`in this case,
`I; e control
`problem is NP-complete. even for m = 2. Yet
`in our experience with our small prototyping
`system norm [6 We have not encountered an
`assignment whl
`could
`3. Treat the other elements as computing elements,
`as illustrated in Figure 12. This scheme incurs
`the largest additional access delay while enablio
`the use of a known polynomial
`time contra
`algorithm.
`It will also increase the size of
`outer switches to maintain rearrangeability of
`the network.
`
`VII Conclusions
`We have addressed architectural issues involved in the
`dsi n of an FPGA-based computing system. Some
`01' the principles discussed in this paper are being
`applied in the design of a FPGA-based computing
`system at University of California at Santa Cruz. We
`are using the Xilinx XC4D10s as the FPGAs and IDISO
`as programmable interconnects.
`
`Acknowledgment
`The authors would like to acknowledge Professor
`Anujan Varma for his discussions on Class networks.
`The authors are grateful for the generous support of
`Xilinx Inc. and IACube Design Systems.
`
`References
`[I] M. Gokhale. W. Holmes, A. Kopser. S. Lucas,
`R. Minnich. D. Sweely, and D. Lopresti. “Build-
`ing and using a highly parallel programmable
`logic array," Computer, vol. 24. DP. 31—89, Jan.
`1991.
`
`.l. Batcheller, and J. Varghese. An
`[2! M. Butts.
`Efficient Logic Emulation System!
`in Proceeds
`199 .
`ingsgICCD 92. pg.l$8-l4l. Cambridge MA. Oct.
`[3] P. Bertin, D. Rancin. and J. Vuillemin, “Pro-
`grammable Active Memories: 3 Performance
`Assessment,” in I" Inlemolionol’ACM/SIGDA
`Workshop on Field Programmable Gate Arrays,
`(Berkeley. California), pp. 57—59. February £992.
`[4] Quickturn Systems Inc.
`325 East Middlefield
`Road. Mountain View, CA 94043. 1991.
`
`161
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 10
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2071, p. 10
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket