`P. G. Krishnakumar
`Cüneyt M. Özveren
`Robert J. Simcoe
`Barry A. Spinney
`Robert E. Thomas
`Robert J. Walsh
`
`GIGAswitch System:
`A High-performance
`Packet-switching Platform
`
`The GIGAswitch system is a high-performance packet-switching platform built on
`a 36-port 100 Mb/s crossbar switching fabric. The crossbar is data link independent
`and is capable of making 6.25 million connections per second. Digital’s first
`GIGAswitch system product uses 2-port FDDI line cards to construct a 22-port IEEE
`802.1d FDDI bridge. The FDDI bridge implements distributed forwarding in hard-
`ware to yield forwarding rates in excess of 200,000 packets per second per port.
`The GIGAswitch system is highly available and provides robust operation in the
`presence of overload.
`
`The GIGAswitch system is a multiport packet-
`switching platform that combines distributed for-
`warding hardware and crossbar switching to attain
`very high network performance. When a packet
`is received, the receiving line card decides where
`to forward the packet autonomously. The ports on
`a GIGAswitch system are fully interconnected with
`a custom-designed, very large-scale integration
`(VLSI) crossbar that permits up to 36 simultaneous
`conversations. Data flows through 100 megabits
`per second (Mb/s) point-to-point connections,
`rather than through any shared media. Movement
`of unicast packets through the GIGAswitch system
`is accomplished completely by hardware.
`The GIGAswitch system can be used to eliminate
`network hierarchy and concomitant delay. It can
`aggregate traffic from local area networks (LANs)
`and be used to construct workstation farms. The
`use of LAN and wide area network (WAN) line cards
`makes the GIGAswitch system suitable for build-
`ing, campus, and metropolitan interconnects. The
`GIGAswitch system provides robustness and avail-
`ability features useful in high-availability applica-
`tions
`like financial networks and enterprise
`backbones.
`In this paper, we present an overview of the
`switch architecture and discuss the principles influ-
`encing its design. We then describe the implemen-
`tation of an FDDI bridge on the GIGAswitch system
`
`platform and conclude with the results of perfor-
`mance measurements made during system test.
`
`GIGAswitch System Architecture
`The GIGAswitch system
`implements Digital’s
`architecture for switched packet networks. The
`architecture allows fast, simple forwarding by map-
`ping 48-bit addresses to a short address when
`a packet enters the switch, and then forwarding
`packets based on the short address. A header con-
`taining the short address, the time the packet was
`received, where it entered the switch, and other
`information is prepended to a packet when it
`enters the switch. When a packet leaves the switch,
`the header is removed, leaving the original packet.
`The architecture also defines forwarding across
`multiple GIGAswitch systems and specifies an algo-
`rithm for rapidly and efficiently arbitrating for
`crossbar output ports. This arbitration algorithm
`is implemented in the VLSI, custom-designed
`GIGAswitch port interface (GPI) chip.
`
`Hardware Overview
`Digital’s first product to use the GIGAswitch plat-
`form is a modular IEEE 802.1d fiber distributed data
`interface (FDDI) bridge with up to 22 ports.1 The
`product consists of four module types: the FDDI
`line card (FGL), the switch control processor (SCP),
`
`Digital Technical Journal Vol. 6 No. 1 Winter 1994
`
`9
`
`CISCO Exhibit 1005
`Cisco v. Bockstar
`Trial IPR2014 - 1
`
`
`
`High-performance Networking
`
`the clock card, and the crossbar interconnection.
`The modules plug into a backplane in a 19-inch,
`rack-mountable cabinet, which is shown in Figure 1.
`The power and cooling systems provide N+1
`redundancy, with provision for battery operation.
`The
`first
`line card
`implemented
`for
`the
`GIGAswitch system is a two-port FDDI line card
`(FGL-2). A four-port version (FGL-4) is currently
`under design, as is a multifunction asynchronous
`transfer mode (ATM) line card. FGL-2 provides con-
`nection to a number of different FDDI physical
`media using media-specific daughter cards. Each
`port has a lookup table for network addresses and
`associated hardware lookup engine and queue man-
`ager. The SCP provides a number of centralized
`functions, including
`
`n
`
`Implementation of protocols (Internet protocol
`[IP], simple network management protocol
`[SNMP], and IEEE 802.1d spanning tree) above
`the media access control (MAC) layer
`
`n Learning addresses in cooperation with the line
`cards
`
`Figure 1 The GIGAswitch System
`
`n Maintaining loosely consistent line card address
`databases
`
`n Forwarding multicast packets and packets to
`unknown destinations
`
`n Switch configuration
`
`n Network management through both the SNMP
`and the GIGAswitch system out-of-band manage-
`ment port
`
`The clock card provides system clocking and
`storage for management parameters, and the cross-
`bar switch module contains the crossbar proper.
`The power system controller in the power sub-
`system monitors the power supply front-end units,
`fans, and cabinet temperature.
`
`Design Issues
`Building a large high-performance system requires
`a seemingly endless series of design decisions and
`trade-offs. In this section, we discuss some of the
`major issues in the design and implementation of
`the GIGAswitch system.
`
`Multicasting
`Although very high packet-forwarding rates for
`unicast packets are required to prevent network
`bottlenecks, considerably lower rates achieve the
`same result for multicast packets in extended LANs.
`Processing multicast packets on a host is often
`done in software. Since a high rate of multicast traf-
`fic on a LAN can render the connected hosts use-
`less, network managers usually restrict the extent
`of multicast packets in a LAN with filters. Measuring
`extended LAN backbones yields little multicast
`traffic.
`The GIGAswitch system forwards unicast traffic
`in a distributed fashion. Its multicast forwarding
`implementation, however, is centralized, and soft-
`ware forwards most of the multicast traffic. The
`GIGAswitch system can also limit the rate of multi-
`cast traffic emitted by the switch. The reduced rate
`of traffic prevents lower-speed LANs attached to the
`switch through bridges from being rendered inop-
`erable by high multicast rates.
`Badly behaved algorithms using multicast proto-
`cols can render an extended LAN useless. Therefore,
`the GIGAswitch system allocates internal resources
`so that forward progress can be made in a LAN with
`badly behaved traffic.
`
`10
`
`Vol. 6 No. 1 Winter 1994 Digital Technical Journal
`
`CISCO Exhibit 1005
`Cisco v. Bockstar
`Trial IPR2014 - 2
`
`
`
`GIGAswitch System: A High-performance Packet-switching Platform
`
`Switch Fabric
`The core of the GIGAswitch system is a 100 Mb/s
`full-duplex crossbar with 36 input ports and 36 out-
`put ports, each with a 6-bit data path (36 ⫻ 36 ⫻ 6).
`The crossbar is formed from three 36 ⫻ 36 ⫻ 2 cus-
`tom VLSI crossbar chips. Each crossbar input is
`paired with a corresponding output to form a dual-
`simplex data path. The GIGAswitch system line
`cards and SCP are fully interconnected through the
`crossbar. Data between modules and the crossbar
`can flow in both directions simultaneously.
`Using a crossbar as the switch connection (rather
`than, say, a high-speed bus) allows cut-through for-
`warding: a packet can be sent through the crossbar
`as soon as enough of it has been received to make
`a forwarding decision. The crossbar allows an input
`port to be connected to multiple output ports
`simultaneously; this property is used to implement
`multicast. The 6-bit data path through the crossbar
`provides a raw data-path speed of 150 Mb/s using
`a 25 megahertz (MHz) clock. (Five bits are used to
`encode each 4-bit symbol; an additional bit pro-
`vides parity.)
`Each crossbar chip has about 87,000 gates and
`is implemented using complementary metal-oxide
`semiconductor (CMOS) technology. The crossbar
`was designed to complement the FDDI data rate;
`higher data rates can be accommodated through
`the use of hunt groups, which are explained later
`in this section. The maximum connection rate for
`the crossbar depends on the switching overhead,
`i.e., the efficiency of the crossbar output port arbi-
`tration and the connection setup and tear-down
`mechanisms.
`Crossbar ports in the GIGAswitch system have
`both physical and logical addresses. Physical port
`addresses derive from the backplane wiring and are
`a function of the backplane slot in which a card
`resides. Logical port addresses are assigned by the
`SCP, which constructs a logical-to-physical address
`mapping when a line card is initialized. Some of
`the logical port number space is reserved; logical
`port 0, for example, is always associated with the
`current SCP.
`
`Arbitration Algorithm With the exception of
`some maintenance functions, crossbar output port
`arbitration uses logical addresses. The arbitration
`mechanism, called take-a-ticket, is similar to the
`system used in delicatessens. A line card that has
`a packet to send to a particular output port obtains
`a ticket from that port indicating its position in line.
`
`By observing the service of those before it, the line
`card can determine when its turn has arrived and
`instruct the crossbar to make a connection to the
`output port.
`The distributed arbitration algorithm is imple-
`mented by GPI chips on the line cards and SCP. The
`GPI is a custom-designed CMOS VLSI chip with
`approximately 85,000 transistors. Ticket and con-
`nection information are communicated among
`GPIs over a bus in the switch backplane. Although it
`is necessary to use backplane bus cycles for cross-
`bar connection setup, an explicit connection tear
`down is not performed. This reduces the connec-
`tion setup overhead and doubles the connection
`rate. As a result, the GIGAswitch system is capable
`of making 6.25 million connections per second.
`
`Hunt Groups The GPI allows the same logical
`address to be assigned to many physical ports,
`which together form a hunt group. To a sender, a
`hunt group appears to be a single high-bandwidth
`port. There are no restrictions on the size and mem-
`bership of a hunt group; the members of a hunt
`group can be distributed across different line cards
`in the switch. When sending to a hunt group, the
`take-a-ticket arbitration mechanism dynamically dis-
`tributes traffic across the physical ports comprising
`the group, and connection is made to the first free
`port. No extra time is required to perform this arbi-
`tration and traffic distribution. A chain of packets
`traversing a hunt group may arrive out of order.
`Since some protocols are intolerant of out-of-order
`delivery, the arbitration mechanism has provisions
`to force all packets of a particular protocol type to
`take a single path through the hunt group.
`Hunt groups are similar to the channel groups
`described by Pattavina, but without restrictions on
`group membership.2 Hunt groups in the GIGAswitch
`system also differ from channel groups in that their
`use introduces no additional switching overhead.
`Hardware support for hunt groups is included in
`the first version of the GIGAswitch system; software
`for hunt groups is in development at this writing.
`
`Address Lookup
`A properly operating bridge must be able to receive
`every packet on every port, look up several fields in
`the packet, and decide whether to forward or filter
`(drop) that packet. The worst-case packet arrival
`rate on FDDI is over 440,000 packets per second per
`port. Since three fields are looked up per packet,
`the FDDI line card needs to perform approximately
`
`Digital Technical Journal Vol. 6 No. 1 Winter 1994
`
`11
`
`CISCO Exhibit 1005
`Cisco v. Bockstar
`Trial IPR2014 - 3
`
`
`
`High-performance Networking
`
`1.3 million lookups per second per port; 880,000 of
`these are for 48-bit quantities. The 48-bit lookups
`must be done in a table containing 16K entries
`in order to accommodate large LANs. The lookup
`function is replicated per port, so the requisite per-
`formance must be obtained in a manner that mini-
`mizes cost and board area. The approach used to
`look up the fields in the received packet depends
`upon the number of values in the field.
`Content addressable memory (CAM) technology
`currently provides approximately 1K entries per
`CAM chip. This makes them impractical for imple-
`menting the 16K address lookup table but suitable
`for the smaller protocol field lookup. Earlier Digital
`bridge products use a hardware binary search
`engine to look up 48-bit addresses. Binary search
`requires on average 13 reads for a 16K address set;
`fast, expensive random access memory (RAM)
`would be needed for the lookup tables to minimize
`the forwarding latency.
`To meet our lookup performance goals at reason-
`able cost, the FDDI-to-GIGAswitch network con-
`troller (FGC) chip on the line cards implements
`a highly optimized hash algorithm to look up the
`destination and source address fields. This lookup
`makes at most four reads from the off-chip static
`RAM chips that are also used for packet buffering.
`The hash function treats each 48-bit address as
`a 47-degree polynomial in the Galois field of order
`2, GF(2).3 The hashed address is obtained by the
`equation:
`
`M(X) ⫻A( X) mod G(X)
`
`where G( X) is the irreducible polynomial, X48 ⫹
`X 36 ⫹ X 25 ⫹ X 10 ⫹ 1; M( X) is a nonzero, 47-degree
`programmable hash multiplier with coefficients
`in GF(2); and A( X) is the address expressed as a
`47-degree polynomial with coefficients in GF(2).
`The bottom 16 bits of the hashed address is then
`used as an index into a 64K-entry hash table. Each
`hash table entry can be empty or can hold a pointer
`to another table plus a size between 1 to 7, indicat-
`ing the number of addresses that collide in this hash
`table entry (i.e., addresses whose bottom 16 bits of
`their hash are equal). In the case of a size of 1, either
`the pointer points to the lookup record associated
`with this address, or the address is not in the tables
`but happens to collide with a known address. To
`determine which is true, the remaining upper
`32 bits of the hashed address is compared to the pre-
`viously computed upper 32 bits of the hash of the
`known address stored in the lookup record. One of
`
`the properties of this hash function is that it is a
`one-to-one and onto mapping from the set of 48-bit
`values to the same set. As long as the lookup table
`records are not shared by different hash buckets,
`comparing the upper 32 bits is sufficient and leaves
`an additional 16 bits of information to be associated
`with this known address.
`In the case where 1 ⬍ size ≤ 7, the pointer stored
`in the hash bucket points to the first entry in a bal-
`anced binary tree of depth 1, 2, or 3. This binary
`tree is an array sorted by the upper 32 hash remain-
`der bits. No more than three memory reads are
`required to find the lookup record associated with
`this address, or to determine that the address is not
`in the database.
`When more than seven addresses collide in the
`same hash bucket—a very rare occurrence—the
`overflow addresses are stored in the GIGAswitch
`content-addressable memory (GCAM). If several
`dozen overflow addresses are added to the GCAM,
`the system determines that it has a poor choice of
`hash multipliers. It then initiates a re-hashing oper-
`ation, whereby the SCP module selects a better
`48-bit hash multiplier and distributes it to the FGLs.
`The FGLs then rebuild their hash table and lookup
`tables using this new hash multiplier value. The new
`hash multiplier is stored in nonvolatile memory.
`
`Packet Buffering
`The FDDI line card provides both input and output
`packet buffering for each FDDI port. Output buffer-
`ing stores packets when the outgoing FDDI link
`is busy. Input buffering stores packets during
`switch arbitration for the desired destination port.
`Both input and output buffers are divided into sep-
`arate first-in, first-out (FIFO) queues for different
`traffic types.
`Switches that have a single FIFO queue per input
`port are subject to the phenomenon known as
`head-of-line blocking. Head-of-line blocking occurs
`when the packet at the front of the queue is des-
`tined for a port that is busy, and packets deeper in
`the queue are destined for ports that are not busy.
`The effect of head-of-line blocking for fixed-size
`packets that have uniformly distributed output port
`destinations can be closely estimated by a simple
`probability model based on independent trials.
`This model gives a maximum achievable mean uti-
`lization, U ⬇ 1 ⫺ 1/e ⫽ 63.2 percent, for switches
`with more than 20 duplex ports. Utilization
`increases for smaller switches (or for smaller active
`parts of larger switches) and is approximately
`
`12
`
`Vol. 6 No. 1 Winter 1994 Digital Technical Journal
`
`CISCO Exhibit 1005
`Cisco v. Bockstar
`Trial IPR2014 - 4
`
`
`
`GIGAswitch System: A High-performance Packet-switching Platform
`
`a larger set of workstations) is uniformly dis-
`tributed to a smaller set of outputs (for example,
`a smaller set of file servers). The result is
`U ⫽ 1 ⫺ 1
`ec
`
`Un, c
`
`⫽ 1 ⫺ (n ⫺ 1)cn
`
`n
`
`lim
`n (cid:213) ⬁
`
`where c is the mean concentration factor of input
`ports to output ports, and n is the number of out-
`puts. This yields a utilization of 86 percent when
`an average of two inputs send to each output, and a
`utilization of 95 percent when three inputs send to
`each output. Note that utilization increases further
`for smaller numbers of active ports or if hunt
`groups are used.
`Other important factors in head-of-line blocking
`are the nature of the links and the traffic distribu-
`tion on the links. Standard FDDI is a simplex link.
`Simulation studies of a GIGAswitch system model
`were conducted to determine the mean utilization
`of a set of standard FDDI links. They have shown
`that utilization reaches 100 percent, despite head-
`of-line blocking, when approximately 50 Mb/s of
`fixed-size packets traffic, uniformly distributed
`to all FDDI links in the set, is sent into the switch
`from each FDDI. The reason is that almost 50 per-
`cent of the FDDI bandwidth is needed to sink data
`from the switch; hence the switch data path is only
`at 50 percent of capacity when the FDDI links are
`100 percent utilized. This result also applies to
`duplex T3 (45 Mb/s) and all slower links. In these
`situations, the switch operates at well below capac-
`ity, with little internal queuing.
`A number of techniques can be used to reduce
`the effect of head-of-line blocking on link efficiency.
`These include increasing the speed of the switching
`fabric and using more complicated queuing mecha-
`nisms such as per-port output queues or adding
`lookahead to the queue service. All these tech-
`niques raise the cost and complexity of the switch;
`some of them can actually reduce performance for
`normal traffic. Since our studies led us to believe
`that head-of-line blocking occurs rarely in a
`GIGAswitch system, and if it does, hunt groups are
`an effective means for reducing head-of-line block-
`ing, we chose not to implement more costly and
`complex solutions.
`
`Robustness under Overload
`The network must remain stable even when the
`GIGAswitch system is severely stressed. Stability
`requires timely participation in the 802.1d spanning
`tree when the packet forwarding loads approach
`
`75 percent for 2 active ports. The independent trial
`assumption has been removed, and the actual mean
`utilization has been computed.4 It is approximately
`60 percent for large numbers of active ports.
`Hunt groups also affect utilization. The benefits
`of hunt groups on head-of-line blocking can be seen
`by extending the simple independent-trial analysis.
`The estimated mean utilization is
`
`Un, g
`
`⫽ 1 ⫺ ⌺ g ⫺ k (n g)(1)k(n ⫺ 1)ng ⫺ k
`
`g
`
`k⫽0
`
`g
`
`k
`
`n
`
`n
`
`where n is the number of groups, and g is the hunt
`group size. In other words, all groups are the same
`size in this model, and the total number of switch
`ports is (n ⫻ g). This result is plotted in Figure 2
`along with simulation results that remove the inde-
`pendent trial assumption. The simulation results
`agree with the analysis above for the case of only
`one link in each hunt group. Note that adding a link
`to a hunt group increases the efficiency of each
`member of the group in addition to adding band-
`width. These analytical and simulation results, doc-
`umented in January 1988, also agree with the
`simulation results reported by Pattavina.2
`The most important factor in head-of-line block-
`ing is the distribution of traffic within the switch.
`When all traffic is concentrated to a single output,
`there is zero head-of-line blocking because traffic
`behind the head of the line cannot move any more
`easily than the head of the line can move. To study
`this effect, we extended the simple independent-
`trial model. We estimated the utilization when the
`traffic from a larger set of inputs (for example,
`
`5
`10
`HUNT GROUP SIZE
`
`15
`
`0.90
`
`0.85
`
`0.80
`
`0.75
`
`0.70
`
`0.65
`
`0.60
`
`0.55
`
`0
`
`EXPECTED MAXIMUM(cid:13)
`
`UTILIZATION
`
`KEY:
`
`ANALYTICAL(cid:13)
`SIMULATION
`
`Figure 2 Effect of Hunt Groups on Utilization
`
`Digital Technical Journal Vol. 6 No. 1 Winter 1994
`
`13
`
`CISCO Exhibit 1005
`Cisco v. Bockstar
`Trial IPR2014 - 5
`
`
`
`High-performance Networking
`
`the worst-case maximum. The techniques used to
`guarantee forward progress on activities like the
`spanning tree include preallocation of memory
`to databases and packets, queuing methods, operat-
`ing system design, and scheduling techniques.
`Solutions that provide only robustness are insuffi-
`cient; they must also preserve high throughput in
`the region of overload.
`
`Switch Control Processor Queuing and Quota
`Strategies The SCP is the focal point for many
`packets, including (1) packets to be flooded,
`(2) 802.1d control packets, (3) intrabox intercard
`command (ICC) packets, and (4) SNMP packets.5
`Some of these packets must be processed in a
`timely manner. The 802.1d control packets are part
`of the 802.1d algorithms and maintain a stable net-
`work topology. The ICCs ensure correct forwarding
`and filtering of packets by collecting and distribut-
`ing information to the various line cards. The SNMP
`packets provide monitoring and control of the
`GIGAswitch system.
`Important packets must be distinguished and
`processed even when the GIGAswitch system is
`heavily loaded. The aggregate forwarding rate for a
`GIGAswitch system fully populated with FGL-2 line
`cards is about 4 million packets per second. This is
`too great a load for the SCP CPU to handle on its
`own. The FDDI line cards place important packets
`in a separate queue for expedient processing.
`Special hardware on the SCP is used to avoid loss of
`important packets.
`The crossbar access control (XAC) hardware on
`the SCP is designed to avoid the loss of any impor-
`tant packet under overload. To distinguish the pack-
`ets, the XAC parses each incoming packet. By
`preallocating buffer memory to each packet type,
`and by having the hardware and software cooper-
`ate to maintain a strict accounting of the buffers
`used by each packet type, the SCP can guarantee
`reception of each packet type.
`
`Arriving packets allocated to an exhausted buffer
`quota are dropped by the XAC. For instance, pack-
`ets to be flooded arrive due to external events and
`are not rate limited before they reach the SCP.
`These packets may be dropped if the SCP is over-
`loaded. Some buffer quotas, such as those for ICC
`packets, can be sized so that packets are never
`dropped. Since software is not involved in the deci-
`sion to preserve important packets or to drop
`excessive loads, high throughput is maintained dur-
`ing periods of overload. In practice, when the net-
`work topology is stable, the SCP is not overloaded
`and packets passing through the SCP for bridging
`are not dropped, even on networks with thousands
`of stations. This feature is most important during
`power-up or topology-change transients, to ensure
`the network progresses to the stable state.
`If the SCP simply processed packets in FIFO order,
`reception of each packet type would be ensured, but
`timely processing of important packets might not.
`Therefore, the first step in any packet processing is
`to enqueue the packet for later processing. (Packets
`may be fully processed and the buffers reclaimed if
`the amount of work to do is no greater than the
`enqueue/dequeue overhead.) Since the operating
`system scheduler services each queue in turn, split-
`ting into multiple queues allows the important
`packets to bypass the less important packets.
`Multiple queues are also used on the output port
`of the SCP. These software output queues are ser-
`viced to produce a hardware output queue that is
`long enough to amortize device driver entry over-
`heads, yet short enough to bound the service time
`for the last packet inserted. Bounding the hardware
`queue service time ensures that the important
`802.1d control packets convey timely information
`for the distributed spanning tree algorithms. These
`considerations yield the queuing diagram shown in
`Figure 3.
`At time t1, packets arriving in nonempty quotas
`are transferred by direct memory access (DMA) into
`
`PACKETS ARRIVE(cid:13)
`FROM CROSSBAR
`
`TASK A
`
`TASK B
`
`PROCESS C
`
`PACKETS LEAVE(cid:13)
`ON CROSSBAR
`
`t1
`
`t 2
`
`t 3
`
`t 4
`
`t 5
`
`Figure 3 Packet Queuing on the SCP
`
`14
`
`Vol. 6 No. 1 Winter 1994 Digital Technical Journal
`
`CISCO Exhibit 1005
`Cisco v. Bockstar
`Trial IPR2014 - 6
`
`
`
`GIGAswitch System: A High-performance Packet-switching Platform
`
`dynamic RAM. They enter the hardware-received
`packet queue. At time t2, software processes the
`received packet queue, limiting the per-packet
`processing to simple actions like the enqueuing
`of the packet to a task or process. At time t3, the
`packet contents are examined and the proper pro-
`tocol actions executed. This may involve the for-
`warding of the arriving packet or the generation of
`new packets. At time t4, packets are moved from
`the software output queues to the short hardware
`output queue. At time t5, the packet is transferred
`by DMA into the crossbar.
`
`Limiting Malicious Influences Using packet
`types and buffer quotas, the SCP can distinguish
`important traffic, like bridge control messages,
`when it is subjected to an overload of bridge con-
`trol, unknown destination addresses, and multicast
`messages. Such simple distinctions would not, how-
`ever, prevent a malicious station from consuming
`all the buffers for multicast packets and allowing
`starvation of multicast-based protocols. Some of
`these protocols, like the IP address resolution pro-
`tocol (ARP), become important when they are not
`allowed to function.6
`To address this problem, the SCP also uses the
`incoming port to classify packets. A malicious sta-
`tion can wreak havoc on its own LAN whether or
`not the GIGAswitch system is present. By classifying
`packets by incoming port, we guarantee some
`buffers for each of the other interfaces and thus
`ensure communication among them. The mali-
`cious station is reduced to increasing the load of
`nuisance background traffic. Region t4 of Figure 3
`contains the layer of flooding output queues that
`sort flooded packets by source port. When for-
`warding is done by the SCP bridge code, packets
`from well-behaved networks can bypass those from
`poorly behaved networks.
`Fragmentation of resources introduced by the
`fine-grained packet classification could lead to small
`buffer quotas and unnecessary packet loss. To com-
`pensate for these possibilities, we provided shared
`resource pools of buffers and high-throughput,
`low-latency packet forwarding in the SCP.
`
`CPU, so that all parts of the system can make for-
`ward progress in a timely manner.
`On the SCP, limiting the interrupt rate is accom-
`plished in two ways. One is to mask the propa-
`gation of an interrupt by combining it with a
`software-specified pulse. After an interrupt is
`serviced, it is inhibited for the specified time by
`triggering the start of the pulse. At the cost of hard-
`ware complexity, software is given a method for
`quick, single-instruction, fine-grained rate limiting.
`Another method, suitable for less frequently exe-
`cuted code paths like error handling, is to use soft-
`ware timers and interrupt mask registers to limit
`the frequency of an interrupt. Limiting the inter-
`rupt rate also has the beneficial effect of amortizing
`interrupt overhead across the events aggregated
`behind each interrupt.
`Noninterrupt software inhibits interrupts as part
`of critical section processing. If software inhibits
`interrupts for too long, interrupt service code can-
`not make forward progress. By convention, inter-
`rupts are inhibited for a limited time.
`Interrupt servicing can be divided into two
`types. In the first type, a fixed sequence of actions
`is taken, and limiting the interrupt rate is sufficient
`to limit interrupt execution time. Most error pro-
`cessing falls into this category. In the second type,
`for all practical purposes, an unbounded response
`is required. For example, if packets arrive faster
`than driver software can process them, then inter-
`rupt execution time can easily become unaccept-
`able. Therefore, we need a mechanism to bound the
`service time. In the packet I/O interrupt example,
`the device driver polls the microsecond clock to
`measure service time and thereby terminate device
`driver processing when a bound is reached. If ser-
`vice is prematurely terminated, then the hardware
`continues to post the interrupt, and service is
`renewed when the rate-limiting mechanism allows
`the next service period to begin.
`Interrupt rate limiting can lead to lower system
`throughput if the CPU is sometimes idle. This can
`be avoided by augmenting interrupt processing
`with polled processing when idle cycles remain
`after all activities have had some minimal fair share
`of the CPU.
`
`If an interrupt-
`Guaranteeing Forward Progress
`driven activity is offered unlimited load and is
`allowed to attempt to process the unlimited load,
`a “livelock” condition, where only that activity exe-
`cutes, can result. Limiting the rate of interrupts
`allows the operating system scheduler access to the
`
`Reliability and Availability
`Network downtime due to switch failures, repairs,
`or upgrades of the GIGAswitch system is low.
`Components in the GIGAswitch system that could
`be single points of failure are simple and thus more
`
`Digital Technical Journal Vol. 6 No. 1 Winter 1994
`
`15
`
`CISCO Exhibit 1005
`Cisco v. Bockstar
`Trial IPR2014 - 7
`
`
`
`High-performance Networking
`
`reliable. Complex functions (logic and firmware)
`are placed on modules that were made redundant.
`A second SCP, for example, takes control if the first
`SCP in the GIGAswitch system fails. If a LAN is con-
`nected to ports on two different FDDI line cards,
`the 802.1d spanning tree places one of the ports in
`backup state; failure of the operational port causes
`the backup port to come on-line.
`The GIGAswitch system allows one to “hot-swap”
`(i.e., insert or remove without turning off power)
`the line cards, SCP, power supply front-ends, and
`fans. The GIGAswitch system may be powered from
`station batteries, as is telephone equipment, to
`remove dependency on the AC mains.
`
`Module Details
`In the next section, we describe the functions of
`the clock card, the switch control processor, and
`the FDDI line card.
`
`Clock Card
`The clock card generates the system clocks for the
`modules and contains a number of centralized sys-
`tem functions. These functions were placed on the
`clock card, rather than the backplane, to ensure
`that the backplane, which is difficult to replace, is
`passive and thus more reliable. These functions
`include storing the set of 48-bit IEEE 802 addresses
`used by the switch and arbitration of the backplane
`bus that is used for connection setup.
`Management parameters are placed in a stable
`store in flash electrically erasable programmable
`read-only memory (EEPROM) on the clock card,
`
`rather than on SCP modules. This simplifies the task
`of coordinating updates to the management param-
`eters among the SCP modules. The SCP module con-
`trolling the box is selected by the clock card, rather
`than by a distributed (and more complex) election
`algorithm run by the SCPs.
`The clock card maintains and distributes the
`system-wide time, communicated to the SCP and
`line cards through shared-memory mailboxes on
`their GPI chips. Module insertion and removal are
`discovered by the clock card, which polls the slots
`in the backplane over the module identification bus
`interconnecting the slots. The clock card controls
`whether power is applied to or removed from a
`given slot, usually under command of the SCP. The
`clock card also provides a location for the out-of-
`band management RS-232 port, although the out-
`of-band management code executes on the SCP.
`Placing this set of functionality on the clock card
`does not dramatically increase complexity of that
`module or reduce its reliability. It does, however,
`significantly reduce the complexity and increase
`the reliability of the system as a whole.
`
`S