throbber
(12) United States Patent
`Bhaskaran
`
`(10) Patent N0.:
`(45) Date of Patent:
`
`US 6,266,335 B1
`Jul. 24, 2001
`
`US006266335B1
`
`(54) CROSS-PLATFORM SERVER CLUSTERING
`USING A NETWORK FLOW SWITCH
`
`(75) Inventor: Sajit Bhaskaran, Sunnyvale, CA (US)
`
`_
`-
`(73) Asslgnee' CyberIQ Systems’ San Jose’ CA (Us)
`
`9/1998 Hashimoto .................... .. 395/200.68
`5,815,668
`11/1998 Hess .............................. .. 395/18208
`5,835,696
`5,835,710 * 11/1998 Nagami et a1. ..
`....... .. 709/250
`578627338 * 1/1999 Walker et al'
`395000-54
`5,920,699 * 7/1999 Bare ........................... .. 709/225
`5,936,936 * 8/1999 Alexander, Jr. et a1. .......... .. 370/395
`5,949,753 * 9/1999 Alexander, Jr. et a1. .......... .. 370/216
`
`
`
`iititi?ineftail?' ' ' ' ' ' ' ' ' ' ' ' * 5:999:536 * 12/1999 Kawafuji ct.
`
`
`
`' ' ' "
`
`37O/4O1
`
`709/226
`6,006,264 * 12/1999 Colby et a1.
`709/223
`6,047,319 * 4/2000 Olson ..... ..
`.. 709/201
`6,097,882 * 8/2000 Mogul ........ ..
`6,101,616 * 8/2000 Joubert et a1. ....................... .. 714/11
`
`FOREIGN PATENT DOCUMENTS
`9 321789
`12/1997 (JP)
`HO4L/12/46
`WO 99/32956
`7/1999 (W0) ........................ u GO6F/O/OO
`
`OTHER PUBLICATIONS
`
`Internet. “Quasi—Dynamic Load—Balancing (QDBL) Meth
`9915-” Apr- 25, 1995,1111 2 and 5
`* Cited by examiner
`
`Primary Examiner—Douglas Olms
`Assistant Examiner—Phirin Sam
`(74) Attorney,
`Agent,
`or Firm—Skjerven Morrill
`Macpherson LLP; Alan H' Macpherson; Pablo E‘ Marine
`
`(57)
`
`ABSTRACT
`
`A network ?ow switch is provided for connecting a pool of
`IP routers to a cluster of IP servers sharing a single IP
`address without requiring translation of the IP address.
`Rather, all IP servers have the same IP address. The network
`?ow switch routes packets to individual servers by writing
`the Data Link Layer address of the destination IP server in
`the destmatlon Data Llnk Layer address ?eld of the Packet~
`However, no Data Link Layer address translation is required
`for packets transmitted from the IP servers to the IP routers.
`Since in a typical client-server environment, the number of
`packets sent from the server to the client is much greater
`than the number of packets sent from the client to the server,
`the Data Link Layer address translation requires very little
`Overall processing time'
`
`35 Claims, 11 Drawing Sheets
`
`( * ) Notice:
`
`
`
`Subject to any disclaimer, the term of this patent is extended or adjusted under 35
`
`U-S-C- 154(b) by@ days~
`
`(21) APPL NO; 08/994,709
`
`(22) Filed:
`
`Dec. 19, 1997
`
`(51) Int. Cl.7 ........................... .. H04L 12/28; H04L 12/56
`(52) US. Cl. ........................................... .. 370/399, 370/389
`(58) Field of Search ................................... .. 370/399, 397,
`370/402, 360, 372, 353, 389, 396, 400,
`401, 409, 419, 420, 421, 423, 901, 902,
`903, 908, 910, 912, 911, 392, 422; 395/115,
`182.07, 200.3, 200.31, 200.32, 200.33,
`200.48, 200.49, 200.57, 200.68
`
`(56)
`
`References Clted
`U_S_ PATENT DOCUMENTS
`_
`_
`2/1994 Georgiadis et a1. ............... .. 395/650
`5,283,897
`5,301,226 * 4/1994 Olson et a1. ..... ..
`379/8818
`5,473,599
`12/1995 Li et a1. .................... .. 370/16
`5,513,314
`4/1996 Kandasamy et a1. .
`.. 395/18204
`5,583,940
`12/1996 Vidrascu et a1.
`....... .. 380/49
`370/404
`5,586,121
`12/1996 Moura et a1- -
`5,608,447 * 3/1997 Parry et a1
`~~~~~ " 348/7
`576127865 * 3/1997 Dasgupta
`364/184
`Eiege "" "
`5j652:892 * 7/1997 U
`"""""""""""""""""" "
`5,655,140
`8/1997 Haddock ....................... .. 395/200.76
`576667487 * 9/1997 Goodman et a1
`709/246
`5,687,369
`11/1997 Li ,,,,,,,,,,,,, ,,
`395/619
`5,740,375 * 4/1998 Dunne et a1. ..
`709/238
`709/227
`5,754,752 * 5/1998 Sheh et al
`577647895 * 6/1998 Chung ~~~~~~~~~~ ~~
`370/402
`5774560 * 6/1998 Brendefl et a1‘
`39500031
`5,774,668 * 6/1998 Choquier et a1. ..
`370/480
`5,796,941
`8/1998 Lita .............. ..
`395/187 01
`5,805,804 * 9/1998
`5,812,819 * 9/1998
`
`Mhwk Router
`
`IBM / Softlayer v. ZitoVault
`Ex. 1006 / Page 1 of 19
`
`

`
`U.S. Patent
`
`Jul. 24, 2001
`
`Sheet 1 of 11
`
`US 6,266,335 B1
`
`
`
`Lo.
`5
`C\1
`07
`<~
`
`—
`
`1.0
`_._
`Lo.
`
`c\1
`
`Q.
`*
`
`<9
`
`"'7
`(N
`on
`*“
`
`—
`
`<3
`
`3
`Q?
`%
`o
`
`<1;
`Z
`
`.—<1)
`O
`*3
`
`_\<
`
`E
`c1_>
`Z
`
`QC
`
`-5
`O
`
`Q)
`Z
`
`O
`
`Ԥ/x_,44
`<1
`.
`
`15>
`\/
`&'Q_
`
`Ex. 1006/ Page 2 of 19
`
`LG
`
`.)
`C.)
`:
`
`*5
`as
`
`8
`L
`
`g:
`_
`Q)
`(/3
`
`5
`>
`5
`(f)
`&
`
`5
`E
`U7
`
`C_L
`
`__
`g
`
`U7
`
`53
`>
`
`U‘)
`c_L
`
`3
`>
`5
`(/7
`Q_
`
`LO.LO
`Lo
`,_'
`N’)
`c\1
`on
`“
`
`LC?
`:3
`C\l
`03
`«-
`E
`
`LO
`.-—
`<0.
`
`C\l
`
`g
`
`C‘!If)
`L0
`._‘
`
`E
`r—
`D.
`
`1.0
`‘C.
`.—
`M)
`C\l
`on
`-—
`
`E
`
`Ex. 1006 / Page 2 of 19
`
`

`
`U.S. Patent
`
`Jul. 24, 2001
`
`Sheet 2 of 11
`
`US 6,266,335 B1
`
`02323.5%n:0225:22¢_371_.£._22¢_
`
`C22:_,f22n:
`Q22Ezfizn:
`
`£29;3:H2:
`
`ES912
`
`SfiHZ:
`
`H.
`
`E92
`
`om
`
`em
`
`53mm2atom¢_‘III
`
`aimn__
`
`siom
`
`E322n:§.§2¢_
`
`3522ES22
`
`$En§¢_
`
`339%
`
`02EN
`
`am
`
`stemvtofwzE81vtoiwz
`
`Ezomvtofmz
`
`NN»:
`
`Ex. 1006/ Page 3 of 19
`
`E5,3:Efiz
`
`am
`
`Ex. 1006 / Page 3 of 19
`
`
`

`
`U.S. Patent
`
`Jul. 24, 2001
`
`Sheet 3 0f 11
`
`US 6,266,335 B1
`
`22 0% B05 25 . . is 2: NE 2%:
`
`on an 3., 2m 0m 5, E N N N N N N N
`
`6% 22 ‘
`
`8% $2 \
`
`Fm 6E
`
`am an an N N N
`
`
`
`6% N ‘ 1 m2: @ was @ ‘
`
`mm .QE.
`
`
`
`25 $062 cozoéwwa 9% $225 850m 912 1/
`
`
`
`
`
`
`
`A 8% a $3 2 x mg: iv
`
`3.,
`
`Ex. 1006 / Page 4 of 19
`
`

`
`U.S. Patent
`
`Jul. 24, 2001
`
`Sheet 4 of 11
`
`US 6,266,335 B1
`
`gm
`
`MW;
`
`Ev
`
`m?
`
`2,
`
`3.5.3
`
`a__2E8
`
`#mEm£m_
`
`a__o_E8
`
`3»62
`
`Ex. 1006/ Page 5 of 19
`
`Ex. 1006 / Page 5 of 19
`
`
`
`
`
`

`
`US. Patent
`
`Jul. 24, 2001
`
`Sheet 5 6f 11
`
`US 6,266,335 B1
`
`i
`
`Start
`
`)
`
`ti
`Ethernet cord
`receives packet ~ 420
`
`Copy packet
`to Memory
`
`t
`Perform Load Balancing
`/ (OPTIONAL)
`435
`"
`
`Re—write MAC
`f destination address
`440
`
`Route packet
`to Server
`
`~445
`
`t!
`End
`
`FIG. 4B
`
`Ethernet card
`receives packet
`
`-» 450
`
`Re—write MAC
`destination address
`(OPTIONAL)
`
`ti
`Route packet
`to Network Router ~ 465
`
`i
`End
`
`FIG. 4C
`
`Ex. 1006 / Page 6 of 19
`
`

`
`U.S. Patent
`
`Jul. 24, 2001
`
`Sheet 6 of 11
`
`US 6,266,335 B1
`
`o_¢
`
`mom
`
`\\2:ma_
`
`ago
`
`zomammxN_m
`
`2,29%
`
`en
`
`E:2%052.5
`
`Nmmmm
`
`mmumo
`
`Nmmmm
`
`mN-mQ
`
`Noow¢-_o
`
`_o=o_#coo
`
`Noowv-H0
`
`_m=o_#coo
`
`Noom¢-_o
`
`_®=o_#coo
`
`Noow¢-Ho
`
`_o=o_#:oo
`
`m2_
`
`§<mcom
`
`o¢m
`
`mz_
`
`§<moom
`
`mz_
`
`2<mcom
`
`mz_
`
`2<mogm
`
`"36E
`
`canommo_m
`
`Ex. 1006/ Page 7 of 19
`
`Ex. 1006 / Page 7 of 19
`
`
`
`
`
`

`
`U.S. Patent
`
`Jul. 24, 2001
`
`Sheet 7 of 11
`
`US 6,266,335 B1
`
`o_¢
`
`o o
`
`m
`
`zomlmmxN_m
`
`2<mmzw
`
`SW0:mzm
`
`m c
`
`m
`
`3&0
`
`oc_¢xFe
`
`Nmmmm
`
`nmumo
`
`Nmmmm
`
`mmumo
`
`Ex. 1006/ Page 8 of 19
`
`Ex. 1006 / Page 8 of 19
`
`

`
`U.S. Patent
`
`Jul. 24, 2001
`
`Sheet 8 of 11
`
`US 6,266,335 B1
`
`o 9
`
`.,3.5
`
`mom
`
`o_¢
`
`com
`
`éEEE
`
`Emu
`
`oomvnm_e
`
`o_o¢©-_o
`
`Nmmmm
`
`mmuma
`
`mmmmm
`
`mmnmo
`
`Ex. 1006/ Page 9 of 19
`
`Ex. 1006 / Page 9 of 19
`
`

`
`U.S. Patent
`
`Jul. 24, 2001
`
`Sheet 9 of 11
`
`US 6,266,335 B1
`
`Li‘?
`C)
`<\l
`
`\
`
`29%EE3%
`
`5%E:a__§§
`
`§<mmzw
`
`onomm
`
`
`
`smogmzmo_o¢©:Ho_m=o:coo
`
`beam:
`
`§_e<25252$:Es
`
`Ex.1006/Page10of19
`
`Ex. 1006 / Page 10 of 19
`
`

`
`U.S. Patent
`
`Jul. 24, 2001
`
`Sheet 10 of 11
`
`US 6,266,335 B1
`
`o_¢
`
`223e_E
`
`now
`
`\\2:mgm
`
`ooN¢-m_o_mmmmm
`
`mmumo
`
`:2:ea
`
`mmmNnmmm
`
`oa$»5
`
`mmumo
`
`Noow¢-_o
`
`_m=o_#coo
`
`Noowv-_o
`
`Noow¢-Ho
`
`
`
`_w=o_#=oo_m=o.#=oo
`
`Noow¢-_o
`
`_o=o__coo
`
`m2_
`
`2<mcom
`
`mg,
`
`:<mcam
`
`m2_
`
`2<mcam
`
`m:_
`
`§<mcow
`
`owncanowno_m
`
`Ex.1006/Page11of19
`
`Ex. 1006 / Page 11 of 19
`
`
`
`
`

`
`U.S. Patent
`
`Jul. 24, 2001
`
`Sheet 11 0f 11
`
`US 6,266,335 B1
`
`zsaw 02 31% 22
`
`0; an) 22
`
`gm § am an
`
`~ ~ x
`
`N am,
`
`am 9mm x N
`
`6%86
`
`E5
`
`
`
`ECU gomwmuel E00 LOmmmoOkm
`
`
`
`
`
`mow
`
`
`
`zwewgcoz E2381
`
`éweomocoz
`
`
`
`
`
`
`
`
`
`“m0; “0523M #mOL 3:556 am am 25 2E2; E5 25% x ./
`
`
`
`Ex. 1006 / Page 12 of 19
`
`

`
`US 6,266,335 B1
`
`1
`CROSS-PLATFORM SERVER CLUSTERING
`USING A NETWORK FLOW SWITCH
`
`CROSS REFERENCE TO APPENDIX
`
`Appendix A, Which is part of the present application, is a
`set of architectural speci?cations for a network ?oW sWitch,
`according to one embodiment of the invention.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`The present invention relates generally to computer net
`Works and more speci?cally, to high-bandWidth netWork
`sWitches.
`2. Description of the Related Art
`The increasing traf?c over computer netWorks such as the
`Internet, as Well as corporate intranets, WANs and LANs,
`often requires the use of multiple servers to accommodate
`the needs of a single service provider or MIS department.
`For example, a company that provides a search engine for
`the Internet may handle over 80 million hits (i.e., accesses
`to the company’s Web page) every day. A single server
`cannot handle such a large volume of service requests Within
`an acceptable response time. Therefore, it is desirable for
`high-volume service providers to be able to use multiple
`servers to satisfy service requests.
`For example, the Internet Protocol (IP), Which is used to
`identify computers connected to the Internet and other
`global, Wide or local area netWorks, assigns a unique IP
`address to each computer connected to the netWork. Thus,
`When multiple servers are used, each server must be
`accessed using the server’s oWn IP address.
`On the other hand, it is desirable for users to be able to
`access all servers of a service provider using a unique IP
`address. OtherWise, the users Would have to keep track of the
`servers maintained by the service provider and their relative
`Workloads in order to obtain faster response times. By using
`a single “virtual” IP address (i.e., an IP address that does not
`correspond to any one of the IP servers, but rather designates
`the entire group of IP servers), service providers are able to
`divide service requests among the servers. By using this
`scheme, IP servers may even be added or removed from the
`group of IP servers corresponding to the virtual IP address
`to compensate for varying traf?c volumes. Multiple servers
`used in this fashion are sometimes referred to as a “cluster.”
`FIG. 1 illustrates a prior art cluster of IP servers. A server
`load balancer 100 routes packets among IP servers 110, 120,
`130, 140 and 150 and netWork routers 160, 170 and 180.
`Each of IP servers 110, 120, 130, 140 and 150 and netWork
`routes 160, 170 and 180 has a distinct IP address; hoWever,
`any of IP servers 110, 120, 130, 140 and 150 can be accessed
`via a virtual IP address (not shoWn) from netWorks con
`nected to netWork routers 160, 170 and 180. When a packet
`addressed to the virtual IP address is received by server load
`balancer 100, the virtual IP address is translated into the
`individual IP addresses of one of the IP servers and the
`packet is routed to that IP server. The translation, hoWever,
`involves generating a neW checksum for the packet and
`re-Writing the source/destination IP address and the check
`sum ?elds of the IP header ?eld, as Well as of the TCP and
`UDP header ?elds. Both the IP header checksum, Which is
`the ISO Layer 3 or NetWork Layer header, and the TCP or
`UDP header checksums, Which are the ISO Layer 4 or
`Transport Layer header checksums, need to be recalculated
`for each packet. Typically, these operations require inter
`vention by a processor of the server load balancer.
`
`15
`
`25
`
`35
`
`45
`
`55
`
`65
`
`2
`When a high volume of requests is processed, the over
`head imposed by the translation has a signi?cant impact on
`the response time of the IP servers. In addition, if a large
`number of IP servers are used, the time required to perform
`the translation creates a bottleneck in the performance of the
`server load balancer, since the IP address of each packet
`transmitted to and from the IP servers must be translated by
`the sWitch. Therefore, there is a need for a faster method for
`sharing a single IP address among multiple IP servers.
`In other cases, When multiple IP addresses are used and a
`client typically tries to access a primary IP server. If the
`primary IP server does not respond Within a ?xed time
`period, the client tries to access backup IP servers, until a
`response is received. Thus, When the primary IP server is
`unavailable, the client experiences poor response time. Cur
`rent server replication systems such as those used in DNS
`and RADIUS servers are affected by this problem. There is
`thus a need for a method of accessing multiple IP servers
`Which does not experience poor response time When the
`primary IP server is unavailable.
`Another potential draWback of the prior art is that each
`replicated server requires a unique IP address physically
`con?gured on the server. Since all IP netWorks are subject to
`subnet masking rules (Which are often determined by an
`external administrator) the scalability of the replication is
`severely limited. For example, if the subnet pre?x is 28 bits
`of a 32-bit IP address, the maximum number of replicated
`servers is 16 (26248)). There is a need for a method of
`replicating servers that alloWs replication of IP servers
`independent of subnet masking rules.
`IP version 4 addresses are currently scarce on the Internet,
`so any method of IP server replication that requires a
`proportional consumption of these scarce IP addresses is
`inherently Wasteful. For example, an example of prior art is
`Domain Name Service (DNS) based load balancing. DNS
`servers are used for resolving a server name (e.g.,
`WWW.companyname.com) to a globally unique IP address
`(e.g., 192.45.54.23). In DNS based server load balancing,
`many unique IP addresses per server name are kept and
`doled out to alloW load balancing. HoWever, this reduces the
`number of available IP version 4 addresses. There is thus a
`need for a method of clustering IP servers that minimiZes
`consumption of the scarce IP address space.
`Furthermore, When the IP payload of a packet is encrypted
`to provide secure transmissions over the Internet, IP address
`translation cannot be performed Without ?rst decrypting the
`IP payload (Which contains the TCP or UDP header
`checksums). In the current frameWork for IP Security,
`referred to as IPSEC, the transport layer is part of the
`netWork layer payload Which Will be completely encrypted
`in a netWork application that implements IPSEC. IPSEC is
`described in RFCs 1825—1827 published by the Internet
`Engineering Taskforce. Encryption is performed by the
`client, and decryption is performed by the server, using
`secret crypto-keys Which are unique to each client-server
`link. Therefore When such encryption is performed in client
`server communications, as in IPSEC, prior art server load
`balancers Will not be able to perform load balancing opera
`tions Without violating IPSEC rules. This is because server
`load balancers cannot access the transport layer information
`(encrypted as part of the IP payload) Without ?rst decrypting
`the IP payload. Since, the crypto-keys set up betWeen client
`and server are by de?nition not public, the IP payload cannot
`be decrypted by the server load balancer in compliance With
`IPSEC (indeed, for all practical purposes, the server load
`balancer Will not Work at all for encrypted packets).
`There is thus a need for a system that not only alloWs for
`transmissions of encrypted data packets according to the
`
`Ex. 1006 / Page 13 of 19
`
`

`
`US 6,266,335 B1
`
`3
`IPSEC model, but also allows network administrators to
`perform both server load balancing and IPSEC in their
`networks. Furthermore, current server load balancers typi
`cally operate on TCP packets only. By contrast, IP headers
`have an 8-bit protocol ?eld, theoretically supporting up to
`256 transport protocols at ISO layer 4. There is thus a need
`for a server load balancing system that supports transport
`protocols at ISO layer 4 other than TCP (e.g., UDP, IPiini
`IP, etc.).
`Prior art systems allow for load balancing and,
`sometimes, fault tolerance of network traf?c only in the
`inbound direction (i.e., client-router-server). Load balancing
`and fault tolerance in the reverse (outbound) direction (i.e.,
`server-router-client) is not supported. Speci?cally if mul
`tiple router links are provided for the server to return
`information to clients, no attempt is made to load balance
`traf?c ?ow through the router links. Also, when a speci?c IP
`server is con?gured to use a speci?c default router IP
`address in the outbound transmissions, no fault tolerance or
`transparent re-routing of packets is performed when the
`router fails. There is thus a need for a system that allows for
`traf?c ?ow clustering services, in both the inbound and the
`outbound directions.
`The prior art solutions are hardware devices con?gured to
`appear as IP routers to the cluster of servers being load
`balanced. As a result, one more classes of IP router devices
`are added to the router administrator’s domain of managed
`IP routers. This constrains future evolution of the router
`network, both in terms of adding new vendors’ routers in the
`future and adding new and more sophisticated routing
`features. Debugging and troubleshooting of routing prob
`lems also becomes more difficult. It would thus be preferable
`to employ a completely transparent piece of hardware, such
`as a LAN switch or hub, as a load balancing device. In the
`related art, the servers and any external routers are con
`nected to the load balancing device using shared media
`Ethernet, (i.e., a broadcast media network). There is a need
`for a better solution that allows use of switched circuits (e.g.,
`switched Ethernet, SONET), as switched circuits inherently
`provide (a) dedicated bandwidth and (b) full-duplex (i.e.,
`simultaneous transmit and receive operations) to call con
`nected devices.
`
`SUMMARY OF THE INVENTION
`The present invention provides a network ?ow switch
`(and a method of operation thereof)for connecting a pool of
`IP routers to a cluster of IP servers sharing a single IP
`address, without requiring translation of the IP address, and
`providing bi-directional clustering. The network ?ow
`switch, by operating transparently at the 150 layers 2 and 3,
`enables cross-platform clustering of servers and routers,
`these routers being the so-called “?rst-hop” routers used by
`the servers to communicate with the outside world. This
`means the servers within any single cluster can come from
`any manufacturer of computer hardware and run any oper
`ating system (e.g., Microsoft WINDOWS NT, Unix,
`MACOS). WINDOWS NT is a registered trademark of
`Microsoft Corp. of Redmond, Wash.; MACOS is a regis
`tered trademark of Apple Computer, Inc. of Cupertino, Calif.
`It also means the routers can come from any vendor of
`routing equipment. The network ?ow switch therefore,
`allows customers freedom of choice in server operating
`systems as well as router systems in designing their server
`clustering schemes. The only requirements on these servers
`and routers is that they all implement standard TCP/1P
`communications protocols, or some other protocol stack in
`conformance with the ISO/OSI 7-layer model for computer
`
`10
`
`15
`
`25
`
`35
`
`45
`
`55
`
`65
`
`4
`communications. The network ?ow switch routes packets to
`individual servers by writing the Data Link Layer address of
`the destination IP server in the destination Data Link Layer
`address ?eld of the packet. Packets transmitted from the IP
`servers to the IP routers, on the other hand, do not require
`modi?cation of the Data Link Layer address ?eld.
`Since in a typical client-server environment the majority
`of the packets ?owing through the network ?ow control
`switch are transferred from the server to the client, elimi
`nating processor intervention in routing outbound packets
`allows for signi?cant performance enhancements. As a
`result, the likelihood of the network ?ow switch becoming
`a bottleneck is greatly reduced.
`Multiple clusters (one or more PI servers sharing a single
`IP address) are supported in a single network ?ow switch.
`On any single link attached to each of the IP servers,
`multiple clusters can be supported if the IP server’s operat
`ing system supports multiple IP addresses on a physical link.
`In some embodiments, the network ?ow switch, in addi
`tion to routing of the packets, performs load balancing and
`fault tolerance functions. In these embodiments, a processor
`of the network ?ow switch periodically eXecutes a load
`balancing routine to determine the relative workload of each
`of the IP servers. When the network ?ow switch receives a
`packet destined to the cluster of IP servers, the packet is
`routed to the IP server with an optimal workload, so as to
`ensure that the workload is evenly distributed among the IP
`servers. In addition, if a failure of a network router is
`detected, a packet addressed to that network router is
`re-routed to a different network router by re-writing the Data
`Link Layer destination address of the packet. Since the
`network ?ow switch continuously monitors the status of the
`IP servers, no lengthy time delay is introduced in client
`server communications when an IP server is disabled.
`Since the IP header is not modi?ed, the network ?ow
`switch of the present invention operates on packets encoded
`according to any ISO layer 4 protocol and, unlike prior art
`server load balancers, is not limited to TCP encoded packets.
`In addition, the network ?ow switch can also handle
`re-routing, load balancing and fault tolerance of encrypted
`packets transparently to both server and client.
`In some embodiments, load balancing is also performed
`for outbound packets so as to route packets to the router with
`an optimal workload.
`Thus, a method and apparatus are provided to allow
`bi-directional clustering for load balancing and fault toler
`ance in the inbound direction (i.e., client-router-server), as
`well as in the outbound direction (i.e., server-router-client).
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 illustrates a prior art cluster of IP servers, each
`having a distinct IP address, and a prior art network ?ow
`switch for translating a virtual IP addressed shared by all IP
`servers in the cluster into the individual IP addresses of the
`IP servers.
`FIG. 2 illustrates a cluster of IP servers and a network
`?ow switch, according to an embodiment of the invention.
`Each IP server has a same IP address. A Data Link Layer
`address is used to identify each IP server within the cluster.
`FIG. 3A illustrates the format of a packet routed to/from
`the cluster of IP servers by the network ?ow switch 205 of
`FIG. 2.
`FIG. 3B shows the format of link ?eld 320 of FIG. 3A.
`FIG. 4A illustrates the structure of the network ?ow
`switch 205 of FIG. 2.
`
`Ex. 1006 / Page 14 of 19
`
`

`
`US 6,266,335 B1
`
`5
`FIG. 4B is a How diagram of the process of routing
`packets from one of the network clients to one of the IP
`servers of FIG. 2 via the network ?oW sWitch 205 of FIG.
`4A, according to an embodiment of the invention.
`FIG. 4C is a How diagram of the process of routing
`packets from one of the IP servers to one of the netWork
`clients of FIG. 2 via the netWork ?oW sWitch 205 of FIG. 4A,
`according to an embodiment of the invention.
`FIG. 5A is a block diagram of a netWork ?oW sWitch
`implemented using multiple general-purpose circuit boards,
`according to an embodiment of the invention.
`FIG. 5B is a block diagram of a netWork ?oW sWitch
`implemented using a general-purpose CPU board and a
`special-purpose netWork board, according to an embodiment
`of the invention.
`FIG. 5C is a block diagram of a netWork ?oW sWitch
`implemented using tWo special-purpose circuit boards,
`according to an embodiment of the invention.
`FIG. 5D is a block diagram of a netWork ?oW sWitch
`implemented using a single special-purpose circuit board,
`according to an embodiment of the invention.
`FIG. SE is a block diagram of a netWork ?oW sWitch
`implemented using a combination of special-purpose and
`general purpose circuit boards, according to an embodiment
`of the invention.
`FIG. 5F is a block diagram of a netWork ?oW sWitch
`implemented using a crossbar sWitch, according to an
`embodiment of the invention.
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`The method and apparatus of the present invention alloW
`multiple IP servers to share a same IP address and use a
`netWork ?oW sWitch to route packets among the IP servers
`based on the Data Link Layer address of the IP servers (e.g.,
`the destination address of the packets is translated into the
`Data Link Layer address of one of the IP servers). Since IP
`netWorks ignore the source Data Link Layer address ?eld of
`packets transmitted over the netWork, Data Link Layer
`address translation is performed only for packets ?oWing
`from an IP client to an IP server. In the reverse ?oW
`direction, that is, from an IP server to an IP client, no Data
`Link Layer address translation is required, thus alloWing for
`very fast throughput through the netWork ?oW sWitch.
`Acluster of IP servers 200 and a netWork ?oW sWitch 205,
`according to an embodiment of the invention, are shoWn in
`FIG. 2. Network How sWitch 205 routes packets among IP
`servers 210, 220, 230,240 and 250 and netWork routers 260,
`270 and 280. IP servers 210, 220, 230,240 and 250 are
`con?gured identically and have a virtual IP address 290. In
`addition, each of IP servers 210, 220, 230, 240 and 250 has
`a distinct Data Link Layer address, and a distinct link name.
`The link name is used to identify the unique server Within
`the cluster of servers sharing a same IP address. As
`explained beloW, the Data Link Layer address is used to
`translate a virtual Data Link Layer address to a physical Data
`Link Layer address, after an IP server is selected by netWork
`?oW sWitch 205 to receive the packet. IP address 290 is
`visible to devices communicating With the cluster 200, While
`the individual Data Link Layer addresses of each of the IP
`servers are not. Network How sWitch 205, in fact, performs
`a proxy Address Resolution Protocol (ARP) function that
`returns a “virtual” Data Link Layer address (not shoWn) to
`a netWork connected device in response to a standard ARP
`query. As a result, netWork connected devices see the cluster
`
`6
`200 as having a single IP address 290 and a single Data Link
`Layer address (not shoWn).
`NetWork routers 260, 270 and 280, on the other hand,
`each have a distinct IP address and a distinct Data Link
`Layer address. The routers are used to connect cluster 200 to
`external netWorks (not shoWn) via netWork ?oW sWitch 205.
`Thus, in order to transmit packets of information to cluster
`200, a device connected to one of the external netWorks
`(e.g., a router) issues a standard ARP query to netWork ?oW
`sWitch 205 to obtain the virtual Data Link Layer address of
`cluster 200; netWork ?oW sWitch 205 returns a Data Link
`Layer address of the selected receiving device (e.g., one of
`the IP servers) to the requesting device (e.g., the router). The
`netWork connected device then transmits a series of packets
`to netWork ?oW sWitch 205 (e.g., through one of netWork
`routers 260, 270 or 280 connected to the external netWork).
`The packets are then re-routed by netWork ?oW sWitch 205
`to exactly one of IP servers 210, 220, 230, 240 and 250.
`Since all embodiments of the netWork ?oWsWitch ensure
`that no tWo servers in the same cluster are on the same
`?oWsWitch part, broadcast isolation of the replicated servers
`is enabled. Therefore, IP address con?icts are avoided by the
`active intervention of the ?oWsWitch in the event of ARP
`query packets being received by the netWork ?oWsWitch, as
`described above.
`The format of a packet 300 transmitted over the external
`netWork is illustrated in FIG. 3A. Packet 300 has a header
`?eld 310, a link ?eld 320, an IP header 330, a TCP header
`340, a data payload 350, a CRC ?eld 360 and a trailer 370.
`Header 310 and trailer 370 are 8-bit Wide private tag-?elds:
`these are not transmitted over the external netWork but used
`only inside the netWork ?oW sWitch. IP header 330 and TCP
`header 340 are standard IP and TCP headers. IP header 330
`includes, among other information, a destination IP address
`and a source IP address for packet 300. CRC ?eld 360
`contains a checksum correction code used to verify that
`packet 300 has been transmitted Without error. If IP header
`330 Were modi?ed, as required by prior art methods for
`sharing a single IP address among multiple IP servers, the
`checksum for CRC ?eld 360 Would have to be recalculated,
`an operation requiring processor intervention. In addition, if
`encrypted information is transmitted according to the IPSEC
`security frameWork, decryption of the IP payload is required.
`Thus, by eliminating the need to recompute the checksum
`for each packet, the netWork ?oW sWitch of the present
`invention achieves better throughput than prior art devices.
`NetWork oWners can further deploy IPSEC security mecha
`nisms transparently and Without fear of communications
`being broken.
`FIG. 3B illustrates the format of link ?eld 320. Link ?eld
`320 has a Data Link Layer source address ?eld 380, a Data
`Link Layer destination address ?eld 390 and type ?eld 395.
`Since link ?eld 320 is not part of the IP protocol, there is no
`need to recalculate the checksum for CRC ?eld 360 When
`link ?eld 320 is modi?ed. Accordingly, re-routing of packets
`according to the present invention is accomplished by
`re-Writing the Data Link Layer destination address in Data
`Link Layer destination address ?eld 390 of packet 300.
`Neither IP header 330 nor CRC ?eld 360 are modi?ed,
`reducing the processing time required to route packets to and
`from the cluster of IP servers.
`An embodiment of netWork ?oW sWitch 205 (FIG. 2) is
`illustrated by the block diagram of FIG. 4A. Network How
`sWitch 205 has a CPU board 400 and four ethernet cards 415,
`416, 417 and 418 connected by a PCI bus 410. CPU board
`400, in turn, has a CPU 402, a memory 404, and a memory
`
`10
`
`15
`
`25
`
`35
`
`45
`
`55
`
`65
`
`Ex. 1006 / Page 15 of 19
`
`

`
`US 6,266,335 B1
`
`7
`controller 406 for controlling access to the memory 404.
`Each of ethernet cards 415, 416, 417 and 418 has an ethernet
`controller and tWo input/output ports 411 and 413.
`A network ?oW sWitch according to one embodiment of
`the invention can be constructed entirely from off-the-shelf
`ASICs (Application Speci?c Integrated Circuits), controlled
`by a general purpose CPU executing a softWare program.
`Since many commercially available Ethernet sWitches pro
`vide general purpose CPUs for sWitch management (e.g., for
`executing SNMP and IEEE 802.1D Spanning Tree
`Protocols) a netWork sWitch according to an embodiment of
`the invention can be easily implemented on such hardWare
`platforms. The only requirement is that the ASIC be able to
`support some form of “CPU intervention” triggered When a
`packet With a particular destination Data Link Layer address
`is routed through the netWork ?oW sWitch. ASICs that
`support this form of CPU intervention are available from,
`among others, Galileo Technology Ltd. of Kormiel, Israel,
`MMC Networks, Inc. of Sunnyvale, Calif. and I-Cube, Inc.
`of Campbell, Calif.
`The process of routing a packet 300 (FIG. 3A) received by
`one of netWork routers 260, 270 or 280 to one of IP servers
`210, 220, 230, 240 or 250 of FIG. 2 is illustrated by the How
`diagram of FIG. 4B. Initially, a packet is received on a port
`of one of ethernet cards 415, 416, 417 or 418, in stage 420.
`In stage 425, ethernet controller 412 then checks a CPU
`intervention bit to determine Whether the packet needs to be
`sent to the CPU board 400 for further processing. In such a
`case the packet is transferred to CPU board 400 over PCI bus
`410 and stored in memory 404 by memory controller 406, in
`stage 430. If the CPU intervention bit is not set, hoWever, the
`processing proceeds to stage 445. Stage 435 performs an
`optional load balancing operation to determine Which of IP
`servers 210, 220, 230, 240 or 250 packet 300 is to be routed
`to. The load balancing operation of stage 435 attempts to
`divide packets to be processed among the IP servers accord
`ing to the capacity and the current utiliZation of each server.
`A load balancing scheme suitable for use in the present
`invention is described in a related application titled
`“DYNAMIC LOAD BALANCER FOR MULTIPLE NET
`WORK SERVERS” by Sajit Bhaskaran and Abraham
`MattheWs, having Ser. No. 08/992,038, Which is herein
`incorporated by reference in its entirety. Stage 440 then
`reWrites the Data Link Layer destination address ?eld of
`packet 300 to indicate Which of IP servers 210, 220, 230, 240
`or 250 packet 300 is to be routed to. Finally, the packet is
`transferred the one of ethernet cards 415, 416, 417 or 418 to
`Which the IP server speci?ed by the Data Link Layer
`destination address ?eld of packet 300 is connected, in stage
`445.
`The process of routing a packet 300 (FIG. 3A) from one
`of IP servers 210, 220, 230, 240 or 250 to one of netWork
`routers 260, 270 or 280 (FIG. 2) is illustrated by the How
`diagram of FIG. 4C. Initially, a packet is receive

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket