throbber
Ulllted States Patent [19]
`Adelman et al.
`
`US006078957A
`[11] Patent Number:
`[45] Date of Patent:
`
`6,078,957
`Jun. 20, 2000
`
`5,832,222 11/1998 DZiadosZ et al. ..................... .. 709/216
`5,848,233 12/1998 Radia et al. .......................... .. 713/201
`$151991 :1 ‘111- -
`8/1999 Chung '
`
`oura e a .
`
`.
`
`,
`
`,
`
`579407597
`
`[54] METHOD AND APPARATUS FOR A TCP/IP
`LOAD BALANCING AND FAILOVER
`PROCESS IN AN INTERNET PROTOCOL (IP)
`NETWORK CLUSTERING SYSTEM
`
`[75] Inventors: Kenneth Allen Adelman, Corralitos;
`David Lyon Kashtan, La Selva Beach;
`William L_ Palter; Derrel] I)_ Piper,
`II, both of Santa Cruz, all of Calif
`
`[73] Assignee: Network Alchemy, Inc., Santa Cruz,
`Calif
`
`[21] APPL No. 09/197 018
`’
`NOV- 20, 1998
`[22] Filed:
`[51] Int. c1.7 .................................................... .. G06F 13/00
`5
`U S C]
`09 224
`I:
`.-
`-
`- ............................................................ .. 7 /
`[58] Fleld 0f Search ................................... .. 709/200, 201,
`709/209> 221’ 223’ 224’ 226’ 238
`C, d
`R f
`e erences lte
`
`56
`I
`I
`
`OTHER PUBLICATIONS
`
`Andresen, D., et al., “Toward a Scalable Distributed WWW
`Server on Workstation Clusters,” Journal of Parallel and
`Distributed Computing, Article No. PC971305, vol. 42,
`1997 at 91—100
`Cardellini, V., et al., “Efficient State Estimators for Load
`Control Policies in Scalable Web Server Clusters,” 22nd
`Annual International Computer Software & Applications
`Conference (Compsac ’98), Aug. 19—21, 1998 at 449—455.
`CardoZa, W., et al., “Overview of Digital UNIX cluster
`System Archltecmre’ Proceedmgs of COMPCON 96’ Feb'
`25—28, 1996 at 254—259.
`Chun,
`et 81', “Virtual Network Transport Protocols for
`Myrinet,” IEEE Micro, Jan/Feb' 1998 at 53_63~
`Cisco Systems Inc., “LocalDirector Hot—Standby Faiilover:
`Product Overview,” www.cisco.com/univercd/cc/td/doc/
`product/iaabu/localdir/ld20rns/ldicgd/ld3ich5.htm, 1998.
`
`Us‘ PATENT DOCUMENTS
`
`(List continued on neXt page.)
`
`571387615
`5,301,337
`
`ggrzman et al. .......................
`8/1992 Lampgggtzlimj
`4/1994 Wells et al. .... ..
`
`370/400
`709/104
`
`Primary Examiner I loustafa M‘ Meky
`Attorney, Agent, or Firm—l.\/Iorrison & Foerster LLP; Erwin
`J - Basinski
`
`5,448,723
`
`9/1995 Rowett . . . . . . . . .
`
`. . . . . . .. 714/4
`
`5,553,239
`9/1996 Heath et al. .......................... .. 713/201
`5,586,121 12/1996 Moura et al. ......................... .. 370/404
`
`[57]
`ABSTRACT
`The present invention is a method and apparatus for moni_
`
`5,612,865
`
`3/1997 Dasgupta . . . . .
`
`. . . . .. 700/79
`
`709/217
`5 666 486 9/1997 Al?eri et al
`709024
`576947547 12/1997 Subramania'n
`5j699:500 12/1997 Dasgupta _______________________ __ 714/1
`5,708,659
`1/1998 Rostoker et a1,
`370/392
`5,712,981
`1/1998 McKee et al.
`709/241
`370/227
`5,737,311
`4/1998 Wyld ------------------ -
`5,778,185
`7/1998 Gfegerson et ‘11-
`~
`709/226
`5’793’968
`8/1998 Gfegerson et a1‘
`709/209
`giiscgcil'et' ' ' ' ' ' ' '
`370/236
`5js2sjs55 10/1998 Moura et al.
`5,828,876 10/1998 Fish et al. ................................. .. 707/1
`
`' ' '
`
`.
`
`.
`
`.
`
`.
`
`toring packet loss activity in an Internet Protocol (IP)
`network clustering system which can provide a useful,
`discrete and tangible mechanism for controlled failover of
`the TCP/IP network cluster system. An adaptive interval
`value is determined as a function of the average packet loss
`in the system, and this adaptive interval value used to
`determine when a cluster member must send a neXt kee
`palive message to all other cluster members, and wherein the
`keepalive message is used to determine network packet loss.
`
`25 Claims, 12 Drawing Sheets
`
`MASTER GOT
`CLIENT KEEPALIVE
`
`0
`83
`
`YES
`
`THIS
`CLIENT IN MY
`CLUSTER
`'?
`
`K 833
`.1
`SEND EX'I
`CLUSTER
`
`]
`EXIT
`
`834
`
`{-835
`CALCULATE AND STORE
`PACKET LOSS AVERAGE
`(USING SEQUENCE NUMBER
`OF KEEPALIVE AND ADAPTIVE
`KEEPALIVE INTERVAL)
`
`837
`[
`RESET WATCHDOG
`FOR THIS CLIENT
`
`y
`
`834
`
`EXIT
`
`IBM / Softlayer v. ZitoVault
`Ex. 1004 / Page 1 of 22
`
`

`
`6,078,957
`Page 2
`
`OTHER PUBLICATIONS
`
`Damani, O., et al., “ONE—IP: techniques for hosting a
`serviceon a cluster of machines,” Computer Networks and
`ISDN Systems, vol. 29, 1997 at 1019—1027.
`Fox, A., et al., “Cluster—Based Scalable Network Services,”
`16th ACM Symposium on Operating systems Principles,
`Oct. 5—8, 1997 at 78—91.
`Ghormley, D.P., et al. “GLUnix: A Global Layer Unix for a
`Network of Workstations,” Software—Practice and Experi
`ence, vol. 28, No. 9, Jul. 25, 1998 at 929—961.
`Huang, Y. et al., “Software Implemented Fault Tolerance:
`Technologies and Experience,” 23rd International Sympo
`sium on Fault—Tolerant Computing, Jun. 22—24, 1993 at
`2—9.
`Hunt, G.D.H., et al., “Network Dispatcher: a connection
`router for scalable Internet services,” Computer Networks
`and ISDN Systems, vol. 30, 1998 at 347—357.
`Hwang, K., et al., Scalable Parallel Computing, WCB
`McGraw—Hill, 1998 at 366—377, 453—564.
`KurcewicZ, M., et al., “A Distributed WWW Cache,” Com
`puter Networks and ISDN Systems, vol. 30, 1998 at
`2261—2267.
`
`Mendiratta, V.B., “Reliability Analysis of Clustered Com
`puting Systems,” 9th International Symposium on Software
`Reliability Engineering, Nov. 4—7, 1998 at 268—272.
`Parker, T., “QualixHA+,” Unix Review, Mar. 1998 at 59—61.
`Short, R., et al., “Windows NT Clusters for Availability and
`Scalabilty,” Proceedings, IEEE COMPCON ’97, Feb.
`23—26, 1997 at 8—13.
`Taschek, J., “ZD Internet Lab; A Well—Balanced Web,
`”www.Zdnet.com/icom/Zdlabs/load.balance/, Feb. 23, 1998.
`Thaler, D.G., et al., “Using Name—Based Mappings to
`Increase Hit Rates,” IEEE/ACM Transactions on Network
`ing, vol.6, No.1, Feb. 1998 at 1—14.
`Valence Research, Inc., “Convoy Cluster Software; Product
`Overview,” www.valence.com/convoy/main.html.
`Venkataraman, S., et al., “Memory Management for Scal
`able Web Data Servers,” 13th International Conference on
`Data Engineering, Apr. 7—11, 1997 at 510—519.
`Vogels, W., et al., “The Design and Architecture of the
`Microsoft Cluster Service—A Practical Approach to
`High—Availability and Scalability,” 28th annual Interna
`tional Symposium on Fault—Tolerant Computing, Jun.
`23—25, 1998 at 422—431.
`
`Ex. 1004 / Page 2 of 22
`
`

`
`U.S. Patent
`
`Jun. 20,2000
`
`Sheet 1 0f 12
`
`6,078,957
`
`TYPICAL
`100
`NTERNET NETWORK
`CONFIGURATION
`
`BRANCH
`OFFICE
`
`I105
`
`a
`
`GATEWAY/
`TUNNEL
`SERVER
`
`CLIENTS
`
`'
`
`107
`
`GATEWAY /
`TU N N EL
`SERVER
`/111
`113
`F
`
`112
`F
`
`114
`F
`
`SERVER
`1
`
`SERVER
`2
`
`SERVER
`3
`
`>/115
`
`r116
`
`r117
`
`r118
`LOCAL CLIENTS
`
`Ex. 1004 / Page 3 of 22
`
`

`
`U.S. Patent
`
`Jun. 20,2000
`
`Sheet 2 0f 12
`
`6,078,957
`
`TYPICAL GENERAL 200
`PURPOSE COMPUTER/ \
`CLUSTER-MEMBER
`CONFIGURATION
`
`21s
`F
`DISPLAY
`
`M
`
`K201
`K203
`K205
`1/0
`5
`CPU I207
`I209
`
`MEMORY
`
`\
`
`[211
`MFELGSEY
`CARD
`
`K215
`(EJTHER
`
`NITS
`
`V CDROM I217
`DRIvE
`
`219
`
`221
`
`PROGRAM
`
`223
`
`DISK
`STORAGE
`
`7
`
`KEYBOARD
`K225
`
`OTHER
`UNITS
`
`FIG--2
`
`f303
`
`FLASH MEMORY - 300
`CONTENTS \
`CRYPTOGRAPHICALLY f301
`SIGNED KERNEL
`CONFIGURATION
`FILES
`WHERE TO LOG
`ERROR MESSAGES
`AUTHENTICATION
`CERTIFICATE
`
`f305
`
`f307
`
`Ex. 1004 / Page 4 of 22
`
`

`
`U.S. Patent
`
`Jun. 20,2000
`
`Sheet 3 0f 12
`
`6,078,957
`
`400 TYPICALIP
`/ NETWORK
`CLUSTER
`
`107
`
`INTERNET
`
`[-401
`
`4
`
`r03
`
`4
`
`r05
`
`4 7
`
`f0 r“Huang
`
`i
`
`40
`
`UNIT1
`
`UN|T2
`
`
`
`UNITB 411
`
`UNITN
`
`4
`
`._
`
`OTHER
`NETWORK
`
`UNITS
`
`Ex. 1004 / Page 5 of 22
`
`

`
`U.S. Patent
`
`Jun. 20, 2000
`
`Sheet 4 0f 12
`
`6,078,957
`
`GENERAL MEMORY MAP
`TYPICAL IP NETWORK
`CLUSTER MEMBER
`
`500 \
`OPERATING
`SYSTEM KERNEL
`
`f501
`
`503
`f
`
`TCP / IP
`STACK
`CLUSTER MANAGEMENT [505
`ROuTINES
`APPLICATION
`#1
`APPLICATION
`#2
`
`f507
`
`f509
`
`f 511
`
`f513
`
`APPLICATION
`#3
`APPLICATION
`#4
`WORK ASSIGNMENT f515
`TABLE (MASTER)
`THIS uNIT APPLICATION f5 17
`STATE TABLE
`OTHER UNITS
`APPLICATION
`STATE TABLE
`INCOMING
`MSG STORE
`
`f519
`
`f521
`
`DATA HANDLERS - I523
`OTHER UNITS
`
`FIG._5
`
`Ex. 1004 / Page 6 of 22
`
`

`
`U.S. Patent
`
`Jun. 20,2000
`
`Sheet 5 0f 12
`
`6,078,957
`
`601
`
`CLUSTER
`ESTABLISHMENT
`
`START
`
`7
`609
`[60 GOT “EXIT
`REQUEST” / :
`CL'ENT A SUCCESS)
`
`,
`JOINING
`SEND JOIN REQUESTS =
`
`(603
`
`I
`
`611j
`
`MASTER KEEPALIVE
`
`WATCHDOG
`615\
`
`1
`
`MAST
`
`617/‘
`613/"
`OFFER MASTER:
`SEND "OFFER MASTER” PACKETS
`
`621\
`
`619 SUCCESS
`\
`"
`ARPS:
`SENDARPS
`
`T "
`
`642/
`
`623\
`7
`
`"
`MASTER
`
`GOT “EXIT”
`RE UEST”
`O
`
`641
`f V
`
`625 GOT “MASTER”
`KEEPALIVE"
`
`WIN
`TIE BREAKER
`'?
`YES
`
`635
`
`637 \
`
`SEND ”OTHER MASTER EXISTS”
`
`A
`
`FIG__ 6
`
`SEND ARPS
`
`Ex. 1004 / Page 7 of 22
`
`

`
`U.S. Patent
`
`Jun. 20,2000
`
`Sheet 6 0f 12
`
`6,078,957
`
`FIG... 7
`
`FIG._ 7A
`
`j 700
`
`TCP FAILOVER STATE
`
`/ 702
`Initial State (Only need to send once)
`
`Source IP Address + Port
`Destination IP Address + Port
`Maximum Segment Size
`MSS + Options Size
`
`/ 701
`Essential State (Send on each state change)
`
`Flags: No Delay, No Options, Request Window Scaling,
`Receive Window Scaling, Request Timestamp,
`Receive Timestamp, Permit Selective ACK
`Send “Next” Sequence Number
`Window Update Segment Sequence Number
`Window Update Segment Acknowledgement Number
`Send Window
`Receive “Next" Sequence Number
`Receive “Advertized“ Window
`Send Window Scaling
`Receive Window Scaling
`Recent Timestamp Echo Data
`
`Ex. 1004 / Page 8 of 22
`
`

`
`U.S. Patent
`
`Jun. 20, 2000
`
`Sheet 7 0f 12
`
`6,078,957
`
`/ 703
`Calculable State (Don’t Send)
`
`State = ESTABLISHED
`Retransmit Time = None
`Probe Time = Now
`TCP Keepalive Time : Now
`ZMSL‘Fme = None
`Retransmit Time Shift = 0
`Current Retransmit = lnitial Value
`Consecutive Duplicate Acks Received = 0
`Force Output: 0;
`Send “Unacknowledged" Sequence Number: Send “Next” Sequence
`Number
`Send “Urgent Pointer" : Send “Unacknowledged‘ Sequence Number
`Highest Sequence Number Sent : Send “Next” Sequence Number
`Send Initial Segment Sequence Number: 0
`Receive Window : Amount of space left in socket receive buffer
`Receive “Urgent Pointer” : Receive “Next” Sequence Number
`Receive Initial Segment Sequence Number = O
`Congestion Control Window = Initial Value
`Congestion Control Window Linear/Exponential Threshold : Initial Value
`Inactivity Time = 0
`Estimated Round Trip Time = 0
`Sequence Number Being Timed = O
`Smoothed Round Trip Time : Initial Value
`Variance In Round Trip Time = Initial Value
`Minimum Round Trip Time Allowed = initial Value
`Largest Window Offered by Peer = 0
`Out Of Band Data = None
`Send Pending Window Scaling = Send Window Scaling
`Receive Pending Window Scaling : Receive Window Scaling
`Timestamp Echo Data Update Time = 0
`Last Ack Sent Sequence Number: Receive “Next” Sequence Number
`Send Connection Count: 0
`Receive Connection Count = 0;
`Connection Duration = 0;
`Number of Round Trip Time Samples : 0;
`Number of TCP Keepalive Probes = Initial Value
`Interval Between TCP Keepalive Probes = Initial Value
`'Fme Before First TCP Keepalive Probe = Initial Value
`Maximum Idle Time : Initial Value
`
`FIG._ 7B
`
`Ex. 1004 / Page 9 of 22
`
`

`
`U.S. Patent
`
`Jun. 20,2000
`
`Sheet 8 0f 12
`
`6,078,957
`
`Rpm
`
`l/mim.
`
`Sm
`A,
`
`
`
`EmEIwDJO .EXm
`
`
`
`"NEIL-0., 02mm,
`
`new
`
`mom
`
`Ex. 1004 / Page 10 of 22
`
`

`
`U.S. Patent
`
`Jun. 20, 2000
`
`Sheet 9 0f 12
`
`6,078,957
`
`833
`/
`SEND “EXIT
`CLUSTER”
`
`I
`EXIT
`
`834
`
`MASTER GOT
`CLIENT KEEPALIVE
`
`830
`
`THIS
`CLIENT IN MY
`CLUSTER
`'2
`
`FIG._8B
`
`835
`K
`CALCULATE AND STORE
`PACKET LOSS AVERAGE
`(USING SEQUENCE NUMBER
`OF KEEPALIVE AND ADAPTIVE
`KEEPALIVE INTERVAL)
`
`837
`" r
`RESET WATCHDOG
`FOR THIS CLIENT
`
`‘
`
`834
`
`EXIT
`
`850
`
`MASTER
`EVENTS
`
`PERIODIC TIMER (ADAPTIVE
`TO NETWORK PACKET LOSS)
`
`"
`
`K 851
`
`BROADCAST MASTER
`KEEPALIVE CONTAINING
`CLUSTER MEMBER LIST AND
`ADAPTIVE KEEPALIVE
`INTERVAL
`
`FIG._8C
`
`Ex. 1004 / Page 11 of 22
`
`

`
`U.S. Patent
`
`Jun. 20, 2000
`
`Sheet 10 0f 12
`
`6,078,957
`
`FIG 8 D
`. _
`
`PERIODIC TIMER
`(2 SECONDS)
`
`855
`
`K 857
`‘
`CALCULATE LOAD
`DIFFERENCE BETWEEN
`MOST LOADED AND LEAST
`LOADED MEMBER
`
`859
`
`WOULD
`MOVING ONE
`WORK UNIT LEAVE
`THE LEAST LOADED
`MEMBER WITH A LOAD s
`THE MOST LOADED
`MEMBER
`7
`
`865
`‘L
`
`863
`D
`SEND “WORK DE-ASSIGN"
`REQUEST TO MOST LOADED
`MEMBER WITH THE LEAST
`LOADED MEMBER AS TARGET
`
`/ 861
`
`IGNORE
`
`|
`
`v
`EXIT
`
`I
`
`WATCHDOG TIMER FOR )[870
`FIG-_8E C ACLIENT EXPIRES
`
`871
`f
`I
`DELETE CLIENT FROM
`CLUSTER DATA STRUCTURE
`
`FIG 8 H PERIODIC TIMER (ADAPTIVE
`._
`TO NETWORK PACKET LOSS)
`
`910
`
`SEND CLIENT KEEP ALIVE TO
`MASTER CONTAINING
`MONOTONICALLY
`INCREASING SEQUENCE #
`(FOR MEASURING NETWORK
`PACKET LOSS)
`
`‘
`
`Ex. 1004 / Page 12 of 22
`
`

`
`U.S. Patent
`
`Jun. 20, 2000
`
`Sheet 11 0f 12
`
`6,078,957
`
`875
`
`MASTER GETS CLIENT
`JOIN REQUEST
`877 K
`RESPOND WITH
`NAK REASON
`“OPERATION IN
`PROGRESS”
`
`NOTIFY
`APPLICATIONS
`OF JOIN
`REQUEST
`
`881
`
`APPLICATIONS FINISH
`WITH JOIN REQUEST
`
`JOIN
`ALLOWED
`?
`YES
`K887
`
`SEND
`
`ACKNOWLEDGMENT
`
`885 r
`SEND
`NAK
`AND
`REASON
`
`(
`
`V
`
`EXIT
`
`)
`
`WATCHING TIMER FOR 9'20
`MASTER EXPIRES
`
`K922
`‘
`EXIT CLUSTER AND
`START JOINING
`
`EXIT
`
`FIG._8I
`
`CLIENT
`EVENTS
`890
`
`CLIENT GOT
`MASTER KEEPALIVE
`
`UPDATE
`ADAPTIVE
`KEEPALIVE
`NTERvAL
`(CALCULATED
`BY MASTER)
`
`893
`
`HAVE
`WE LOST ANY
`MEM'SERS
`YES
`895 K
`NOTIFY
`APPLICATIONS
`
`897
`
`HAVE
`WE ADDED NEW
`MEMBERS
`?
`
`NOTIFY
`
`RESET
`WATCHDOG
`FOR MASTER
`
`/899
`
`II
`
`EXIT
`
`I
`
`I
`
`FIG._8G
`
`Ex. 1004 / Page 13 of 22
`
`

`
`U.S. Patent
`
`Jun. 20,2000
`
`Sheet 12 0f 12
`
`6,078,957
`
`RECEIVE AN
`IP PACKET
`
`901
`
`IP PACKET
`INPUT HANDLING
`
`CLUSTER
`IP ADDRESS
`
`907
`
`SHOULD
`PACKET BE
`FORWQRDED
`
`'
`
`YES f 913
`
`APPLY FILTER
`TO CLASSIFY
`WORK
`
`APPLY FORWARDING
`FILTER TO CLASSIFY
`WORK
`
`0
`
`915
`
`DO WE
`OWN TH|S
`WORK SET
`7
`
`YES
`917
`
`919 f
`PROCESS
`LOCALLY
`
`/939
`
`N
`
`O 921
`
`ARE
`WE MASTER
`
`NO
`923
`
`YES
`
`926
`
`927
`
`925 f
`DROP
`PACKET
`
`K- 909
`
`PROCEss
`LOCALLY
`
`‘V
`
`945
`
`EXIT
`
`EXIT
`
`I
`
`936
`
`931
`
`YES
`
`935
`
`IS THIS
`WORK sET
`ASSIGNED
`'2
`
`YES
`929
`
`ACCEPTED THIS YES
`WORK SET
`933
`v
`
`928
`
`NO
`937
`F
`ASSIGN WORK SET
`_ TO LEAST LOADED
`MEMBER
`
`940 NO
`941
`F
`SAVE PACKET UNTIL
`WORK SETACCEPTED
`
`" r 943
`
`FORWARD PACKET
`TO MEMBER
`
`945
`
`‘
`EXIT
`
`COULD
`MEMBER SEE
`PAQKET
`-
`
`N0
`
`FIG._ 9
`
`Ex. 1004 / Page 14 of 22
`
`

`
`1
`METHOD AND APPARATUS FOR A TCP/IP
`LOAD BALANCING AND FAILOVER
`PROCESS IN AN INTERNET PROTOCOL (IP)
`NETWORK CLUSTERING SYSTEM
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`
`This application is related to application Ser. No. 09/196,
`941 entitled “Method and Apparatus for an Internet Protocol
`(IP) NetWork Clustering System,” ?led Nov. 20, 1998, still
`pending.
`
`TECHNICAL FIELD
`
`This invention relates to the ?eld of Computer Systems in
`the general NetWork Communications sector. More
`speci?cally, the invention is a method and apparatus for a
`TCP/IP Load Balancing & Failover system in an Internet
`Protocol (IP) NetWork clustering system.
`
`10
`
`15
`
`BACKGROUND ART
`
`As more and more businesses develop electronic com
`merce applications using the Internet in order to market and
`to manage the ordering and delivery of their products, these
`businesses are searching for cost-effective Internet links that
`provide both security and high availability. Such mission
`critical applications need to run all day, every day With the
`netWork components being highly reliable and easily scal
`able as the message traf?c groWs. National carriers and local
`Internet Service Providers (ISPs) are noW offering Virtual
`Private NetWorks (VPN)—enhanced Internet-based back
`bones tying together corporate Workgroups on far-?ung
`Local Area NetWorks (LANs)—as the solution to these
`requirements. Similarly other companies are offering clus
`tered systems as a means to provide netWork scalability, and
`adaptive load balancing innetWork systems. HoWever, TCP/
`IP based systems present problems of message traf?c loss
`and netWork session reinitiation if ef?cient failover pro
`cesses are not implemented.
`A number of companies have recently announced current
`or proposed VPN products and/or systems Which variously
`support IPSec, IKE (ISAKMP/Oakley) encryption-key
`management, as Well as draft protocols for Point-to-Point
`Tunneling protocol (PPTP), and Layer 2 Tunneling protocol
`(L2TP) in order to provide secure traffic to users. Some of
`these products include IBM’s NWays Multiprotocol Routing
`ServicesTM 2.2, Bay Networks OptivityTM and CentillionTM
`products, Ascend Communication’s MultiVPNTM package,
`Digital Equipment’s ADI VPN product family, and Indus
`River’s RiverWorksTM VPN planned products. HoWever,
`none of these products are knoWn to offer capabilities Which
`minimiZes delay and session by a controlled fail-over pro
`cess.
`These VPNs place enormous demands on the enterprise
`netWork infrastructure. Single points of failure components
`such as gateWays, ?reWalls, tunnel servers and other choke
`points that need to be made highly reliable and scaleable are
`being addressed With redundant equipment such as “hot
`standbys” and various types of clustering systems.
`For eXample, CISCOTM Inc. noW offers a neW product
`called LocalDirectorTM Which functions as a front-end to a
`group of servers, dynamically load balances TCP traf?c
`betWeen servers to ensure timely access and response to
`requests. The LocalDirector provides the appearance, to end
`users, of a “virtual” server. For purposes of providing
`continuous access if the LocalDirector fails, users are
`
`45
`
`55
`
`65
`
`6,078,957
`
`2
`required to purchase a redundant LocalDirector system
`Which is directly attached to the primary unit, the redundant
`unit acting as a “hot” standby. The standby unit does no
`processing Work itself until the master unit fails. The
`standby unit uses the failover IP address and the secondary
`Media Access Control (MAC) address (Which are the same
`as the primary unit), thus no Address Resolution Protocol
`(ARP) is required to sWitch to the standby unit. HoWever,
`because the standby unit does not keep state information on
`each connection, all active connections are dropped and
`must be re-established by the clients. Moreover, because the
`“hot standby” does no concurrent processing it offers no
`processing load relief nor scaling ability.
`ValenceTM Research Inc. (recently purchased by
`Microsoft® Corporation) offers a softWare product called
`Convoy ClusterTM (Convoy). Convoy installs as a standard
`WindoWs NT netWorking driver and runs on an eXisting
`LAN. It operates in a transparent manner to both server
`applications and TCP/IP clients. These clients can access the
`cluster as if it is a single computer by using one IP address.
`Convoy automatically balances the netWorking traf?c
`betWeen the clustered computers and can rebalance the load
`Whenever a cluster member comes on-line or goes off-line.
`HoWever this system appears to use a compute intensive and
`memory Wasteful method for determining Which message
`type is to be processed by Which cluster member in that the
`message source port address and destination port address
`combination is used as an indeX key Which must be stored
`and compared against the similar combination of each
`incoming message to determine Which member is to process
`the message. Moreover, this system does not do failover.
`There is a need in the art for an IP netWork cluster system
`Which can easily scale to handle the exploding bandWidth
`requirements of users. There is a further need to maXimiZe
`netWork availability, reliability and performance in terms of
`throughput, delay and packet loss by making the cluster
`overhead as ef?cient as possible, because more and more
`people are getting on the Internet and staying on it longer. A
`still further need eXists to provide a reliable failover system
`for TCP based systems by ef?ciently saving the state infor
`mation on all connections so as to minimiZe packet loss and
`the need for reconnections.
`Computer cluster systems including “single-system
`image” clusters are knoWn in the art. See for eXample,
`“Scalable Parallel Computing” by Kai HWang & ZhiWei Xu,
`McGraW-Hill, 1998, ISBN 0-07-031798-4, Chapters 9 & 10,
`Pages 453—564, Which are hereby incorporated fully herein
`by reference. Various Commercial Cluster System products
`are described therein, including DEC’s TruClustersTM
`system, IBM’s SPTM system, Microsoft’s WolfpackTM sys
`tem and The Berkeley NOW Project. None of these systems
`are knoWn to provide ef?cient IP NetWork cluster capability
`along With combined scalability, load-balancing and con
`trolled TCP fail-over.
`
`SUMMARY OF THE INVENTION
`
`The present invention overcomes the disadvantages of the
`above described systems by providing an economical, high
`performance, adaptable system and method for netWork
`traffic monitoring and controlled failover in an IP NetWork
`cluster.
`The present invention is a method and apparatus for
`monitoring packet loss activity in an Internet Protocol (IP)
`netWork clustering system Which can provide a useful,
`discrete and tangible mechanism for controlled failover of
`the TCP/IP netWork cluster system. An adaptive interval
`
`Ex. 1004 / Page 15 of 22
`
`

`
`6,078,957
`
`3
`value is determined as a function of the average packet loss
`in the system, and this adaptive interval value used to
`determine When a cluster member must send a neXt kee
`palive message to all other cluster members, and Wherein the
`keepalive message is used to determine network packet loss.
`The present invention includes a cluster apparatus com
`prising a plurality of cluster members, With all cluster
`members having the same internet machine name and IP
`address, and each member having a general purpose
`processor, a memory unit, a program in the memory unit, a
`display and an input/output unit; and the apparatus having a
`?rst mechanism Which determines an adaptive interval value
`representing a time interval as a function of average packet
`loss in the system and a second mechanism for using the
`adaptive interval value to determine When a cluster member
`is to send a neXt keepalive message Wherein the keepalive
`message is used to determine netWork packet loss.
`The present invention further includes a method for
`monitoring netWork activity and for controlled failover in a
`TCP/IP Network cluster, the cluster having a plurality of
`cluster members, each cluster member having a processor, a
`memory, an input/output section, a display screen, and
`programs in the memory for operating the cluster member,
`the method comprising steps of determining an adaptive
`interval value representing a time interval as a function of
`average packet loss in the cluster, and using the adaptive
`interval value to determine When a cluster member is to send
`a neXt keepalive message, the keepalive message being used
`to determine a netWork packet loss estimate.
`Other embodiments of the present invention Will become
`readily apparent to those skilled in these arts from the
`folloWing detailed description, Wherein is shoWn and
`described only the embodiments of the invention by Way of
`illustration of the best mode knoWn at this time for carrying
`out the invention. The invention is capable of other and
`different embodiments some of Which may be described for
`illustrative purposes, and several of the details are capable of
`modi?cation in various obvious respects, all Without depart
`ing from the spirit and scope of the present invention.
`
`15
`
`25
`
`4
`tion for purposes of explanation, speci?c data and con?gu
`rations are set forth in order to provide a thorough under
`standing of the present invention. In the presently preferred
`embodiment the IP NetWork cluster is described in terms of
`a VPN tunnel-server cluster. HoWever, it Will be apparent to
`one skilled in these arts that the present invention may be
`practiced Without the speci?c details, in various applications
`such as a ?reWall cluster, a gateWay or router cluster, etc. In
`other instances, Well-knoWn systems and protocols are
`shoWn and described in diagrammatical or block diagram
`form in order not to obscure the present invention unnec
`essarily.
`
`OPERATING ENVIRONMENT
`The environment in Which the present invention is used
`encompasses the general distributed computing scene Which
`includes generally local area netWorks With hubs, routers,
`gateWays, tunnel-servers, applications servers, etc. con
`nected to other clients and other netWorks via the Internet,
`Wherein programs and data are made available by various
`members of the system for eXecution and access by other
`members of the system. Some of the elements of a typical
`internet netWork con?guration are shoWn in FIG. 1, Wherein
`a number of client machines 105 possibly in a branch office
`of an enterprise, are shoWn connected to a GateWay/hub/
`tunnel-server/etc. 106 Which is itself connected to the inter
`net 107 via some internet service provider (ISP) connection
`108. Also shoWn are other possible clients 101, 103 similarly
`connected to the internet 107 via an ISP connection 104,
`With these units communicating to possibly a home of?ce via
`an ISP connection 109 to a gateWay/tunnel-server 110 Which
`is connected 111 to various enterprise application servers
`112, 113, 114 Which could be connected through another
`hub/router 115 to various local clients 116, 117, 118.
`The present IP NetWork cluster is made up of a number of
`general purpose computer units each of Which includes
`generally the elements shoWn in FIG. 2, Wherein the general
`purpose system 201 includes a motherboard 203 having
`thereon an input/output (“I/O”) section 205, one or more
`central processing units (“CPU”) 207, and a memory section
`209 Which may have a ?ash memory card 211 related to it.
`The U0 section 205 is connected to a keyboard 226, other
`similar general purpose computer units 225, 215, a disk
`storage unit 223 and a CD-ROM drive unit 217. The
`CD-ROM drive unit 217 can read a CD-ROM medium 219
`Which typically contains programs 221 and other data. Logic
`circuits or other components of these programmed comput
`ers Will perform series of speci?cally identi?ed operations
`dictated by computer programs as described more fully
`beloW.
`Flash memory units typically contain additional data used
`for various purposes in such computer systems. In the
`preferred embodiment of the present invention, the ?ash
`memory card is used to contain certain unit “personality”
`information Which is shoWn in FIG. 3. Generally the ?ash
`card used in the current embodiment contains the folloWing
`type of information:
`Cryptographically signed kernel—(301)
`Con?guration ?les (such as cluster name, speci?c unit IP
`address, cluster address, routing information con?guration,
`etc.)—(303)
`Pointer to error message logs—(305)
`Authentication certi?cate—(307).
`Security policies (for eXample, encryption needed or not,
`etc.)—(309)
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`The features and advantages of the system and method of
`the present invention Will be apparent from the folloWing
`description in Which:
`FIG. 1 illustrates a typical Internet netWork con?guration.
`FIG. 2 illustrates a representative general purpose
`computer/cluster-member con?guration.
`FIG. 3 illustrates a representative memory map of data
`contained on a related Flash Memory card.
`FIG. 4 illustrates a typical IP NetWork cluster
`FIG. 5 illustrates a general memory map of the preferred
`embodiment of a cluster member acting as a tunnel-server.
`FIG. 6 illustrates a ?oW-chart of the general operation of
`the cluster indicating the cluster establishment process.
`FIG. 7 illustrates an exemplary TCP state data structure.
`FIGS. 8A—8I illustrate ?oW-charts depicting the events
`Which the master processes and the events Which the non
`master cluster members (clients) must process.
`FIGS. 9 illustrates a ?oW-chart depicting the normal
`packet handling process after establishing the cluster.
`
`45
`
`55
`
`BEST MODE FOR CARRYING OUT THE
`INVENTION
`Amethod and apparatus for operating an Internet Protocol
`(IP) NetWork cluster is disclosed. In the folloWing descrip
`
`65
`
`THE INVENTION
`The present invention is an Internet Protocol (IP) clus
`tering system Which can provide a highly scalable system
`
`Ex. 1004 / Page 16 of 22
`
`

`
`6,078,957
`
`5
`Which optimiZes-throughput by adaptively load balancing its
`components, and Which minimizes delay and session loss by
`a controlled fail-over process. A typical IP cluster system of
`the preferred embodiment is shoWn in FIG. 4 Wherein the
`internet 107 is shoWn connected to a typical IP cluster 401
`Which contains programmed general purpose computer units
`403, 405, 407, 409 Which act as protocol stack processors for
`message packets received. The IP cluster 401 is typically
`connected to application servers or other similar type units
`411 in the netWork. In this ?gure it is shoWn that there may
`be any number of units in a cluster (e.g. member “n” 409)
`hoWever for purposes of further illustration the cluster Will
`be depicted as having three units, understanding that the
`cluster of the present invention is not limited to only three
`units. Also for purposes of illustration the preferred embodi
`ment Will be described as a cluster Whose applications may
`be VPN tunnel protocols hoWever it should be understood
`that this cluster invention may be used as a cluster Whose
`application is to act as a FireWall, or to act as a gateWay, or
`to act as a security device, etc.
`In the preferred embodiment of the present invention,
`each of the cluster members is a computer system having an
`Intel motherboard, tWo Intel PentiumTM processors, a 64
`megabyte memory and tWo Intel Ethernet controllers, and
`tWo HiFn cryptographic processors. The functions per
`formed by each processor are generally shoWn by reference
`to the general memory map of each processor as depicted in
`FIG. 5. Each cluster member has an Operating System
`kernel 501, TCP/IP stack routines 503 and various cluster
`management routines (described in more detail beloW) 505,
`program code for processing application #1 507, Which in
`the preferred embodiment is code for processing the IPSec
`protocol, program code for processing application #2 509,
`Which in the preferred embodiment is code for processing
`the PPTP protocol, program code for processing application
`#3 511, Which in the preferred embodiment is code for
`processing the L2TP protocol, and program code for pro
`cessing application #4 513, Which in the preferred embodi
`ment is code space for processing an additional protocol
`such as perhaps a “Mobile IP” protocol. Detailed informa
`tion on these protocols can be found through the home page
`of the IETF at “http://WWW.ietf.org”. The folloWing speci?c
`protocol descriptions are hereby incorporated fully herein by
`reference:
`“Point-to-Point Tunneling Protocol—PPTP”, Glen Zorn,
`G. Pall, K. HamZeh, W. Verthein, J. Taarud, W. Little, Jul.
`28, 1998;
`“Layer TWo Tunneling Protocol”, Allan Rubens, William
`Palter, T. Kolar, G. Pall, M. LittleWood, A. Valencia, K.
`HamZeh, W. Verthein, J. Taarud, W. Mark ToWnsley, May
`22, 1998;
`Kent, S., Atkinson, R., “IP Authentication Header,” draft
`ietf-ipsec-auth-header-07.tXt.
`Kent, S., Atkinson, R., “Security Architecture for the
`Internet Protocol,” draft-ietf-ipsec-arch-sec-07.tXt.
`Kent, S., Atkinson, R., “IP Encapsulating Security Pay
`load (ESP),” draft-ietf-ipsec-esp-v2-06.tXt.
`Pereira, R., Adams, R., “The ESP CBC-Mode Cipher
`Algorithms,” draft-ietf-ipsec-ciph-cbc-04.tXt.
`Glenn, R., Kent, S., “The NULL Encryption Algorithm
`and Its Use With IPsec,” draft-ietf-ipsec-ciph-null-O1.tXt.
`Madson, C., DorasWamy, N., “The ESP DES-CBC Cipher
`Algorithm With EXplicit IV,” draft-ietf-ipsec-ciph-des
`eXpiv-02.tXt.
`Madson, C., Glenn, R., “The Use of HMAC-MD5 Within
`ESP and AH,” draft-ietf-ipsec-auth-hmac-md5-96-03.tXt.
`
`1O
`
`15
`
`25
`
`35
`
`45
`
`55
`
`65
`
`6
`Madson, C., Glenn, R., “The Use of HMAC-SHA-1-96
`Within ESP and AH,” draft-ietf-ipsec-auth-hmac-sha196
`03.tXt.
`Harkins, D., Carrel, D., “The Internet Key EXchange
`(IKE),” draft-ietf-ipsec-isakmp-oakley-08.tXt.
`Maughan, D., Schertler, M., Schneider, M., and Turner, J .,
`“Internet Security Association and Key Management Pro
`tocol (ISAKMP),” draft-ietf-ipsec-isakmp-10.{ps,tXt}.
`H. K. Orman, “The OAKLEY Key Determination
`Protocol,” draft-ietf-ipsec-oakley-02.tXt.
`Piper, D. “The Internet IP Security Domain of Interpre
`tation for ISAKMP,” draft-ietf-ipsec-ipsec-doi-10.tXt.
`Tunneling protocols such as the Point-to-Point Tunneling
`Protocol (PPTP) and Layer 2 Tunneling Protocol (L2TP)
`although currently only “drft” standards, are eXpected to be
`con?rmed as of?cial standards by the Internet Engineering
`Task Force (IETF) in the very near future, and these proto
`cols together With the Internet Security Protocol (IPSec),
`provide the basis for

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket