throbber
||||||||||||||l|l|||||||||||||||||||l||ll||ll||||||||||l||||l|||||||||l||||
`
`USOUGSTSISSBI
`
`r12) Umted States Patent
`(10) Patent N0.:
`US 6,578,158 Bl
`E
`
`Dcitz ct al.
`(45) Date of Patent:
`Jun. 10 2003
`
`(54) METHOD AND APPARATUS FOR
`PROVIDING A RAID CONTROLLER
`HAVING TRANSPARENT FAILOVER AND
`FAILMCK
`
`(75)
`
`.
`Inventors: William G. Deitz. leol. (‘0 {US}:
`Keith Short, Ialiayette. CO (US)
`_
`.
`_
`'I
`.. M .
`.
`.
`_
`71
`t J Magnw- 3:333:13 Egifnczsfitglgft
`
`( ' ) Notice:
`
`Subject to an}r disclaimer, the term ofthjs
`patent is extended or adjusted under 35
`USE. 154a,) bv 0 days.
`I
`'
`
`.
`‘1
`(J) App}. No" 09"]429'523
`(22)
`Filed:
`Oct. 28, 1999
`
`7
`
`‘
`Int. ('1'
`(51')
`.
`(52) U..S. Cl.
`(53) Fleld M Search
`
`1
`‘
`G0“! 11:11“
`7l4fll: 714.5
`714K). 7‘ 8.. “.-
`71-9710. 5: 7] 11’ 1 14
`
`(56)
`
`"
`References Liter]
`US. PATENT DOIITUMENTS
`
`‘7 337363}
`3 $743343
`'ifibififiq
`
`J .r .nLu [JE }>}>;3>3*
`
`3:90:73
`5.811764
`
`r
`*
`
`3952““
`3.51093 Walk?! 61 311-
`----- 371-"13-1
`“ 1319‘” “19ml!“ “1 ‘11-
`395575
`llrl'JU-l Holland et at.
`..3‘.‘.t.
`r
`r:
`01
`L
`-
`a I.
`36am.“
`J‘r HI!
`._rl Jo ILICMB Ll al
`5r'l‘J93 Jones
`395.:13211?
`3,1993 Marks et al.
`961998 IJJl ct al. ................=6QSIISNH
`
`
`
`..
`
`
`
`‘mrr
`19999 Espy e101.
`5,922,071 A
`3?llr‘222
`zoom PiI-Bntal
`......
`’1
`6,129,027 A]
`“NIH
`4900] Richardson ..
`"
`“119,353 El
`1: 12/200] Gnflith .__....o..o......,...... 714.“?!
`fijfitlfiS? Bl
`"‘"l'db
`'=
`U L
`y ”drum”
`Primary Exrirrtr'ner——Roberl Beatusoliel
`Assistant Exmtrirrcr—Mare Duncan
`(74) Ammo; Agent, or Firrn—-l)orsey & Whitney LLP
`(57)
`ABSTRACT
`.
`‘
`A method and apparatus [or controlling a memory system
`100 comprising a plurality of controllers 105 connected by
`a fibre channel arbitrated loop 145 to provide transparent
`failover and tailback mechanisms for failed conlrollers. The
`controllers 105 are adapted to transfer data hetwaun a data
`storage system 120 and at least one host computer 110 in
`response to instructions therefrom. ln Ihe method, a unique
`identifier is )rovidecl to each controller 105. The (I eraljon
`l
`P
`of the controllers m5 is then monitored and, when a failed
`controller is detected, a failover procedure is performed on
`a surviving controller. The l‘uilovcr procedure includes (lis—
`ahling the failed controller and instructing the surviving
`controller to assume the identity of the failed controller.
`Thus. the surthng controller is capahle or respondmg to
`Instructions addressed lo 11 and Instructions addressed to the
`
`failed controller. and the failure of the failed controller is
`transparent to the host computer 110. A computer program
`and a computer program product
`for
`implementing the
`.
`method are also provrded.
`
`25 C Iaims. 4 Drawing Sheets
`
`title
`
`1100
`
`HOST COMPUTER
`
`HOST COMPUTER
`
`
`
`VMWARE-1008
`
`Page 1 of 12
`
`VMWARE-1008
`Page 1 of 12
`
`

`

`US. Patent
`
`Jun. 10,2003
`
`Sheet 1 of4
`
`US 6,578,158 B1
`
`HOST COMPUTER
`
`155b
`
`I \
`
`\
`
`m D
`
`VMWARE-1008 / Page 2 of 12
`
`VMWARE-1008 / Page 2 of 12
`
`

`

`US. Patent
`
`.lun.10,2003
`
`Sheet 2 of4
`
`US 6,578,158 B]
`
`1108
`
`11%
`
`HOST COMPUTER
`
`HOST COMPUTER
`
`INACTIVE
`
`2003
`
`200b
`
`FALLOVER
`PORT
`
`VMWARE-1008 / Page 3 of 12
`
`VMWARE-1008 / Page 3 of 12
`
`

`

`US. Patent
`
`Jun. 10, 2003
`
`Sheet 3 of 4
`
`US 6,578,158 Bl
`
`210
`
`215
`
`PROVIDE UNIQUE
`IDENTIFIER
`
`COMMUNICATE
`UNIQUE
`IDENTIFIER
`
`BEGIN DUAL-
`ACTIVE OPERATION
`
`EXCHANGE PINGS
`
`
`
`
`AILED
`CONTROLLER
`DETECTED?
`
`YES
`
`DISABLE FAILED
`CONTROLLER
`
`ASSUME IDENTITY
`OF FAILED
`CONTROLLER
`
`270
`
`235
`
`240
`
`
`
`RESPOND TO IIO
`INSTRUCTIONS FOR
`SURVIVING AND
`
`
`FAILED CONTROLLERS
`
`
`
`
`
`POLL FAILED
`CONTROLLER
`
`
`
`FAILED
`CONTROLLER
`REPLACED?
`
`225
`
`230
`
`YES
`
`COMMUNICATE
`UNIQUE
`IDENTIFIER
`
`255
`
`250
`
`REPLACEMENT
`CONTROLLER
`ASSUMES IDENTITY OF
`FAILED CONTROLLER
`
`
`
`RESUME DUAL-
`ACTIVE OPERATION
`
`
`
`MEMORY
`SYSTEM
`REBOOTED?
`
`YES
`
`FIG. 3
`
`VMWARE-1008 / Page 4 of 12
`
`VMWARE-1008 / Page 4 of 12
`
`

`

`US. Patent
`
`.Iun.10,2003
`
`Sheet 4 of4
`
`US 6,578,158 B1
`
`230
`
`\
`
`CONTROLLER
`
`INITIALIZATION
`UNIT
`
`235
`
`315
`
`REPLACEMENT
`DETECTION
`UNIT
`
`290
`
`FAILURE
`DETECTION
`UNIT
`
`320
`
`FAILBACK
`UNIT
`
`295
`
`FAILOVER
`UNIT
`
`LOOP
`
`325
`
`REINITEWIZATION
`
`300
`
`DISABLING
`
`310
`
`LOOP
`
`INWIJLIIIZITP'ITION
`
`FIG. 4
`
`VMWARE-1008 / Page 5 of 12
`
`VMWARE-1008 / Page 5 of 12
`
`

`

`US 6.578.158 Bl
`
`1
`METHOD AND APPARATUS FOR
`PROVIDING A RAID CONTROLLER
`HAVING TRANSPARENT FA] LUV ICR AN 1)
`FAILBACK
`
`FIELD OI“ THE INVENTION
`
`This invention pertains generally to the field of computer
`memory systems, and more particularly to a method and
`apparatus for controlling redundant arrays of independent
`disks.
`
`BACKGROUND OF THE INVENTION
`
`Modem computers frequently require large, fault-tolerant
`memory systems. One approach to meeting this need is to
`provide a Redundant Array of Independent Disk drives
`(RAID) usually including a plurality of hard disk drives
`operated by a disk array controller that is coupled to a host
`computer. The controller provides. the brains of the memory
`system, servicing all host requests. storing data to or retriev-
`ing it from the RA] D. caching data to provide faster access.
`and handling drive failures without
`interrupting host
`requests. Given the importance of the controller. numerous
`solutions have been suggested minimize the potential for
`interrupted service due to controller malfunction. One such
`solution calls for providing dual-active controllers having
`l'ailover and tailback capabilities. Dual-active controllers are
`a pair of controllers that are connected to each other and to
`all the disk drives in a RAID. In normal operation. input!
`output (I10) requests from the host computer are divided
`between the dual-active controllers to increase the rate at
`which in formation can be transferred to or from the RAID,
`commonly referred to as the bandwidth of the memory
`system. However,
`in the event that one of the controllers
`fails, the surviving controller takes over the functions of the
`failed controller and begins servicing host
`requests
`addressed to the failed controller in addition to those
`addressed to it. The mechanism that allows this is commonly
`known as a t‘ailover mechanism. If the surviving controller
`is able to assume the functions of the failed controller
`without any actions on the part of the host computer. for
`example redirecting 130 requests to the surviving controller.
`the failover mechanism is said to be transparent. Ifthc failed
`controller can be subsequently replaced and normal opera~
`lion resumed without decnergizing or reinitializing the
`controllers the memory system is said to have a failback
`mechanism.
`
`One example of the use of such dual-active controllers is
`described. for example. in US. Pat. No. 5390.775, to Marks
`et 3]. uses dual—active controllers connected to the host
`computer by a Small Computer System Interface (SCSI)
`bus. 'l‘ypicatly. the controllers are also connected to a RAID
`comprising multiple disk drives through a number of ttdtli-
`tionul SCSI buses. Each SCSI device on a bus. such as a
`controller or a disk drive, is assigned one bit as an identifier
`(SCSI ID) to permit the host computer to select a particular
`controller. and the controller to select a particular disk drive.
`Thus, the method permits a maximum of eight devices to be
`identified on a standard 8-bit SCSI bus.
`In addition,
`the
`controllers are connected to one another by a separate
`communications link. and each has access to a cache
`memory in the other. Although both controllers are con—
`nected to every disk drive in the RAID, to permit dual-active
`operation each disk drive is typically under primary control
`of one of the controllers. This is accomplished by dividing
`the RAID into groups of disk drives that appear to the host
`
`2
`computer as a logical drive or unit identified by a logical unit
`number {LUN} and, during initialization, associating each
`LUN with the SCSI ID of a particular controller. In normal
`operation, a controller responds only to lit) requests which
`are addressed to it and which refer to LUNs over which it has
`primary control. However. if a controller fails the remaining
`controller of the pair obtains configuration information,
`including the SCSI ID and the LUNs of the failed controller.
`over the communications link and begins servicing requests
`addressed by the host to the failed controller as well as those
`addressed to itself
`While the above approach has been ell‘ective in reducing
`interruptions in service for memory systems having dual-
`active controllers.
`it
`is limited by the architecture of the
`SCSI bus. Traditionally, SCSI buses have from eight
`to
`sixteen signal lines which allows a maximum of from eight
`to sixteen SCSI devices to be interconnected by the SCSI
`bus at any one time. Thus, systems which use :1 lo~bit wide
`SCSI bus on the host side and 8-bit wide SCSI buses on the
`device side, typically provide for at most six device side
`SC‘SI buses having six disk drives each. Moreover, the above
`approach, which relies on SCSI IDs, has not been imple-
`mented using librc interface type controllers.
`Fibre interface type controllers are coupled to a host
`computer through one or more fibre channels. Fibre channel
`is the general name of a technology using an integrated set
`ofstandards developed by the American National Standards
`Institute (ANSI)
`for high speed, serial communication
`between computer devices. {See for example the ANSI
`standard K3'l‘tt, "Fibre Channel Physical and Signaling
`Interface (PC-Pin,“ Rev 4.3 {1994), hereby incorporated by
`reference.) Manufacturers of RAID systems have been mov—
`ing to fibre channel technology because it allows transmit-
`ting of data between computer devices at rates of over 1
`Gbps (one billion bits per second}, and at distances exceed—
`ing several hundred meters and more. Also, fibre channel
`arbitrator]
`loop (PC-AL) allows for 137 unique loop
`identifiers, one of which unique identities is reserved for a
`fabric loop port.
`The widely accepted approach to providing failovc-rt'
`failbuck capability in RAID systems comprising fibre inter-
`face controllers has been to use dual-active controllers
`coupled by a redirecting driver. In the event of a controller
`failure the redirecting driver shifts host requests. from the
`failed controller to a surviving controller. The failed con-
`troller can then be replaced and the memory system reini—
`tialized to return to normal. dual-active controller operation.
`The redirecting driver can be implemented using a software
`or hardware protocol. One. exemplary redirecting driver is
`disclosed in U.S. Pat. No. 5,237,658, to Walker et aI., hereby
`incorporated by reference. However. one problem associated
`with this type of solution is that it is achieved at the expense
`of added memory system complexity that increases cost and
`decreases bandwidth. In addition. when. as is common, the
`rcdirccting driver is implemented using software in the host
`computer,
`this approach is not
`independent of the host
`computer, and typically requires it special driver for each
`host computer system on which it
`is to lie utilitted. This
`further adds to the cost and complexity, and increases the
`difficulty of installing and maintaining the memory system.
`Accordingly. there is a need for a memory system com—
`prising a number of fibre interface controllers and having a
`faitover mechanism that is transparent to a host computer.
`There is a further need for such a memory system having a
`failback mechanism that
`is also transparent
`to the host
`computer. The present invention provides a solution to these
`and other problems, and offers additional advantages over
`the prior art.
`
`f."
`
`.10
`
`3.5
`
`40
`
`5t]
`
`55
`
`6h
`
`55
`
`VMWARE-1008 I Page 6 of 12
`
`VMWARE-1008 / Page 6 of 12
`
`

`

`3
`SUMMARY or THE INVENTION
`
`US 6,578,158 Bl
`
`.10
`
`3.5
`
`4U
`
`St]
`
`invention provides a memory system and
`The present
`method of operating a memory system. In one embodiment,
`the memory system includes a number of controllers con-
`nected by a fibre channel arbitrat'ed loop to provide trans—
`parent t'ailove-r and t‘ailhack for failed controllers. The con-
`trollers are adapted to transfer data between a data storage
`system and at
`least one host computer
`in response to
`instructions therefrom. In the inventive method, a unique
`identifier is provided to each controller to permit the host
`compute r to address instructions to a specific controller.
`Then. operation of the controllers is monitored and when a
`failed controller is detected, a l'ailover procedure is per-
`formed on a surviving controller. In one embodiment, the
`failover procedure disables the failed controller and assumes
`the identity of the failed controller. Thus,
`the surviving
`controller becomes capable ot‘ responding to inslmctio-ns
`addressed to it and instructions addressed to the failed
`controller, and the failure. of the failed controller is trans-
`parent to the host computer. In one particular embodiment.
`the step of providing a unique identifier to each controller
`preferably includes the step of providing a world wide name
`In each controller. and more preferably the step further
`includes providing a loop identifier to each controller.
`In another aspect the invention provides a memory system
`tor transferring data between a data storage system and at
`least one host computer in response to instructions there-
`from. The memory system includes a pair of dual-active
`controllers connected by a Iibre channel arbitrated loop.
`Each controller has a unique identifier and is adapted to
`assume the identity of a failed controller and to respond to
`instructions addressed to it, thereby rendering failure of the
`failed controller transparent to the host computer. In one
`embodiment. the memory system further includes a com-
`munication path coupling the controllers, the communica-
`tion path being adapted to enable each controller to detect
`failure of the other controller. The present
`invention is
`particularly useful
`for data storage systems comprising
`multiple disk drives coupled to the controllers by disk
`channels. in which at least one disk channel also serves as
`the communication path.
`In yet another aspect the invention provides a computer
`program and a computer program product for operating a
`memory system comprising a plurality of controllers, each
`controller having a unique identifier, and the controllers
`adapted to transfer data between a data storage system and
`at least one host computer in response to instructions there-
`from. The computer program product includes a computer
`readable medium with a computer program stored therein.
`The computer program has a failure detection unit adapted
`to detect a failed controller . A failover unit is adapted to
`enable a surviving controller to respond to instructions
`addressed to it and to instructions addressed to the failed
`controller. The failover unit includes a disabling unit adapted .
`to disable the failed controller. The t'ailovcr unit also
`includes a loop initialization unit. which is adapted to
`instruct a surviving controller to assume the identity of the
`failed controller and to instruct the surviving controller to
`respond to instructions addressed to it and to the failed
`controller as well as instructions addressed to the surviving
`controller. 'l'hus, failure of the failed controller is transparent
`to the host computer. In one embodiment. each controller
`has an active port and a failover port, and the l‘uilover unit
`is adapted to activate the failover port of the surviving
`controller. In another embodiment, the computer program
`product
`further
`includes a replacement detection unit
`
`4
`adapted to instruct a replacement controller to assume the
`identity ofthe failed controller and respond to instructions to
`the failed controller, thereby rendering replacement of the
`failed controller transparent to the host compute r.
`In still another aspect the invention provides a memory
`system for transferring data between a data storage system
`and at least one host computer in response to iostntctions
`therefrom. The memory system comprising a pair of dual-
`active controllers connected by a libre channel arbitratetl
`loop, each controller having a unique identifier. and a means
`for providing a failover mode from a failed controller to a
`surviving controller that is substantially transparent to the
`host computer. In one embodiment, the means for providing
`a failover mode is a computer program product having a
`computer program including a loop initialization unit
`adapted to instruct the surviving controller to assume the
`identity of the failed controller and to instruct the surviving
`controller to respond instructions addressed to it and to the
`failed controller.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`Additional objects and features of the invention will he
`more readily apparent from the following detailed descrip—
`tion and appended claims when taken in conjunction with
`the drawings, in which:
`FIG. 1 is a block diagram ofan embodiment ofa memory
`system comprising a pair ot’controllers having a transparent
`failover and tailback mechanism according to the present
`invention;
`FIG. 2 is a block diagram of another embodiment of a
`memory system according to the present invention in an
`environment comprising a pair of host computer systems;
`FIG. 3 is a flowchart showing an emhodimertt of a method
`of operating the memory system shown in FIG. 1 or FIG. 2
`to provide a transparent failover and fallback mechanism
`aecording to the present invention; and
`FIG. 4 is a block diagram illustrating the hierarchical
`structure- ot' an embodiment of a computer program accord—
`ing to an embodiment of the present invention.
`
`DETAILED DESCRIPTION
`
`VMWARE-1008 I Page 7 of 12
`
`The present invention is directed to a memory system
`- having a number of controllers adapted to transfer data
`between at
`least one host computer and a data storage
`system. such as one or more Redundant Array of Indepen-
`dent Disks (RAID) storage systems. The controllers are
`coupled to the host computer and one another through a
`host-side loop to provide a failover and a failback mecha-
`nism for a failed controller that is transparent to the host
`computer. Advantageously. the controllers are connected by
`a fibre channel arbitratcd loop (FC-AL). While the invention
`is described using examples of data storage system com—
`prising a RAID having multiple magnetic disk drives, the
`present
`it'tV'e't'ttiUt't can he used with other data storage
`systems, as apparent to those skilled in the art. including
`arrays and individual disk drives in which the disk drives are
`optical, magnetic. or magneto-mitical disk drives.
`FIG. 1 shows a block diagram of an exemplary embodi—
`ment of a memory system 100 according to the present
`invention having a pair of controllers 195 (singularly 1051:
`and 105m coupled to a host computer lltl through a pair of
`host-side loops 15 (singularly 115i? and 115th). It is to he
`understood that by host-side loop 115 it is meant a commu-
`nication path which connects the controllers 105 to the host
`computer 110. and that the host-side loop can also connect
`
`on
`
`65
`
`VMWARE-1008 / Page 7 of 12
`
`

`

`US 6,578,158 Bl
`
`5
`other devices or systems {not shown) to the host computer.
`The controllers 105 are in turn coupled a data storage system
`120. shown here its a RAID 130 comprising multiple disk
`drives 135, via several device-side loops I40 (singularly
`14017 to 140c) also known as disk channels. Alternatively.
`the controllers 105 could also be coupled to the data storage
`system 120 via SCSI buses (not shown). Although FIG. 1
`shows a single pair of controllers 105 coupled by three-
`devicc-side loops 140 to a RAID 130 comprising only
`twulve disk drives 135, the illustra tcd architecture is extend-
`ahle to memory systems having any number of controllers.
`disk drives, and device-side loops. For example. the memory
`system 100 can a- number. n, of n-way controllers. using
`operational primitives in a message passing multi—controllcr
`non-uniform workload environment, as described in com-
`monly assigned ctr-pending US. patent application Ser. No.
`09826397, which is hereby incorporated by reference.
`The host-side loops 115 are made up of several
`tlbre
`channels 145 and a hub 150m 15%. The term fibre channel
`as used here refers to any physical medium that can be used
`to transmit data at high speed,
`for example to serially
`transmit data at high speed in accordance with standards
`developed by the American National Standards Institute
`(ANSI). such as for example optical fibre, co—axial cable. or
`twisted pair telephone line. Each of the host—side loops 115
`connect to three nodes or ports, including a single server port
`known as a host bus adapter HBA 155:1, 155b, on the host
`computer 110 and to two controller ports 160d, 160b, on
`each of the controllers 105. The host-side loops 115 are
`adapted to enable data and inputtoutput (It'O) requests from
`the host computer 110 to be transferred betwuen any port on
`the loop 115.
`The controllers 105 can be any suitable tibrc channel
`compatible controller that can be modified to operate
`according to the present invention, such as for example the
`DAC‘JGUSI'". commercially available from Myles.
`lnc.,
`Boulder. Colo. Such controllers .105 include, or can be
`modified to include. an active port 165e, 1651‘). and a failover
`port
`land, 1661:. on each controller. and a register [not
`shown) adapted to support
`the failover and a
`fallback
`mechanism of the present invention. A pair of the controllers
`105 can be configured to operate as dual-active controllers
`as described above, or as dual-redundant controllers wherein
`one controller serves as an installed spare for the other,
`which in normal operation handles all [£0 requests from the
`host computer [10. Preferably, the controllers 105 operate as
`dual-active controllers to increase the bandwidth of the
`memory system 100. Generally. each or the controllers 105
`have a computer readable medium, such as a read only
`memory (ROM) 170, in which is embedded a computer or
`machine readable code. commonly known as firmware, with
`instructions for configuring and operating the controller, a
`cache 180?, 180th, [or temporarily storing [to requests and
`data from the host computer 110, and a local processor 185:3,
`185th, for executing the instructions and requests. The firm-
`ware of each controller is modified to support the failovcr
`and a failback mechanism of the present invention.
`To enable the controllers 105 to be operated in dual-active
`mode,
`the controllers on host-side loops 115a. 115b, are
`identified by a unique identifier to permit the host computer
`110 to address an [m request to a specific controller. In one
`embodiment. the unique identifier includes a non-volatile,
`{1411“ World Wide Name (WWN). AWWN is an identifying
`code that is hardwired, embedded in the firmware, or oth-
`erwise encoded in a fibre channel compatible device, such as
`the NBA lfifin, 1551), or the controllers 105, at the time of
`manufacture. Additionally. the unique identifier includes a
`
`6
`loop identifier (LOOP 1D] which is.assigned to each port in
`a host-side loop llSn, 115b, during a system initialization of
`the memory system 100. This LOOP ID can be acquired
`during a Loop Initialization l-Iard Address (LlllA) phase of
`the system initialization, or during a Loop initialization
`Software Address (LISA) phase. Because not all host com-
`puters have operating systems that support addressing
`schemes using WWNs, for example some legacy host com—
`puter systems, in a preferred embodiment. the unique iden-
`titicr includes both a WWN and a LOOP ll) to enable the
`memory system 100 of the present invention to be used with
`any host computer 110 independent of the operating system.
`During system initialization, each of the controllers I05
`register the unique identifier of the other controller. This
`enables a surviving controller. for example controller 105.1.
`to accept and process ttO requests addressed to a failed
`controller, for example controller 1051‘). by assuming the
`identity of the failed controller.
`The RAID is comprised of multiple virtual or logical
`volumes. Although the controllers 105 share the same RAID
`130, that is both controllers are connected to every d isk drive
`135 in the RAID, preferably each logical volume is under
`the primary control of one of the controllers so that coher—
`ency need not be maintained between the caches 180a. 180b,
`of the controllers when they are operated in dual—active
`mode. By primary control it is meant that during normal
`operation each logical volume 135 in the RAID 130 is
`controlled solely by one of the controllers 105. Each logical
`volume is represented by a logical unit number (LUN) lo the
`host computer 110. Each LUN in turn is associated with the
`unique identifier of one of the controllers 105 so that when
`data needs to be stored in or retrieved from a particular [.UN.
`the Ill)
`request
`is automatically directed to the correct
`controller.
`
`In a preferred embodiment, shown in F1112, reliability is
`further enhanced by providing a clustered environment in
`which two host computers 110 (singularly 110n and 11%}
`each have direct access to both controllers 105 through a
`number of I-IBAs 155n—d. Thus. the failure of a single host
`computer llOrt, 110b, will not result in the failure of an
`entire network of client computers {not shown). In addition.
`nsshown in FIG. 2. each of the controllers 105 have at least
`one active port 19511, 1951:! and one inactive port 2000. 20015.
`The active ports 195a, 195th receiVe and process It‘O requests
`sent by the host computers 110 on the host-side loops 115.
`The inactive ports. 200a, 2001?, also known as a failover
`ports, can process It‘ll] requests only when the active port
`195:3, 19511 on the same host-side loop 115m 115b, has
`failed. For example,
`in case of failure of controller 105a.
`inactive por1200h on surviving controller 105th assumes the
`identity ofthe active port 195:: on failed controller 105a and
`begins accepting and processing [to requests directed to the
`failed Controller 105n.
`
`the memory
`invention,
`In accordance with the present
`system further includes a communication path 205 adapted
`to transmit a signal from one controller 105 to another in the
`event of a controller failure. The communication path 205
`can be a Small Computer System Interface (SCSI) bus or a
`fibre channel as described aboVe. It can take the form of a
`dedicated high speed path extending directly between the
`controllers 105, as shown in FIG. 1. or one of the device-side
`channels 140(t—‘C {disk channels) which can also serve as the
`communication path 205. as shown in FIG. 2. The signal
`passed between the controllers 105 to indicate. controller
`failure can be a passive signal, such as for example the lack
`of a proper response to a polling or pinging scheme in which
`each controller interrogates the other at regular, frequent
`
`lit
`
`3.5
`
`40
`
`5t]
`
`55
`
`bit
`
`65
`
`VMWARE-1008 I Page 8 of 12
`
`VMWARE-1008 / Page 8 of 12
`
`

`

`US 6,578,158 Bl
`
`7
`intervals to ensure the other controller is operating correctly.
`Alternatively, the signal can be a dynamic signal transmitted
`directly from a failed or failing controller 105n, 105i). to the
`surviving controller 105b, 105n, instructing it
`to initiate a
`failover process or mechanism. Optionally, the communica—
`tion path 205 is also adapted to enable the controllers 105 to
`achieve cache coherency in case of controller t'ailurc.
`An exemplary method of operating the memory system
`100 shown in FIG. 2 to provide a failover process that is
`substantially transparent to the host computers lan, 11!).
`will now be described with reference to FIG. 3. The fol-
`
`lowing initial actions or steps are required to make the
`t‘ailover operation transparent to the host computer. First, in
`a system initialization step 210 each of the controllers 105
`is provided with a unique identifier which is communicated
`to the host computers 110. This step 210 generally merely
`involves querying the controllers 105 to obtain their WWN,
`but it may also include assigning a LOOP It) to each
`controller in a [.IHA phase or a LISA phase, as described
`above. The unique identifiers are then registered by the host
`computers .110 and one or more of the LUNs are associated
`with each unique identifier. Next, in a communication step
`215. the unique identifiers and their associated LUNs are
`communicated between the controllers 105 via the commu-
`nication path 205. Each of the controllers 105 assign the
`unique identifier and the associated LUNs of the other
`controller, to its tailriver port 200a. 2001!). This enables a
`surviving controller 105rr, 105!) to assume the identity of a
`failed controller 105b, 105:}, and to accept and process l.-’O
`requests addressed to it by activating the normally inactive
`or failover port 200a, 200i).
`The memory system 100 is then ready to begin regular
`operations in a dual-active operation step 225 in which the
`controllers 105 both simultaneously reeeive and process ltO
`requests from the host computers 110. During normal opera-
`lions a fault detection step 230 is executed in which the
`controllers 105 exchange a series of‘pings," also referred to
`as a heart beat signal, the response to which, as described
`above. signals to each controller that the other has not failed.
`This step 230 may also involve a scheme in which a failed
`or failing controller 105a, 105b dynamically signals a sur-
`viving controller 105b, 105a, that a failure has occurred or
`is about to occur.
`(to detection of a controller failu re, a t'ailover procedure
`is perfomied on the surviving controller 105n, 105b, the
`[ailover procedure involves the steps of disabling the failed
`controller (step 235} and assuming the identity of the 1'ailcd
`controller {step 240). In the disabling step 235. the surviving
`controller [05:1, 105?: asserts a reset signal, which disables
`the failed controller 105b, 105c by resetting its.]ocal pro-
`cessor 185a, 1855, and the active port 195:3, 195b, fibre
`protocol chip (not shown). Resetting the fibre protocol chip
`causes the hub 150:1, 150!) to automatically bypass the
`primary port 195a, 195b, on the failed controller lllSn, 105b.
`In the assuming identity step 240, the failover port 201M,
`200th of the surviving controller 105o, 105b, begins accept-
`ing and processing U0 requests addressed by the host
`computers llllrr, 110b, to the failed controller 1051), 105a.
`Preferably,
`to speed up the failover process the unique
`identifier for the failed controller 105a, 105b, was previ—
`ously assigned to the failover port 2000, 2110!), during the
`communication step 215. and the surviving controller 105
`merely activates the t'ailover port 200a. 200b, to enable it to
`begin accepting and processing [JO requests.
`the surviving
`After the failove-r process is completed,
`controller 105”, 105th,
`in a resume operation (step 245)
`resumes operations by responding to ItO requests addressed
`
`8
`to itselfand to the failed controller. The surviving controller
`105n. 105b, responds to requests to store or retrieve data
`addressed to the failed controller, without any additional
`support from the host computers 110 or the lIBAs 155.
`Because there is no need to alter the registered unique.
`identifiers or the associated LUNs, the failover process is
`transparent to the host computers 110. To the host computers
`110. the delay, it any. caused by the time it takes to detect the
`failed controller 1050, 105!) and to perform the loop initial-
`ization procedure appears to be no more than a momentary
`loss of power to the memory system 100, which requires the
`host computers to restransmit the last several commands sent
`to the failed controller.
`
`Optionally, when the controllers 105 include caches 180rr.
`180h. the t'ailover process can also include a cache flush step
`(not shown) and a conservative cache mode enable step {not
`shown). The cache flush step prevents the loss of data that
`was presented with good status to the host computers 110
`because the data has been written to both caches 18%, 1806.
`but has not actually been written to the data storage system
`120 before the controller
`failure. The cache flush step
`commits this data to the data storage system 120. Enabling
`conservative cache mode minimizes the chance of data
`being lost while operating with a single controller 105.7,
`105b, in failover mode, by ensuring that all data is written
`to the data storage system 120 prior to a good status signal
`being sent.
`In another aspect, the present invention is directed to a
`memory system 100 having a failover mechanism. such as
`the one described above, that further includes a tailback
`process or mechanism that is substantially transparent to the
`host computers lllln, 11033. To be transparent
`to the host
`computers 110a, two, the failback mechanism should sup-
`port a hot swap of a failed controller 1050. 105th. By hot
`swap it is meant the failed controller 1050, 105.6 is removed
`and a replacement controller (not shown) put
`in service
`without rte—energizing or re—booting the memory system 100
`andflor the host co

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket