throbber
(12) United States Patent
`Deitz et al.
`
`(10) Patent No.: US 6,578,158 B1
`(45) Date of Patent: Jun. 10, 2003
`
`US006578158B1
`
`(54) METHOD AND APPARATUS FOR
`PROVIDING A RAID CONTROLLER
`HAVING TRANSPARENT FAILOVER AND
`FAILBACK
`
`(75)
`
`Inventors:
`
`William G. Deitz, Niwot, CO (US);
`Keith Short, l,aFayette, CO (US)
`
`(73) Assignee:
`
`International Business Machines
`Corporation, Armonk, NY (US)
`
`(*) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`(21)
`
`Appl. No.: 09/429,523
`
`(22) Filed:
`
`Oct. 28, 1999
`
`Int. CI.7 ................................................. G06F 11/00
`(51)
`(52) U.S. Cl ............................................... 714/11; 714/5
`(58) Field of Search .............................. 714/6, 7, 8, 11,
`714/710, 5; 711/114
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`5,237,658 A
`5,274,645 A
`5,367,669 A
`5,553,230 A
`5,757,642 A
`5,790,775 A
`5,812,754 A
`
`8/1993 Walker et al ............... 395/200
`* 12/1993 Idleman et al ............. 371/10.1
`11/1994 Holland ct al .............. 395/575
`* 9/1996 Petersen et al ............. 395/180
`5/1998 Jones ......................... 364/134
`* 8/1998 Marks et al ........... 395/182.07
`* 9/1998 Lui et al ................ 395/182.04
`
`7/1999
`5,922,077 A
`6,129,027 A1 * 2/2001
`6,219,753 B1 * 4/2001
`6,330,687 B1 * 12/2001
`
`Espy et al ..................... 714/7
`E1-Batal ..................... 370/222
`Richardson ................. 711/114
`Griffith .......................... 714/6
`
`* cited by examiner
`
`Primary Examiner~Robert Beausoliel
`Assistant Examiner~Marc Duncan
`(74) Attorney, Agent, or Firm~orsey & Whitney LLP
`
`(57)
`
`ABSTRACT
`
`A method and apparatus for controlling a memory system
`100 comprising a plurality of controllers 105 connected by
`a fibre channel arbitrated loop 145 to provide transparent
`failover and failback mechanisms for failed controllers. The
`controllers 105 are adapted to transfer data between a data
`storage system 120 and at least one host computer 110 in
`response to instructions therefrom. In the method, a unique
`identifier is provided to each controller 105. The operation
`of the controllers 105 is then monitored and, when a failed
`controller is detected, a failover procedure is performed on
`a succiving controller. The failovcr procedure includes dis-
`abling the failed controller and instructing the surviving
`controller to assume the identity of the failed controller.
`Thus, the surviving controller is capable of responding to
`instructions addressed to it and instructions addressed to the
`failed controller, and the failure of the failed controller is
`transparent to the host computer 110. A computer program
`and a computer program product for implementing the
`method are also provided.
`
`25 Claims, 4 Drawing Sheets
`
`110a
`
`110b
`
`HOST COMPUTER
`
`HOST COMPUTER
`
`100
`
`195a-~ ,
`
`I I m° " I
`
`’~(--200a
`FALLOVER
`I ~0~T [
`
`I
`
`200b-~ ~ ~195b
`F~LOVERI
`[
`I I ~0~T I
`I ~0~T I
`
`I
`
`~
`
`I
`
`I
`
`I ~05b
`
`/
`
`/
`
`1-
`,,0 k
`1;0
`
`IBM-Oracle 1008
`Page 1 of 12
`
`

`

`U.S. Patent
`
`Jun. 10, 2003 Sheet 1 of 4
`
`US 6,578,158 B1
`
`HOST COMPUTER
`
`110
`
`HBA
`
`~155a 155bj
`
`HBA
`
`100
`
`HUB
`
`ROM [ CPU
`
`ACTIVE
`PORT PORT
`ROM I CPU
`
`180a~
`CONTROLLER
`
`205
`
`180b~
`CONTROLLER
`
`185
`
`~105b
`
`105a
`
`120
`
`140b
`
`140¢
`
`130-" ’
`
`I
`
`FIG. 1
`
`IBM-Oracle 1008
`Page 2 of 12
`
`

`

`U.S. Patent
`
`Jun. 10, 2003 Sheet 2 of 4
`
`US 6,578,158 B1
`
`110a
`
`110b
`
`HOST COMPUTER
`
`HOST COMPUTER
`
`I HBA 155A
`
`II
`
`HBA --
`155a I
`
`!
`
`115a
`
`HUB
`
`~-150a
`
`I’,
`
`! I
`! I
`
`HBA 155b
`
`155BI
`
`HBA
`
`! I
`
`I HUB
`
`,,
`
`! !
`
`, 150b
`
`195a---,~
`
`l
`
`PORT
`ACTIVE
`
`,’ a
`FALLOVERI
`PORTl
`NACT VE I
`
`IFALLOVERI
`
`PORTI
`INACTIVE I
`
`, T (,--195b
`
`PORT J
`ACTIVE
`
`105aj
`
`120
`
`130-" ’
`
`I
`
`CONTROLLER
`
`CONTROLLER
`
`~-105b
`
`140b
`
`140c;
`
`@ @ @
`
`FIG. 2
`
`IBM-Oracle 1008
`Page 3 of 12
`
`

`

`U.S. Patent
`
`Jun. 10, 2003 Sheet 3 of 4
`
`US 6,578,158 B1
`
`START
`
`210
`
`PROVIDE UNIQUE
`IDENTIFIER
`
`RESPOND TO I/0
`INSTRUCTIONS FOR
`SURVIVING AND
`FAILED CONTROLLERS
`
`245
`
`I
`
`POLL FAILED
`CONTROLLER ~
`
`~ 250
`
`NO
`
`~255
`
`~, YES
`COMMUNICATE
`UNIQUE
`IDENTIFIER
`
`COMMUNICATE
`UNIQUE
`IDENTIFIER
`
`I
`
`BEGIN DUAL-
`ACTIVE OPERATION
`
`EXCHANGE PINGS
`
`225
`
`230
`
`NO
`
`DISABLE FAILED
`CONTROLLER
`
`235"~
`
`ASSUME IDENTITY
`OF FAILED
`CONTROLLER
`
`I
`
`REPLACEMENT
`CONTROLLER
`ASSUMES IDENTITY OF
`FAILED CONTROLLER
`
`RESUME DUAL- I
`ACTIVE OPERATION
`
`270
`
`NO
`
`~~, YES
`(START ~
`
`FIG. 3
`
`IBM-Oracle 1008
`Page 4 of 12
`
`

`

`U.S. Patent
`
`Jun. 10, 2003 Sheet 4 of 4
`
`US 6,578,158 B1
`
`CONTROLLER
`INITIALIZATION
`UNIT
`I
`
`REPLACEMENT
`DETECTION
`UNIT
`
`FAILURE
`DETECTION
`UNIT
`
`FAILBACK
`UNIT
`
`295-~
`
`FAILOVER
`UNIT
`
`LOOP
`REINITIALIZATION
`UNIT
`
`DISABLING
`UNIT
`
`310j
`
`LOOP
`INITIALIZATION
`UNIT
`
`FIG. 4
`
`IBM-Oracle 1008
`Page 5 of 12
`
`

`

`US 6,578,158 BI
`
`1
`METHOD AND APPARATUS FOR
`PROVIDING A RAID CONTROLLER
`HAVING TRANSPARENT FAILOVER AND
`FAILBACK
`
`FIELD OF THE INVENTION
`
`This invention pertains generally to the field of computer
`memory systems, and more particularly to a method and
`apparatus for controlling redundant arrays of independent
`disks.
`
`BACKGROUND OF THE INVENTION
`
`Modern computers frequently require large, fault-tolerant
`memory systems. One approach to meeting this need is to
`providc a Rcdundant Array of Indcpcndcnt Disk drivcs
`(RAID) usually including a plurality of hard disk drives
`operated by a disk array controller that is coupled to a host
`computer. The controller provides the brains of the memory
`system, servicing all host requests, storing data to or retriev-
`ing it from the RAID, caching data to provide faster access,
`and handling drive failures without interrupting host
`requests. Given the importance of the controller, numerous
`solutions have been suggested minimize the potential for
`interrupted service due to controller malfunction. One such
`solution calls for providing dual-active controllers having
`failover and failback capabilities. Dual-active controllers are
`a pair of controllers that are connected to each other and to
`all the disk drives in a RAID. In normal operation, input/
`output (I/O) requests from the host computer are divided
`between the dual-active controllers to increase the rate at
`which information can be transferred to or from the RAID,
`commonly referred to as the bandwidth of the memory
`system. However, in the event that one of the controllers
`fails, the surviving controller takes over the functions of the
`failed controller and begins servicing host requests
`addressed to the failed controller in addition to those
`addressed to it. The mechanism that allows this is commonly
`known as a failover mechanism. If the surviving controller
`is able to assume the functions of the failed controller
`without any actions on the part of the host computer, for
`example redirecting I/O requests to the surviving controller,
`the failover mechanism is said to be transparent. If the failed
`controller can be subsequently replaced and normal opera-
`tion resumed without de-energizing or reinitializing the
`controllers the memory system is said to have a failback
`mechanism.
`One example of the use of such dual-active controllers is
`described, for example, in U.S. Pat. No. 5,790,775, to Marks
`ct al. uscs dual-activc controllcrs conncctcd to thc host
`computer by a Small Computer System Interface (SCSI)
`bus. Typically, the controllers are also connected to a RAID
`comprising multiple disk drives through a number of addi-
`tional SCSI buses. Each SCSI device on a bus, such as a
`controller or a disk drive, is assigned one bit as an identifier
`(SCSI ID) to permit the host computer to select a particular
`controller, and the controller to select a particular disk drive.
`Thus, the method permits a maximum of eight devices to be
`identified on a standard 8-bit SCSI bus. In addition, the
`controllers are connected to one another by a separate
`communications link, and each has access to a cache
`memory in the other. Although both controllers are con-
`nected to every disk drive in the RAID, to permit dual-active
`operation each disk drive is typically under primary control
`of one of the controllers. This is accomplished by dividing
`the RAID into groups of disk drives that appear to the host
`
`10
`
`2
`computer as a logical drive or unit identified by a logical unit
`number (LUN) and, during initialization, associating each
`LUN with the SCSI ID of a particular controller. In normal
`operation, a controller responds only to I/O requests which
`5 are addressed to it and which refer to LUNs over which it has
`primary control. However, if a controller fails the remaining
`controller of the pair obtains configuration information,
`including the SCSI ID and the LUNs of the failed controller,
`over the communications link and begins servicing requests
`addressed by the host to the failed controller as well as those
`addressed to itself
`Whilc the abovc approach has bccn cffcctivc in rcducing
`interruptions in service for memory systems having dual-
`active controllers, it is limited by the architecture of the
`SCSI bus. Traditionally, SCSI buses have from eight to
`15 sixteen signal lines which allows a maximum of from eight
`to sixteen SCSI devices to be interconnected by the SCSI
`bus at any one time. Thus, systems xvhich use a 16-bit wide
`SCSI bus on the host side and g-bit wide SCSI buses on the
`device side, typically provide for at most six device side
`20 SCSI buses having six disk drives each. Moreover, the above
`approach, xvhich relies on SCSI IDs, has not been imple-
`mented using fibre interface type controllers.
`Fibre interface type controllers are coupled to a host
`computer through one or more fibre channels. Fibre channel
`25 is the general name of a technology using an integrated set
`of standards developed by the American National Standards
`Institute (ANSI) for high speed, serial communication
`between computer devices. (See for example the ANSI
`standard X3Tll, "Fibre Channel Physical and Signaling
`3o Interface (FC-PH)," Rev 4.3 (1994), hereby incorporated by
`reference.) Manufacturers of RAID systems have been mov-
`ing to fibre channel technology because it allows transmit-
`ting of data between computer devices at rates of over 1
`Gbps (one billion bits per second), and at distances exceed-
`35 ing several hundred meters and more. Also, fibre channel
`arbitratcd loop (FC-AL) allows for 127 uniquc loop
`identifiers, one of which unique identities is reserved for a
`fabric loop port.
`The widely accepted approach to providing failover/
`4o failback capability in RAID systems comprising fibre inter-
`face controllers has been to use dual-active controllers
`couplcd by a rcdirccting driver. In thc cvcnt of a controllcr
`failure the redirecting driver shifts host requests from the
`failed controller to a surviving controller. The failed con-
`45 troller can then be replaced and the memory system reini-
`tialized to return to normal, dual-active controller operation.
`The redirecting driver can be implemented using a software
`or hardware protocol. One exemplary redirecting driver is
`disclosed in U.S. Pat. No. 5,237,658, to Walker et al., hereby
`s0 incorporated by reference. However, one problem associated
`with this type of solution is that it is achieved at the expense
`of added memory system complexity that increases cost and
`decreases bandwidth. In addition, when, as is common, the
`redirecting driver is implemented using software in the host
`55 computer, this approach is not independent of the host
`computer, and typically requires a special driver for each
`host computer system on ~vhich it is to be utilized. This
`further adds to the cost and complexity, and increases the
`difficulty of installing and maintaining the memory system.
`6o Accordingly, there is a need for a memory system com-
`prising a number of fibre interface controllers and having a
`failover mechanism that is transparent to a host computer.
`There is a further need for such a memory system having a
`failback mechanism that is also transparent to the host
`65 computer. The present invention provides a solution to these
`and other problems, and offers additional advantages over
`the prior art.
`
`IBM-Oracle 1008
`Page 6 of 12
`
`

`

`US 6,578,158 BI
`
`3
`SUMMARY OF THE INVENTION
`
`The present invention provides a memory system and
`method of operating a memory system. In one embodiment,
`the memory system includes a number of controllers con-
`nected by a fibre channel arbitrated loop to provide trans-
`parent failover and failback for failed controllers. The con-
`trollers are adapted to transfer data between a data storage
`system and at least one host computer in response to
`instructions therefrom. In the inventive method, a unique
`identifier is provided to each controller to permit the host
`conrpute r to address instructions to a specific controller.
`Then, operation of the controllers is monitored and xvhen a
`failed controller is detected, a failover procedure is per-
`formed on a surviving controller. In one embodiment, the
`failover procedure disables the failed controller and assumes
`the identity of the failed controller. Thus, the surviving
`controllcr bccomcs capable of responding to instructions
`addressed to it and instructions addressed to the failed
`controller, and the failure of the failed controller is trans-
`parent to the host computer. In one particular embodiment,
`the step of providing a unique identifier to each controller
`preferably includes the step of providing a world wide name
`to each controller, and more preferably the step further
`includes providing a loop identifier to each controller.
`In another aspect the invention provides a memory system
`for transferring data between a data storage systcnr and at
`least one host computer in response to instructions there-
`from. The memory system includes a pair of dual-active
`controllers connected by a fibre channel arbitrated loop.
`Each controller has a unique identifier and is adapted to
`assume the identity of a failed controller and to respond to
`instructions addressed to it, thereby rendering failure of the
`failed controller transparent to the host computer. In one
`embodiment, the memory system further includes a com-
`munication path coupling thc controllcrs, thc communica-
`tion path being adapted to enable each controller to detect
`failure of the other controller. The present invention is
`particularly useful for data storage systems comprising
`multiple disk drives coupled to the controllers by disk
`channels, in xvhich at least one disk channel also serves as
`the communication path.
`In yet another aspect the invention provides a computer
`program and a computer program product for operating a
`menrory systcnr conrprising a plurality of controllers, each
`controller having a unique identifier, and the controllers
`adapted to transfer data between a data storage system and
`at least one host computer in response to instructions there-
`from. The computer program product includes a computer
`readable medinm with a computer program stored therein.
`The computer program has a failure dctcction unit adapted
`to detect a failed controller. A failover unit is adapted to
`enable a surviving controller to respond to instructions
`addressed to it and to instructions addressed to the failed
`controller. The failover unit includes a disabling unit adapted
`to disable the failed controller. The failover unit also
`includes a loop initialization unit, which is adapted to
`instruct a surviving controller to assume the identity of the
`failed controller and to instruct the surviving controller to
`respond to instructions addressed to it and to the failed
`controller as well as instructions addressed to the surviving
`controller. Thus, failure of the failed controller is transparent
`to the host computer. In one embodiment, each controller
`has an active port and a failover port, and the failover unit
`is adapted to activate the failover port of the surviving
`controller. In another embodiment, the computer program
`product further includes a replacenrent detection unit
`
`5
`
`4
`adapted to instruct a replacement controller to assume the
`identity of the failed controller and respond to instructions to
`the failed controller, thereby rendering replacement of the
`failed controller transparent to tire host conrputer.
`In still another aspect the invention provides a memory
`system for transferring data between a data storage system
`and at least one host computer in response to instructions
`therefrom. The memory system comprising a pair of dual-
`active controllers connected by a fibre channel arbitrated
`10 loop, each controller having a unique identifier, and a means
`for providing a failover mode from a failed controller to a
`surviving controller that is substantially transparent to the
`host computer. In one embodiment, the means for providing
`a failover mode is a computer program product having a
`15 computer program including a loop initialization unit
`adapled to instruct the surviving controller to assume the
`identity of the failed controller and to instruct the surviving
`controller to respond instructions addressed to it and to the
`failed controller.
`
`2O
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`Additional objects and features of the invention will be
`more readily apparent from the following detailed descrip-
`2s tion and appended claims when taken in conjunction with
`the drawings, in which:
`FIG. I is a block diagram of an embodiment of a memory
`system comprising a pair of controllers having a transparent
`failover and failback mechanism according to the present
`3o invention;
`FIG. 2 is a block diagram of another embodiment of a
`lrrcnrory system according to the present invention in an
`environment comprising a pair of host computer systems;
`FIG. 3 is a flowchart showing an cmbodimcnt of a mcthod
`3s of operating the memory system shown in FIG. 1 or FIG. 2
`to provide a transparent failover and failback mechanism
`according to the present invention; and
`FIG. 4 is a block diagram illustrating the hierarchical
`structure of an embodiment of a computer program accord-
`40 . mg to an embodiment of the present invention.
`
`DETAILED DESCRIPTION
`
`The present invention is directed to a memory system
`45 having a number of controllers adapted to transfer data
`between at least one host computer and a data storage
`system, such as one or more Redundant Array of Indepen-
`dent Disks (RAID) storage systems. The controllers are
`coupled to the host computer and one another through a
`so host-side loop to provide a failover and a failback mecha-
`nism for a failed controller that is transparent to the host
`computer. Advantageously, the controllers are connected by
`a fibre channel arbitrated loop (FC-AL). While the invention
`is described using examples of data storage system com-
`55 prising a RAID having nrultiple nragnetic disk drives, the
`present invention can be used with other data storage
`systems, as apparent to those skilled in the art, including
`arrays and individual disk drives in which thc disk drivcs arc
`optical, magnetic, or magneto-optical disk drives.
`FIG. 1 shows a block diagram of an exemplary embodi-
`ment of a memory system 100 according to the present
`invention having a pair of controllers 105 (singularly 105a
`and 105b) coupled to a host computer 110 through a pair of
`host-side loops 15 (singularly 115a and 115b). It is to be
`65 understood that by host-side loop 115 it is meant a commu-
`nication path which connects the controllers 105 to the host
`conrputer 110, and that the host-side loop can also connect
`
`6o
`
`IBM-Oracle 1008
`Page 7 of 12
`
`

`

`US 6,578,158 BI
`
`other devices or systems (not shown) to the host computer.
`The controllers 105 are in turn coupled a data storage system
`120, sho~vn here as a RAID 130 comprising multiple disk
`drives 135, via several device-side loops 140 (singularly
`140a to 140c) also known as disk cha~mels. Alternatively,
`the controllers 105 could also be coupled to the data storage
`system 120 via SCSI buses (not shown). Although FIG. 1
`shows a single pair of controllers 105 coupled by three
`device-side loops 140 to a RAID 130 comprising only
`twelve disk drives 135, the illustrated architecture is extend-
`able to memory systems having any number of controllers,
`disk drives, and device-side loops. For example, the memory
`system 100 can a number, n, of n-way controllers using
`operational primitives in a message passing multi-controller
`non-uniform workload environment, as described in com-
`monly assigned co-pending U.S. patent application Ser. No.
`09/326,497, which is hereby incorporated by reference.
`The host-side loops 115 are made up of several fibre
`channels 145 and a hub 150a, 150b. The term fibre channel
`as used here refers to any physical medium that can be used
`to transmit data at high speed, for example to serially
`transmit data at high speed in accordance with standards
`developed by the American National Standards Institute
`(ANSI), such as for example optical fibre, co-axial cable, or
`twisted pair telephone line. Each of the host-side loops 115
`connect to three nodes or ports, including a single server port
`known as a host bus adapter HBA 155a, 155b, on the host
`computer 110 and to t~vo controller ports 160a, 160b, on
`each of the controllers 105. The host-side loops 115 are
`adapted to enable data and input/output (I/O) requests from
`the host computer 110 to be transferred between any port on
`the loop 115.
`The controllers 105 can be any suitable fibre channel
`compatible controller that can be modified to operate
`according to the present invention, such as for example the
`DAC960SF, commercially available from Mylcx, Inc.,
`Boulder, Colo. Such controllers 105 include, or can be
`modified to include, an active port 165a, 165b, and a failover
`port 166a, 166b, on each controller, and a register (not
`shown) adapted to support the failover and a failback
`mechanism of the present invention. Apair of the controllers
`105 can bc configurcd to opcratc as dual-activc controllcrs
`as described above, or as dual-redundant controllers wherein
`one controller serves as an installed spare for the other,
`which in normal operation handles all I/O requests from the
`host computer 110. Preferably, the controllers 105 operate as
`dual-active controllers to increase the bandwidth of the
`memory system 100. Generally, each or the controllers 105
`have a computer readable medium, such as a read only
`memory (ROM) 170, in which is embedded a computer or
`machine readable code, commonly known as firmware, with
`instructions for configuring and operating the controller, a
`cache 180a, 180b, for temporarily storing I/O requests and
`data from the host computer 110, and a local processor 185a,
`185b, for executing the instructions and requests. The firm-
`ware of each controller is modified to support the failover
`and a failback mechanism of the present invention.
`To enable the controllers 105 to be operated in dual-active
`mode, the controllers on host-side loops llSa, llSb, are
`identified by a unique identifier to permit the host computer
`110 to address an I/O request to a specific controller. In one
`embodiment, the unique identifier includes a non-volatile,
`64 bit World Wide Name (WWN). AWAVN is an identifying
`code that is hardwircd, embedded in the firmware, or oth-
`erwise encoded in a fibre channel compatible device, such as
`the HBA 155a, 155b, or the controllers 105, at the time of
`manufacture. Additionally, the unique identifier includes a
`
`loop identifier (LOOP ID) which is.assigned to each port in
`a host-side loop llSa, 115b, during a system initialization of
`the memory system 100. This LOOP ID can be acquired
`during a Loop Initialization Hard Address (LIHA) phase of
`5 the system initialization, or during a Loop Initialization
`Software Address (LISA) phase. Because not all host com-
`puters have operating systems that support addressing
`schemes using WWNs, for example some legacy host com-
`puter systems, in a preferred embodiment, the unique iden-
`
`10 rifler includes both a WAVN and a LOOP ID to enable the
`memory system 100 of the present invention to be used with
`any host computer 110 independent of the operating system.
`During system initialization, each of the controllers 105
`register the unique identifier of the other controller. This
`15 enables a surviving controller, for example controller 105a,
`to accept and process I/O requests addressed to a failed
`controller, for example controller 105b, by assuming the
`identity of the failed controller.
`The RAID is comprised of multiple virtual or logical
`
`20 volumes. Although the controllers 105 share the same RAID
`130, that is both controllers are connected to every disk drive
`135 in the RAID, preferably each logical volume is under
`the primary control of one of the controllers so that coher-
`ency need not bc maintained between the caches 180a, 180b,
`~_5 of the controllers when they are operated in dual-active
`mode. By primary control it is meant that during normal
`operation each logical volume 135 in the RAID 130 is
`controlled solely by one of the controllers 105. Each logical
`volume is represented by a logical unit number (LUN) to the
`3o host computer 110. Each LUN in turn is associated with the
`unique identifier of one of the controllers 105 so that when
`data needs to be stored in or retrieved from a particular LUN,
`the I/O request is automatically directed to the correct
`controller.
`35 In a preferred embodiment, shown in FIG. 2, reliability is
`further enhanced by providing a clustered cnviromncnt in
`which two host computers 110 (singularly ll0a and ll0b)
`each have direct access to both controllers 105 through a
`number of IIBAs 155a-d. Thus, the failure of a single host
`4o computer ll0a, ll0b, will not result in the failure of an
`entire network of client computers (not shown). In addition,
`as shown in FIG. 2, cach of the controllcrs 105 havc at lcast
`one active port 195a, 195b and one inactive port 200a, 200b.
`The active ports 195a, 195b receive and process I/O requests
`45 sent by the host computers 110 on the host-side loops 115.
`The inactive ports 200a, 200b, also known as a failover
`ports, can process I/0 requests only when the active port
`195a, 195b on the same host-side loop 115a, llSb, has
`failed. For example, in case of failure of controller 105a,
`50 inactive port 200b on surviving controller 105b assumes the
`identity of the active port 195a on failed controller 105a and
`begins accepting and processing I/O requests directed to the
`failed controller 105a.
`In accordance with the present invention, the memory
`55 system further includes a communication path 205 adapted
`to transmit a signal from one controller 105 to another in the
`event of a controller failure. The communication path 205
`can be a Small Computer System Interface (SCSI) bus or a
`fibre channel as described above. It can take the form of a
`6o dedicated high speed path extending directly between the
`controllers 105, as shown in FIG. 1, or one of the device-side
`channels 140a-c (disk channels) which can also serve as the
`communication path 205, as shown in FIG. 2. The signal
`passed between the controllers 105 to indicate controller
`65 failure can be a passive signal, such as for example the lack
`of a proper response to a polling or pinging scheme in which
`each controller interrogates the other at regular, frequent
`
`IBM-Oracle 1008
`Page 8 of 12
`
`

`

`US 6,578,158 BI
`
`7
`intervals to ensure the other controller is operating correctly.
`Alternatively, the signal can be a dynamic signal transmitted
`directly from a failed or failing controller 11}5a, 11}5b, to the
`surviving controller 11}5b, 11}5a, instructing it to initiate a
`failover process or mechanism. Optionally, the communica-
`tion path 205 is also adapted to enable the controllers 105 to
`achieve cache coherency in case of controller failure.
`An cxcmplary mcthod of opcrating thc mcmory systcm
`100 shown in FIG. 2 to provide a failover process that is
`substantially transparent to the host computers ll0a, llb,
`will now be described with reference to FIG. 3. The fol-
`lowing initial actions or steps are required to make the
`failover operation transparent to the host computer. First, in
`a system initialization step 210 each of the controllers 105
`is provided with a unique identifier which is communicated
`to the host computers 110. This step 210 generally merely
`involves querying the controllers 105 to obtain their WWN,
`but it may also include assigning a LOOP ID to each
`controller in a LIHA phase or a LISA phase, as described
`above. The unique identifiers are then registered by the host
`computers 110 and one or more of the LUNs are associated
`with each unique identifier. Next, in a communication step
`215, the unique identifiers and their associated LUNs are
`communicated between the controllers 105 via the commu-
`nication path 205. Each of the controllers 105 assign the
`unique identifier and the associated LUNs of the other
`controller, to its failover port 200a, 200b. This enables a
`surviving controller 105a, 105b to assume the identity of a
`failed controller 105b, 105a, and to accept and process I/O
`requests addressed to it by activating the normally inactive
`or failover port 200a, 200b.
`The memory system 100 is then ready to begin regular
`operations in a dual-active operation step 225 in which the
`controllers 105 both simultaneously receive and process I!O
`requests from the host computers 110. During normal opera-
`tions a fault dctcction stcp 230 is cxccutcd in which thc
`controllers 105 exchange a series oP’pings," also referred to
`as a heart beat signal, the response to which, as described
`above, signals to each controller that the other has not failed.
`This step 230 may also involve a scheme in which a failed
`or failing controller 105a, 105b dynamically signals a sur-
`viving controller 105b, 105a, that a failure has occurred or
`is about to occur.
`On detection of a controller failure, a failover procedure
`is performed on the surviving controller 105a, 105b, the
`failover procedure involves the steps of disabling the failed
`controller (step 235) and assuming the identity of the failed
`controller (step 240). In the disabling step 235, the surviving
`controller 105a, 105b asserts a reset signal, which disables
`the failed controller 105b, 105a by resetting its.local pro-
`cessor 185a, lgSb, and the active port 195a, 195b, fibre
`protocol chip (not shown). Resetting the fibre protocol chip
`causes the hub 150a, 150b to automatically bypass the
`primary port 195a, 195b, on the failed controller 105a, 105b.
`In the assuming identity step 240, the failover port 200a,
`200b of the surviving controller 105a, 105b, begins accept-
`ing and processing I/O requests addressed by the host
`computers ll0a, ll0b, to the failed controller 105b, 105a.
`Preferably, to speed up the failover process the unique
`identifier for the failed controller 105a, 105b, was previ-
`ously assigned to the failover port 200a, 200b, during the
`communication step 215, and the surviving controller 105
`merely activates the failover port 200a, 200b, to enable it to
`bcgin acccpting and proccssing I/O rcqucsts.
`Alter the failover process is completed, the surviving
`controller 105a, 105b, in a resume operation (step 245)
`resumes operations by responding to I/O requests addressed
`
`20
`
`to itself and to the failed controller. The surviving controller
`11}5a, 11}5b, responds to requests to store or retrieve data
`addressed to the failed controller, without any additional
`support from the host computers 111} or the HBAs 155.
`5 Because there is no need to alter the registered unique
`identifiers or the associated LUNs, the failover process is
`transparent to the host computers 110. To the host computers
`110, the delay, if any, caused by the time it takes to detect the
`failed controller 105a, 105b and to perform the loop initial-
`10 ization procedure appears to be no more than a momentary
`loss of power to the memory system 100, which requires the
`host computers to re-transmit the last several commands sent
`to the failed controller.
`Optionally, when the controllers 105 include caches 180a,
`15 180b, the failover process can also include a cache flush step
`(not shown) and a conservative cache mode enable step (not
`shown). The cache flush step prevents the loss of

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket