throbber
I 1111111111111111 11111 lllll 111111111111111 lllll lllll 111111111111111 11111111
`
`US006173374Bl
`
`(12)United States Patent
`
`
`Heil et al.
`
`(10)Patent No.:US 6,173,374 Bl
`
`
`(45)Date of Patent:
`*Jan.9,2001
`
`(54)SYSTEM AND METHOD FOR PEER-TO­
`
`PEER ACCELERATED 1/0 SHIPPING
`BETWEEN HOST BUS ADAPTERS IN
`CLUSTERED COMPUTER NETWORK
`
`4,539,655 * 9/1985 Trussell et al. ...................... 364/900
`
`
`
`4,698,753 * 10/1987 Hubbins
`
`et al. ..................... 364/200
`
`
`
`4,805,137 * 2/1989 Grant et al. .......................... 364/900
`
`
`
`4,935,894 * 6/1990 Ternes et al. ........................ 364/900
`
`
`5,210,828 * 5/1993 Bolan et al. ......................... 395/200
`
`
`
`
`5,410,654 * 4/1995 Foster et al. ......................... 395/275
`(75)Inventors: Thomas F. Heil, Fort Collins, CO (US);
`
`
`
`
`
`5,471,638 * 11/1995 Keeley ................................. 395/800
`
`
`Martin H. Francis, Wichita, KS (US);
`
`
`
`5,499,384 * 3/1996 Lentz et al. .......................... 395/821
`
`
`Rodney A. DeKoning, Wichita, KS
`
`
`5,522,050 * 5/1996 Amini et al. ......................... 395/306
`
`(US); Bret S. Weber, Wichita, KS (US)
`
`
`
`
`5,675,791 * 10/1997 Bhide et al. ......................... 395/621
`(73)Assignee: LSI Logic Corporation, Milpitas, CA
`
`
`
`* cited by examiner
`(US)
`
`( *) Notice: This patent issued on a continued pros­
`
`
`
`
`Primary Examiner-John W. Cabeca
`
`
`ecution application filed under 37 CFR
`
`Assistant Examiner-Kimberly McLean
`
`
`
`
`1.53( d), and is subject to the twenty year
`
`patent term provisions of 35 U.S.C.
`(57)
`154(a)(2).
`
`ABSTRACT
`
`(21)Appl. No.: 09/022,350
`
`(22)Filed:Feb. 11, 1998
`
`The present invention retrieves data across independent
`
`
`
`
`
`
`Under 35 U.S.C. 154(b), the term of this
`
`
`
`
`computer nodes of a server cluster by providing for 1/0
`
`
`patent shall be extended for O days.
`
`
`
`shipping of block level requests to peer intelligent host-bus
`
`
`adapters (hereinafter referred to as HBA). This peer-to-peer
`
`
`
`
`distribution of block 1/0 requests is transparent to the host.
`
`
`
`The HBA has the intelligence to decide whether to satisfy a
`
`block 1/0 request locally or remotely. Each HBA driver
`
`(51)Int. Cl.7 ...................................................... G06F 12/00
`
`
`allows peer-to-peer com­utilizes the I20 protocol, which
`
`
`
`
`
`munication independent of the operating system or hardware
`
`
`
`(52)U.S. Cl. .......................... 711/148; 711/147; 711/130;
`
`
`
`
`of the underlying network. In a first embodiment of the
`
`711/153; 709/213; 709/214; 709/217; 709/212
`
`
`
`
`present invention, local and remote storage channels, within
`
`(58)Field of Search ..................................... 711/147, 148,
`
`a node, are supported by a single HBA. In a second
`
`711/114; 709/213, 214, 215, 232, 231,
`
`
`
`
`embodiment of the present invention, local storage channels,
`
`201,206,212,244,249
`
`within a node, are supported by one HBA, and the remote
`
`
`
`
`
`
`storage channel, within a node, is supported by a separate
`HBA.
`
`(56)
`
`References Cited
`
`
`
`U.S. PATENT DOCUMENTS
`
`
`31 Claims, 7 Drawing Sheets
`
`
`
`
`3,794,983 * 2/1974 Sahin ................................ 340/172.5
`
`r---------------------
`
`NODE 1
`
`I
`�
`
`I
`
`150
`
`� - - - - - - - -NODE N - - - - - - - - - 7
`I
`135
`I
`I
`135
`I
`I
`I
`-�� -��I
`130 '- --r- �
`151 /I
`'- --r- � I
`129 '- -- --r- -- ----' I
`I
`I
`--�- � I
`I
`I
`128
`I
`I
`'- --'--'-' '-'r'..c....c..c'---'
`I
`--�- � I
`127.5 �- --r- -�
`I
`II
`
`I
`
`-�-- -��
`
`I
`
`100
`
`I
`
`I
`
`I
`
`1
`I CPU1
`II
`I I
`I I
`I II
`IIII
`
`
`
`1 ,;::::rr::::;-----;:::==:::r:::;-,=E==='==:::E::='il
`
`I
`
`I
`
`I
`
`I
`
`i � :::
`
`1�--;::::::=::±===::;--y7
`c=e=
`118
`I
`LOCAL DRIVES
`I I
`
`121
`
`r;==;=::::=l===;'===';==±:'=,;:::::::::i=::::;----;:::::::t:J'.::;7 IIIIII III
`I IIIIII
`
`IT------;:==:::t===�I 123
`123
`123
`LOCAL DRIVES
`
`L--------------------�
`
`Cisco Exhibit 1006
`Cisco et al. v. LS Cloud Storage Technologies
`IPR2023-00733, Page 1 of 18
`
`

`

`FIG. 1
`r-------------------- 7
`NODE N
`I
`I
`
`I
`I
`
`I r--.__150
`
`I
`151 ./7
`
`I
`I
`
`100
`
`NODE 1
`
`CPU 1
`
`CPU 2
`
`=r
`CACHE 2
`CACHE 1
`
`
`
`PROCESSOR BUS
`
`HOST TO PCI
`BUS BRIDGE
`
`PCIBUS
`
`CPU
`FRONT
`EMBEDDED
`HOST TO PCI
`END
`BUS BRIDGE
`1/F
`MEMORY
`
`or FCAL or FCAL
`
`100
`105 110
`116.5
`117.2
`101
`1 0 2 _ ______.
`117.1
`117.3
`116
`117.6
`SCSI SCSI t-'--117.9
`120
`117 .8 ,___ __ _,
`118 I 117
`.r---1--118 118
`
`HBA
`HBA
`FIBRE
`FIBRE
`CHANNEL
`CHANNEL
`I I : I
`NETWORK
`CHIP
`CHIP
`
`FIBRE CHANNEL
`BACKBONE
`
`MEMORY
`L
`
`122
`
`121
`
`• � �
`,--- - --, 135 I
`135 CPUN+1 CPUN+2 I
`130 CACHE N+ 1 CACHE N+2 I
`129
`...... ���� N00
`128
`127.5
`111 HOST TO PCI
`103
`126.1 126.2
`'"""' ��
`126.3 I I BUS BRIDGE
`"'---127
`FR□N1 I:
`125
`122.3 � SCSI
`122.4
`123
`126
`123
`�
`123
`
`���1M_�_I
`
`I
`
`PROCESSOR BUS
`I
`I
`
`BUS BRIDGE
`
`PCIBUS
`
`) I
`
`EMBEDDED
`END I
`HOST TO PCI
`1/F 1
`
`'"""'
`
`-..J
`:::i..__
`I;
`SCSI
`or FCAL
`orFCAL
`
`•
`
`......
`
`e
`rJ"J.
`
`_,.a-...
`
`i,-� � ----l ,I;;..
`
`LOCAL DRIVES
`
`LOCAL DRIVES
`
`L--------------------�
`�---------------------
`
`i,-
`
`Cisco Exhibit 1006
`Cisco et al. v. LS Cloud Storage Technologies
`IPR2023-00733, Page 2 of 18
`
`

`

`•
`•
`�
`�
`......
`
`......
`
`��
`��
`N00
`
`'JJ.
`=­� �
`.... N
`
`0
`....,
`-..J
`
`e
`rJ'J.
`
`_,.a-...
`
`i,- �
`
`
`�
`-...,l ,I;;..
`
`i,-
`
`NODE 1
`
`I
`
`I
`I
`
`I
`I
`i'--150
`
`HIGHER
`HOST
`200
`LAYERS
`
`(OSM)
`DRIVER
`
`210
`
`PCIBUS
`
`220
`
`FIG.2
`----
`r-----
`1
`151../"l
`I
`I
`
`NODE N -- - - - -___
`
`,
`
`300
`
`HIGHER
`HOST
`LAYERS
`
`310
`
`320
`
`(OSM)
`DRIVER
`
`PCIBUS
`
`I
`I
`I
`I
`I
`I
`I
`I
`
`HOST 1/F
`
`230
`
`240
`
`I
`
`1/0 REDIREC
`TOR SOFTWARE
`.-- - ---' '-- - --, 270
`I 250
`1/0 SHIP
`LOCAL RAID,
`I
`ISM
`CACHE ISMs
`
`I
`I
`I
`
`1/0 SHIP
`LOCAL STORAGE
`HOM
`HDMs
`
`I
`I
`260
`I
`I
`I
`'-------------
`
`118
`
`280
`
`__,
`
`331
`
`330
`
`HOST 1/F
`
`1/0 REDIRECTOR SOFTWARE
`
`,-----'----, ,_ __ _._ _____ 360
`340
`LOCAL RAID,
`1/0 SHIP
`CACHE ISMs
`ISM
`
`152
`<
`
`350
`
`1/0 SHIP
`HOM
`
`LOCAL STORAGE
`HDMs
`
`370
`
`123
`
`_____
`
`NETWORK
`FIBRE CHANNEL
`BACKBONE
`121
`
`Cisco Exhibit 1006
`Cisco et al. v. LS Cloud Storage Technologies
`IPR2023-00733, Page 3 of 18
`
`

`

`
`
`U.S. Patent Jan.9,2001 Sheet 3 of 7 US 6,173,374 Bl
`
`FIG. 3
`
`1/0 BLOCK REQUEST
`400
`
`YES
`
`420
`
`RETRIEVE FROM
`LOCAL DISKS
`
`NO
`
`SHIP 1/0 REQUEST
`TO REMOTE HBA
`450
`
`REMOTE HBA RETURNS
`
`
`REQUESTED 1/0 BLOCKS
`460
`TO LOCAL HBA
`
`LOCAL HBA SENDS
`
`1/0 BLOCKS TO HOST
`
`470
`
`Cisco Exhibit 1006
`Cisco et al. v. LS Cloud Storage Technologies
`IPR2023-00733, Page 4 of 18
`
`

`

`
`
`U.S. Patent Jan.9,2001 Sheet 4 of 7 US 6,173,374 Bl
`
`FIG. 4A
`
`HOSTS AND HBAs IN
`CLUSTER INITIALIZE 500
`
`+
`HBAs DETERMINE THE
`
`CONTENTS OF RESPECTIVE -----502
`LOCAL STORAGE
`
`+
`
`EACH HBA BUILDS A LOCAL
`DIRECTORY CONTAINING -----504
`
`LOCATION OF DATA BLOCKS
`FOR LOCAL STORAGE
`
`+
`EACH HBA BROADCASTS
`�
`
`CONTENTS OF LOCAL DIRECTORY
`506
`TO PEER HBAs
`
`♦
`EACH HBA UPDATES DIRECTORY
`WIT H BROADCAST ----508
`INFORMATION
`
`FIG. 4B
`
`HOSTS AND HBAs IN
`CLUSTER INITIALIZE 510
`
`+
`CENTRAL HOST BROADCASTS
`
`MESSAGE TO PEER HOSTS TO
`512
`
`DETERMINE CONTENTS OF
`LOCAL STORAGE
`
`+
`CENTRAL HOST BUILDS
`
`DIRECTORY CONTAINING
`r----..
`514
`
`
`LOCATION OF DATA BLOCKS OF
`
`ENTIRE CLUSTER
`
`+
`DOWNLOAD DIRECTORY TO
`EACH HBA DURING �
`516
`HBA INITIALIZATION
`
`Cisco Exhibit 1006
`Cisco et al. v. LS Cloud Storage Technologies
`IPR2023-00733, Page 5 of 18
`
`

`

`
`
`U.S. Patent Jan.9,2001 Sheet 5 of 7 US 6,173,374 Bl
`
`FIG. 4C
`
`HOSTS AND HBAs IN CLUSTER INITIALIZE 518
`
`•
`
`
`HBAs DETERMINE CONTENTS OF RESPECTIVE
`LOCAL STORAGE 520
`
`♦
`ONE HBA IS INITIALIZED AS
`
`
`DIRECTORY MANAGER 522
`
`♦
`
`
`HBA ACTING AS DIRECTORY MANAGER
`I'--
`
`
`REQUESTS AND RECEIVES PEER HBAs 524
`
`L OCAL DIRECTORY INFORMATION
`
`+
`
`
`HBA ACTING AS DIRECTORY MANAGER
`
`COALESCES THE LOCAL DIRECTORY r---
`526
`
`INFORMATION INTO A COMPREHENSIVE
`
`
`
`DIRECTORY OF CLUSTER'S ENTIRE STORAGE
`
`+
`
`
`HBA ACTING AS DIRECTORY MANAGER
`
`BROADCASTS THE NEWLY CREATED I'-
`528
`
`
`
`DIRECTORY OF CLUSTER'S NETWORK STORAGE
`
`FIG. 4D
`
`HOSTS AND HBAs IN CLUSTER INITIALIZE
`529
`
`+
`OF RESPECTIVE ----530
`
`HBAs DETERMINE CONTENTS
`LOCAL STORAGE
`
`♦
`EACH HBA BUILDS A LOCAL DIRECTORY
`r'--
`
`CONTAINING LOCATION OF DATA BLOCKS 532
`WITHIN LOCAL STORAGE
`
`+
`
`
`HBA DEMANDS DIRECTORY INFORMATION
`..-.......
`FROM PEER HBAs USING PEER-TO-PEER
`534
`
`COMMUNICATION WHEN A REQUESTED
`BLOCK IS NOT FOUND LOCALLY
`
`♦
`
`
`HBA UPDATES DIRECTORY WITH DIRECTORY
`r--.
`
`
`INFORMATION OBTAINED FROM PEER HBAs 536
`
`Cisco Exhibit 1006
`Cisco et al. v. LS Cloud Storage Technologies
`IPR2023-00733, Page 6 of 18
`
`

`

`-------
`,
`
`•
`
`•
`
`FIG. SA
`�
`� ......
`......
`
`�,so
`
`------
`----------------------
`
`I
`BUS I
`I
`rNo□E 1
`I
`116.5
`I rci
`181 HBA
`180
`HBA
`FRONT EMBEDDED
`EMBEDDED
`FRONT
`END HOST TO Pc11 "-173
`171 END 1/F
`HOST TO PCI 172
`BUS BRIDGE
`1/F BUS BRIDGE
`114.1
`114
`114.2
`114.6
`I... �PU
`CPUl�· 3
`193
`117.4
`117.5
`I
`I I
`I MEMORY
`MEMORY h. 114.51 I
`�114�� 4.9
`y I
`I
`L
`114.4
`I
`FIBRE
`SCSI
`or FCAL SCSI
`CHANNEL
`or FCAL
`CHIP
`1
`I 117.8 I 117.9
`120
`
`I
`
`I
`
`152
`(
`
`��
`��
`N0 0
`
`'JJ.
`
`O'I
`
`-..J
`
`I
`
`=­� � ....
`TO
`NETWORK
`--NODE N FIG SB
`
`1 . FIBRE CHANNEL
`BACKBONE
`11 I
`0 ...,
`I I
`121
`I I I I I I
`
`e
`rJ'J.
`
`_,.a-...
`
`i,- �
`
`
`�
`-...,l ,I;;..
`
`i,-
`
`118
`r--1----.11 8
`LOCAL DRIVES
`118
`�----------------------------
`
`I I
`
`Cisco Exhibit 1006
`Cisco et al. v. LS Cloud Storage Technologies
`IPR2023-00733, Page 7 of 18
`
`

`

`- - - - - - - __ _
`-----
`NODEN 1
`
`-------
`
`----
`
`--------
`
`1-
`
`127.5 --1
`I BUS I
`- I �
`
`FIG. 58
`
`1
`
`1
`
`•
`
`•
`
`�
`�
`
`......
`
`......
`
`��
`��
`N0 0
`
`'JJ. =­� �
`
`....
`-..J
`
`0
`
`....,
`-..J
`
`e rJ"J.
`_,.a-...
`
`i,- �
`
`i,-
`
`1
`190
`151 __r-:
`I I HBA
`I
`EMBEDDED
`FRONT
`I
`HOST TO PCI
`END
`174
`BUS BRIDGE
`1/F
`I
`126.7
`I 1182
`I ��_J___CACHE 183
`I
`I
`I
`MEMORY 185
`184
`I
`I
`I
`FIBRE
`I
`FROM
`---------4--4------------1 CHANNEL
`NODE 1
`CHIP
`FIG SA
`I
`I
`122
`I
`I
`I
`123
`I
`I
`�123
`I
`I
`_ 123
`
`I
`LOCAL DRIVES
`�
`I
`-....,l ,I;;..
`------------------------------------�
`
`194
`
`125.1
`
`191
`
`HBA
`EMBEDDED
`FRONT
`END
`175-7 HOST TO PCI
`195
`1/F
`BUS BRIDGE
`126.8'
`186
`
`CACHE 187
`189
`MEMORY 188
`
`125.2
`
`
`or FCAL SCSI or FCAL
`SCSI
`122.4
`122.3 L..__-�-----'
`
`Cisco Exhibit 1006
`Cisco et al. v. LS Cloud Storage Technologies
`IPR2023-00733, Page 8 of 18
`
`

`

`US 6,173,374 Bl
`
`1
`
`SYSTEM AND METHOD FOR PEER-TO­
`
`
`PEER ACCELERATED 1/0 SHIPPING
`
`BETWEEN HOST BUS ADAPTERS IN
`CLUSTERED COMPUTER NETWORK
`
`2
`throughout a cluster's storage subsystems. This allows the
`
`
`
`
`
`
`
`
`
`parts of the file to be concurrently accessed locally and/or
`
`
`
`
`
`remotely. Software executed by the host coordinates the
`
`
`
`
`communications between the host computer requesting the
`
`
`
`subsystem containing 5 blocks and the local or remote storage
`
`
`
`
`the blocks. As previously stated, this software is presently
`BACKGROUND OF THE INVENTION
`
`
`
`
`implemented within each host's operating system or as a
`1. Field of the Invention
`
`
`
`software layer operating on each host in a cooperative
`
`distributed manner.
`This invention relates to distribution of storage data
`
`
`
`can 10 Both block level distribution and file level distribution
`
`
`
`within computer nodes of a clustered computing
`
`
`
`be performed with physically shared disks. In a physically
`
`
`
`environment, and more particularly, to an apparatus and
`
`
`
`shared disk architecture, each node (computer) in the cluster
`
`
`method for 1/0 request shipping operable using peer-to-peer
`
`
`
`
`
`has direct access to all the disks within the cluster to thereby
`
`
`
`communications within host bus adapters of the clustered
`
`
`
`provide "any-to-any" connectivity among hosts and disks. A
`systems.
`
`
`layer of host software provides the coordination to allow
`2.Discussion of Related Art
`
`
`15
`
`
`
`hosts to access data from any disk within the cluster. One
`
`
`A cluster is, in general, a collection of interconnected
`
`
`
`
`
`
`such example is the Oracle Parallel Database Server running
`
`
`
`whole computers utilized as a single computing resource
`
`in a DEC VAX/Alpha cluster.
`
`
`
`whereby a communication network is used to interconnect
`The Oracle Parallel Database Server maintains the con-
`
`
`
`
`
`
`
`
`the computers within the cluster. A cluster typically contains
`
`
`
`
`
`protocol,a proprietary by utilizing 20 sistency of the database
`
`
`
`several computers. From the viewpoint of a computer within
`the distributed lock manager (DLM), to allow nodes to
`
`
`
`
`
`this collection of computers, the rest of the computers and
`
`
`access the shared storage concurrently. Utilizing DLM soft­
`
`
`
`
`
`
`their respective attached resources are deemed remote,
`
`
`ware in a cluster of physically shared disks allows all
`
`
`whereas its own attached resources are deemed to be local.
`
`
`
`
`computer nodes to have access to all disks directly through
`
`
`
`Resource sharing is one benefit of a computing cluster. A
`
`
`their own 1/0 subsystem so that each disk appears to be
`25
`
`
`
`computer within the cluster can access the resources of
`
`
`
`
`physically local. Each computer node can cache and/or lock
`
`
`shared disk-based structures utilizing the DLM software.
`
`
`
`
`
`another computer within the cluster, and the computers of
`
`
`
`
`
`the cluster can thereby share any resource in the cluster.
`
`
`For example, if one node wants blocks X, Y, and Z within
`
`
`disks A and B, it must first ask the DLM software for
`
`
`
`
`
`Combining the processing power and storage resources of
`The DLM will grant permission only after it has
`
`
`
`
`
`the cluster into one virtual machine increases the availability
`30 permission.
`
`
`endured that blocks X, Y, and Z are current. The DLM
`
`
`
`
`
`and capacity of resources within the cluster. For example, if
`
`
`
`
`
`one resource, such as a processor, in the cluster were to fail,
`
`
`
`
`
`ensures that if another node has made recent changes to
`
`
`
`another processor within the cluster could take over the load
`
`
`blocks X, Y, and Zand locally cached the modifications, the
`
`DLM will ask it to flush the modifications to disks A and B
`
`
`
`
`of the failed processor. To the requester, the failure of the
`
`
`
`processor is transparent because another peer processor
`first.
`35
`
`
`services its request load.
`Physically shared disks are simple to manage, provide fast
`
`
`
`
`
`A common application of such a clustered environment is
`
`
`
`
`
`
`data access, and are the dominant approach in the market
`
`
`
`
`for the sharing of disk storage resources. For example, in
`
`
`
`today. However in large configurations, expensive switches
`
`
`
`
`high volume transaction processing applications (e.g., a
`
`
`and multiplexing devices are required to maintain any-to-
`
`
`
`
`database transaction system), a large number of processors
`
`
`40 any connectivity between nodes. Due to the expensive
`
`
`may be added to a computing environment all of which share
`
`
`
`
`switches and interconnects, this architecture is expensive to
`
`
`
`access to common storage devices containing the shared
`
`
`
`scale. In particular, each computer or disk added to such a
`
`
`
`database. The transaction processing load may therefore be
`
`
`physically shared disks architecture may require, in turn,
`
`
`
`
`distributed over a large number of processors operating in
`
`
`
`
`
`addition of a larger, more complex, more costly switching or
`
`
`
`parallel to perform the requisite transactions. Problems arise
`
`45 multiplexing devices.
`
`
`
`where multiple computers, operating in parallel, share data
`Both block level distribution and file level distribution can
`
`
`
`
`
`
`
`and storage devices. Clearly, a level of coordination is
`
`
`
`
`
`also be performed with logically shared disks. In a logically
`
`
`
`
`
`required to assure that each of the computers is aware of
`
`
`
`shared disk architecture disks are not shared physically, but
`
`
`
`
`updates in the storage devices made by others of the com­
`
`
`
`
`
`distributed across the nodes of the cluster, each node owning
`
`
`
`puters in the cluster. In environments that share common
`
`
`
`
`level shared access control or 50 a subset of the total disks. File
`
`
`
`
`storage resources, two fundamental architectures have arisen
`
`
`
`block level shared access control at the host level retrieves
`
`
`
`
`to coordinate the shared access to storage devices: file level
`
`
`
`data on a cluster where there is primarily networked con­
`
`
`
`
`
`shared access control and block level shared access control.
`
`
`
`
`nectivity between the computer nodes of the cluster. That is
`
`
`
`That is, information may be distributed between disks at the
`
`
`in this type of cluster, the application data are partitioned
`
`
`file level or at the block level.
`
`
`
`cluster so that each node has direct 55 within the nodes in the
`
`
`
`
`Entire files may be distributed throughout
`
`
`
`
`access a cluster's to physically local disks but must establish network
`
`
`storage subsystem
`
`connectivity by storing the files on a local disk or with other nodes to retrieve files or blocks from
`
`
`
`
`storing the files on remote storage
`
`remote disks subsystems. Software on another node in the cluster.
`
`
`
`executed by the host coordinates the communications
`
`
`
`
`
`
`To retrieve files from a remote resource, software in the
`
`
`
`between the host computer requesting the file and the local 60
`
`
`host intercepts file or block level 1/0 requests and deter­
`
`
`
`or remote storage subsystem containing the file. This soft­
`
`
`
`
`mines whether the particular file or block is stored locally or
`
`
`
`ware executed by the host is implemented within each host's
`
`
`
`
`
`remotely. If local, the software passes the request down to
`
`
`
`operating system or can be a software layer operating on
`
`
`the local file system or block 1/0 driver. If remote, the
`
`
`
`each host to coordinate access to the files.
`
`
`
`software passes the request to the node owning the remote
`
`
`65 disk via the inter-node communication network.
`
`
`
`As entire files may be distributed throughout a cluster's
`
`
`
`
`
`
`
`
`
`storage subsystem, a file can be partitioned into a plurality The key benefit of logically shared disk architecture is the
`
`
`
`
`of individual blocks that can similarly be distributed ability to scale the number of nodes by simple replication of
`
`Cisco Exhibit 1006
`Cisco et al. v. LS Cloud Storage Technologies
`IPR2023-00733, Page 9 of 18
`
`

`

`
`
`US 6,173,374 Bl
`
`4
`3
`the required subsystems. Unlike the physically shared disk
`
`
`
`
`apparatus and methods for 1/0 shipping of block level
`
`
`
`
`
`
`architecture that requires complex switches and multiplex­
`
`
`requests to peer intelligent host bus adapters (hereinafter
`
`
`
`
`ing devices, the logically shared disk architecture enables
`
`
`
`
`referred to as "HBA"). An HBA in general is a device which
`
`
`
`
`simple and inexpensive scaling of the cluster capacity and
`
`
`
`
`adapts (connects) a host computer system to an 1/0 device.
`
`
`
`size, since any-to-any connectivity between computer nodes 5
`
`
`
`
`Signals associated with the bus of the host computer system
`
`
`
`and disks need not be maintained. Additional storage
`
`
`(e.g., PCI, ISA, etc.) are adapted for exchange with a bus
`
`
`
`devices are accessed by non-local computers of the cluster
`
`
`
`
`specific to the 1/0 device (e.g., SCSI, Fibre Channel, LAN,
`
`
`
`via existing network interfaces interconnecting the comput-
`
`
`
`
`
`etc.). The HBA of the present invention contains a directory
`
`
`
`ers of the cluster. In like manner, each additional computer
`
`
`
`
`within memory for storing location information regarding
`
`
`
`
`
`has access to all storage in the cluster either locally or via 10
`
`
`
`
`
`blocks of data stored within the plurality of storage devices
`
`
`
`existing network connections among the computers of the
`
`
`
`
`
`in the cluster, and circuits and software for searching the
`cluster.
`
`
`
`
`directory to determine whether to locally or remotely
`
`
`
`
`retrieve blocks of data. Independent of the host, the HBA
`
`
`
`
`Physically shared disk architectures are most prevalent in
`
`
`
`distributes 1/0 block requests to the appropriate HBA in
`spite of the higher costs in view of their higher performance
`
`
`
`
`
`
`
`15 response to the directory search. The HBA is operable to
`
`as compared to logically shared disk architectures. Many
`
`
`
`
`
`establish and maintain communications with at least one
`
`
`
`
`environments therefore have significant investments in
`
`
`
`
`other host bus adapter to query and request another host bus
`
`
`
`application programs and associated "middleware"
`
`
`
`adapter to retrieve and transfer 1/0 requested data blocks
`
`
`
`
`(intermediate layers of software) which are designed pre­
`
`
`
`
`from a storage subsystem within said clustered computing
`
`
`
`
`suming the simple, flexible, any-to-any connectivity of
`20 network.
`
`
`physically shared disks.
`In accordance with the preferred embodiment, intelligent
`
`
`
`
`
`
`"1/0 shipping" is a technique that has evolved to allow
`HBA(s) in each node communicate among themselves as
`such application programs and middleware to operate in an
`
`
`
`
`
`
`peers. HBAs in the same system can communicate as peers
`environment that in fact does not provide physically shared
`
`
`
`
`
`over the system's PCI bus in accordance with the intelligent
`
`disks. Rather, 1/0 shipping methods are used to emulate the
`
`
`
`
`
`
`25 1/0 standard (hereinafter referred
`to as the I20 standard).
`
`physically shared disk architecture using a low-level layer of
`
`
`
`
`
`
`
`Similarly, HBAs in different nodes can communicate as
`
`
`
`
`host software. In essence, 1/0 shipping is a technique to
`
`peers via, for example, a Fibre Channel backbone that
`
`
`implement logically shared disks even though any-to-any
`
`interconnects the HBAs.
`
`connectivity does not exist.
`In the preferred embodiment, an HBA is connected to a
`
`
`
`1/0 shipping is presently performed at a block driver layer 30
`
`
`
`
`
`
`
`
`peer HBA via a Fibre Channel backbone. The Fibre Channel
`
`
`
`of the host software to preserve the simplicity of manage­
`
`
`
`backbone is a high-speed communication medium and is
`
`
`ment of physically shared disks while enjoying the eco­
`
`
`
`used to "ship" block level 1/0 requests for blocks of stored
`
`
`
`nomic and scalability benefits of the logically shared disk
`
`
`
`data among the HBAs and to exchange blocks of stored data
`
`
`
`
`architecture. 1/0 shipping receives block level requests from
`
`
`associated with the shipped 1/0 requests. In essence, HBA
`
`
`
`higher layers of software, which presume an any-to-any
`35
`
`
`
`intelligence and peer-to-peer communications enable 1/0
`
`
`
`
`connection architecture underlies their operation. The 1/0
`
`
`
`shipping functionality to be removed from the host and
`
`
`
`shipping layer processes 1/0 requests locally if the local
`
`
`
`
`executed by the HBA. The 1/0 shipping is occurring over a
`
`
`
`disks are appropriate for the requested action and passes the
`
`
`
`Fibre Channel backbone, thereby relieving congestion on
`
`
`
`
`1/0 request to other host systems if the requested blocks are
`
`
`
`any other network used for inter-processor communication.
`
`
`not stored on the local disk. 1/0 shipping thus allows 40
`
`
`The intelligent HBA(s) of the present invention process
`
`
`
`continued use of existing software that presumes physically
`
`
`
`1/0 requests received from the host system. In the preferred
`
`
`
`
`shared disks to work in a cluster with logically shared disks.
`
`
`
`
`embodiment, one HBAis utilized in each node to support the
`
`
`
`That is, 1/0 shipping determines whether the block request
`
`
`local and remote storage channels of a node. In a second
`
`
`
`made by a higher level application, that assumes all disks are
`
`
`
`embodiment, one HBA supports the local storage channels
`
`
`
`
`physically shared, can be retrieved locally or must be
`45
`
`
`and a second HBA supports the remote storage channels
`
`
`
`
`
`retrieved remotely and therefore require the 1/0 request be
`
`
`within a node. In this second embodiment, the 1/0 request
`
`
`
`
`"shipped" to another host computer. To higher level software
`
`
`
`
`processing is distributed between the HBAs within the node
`
`
`
`
`
`layers 1/0 shipping in essence emulates physically shared
`
`
`
`
`to thereby reduce the workload on both HBAs within the
`
`
`disk using logically shared disks.
`node.
`
`
`All the above known cluster configurations suffer from a
`50
`common problem in that the disk sharing control and
`
`
`
`In the preferred embodiment, an HBA receives a block
`
`
`
`
`
`
`coordination is performed within the host systems and
`
`
`
`1/0 request from the host and searches through the directory
`
`
`
`
`therefore imposes an overhead load on the host systems. The
`
`
`in HBA memory to determine whether the block can be
`
`
`
`
`degree of overhead processing varies somewhat depending
`
`
`
`
`found locally or remotely. If the search through the directory
`
`
`
`upon the specific architecture employed. Nevertheless, all 55
`
`
`does not find a particular data block, the HBAmay poll (e.g.
`
`
`the above noted prior techniques impose a significant
`
`
`query) peer HBAs to determine which HBA can satisfy a
`
`overhead-processing load on the computers of the cluster.
`
`particular block 1/0 request.
`
`Consequently, a need exists for an improved apparatus and
`If blocks are available remotely, then the initiating HBA
`
`
`
`
`
`
`
`
`
`method to provide cluster computing disk sharing ( or more
`
`
`establishes communications with its peer HBA residing in
`
`
`
`
`generally resource sharing) with high 1/0 throughput
`
`
`
`
`the node containing the requested data blocks (e.g., peer­
`60
`
`
`
`performance, low host system processing overhead, and
`
`
`
`to-peer communications among HBAs). The 1/0 request is
`
`
`lower cost/complexity as compared to prior host-based
`
`"shipped" to the peer HBA, which performs the requisite
`techniques.
`
`
`
`processing. Data returned from the HBA performing the
`
`
`
`remote processing (i.e., a read request) is passed by the
`
`
`
`65 initiating HBA to the requesting host. Otherwise, the initi­
`The present invention solves the above and other
`
`
`
`
`ating HBA retrieves the requested data blocks from the local
`
`
`
`problems, thereby advancing the useful arts, by providing
`disks.
`
`
`
`SUMMARY OF THE INVENTION
`
`Cisco Exhibit 1006
`Cisco et al. v. LS Cloud Storage Technologies
`IPR2023-00733, Page 10 of 18
`
`

`

`
`
`US 6,173,374 Bl
`
`6
`5
`According to the present invention, a mapping of the
`
`
`
`The above and other objects, aspects, features, and advan­
`
`
`
`
`
`
`
`location of data, within every storage subsystem in the
`
`
`
`tages of the present invention will become apparent from the
`
`
`cluster, is performed by each HBA. Each HBA maps the
`
`
`
`following description and the attached drawings.
`
`
`location of data within its local storage system thereby
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`building a directory containing the location of all locally
`5
`
`
`
`
`stored data blocks. In the preferred embodiment, each HBA
`FIG. 1 is a block diagram of a host system in which the
`
`
`
`
`
`uses peer-to-peer capabilities to communicate the contents
`
`
`present invention may be advantageously applied.
`
`
`
`of its directory to peer HBAs. Each HBA updates its
`
`
`
`FIG. 2 is a block diagram of the software layers operable
`
`
`directory to include the communicated information.
`
`
`
`within cooperating nodes in accordance with the present
`10
`
`
`As distinguished from known techniques, the present
`
`
`invention as shown in FIG. 1.
`
`
`
`invention virtually eliminates host system overhead while
`FIG. 3 is a flow chart depicting the process operable
`
`
`
`
`enabling host applications to assume there is full "any-to­
`
`
`
`within an HBA in accordance with the present invention to
`
`
`
`
`
`any" connectivity to all disks in the cluster ( e.g., logically
`
`
`retrieve a block locally or remotely.
`
`
`
`shared disks architecture). The present invention provides
`
`
`FIGS. 4A-4D are flow charts depicting the alternative,
`
`
`such flexibility through 1/0 shipping emulation of physically 15
`
`
`
`
`
`
`equivalent methods to build the directory for locating
`
`
`
`shared disks but implements the 1/0 shipping layer within
`
`
`
`requested blocks within an HBA in accordance with the
`
`
`
`
`the HBAs of the clustered computers, rather than in the host.
`
`present invention.
`
`
`
`
`Implementing 1/0 shipping within the HBA virtually elimi­
`
`
`
`nates overhead resulting from processing 1/0 requests within
`FIGS. SA and SB, in combination, are a block diagram of
`
`
`20
`
`
`
`
`the host systems previously imposed when sharing storage
`
`
`
`
`a second embodiment of the present invention.
`
`
`
`resources. That is, the host software no longer has to handle
`DETAILED DESCRIPTION OF THE
`
`
`
`
`
`the 1/0 shipping function, and inter-processor communica­
`
`PREFERRED EMBODIMENTS
`
`
`tion networks no longer need to carry 1/0 traffic in addition
`
`to communication traffic.
`While the invention is susceptible to various modifica­
`
`
`
`25
`
`
`
`Performance is further enhanced in that each intelligent
`
`
`
`
`tions and alternative forms, a specific embodiment thereof
`
`
`
`HBA can provide disk caching of the local disks under its
`
`
`has been shown by way of example in the drawings and will
`
`
`control, and can service hits out of this cache regardless of
`
`
`herein be described in detail. It should be understood,
`
`
`
`whether the requesting host is local or remote. This is in
`
`
`
`
`
`however, that it is not intended to limit the invention to the
`
`
`contrast to physically shared disks where the cache must
`
`
`
`particular form disclosed, but on the contrary, the invention
`
`reside out in the subsystem to avoid the cache flushes or 30
`
`
`
`
`
`is to cover all modifications, equivalents, and alternatives
`
`
`
`complex cache synchronization issues associated with "dis­
`
`
`
`
`falling within the spirit and scope of the invention as defined
`
`tributed" cache architectures.
`
`
`by the appended claims.
`
`
`
`
`Furthermore, 1/0 shipping emulation of physically shared
`FIG. 1 depicts a host system in which the present inven­
`
`
`
`
`
`
`disks reduces cost and complexity as compared to actual
`
`
`
`tion may be advantageously applied. Node 1150 includes a
`35
`
`
`
`physically shared disks. Costly complex multiplexors and
`
`host system connected to a HBA 117 via a Host-to­
`
`
`
`switches are not required to implement any-to-any connec­
`
`
`
`
`Peripheral Component Interconnect (hereinafter referred to
`tivity.
`
`
`as "PCI") bus bridge 115 and PCI bus 116.5. The host
`
`
`
`
`
`It is therefore an object of the present invention to provide
`
`
`
`system includes one or more Central Processing Units
`
`
`
`
`apparatus and associated methods of operation for perform-
`
`
`
`(hereinafter referred to as "CPU"). Host system CPU 1 and
`40
`
`
`
`
`ing 1/0 shipping within HBAs of computers in a clustered
`
`
`
`CPU 2 100 are connected to respective cache 1 and cache 2
`
`computing environment.
`
`105 as well as to other memory components not shown. The
`
`
`host processors, CPU 1 and CPU 2 100 are connected to
`
`
`
`It is a further object of the present invention to provide
`
`
`
`local processor bus 110. Those skilled in the art will recog-
`
`apparatus and associated methods of operation for perform­
`
`
`
`
`
`
`nize that local processor bus 110 may be any of several
`
`ing 1/0 shipping within HBAs of computers in a clustered
`
`
`
`45
`
`
`
`busses depending upon the choice of components for host
`
`
`
`
`
`computing environment to emulate physically shared disks
`CPU 1 and CPU 2 100.
`
`
`on logically shared disks.
`Host-to-PCI Bus bridge 115
`
`
`
`
`
`
`It is still a further object of the present invention to In the present invention, the
`
`
`
`
`
`
`
`
`
`
`
`provide apparatus and associated methods of operation for adapts the processor bus 110 signals and the PCI bus 116.5
`
`
`
`
`low level block 1/0 50 performing 1/0 shipping to distribute signals
`
`
`
`to allow communications with the embedded HBA
`
`
`
`requests within HBAs to process both local and remote
`
`
`
`intelligence. An exemplary Host-to-PCI Bus bridge 115 is
`block 1/0 requests.
`
`
`the "Saturn II" chip set manufactured as part number 82420
`
`
`by Intel Corporation. Those skilled in the art will recognize
`
`
`
`
`It is yet a further object of the present invention to provide
`
`
`
`other processor/PC! interface chips well known in the art.
`
`
`
`
`apparatus and associated methods of operation for establish­
`
`
`
`ing peer-to-peer communications that is transparent to the 55
`
`The PCI bus 116.5 is but one example of a presently
`
`
`
`host by providing an intelligent HBA to establish peer-to­
`
`
`
`
`available, commercially popular bus for peripheral device
`
`
`peer connections to distribute block 1/0 requests.
`
`
`interconnection in host systems. A PCI bus is commonly
`
`
`used as the host system interface bus due to faster transfer
`
`
`
`It is another object of the present invention to provide
`
`
`
`rates as distinguished from older backplane busses such as
`
`
`
`
`apparatus and associated methods of operations for estab­
`
`
`ISA or EISA. The host system interface bus can be selected
`
`
`
`
`lishing peer-to-peer communications that is transparent to 60
`
`
`
`based on system performance requirements. Although PCI is
`
`
`the host by providing a single HBA to support a node's local
`
`
`
`the preferred 1/0 bus stan

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket