`Brady et al.
`
`USOO5758050A
`11) Patent Number:
`45 Date of Patent:
`
`5,758,050
`May 26, 1998
`
`54 RECONFIGURABLE DATA STORAGE
`SYSTEM
`(75) Inventors: James Thomas Brady; Paul T.
`R.E. St. E.i.e. B.
`Moothedath Menon; Steven Gerdt,
`both of San Jose, all of Calif.
`
`73) Assignee: International Business Machines
`Corporation, Armonk, N.Y.
`
`w
`
`Sa e
`
`5,301.297 4/1994 Menon et al. .......................... 395/425
`5,303,244 4/1994 Watson ...........
`. 371/10.
`5,333,277 7/1994 Searls .............
`... 395/325
`38; A: E. al. ................ 365/200
`5,465.343
`17995 Henson et al. .......................... 395/439
`5,491,810 2/996 Allen ..............
`... 395/.444
`5,542,065 7/1996 Burkes et al. .....
`395/441
`5.546.558 8/1996 Jacobson et al. ..
`395,441
`5,553.285 9/1996 Krakauer et al. ....................... 395/60
`5.588.138 12/1996 Bai et al. ........
`71f173
`5,592,638
`1/1997 Onodera .....
`... 71/173
`5,602,995 2/1997 Hendel et al. .......................... 395/250
`Primary Examiner-Albert Decady
`21 Appl. No.: 614460
`Attorney, Agent, or Firm-Baker Maxham Jester & Meador
`22 Filed:
`Mar 12, 1996
`7
`AB
`T
`(51
`int. Cl. ....................... G06F 1200 ''
`STRAC
`52 U.S.C. .......................................... 395/180; 711/173
`A system for managing data storage devices of a data storage
`58) Field of Search ..................................... 395/180, 181,
`subsystem. A data storage system includes a controller
`395/182.03, 182.06, 410, 412, 413. 419,
`coupled to multiple data storage devices. In response to a
`183.05, 182.07, 651; 711/100, 153, 147.
`request, the controller allocates the devices' storage space
`170, 171, 172, 173, 202. 203
`into a number of storage partitions or "virtual devices." The
`request specifies the size and function mode for each storage
`References Cited
`partition. Each storage partition, having the requested size.
`is operated according to its requested function mode. This
`U.S. PATENT DOCUMENTS
`involves mapping access commands, which specify virtual
`3/1984 Winkelman .
`addresses, to the proper physical addresses of the appropri
`4.435.752
`4,601,012 7/1986 Aiken, Jr. .
`ate data storage devices.
`5,018,060 5/1991 Galb et al. .
`5,148,432 9/1992 Gordon et al. ......................... 371/10.1
`61 Claims, 7 Drawing Sheets
`
`56)
`
`200
`20
`
`-209
`HOST
`
`NTERFACE
`–
`323s----------------------------------
`CONTROLLER
`COMMAND
`OSTRIBUTOR
`
`22O
`
`22
`
`
`
`NON-RAID
`
`206-1
`
`224
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1032, p. 1
`
`
`
`U.S. Patent
`
`May 26, 1998
`
`Sheet 1 of 7
`
`5,758,050
`
`SURAGE - 103
`SYSTEM
`
`STORAGE
`SYSTEM
`
`1 O9
`
`STORAGE
`SYSTEM
`
`1 O
`
`
`
`1 O2
`CONTROLLER
`
`1 OO-N
`
`
`
`FIG 1
`(PRIOR ART)
`
`
`
`FIG 3
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1032, p. 2
`
`
`
`U.S. Patent
`
`May 26, 1998
`
`Sheet 2 of 7
`
`5,758,050
`
`200
`
`21 O
`USER
`INTERFACE
`2O2N
`CONTROLLER
`
`212 COMMAND
`DISTRIBUTOR
`
`2O9
`HOST
`
`21,3
`DIRECTORY
`
`214 - ARRAY || ARRAY || ARRAY
`MANAG. MANAG. MANAG.
`UNIT
`UNIT
`UNIT
`
`21 6
`
`I/O BUS AND ADAPTER - 204
`
`2O6
`
`a-OOOOOO
`208 OOOOO O
`or OOOOO O
`tooooo ol
`
`FIG 2A
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1032, p. 3
`
`
`
`U.S. Patent
`
`May 26, 1998
`
`Sheet 3 of 7
`
`5,758,050
`
`200 Na
`
`21 O
`USER
`INTERFACE
`*S.-----------------------------------
`CONTROLLER
`
`209
`
`HOST
`
`212
`
`COMMAND
`DISTRIBUTOR
`
`2 3
`DIRECTORY
`
`214-LSA RADA-21s
`
`RAID NON
`
`LSA
`
`I/O BUS AND ADAPTER - 204
`
`222
`
`223
`
`LSA
`
`22O
`
`221
`
`RAID-5
`
`NON-RAD
`
`206-1
`
`FIG 2B
`
`RAID-1
`
`RAD - 3
`
`224
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1032, p. 4
`
`
`
`U.S. Patent
`
`May 26, 1998
`
`Sheet 4 of 7
`
`5,758,050
`
`
`
`
`
`
`
`ALLOCATE, CREATE MAP
`OF LOGICAL SUBSYSTEM
`
`OPERATE AS REQUESTED
`
`412
`
`ANOTHER
`REOUEST
`p
`
`CONTINUE TO OPERATE AS REQUESTED
`
`HIG 4
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1032, p. 5
`
`
`
`U.S. Patent
`
`May 26, 1998
`
`Sheet S of 7
`
`5,758,050
`
`504 it
`
`
`
`-500
`
`0 | BO||ABC]
`
`9 BO||ABC
`
`6 30|ABC]
`
`25 MB
`
`50 MB
`
`75 MB
`
`1 OO MB
`
`FIG 5
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1032, p. 6
`
`
`
`U.S. Patent
`
`May 26, 1998
`
`Sheet 6 of 7
`
`5,758,050
`
`500 Na
`
`
`
`25 MB
`
`5O MB
`
`75 MB
`
`1 OO MB
`
`608
`
`61 O 6 11
`
`FIG 6
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1032, p. 7
`
`
`
`U.S. Patent
`
`May 26, 1998
`
`Sheet 7 of 7
`
`5,758,050
`
`FIG 7
`
`N
`
`A
`
`7OO
`1.
`
`7O2
`
`
`
`CONTROLLER
`
`714.
`STORAGE
`INTERFACE
`
`72
`STORAGE
`INTERFACE
`
`713
`STORAGE
`INTERFACE
`715
`STORAGE
`NTERFACE
`
`7O8 708 7O8
`
`D (DCD COO
`|OOOOO O.
`Loooooo
`a 5 to OOO
`|OOOOO O.
`looooo ol
`
`71 O 71 O 71 O
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1032, p. 8
`
`
`
`5,758.050
`
`1
`RECONFIGURABLE DATA STORAGE
`SYSTEM
`BACKGROUND OF THE INVENTION
`1. Field of the Invention
`The present invention relates to a system for managing
`data storage devices in a digital computer. More particularly.
`the invention concerns a method and apparatus to operate a
`plurality of partition slices provided by multiple data storage
`devices. After allocating the partition slices into a number of
`storage partitions according to a user request that specifies
`the size and function mode for each storage partition, each
`storage partition is operated according to its function mode.
`2. Description of the Related Art
`In the past, a number of data storage systems have
`employed the architecture of FIG. 1, where a data storage
`system 100 includes a controller 102 having a single node
`104 connected to one or more storage subsystems 108-110.
`The node 104 is typically embodied in a device such as a
`microprocessor. Each storage subsystem 108-110 includes
`multiple data storage devices, also called "head disk assem
`blies" or "HDAs". The data storage devices may employ
`data storage media such as magnetic storage disks as found
`in a computer "hard drive."
`25
`The storage subsystems 108-110 may be identical, or
`diverse. For example the first storage subsystem 108 may
`include hardware particular to a RAID-5 system, the second
`subsystem 109 may include hardware of a non-RAID
`system, and the third subsystem 110 may comprise a log
`structured array ("LSA"). In the data storage system 100,
`each of the storage subsystems 108-110 includes its own
`controller to operate the accompanying storage devices in
`the desired manner. Thus, upon system startup, the sub
`system 108 is operable as a RAID-5 system the subsystem
`109 is operable as a non-RAID system; and, the subsystem
`110 is operable as a LSA.
`The data storage system 100 provides an effective way of
`storing data in different types of storage subsystems, within
`the constraints of a single architecture. Nonetheless, some
`40
`users may require greater flexibility in their data storage. A
`user's data storage needs may eventually require, for
`example, more RAID-5 storage space and less non-RAID
`storage space. In the data storage system 100, however, such
`changes require a potentially expensive purchase of new
`hardware or a possibly complicated modification of existing
`hardware.
`Another potential problem is that, when one of the storage
`subsystems 108-110 is filled with data, it cannot store any
`more data even though another one of the subsystems may
`be partially full or even empty.
`SUMMARY OF THE INVENTION
`Broadly, one aspect of the present invention concerns a
`data storage system, including a storage subsystem with a
`multiplicity of data storage devices, each data storage device
`including storagge media containing portions of multiple
`partition slices. Each partition slice, for example, may
`comprise a selected number of bytes. The data storage
`devices together may provide a homogeneous or heteroge
`neous grouping of magnetic disk storage units, magnetic
`tape units, optical cartridges, or other digital data storage
`apparatuses. An interface couples the data storage devices to
`a controller, which receives input from a user input device.
`The controller includes one or more nodes programmed to
`manage allocation of the partition slices. The nodes may, for
`instance. comprise microprocessors.
`
`35
`
`O
`
`15
`
`45
`
`50
`
`2
`The controller creates a map comprising an ordered
`representation of the partition slices. The controller receives
`requests to allocate the partition slices into a number of
`storage partitions or "virtual devices." These requests may
`originate from the user input device, or another source such
`as a host computer, application program, etc. Each request
`includes a size and a function mode for each storage
`partition. Based on the request, the controller operates each
`storage partition as a virtual device, according to its
`requested function mode. This involves receiving input data
`access commands, including virtual addresses compatible
`with virtual devices, and mapping the data access command
`to the appropriate physical storage locations of the data
`storage devices.
`In an illustrative embodiment, the function modes may
`include non-RAID, LSA, and various types of RAID, such
`as RAID-0. RAID-1, RAID-3, RAID-4, and RAID-5.
`Therefore, if the user submits a size and function mode
`request specifying 30 Mb of RAID-5, the controller allo
`cates storage partitions sufficient to provide a 30 Mb storage
`partition, and operates these partition slices as a RAID-5
`virtual device.
`Another aspect of the invention involves a method for
`operating a data storage subsystem. Still another aspect of
`the invention involves a data storage device, tangibly
`embodying a program of instructions to manage operation of
`a data storage subsystem.
`The invention affords its users with a number of distinct
`advantages. First, the invention provides flexibility in data
`management, because the user can selectively store data in
`different storage partitions having different operating char
`acteristics. The storage subsystem can therefore be opti
`mized based on cost, performance, and availability of its
`components. Moreover, the user saves money by foregoing
`the purchase of different storage subsystems to implement
`different memory storage devices. With the present
`invention, a pool of data storage devices can be selectively
`apportioned to effectively provide different storage sub
`systems with different operating characteristics, e.g. RAID
`5. LSA, non-RAID, etc. Advantageously, unused partition
`slices may be reassigned from one storage partition to a
`different storage partition. Another benefit of the invention
`is achieved by using multiple nodes, which provides redun
`dancy in case of a node failure and also permits balancing of
`workload among the nodes. The present invention also
`provides a number of other benefits, as described below.
`BRIEF DESCRIPTION OF THE DRAWINGS
`The nature, objects. and advantages of the invention will
`become more apparent to those skilled in the art after
`considering the following detailed description in connection
`with the accompanying drawings, in which like reference
`numerals designate like parts throughout, wherein:
`FIG. 1 is a block diagram of a known data storage system;
`FIG. 2A is a block diagram of the hardware components
`and interconnections of a data storage system pursuant to the
`invention;
`FIG. 2B is a block diagram of the hardware components
`and interconnections of the data storage system of FIG. 2A,
`configured and implemented pursuant to the invention.
`FIG.3 is an illustrative data storage medium for use by the
`controller pursuant to the invention;
`FIG. 4 is an illustrative sequence of steps for managing a
`data storage subsystem pursuant to the invention;
`FIG. 5 is a preliminary mapped storage array pursuant to
`the invention;
`
`55
`
`60
`
`65
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1032, p. 9
`
`
`
`5,758.050
`
`20
`
`30
`
`25
`
`3
`FIG. 6 is a fully mapped storage array pursuant to the
`invention; and
`FIG. 7 is a block diagram of the hardware components
`and interconnections of a multi-node data storage system
`pursuant to the invention.
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENTS
`Broadly, the present invention concerns a system for
`managing data storage devices in a digital computing sys
`tem. An apparatus aspect of the invention involves a data
`storage system with a multiplicity of data storage devices
`that provide a reconfigurable data storage subsystem. A
`process aspect of the invention concerns a method for
`15
`operating a reconfigurable data storage subsystem.
`STRUCTURE
`As shown in FIG. 2A, the hardware components and
`interconnections of the invention provide a data storage
`system 200 that basically includes a controller 202, an I/O
`bus and adapter 204 coupled to the controller 202, and a
`storage subsystem 206 coupled to the storage interface 204.
`The I/O bus and adapter 204 is also called an “interface."
`The system 200 may also include a host 209, which
`exchanges commands and data with the controller 202 as
`discussed below. Additionally, a user interface 210 is con
`nected to the controller 202, to aid a user (not shown) in
`providing input to the controller 202. The user interface 210
`may comprises, for instance, a keyboard, mouse, voice
`recognition apparatus, or a number of other means for
`providing human input to the controller 202.
`The storage subsystem 206 includes a multiplicity of data
`storage devices 208, each of which is accessible via the
`interface 204. The data storage devices 208 may comprise a
`35
`number of different devices, such as magnetic storage disks.
`optical storage disks, optical or magnetic tape media. RAM.
`etc. The data storage devices 208 may together provide a
`homogeneous or a heterogenous group of storage devices.
`depending upon the application's requirement. If desired,
`even devices 208 of the same type may vary in various
`operating characteristics, such as throughput, storage
`capacity, etc.
`The controller 202 preferably comprises a microprocessor
`such as the INTEL model i960TM. The interface 204 may
`45
`comprise, for example, an apparatus employing serial stor
`age architecture ("SSA"). The controller 202 manages the
`data storage devices 208 of the storage subsystem 206 using
`a number of components, an example of which is described
`as follows. As shown in FIG. 2A, the controller 202 includes
`command distributor 212, a directory 213. and multiple
`array management units 214-216. Broadly, these features
`enable the controller 202 to manage the storage subsystem
`206 as a member of independent user-selected storage
`partitions, such as a RAID-5 partitions, an LSA partition, a
`non-RAID partition, etc.
`More particularly, the command distributor 212 first
`receives memory access commands from the host 209 or
`user interface 210, each command identifying a location of
`data within the storage subsystem 206 and a type of storage
`operation, such as a Read or Write operation. In the illus
`trated embodiment, the memory access command identifies
`the location of data by identifying a "volume” of data stored
`in the subsystem 206 and an address within the volume. In
`this example, each storage partition completely contains one
`or more volumes of data. The volume information therefore
`constitutes a "virtual address' to storage locations or one of
`
`4
`the "virtual devices" provided by the storage partitions.
`After receiving a memory access command, the command
`distributor 212 consults the directory 213 to identify the
`storage partition corresponding to the volume specified in
`the memory access command. Having identified the desired
`storage partition. the command distribution 212 forwards the
`command to an appropriate one of the array management
`units 214-216 corresponding to the proper storage partition.
`Further understanding of array management may be
`obtained with reference to the RAIDbook-A Source Book
`for Disk Array Technology (4th ed.), available from the
`RAID Advisory Board, St. Peter, Minn... this reference being
`incorporated herein by reference in its entirety.
`FIG. 2B depicts a more specific example of the data
`storage system 200. In this example, the array management
`units 214-218 include units designed to manage LSA type
`storage (unit 214), RAID type storage (unit 215), and
`non-RAID storage (unit 216). Without requiring any alter
`ation to the data storage devices 208, the devices 208 are
`partitioned to provide a LSA storage device 220, a RAID-5
`storage device 221, a non-RAID storage device 222, a
`RAID-1 storage device 223, and a RAID-3 storage device
`224. Thus, the array management units 214-216 in effect
`manage a number of virtual storage devices 220-224.
`When the command distributor 212 receives a memory
`access command identifying a volume contained in the LSA
`partition 220, the distributor 212 forwards the memory
`access command to the LSA array management unit 214.
`which maps the virtual address specified by the memory
`access command into the appropriate physical address
`within the partition slices allocated for use as the LSA
`partition 220. Likewise the distributor 212 forwards memory
`access commands having virtual addresses within the non
`RAID partition 222 to the non-RAID array management unit
`216 for mapping to the appropriate physical address within
`the partition slices allocated for use as the non-RAID
`partition 222.
`In one embodiment, the controller 202 may manage
`operation of the storage subsystem 206 by executing array
`management software comprising a series of computer
`readable programming instructions. Alternatively, these
`functions may be performed individually by the array man
`agement units 214-216 executing separate respective
`sequences of computer-readable programming instructions.
`In either case, the programming instructions may be con
`tained on a data storage medium such as an EPROM, PLA.
`ECL, or another medium fixedly coupled to the controller
`202. Instead, the programming instructions may be con
`tained on a data storage medium detachably coupled to the
`controller 202, such as a computer diskette 300 (FIG. 3), a
`DASD array, a magnetic tape, a conventional "hard disk
`drive", electronic read-only memory, an optical storage
`device. a set of paper "punch" cards, or another suitable data
`storage medium. Alternatively, the programing instructions
`may be contained in a reserved space of the storage sub
`system 206, such as in a private file system space, discussed
`below. The programming instructions may, for example,
`comprise lines of compiled C-H code.
`Operation of the system 200 may be further understood
`with reference to the detailed description of the invention's
`operation, set forth below.
`OPERATION
`
`50
`
`55
`
`65
`
`General Description
`FIG. 4 depicts a sequence of tasks 400 that illustrate one
`embodiment of the invention's operation. The sequence 400
`serves to manage a reconfigurable data storage subsystem.
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1032, p. 10
`
`
`
`5
`such as the subsystem 206. For ease of illustration, the
`sequence 400 is discussed in the context of the exemplary
`hardware configuration of FIG. 2B.
`Request for Allocation
`The sequence 400 begins in task 402. In task 406 the
`controller 202 receives a request to allocate the subsystem
`206 into a number of storage partitions, also called "virtual
`devices." For each storage partition, the request includes a
`"storage capacity"and a "function mode". The storage
`capacity refers to the total amount of data storage provided
`by a storage partition, e.g. 30 Mb. The function mode
`pertains to different operating characteristics of a storage
`partition and includes at least a "storage type", which
`identifies the way in which a particular storage partition
`operates, e.g. RAID-5, RAID-0, non-RAID, LSA, etc. The
`function mode may also include a number of other operating
`characteristics, such as parity level. parity rotation, sparing
`level, sparing rotation, data width. and data width rotation.
`"Parity level" concerns the number of physical devices in
`a storage partition that are devoted to the storage of parity
`information. The parity level preferably comprises a number
`Such as zero or one. In a storage partition having 8 devices,
`with a parity level of one, there would be one physical
`device for parity information storage and 7 physical devices
`for data storage. Although certain portions of each storage
`device are actually allocated to store parity bits correspond
`ing to data stored in the other storage devices. for ease of
`discussion storage systems are often described as having a
`particular "parity device” dedicated to the storage of parity
`information. Identically sliced blocks of parity information
`may be distributed across the storage devices of a storage
`partition in a desired pattern; in this respect, the requested
`parity level may also include a "parity rotation", specifying
`the size of each parity block.
`"Sparing levels concerns the number of logical devices
`in a storage partition that are set aside as "spare" logical
`devices, useful in the event of a device failure, for example.
`Analogous to the parity level in many respects, the sparing
`level preferably comprises an integer, and may also include
`a "sparing rotation" to determine the size of each spare
`block. "Data width" concerns the number of physical
`devices in each storage partition that are devoted to data
`storage, i.e. not parity or spare devices. The data width may
`also encompass a specification of "data rotation", analogous
`to parity and sparing rotation.
`Allocation
`After task 406, the controller 202 in task 408 carries out
`the requested allocation and stores a representative map in
`the directory 213. This map may comprise a lookup table,
`for example. This map translates between the user-specified
`volume information in the memory access command and the
`physical storage space of the storage devices 208. If each
`volume is coterminous with a storage partition, the map
`translates between the user-specified volume information
`and the virtual devices provided by the user-requested
`storage partitions.
`In performing task 408, the controller 202 first maps the
`aggregate storage space contained on all of the devices 208;
`
`35
`
`45
`
`55
`
`5,758.050
`
`5
`
`20
`
`25
`
`30
`
`6
`this involves determining the total amount of storage space
`available in the subsystem 206, and then dividing this
`storage space into a multiplicity of partition slices of one or
`more appropriate sizes. To illustrate this with an example,
`FIG. 5 depicts the storage space of all devices 208 mapped
`into a preliminary array 500, where one axis 502 represents
`different devices 208 and the other axis 504 represents the
`storage space of each device 208. With the preliminary array
`500 each column represents the storage space of one physi
`cal storage device 208. Each row represents contresponding
`addresses, either logically or physically, of all illustrated
`devices 208 together.
`Next in task 408, the storage space is divided into
`partition slices, and the partition slices are allocated to the
`different storage partitions. Each partition slice preferably
`resides across multiple storage devices 208. Therefore, each
`storage device preferably includes parts of multiple partition
`slices. Each partition slice is preferably sized appropriately
`to serve the user's allocation request. The use of partition
`slices provides great flexibility in allocating and
`re-allocating the virtual devices. The division into partition
`slices may be performed according to various schemes. For
`example, one scheme is a time-based priority scheme and
`another is a "best fit" scheme. In the time-based priority
`scheme. the storage space of the subsystem 206 is allocated
`on a "first come, first served” basis. Here, the user's first
`requested storage partition is positioned in the lowest row
`and column position of the array 500, i.e. top-left. The other
`storage partitions are placed in ensuing rows and columns.
`proceedinig right-to-left, top-to-bottom, or in another des
`ignated order. In contrast, under the "best fit" scheme the
`controller 202 considers all of the user's requested storage
`partitions together before allocating any, and then allocates
`the storage space of the subsystem 206 to best accommodate
`the different sizes and types of requested storage partitions.
`Ongoing Operation
`After the allocation of task 408 is completed, the now
`defined storage partitions are operated in task 412 according
`to their requested function mode. Namely the distributor 212
`directs memory access commands to the array management
`units 214-216, each of which exchanges data with the
`appropriate physical storage address of the corresponding
`partition 220-224.
`Query 414 recognizes when the controller 202 receives
`another request for storage partition allocation. When this
`happens, control is returned to task 408 for reallocation.
`Otherwise, the controller 202 in task 416 continues to
`operate the defined storage partitions according to their
`requested function mode and parity level, as in task 412.
`Example
`To better understand the sequence 400, Table 1 provides
`an exemplary user request, whose processing is discussed
`below. In this example, the subsystem 206 includes 16
`storage devices 208, each device including 100 Mb of
`storage, as shown in FIG. 5.
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1032, p. 11
`
`
`
`5,758.050
`
`TABLE 1.
`Exemplary Storage Partition Requests
`
`STORAGE CAPACITY
`AWALABLE FOR DATA
`REQUEST STORAGE
`
`1st (602) 175 Mb storage partition,
`2nd (603) 150 Mb storage partition,
`3rd (604) 50 Mb storage partition,
`4th (605) 175 Mb storage partition,
`5th (606) 100 Mb storage partition,
`
`PARTY STORAGE
`LEVEL TYPE
`
`DATA
`SPARING WIDTH
`
`1.
`1.
`O
`1
`O
`
`RAID-5
`LSA
`non-RAID
`RAID-5
`RAID-0
`
`O
`l
`O
`O
`O
`
`7
`3
`X
`7
`2
`
`For example, in an alternative embodiment the system
`First, the preliminary array 500 is created. The array 500 15
`200 may by implemented to provide a multi-node data
`represents the aggregate storage space contained on all of
`storage system 700 (FIG. 7) with systems redundancy for
`the 16 devices 208 together. The array 500 includes one
`greater reliability. The system 700 includes a controller 702
`column for each device, and one row for each 5 Mb of
`with multiple nodes 702a-702d. Each of the nodes
`storage. These dimensions are merely exemplary however,
`and they may be changed by an ordinarily skilled artisan 20 702a-702d. for example, may comprise an INTEL model
`(with the benefit of this disclosure) to suit a particular
`i960 microprocessor. The controller 702 is coupled to a user
`application.
`interface 703 and first and second storage subsystems 704.
`Having created the preliminary array 500, the controller
`706. The subsystems 704, 706 each include a multiplicity of
`202 divides the storage space into partition slices as needed,
`storage devices 708, 710. Tile subsystem 704 is coupled to
`performs the requested allocation, and stores a representa- 25 the nodes 702c-702d via a pair of storage interfaces
`tive map in the directory 213. The illustrated example uses
`712-713, respectively. Likewise, the subsystem 706 is
`the time-priority allocation method. Accordingly, the user's
`coupled to the nodes 702a-702b via a pair of storage
`first request concerns a 175 Mb storage partition, with a
`interfaces 714-715, respectively.
`parity level of one. a RAID-5 type, sparing level of zero, and
`The system 700 preferably provides shared management.
`data width of 7. Since each partition slice preferably resides 30 where all nodes 702a-702d have access to all devices
`across multiple devices, and the partition slices are allocated
`708–710. However, each node is preferably assigned exclu
`beginning with the top-left of the array 500, the first
`sive control over a selected group of storage devices, irre
`requested storage partition is given by 602 (FIG. 6). Each
`spective of the storage partition boundaries. For example, in
`partition slice of the storage partition 602 spans the 1st
`the subsystem 704, the node 702c may operate devices 1-8
`through 8th devices. With this allocation, devices 1-7 are 35 with the node 702d operating the devices 9-16. This pro
`dedicated to data storage the 8th device serving to store
`vides redundancy in case one of the nodes fails, since
`parity information.
`another node can then operate the storage devices previously
`Since the widths of the partition slices of each storage
`managed by the failed node. Additionally, this permits
`partition are determined by the number of devices in that
`reassignment of storage devices from one node to another, to
`storage partition, the depth of the partition slices determines 40 balance the nodes' workload.
`the slices' size. For instance, if each partition slice of the
`What is claimed is:
`storage partition 602 is to occupy 8 Mb, then each partition
`1. A data storage system, comprising:
`slice has a depth of 1 Mb, i.e. occupies 1 Mb on each device.
`a multiplicity of data storage devices together providing
`The size of the partition slices may be predetermined or
`an aggregate amount of storage space;
`E" depending upon the particular needs of the 45
`a storage interface coupled to the data storage devices;
`Continuing this process, the user's 2nd through 5th
`an input device;
`requests are carried out by creating the storage partitions
`a controller coupled to the interface and the input device
`603-606. The region 607 represents unallocated storage
`and including a node programmed to manage allocation
`space of the subsystem 206. If the operating instructions of 50
`of storage partitions by performing steps comprising:
`the controller 202 are to be stored in the subsystem 206. then
`identifying the aggregate amount of storage space
`a private file system space 612 may be allocated for that
`provided by the data storage devices;
`purpose.
`receiving from the input device a request to allocate the
`If desired, these and future allocations may be made to
`storage space into a number of storage partitions,
`observe "device group partitions". which serve to com- 55
`said request including for each storage partition a
`pletely divide certain physical storage devices into partition
`storage capacity and a device-feature emulation
`slices of one size. For instance, the 1st-8th devices, 9th-13th
`mode;
`devices, and 14th-16th devices may be designated as device
`in response to the request, creating a map of the storage
`group partitions 608-610, respectively.
`space, said map allocating for each storage partition
`sufficient storage space from at least one of the data
`OTHER EMBODMENTS
`storage devices to provide the requested storage capac
`While there have been shown what are presently consid-
`ity; and
`ered to be preferred embodiments of the invention, it will be
`operating each storage partition according to its device
`apparent to those skilled in the art that various changes and
`feature emulation mode.
`modifications can be made herein without departing from 65
`2. The system of claim 1, the node being programmed
`such that the creating step comprises the steps of creating a
`the scope of the invention as defined by the appended
`map of storage space dividing the storage space into a
`claims.
`
`60
`
`Petitioners Microsoft Corporation and HP Inc. - Ex. 1032, p. 12
`
`
`
`5,758,050
`
`9
`multiplicity of partition slices having at least one size. said
`map allocating for each storage partition a number of
`partition slices of like size to provide the requested storage
`capacity.
`3. The system of claim 2, each data storage device
`containing portions of multiple partition slices.
`4. The system of claim 1, wherein the controller is
`programmed such that the device-feature emulation mode
`includes for each storage partition a storage type.
`5. The system of claim 4, wherein the device-feature
`emulation mode includes selection of a speed of operation.
`6. The system of claim 1, wherein the controller is
`programmed such that the device-feature emulation mode
`includes for each storage partition a parity level.
`7. The system of claim 6, wherein the parity level includes
`a number of parity devices and a parity rotation.
`8. The system of claim 1, wherein the controller is
`programmed such that the device-feature emulation mode
`includes for each storage partition a data width.
`9. The system of claim 8. wherein the data width includes
`a number of data devices and a data rotation.
`10. The system of claim 1, wherein the controller is
`programmed such that the device-feature emulation mode
`includes for each storage parti