throbber
submitted to ACM Computing Surveys
`
`RAID: High-Performance, Reliable Secondary Storage
`
`Peter M. Chen
`Computer Science and Engineering Division
`Department of Electrical Engineering and Computer Science
`1301 Beal Avenue
`University of Michigan
`Ann Arbor, MI 48109-2122
`
`Edward K. Lee
`DEC Systems Research Center
`130 Lytton Avenue
`Palo Alto, CA 94301-1044
`
`Garth A. Gibson
`School of Computer Science
`Carnegie Mellon University
`5000 Forbes Avenue
`Pittsburgh, PA 15213-3891
`
`Randy H. Katz
`Computer Science Division
`Department of Electrical Engineering and Computer Science
`571 Evans Hall
`University of California
`Berkeley, CA 94720
`
`David A. Patterson
`Computer Science Division
`Department of Electrical Engineering and Computer Science
`571 Evans Hall
`University of California
`Berkeley, CA 94720
`
`Abstract: Disk arrays were proposed in the 1980s as a way to use parallelism between
`multiple disks to improve aggregate I/O performance. Today they appear in the product
`lines of most major computer manufacturers. This paper gives a comprehensive over-
`view of disk arrays and provides a framework in which to organize current and
`future work. The paper first introduces disk technology and reviews the driving forces
`that have popularized disk arrays: performance and reliability. It then discusses the two
`architectural techniques used in disk arrays: striping across multiple disks to improve per-
`formance and redundancy to improve reliability. Next, the paper describes seven disk
`array architectures, called RAID (Redundant Arrays of Inexpensive Disks) levels 0-6 and
`compares their performance, cost, and reliability. It goes on to discuss advanced research
`and implementation topics such as refining the basic RAID levels to improve performance
`and designing algorithms to maintain data consistency. Last, the paper describes five disk
`array prototypes or products and discusses future opportunities for research. The paper
`includes an annotated bibliography of disk array-related literature.
`
`Content indicators: disk array, RAID, parallel I/O, storage, striping, redundancy
`
`DHPN-1011
`Dell Inc. vs. Electronics and Telecommunications, IPR2013-00635
`Page 1 of 65
`
`IPR2014-00901 Owner Ex. 2102
`ETRI, Patent Owner
`VMware, Petitioner
`
`

`

`1
`2
`
`3
`
`4
`
`5
`
`6
`
`3.3
`
`3.4
`
`INTRODUCTION ...................................................................................................1
`BACKGROUND .....................................................................................................3
`2.1
`Disk Terminology ................................................................................................................3
`2.2
`Data Paths ............................................................................................................................5
`2.3
`Technology Trends...............................................................................................................7
`DISK ARRAY BASICS...........................................................................................8
`3.1
`Data Striping and Redundancy ............................................................................................8
`3.2
`Basic RAID Organizations ..................................................................................................9
`3.2.1
`Non-Redundant (RAID Level 0) .........................................................................10
`3.2.2 Mirrored (RAID Level 1) ....................................................................................10
`3.2.3 Memory-Style ECC (RAID Level 2) ..................................................................12
`3.2.4
`Bit-Interleaved Parity (RAID Level 3)................................................................12
`3.2.5
`Block-Interleaved Parity (RAID Level 4) ...........................................................13
`3.2.6
`Block-Interleaved Distributed-Parity (RAID Level 5)........................................13
`3.2.7
`P+Q Redundancy (RAID Level 6) ......................................................................14
`Performance and Cost Comparisons..................................................................................15
`3.3.1
`Ground Rules and Observations..........................................................................15
`3.3.2
`Comparisons........................................................................................................17
`Reliability...........................................................................................................................19
`3.4.1
`Basic Reliability ..................................................................................................19
`3.4.2
`System Crashes and Parity Inconsistency ...........................................................21
`3.4.3
`Uncorrectable Bit-Errors .....................................................................................22
`3.4.4
`Correlated Disk Failures......................................................................................23
`3.4.5
`Reliability Revisited ............................................................................................24
`3.4.6
`Summary and Conclusions..................................................................................27
`Implementation Considerations .........................................................................................27
`3.5.1
`Avoiding Stale Data.............................................................................................28
`3.5.2
`Regenerating Parity after a System Crash...........................................................29
`3.5.3
`Operating with a Failed Disk...............................................................................30
`3.5.4
`Orthogonal RAID ................................................................................................31
`ADVANCED TOPICS...........................................................................................32
`4.1
`Improving Small Write Performance for RAID Level 5 ...................................................32
`4.1.1
`Buffering and Caching ........................................................................................32
`4.1.2
`Floating Parity .....................................................................................................34
`4.1.3
`Parity Logging.....................................................................................................34
`4.2
`Declustered Parity..............................................................................................................35
`4.3
`Exploiting On-Line Spare Disks........................................................................................38
`4.4
`Data Striping in Disk Arrays .............................................................................................40
`4.5
`Performance and Reliability Modeling..............................................................................42
`CASE STUDIES....................................................................................................44
`5.1
`Thinking Machines Corporation ScaleArray.....................................................................45
`5.2
`StorageTek Iceberg 9200 Disk Array Subsystem ..............................................................46
`5.3
`TickerTAIP/DataMesh .......................................................................................................47
`5.4
`The RAID-II Storage Server..............................................................................................49
`5.5
`IBM Hagar Disk Array Controller.....................................................................................50
`OPPORTUNITIES FOR FUTURE RESEARCH..................................................50
`6.1
`Experience with Disk Arrays.............................................................................................51
`6.2
`Interaction among New Technologies ...............................................................................51
`
`3.5
`
`October 29, 1993
`
`RAID: High-Performance, Reliable Secondary Storage
`
`i
`
`DHPN-1011 / Page 2 of 65
`
`IPR2014-00901 Owner Ex. 2102
`ETRI, Patent Owner
`VMware, Petitioner
`
`

`

`6.3
`Scalability, Massively Parallel Computers, and Small Disks ............................................52
`6.4
`Latency...............................................................................................................................52
`CONCLUSIONS ...................................................................................................53
`ACKNOWLEDGEMENTS...................................................................................53
`ANNOTATED BIBLIOGRAPHY.........................................................................53
`
`7
`8
`9
`
`October 29, 1993
`
`RAID: High-Performance, Reliable Secondary Storage
`
`ii
`
`DHPN-1011 / Page 3 of 65
`
`IPR2014-00901 Owner Ex. 2102
`ETRI, Patent Owner
`VMware, Petitioner
`
`

`

`1 INTRODUCTION
`
`In recent years, interest in RAID, Redundant Arrays of Inexpensive Disks1, has grown explo-
`
`sively. The driving force behind this phenomenon is the sustained exponential improvements in
`
`the performance and density of semiconductor technology. Improvements in semiconductor tech-
`
`nology make possible faster microprocessors and larger primary memory systems which in turn
`
`require larger, higher-performance secondary storage systems. More specifically, these improve-
`
`ments on secondary storage systems have both quantitative and qualitative consequences.
`
`On the quantitative side, Amdahl’s Law [Amdahl67] predicts that large improvements in
`
`microprocessors will result in only marginal improvements in overall system performance unless
`
`accompanied by corresponding improvements in secondary storage systems. Unfortunately, while
`
`RISC microprocessor performance has been improving 50% or more per year [Patterson94, pg.
`
`27], disk access times, which depend on improvements of mechanical systems, have been improv-
`
`ing less than 10% per year. Disk transfer rates, which track improvements in both mechanical sys-
`
`tems and magnetic media densities, have improved at the faster rate of approximately 20% per
`
`year. Assuming that semiconductor and disk technologies continue their current trends, we must
`
`conclude that the performance gap between microprocessors and magnetic disks will continue to
`
`widen.
`
`In addition to the quantitative effect, a second, perhaps more important, qualitative effect is
`
`driving the need for higher-performance secondary storage systems. As microprocessors become
`
`faster, they make possible new applications and greatly expand the scope of existing applications.
`
`In particular, applications such as video, hypertext and multi-media are becoming common. Even
`
`in existing application areas such as computer-aided design and scientific computing, faster micro-
`
`processors make it possible to tackle new problems requiring larger datasets. This shift in applica-
`
`tions along with a trend toward large, shared, high-performance, network-based storage systems is
`
`causing us to reevaluate the way we design and use secondary storage systems.
`
`1. Because of the restrictiveness of “Inexpensive”, RAID is sometimes said to stand for “Redundant Arrays
`of Independent Disks”.
`
`October 29, 1993
`
`1
`
`DHPN-1011 / Page 4 of 65
`
`IPR2014-00901 Owner Ex. 2102
`ETRI, Patent Owner
`VMware, Petitioner
`
`

`

`Disk arrays, which organize multiple independent disks into a large, high-performance logical
`
`disk, are a natural solution to the problem. Disk arrays stripe data across multiple disks and access-
`
`ing them in parallel to achieve both higher data transfer rates on large data accesses and higher I/O
`
`rates on small data accesses. Data striping also results in uniform load balancing across all of the
`
`disks, eliminating hot spots that otherwise saturate a small number of disks while the majority of
`
`disks sit idle.
`
`Large disk arrays, however, are highly vulnerable to disk failures; a disk array with a hundred
`
`disks is a hundred times more likely to fail than a single disk. An MTTF (mean-time-to-failure) of
`
`200,000 hours, or approximately twenty-three years, for a single disk implies an MTTF of 2000
`
`hours, or approximately three months, for a disk array with a hundred disks. The obvious solution
`
`is to employ redundancy in the form of error-correcting codes to tolerate disk failures. This allows
`
`a redundant disk array to avoid losing data for much longer than an unprotected single disk.
`
`Redundancy, however, has negative consequences. Since all write operations must update the
`
`redundant information, the performance of writes in redundant disk arrays can be significantly
`
`worse than the performance of writes in non-redundant disk arrays. Also, keeping the redundant
`
`information consistent in the face of concurrent I/O operations and system crashes can be difficult.
`
`A number of different data striping and redundancy schemes have been developed. The com-
`
`binations and arrangements of these schemes lead to a bewildering set of options for users and
`
`designers of disk arrays. Each option presents subtle tradeoffs between reliability, performance
`
`and cost that are difficult to evaluate without understanding the alternatives. To address this prob-
`
`lem, this paper presents a systematic tutorial and survey of disk arrays. We describe seven basic
`
`disk-array organizations along with their advantages and disadvantages and compare their reliabil-
`
`ity, performance and cost. We draw attention to the general principles governing the design and
`
`configuration of disk arrays as well as practical issues that must be addressed in the implementa-
`
`tion of disk arrays. A later section of the paper describes optimizations and variations to the seven
`
`basic disk-array organizations. Finally, we discuss existing research in the modeling of disk arrays
`
`and fruitful avenues for future research. This paper should be of value to anyone interested in disk
`
`arrays, including students, researchers, designers and users of disk arrays.
`
`October 29, 1993
`
`2
`
`DHPN-1011 / Page 5 of 65
`
`IPR2014-00901 Owner Ex. 2102
`ETRI, Patent Owner
`VMware, Petitioner
`
`

`

`2 BACKGROUND
`
`This section provides basic background material on disks, I/O datapaths, and disk technology
`
`trends for readers who are unfamiliar with secondary storage systems.
`
`2.1 Disk Terminology
`
`Figure 1 illustrates the basic components of a simplified magnetic disk drive. A disk princi-
`
`pally consists of a set of platters coated with a magnetic medium rotating at a constant angular
`
`velocity and a set of disk arms with magnetic read/write heads which are moved radially across the
`
`platters’ surfaces by an actuator. Once the heads are correctly positioned, data is read and written
`
`in small arcs called sectors on the platters’ surfaces as the platters rotate relative to the heads.
`
`Although all heads are moved collectively, in almost every disk drive, only a single head can read
`
`or write data at any given time. A complete circular swath of data is referred to as a track and each
`
`platter’s surface consists of concentric rings of tracks. A vertical collection of tracks at the same
`
`radial position is logically referred to as a cylinder. Sectors are numbered so that a sequential scan
`
`of all sectors traverses the entire disk in the minimal possible time.
`
`Given the simplified disk described above, disk service times can be broken into three pri-
`
`mary components: seek time, rotational latency, and data transfer time. Seek time is the amount of
`Inner Track
`Sector
`Outer Track
`
`Head
`
`Arm
`
`Platter
`
`Actuator
`
`Figure 1: Disk Terminology. Heads reside on arms which are positioned by actuators. Tracks
`are concentric rings on a platter. A sector is the basic unit of reads and writes. A cylinder is a
`stack of tracks at one actuator position. An HDA (head-disk assembly) is everything in the
`figure plus the airtight casing. In some devices it is possible to transfer data from multiple
`surfaces simultaneously, but this is both rare and expensive. The collection of heads that
`participate in a single logical transfer that is spread over multiple surfaces is called a head
`group.
`
`October 29, 1993
`
`3
`
`DHPN-1011 / Page 6 of 65
`
`IPR2014-00901 Owner Ex. 2102
`ETRI, Patent Owner
`VMware, Petitioner
`
`

`

`time needed to move a head to the correct radial position and typically ranges from one to thirty
`
`milliseconds depending on the seek distance and the particular disk. Rotational latency is the
`
`amount of time needed for the desired sector to rotate under the disk head. A full rotation time for
`
`disks currently vary from eight to twenty-eight milliseconds. The data transfer time is dependent
`
`on the rate at which data can be transferred to/from a platter’s surface and is a function of the plat-
`
`ter’s rate of rotation, the density of the magnetic media, and the radial distance of the head from
`
`the center of the platter—some disks use a technique called zone-bit-recording to store more data
`
`on the longer outside tracks than the shorter inside tracks. Typical data transfer rates range from
`
`one to five megabytes per second. The seek time and rotational latency are sometimes collectively
`
`referred to as the head positioning time. Table 1 tabulates the statistics for a typical high-end disk
`
`available in 1993.
`
`The slow head positioning time and fast data transfer rate of disks lead to very different per-
`
`formance for a sequence of accesses depending on the size and relative location of each access.
`
`Suppose we need to transfer 1 MB from the disk in Table 1, and the data is laid out in two ways:
`
`sequential within a single cylinder or randomly placed in 8 KB blocks. In either case the time for
`
`Form Factor/Disk Diameter
`Capacity
`Cylinders
`Tracks Per Cylinder
`Sectors Per Track
`Bytes Per Sector
`Full Rotation Time
`Minimum Seek
`(single cylinder)
`Average Seek
`(random cylinder to cylinder)
`Maximum Seek
`(full stroke seek)
`Data Transfer Rate
`
`5.25 inch
`2.8 GB
`2627
`21
`~99
`512
`11.1 ms
`
`1.7 ms
`
`11.0 ms
`
`22.5 ms
`» 4.6 MB/s
`
`Table 1: Specifications for the Seagate ST43401N Elite-3 SCSI Disk Drive. Average seek in this table
`is calculated assuming a uniform distribution of accesses. This is the standard way manufacturers report
`average seek times. In reality, measurements of production systems show that spatial locality significantly
`lowers the effective average seek distance [Hennessy90, pg. 559].
`
`October 29, 1993
`
`4
`
`DHPN-1011 / Page 7 of 65
`
`IPR2014-00901 Owner Ex. 2102
`ETRI, Patent Owner
`VMware, Petitioner
`
`

`

`the actual data transfer of 1 MB is about 200 ms. But the time for positioning the head goes from
`
`about 16 ms in the sequential layout to about 2000 ms in the random layout. This sensitivity to the
`
`workload is why applications are categorized as high data rate, meaning minimal head positioning
`
`via large, sequential accesses, or high I/O rate, meaning lots of head positioning via small, more
`
`random accesses.
`
`2.2 Data Paths
`
`A hierarchy of industry standard interfaces has been defined for transferring data recorded on
`
`a disk platter’s surface to or from a host computer. In this section we review the complete datapath,
`
`from the disk to a users’s application (Figure 2). We assume a read operation for the purposes of
`
`this discussion.
`
`On the disk platter’s surface, information is represented as reversals in the direction of stored
`
`magnetic fields. These “flux reversals” are sensed, amplified, and digitized into pulses by the low-
`
`est-level read electronics. The protocol ST506/412 is one standard that defines an interface to disk
`
`systems at this lowest, most inflexible, and technology-dependent level. Above this level of the
`
`read electronics path, pulses are decoded to separate data bits from timing-related flux reversals.
`
`The bit-level ESDI and SMD standards define an interface at this more flexible, encoding-indepen-
`
`dent level. Below the higher, most-flexible packet-level, these bits are aligned into bytes, error cor-
`
`recting codes applied, and the extracted data delivered to the host as data blocks over a peripheral
`
`bus interface such as SCSI (Small Computer Standard Interface), or IPI-3 (the third level of the
`
`Intelligent Peripheral Interface). These steps are performed today by intelligent on-disk control-
`
`lers, which often include speed matching and caching “track buffers”. SCSI and IPI-3 also include
`
`a level of data mapping: the computer specifies a logical block number and the controller embed-
`
`ded on the disk maps that block number to a physical cylinder, track, and sector. This mapping
`
`allows the embedded disk controller to avoid bad areas of the disk by remapping those logical
`
`blocks that are affected to new areas of the disk.
`
`The topology and devices on the data path between disk and host computer varies widely
`
`depending on the size and type of I/O system. Mainframes have the richest I/O systems, with many
`
`October 29, 1993
`
`5
`
`DHPN-1011 / Page 8 of 65
`
`IPR2014-00901 Owner Ex. 2102
`ETRI, Patent Owner
`VMware, Petitioner
`
`

`

`devices and complex interconnection schemes to access them. An IBM channel path, the set of
`
`cables and associated electronics that transfer data and control information between an I/O device
`
`and main memory, consists of a channel, a storage director, and a head of string. The collection of
`
`disks that share the same pathway to the head of string is called a string. In the workstation/file
`
`CPU
`
`DMA
`
`I/O Controller
`or Host-Bus Adaptor
`or Channel Processor
`
`String
`
`Disk Controller/
`Storage Director
`
`& Track Buffers
`
`Formatter
`
`Clocking
`
`IPI-3, SCSI-1, SCSI-2, DEC CI/MSCP
`
`IPI-2, SCSI-1, DEC SDI,
`IBM Channel Path (data blocks)
`
`Disk Controller/
`Storage Director
`
`& Track Buffers
`
`Formatter
`
`Clocking
`
`SMD, ESDI (bits)
`
`ST506, ST412 (pulses)
`
`magnetic
`media
`
`magnetic
`media
`
`Figure 2: Host-to-Device Pathways. Data that is read from a magnetic disk must pass through
`many layers on its way to the requesting processor. Each dashed line marks a standard interface.
`The lower interfaces, such as ST506 deal more closely with the raw magnetic fields and are
`highly technology dependent. Higher layers such as SCSI deal in packets or blocks of data and are
`more technology independent. A string connects multiple disks to a single I/O controller.
`
`October 29, 1993
`
`6
`
`DHPN-1011 / Page 9 of 65
`
`IPR2014-00901 Owner Ex. 2102
`ETRI, Patent Owner
`VMware, Petitioner
`
`

`

`server world, the channel processor is usually called an I/O controllers or host-bus adaptor (HBA)
`
`and the functionality of the storage director and head of string is contained in an embedded con-
`
`troller on the disk drive. As in the mainframe world, the use of high-level peripheral interfaces
`
`such as SCSI and IPI-3 allow multiple disks to share a single peripheral bus or string.
`
` From the host-bus adaptor, the data is transferred via direct memory access, over a system
`
`bus, such as VME, S-Bus, MicroChannel, EISA, or PCI, to the host operating system’s buffers. In
`
`most operating systems, the CPU then performs a memory to memory copy over a high-speed
`
`memory bus from the operating system buffers to buffers in the application’s address space.
`
`2.3 Technology Trends
`
`Much of the motivation for disk arrays comes from the current trends in disk technology. As
`
`Table 2 shows, magnetic disk drives have been improving rapidly by some metrics and hardly at
`
`all by other metrics. Smaller distances between the magnetic read/write head and the disk surface,
`
`more accurate positioning electronics, and more advanced magnetic media have dramatically
`
`increased the recording density on the disks. This increased density has improved disks in two
`
`ways. First, it has allowed disk capacities to stay constant or increase, even while disk sizes have
`
`1993
`
`50-150
`Mbits/sq. inch
`40,000-60,000
`bits/inch
`1,500-3,000
`tracks/inch
`
`Historical Rate
`of Improvement
`
`27% per year
`
`13% per year
`
`10% per year
`
`100-2000 MB
`
`27% per year
`
`3-4 MB/s
`7-20 ms
`
`22% per year
`8% per year
`
`Areal Density
`
`Linear Density
`
`Inter-Track Density
`
`Capacity
` (3.5” form factor)
`Transfer Rate
`Seek Time
`
`Table 2: Trends in Disk Technology. Magnetic disks are improving rapidly in density and capacity, but
`more slowly in performance. Areal density is the recording density per square inch of magnetic media. In
`1989, IBM demonstrated a 1 Gbit/sq. inch density in a laboratory environment. Linear density is the
`number of bits written along a track. Inter-track density refers to the number of concentric tracks on a
`single platter.
`
`October 29, 1993
`
`7
`
`DHPN-1011 / Page 10 of 65
`
`IPR2014-00901 Owner Ex. 2102
`ETRI, Patent Owner
`VMware, Petitioner
`
`

`

`decreased from 5.25” in 1983 to 3.5” in 1985 to 2.5” in 1991 to 1.8” in 1992 to 1.3” in 1993. Sec-
`
`ond, the increased density, along with an increase in the rotational speed of the disk, has made pos-
`
`sible a substantial increase in the transfer rate of disk drives. Seek times, however, have improved
`
`very little, decreasing from approximately 20 ms in 1980 to 10 ms today. Rotational speeds have
`
`increased at a similar rate from 3600 revolutions per minute in 1980 to 5400-7200 today.
`
`3 DISK ARRAY BASICS
`
`This section examines basic issues in the design and implementation of disk arrays. In partic-
`
`ular, we examine the concepts of data striping and redundancy; basic RAID organizations; perfor-
`
`mance and cost comparisons between the basic RAID organizations; the reliability of RAID-based
`
`systems in the face of system crashes, uncorrectable bit-errors and correlated disk failures; and
`
`finally, issues in the implementations of block-interleaved, redundant disk arrays.
`
`3.1 Data Striping and Redundancy
`
`Redundant disk arrays employ two orthogonal concepts: data striping for improved perfor-
`
`mance and redundancy for improved reliability. Data striping transparently distributes data over
`
`multiple disks to make them appear as a single fast, large disk. Striping improves aggregate I/O
`
`performance by allowing multiple I/Os to be serviced in parallel. There are two aspects to this par-
`
`allelism. First, multiple, independent requests can be serviced in parallel by separate disks. This
`
`decreases the queueing time seen by I/O requests. Second, single, multiple-block requests can be
`
`serviced by multiple disks acting in coordination. This increases the effective transfer rate seen by
`
`a single request. The more disks in the disk array, the larger the potential performance benefits.
`
`Unfortunately, a large number of disks lowers the overall reliability of the disk array, as mentioned
`
`before. Assuming independent failures, 100 disks collectively have only 1/100th the reliability of a
`
`single disk. Thus, redundancy is necessary to tolerate disk failures and allow continuous operation
`
`without data loss.
`
`We will see that the majority of redundant disk array organizations can be distinguished based
`
`on the granularity of data interleaving and the method and pattern in which the redundant informa-
`
`October 29, 1993
`
`8
`
`DHPN-1011 / Page 11 of 65
`
`IPR2014-00901 Owner Ex. 2102
`ETRI, Patent Owner
`VMware, Petitioner
`
`

`

`tion is computed and distributed across the disk array. Data interleaving can be characterized as
`
`either fine-grained or coarse-grained. Fine-grained disk arrays conceptually interleave data at a rel-
`
`atively small unit such that all I/O requests, regardless or their size, access all of the disks in the
`
`disk array. This results in very high data transfer rates for all I/O requests but has the disadvantage
`
`that only one logical I/O request can be in service at any given time and all disks must waste time
`
`positioning for every request. Coarse-grained disk arrays interleave data at a relatively large unit
`
`so that small I/O requests need access only a small number of disks but large requests can access
`
`all the disks in the disk array. This allows multiple small requests to be serviced simultaneously
`
`but still allows large requests the benefits of using all the disks in the disk array.
`
`The incorporation of redundancy in disk arrays brings up two somewhat orthogonal prob-
`
`lems. The first problem is selecting the method for computing the redundant information. Most
`
`redundant disk arrays today use parity but there are some that use Hamming codes or Reed-
`
`Solomon codes. The second problem is that of selecting a method for distributing the redundant
`
`information across the disk array. Although there are an unlimited number of patterns in which
`
`redundant information can be distributed, we roughly classify these patterns into two different dis-
`
`tributions schemes, those that concentrate redundant information on a small number of disks and
`
`those that distributed redundant information uniformly across all of the disks. Schemes that uni-
`
`formly distribute redundant information are generally more desirable because they avoid hot spots
`
`and other load balancing problems suffered by schemes that do not uniformly distribute redundant
`
`information. Although the basic concepts of data striping and redundancy are conceptually simple,
`
`selecting between the many possible data striping and redundancy schemes involves complex
`
`tradeoffs between reliability, performance and cost.
`
`3.2 Basic RAID Organizations
`
`This section describes the basic RAID, Redundant Arrays of Inexpensive Disks, organiza-
`
`tions which will be used as the basis for further examinations of performance, cost and reliability
`
`of disk arrays. In addition to presenting RAID levels 1 through 5 which first appeared in the land-
`
`mark paper by Patterson, Gibson and Katz [Patterson88], we present two other RAID organiza-
`
`October 29, 1993
`
`9
`
`DHPN-1011 / Page 12 of 65
`
`IPR2014-00901 Owner Ex. 2102
`ETRI, Patent Owner
`VMware, Petitioner
`
`

`

`tions, RAID levels 0 and 6, which have since become generally accepted1. For the benefit of those
`
`unfamiliar with the original numerical classification of RAID, we will use English phrases in pref-
`
`erence to the numerical classifications. It should come as no surprise to the reader that even the
`
`original authors have sometimes been confused as to the disk array organization referred to by a
`
`particular RAID level! Figure 3 schematically illustrates the seven RAID organizations.
`
`3.2.1 Non-Redundant (RAID Level 0)
`
`The non-redundant disk array, or RAID level 0, has the lowest cost of any redundancy
`
`scheme because it does not employ redundancy at all. This scheme offers the best write perfor-
`
`mance since it never needs to update redundant information. Surprisingly, it does not have the best
`
`read performance. Redundancy schemes such as mirroring, which duplicate data, can perform bet-
`
`ter on reads by selectively scheduling requests on the disk with the shortest expected seek and rota-
`
`tional delays [Bitton88]. Without redundancy, any single disk failure will result in data-loss. Non-
`
`redundant disk arrays are widely used in supercomputing environments where performance and
`
`capacity, rather than reliability, are the primary concerns.
`
`3.2.2 Mirrored (RAID Level 1)
`
`The traditional solution, called mirroring or shadowing, uses twice as many disks as a non-
`
`redundant disk array [Bitton88]. Whenever data is written to a disk the same data is also written to
`
`a redundant disk, so that there are always two copies of the information. When data is read, it can
`
`be retrieved from the disk with the shorter queueing, seek and rotational delays [Chen90a]. If a
`
`disk fails, the second copy is used to service requests. Mirroring is frequently used in database
`
`applications where availability and transaction rate are more important than storage efficiency
`
`[Gray90].
`
`1. Strictly speaking, RAID Level 0 is not a type of redundant array of inexpensive disks since it stores no
`error-correcting codes.
`
`October 29, 1993
`
`10
`
`DHPN-1011 / Page 13 of 65
`
`IPR2014-00901 Owner Ex. 2102
`ETRI, Patent Owner
`VMware, Petitioner
`
`

`

`Non-Redundant (RAID Level 0)
`
`Mirrored (RAID Level 1)
`
`Memory-Style ECC (RAID Level 2)
`
`Bit-Interleaved Parity (RAID Level 3)
`
`Block-Interleaved Parity (RAID Level 4)
`
`Block-Interleaved Distributed-Parity (RAID Level 5)
`
`P+Q Redundancy (RAID Level 6)
`
`Figure 3:RAID Levels 0 Through 6. All RAID levels are illustrated at a user capacity of four
`disks. Disks with multiple platters indicate block-level striping while disks without multiple
`platters indicate bit-level striping. The shaded platters represent redundant information.
`
`October 29, 1993
`
`11
`
`DHPN-1011 / Page 14 of 65
`
`IPR2014-00901 Owner Ex. 2102
`ETRI, Patent Owner
`VMware, Petitioner
`
`

`

`3.2.3 Memory-Style ECC (RAID Level 2)
`
`Memory systems have provided recovery from failed components with much less cost than
`
`mirroring by using Hamming codes [Peterson72]. Hamming codes contain parity for distinct over-
`
`lapping subsets of components. In one version of this sc

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket