`for Disaster Recovery
`
`Hugo Patterson, Stephen Manley, Mike Federwisch, Dave Hitz, Steve Kleiman, Shane Owara
`
`Network Appliance Inc.
`Sunnyvale, CA
`{hugo, stephen, mikef, hitz, srk, owara}@netapp.com
`
`Abstract
`Computerized data has become critical to the survival of
`an enterprise. Companies must have a strategy for recov(cid:173)
`ering their data should a disaster such as a fire destroy the
`primary data center. Current mechanisms offer data man(cid:173)
`agers a stark choice: rely on affordable tape but risk the
`loss of a full day of data and face many hours or even
`days to recover, or have the benefits of a fully synchro(cid:173)
`nized on-line remote mirror, but pay steep costs in both
`write latency and network bandwidth to maintain the
`mirror. In this paper, we argue that asynchronous mirror(cid:173)
`ing, in which batches of updates are periodically sent to
`the remote mirror, can let data managers fmd a balance
`between these extremes. First, by eliminating the write
`latency issue, asynchrony greatly reduces the perfor(cid:173)
`mance cost of a remote mirror. Second, by storing up
`batches of writes, asynchronous mirroring can avoid
`sending deleted or overwritten data and thereby reduce
`network bandwidth requirements. Data managers can
`tune the update frequency to trade network bandwidth
`against the potential loss of more data. We present Snap(cid:173)
`Mirror, an asynchronous mirroring technology that le(cid:173)
`verages file system snapshots to ensure the consistency
`of the remote mirror and optimize data transfer. We use
`traces of production filers to show that even updating an
`asynchronous mirror every 15 minutes can reduce data
`transferred by 30% to 80%. We find that exploiting file
`system knowledge of deletions is critical to achieving
`any reduction for no-overwrite file systems such as
`WAFL and LFS. Experiments on a running system show
`that using file system metadata can reduce the time to
`identify changed blocks from minutes to seconds com(cid:173)
`pared to purely logical approaches. Finally, we show that
`using SnapMirror to update every 30 minutes increases
`the response time of a heavily loaded system only 22%.
`
`1 Introduction
`As reliance on computerized data storage has
`grown, so too has the cost of data unavailability. A few
`
`SnapMirror, NetApp, and W AFL are registered trademarks of
`Network Appliance, lnc.
`
`hours downtime can cost from thousands to millions of
`dollars depending on the size of the enterprise and the
`role of the data. With increasing frequency, companies
`are instituting disaster recovery plans to ensure appropri(cid:173)
`ate dat~ availability in the event of a catastrophic failure
`or disaster that destroys a site (e.g. flood, fire, or earth(cid:173)
`quake). It is relatively easy to provide redundant server
`and storage hardware to protect against the loss of phys(cid:173)
`ical resources. Without the data, however, the redundant
`hardware is of little use.
`
`The problem is that current strategies for data pro(cid:173)
`tection and recovery offer either inadequate protection,
`or are too expensive in performance and/or network
`bandwidth. Tape backup and restore is the traditional ap(cid:173)
`proach. Although favored for its low cost, restoring from
`a nightly backup is too slow and the restored data is up to
`a day old. Remote synchronous and semi-synchronous
`mirroring are more recent alternatives. Mirrors keep
`backup data on-line and fully synchronized with the pri(cid:173)
`mary store, but they do so at a high cost in performance
`(write latency) and network bandwidth. Semi-synchro(cid:173)
`nous mirrors can reduce the write-latency penalty, but
`can result in inconsistent, unusable data unless write or(cid:173)
`dering across the entire data set, not just within one stor(cid:173)
`age device, is guaranteed. Data managers are forced to
`choose between two extremes: synchronized with great
`expense or affordable with a day of data loss.
`
`In this paper, we show that by letting a mirror vol(cid:173)
`ume lag behind the primary volume it is possible to re(cid:173)
`duce substantially the performance and network costs of
`maintaining a mirror while bounding the amount of data
`loss. The greater the lag, the greater the data loss, but the
`cheaper the cost of maintaining the mirror. Such asyn(cid:173)
`chronous mirrors let data managers tune their systems to
`strike the right balance between potential data loss and
`cost.
`
`We present SnapMirror, a technology which imple(cid:173)
`ments asynchronous mirrors on Network Appliance fil(cid:173)
`ers. SnapMirror periodically transfers self-consistent
`snapshots of the data from a source volume to the desti(cid:173)
`nation volume. The mirror is on-line, so disaster recov-
`
`USENIX Association
`
`FAST '02: Cohference on File and Storage Technologies
`
`117
`
`Microsoft Ex. 1016, p. 1
`Microsoft v. Daedalus Blue
`IPR2021-00831
`
`
`
`ery can be instantaneous. Users set the update frequency.
`If the update frequency is high, the mirror will be nearly
`current with the source and very little data will be lost
`when disaster strikes. But, by lowering the update fre(cid:173)
`quency, data managers can reduce the performance and
`network cost of maintaining the mirror at the risk of in(cid:173)
`creased data loss.
`
`There are three main problems in maintaining an
`asynchronous mirror. First, for each periodic transfer, the
`system must determine which blocks need to be trans(cid:173)
`ferred to the mirror. To obtain the bandwidth reduction
`benefits of asynchrony, the system must avoid transfer(cid:173)
`ring data which is overwritten or deleted. Second, if the
`source volume fails at any time, the destination must be
`ready to come on line. In particular, a half-completed
`transfer can't leave the destination in an unusable state.
`Effectively, this means that the destination must be in, or
`at least recoverable to, a self-consistent, state at all times.
`Finally, for performance, disk reads on the source and
`writes on the destination must be efficient.
`
`In this paper, we show how SnapMirror leverages
`the internal data structures ofNetApp's WAFL ®file sys(cid:173)
`tem [Hitz94] to solve these problems. SnapMirror lever(cid:173)
`ages the active block maps in WAFL's snapshots to
`quickly identify changed blocks and avoid transferring
`deleted blocks. Because SnapMirror transfers self-con(cid:173)
`sistent snapshots of the file system, the remote mirror is
`always guaranteed to be in a consistent state. New up(cid:173)
`dates appear atomically. Finally, because it operates at
`the block level, SnapMirror is able to optimize its data
`reads and writes.
`
`We show that SnapMirror's periodic updates trans(cid:173)
`fer much less data than synchronous block-level mirrors.
`Update intervals as short as 1 minute are sufficient to re(cid:173)
`duce data transfers by 30% to 80%. The longer the period
`between updates, the less data needs to be transferred.
`SnapMirror allows data managers to optimize the
`tradeoff of data currency against cost for each volume.
`
`In this paper, we explore the interaction between
`asynchronous mirroring and no-overwrite file systems
`such as LFS-[Rosenblum92] and WAFL. We find that
`asynchronous block-level mirroring ofthese file systems
`does not transfer less data synchronous mirroring. Be(cid:173)
`cause these file systems do not update in place, logical
`overwrites become writes to new storage blocks. To gain
`the data reduction benefits of asynchrony for these file
`systems, it is necessary to have knowledge of which
`blocks are active and which have been deallocated and
`are no longer needed. This is an important observation
`since many conm1ercial mirroring products are imple(cid:173)
`mented at the block level.
`
`1.1 Outline for remainder of paper
`We start, in Section 1.2, with a discussion of there(cid:173)
`quirements for disaster recovery. We go on in Sections
`1.3 and 1.4 to discuss the shortcomings of tape-based re(cid:173)
`covery and synchronous remote mirroring. In Section 2,
`we review related work. We present the design and im(cid:173)
`plementation of SnapMirror in Section 3. In Section 4,
`we use system traces to study the data reduction benefits
`. of asynchronous mirroring with file system knowledge.
`Then, in Section 5, we compare SnapMirror to asynchro(cid:173)
`nous mirroring at the logical file level. Section 6, pre(cid:173)
`sents experiments measuring the performance of our
`SnapMirror implementation running on a loaded system.
`Conclusion, acknowledgments, and references are in
`Sections 7, 8, and 9.
`
`1~2 Requirements for Disaster Recovery
`Disaster recovery is the process of restoring access
`to a data set after the original was destroyed or became
`unavailable. Disasters should be rare, but data unavail(cid:173)
`ability must be minimized. Large enterprises are asking
`for disaster recovery techniques that meet the following
`requirements:
`
`Recover quickly. The data should be accessible within a
`few minutes after a failure.
`
`Recover consistently. The data must be 1n a consistent
`state so that the application does not fail during the re(cid:173)
`covery attempt because of a corrupt data set.
`
`Minimal impact on normal operations. The perfor(cid:173)
`mance impact of a disaster recovery technique should be
`minimal during normal operations.
`Up to date. If a disaster occurs, the recovered data
`should reflect the state of the original system as closely
`as possible. Loss of a day or more worth of updates is not
`acceptable in many applications.
`
`Unlimited distance. The physical separation between
`the original and recovered data should not be limited.
`Companies may have widely separated sites and the
`scope of disasters such as earthquakes or hurricanes may
`require hundreds of miles of separation.
`
`Reasonable cost. The solution should not require exces(cid:173)
`sive cost, such as many high-speed, long-distance links
`(e.g. direct fiber optic cable). Preferably, the link should
`be compatible with WAN technology.
`
`1.3 Recovering from Off-line Data
`Traditional disaster recovery strategies involve
`loading a saved copy of the data from tape onto a new
`server in a different location. After a disaster, the most
`recent full backup tapes are loaded onto the new server.
`A series of nightly incremental backups may follow the
`
`118
`
`FAST '02: Conference on File and Storage Technologies
`
`USENIX Association
`
`Microsoft Ex. 1016, p. 2
`Microsoft v. Daedalus Blue
`IPR2021-00831
`
`
`
`full backup to bring the recovered volume as up-to-date
`as possible. This worked well when file systems were of
`moderate size and when the cost of a few hours of down(cid:173)
`time was acceptable, provided such events were rare.
`
`Today, companies are taking advantage of the 60%
`compound annual growth rate in disk drive capacity
`[Growchowski96] and file system size is growing rapid(cid:173)
`ly. Terabyte storage systems are becoming common(cid:173)
`place. Even with the latest image dump technologies
`[Hutchinson99], data can only be restored at a rate of
`100-200 GB/hour. If disaster strikes a terabyte file sys(cid:173)
`tem, it will be offline for at least 5-10 hours if tape-based
`recovery technologies are used. This is unacceptable in
`many environments.
`
`Will technology trends solve this problem over
`time? Unfortunately, the trends are against us. Although
`disk capacities are growing 60% per year, disk transfer
`rates are growing at only 40% per year [Grochowski96].
`It is taking more, not less, time to fill a disk drive even in
`the best case of a purely sequential data stream. In prac(cid:173)
`tice, even image restores are not purely sequential and
`achieved disk bandwidth is less than the sequential ideal.
`To achieve timely disaster recovery, data must be kept
`one line and ready to go.
`
`1.4 Remote Mirroring
`Synchronous remote mirroring immediately copies
`all writes to the primary volume to a remote mirror vol(cid:173)
`ume. The original transfer is not acknowledged until the
`data is written to both volumes. The mirror gives the user
`a second identical copy of the data to fall back on if the
`primary file system fails. In many cases, both copies of
`the data are also locally protected by RAID.
`-
`
`The down side of synchronous remote mirroring is
`that it can add a lot of latency to VO write operations.
`Slower 1/0 writes slow down the server writing the data.
`The extra latency results first from serialization and
`transmission delays in the network link to the remote
`mirror. Longer distances can bloat response time to un(cid:173)
`acceptable levels. Second, unless there is a dedicated
`high-speed line to the remote mirror, network congestion
`and bandwidth limitations Will further reduce perfor(cid:173)
`mance. For these reasons, most synchronous mirroring
`implementations limit the distance to the remote mirror
`to 40 kilometers or less.
`
`Because of its performance limitations, synchronous
`mirroring implementations sometimes slightly relax
`strict synchrony, to allow a limited number of source I/0
`operations to proceed before waiting for acknowledg(cid:173)
`ment of receipt from the remote site 1• Although this ap(cid:173)
`proach can reduce I/0 latency, it does not reduce the link
`bandwidth needed to keep up with the writes. Further,
`
`the improved performance comes at the cost of some po(cid:173)
`tential data loss in the event of a disaster.
`
`A major challenge for non-synchronous mirroring is
`ensuring the consistency of the remote data. If writes ar(cid:173)
`rive out-of-order 1at the remote site, the remote copy of
`the data may appear corrupted to an application trying to _
`use the data after a disaster. If this occurs, the remote
`mirroring will have been useless since a full restore from
`tape will probably be required to bring the application
`back on line. The problem is especially difficult when a
`single data set is spread over multiple devices and the
`mirroring is done at the device level. Although each de(cid:173)
`vice guarantees in-order delivery of its the data, there
`may be no ordering guarantees among the devices. In a
`rolling disaster, one in which devices fail over a period
`of time (imagine fire spreading from one side of the data
`center to the other), the remote site may receive data
`from some devices but not others. Therefore, whenever
`synchrony is relaxed, it is important that it be coordinat(cid:173)
`ed at a high enough level to ensure data consistency at the
`remote site.
`
`Another important issue is keeping track of the up(cid:173)
`dates required on the remote mirror should it or the link
`between the two systems become unavailable. Once the
`modification log on the primary system is filled, the pri(cid:173)
`mary system usually abandons keeping track of individ(cid:173)
`ual modifications and instead keeps track of updated
`regions. When the destination again becomes available,
`the regions are transferred. Of course, the destination file
`system may be inconsistent while this transfer is taking
`place, since file system ordering rules may be violated,
`but it's betterthan starting from scratch.
`
`2 Related Work
`There. are other ways to provide disaster recovery
`besides restore from tape and synchronous mirroring.
`One is server replication.
`
`Server replication is another approach to providing
`high availability. Coda is one example of a replicated file
`system [Kistler93]. In Coda, the clients of a file server
`are responsible for writing to multiple servers. This ap(cid:173)
`proach is essentially synchronous logical-level mirror(cid:173)
`ing. By putting the responsibility for replication on the
`clients, Coda effectively off-loads the servers. And, be(cid:173)
`cause clients are aware of the multiple servers, recovery
`from the loss of a server is essentially instantaneous.
`However, Coda is not designed for replication over a
`WAN. If the WAN connecting a client to a remote server
`
`1. EMC's SRDfTM in semi-synchronous mode or Stor(cid:173)
`age Computer's Omniforce® in log synchronous mode.
`
`USENIX Association
`
`FAST '02: C~nference on File and Storage Technologies
`
`119
`
`Microsoft Ex. 1016, p. 3
`Microsoft v. Daedalus Blue
`IPR2021-00831
`
`
`
`is slow or co'ngested, the client will feel a significant per(cid:173)
`formance impact. Another difference is that where Coda
`leverages client-side software, SnapMirror's goal is to
`provide disaster recovery for the file servers without cli(cid:173)
`ent side modifications.
`
`Earlier, we mentioned that SnapMirror leverages
`file system metadata to detect new data since the last up(cid:173)
`date of the mirror. But, there are many other approaches.
`
`At the logical file system level, the most common
`approach is to walk the directory structure checking the
`time that files were last updated. For example, the UNIX
`dump utility compares the file modify times to the time
`of the last dump to determines which files it should write
`to an incremental dump tape. Other examples of detect(cid:173)
`ing new data at the logical level include programs like rd(cid:173)
`ist and rsync [Tridgell96]. These programs traverse both
`the source and destination file systems, looking for files
`that have been more recently modified on the source than
`the destination. The rdist program will only transfer
`whole files. If one byte is changed in a large database
`file, the entire file will be transferred. The rsync program
`works to compute a minimal iange of bytes that need be
`transferred by comparing checksums of byte ranges. Jt
`uses CPU resources on the source server to reduce net(cid:173)
`work traffic. Compared to these programs SnapMirror
`does not need to traverse the entire file system or do
`checksums to determine the block differences between
`the source and destination. On the other hand, SnapMir(cid:173)
`ror needs to be tightly integrated with the file system
`whereas approaches which operate at the logical level are
`more general.
`
`Another approach to mirroring, adopted by databas(cid:173)
`es such as Oracle, is to write a time-stamp in a header in
`each on-disk data block. The time-stamp enables Oracle
`to detemline if a block needs to be backed up by looking
`only at the relatively small header. This can save a lot of
`time compared to approaches which must perform check
`sums on the contents of each block. But, it still requires
`each block to be scanned. In contrast, Snap Mirror uses
`file system data structures as an index to detect updates.
`The total amount of data examined is similar in the two
`cases, but the file system structures are stored more
`densely and consequently the number ofblocks that must
`be read from disk is much smaller.
`
`3 SnapMirror Design and Implementation
`SnapMirror is an asynchronous mirroring package
`currently available on Network Appliance file servers.
`Its design goal was to meet the data protection needs of
`large-scale systems. It provides a read-only, on-line, rep(cid:173)
`lica of a source file system. In the event of disaster, the
`replica can be made writable;· replacing the original
`
`source file system.
`
`Periodically, SnapMirror reflects changes in the
`source volume to the destination volume. It replicates the
`source at a block-level, but uses file system knowledge
`to linlit transfers to blocks that are new or modified and
`that are still allocated in the file system. SnapMirror does
`not transfer blocks which were written but have since
`been overwritten or deallocated.
`
`Each time SnapMirror updates the destination, it
`takes a new snapshot of the source volume. To determine
`which blocks need to be sent to the destination, it com(cid:173)
`pares the new snapshot to the snapshot from the previous
`update. The destination jumps forward from one snap(cid:173)
`shot to the next when each transfer is completed. Effec(cid:173)
`tively, the entire update is atomically applied to the
`destination volume. Because the source snapshots al(cid:173)
`ways contain a self-consistent, point-in-time image of
`the entire volume or file system, and these snapshots are
`applied atomically to the destination, the destination al(cid:173)
`ways contains a self-consistent, point-in-time image of
`the volume. Snap Mirror solves the problem or ensuring
`destination data consistency even when updates are
`asynchronous and not all writes are transferred so order(cid:173)
`ing among individual writes cannot be maintained.
`
`The system administrator sets SnapMirror's update
`frequency to balance the impact on system performance
`against the lag time of the mirror.
`
`3.1 Snapshots and the Active Map File
`SnapMirror's advantages lie in its knowledge of the
`Write Anywhere File Layout (WAFL) file system and its
`snapshot feature [Hitz94], which runs on top ofNetwork
`Appliance's file servers. W AFL is designed to have
`many of the same advantages as the Log Structured File
`System (LFS) [Rosenblum92]. It collects file system
`block modification requests and then writes them to an
`unused group of blocks. W AFL's block allocation policy
`is able to fit new writes in among previously allocated
`blocks, and thus it avoids the need for segment-cleaning.
`W AFL also stores all metadata in files, like the Episode
`file system [Chutani92]. This allows updates to write
`metadata anywhere on disk, in the same manner as regu(cid:173)
`lar file blocks.
`
`W AFL's on-disk data structure is a tree that points to
`all data and metadata. The root of the tree is called thejs(cid:173)
`info block. A complete and consistent version of the file
`system can be reached from the information in this block.
`The fsinfo block is the only exception to the no-over(cid:173)
`write policy. Its update protocol is essentially a database(cid:173)
`like transaction; the rest of the file system image must be
`consistent whenever a new fsinfo block overwrites the
`old. This insures that partial writes will never corrupt the
`
`120
`
`FAST '02: Conference on File and Storage Technologies
`
`USENIX Association
`
`Microsoft Ex. 1016, p. 4
`Microsoft v. Daedalus Blue
`IPR2021-00831
`
`
`
`file system.
`
`It is easy to preserve a consistent image of a file sys(cid:173)
`tem, called a snapshot, at any point in time, by simply
`saving a copy of the information in the fsinfo block and
`then making sure the blocks that comprise the file system
`image are not reallocated. Snapshots will share the block
`data that re!Jlains unmodified with the active file system;
`modified data are written out to unallocated blocks. A .
`snapshot image can be accessed through a pointer to the
`saved fsinfo block.
`
`W AFL maintains the block allocations for each
`snapshot in its own active map file. The active map file
`is an array with one allocation bit for every block in the
`volume. When a snapshot is taken, the current state of the
`active file system's active map file is frozen in the snap(cid:173)
`shot just like any other file. WAFL will not reallocate a
`block unless the allocation bit for the block is cleared in
`every snapshot's active map file. To speed block alloca(cid:173)
`tions, a summary active map file maintains for each
`block, the logical-OR of the allocation bits in all the
`snapshot active map files.
`
`3.2 SnapMirror Implementation
`Snapshots and the active map file provide a natural
`way to find out block-level differences between two in-
`. stances of a file system image. Snap Mirror also uses such
`block-level information to perform efficient block-level
`transfers. Because the mirror is a block-by-block replica
`of the source, it is easy to turn it into a primary file server
`.
`.
`.
`I
`for users, should disaster befall the source.
`
`3.2.1 Initializing the Mirror
`The destination-triggers SnapMirror updates. The
`destination initiates the mirror relationship by requesting
`an initial transfer from the source. The source responds
`by taking a base reference snapshot and then transferring
`all the blocks that are allocated in that or any earlier snap(cid:173)
`shot, as specified in the snapshots' active map files.
`Thus, after initialization, the destination will have the
`same set of snapshots as the source. The base snapshot
`serves two purposes: first, it provides a reference point
`for the first update; second, it provides a static, self-con(cid:173)
`sistent image which is unaffected by writes to the active
`file system during the transfer.
`
`The destination system writes the blocks to the same
`logical location in its storage array. All the blocks in the
`array are logically numbered from 1 toN on both the
`source and the destination, so the source and destination
`array geometries need not be identical. However, be(cid:173)
`cause WAFL optimizes block layout for the underlying
`array geometry, SnapMirror performance is best when
`the source and destination geometries match and the op-
`
`timizations apply equally well to both systems. When the
`block transfers complete, the destination writes its new
`fsinfo block.
`
`3.2.2 Block-bevel Differences and Update
`Transfers
`Part of the work involved in any asynchronous mir(cid:173)
`roring technique is to find the changes that have occurred
`in the primary file system and make the same changes in
`another file system. Not surprisingly, SnapMirror uses
`W AFL's active map file and reference snapshots to do
`this as shown in Figure I.
`
`When a mirror has an update scheduled, it sends a
`message to the source. The source takes an incremental
`reference snapshot and compares the allocation bits in
`the active map files of the base and incremental reference
`snapshots. This active map file comparison follows the
`following rules:
`
`If the block is not allocated in either active map, it is un(cid:173)
`used. The block is not transferred. It did not exist in the
`old file system image, and is not in use in the new one.
`Note that it could have been allocated and deallocated
`between the last update and the current one.
`
`If the block is allocated in both active maps, it is un(cid:173)
`changed. The block is not transferred. By the file sys(cid:173)
`tem's no-overwrite policy, this block's data has not
`changed. It could not have been overwritten, since the
`old reference snapshot keeps the-block from being re-al(cid:173)
`located.
`
`If the block is only allocated in the base active map, it has
`been deleted. The block is not transferred. The data it
`contained has either been deleted or changed.
`
`If the block is only allocated in the incremental active
`map, it has been added. The block is transferred. This
`means that the data in this block is either new or an up(cid:173)
`dated version of an old block.
`
`Note that SnapMirror does not need to understand
`whether a transferred block is user data or file system
`metadata. All it has to know is that the block is new to the
`file system since the last transfer and therefore it should
`be transferred. In particular, block de-allocations auto(cid:173)
`matically get propagated to the mirror, because the up(cid:173)
`dated blocks of the active map file are transferred along
`with all the other blocks.
`
`In practice, SnapMirror transfers the blocks for all
`existing snapshots that were created between the base
`and incremental reference snapshots. If a block is newly
`allocated in the active maps of any of these snapshots,
`then it is transferred. Otherwise, it is not. Thus, the des(cid:173)
`tination has a copy of all of the source's snapshots.
`
`USENIX Association
`
`FAST '02: co;'rl-erence on File and Storage Technologies
`
`121
`
`Microsoft Ex. 1016, p. 5
`Microsoft v. Daedalus Blue
`IPR2021-00831
`
`
`
`Initial Transfer
`
`Block 100
`
`Block 101
`
`Block 102
`
`Block 103
`
`Block 104
`
`Block 105
`
`Block 106
`
`Block 100
`
`Block 101
`
`Block 102
`
`Block 103
`
`Block 104
`
`Block 105
`
`Block 106
`
`File System Changes
`
`Base Reference Snapshot
`Incremental Reference Snapshot
`r····························-···········-·········- -· .. ··· ......... ...... ..
`i Active File System
`
`Update Transfer
`
`Block 100
`
`Block 101
`
`Block 102
`
`Block 103
`
`Block 104
`
`Block 105
`
`Block 106
`
`I Active Map Comparison:
`! added (transferred)
`l
`j deleted (not transferred)
`' l.;
`,
`
`deleted (not transferred)
`
`·unchanged (not transferred)
`
`; i unchanged (not transferred)
`; i added (transferred)
`l unused (not transferred)
`
`l••••••••••••••••••••••••••••••••••••••••••••••••-•• oon--•••n•••••• • • •••••
`
`Figure 1. SnapMirror's use of snapshots to identify blocks for transfer. SnapMirror uses a base reference snapshot
`as point of comparison on the source and destination filers. The first such snapshot is used for the Initial Transfer. File
`System Changes cause the base snapshot and theactive file system to diverge (C is overwritten with C', A is deleted,
`E is added). Snapshots and the active file system share unchanged blocks. When it is time for an Update Transfer,
`SnapMirror takes a new incremental reference snapshot and then compares the snapshot active maps according to the
`rules in the text to determine which blocks need to be transferred to the destination. After a successful update, Snap(cid:173)
`Mirror deletes the old base snapshot and the incremental becomes the new base.
`
`At the end of each transfer the fsinfo block is updat(cid:173)
`ed, which brings the user's view of the file system up to
`date with the latest transfer. The base reference snapshot
`is deleted from the source, and the incremental reference
`snapshot becomes the new base. Essentially, the file sys(cid:173)
`tem updates are written into unused blocks on the desti(cid:173)
`nation and then the fsinfo block is updated to refer to this
`new version of the file system with is already in place.
`
`3.2.3 Disaster Recovery and Aborted Transfers
`Because a new fsinfo block (the root of the file sys(cid:173)
`tem tree structure) is not written until all blocks are trans(cid:173)
`ferred, Snap Mirror guarantees a consistent file system on
`the mirror at any time. The destination file system is ac(cid:173)
`cessible in a read-only state throughout the whole Snap(cid:173)
`Mirror process. At any point, its active file system
`replicates the active map and fsinfo block of the last ref(cid:173)
`erence snapshot generated by the source. Should a disas(cid:173)
`ter occur, the destination can be brought immediately
`into a writable state.
`
`The destination can abandon any transfer in progress
`in response to a failure at the source end or a network
`
`partition. The mirror is left in the same state as it was be(cid:173)
`fore the transfer started, since the new fsinfo block is
`never written. Because all data is consistent with the last
`completed round of transfers, the mirror can be reestab(cid:173)
`lished when both systems are available again by finding
`the most recent common SnapMirror snapshot on both
`· systems, and using that as the base reference snapshot.
`
`3.2.4 Update Scheduling and Transfer Rate
`Throttling
`The destination file server controls the frequency of
`update through how often it requests a transfer from the
`source. System administrators set the frequency through
`a cron-like schedule. If a transfer is in progress when an(cid:173)
`other scheduled time has been reached, the next transfer
`will start when the current transfer is complete. SnapMir(cid:173)
`ror also allows the system administrator to throttle the
`rate at which a transfer is done. This prevents a flood of
`data transfers from overwhelming the disks, CPU, or net(cid:173)
`work during an update . .
`
`122
`
`FAST '02:. Conference on File and Storage Technologies
`
`USENIX Association
`
`Microsoft Ex. 1016, p. 6
`Microsoft v. Daedalus Blue
`IPR2021-00831
`
`
`
`3.3 SnapMirror Advantages and Limitations
`Snap Mirror meets the emerging requirements for
`data recovery by using asynchrony and combining file
`system knowledge with block-level transfers.
`
`Because the mirror is on-line and in a consistent
`state at all phases of the relationship, the data is available
`during the mirrored relationship in a read-only capacity.
`Clients of the destination file system will see new up(cid:173)
`dates atomically appear. If they prefer to access a stable
`image of the data, they can access one of the snapshots
`on the destination. The mirror can be brought into a writ(cid:173)
`able state immediately, making disaster recovery ex(cid:173)
`tremely quick.
`
`The schedule-based updates mean that SnapMirror
`h~ as. much or as little impact on operations as the sys(cid:173)
`tem administrator allows. The tunable lag also means
`that the administrator controls how up to date the mirror
`is. Under most loads, SnapMirror can reasonably trans(cid:173)
`mit to the mirror many times in one hour.
`
`SnapMirror works over a TCP/IP connection that
`uses standard network links. Thus