throbber
(12) United States Patent
`Gold et al.
`
`111111
`
`1111111111111111111111111111111111111111111111111111111111111
`US006785786Bl
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 6,785,786 Bl
`Aug. 31,2004
`
`(54) DATA BACKUP AND RECOVERY SYSTEMS
`
`(75)
`
`Inventors: Stephen Gold, Bristol (GB); Jon
`Bathie, Bristol (GB); Peter King,
`Bristol (GB); Ian Peter Crighton,
`Bristol (GB)
`
`(73) Assignee: Hewlett Packard Development
`Company, L.P., Houston, TX (US)
`
`( *) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`(21) Appl. No.:
`
`09/486,439
`
`(22) PCT Filed:
`
`Aug. 28, 1998
`
`(86) PCT No.:
`
`PCT/GB98/02603
`
`§ 371 (c)(l),
`(2), ( 4) Date:
`
`Feb. 28, 2000
`
`(87) PCT Pub. No.: W099/12098
`
`PCT Pub. Date: Mar. 11, 1999
`
`(30)
`
`Foreign Application Priority Data
`
`Aug. 29, 1997
`Aug. 29, 1997
`
`(EP) ............................................ 97306628
`(EP) ............................................ 97306629
`
`Int. Cl? ................................................ G06F 12/16
`(51)
`(52) U.S. Cl. ................................ 711/162; 714/6; 714/4
`(58) Field of Search ................................. 711/162, 104,
`711!114, 161; 714/6
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`5,644,698 A * 7/1997 Cannon ......................... 714/6
`
`5,649,152 A * 7/1997 Ohran eta!. ................ 711!114
`5,673,381 A * 9/1997 Huai eta!. ..................... 714/1
`5,673,382 A * 9/1997 Cannon eta!. ................ 714/6
`5,758,359 A * 5/1998 Saxon ........................ 707/204
`5,799,147 A * 8/1998 Shannon ........................ 714/6
`5,966,730 A * 10/1999 Zulch ......................... 711!162
`6,035,412 A * 3/2000 Tamer eta!. .................. 714/6
`6,148,412 A * 11/2000 Cannon eta!. ................ 714/6
`
`FOREIGN PATENT DOCUMENTS
`
`0 410 630 A2
`EP
`0 541 281 A2
`EP
`0 769 741 A1
`EP
`* cited by examiner
`
`1!1991
`5/1993
`4/1997
`
`Primary Examiner-Matthew Kim
`Assistant Examiner-Matthew D. Anderson
`
`(57)
`
`ABSTRACT
`
`Clients are connected via a LAN to backup apparatus
`including hard and tape drives, respectively primary and
`secondary backups. Each client schedules backup operations
`on a time basis since last backup, the amount of information
`generated since last backup, etc. Prior to a backup, each
`client sends to the backup apparatus a request including
`information representing the files to be backed up. The
`backup apparatus receives backup requests from the clients
`and accepts or rejects backup requests based on backup and
`network loading. The backup apparatus eliminates redun(cid:173)
`dant files and indicates to the clients, prior to the file back
`up, that some files requested to be backed up are already
`stored and are not to be sent. The backup apparatus enables
`a client to restore any of its 'lost' data by copying it directly
`from the disk, without the need for backup administrator
`assistance.
`
`29 Claims, 9 Drawing Sheets
`
`CLIENT210a
`
`CLIENT
`FILE SYSTEM
`322
`
`I
`DYNAMIC
`SCHEDULER I
`310
`
`AGENT
`215a
`
`I
`
`347
`
`l
`FILE
`DIFFERENCING I
`330
`
`-1
`
`BACKUP
`_,. APPARATUS
`240
`
`LOCAL BACKUP
`CONFIGURATION~
`FILE312 ~
`~ ACTIVE FILE I
`I
`MANAGER
`-1
`P1
`32o
`DTF ~
`CACHE
`332
`I FDALOG h
`H BLOCK .~ -
`-1
`I.
`~
`I
`~ RESTORE
`I
`I
`
`CLIENT
`OPERATING ~
`SYSTEM
`312
`
`DIFFERENCING
`340
`
`FDA345
`
`DATA
`TRANSFER
`350
`
`360
`
`Apple Inc. Exhibit 1009 Page 1
`
`

`

`PRIOR ART
`
`ServerM
`
`Figure 1
`
`130m
`
`100
`
`LAN 120
`
`Server I
`("' 0000000 ......, ~======
`
`uuuuuuu
`
`Tape drive
`
`ID~+---1• -+>I~ D_
`
`~
`
`130a
`
`,,,
`
`140
`
`Client N
`
`•
`\Jl
`d •
`
`~0:1
`
`D liOn
`
`-
`
`f-
`
`---------
`
`~o:l
`
`~-
`
`-
`
`I lOb D
`
`Client 2
`
`. -~o:l
`11 Oa D
`
`-
`
`Client 1
`
`Apple Inc. Exhibit 1009 Page 2
`
`

`

`Figure 2
`
`200
`
`Server M
`
`~~~ It, . )_I
`
`~ 130m
`
`("" 0000000 t'-.,
`
`Server 1
`
`JaiJL,. )-1
`
`~~ 230a
`
`Backup Agent
`
`Backup Agent
`
`~~1-244
`====:5
`[Q]av
`
`242
`
`240~
`
`~--------~--------~~------------~----------~--------~
`
`220
`
`r-~01
`D 210n
`
`-
`
`~ Backup Agent )
`
`•
`\Jl
`d •
`
`Client N
`
`( Backup Agent J
`~-
`~o:!
`
`2l0b D
`
`Client 2
`
`215a
`
`( Backup Agent J
`
`Client 1
`
`Apple Inc. Exhibit 1009 Page 3
`
`

`

`347 ~ BLOCK
`
`FDA LOG
`
`330
`
`~ DIFFERENCING
`[\.
`
`FILE
`
`CACHE ~
`
`332
`
`322
`
`FILE SYSTEM
`
`CLIENT
`
`320
`
`_., MANAGER
`:~ ACTIVE FILE
`
`\
`
`DTF
`
`FILE312 ~
`
`215a
`AGENT
`
`310
`
`I~
`
`SCHEDULER
`
`DYNAMIC
`
`.....
`
`CONFIGURATION~
`LOCAL BACKUP
`
`CLIENT 210a
`
`•
`\Jl
`•
`
`240
`
`APPARATUS
`
`BACKUP
`
`FDA 345
`
`SFP 342
`
`e3
`
`Figur
`
`I~
`
`I'
`
`360
`
`RESTORE
`
`~ TRANSFER
`
`350
`
`DATA
`
`DIFFERENCING
`
`~ 340
`~
`
`OPERATING ~
`
`14-
`
`312
`
`SYSTEM
`
`CLIENT
`
`Apple Inc. Exhibit 1009 Page 4
`
`

`

`405
`
`•
`\Jl
`d •
`
`~
`
`~
`
`J
`
`Pointer!
`
`File2
`
`J
`Directory Tree File
`
`Filet
`
`Pointer2
`Pointer!
`
`l
`
`l
`
`410
`
`Figure 4
`
`I
`Backup Directory Filet
`
`Filet
`
`J
`
`Chunk2a
`Chunkla
`
`File2
`
`400
`
`~
`
`It,
`Chunk3a
`Chunk2a
`~ Chunkla
`I
`
`J
`Backup Directory File2
`
`Chunklb
`
`Filet
`
`~.
`
`r--1---_ l
`
`Apple Inc. Exhibit 1009 Page 5
`
`

`

`Figure
`
`210a
`
`CLIENT
`
`•
`\Jl
`d •
`
`550
`
`RECOVERY
`DISASTER
`
`242
`TAPE BACKUP ~ DRIVE
`TAPE
`
`540
`
`530
`
`CONTROL
`MERGE
`
`520
`
`STORAGE
`BACKUP
`
`510
`
`ELIMINATION
`
`FILE
`
`REDUNDANT
`
`SCHEDULER 500
`
`DYNAMIC
`
`FILE 504
`CONFIG
`BACKUP
`
`IMERG£5281
`
`DELTA 526
`
`524
`
`BASELINE
`
`512
`
`RFEINDEX
`
`FILE 502
`PRIORITY
`
`HARD DISK 244
`
`BACKUP APPARATUS 240
`
`Apple Inc. Exhibit 1009 Page 6
`
`

`

`U.S. Patent
`
`Aug. 31,2004
`
`Sheet 6 of 9
`
`US 6,785,786 Bl
`
`CLIENT SIDE
`r - - - -
`SCHEDULE BACKUP
`L-------~----
`
`600
`
`605
`
`REQUEST SLOT
`L-------~----~
`
`BACKUP SERVER SIDE
`
`NO
`
`610
`
`REQUEST
`~GRANTED
`~?
`,--------~--------YES
`
`FILE DIFFERENCING
`
`BUILD NEW FILE LIST
`
`615
`
`620
`
`TRANSMIT NEW FILE LIST
`
`623
`
`RECEIVE LIST & CALCULATE
`SIGNATURES
`
`635
`
`625
`
`630
`
`648
`
`TRANSMIT DATA FOR 650
`BACKUP
`
`Figure 6
`
`655
`
`Apple Inc. Exhibit 1009 Page 7
`
`

`

`~
`c
`N c
`'"""' ~
`~
`
`> = ~
`
`= ......
`~
`~ ......
`~
`•
`\Jl
`d •
`
`I nl Signature I
`
`1 2131415 61718191
`
`Date I Time
`
`Filename
`
`File 5
`
`Pointer
`
`Signature
`
`Date I Time
`
`Filename
`
`6 Signature /
`
`Date I Time
`
`Filename
`
`File 4
`Figure 7d File 2
`
`3 7/ Signature J
`
`Date I Time
`
`Filename
`
`File 1
`
`Signature I Pointer I
`
`Filename
`
`File 4
`
`Figure 7c
`
`CRC
`
`CRC
`
`File 5
`Figure 7b File4
`
`New (Unique)
`
`n Signature
`
`1 2 3 4 5 6 7 8 9
`
`Date I Time
`
`Filename
`
`n Signature New ( Common )
`
`1 2 3 4 5 6 7 8 9
`
`Date I Time
`
`Filename
`
`UnModified
`
`n Signature
`
`1 2 3 4 5 6 7 8 9
`
`Date /Time
`
`Filename
`
`Modified
`
`n Signature
`
`1 2 3 4 5 6 7 8 9
`
`Date I Time
`
`Filename
`
`n Signature Modified
`
`1 2 3 4 5 6 7 8 9
`
`Date I Time
`
`Filename
`
`File 5
`
`File 4
`
`File 3
`
`File 2
`
`File 1
`
`Figure 7a
`
`Apple Inc. Exhibit 1009 Page 8
`
`

`

`Figure 8
`
`~ File C
`
`File B
`
`~
`
`~
`
`File A
`
`~
`
`File 5
`
`File 4
`
`1-A
`
`l!
`
`~ File 3
`
`File 2
`
`File 1
`
`___.
`
`r;
`
`•
`\Jl
`d •
`
`242 BACKED UP DATA
`
`SignatureC 7
`SignatureA v
`
`~
`
`Signature4
`
`SignatureS
`
`Signature!
`
`SignatureS ~
`
`------
`
`Signature3
`
`Signature2 ~
`
`CRC
`
`CRC
`
`CRC
`
`CRC
`
`CRC
`
`CRC
`
`CRC
`
`CRC
`
`Size
`
`Size
`
`Size
`
`Size
`
`Size
`
`Size
`
`Size
`
`Size
`
`File 4 Client 11 On
`
`File C Client 11 On
`
`File B Client 11 Ob
`
`File 1 Client 11 Ob
`
`File A Client 11 Ob
`
`File 5 Client llOa
`
`File 3 Client 11 Oa
`
`File 2 Client 11 Oa
`
`RFEINDEX
`
`512
`
`Apple Inc. Exhibit 1009 Page 9
`
`

`

`.~ 0
`~HEADS IQE
`
`930
`
`v
`
`Figure 9
`
`'i
`
`A
`
`925
`R/W
`
`v
`
`tt
`
`'i
`
`A
`
`v
`J... HARD DISK
`
`920
`
`905 v
`I/F
`l;t.
`
`r
`--
`
`II.
`
`'i
`
`A
`
`~
`
`....
`
`.... ~
`
`.... ~
`
`v MECHANISM 935
`tt
`
`SYSTEM BUS 912
`
`MEMORY
`
`MAIN
`
`::.
`..:.
`913
`
`•
`\Jl
`d •
`
`915
`ROM
`
`r
`
`tt
`
`APPARATUS 900
`TAPE BACKUP
`
`J
`
`r--
`
`..:.
`
`'i
`
`A
`
`CONTROLLER
`
`910
`
`Apple Inc. Exhibit 1009 Page 10
`
`

`

`US 6,785,786 Bl
`
`1
`DATA BACKUP AND RECOVERY SYSTEMS
`
`TECHNICAL FIELD
`
`The present invention relates to computer data backup and
`recovery and particularly, but not exclusively, to apparatus,
`methods and systems for enacting data backup and recovery.
`
`BACKGROUND ART
`
`5
`
`25
`
`2
`and to restore to said clients data from the primary
`storage means; and
`to backup to the tape storage means, in accordance with
`pre-defined criteria, at least some of the data stored in
`the primary storage means and to restore to the primary
`storage means, in accordance with a respective restore
`message received from a client, at least some data
`stored in the tape storage means.
`Preferably, the controller means is programmed to main-
`10 tain stored on the primary storage means at least the most
`current version of all data received from the clients. In this
`way, a data restore operation can be enacted using data
`stored in primary storage, without needing to find and install
`any particular backup tape.
`Preferably, the controller means is programmed to backup
`data stored in the primary storage means to the tape storage
`means independently of any messages from the clients.
`In preferred embodiments, the primary storage means
`comprises a random access storage means such as a hard
`20 disk drive. Alternatively, the primary storage means com(cid:173)
`prises non-volatile random access memory (NV-RAM). For
`the latter case, however, the applicants believe that currently
`NV-RAM would be very prohibitively expensive compared
`to a hard disk drive.
`In preferred embodiments, the tape storage apparatus
`comprises a housing configured specifically to house the
`controller, the interface means, the primary storage means
`and the tape storage means. Thus, the tape storage apparatus
`provides a dedicated and integrated solution to data storage
`and recovery. In other, less preferred embodiments, the
`components of the apparatus may be distributed, for
`example, with some of the components residing on or in
`other apparatus, such as a computer.
`In accordance with a second aspect, the present invention
`provides a method of backing up to a data backup and
`restore apparatus attached to a network data stored in one or
`more clients also attached to the network, the method
`comprising the data backup and restore apparatus storing in
`primary data storage a most recent version of all data
`received from the clients and, from time to time, in accor(cid:173)
`dance with pre-determined criteria, storing in secondary data
`storage at least some of the data stored in the primary data
`storage.
`In accordance with a third aspect, the present invention
`provides a data storage system comprising:
`a network;
`tape storage apparatus; and
`at least one client connected to the apparatus, the (or the
`at least one) client comprising client storage means and
`client processing means, the client processing means being
`programmed in accordance with pre-determined criteria to
`determine when data stored in the client storage means
`should be backed up to the tape backup apparatus.
`Other aspects and embodiments of the invention will
`55 become apparent from the following description and claims.
`
`Typically computer networks in use today have the basic
`topology shown in the diagram in FIG. 1. In FIG. 1, several
`client workstations are connected via a network to several
`servers.
`Users perform work on the client workstations that can 15
`communicate via a network link with the servers. Clients
`typically store unique user data whereas servers typically
`provide a common central point for some function such as
`shared hard disk storage, data backup storage, software
`applications or printer control.
`There are a number of current schemes used for backing
`up data on the network. A first scheme is for each client to
`individually store data to a device such as a tape drive. Tape
`storage has become the preferred method of backing up
`information on computers due to its relatively low cost, high
`capacity, durability and portability. This first scheme
`requires each person, or each administrator, responsible for
`a client to be responsible for backing up the data. Also, each
`client requires its own backup device. A second scheme is
`for each client to store important data on a remote file server,
`where the server data is backed up at regular intervals, for
`example on a daily basis. Thus, if a client fails, potentially
`only less important information is lost. A third scheme,
`which is available, for example, to systems which operate
`under Windows NT4 and are part of an NT Domain, is that
`all 'specified' information on a client is backed up, typically
`overnight, to a tape drive connected to an NT server.
`Known backup schemes offer relatively good protection
`for client data, but with a very high administration overhead. 40
`For example, if data needs to be recovered by a client, for
`example as a result of one or more files being 'lost' or
`destroyed, then the client's owner typically needs to contact
`an administrator of the backup system and request a restore
`procedure. Typically, a restore procedure involves the 45
`backup system administrator tracking down and mounting a
`respective tape, on which the last backup of the lost file(s)
`was made, and initiating the restore procedure. While such
`a procedure is typically very reliable, it can be perceived as
`onerous on client users and backup system administrators 50
`alike.
`
`30
`
`35
`
`DISCLOSURE OF THE INVENTION
`
`In accordance with a first aspect, the present invention
`provides tape storage apparatus, comprising:
`interface means for connecting the apparatus to one or
`more clients;
`controller means for controlling the apparatus and for
`processing messages received from the one or more
`clients;
`primary storage means; and
`tape storage means, wherein the controller is pro(cid:173)
`grammed:
`to process backup and restore messages received from the 65
`one or more clients respectively to backup to the
`primary storage means data received from the clients
`
`60
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`An embodiment of the present invention will now be
`described in more detail, by way of example only, with
`reference to the following drawings, of which:
`FIG. 1 is a diagram of a conventional computer network;
`FIG. 2 is a diagram of a computer network modified to
`operate in accordance with the present exemplary embodi(cid:173)
`ment;
`FIG. 3 is a diagram illustrating the main functional
`features of a client according to the present exemplary
`embodiment;
`
`Apple Inc. Exhibit 1009 Page 11
`
`

`

`US 6,785,786 Bl
`
`3
`FIG. 4 is a diagram which illustrates the main file struc(cid:173)
`tures on the tape backup apparatus to facilitate a backup
`operation according to the present exemplary embodiment;
`FIG. 5 is a diagram illustrating the main functional
`features of a backup apparatus according to the present 5
`exemplary embodiment;
`FIG. 6 is a flow diagram representing a backup operation
`according to the present exemplary embodiment;
`FIG. 7 is a diagram which illustrates file information
`gathered during the procedure described in relation to FIG.
`6;
`
`FIG. 8 is a diagram which represents a redundant file
`elimination index; and
`FIG. 9 is a block diagram of backup apparatus according
`to an embodiment of the present invention.
`
`BEST MODE FOR CARRYING OUT THE
`INVENTION, & INDUSTRIAL APPLICABILITY
`
`4
`server 130a. The backup software schedules the backup to
`happen at, for example, 11:00 pm each night, and controls
`the transfer of data from the file server 130a to the backup
`apparatus 900.
`FIG. 2 is a diagram, which illustrates a networked com-
`puter environment 200 modified for operation according to
`the present embodiment. As shown, a plurality of clients
`(designated 210a, 210b, etc) are connected to a computer
`network 220. In this case also, the network is a LAN (local
`10 area network), such as an Ethernet, which supports the
`TCP liP data communications protocol. A number of servers
`(designated 230a, 230b, etc) are connected to the LAN 220.
`The servers, as above, may be file servers, print servers,
`email servers, or any combination thereof. Also connected to
`15 the LAN 220 is a backup apparatus 240 according to the
`present embodiment. The backup apparatus 240 is shown to
`include a hard disk drive 244 in addition to a tape drive 242.
`The operation of the backup apparatus 240 will be described
`in more detail below.
`In general terms, servers can be thought of as more
`powerful clients, or clients with large amounts of disk
`storage. Thus, for the sake of convenience and unless
`otherwise stated, the term "client" when used hereafter shall
`be taken to mean either a server or a client in a network
`environment. Further, for ease of description only, the term
`"client" shall also be taken to include any device or appa(cid:173)
`ratus which stores data locally which can be backed up
`remotely.
`Further, unless otherwise indicated, only one client sys-
`30 tern 210a will be considered, although it will be understood
`that the other clients operate in the same manner.
`The client 210a includes a backup agent 215a, which
`comprises one or more software routines. The main func(cid:173)
`tional modules of the backup agent 215a are illustrated in the
`35 diagram in FIG. 3. Each module comprises one or more
`software routines, written for example in the C++ program(cid:173)
`ming language, which control the client 210a to process data
`and communicate with the backup apparatus 240 as
`described in detail below. The software routines are stored
`on a hard disk drive device (not shown) in the client and are
`loaded into main random access memory (RAM) when they
`are required to operate, and a central processor in the client
`processes the instructions to control the client to operate in
`accordance with the present embodiment. The lines illus-
`45 trated interconnecting the various modules and diagram
`blocks represent communications channels which are open
`some or all of the time, depending on requirement. The
`client 210a is a general purpose computer, such as a PC
`running the Windows NT 4.0 operating system.
`50 Dynamic Scheduler Module
`In FIG. 3, a dynamic scheduler 310 is a module respon(cid:173)
`sible for dynamically initiating a backup cycle from the
`client 210a, based on time since the last backup and the local
`system resources. A local backup configuration file 312
`55 contains details on a general network-wide policy set by the
`network administrator and a local user-defined policy, in
`terms of a target time delay before data is protected. For
`example, one default policy would be to attempt a backup
`once an hour. The dynamic scheduler 310 is a background
`60 process which runs permanently on the client 210a.
`After the target time delay (e.g. 1-hour) has passed, the
`scheduler 310 assesses the local client system 210a
`resources to ensure that the backup can run without seriously
`impacting the system performance. If the local client system
`65 210a is heavily loaded, for example at 95% capacity, the
`scheduler 310 will retry a short period later (e.g. after 5
`minutes) and continue retrying until the system has enough
`
`As has already been discussed, hitherto known backup 20
`schemes and systems offer relatively good protection for
`network data, but with a potentially very high administration
`overhead, particularly when data restoration from tape is
`required. Administrators have the added problems of back(cid:173)
`ups not being run, or failing due to lack of training or 25
`knowledge on the part of the client users. This is especially
`true of backup scheduling, media handling and maintenance
`schemes, such as tape rotation or drive cleaning. Failed
`backups can result in data loss, which, at best, causes time
`to be lost in recreating the data.
`Also, as the amount of networked data increases, backup
`capacity and network bandwidth become significant limiting
`factors (takes longer to backup and thus increases network
`down time during backup), as does the increasing amount of
`backup administration overhead.
`Consider a 10-client and 2-server network where each
`client has 2 Gbytes of disk storage and each server has 20
`Gbytes of disk space. A tape storage device would poten(cid:173)
`tially need to have a capacity of 60 Gbytes to guarantee to
`completely back up the network.
`Further, a typical10-Base-T network can transfer at most
`1 Mbyte of data per second. A full backup of the above
`network would take 60,000 seconds or 17 hours. During this
`time, the network would be unavailable for use for any other
`means. This would be unacceptable to most users.
`The embodiments described herein aims to address at
`least some of these issues.
`FIG. 1 is a diagram, which illustrates a general prior art
`networked computer environment 100. As shown, a plurality
`of computer systems, or clients, (designated llOa, HOb, etc)
`are connected to a computer network 120. In this example,
`the network is a LAN (local area network), such as an
`Ethernet, which supports the TCP/IP data communications
`protocol. Also connected to the LAN are a number of servers
`(designated 130a, 130b, etc). The servers may be, for
`example, file servers, print servers, email servers, or any
`combination thereof. In the diagram, a tape drive 140 is
`illustrated connected to the server 130a, where the server
`130a is a file server. The data stored on the file server 130a,
`in on-line storage (where on-line indicates that the storage is
`accessible at a given instance) such as a hard disk drive, is
`backed up to the tape drive 140 on a daily basis, typically to
`a new tape each day. Tape media is termed off-line storage,
`since when a tape is removed it is no longer accessible. A
`backup operation is scheduled and controlled by backup
`software, which runs as a background process on the file
`
`40
`
`Apple Inc. Exhibit 1009 Page 12
`
`

`

`US 6,785,786 Bl
`
`5
`
`15
`
`5
`free resources to run the backup. There is an upper time limit
`to the retry, for example 30-minutes, after which time a
`backup is forced irrespective of system loading. The upper
`time limit for retrying is another general policy variable
`stored locally in the backup configuration file 312.
`Once the local client system resources allow, the sched(cid:173)
`uler 310 communicates with the backup apparatus 240 to
`request a backup slot. The backup apparatus 240 will allow
`the backup job to start if the network bandwidth can support
`it. If there are already other backups running from other 10
`clients, which are using up all of the network bandwidth
`allocated to backups, the backup apparatus 240 communi(cid:173)
`cates with the client 210a to refuse the request. As a result,
`the scheduler 310 returns to the retry cycle and waits until
`the backup apparatus 240 gives permission to progress.
`Because the backup agent 215a dynamically initiates the
`backup jobs, the overall network backup scheme is configu(cid:173)
`ration independent. Thus, a client can power-save, or a
`portable client can be detached from the network, and on
`return the backups can continue seamlessly. Thus, there is no 20
`requirement for the backup apparatus 240 to have any
`'knowledge' of which clients are connected, or are not
`connected, to the network at any time.
`The dynamic scheduler 310 also has responsibility for
`initiating the operation of the other modules when required. 25
`Active File Manager Module
`An active file manager module (AFM) 320 monitors
`which files are to be opened by the backup agent 215a, for
`the purpose of backing up. Before a file is opened, the AFM
`320 checks the client's file system 322 to see if the file is 30
`already in use by another program running on the client
`210a. If the file is already in use, the AFM 320 waits until
`the file is in a "safe" state for the backup agent 215a to back
`it up. Only when the file is "safe" does the AFM 320 allow
`the backup agent 215a to open the file. A file can be "safe" 35
`even if the file is locked and in use. If the file is written to
`during the backup operation, the AFM 320 automatically
`preserves the data to be backed up by sending the respective
`blocks directly to the backup apparatus 240 over the net(cid:173)
`work. The backup apparatus 240 can then manage or 40
`re-order these out of order backup blocks, thus preserving
`the original state of the file from when the backup started.
`For example, consider a database of customer addresses
`(not shown), which is being backed up. During the backup,
`a user changes one of the entries in a part of the database that 45
`has not yet been backed up. The AFM 320 immediately
`sends the old address to the backup server, and when the
`backup reaches this point it skips the updated address in the
`database. This method means that when a database applica(cid:173)
`tion thinks it has written data to disk, it has indeed been 50
`written to disk, and not cached somewhere else. Thus, there
`is no possibility of data loss or corruption if the server 240
`were to crash during the backup.
`The AFM 320 can be user-configured to determine when
`a file is "safe" to backup. For example, the AFM 320 can use 55
`a write inactivity period to decide this, which would be one
`of the general policy values stored in the backup configu(cid:173)
`ration file 312. In order to ensure that a backup copy of a file
`does not contain a partial transaction, the AFM 320 monitors
`the period of time that passes without a write taking place. 60
`For example, if the time is set to 5 seconds, the file is not
`"safe" until there is a 5 second period when there are no
`writes active, and at this point the file can be backed up.
`There is also a value for the period of time after which the
`AFM 320 gives up trying to find a safe state. For example, 65
`if this time is set to 60 seconds, then the AFM 320 will try
`for one minute to find a 5 second period with no writes.
`
`6
`Some applications, notably databases, operate a number
`of files simultaneously (e.g. data files and index files) and to
`assure the overall integrity of such files they must be
`configured as a "group". A group defines a number of files
`that must be backed up from a collective "safe" state, and
`only when every file in the group is simultaneously in a
`"safe" state can each file be backed up. This grouping is
`performed automatically by the AFM 320 when it detects
`one of the major database types (e.g. Exchange, Notes, SQL
`Server, and Oracle). Further, the AFM 320 may be config(cid:173)
`ured to treat user-defined list of files as groups, with the
`group definitions being stored in the backup configuration
`file 312.
`File Differencing Module
`A file differencing module (FDM) 330 is a module in the
`backup agent 215a that selects the files to be backed up by
`determining which files have changed or been added since
`the last backup. The module achieves this by reading the
`current directory tree of the local file system 322 and
`checking each file's modified time/date against the entries in
`a cached Directory Tree File (DTF) 332 generated from the
`last backup. Modified files will have different times and
`dates, and new files will have no corresponding entry.
`Modified files are marked as "Modified" and new files are
`marked as "New". Note that for the first backup after
`installation all files will be new files.
`Before the list of modified or new files is further
`processed, the list is filtered for excluded files (such as
`temporary files, Internet cache files, swap files, etc). The
`policy for excluding files is held in the local backup con(cid:173)
`figuration file 312, and is generated from a general network
`policy set by the network administrator and also from a
`user-defined set of excluded files or directories.
`The next stage is to determine which of the new files are
`already held on the backup apparatus 240, and are thus
`redundant. For example, if there has already been a backup
`of a Windows95 workstation, then subsequent backups will
`determine that the Windows95 operating system files are
`redundant. The FDM 330 first sends the list of the selected
`new files to the backup apparatus 240. The list contains for
`each file a 32-bit CRC code (calculated for the respective
`name, date/time stamp and file size information). The
`backup apparatus 240 returns a list of the files that match its
`existing backup file contents, and for each file it also returns
`a signature (in this case a 32-bit CRC checksum calculated
`over the actual file data) and an indication of the location of
`the file on the backup server. For each of the potentially
`redundant files in the list, the FDM 330 generates a respec(cid:173)
`tive signature value and compares it with the value returned
`by the backup apparatus 240. Where the signatures match,
`the file marking is changed from "New" to "Redundant".
`Thus, the output of the FDM 330 is a list of all files which
`are new or modified since the last backup, marked as:
`"Redundant", copy already held on backup server;
`"New", new file, thus no need for block differencing; or
`"Modified", use block differencing to determine which
`blocks changed.
`As well as files, the FDM 330 identifies any modifications
`to the system information used to rebuild the system in
`disaster recovery. This covers areas such as NetWare NDS
`partitions, disk partition information, file system types and
`details (e.g. compressed), bootstrap partitions (e.g. MER,
`NetWare DOS partition).
`Block Differencing
`A block-differencing module (BDM) 340 determines
`which blocks in each file have changed since the last backup.
`The process of identifying the changed portions (deltas) of
`
`Apple Inc. Exhibit 1009 Page 13
`
`

`

`US 6,785,786 Bl
`
`7
`files is performed by two basic processes. The first process
`is a sliding fingerprint (SFP) process 342. In general, a
`fingerprinting process is a probabilistic algorithm, where the
`probability of a failure is made far less than the probability
`of an undetected failure of the underlying storage and
`communication media (for further, detailed information on
`fingerprinting, the reader is referred to the book by Richard
`Karp and Michael Rabin, "Efficient randomised pattern
`matching algorithms", Harvard University Centre for
`Research in Computing Technology, TR-31-81, December
`1981). The second process involves active detection of
`writes to regions of files; this technique requires a process
`called a file delta accelerator (FDA) process 345. The FDA
`process 345 is a background process which operates all the
`time to monitor the client's operating system 312 write calls
`and maintain a log 347 of which logical regions of which 15
`files have been modified.
`The FDA process 345 is more efficient for files that are
`updated in place (e.g. databases), while the SFP process 342
`is far more efficient for document files that are entirely (or
`largely) rewritten with each update-although, only a small 20
`portion of the file may have been modified. As will be
`described, the present embodiment makes use of a combi(cid:173)
`nation of an SFP process 342 and a FDA process 345. As
`each modified file is opened for backup, the FDA process log
`347 is checked to see how much of the file has been 25
`modified. If more than a threshold percentage, for example
`5-10 percent, has been modified, and if the absolute size of
`the changes is smaller than a given size (e.g. 2MB), then the
`SFP process 342 is selected as the appropriate process to use.
`Otherwise, the FDA-detected regions are used. Note that if
`the local client 210a crashes without a 'clean' FDA
`shutdown, all FDA log 347 information is totally
`invalidated, so the BDM 340 must temporarily revert to the
`SFP process (or a conventional incremental backup) when
`the next backup is performed.
`The SFP process 342 divides an updated file into equal(cid:173)
`sized "chunks", the size of which varies depending on the
`file size. Each chunk has a 12-byte fingerprint calculated for
`it, and the fingerprints are sent with the backup data for the
`file to be stored by the backup apparatus 240. When a file is
`to be checked with the SFP process 342, the BDM 340
`communicates with the backup apparatus 240 to download
`the fingerprint set for the file in question. It is also possible
`to locally cache fingerprint sets for files that are frequently
`accessed. The SFP process 342 then calculates the finger- 45
`print function for the updated version of the file, starting
`from the first byte and using a chunk size the same size as
`for the last backup of the file. Then the SFP process 342
`compares the resulting new first fingerprint with the previ(cid:173)
`ous fingerprint set to a find a match. If there is a match, then 50
`the chunk starting at that byte is

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket