`Woodhill et al.
`[45] Date of Patent: Jul. 15, 1997
`
`[191
`
`[11] Patent Number:
`
`5,649,196
`
`
`
`U8005649196A
`
`[54] SYSTEM AND METHOD FOR DISTRIBUTED
`STORAGE MANAGEMENT ON
`NETWORKED COMPUTER SYSTEMS
`USING BINARY OBJECT mENTfl'IERS
`
`["15]
`
`Inventors: James R. Woodhfll. Houston; Louis R.
`Woodhill, Richmond; William Russell
`More, .111; Jay Harris Berlin, both of
`Houston, all of Tax.
`
`[73} Assignec; Legem Corporafim, Pittsburgh, Pa.
`
`[211 App]. No: 555,376
`
`[22] Filed:
`
`Nov. 9, 1995
`
`Related US. Application Data
`
`[63] Continuation of Set: No. 85,596, Jul. 1. 1993, abandoned.
`
`Int. CL" ..................................................... G861? 11100
`[51]
`[52] vs. C]. ...........
`....... 3951620;3951610; 3951618;
`395120003; 3951489; 395149104; 395113561;
`395118317; 395118204; 3951180; 3641DIG. 1:
`3641285; 36412851; 36412344
`[58] Field of Search ............. 3951610, 618.
`3951620. 200.03, 489, 49004, 185.01, 183.17,
`182.04, 180; 3641'DIG. 1
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,616,315 1011986 Logsdonetal. ....
`711992 Chefi’elzetal.
`5,133,065
`........
`............ 3951575
`
`411993 Gramlich etal.
`........... 3951600
`5,202,982
`.............
`5,235,601
`811993 Stallmo et a1.
`....... 3711401
`
`811993 Davis et a1. ......
`5,239,633I
`3951425
`
`811993 Rudeseal at 91.
`......
`5,239,659
`3951800
`
`.... 3951600
`5,274,802 1211993 Alline
`
`.. 3711104
`5,273,sss
`111994 Ng etal.
`
`3951575
`5,295,253
`311994 Iewettetal.
`3951800
`5,367,698 1111994 Webberetal.
`
`OTHER PUBLICKI‘IONS
`
`The 8th International Conference on Distributed Computing
`Systems, 13 Jun. 1988. San Jose. California pp. 471—479
`(Relevant to claim No. I. 2, 5, 8, 9, 18, 19) Daniel Barbara
`et a1.
`'Exploiu‘ng symmetries for low—cost comparison of
`file copies’ seep. 4'1 1, right coL,line 29—p. 472. left coL, line
`10 (Relevant to claim No. 3, 4, 6, 10—14. 20.
`Computer Technology Review. vol. 12, No. 10, Aug. 1992.
`Los Angeles US pp. 55-60 Daniel Masters 'Distributed
`Network Processing Speeds Up Network Backup' see p. 60,
`left. col., line 18-line 31 (Relevant claim No. 10).
`Primn' Examiner—Paul V. Kult‘k
`Assistant Exaninermlean R. Holncre
`Attorney; Agent, or Firm—Kirkpatrick & lockhart LLP
`[5'1]
`ABSTRACT
`
`The present invention is directed to a system and method for
`the distributed management of the storage space and data on
`a networked computer system wherein the networked com-
`puter system includes at least two storage devices for storing
`data files comprised of one or more binary objects. The
`distributed storage management system includes a device for
`selectively copying the binary objects sun-ed on one of the
`storage devices to another of the storage devices and another
`device for calculating a wn-ent value for a binary object
`identifier for selected binary objects stored on the storage
`devices wherein the calculation of the binary object identi-
`fier is based upon the actual data contents of the associated
`binary object. The distributed storage management system
`further includes a device for storing the current value of the
`binary object identifier as a previous value of the binary
`object identifier, another device for comparing the current
`value of the binary object identifier associated with a par-
`ticular binary object to one or more previous values of the
`binary object identifier aswciated with that particular binary
`object and a device for commanding the device for selec»
`tively copying binary objects in response to the device for
`comparing.
`
`18 Claims, 14 Drawing Sheets
`
`tea
`i
`59455»
`on?
`..-
`m —.' so =
`we!
`Lym'
`a
`I
`II-
`1
`
`'1th I
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`RACK-1003
`RACK-1003
`Page 1 of 27
`Page 1 of 27
`
`
`
`US. Patent
`
`Jul. 15, 1997
`
`Sheet 1 of 14
`
`5,649,196
`
`1 8
`
`User
`ork Station
`
`User .
`Work Stanon
`
`18
`
`17
`
`17
`
`16
`
`Local
`
`NetWOI‘
`
`1 7
`
`Local
`Compute
`
`20
`
`19
`
`/ 15
`
`Wide
`
`Am
`
`Network
`
`13
`
`Remote
`
`Fileserver
`
`Backup
`
`16
`
`Local
`
`Netw01'
`
`17
`
`/
`
`Local
`Compute
`
`20
`
`1 9
`
`\ 15
`
`14
`
`12
`
`FIG. 1
`
`RACK-1003
`RACK-1003
`Page 2 of 27
`Page 2 of 27
`
`
`
`US. Patent
`
`Jul. 15, 1997
`
`Sheet 2 of 14
`
`5,649,196
`
`Compressed Storage Files
`
`Free D13 Space
`
`' k
`
`.12
`
`Local Computer Data Flles
`
`
`
`
`Backup Queue Database
`
`File Dat base
`a
`
`Distributed Storage
`
`Management Program
`
`
`
`
`Operating System Files
`
`32
`
`30
`
`28
`
`26
`
`25
`
`24
`
`22
`
`FIG. 2
`
`RACK-1003
`RACK-1003
`Page 3 of 27
`Page 3 of 27
`
`
`
`US. Patent
`
`Jul. 15, 1991r
`
`Sheet 3 of 14
`
`5,649,196
`
`2;
`
`
`
`
`
`40
`
`36
`
`43
`
`
`
`
`
`
`
`
`48
`
`
`62
`
`66
`
`41
`
`46
`
`60
`
`64
`
`67
`
`62
`
`66
`
`70
`
`666
`Idem.‘
`
`Record
`
`34
`
`B k
`44 up
`161321686
`
`42
`
`Object
`166m.
`
`Record
`
`68
`\
`
`74
`BM
`
`Identifier
`
`
`
`
`
`
`
`
`
`
`
`
`66
`
`
`
`
`
`
`'22
`
`
`66
`
`66
`
`64
`
`FIG. 3
`
`RACK-1003
`RACK-1003
`Page 4 of 27
`Page 4 of 27
`
`
`
`US. Patent
`
`Jul. 15, 1997
`
`Sheet 4 of 14
`
`5,649,196
`
`'
`
`
`
`Backup
`Queue
`Record
`
`75
`
`File Name
`
`File Status
`
`File Size
`
`Last Modified Date/Time
`
`2.6.
`
`Last Acess Date/Time
`
`FIG. 4
`
`76
`
`78
`
`80
`
`82
`
`84
`
`86
`
`88
`
`90
`
`92
`
`RACK-1003
`RACK-1003
`Page 5 of 27
`Page 5 of 27
`
`
`
`US. Patent
`
`Jul. 15, 1997
`
`Sheet 5 of 14
`
`5,649,196
`
`
`'
`
`100
`B25155;
`
`
`Queue
`Database
`
`
`1 0 2
`
`
`
`ex
`File
`Found
`
`Y
`
`.
`
`Asmgn
`User-
`Defined
`Pricri
`
`1 08
`
`
`
`
`Backup
`Queue
`
`Rerd Backup” File“ Update . Delete
`
`
`N
`120
`.
`Create
`
`F116
`File _ —126
`Identificatton
`
`Record
`
`
`
`Separate
`Fill; Into
`ate
`Streams
`
`Segment Data
`
`
`Stream Into
`Multiple
`
`
`Binary Objects
`
`t 36
`
`140
`
`To St
`
`116
`
`ep
`
`RACK-1003
`RACK-1003
`Page 6 of 27
`Page 6 of 27
`
`Queue Recor
`Found
`
`Status =
`
`Delete
`
`Queue
`
`Locate
`File
`Identification
`Record
`
`
`
`END
`
`l 19/
`
`128
`
`130
`
`132
`
`.
`
`Backup
`Queue
`Record
`I
`122
`
`133
`
`
`Create
`Binary Object
`
`
`Identification
`Records
`
`
`
`
`Identify
`Binary
`Objects for
`Backu o
`
`
`
`US. Patent
`
`Jul. 15, 1997
`
`Sheet 6 of 14
`
`5,649,196
`
`
`
`Compile list
`Of Binaly
`Objects To
`Be Backed Up
`
`
`
`
`
`Transmit
`‘
`
`
`Storage File
`Sufficien
`To Remote
`Sstorage
`
`
`
`Backup
`pac
`
`
`Delete Binary
`Object Frorn
`Local
`
`Storage
`
` Locate Highest
` Priority Binary
`
`
`
`Object
`
`300
`
`For Backup
`
`
`
`304
`
`
`
`
`
`
`Send Message
`To Resource
`Allocation
`Routme
`
`FIG. 5C
`
`
`
`
`
`
`Send List 01"
`Wait For
`
`Message From
`Binary
`
`
`Objects
`Resource
`
`
`For Backu -
`Allocation
`
` Routine
`
`308
`
`RACK-1003
`RACK-1003
`Page 7 of 27
`Page 7 of 27
`
`
`
`US. Patent
`
`Jul. 15, 1997
`
`Sheet 7 of 14
`
`5,649,196
`
`Send Message
`
`To Resource
`
`
`
`
`Allocation Routine
`
`
`Wait For Compress
`
`
`Message From Backup!
`
`Restore Routine
`
`
`Send Message
`
`To Resource
`
`Allocation Routine
`
`
`Allocated
`
`Storage
`Filled
`
`
`Request Allocation
`
`
`
`Of Compressed
`
`Storage File
`
`
`
`FIG. SD
`
`320
`
`Storage File
`
`Send Message
`To Resource
`
`Allocation
`
`Routine
`
`Wait For
`
`Allocation
`Request
`
`Allocate
`
`Compressed
`
`FIG. 5B
`
`RACK-1003
`RACK-1003
`Page 8 of 27
`Page 8 of 27
`
`
`
`US. Patent
`
`Jul. 15, 1997r
`
`Sheet 3 of 14
`
`5,649,196
`
`Wait For
`
`
`
`Message Fro
`Prioritization
`
`N 350
`
`
`
`
`
`
`
`Mess-
`Mess-
`
`
`
`Mess-
`3&6 From
`age From
`
`
`L053] Storage
`Compression
`age From Back-
`
`
`
`u . Restore
`Routlne
`Routine
`
`
`outln '
`
`
`Mess-
`age From
`Y
`Y
`
`
`
`CompressEOn
`
`
`.
`Routin-
`
`
`
`
`Save Storage
`Cid Cgmptes-
`
`outtne
`sron
`Y
`
`
`
`Space.
`To Space
`
`Information
`.. _
`.
`-
`
`
`
`352
`
`
`
`i
`i
`
`344
`
`354
`
`.
`
`Routines
`Waitin _
`
`
`
`C Marksion
`ompres
`Routine As
`
`
`
`Available 0 N Compresswn
`
`336
`
`
`CompreSsion
`Routine
`' vailabl
`
`
`Send Message
`To Delete
`
`Lower Priority
`
`Compression
`Routine
`
`
`Send Message
`To
`
`358
`
`362
`
`
`Routine
`
`Trasrnit Mess-
`
`age To Backup!
`Restore
`
`Storage Files
`
`Send Message
`To Delete
`
`Storage Files
`After Transfer
`
`RACK-1003
`RACK-1003
`Page 9 of 27
`Page 9 of 27
`
`
`
`US. Patent
`
`Jul. 15, 1997
`
`Sheet 9 of 14
`
`5,649,196
`
`Identity Whether
`
`Binal')’ Object Is
`
`Segment Of
`Database File
`
`
`Calculate
`Contents
`
`
`
`
`Identifiers
`
` Calcualte
`
`Contents
`
`Identifiers
`
`Contents
`
`Identifiers
`
`
`Equal
`
`
`
`
`RACK-1003
`RACK-1003
`Page 10 of 27
`Page 10 of 27
`
`
`
`US. Patent
`
`Jul. 15, 1997r
`
`Sheet 10 of 14
`
`5,649,196
`
`Create Work
`
`Area On
`
`Remote File Server
`
`Locate Most
`
`
`Recent Complete
`
`
`
`
`
`Copy Of
`
`1
`
`Binary Object
`
`420
`
`422
`
`424
`
`430
`
`Bitmap
`
`Create
`
`Locate Most
`
`
`
`
`
`
`
`Binary Object
`
`
`426
`
`
`
`
`
`Binary
`
`.
`Reconstltuted
`
`438
`
`
`Object
`
`RACK-1003
`RACK-1003
`Page 11 of 27
`Page 1 1 of 27
`
`
`
`
`
`Recent
`
`Granularized
`
`Copy Of
`
`
`
`
`
`
`
`
`Granular-
`ized Copy
`Foun .
`
`
`
`US. Patent
`
`Jul. 15, 1997
`
`Sheet 11 of 14
`
`5,649,196
`
`Obtain
`
`Of File
`
`Identity
`
`443
`
`
`
`Compile List Of
`Binary Objects In
`
`
`Previous Version
`
`
`OfFile
`
`
`Calculate
`
`Contents
`
`Identifiers
`
`Transmit
`
`Update
`
`Request
`
`
`
`Reconstitute
`Binary
`
`‘
`
`Object
`
`FIG 51
`
`54
`
`56
`
` ‘ 50 452
`
`
`
`
`
`
`
`
`Transmit
`Granules
`Compare
`
`
`
`To Local
`Identifiers
`contents
`
`
`
`Match
`Computer
`Identifiers
`
`
`
`
`RACK-1003
`RACK-1003
`Page 12 of 27
`Page 12 of 27
`
`
`
`U.S. Patent
`
`Jul. 15, 1997
`
`Sheet 12 of 14
`
`5,649,196
`
`502
`
`500
`
`
`
`
`Initiate Restore
`
`Of Randomly
`
`
`
`
`
`Binaxy Object
`
`Selected
`
`Identifier
`
`Binaty Object
`
`Calculate
`
`O4
`
`Obj ect Identifiers
`
`
`
`
`
`
`
`Log
`
`Successful
`
`Audit Restore
`
`506
`
`510
`
`
`
`Generate
`
` Failure
`
`Audit
`
`FIG. 5J
`
`508
`
`RACK-1003
`RACK-1003
`Page 13 of 27
`Page 13 of 27
`
`
`
`US. Patent
`
`Jul. 15, 1997
`
`Sheet 13 of 14
`
`5,649,196
`
`Obtain
`
` 600
`
`Last Access
`
`Date
` Restore File
`
`602
`
`
`
`
`Database if
`
`
`Necessary
`
`'
`
`FIG. 5K
`
` o4
`
`
`
`
`
`Read File
`
`
`Identification
`
`Record
`
`
`
`Disk Drive
`
`Put
`
`Online
`
`
`Locate Most
`
`
`Recent Backup
`
`Instance Record
`
`
` Initiate Restoratiou
`
`0f File; Set
`
`
`
`Migration Status
`
`To "Normal"
`
`
`
` "Migrated"
`
`RACK-1003
`RACK-1003
`Page 14 of 27
`Page 14 of 27
`
`
`
`US. Patent
`
`Jul. 15, 1997
`
`Sheet 14 of 14
`
`5,649,196
`
`Locate Each
`
`File
`
`
`
`
`Identification
`Record
`
`
`
` Create
`
`Retention
`
`
`Additional
`
`
`Working
`
`Backup Instance
`
`List
`
`
`Records
`
`
`710
`
`Locate Most
`
`Mark Retention
`
`Working List
`Entries As
`
`
`
`
`
`Insert
`
`Date Matches
`Recent Backup
`
`
`
`Unused Date
`Instance
`
`
`
`Ranges
`
`
`Record
`Used
`
`
`
`
`RACK-1003
`RACK-1003
`Page 15 of 27
`Page 15 of 27
`
`
`
`5,649,196
`
`1
`SYSTEM AND METHOD FOR DISTRIBUTED
`STORAGE MANAGEMENT 0N
`NETWORKED COMPUTER SYSTEMS
`USING BINARY OBJECT IIJEN‘I‘IFIERS
`
`This application is a continuation application Ser. No.
`081085596 filed on Jul. 1, 1993, now abandoned.
`BACKGROUND OF THE INVENTION
`1. Field of the Invention
`
`The present invention is directed generally to a system
`and method for distributed storage management on a net«
`worked computer system and, more specifically, to a system
`and method for distributed storage management on a net—
`worked computer system including a remote backup file
`server and one or more local area networks in communica-
`
`tion with the remote baclmp file server.
`2. Description of the Background of the Invention
`Backup copies of information stored on a computer
`system must be made so that if a failure occurs which causes
`the original copies of the data to be lost, the lost data can be
`recova'ed as it existed at the time when the last backup copy
`was made. Bachlp/restore systems have a long history on all
`types of computer systems from mainframes to
`minicomputers, local area network file servers and desktop
`workstations.
`
`Historically. backup systems have operated by making
`copies of a computer system’s files on a special backup
`inputtoutput device such as a magnetic tape drive, floppy
`diskette drive, or optical disk drive. Most systems allow full
`backup, partial backup (e.g., specified drives, directories, or
`files), or incremental backups based on files changed after a
`certain date or time. Copies of files made during a backup
`procedure are stored on these special backup devices and are
`then later retrieved during a restore operation either under
`file names derived from the original file, from the datet’time
`of the backup operation or from a serially-incremented
`number. The backup procedure is typically accomplished on
`an individual ccmputen’file server basis. rather than through
`a single coordinated approach encompassing multiple sys-
`tems. That is. the computer resources of two computers at
`most (the one processing the files to be backed up and the
`one with the backup device attached) are employed to effect
`the backup process, regardless of the actual number of
`computers networked together.
`Today,
`the absolute numbers of computers networked
`together by organizations are increasing rapidly as is the
`number of different types of computers and operating sys~
`tems in use. At the same time. the numbm' of storage devices
`and the capacities incorporated into each of these units is
`growing even more rapidly. In this environment, the backuptI
`restore approaches which have been traditionally used have
`become less reliable, more expensive, and more consump-
`tive of human time and attention.
`Thus, the need exists for a system designed to overcome
`the limitations of the existing baclmplrestore systems that
`have the following characteristics: (1) is capable of operat-
`ing on a networked computer system incorporating various
`types of computers and operating systems; (2) is capable of
`accommodating a large array of large capacity storage
`devices; (3) is reliable; (4) is mpable of operating with a
`minimum amount of human intervention; and (5) is rela-
`tively inexpensive.
`SUMMARY OF THE INVENTION
`
`The present invention is directed to a system for the
`distributed management of the storage space and data on a
`
`5
`
`10
`
`15
`
`25
`
`35
`
`45
`
`55
`
`2
`neuvorked computer system wherein the networked com-
`puter system includes at least uvo storage devices for storing
`data files comprised of one or more binary objects. The
`distributed storage management system includes means for
`selectively copying the binary objects stored on one of the
`storage devices to another of the storage devices and means
`for calculating a current value for a binary object identifier
`for selected binary objects stored on the storage devices
`wherein the calculation of the binary object identifier is
`based upon the actual data contents of the associated binary
`object. The distributed storage management system further
`includes means for storing the current value of the binary
`object identifier as a previous value of the binary object
`identifier. means for comparing the current value of the
`binary object identifier associated with a particular binary
`object to one or more previous values of the binary object
`identifier associated with that particular binary object and
`means for commanding the means for selectively copying
`binary objects in response to the means for comparing.
`The present invention is further directed to a method for
`the management of the storage space and data on a computer
`system wherein the computer system includes at least two
`storage area for storing data files comprised of one or more
`binary objects. The storage space management method
`includes the following steps: (1) selectively copying the
`binary objects stored in one of the storage areas to another
`of the storage areas; (2) calculating a current value for a
`binary object identifier for selected binary objects stored in
`the storage areas wherein the calculation of the binary object
`identifier is based upon the actual data contents of the
`associated binary object: (3) storing the current value of the
`binary object identifier as a previous value of the binary
`object identifier; (4) comparing the current value of the
`binary object identifier associated with a particular binary
`object to one or more previous values of the binary object
`identifier associated with that particular binary object; and
`(5) controlling the step for selectively copying binary
`objects in response to the step for comparing.
`The system and method, of the present invention for the
`management of the storage space on a computer system
`provide a backupirestore system that is capable of operating
`on a networked computer system incorporating various
`types of computers and operating systems, is capable of
`accommodating a large array of large capacity storage
`devices, is reliable, is capable of operating with a minimum
`amount of human intervention and is relatively inexpensive.
`These and other advantages and benefits of the present
`invention will become apparent from the description of a
`preferred embodiment hereinbelow.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`For the present invention to be clearly understood and
`readily practiced, a preferred embodiment will now be
`described, by way of example only, with reference to the
`accompanying figures wherein:
`FIG. 1 illustrates a simplified representation of a net-
`worked computer systemic which the system and method of
`the present invention may be employed;
`FIG. 2 illustrates the manner in which the Distributed
`Sta-age Manager program of the present invention allocates
`the storage space on each of the strange devices illustrated
`in FIG. 1;
`FIG. 3 illustrates the File Database utilized by the Dis~
`tributed Storage Manager program of the present invention;
`FIG. 4 illustrates the Backup Queue Database utilized by
`the Distributed Storage Manager program of the present
`invention: and
`
`RACK-1003
`RACK-1003
`Page 16 of 27
`Page 16 of 27
`
`
`
`5,649,196
`
`3
`FIGS. 5a~5l illusu'ate flow charts explaining the operation
`of the Disn't'buted Storage Manager program of the present
`invention.
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENT
`
`FIG. 1 illustrates a simplified representation of a typical
`networked computer system 10 in which the system and
`method of the present invention for distributed storage
`management on networked computer systems may be
`employed. A remote backup file server 12 is in
`communication, via data path 13, with a wide area network
`14. The wide area network 14 is, in turn, in communication
`with a pltnallty of local area networks 16 via data paths 15.
`Those of ordinary skill in the art will recognize that any
`number of wide area networks 14 may be in communication
`with remote backup file server 12 and that any number of
`local area networks 16 (from 1 to more than 100) may be in
`communication with each wide area network 14. Those of
`ordinary skill in the art will also recognize that the means for
`communication between remote backup file server 12, wide
`area network 14 and local area networks 16 over data paths
`13 and 15 is well known.
`
`Each local area network 16 includes multiple user work-
`stations 18 and local computers 20 each in communication
`with their respective local area netWork 16 via data paths 1'7.
`Again, those of ordinary skill in the art will recognize that
`the means for communication between user workstations 18,
`local computers 20 and local area networks 16 via data paths
`17 is well known. The storage space on each disk drive 19
`on each local computer 20 in the networked computer
`system 10 is allocated as follows and as is showu in FIG. 2:
`(1) operating system files 22; (2) a Distributed Storage
`Manager program 24 which embodies the system and
`method of the present invention (the operation of which is
`described in detail hereinbelow); (3) a File Database 25 (the
`structure of which is described in detail hereinbelow}; (4) a
`Backup Queue Database 26 (the structure of which is
`described in detail hereinbelow); (5) local computer data
`files 28; (6) free disk space 30 and (7) compressed storage
`files 32 (created by the Distributed Storage Manager pro-
`gram 24 of the present invention as is explained more fully
`hereinbelow).
`The Distributed Storage Manager program 24 of the
`present invention builds and maintains the File Database 25
`on one of the disk drives 19 on each local computer 20 in the
`networked computer system 10 according to the structure
`illustrated in FIG. 3. The File Database 25 stores information
`relating to each file that has been backed up by the Distrib-
`uted Storage Manager program 24 since the initialization of
`that program on each local computer 20. The File Database
`25 is comprised of three levels of records organized accord
`ing to a predefined hierarchy. The top level record, File
`Identification Record 34, includes identification information
`for each file that has been backed up by Distributed Storage
`Manager program 24. File Identification Record 34 contains
`the following elements: (1) Record Type 36 (identifies the
`file as either a directory file or a regular file); (2) File
`Location 33 (name of the directory in which the file resides);
`(3) File Name 40 (name of the file); (4) Migration Status 41
`(explained more fully hereinbelow); and (5) Management
`Class 43 (explained more fully hereinbelow).
`For each File Identification Record 34 in File Database
`25, one or more Backup Instance Records 42 are created that
`contain information about the file (identified by File Iden-
`tification Record 34) at the time that the file is backed up.
`
`10
`
`1.5
`
`20
`
`3D
`
`35
`
`45
`
`50
`
`55
`
`65
`
`4
`
`Each time that a file is backed up. 3. Backup Instance Record
`42 is created for that file. Each Backup Instance Record 42
`consists of the following elements: (1) Link to File Identi-
`fication Record 44; (2) Backup Cycle Identifier 46 (identifies
`the particular backup cycle during which the Backup
`Instance Record 42 is created); (3) File Size 48; (4) Last
`Modified Datet'I'ime 50; (5) Last Access DatefI‘ime 52; (6)
`File Attributes 54 (e.g., read-only, system, hidden); (7)
`Delete Date 56 (date on which the file was deleted); and (8)
`Insert Date 57 (date on which the Backup Instance Record
`42 was created).
`Associated with each Backup Instance Record 42 is one
`or more Binary Object Identification Records 58. The Dis-
`n-ibnted Storage Manager program 24 views a file as a
`collection of data streams. A data stream is defined as a
`distinct collection of data within the file that may be changed
`independently from other distinct collections of data within
`the file. For example. a file may contain its normal data and
`may also contain extended attribute data. A user may change
`the extended attribute data without modifying any of the
`normal data or vice versa. The Distributed Storage Manager
`program 24 further divides each data stream into one or
`more binary objects. If the size of the data stream is equal
`to or less than a previously defined convenient maximum
`binary object size (currently one (1) megabyte), then a single
`binary object represents the data stream. If the data stream
`is larger than the maximum binary object size, then the
`Distributed Storage Manager program 24 divides the data
`stream into multiple binary objects, all but the last of which
`are equal in size to the maximum binary object size. A
`Binary Object Identification Record 58 is created for each
`binary object that comprises the file which was backed up
`during the backup cycle identified by the Backup Cycle
`Identifier 46 of a particular Backup Instance Record 42.
`Each Binary Object Identification Record 58 includes the
`following components: (1) Link to Backup Instance Record
`60; (2) Binary Object Stream'l'ype 62 (e.g., data, extended
`attributes, security); (3) Binary Object Size 64; (4) Binary
`Object CRCfl2 66 (explained more fully hereinhelow); (5)
`Binary Object [RC 68 (explained more fully hereinbelow);
`(6) Binary Object Hash 70 (explained more fully
`hereinbelow); and (7) Binary Object Ofi’set 72 (explained
`more fully hereinbelow). The Binary Object Size 64, Binary
`Object CRC32 66, Binary Object LRC 68 and Binary Object
`Hash 70 comprise the Binary Object Identifier 74 which is
`a unique identifier for each binary object to be backed up and
`is discussed in more detail below.
`
`The Distributed Storage Manager program 24 also builds
`and maintains the Backup Queue Database 26 on one of the
`disk drives 19 on each local computer 20 in the networked
`computer system 10 according to the structure illustrated in
`FIG. 4. Each entry (Backup Queue Record 75) in the Backup
`Queue Database 26 is comprised of the following compo—
`nents: (1) Record Type 76 (identifies the file as either a
`directory file or a regular file); (2) File Location 78 (name of
`the directory in which the file resides); (3) File Name 80
`(name of the file): (4) File Status 82 (“new". “modifi
`” or
`“deleted”); (5) File Size 84; (6} Last Modified Datefl'ime 86;
`(7) Last Access DatelTime 88; (3) File Attributes 90 (e.g.,
`read—only. system, hidden); and (9) File Priority 92
`(explained more fully hereinbelow).
`The operation of the Distributed Storage Manager pro~
`grain 24 may be illustrated by way of the flow charts
`depicted in FIGS. 5a through St. For explanation purposes,
`the Distributed Storage Manager program 24 is divided into
`several distinct functions which will he discussed in turn.
`Those of ordinary skill in the art will recognize, however.
`
`RACK-1003
`RACK-1003
`Page 17 of 27
`Page 17 of 27
`
`
`
`5 ,649, 196
`
`5
`
`that each of the distinct functions operates in cooperation
`with the otherfunctions to form a unitary computer program.
`Those of ordinary skill in the art will also recognize that the
`following discussion illustrates the operation of the Distrib-
`uted Storage Manager program 24 on a single local com—
`puter 20, although it should be understood that the Distrib-
`uted Storage Manager program 24 operates in the same
`fashion on each local computer 20 on the networked com-
`puter system 10. The Distributed Storage Manager program
`24 can either be executed on user demand or can be set to
`execute periodically on a user-defined schedule.
`1. Identification of Binary Objeds to be Backed Up
`In the flow chart of FIG. 5a, execution of the Distributed
`Storage Manager program 24 begins at step 100 where the
`Backup Queue Database 26 is built by creating a Backup
`Queue Record 75 for each File Identification Record 34
`found in File Database 25. In this way, a list of files that were
`backed up dining the previous backup cycle is established so
`that it can be determined which files need to be backed up
`during the current backup cycle. To create each Backup
`Queue Record 75, the Backup Instance Record 42 repre-
`senting the most recent backup of the file represented by
`each File Identification Record 34 is located. This determi-
`nation is made by examining the Backup Cycle Identifier46
`in each Backup Instance Record 42. The Backup Cycle
`Identifier 46 may represent either a date (monthldayfyear) or
`numerical value assigned to a particular backup cycle. The
`Backup Queue Record 75 is comprised of certain of the data
`fields of both the File Identification Record 34 and the
`Backup Instance Record 42. Dming the process of creating
`each Backup Queue Record 75, the File Status field 82is set
`to “ ELEI‘ED“. However, if the Delete Date field 56 of the
`most recent Backup Instance Record 42 associated with the
`File Identification Record 34 currently being processed is
`non-mo, indicating that the file has been previously deleted,
`then no Backup Queue Record 75 is created for that File
`Identification Record 34. If the backup that is currently
`being processed for the local computer 20 is not a full
`backup (i.e., all files on all disk drives 19 on the local
`computer 20), then the Distributed Storage Manager pro-_
`gram 24 will only create Backup Queue Records 75 for those
`files that match the backup specifications. For example, if
`only those files that have a file extension of “.EXE" are to
`be backed up. then only File Identification Records 34 that
`correspond to “.EXE” files will be processed
`Program control then continues with step 102 where the
`Distributed Storage Manager program 24 of the present
`invention scans all disk drives 19 on the local computer 20
`that are to be backed up. This operation consists of scanning
`the directory hierarchy on each disk drive 19 on the local
`computer 20 and returning to the Distributed Storage Man-
`ager program 24 certain file block information for each of
`the directory files and regular files that are stored on the disk
`drives 19 to be backed up. A typical computer operating
`system maintains a file block for each file stored on the
`system which includes information such as file location, file
`type. user—assigned file name, file size, creation date and
`time, modify date and time, access date and time and file
`attributes. This operation may be controlled by some param-
`eters that indicate which drives, directories and files are to
`be backed up during a backup operation. However, the
`default operation is to back up all files on all disk drives 19
`on the local computer 20. Program control then continues
`with step 103 where the Distributed Storage Manager pro-
`gram 24 determines whether the file block infonnation for
`an additional file has been located on the disk drives 19. If
`an additional file has been located, program control confine
`
`10
`
`15
`
`35
`
`as
`
`50
`
`55
`
`55
`
`6
`ues with step 104. If an additional file has not been located.
`program control continues with step 116.
`In step 104, the Distributed Storage Manager program 24
`determines whether a Backup Queue Record 75 exists for
`the located file by comparing the file’s file block information
`to the information stored in Backup Queue Database 26. If
`such a Backup Queue Record ‘75 does not exist (i.e., this is
`the first time this file will be backed up), program control
`continues with step 106 where a Backup Queue Record '75
`for the file is created using the information contained within
`the file’s file block. The File Status field 82 for the newly
`created Backup Queue Record 75 is setto “NEW”. Program
`control then continues with step 108 where a user-defined
`priority is assigned to the file and stored in the File Priority
`field 92 of the Backup Queue Record ‘75. This user-defined
`priority may be assigned to the file by methods that are
`well-known to those of ordinary skill in the art. The use of
`the File Priority field 92 by the Disn-ibuted Storage Manager
`program 24 is discussed in more detail hereinbelow. Pro-
`gram control is then returned to step 102.
`If the Distributed Storage Manager program 24
`determines, in step 104, that a Backup Queue Round 75
`exists in the Backup Queue Database 26 for the located file,
`program control continues with step 110 where it is deter~
`mined whether any change has been made to the file. This
`determination is made by comparing the information in the
`file’s file block with the information stored in the file‘s
`Baclnrp Queue Record '75. If any of the values have changed.
`program control continues with step 112 where File Status
`field 82 is set to ‘MODIFIED” and the fields in the Backup
`Queue Record 75 are updated firms the file’s file block
`information. Program control then continues with step 108
`where a user-defined priority is assigned to the file and
`stored in File Priority field 92; program control is then
`returned to step 102. Ifthe determination is made in step 110
`that no change has been made to the file, then, in step 114,
`the Backup Queue Record 75 is deleted from the Backup
`Queue Database 26 since the file does not need to be backed
`up. Following step 114, program control is refined to step
`102.
`
`If the Distributed Storage Manager program 24
`determines, in step 103, that an additional file has not been
`located, program control continues with step 116. In step
`116, the Distributed Storage Manager program 24reads each
`Backup Queue Record 75 in Backup Queue Database 26,
`one at a time. The Backup Queue Records 75 in Backup
`Queue Database 26 represent all of the files that must be
`backed up by the Distributed Storage Manager program 24
`during the present backup cycle. Program control continues
`with step 117 where the Distributed Storage Manager pro-
`gram 24 determines whether a next Backup Queue Record
`75 has been located in Backup Queue Database 26. Ifa next
`Backup Queue Record 75 has been located, program control
`continues with step 118; otherwise, program control contin-
`ues with step 119, where the routine illustrated by the flow
`chart of FIG. 5a is terminated In step 118, the Distributed
`Storage Manager program 24 determines whether the File
`Status field 82 in the Backup Queue Record 75 currently
`being prooessed is set to ‘DELEI'ED". If the File Status
`field 82 is set to ‘ EIEI‘ED“, program control continues
`with step 120 where the Delete Date field 56 in the most
`recent Baclnrp Instance Record 42 associated with the file
`identified by the Backup Queue Record 15 ctn'rently being
`processed is setto the cm'rent date. Alist of all Binary Object
`Identification Records 58 associated with the Backup
`Instance Record 42 for the file identified by the Backup
`Queue Record 75 currently being processed is placed in a
`
`RACK-1003
`RACK-1003
`Page 18 of 27
`Page 18 of 27
`
`
`
`5,649,196
`
`7
`
`delete queue (not shown) that will be used by Distributed
`Storage Manager program 24 to delete all Binary Object
`Identification Records 58 for binary objects that have been
`deleted fi'om the disk drives 19 of local computer 20.
`Program control then continues with step 122 where the
`Backup Queue Record 75 unrently being processed is
`deleted from the Backup Queue Database 26. Program
`control is then returned to step 116.
`If the Distributed S