throbber
||||||||||||l|||||||||||||||||||||||||||| |||||||||||||||||||||||||||||||||
`
`USllt1641528tlB1
`
`(12) United States Patent
`US 6,415,280 B1
`(1(1) Patent N0.:
`Farber et a1.
`(45) Date of Patent:
`*Jui. 2, 2002
`
`(54)
`
`(75)
`
`t 73)
`
`IDEN'I‘IFYING AND REQUESTING DATA IN
`NETWORK USING IDENTIFIERS WHICH
`ARE BASED ON CONTENTS OF DATA
`
`Inventors: David A. Farlier, Ojai, CA (US);
`Ronald D. Laehman. Northhrook, [1-
`(US)
`
`Assignees: Kineteeh, Ine.. Northbrook. IL (US);
`Digital Island, lnc., San Francisco, CA
`(US)
`
`(‘l
`
`Notice:
`
`Subject to any disclaimer, the [em] 01' this
`patent is extended or adjusted under 35
`use 154th) by 0 days.
`
`This patent is subject to a tenninal dis-
`claimer.
`
`(21)
`
`(22
`
`Appl. No;
`liiled:
`
`097283,160
`
`Apr. 1, 1999
`
`Related U.S. Application Data
`
`(63)
`
`(51)
`(52
`
`(58)
`
`(56)
`
`Division of application No. [WJGtJ,tJ?9. filed on Oct. 24,
`1997, now Pat. No. 5,978,791. which is a continuation of
`application No. 087425,16I'J.
`tiled on Apr. 11, 1995. now
`abandoned.
`
`Int. Cl.7
`U.S. Cl.
`
`.thtSF 17730
`270772370773; 707710;
`707’1017097203; 7097219; 7097229
`Field of Search
`"7078,10, 101,
`70772;7097203, 219, 229
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,922,417 A
`5,202,982 A *
`5,287,499 A
`5,141,477 A
`5,452,442 A *
`5,542.08? A
`
`571990
`47 199 3
`271994
`871994
`971995
`771996
`
`70771
`Churni et at.
`70772
`Gramlich et all
`70772
`Ncmcs
`
`71,197226
`l’itkin Ct 31.
`
`Nelson et al.
`7077205
`
`Neimal et al.
`707710
`
`..
`
`671997 Stcfik et al.
`5,638.443 A
`{371997 Hamilton el al.
`5,040,564 A "
`971998 Balick et a1.
`5,802,291 A
`971998 Nguyen
`5,897,494 A *
`57199)
`(iudmundson el 31
`5,907,704 A
`(1,006,018 A " 1271999 Burnett etal.
`6,134,6fl3 A * 1072011.!
`Jones et a1.
`
`
`
`785754
`7097303
`.. 7097202
`,. 70771
`
`. 395731149
`7037330
`
`OTHER PUBIJCATIONS
`
`Gwertzman, James, et al. "The Case for Geographical Push-
`Caching." Technical Report I-IU ‘I'R 34—94 (excerpt), I-larv
`vard University, DAS, Cambridge, MA (12138, 1994, 2 pgs.
`Grigni, Michelangelo, et a1. “'l‘igltt Bounds on Minimum
`Broadcasts Networks.” SIAM Journal of Discrete Math-
`ematics, vol. 4, No. 2, May 199], pp. 207—222.
`
`{List continued on next page.)
`
`Primary E'xnrrtiner—Jean R. Homere
`('74) Attorney, Agent, or Firm—Pillsbury Winthrop LLP
`Intellectual Property
`
`(57)
`
`ABSTRACT
`
`In a system in which a set ol‘data items are distributed across
`a network of servers, at least some of the data items being
`cached versions ofdata items from a source server, a content
`delivery method includes determining a data identifier for a
`particular data item.
`the data identifier being determined
`using a given function of the data comprising the particular
`data item; and responsive to a request [or the particular data
`item, the request including at least the data identifier of the
`particular data item. providing the particular data item from
`a given one of the servers of the network of servers. The
`request for the particular data item may be resolved based on
`a measure of availability of at least one of the servers, where
`the measure of availability may he a measurement of band-
`width to the server; a measurement of a cost ofa connection
`to the server, and7or a measurement of a reliability of a
`connection to the server. The function used to determine the
`identifier may be a message digest
`function or
`a hash
`function.
`
`55 Claims, 31 Drawing Sheets
`
`,..,._----____..__ .
`PROCESSOR
`
`.
`
`_ .,_
`
`. ____J:.—.______ _______-.,
`102
`i
`
`STORAGE
`DEWE
`
`"-
`
`
`
`02
`PRO GES‘SOR
`
`PROCESSOR
`
`1m:
`
`
`
`
`
`
`STORAGE
`
`RACK-1001
`RACK-1001
`Page 1 of 56
`Page 1 of 56
`
`

`

`US 6,415,280 B1
`Page 2
`
`OTHER PUBLICATIONS
`
`Devine, Robert. “Design and Implementation of DDH: A
`Distributed Dynamic Hashing Algorithm." In Proceedings
`of 4th International Conference on Foundations of Data
`Organizations and Algorithms, 1993, pp. 101—114.
`Deering, Stephen, et a1. “Multicast Routing in Datagram
`Internetworks and Extended LANs." ACM Transactions on
`Computer Systems, vol. 8, No. 2, May 1990, pp. 85—110.
`Cormen. Thomas H.. et al. Introduction to Algorithms, The
`MIT Press, Cambridge, Massachusetts, 1994, pp. 219—243,
`991—993.
`Naor, Moni, et a]. "The Load, Capacity and Availability of
`Quorum Systems." In Proceedings of the 35th IEEE Sym-
`posium on Foundations of Computer Science, Nov. 1994,
`pp. 214—225.
`for Space—
`"Psuedorandom Generators
`Nisan, Noam.
`Bounded Computation." In Proceedings of the Twenty—
`Sccond Annual ACM Symposium on Theory of Computing,
`May [990, pp. 204-2l2.
`Palmer, Mark et al. “Fido: ACache that Leams to Fetch.” In
`Proceedings of the 17111 International Conference on Very
`Large Data Bases, Sep. 1991, pp. 255—264.
`Peleg, David, et al. "The Availability oi" Quorum Systems."
`Information and Computation 123, 1995, 210—223.
`Rabin, Michael. “Elficient Dispersal of Information for
`Security, Load Balancing, and Fault Tolerance.” Journal of
`the ACM, vol. 36, No. 2, Apr. 1989, pp. 335—348.
`Ravi. IL. “Rapid Rumor Ramification: Approximating the
`Minimum Broadcast Time." In Proceedings of the 35111
`IEEE Symposium on Foundation of Computer Science, Nov.
`1994, pp. 202—213.
`Schmidt. Jeanette, et aI. “Chernotf—Hoctfding Bounds for
`Applications with limited Independence." In Proceedings
`ot‘ the 4th ACS—SLAM Symposium on Discrete Algorithms,
`1993. pp. 331—340.
`Tarjan, Robert Entire, el al. “Storing a Sparse Table."
`Communications of the ACM, vol. 22, No. 1], Nov. 1979,
`pp. 606—611.
`
`Wegrnan, Mark, et al. " New Hash Functions and Their Use
`in Authentication and Set Equality.” Journal of Computer
`and System Sciences vol. 22, Jun. 1981, pp. 265-279.
`
`Vitter. Jelfrey Scott, et a1. "Optimal Prefetching via Data
`Compression. ” In Proceedings 0132nd IEEE Symposium on
`Foundations of Computer Science, Nov. 1991, pp. 121—130.
`
`Fredman, Michael, el al. "Storing a Sparse Table with 0(1)
`Worst Case Access Time.” Journal of the Association for
`Computing Machinery, vol. 31, No. 3, Jul. 1984, pp.
`538—544.
`
`Yao, Andrew Chi—Chih. “Should Tables be Sorted?” Journal
`of the Association for Computing Machinery, vol. 28. No. 3,
`Jul. 1931, pp. 615—628.
`
`Floyd, Sally, et al. “A reliable Multicast Framework for
`Light—Weight Sessions and Application level Framing." In
`Proceedings of ACM SIGCOMM ‘95, pp. 342—356.
`
`Feeley, Michael, et :1]. “Implementing Global Memory Man-
`agement in a Workstation Cluster." In Proceedings of the
`15th ACM Symposium on Operating Systems Principles,
`1995, pp. 201—212.
`Caner, J. Lawrence, et al. “Universal Classes of Hash
`Functions." Journal of Computer and System Sciences, vol.
`18, No. 2, Apr. 1979, pp. 143—154.
`
`Patent Abstracts of Japan, “Electronic Mail Multiplexing
`System and Communication Control Method in The Sys-
`tem.” Jun. 30, 1993, JP 051625293.
`
`Kim ct al., ”Experiencess with Tripwire: Usinglntergrity
`Checkers For Intruosion Detection”, Coast Labs. Dept. of
`Computer Sciences Purdue University. Feb. 22, 1995, pp.
`1—12.
`
`Kim et al., “The Design and Implementation of Tripwire: A
`file System Integrity Checker“, Coast Labs. Dept. of Com-
`puter Sciences Purdue University, Nov. 19, 1993, pp. 1—21.
`
`‘ cited by examiner
`
`RACK-1001
`RACK-1001
`Page 2 of 56
`Page 2 of 56
`
`

`

`US. Patent
`
`
`
`Ac:.07.
`
`m
`
`m2:
`
`
`.memwmoomn.....memwm...00mg
`
`
`m.mosmomoSma
`
`5555moéofi
`
`m
`
`US 6,415,280 B1
`
`
`
`
`
`mmOmmmooEmomwmoomn.53305mNSNS«2
`
`c_.
`
`RACK-1001
`RACK-1001
`Page 3 of 56
`Page 3 of 56
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 2 0f 31
`
`US 6,415,280 B1
`
`Hagar—.0
`
`wot/mo
`
`3:.
`GE
`
`RACK-1001
`RACK-1001
`Page 4 of 56
`Page 4 of 56
`
`

`

`US. Patent
`
`.m._
`
`J
`
`m2,
`
`f03m
`
`US 6,415,280 B1
`
`Mon...
`
`ouv02
`
`
`
`MAE.....mn=u_m3."—
`
`
`
`S.EoBmmE...EchommaEoSmmE
`
`ma:a:m:
`
`NNr
`
`may«3
`
`2...m|=m
`
`
`
`
`
`pzmsumw....Emsommpzmzwmw
`
`RACK-1001
`RACK-1001
`Page 5 of 56
`Page 5 of 56
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 4 0f 31
`
`US 6,415,280 B1
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`FIG. 3
`o
`
`F1112 ID
`
`
`
`-
`
`_ o
`
`I
`
`a
`
`.40
`
`E42
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`FIG. 4
`
`E oiration
`
`Groomin - delete count
`
`FIG. 5
`
`RACK-1001
`RACK-1001
`Page 6 of 56
`Page 6 of 56
`
`

`

`US. Patent
`
`Jul.2,2002
`
`Sheet 5 of 31
`
`US 6,415,280 B1
`
`Hocfi.fiho :DHUMUDH
`flUHd—Dm
`
`.flmumufiws
`
`mfimnnuum
`
`wfimzmans
`
`RACK-1001
`RACK-1001
`Page 7 of 56
`Page 7 of 56
`
`wfimz
`
`DHHommuooum
`
`fioflumnm.o
`
`.Q
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 6 0f 31
`
`US 6,415,280 B1
`
`FIG. IO(a)
`
`SIMPJ_E
`
`DA TA ITEM
`
`‘F-finfifiu-nfihfian—n—nF--‘
`
`APPEND LENGTH MODULO 32 OF
`
`DATA ITEM
`
`COMPUTE MD FUNCTION ON
`
`DATA ITEM
`
`8214
`
`4"
`
`-—.—_-._-_.-—-_-_--—-—--"
`
`TRUE NAME
`
`l
`
`RACK-1001
`RACK-1001
`Page 8 of 56
`Page 8 of 56
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 7 0f 31
`
`US 6,415,280 B1
`
`YES
`
`
`
` $216
`
`DATA ITEM
`
`SIMPLE?
`
`
`$220
`
`
`
`PARTITION DATA ITEM INTO
`SEGMENTS
`
`
`FIG.
`
`|O('b)
`
`3222
`
`ASSIMILATE EACH SEGMENT
`
`(COMPUTING ITS TRUE NAME)
`
`SEGMENT TRUE NAMES
`
`8224
`
`CREATE INDIRECT BLOCK OF
`
`II
`
`lII II
`
`COMPUTE TRUE
`
`NAME OF SIMPLE .
`DATA ITEM
`t
`————————————— l
`
`3226
`ASSIMILATE INDIRECT BLOCK
`
`(COMPUTING ITS TRUE NAME)
`
`ITEM
`
`8228
`REPLACE FINAL 32 BITS 0F TRUE
`NAME WITH LENGHT MOD 32 OF DATA
`
`RACK-1001
`RACK-1001
`Page 9 of 56
`Page 9 of 56
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 8 0f 31
`
`US 6,415,280 B1
`
`>m._.zmmeD
`
`
`
`we.m3."—m><I
`
`mm:
`
`mm>
`
`mmmw
`
`
`
`0—m|=n_mmOhm
`
`mmmw
`
`
`
`n:we".wkmAmo
`
`RACK-1001
`RACK-1001
`Page 10 of 56
`Page 10 of 56
`
`
`
`
`
`v0....FZDOOmm:.55,.
`
`n=MJEmmOFwc
`
`
`
`
`
`whim—n.EMT—POhum..
`
`
`
`
`
`mE<zmam...wmoo
`
`w.=n_mDmhz_kwsfl
`
`m>mhm5wm
`
`wmmm
`
`ommw
`
`mz_5mm_._.mn_
`
`m=2<zmam...
`
`__.o_u_
`
`
`
`>m._.zm_gmzwkdmmo..
`
`
`
`
`
`
`
`
`
`
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 9 0f 31
`
`US 6,415,280 B1
`
`FIG. l2
`
`
`
`FILE
`
`3240
`
`UPDATE
`
`
`
`
`
`
`
`3238
`
`
`
`LOCKED?
`
`DEPENDENCY
`
`LIST
`
`
`
`8242
`
`
`
`
`
` 8244
`UPDATE CACHE
`
`COMPRESS
`
`(IF DESIRED)
`
`SEND MESSAGE TO
`
`CACHE SERVER TO
`
`MIRROR
`
`
`
` 3246
`
`
`(IF DESIRED)
`
`RACK-1001
`RACK-1001
`Page 11 of 56
`Page 11 of 56
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 10 0f 31
`
`US 6,415,280 B1
`
`5250
`SEARCH FOR
`THE
`
`PATHNAME
`
`. .
`
`.
`
`. '
`
`FAIL
`
`
`
`LDE INCLUDES
`
`TRUE NAME?
`
`
`
`8258
`
`
`
`
`
`ASSIMILATE
`LDE IDENTIFIES
`FILE ID
`DIRECTORY?
`
`
`
`
`
`Y8
`
`8256
`
`FREEZE
`
`DIRECTORY
`
`RACK-1001
`RACK-1001
`Page 12 of 56
`Page 12 of 56
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 11 MB]
`
`US 6,415,280 B1
`
`CONFIRM THAT
`
`TRUE NAME
`
`EXISTS LOCALLY
`
`
` 8260
`
`
`
`
`
`
`
`
`8262
`SEARCH FOR
`
`
`PATH NAME IN
`
`
`
`LDE TABLE
`
`FIG. l4
`
` 8264
`
`
`
`CONFIRM THAT
`
`DIRECTORY
`
`EXISTS
`
`
`
`3266
`
`
`
`
`
`
`NAMED FILE
`
`EXISTS?
`
`
`
`
`8268
`
`
`
`DELETE
`
`TRUE FILE
`
`
`
`CREATE
`
`
`
`ENTRY IN LDE
`
` $270
`
`
`& UPDATE
`
`RACK-1001
`RACK-1001
`Page 13 of 56
`Page 13 of 56
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 12 0f 31
`
`US 6,415,280 B1
`
`mmZOmwmm
`
`Mahmom
`
`vhmw
`
`”rpm02mm
`
`waOmmmm
`
`2mm
`
`
`
`m5."—mnmkMMHZm
`
`owmw
`
`:5592:05.a9.3635.hmwncmm
`mo".
`
`am;02
`
`
`
`“memmwoom.4.29.5604.0..—
`
`mmZOmwwm
`
`mahdwmz
`
`
`
`93w
`
`mu:
`
`
`
`0P2.omzmnhmm
`
`Nwmw
`
`
`
`mom...>n=mm>
`
`"aw.=n_
`
`3%:me
`
`
`
`m|=..._02—"—
`
`9.0.“.
`
`RACK-1001
`RACK-1001
`Page 14 of 56
`Page 14 of 56
`
`
`
`
`
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 13 0f 31
`
`US 6,415,280 B1
`
`239.91..
`
`HID
`
`vmmm
`
`.Fzmno
`
`305mm
`
`AmEOmmmoomn.
`
`mm;
`
`mmmw
`
`>z<
`
`nwhOme
`
`mmOmmmoomm
`
`mmmw
`
`Pzwfio
`
`mwzommwm
`
`m3aon
`
`
`
`.....dqlu-lli-
`
`wkw<oo<0mm
`m5:.0&2
`
`mMZOlmmm
`
`b3what.
`
`to
`
`RACK-1001
`RACK-1001
`Page 15 of 56
`Page 15 of 56
`
`
`
`
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 14 0f 31
`
`.4
`
`Emmw8me
`
`
`
`02mm6.,mpwflflmmmwmwxmmm>wqumwm%%om...“mackm>mmmmmm-ohmw<wmmz
`
`momnom20
`
`
`
`
`
`
`max...”.0NON—30w
`
`
`
`EON."—wmmufiamE<z
`
`«ZO_._.<Z_._.wmo
`
`momww
`
`mon—mnt.n5KOO..—
`
`00¢aMES...may:
`
`
`
`9.29.5601.momnow
`
`
`
`mon—wo.mom=0mO...
`
`
`
`mE<zmam...
`
`ommw
`
`mmOhm
`
`n:MOmwMUOm—n.
`
`ormmm
`
`lB08.1
`
`2.,>53805H.5:0h03nz<.«55$
`
`35—.0E
`
`RACK-1001
`RACK-1001
`Page 16 of 56
`Page 16 of 56
`
`
`
`
`
`
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 15 0f 31
`
`US 6,415,280 B1
`
`Emm>
`
`MO”.n:mJE
`
`~>mhzm9:...
`
`02
`
`wmmmmm;
`
`omum
`
`mmmmmfioomo
`
`ommwmmmEOU
`
`we.MAE
`
`mm;
`
`.=<n_
`
`Oz
`
`
`
`>N—hzmm...=n_may.
`
`
`
`max...:0".mu...2.
`
`«MES;
`
`3E
`of.
`
`RACK-1001
`RACK-1001
`Page 17 of 56
`Page 17 of 56
`
`
`
`
`
`
`
`
`

`

`US. Patent
`
`m.
`
`2
`
`{0m%
`
`US 6,415,280 B1
`
`ammomw
`
`Nomm
`
`>n=._.OZ
`
`
`
`mhowl—mmNEOEOZm._.<00._
`
`Q~mfifikzum
`
`
`
`
`
`
`
`mm:momDOmm.=n_who—2mm.
`
`MEOmeAEMmzmhmflqdwm
`
`avmomaom
`
`:32.07..
`
`mm>
`
`RACK-1001
`RACK-1001
`Page 18 of 56
`Page 18 of 56
`
`
`
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 17 0f 31
`
`US 6,415,280 B1
`
`
`
`“Sm:.o_n_
`
`m0".n:m...“—
`
`«MAEm3”;
`
`Nwmw
`
`
`
`me...9,—5.2
`
`.200.—m..."—
`
`aha—n.
`
`NDE".0>moomm
`
`whmfimomwmw
`
`mud—mmam...
`
`9mm
`
`
`
`mmEFzmQmo;—
`
`
`
`mnm...02:.me
`
`hm..."—
`
`oumm
`
`
`
`2&2Edam—0
`
`
`
`MAEIUPgom
`
`RACK-1001
`RACK-1001
`Page 19 of 56
`Page 19 of 56
`
`
`
`
`
`
`
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 18 0f 31
`
`US 6,415,280 B1
`
`wmmw
`
`m>OEmmwn:w.__u_m><w
`mm...
`
`>m._.Zm_
`
`wm>
`
`ommw
`
`
`
`EmzO._.m.=n_>m00
`
`a.man.#65.m.__"_
`
`.m._mEm32_
`
`
`
`mm:hzmsmmoma
`
`#2300
`
`
`
`3V?.9“.
`
`RACK-1001
`RACK-1001
`Page 20 of 56
`Page 20 of 56
`
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 19 0f 31
`
`US 6,415,280 B1
`
`
`
`853.2332:fiwwmmmwwoz<we“.
`
`55.5593%mEzamomam
`88:93mo“.
`m3:at2.>55me
`
`
`
`amp—bumszmzw
`
`A32.oE
`
`Nmmw
`
`HZMEMKOZ.
`
`
`
`x004mNmmmn—
`
`hmmw
`
`
`
`gm:whdmmo
`
`Sub<._.<n
`
`RACK-1001
`RACK-1001
`Page 21 of 56
`Page 21 of 56
`
`
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 20 0f 31
`
`US 6,415,280 B1
`
`03w
`
`omoomx
`
`
`
` mmmm
`
`:Ofim0".
`
`4<ZOFED<
`
`cum—mun
`
`ZO_H<EmOn_Z_
`
`2:2.o_n_
`
`OH>mhzm004
`
`«Fadgmz
`
`Emt
`
`m._.<z_nmom:w
`
`DZ<m..=n_
`
`m1...2—>mOPUmw=D
`
`
`
`>m0h0wm~otha
`
`mvmm
`
`NIHw._.<..=s=mm<
`
`Em...—<._.<n_E2
`
`3mm
`
`szEmMOm—D
`
`mNmmEm3
`
`go0.. ,
`
`RACK-1001
`RACK-1001
`Page 22 of 56
`Page 22 of 56
`
`
`
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 21 0f 31
`
`US 6,415,280 B1
`
`wvmw
`
`gamm—
`
`>m0h0mm5
`
`ommm
`
`
`
`4...Dn.mbqwmo
`
`m5<zzh<m
`
`Nmmm
`
`OhFEE$2.:—
`
`mE<zmay:
`
`mvmm
`
`mam...msz
`
`1200..—mgr...
`
`mmmw
`
`zo<mm0".
`
`>m0kowm5
`
`>m._.2m_
`
`macs.Oz
`
`wmahzm
`
`vmmw
`
`mZOD
`
`ON
`0E
`
`RACK-1001
`RACK-1001
`Page 23 of 56
`Page 23 of 56
`
`
`
`
`
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 22 0f 31
`
`US 6,415,280 B1
`
`
`
`WAIT FOR
`
`FREEZE LOCK
`T0 TURN OFF
`
` S354
`
`
` S356
`ENTRY
`
`FIND TFR
`
`FIG.2|
`
`3358
`
`
`
`
`
`
`DECREMENT
`
`REFERENCE
`COUNT
`
`
`
`
`
`S362
`
`
`REFERENCE COUNT IS
`DELEI'E
`
`ZERO 8: N0 DEPENDENT
`
`
`TRUE FILE
`
`SYSTEMS IN TFR?
`
`
`
`S364
`
`
`REMOVE FILE ID .
`
`AND COMPRESSED
`
`FILE ID
`
`
`
`RACK-1001
`RACK-1001
`Page 24 of 56
`Page 24 of 56
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 23 0f 31
`
`US 6,415,280 B1
`
`
`
`GET
`
` 8365
`
`OPERATION
`
`
`
`YES
`
`FIG. 22
`
`$368
`
`
`ASSIMILATE
`
`
`
`YES
`
`
`
`
`
`COPY OR DELETE
`COMPOUND?
`
`
`
`
`
`
`
`
`COUNT OF EACH
`COMPONENT
`
`NAME IN AUDIT
`
`
`
`S3YQ
`
`
`
`
`
`FOR EACH PARENT
`
`DIRECTORY 0R FILE,
`UPDATE USE COUNT.
`LAST ACCESS AND
`MODIFY TIMES
`
`
`
`RACK-1001
`RACK-1001
`Page 25 of 56
`Page 25 of 56
`
`
`
`S378
`
`MODIFY USE
`
`
`
`
`S370
`RECORD TRUE
`
`
`FILE
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 24 0f 31
`
`US 6,415,280 B1
`
`FIG. 23
`
`
`
`
`
`
` S382
`\RHUFY
`
`
`GROOMING
`
`LOCK OFF
`
`
`
` 3386
`
`SET GROOM
`
` 8384
`
`SET
`
`GROOMING
`
`LOCK
`
`
`COUNTS
`
`RACK-1001
`RACK-1001
`Page 26 of 56
`Page 26 of 56
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 25 0f 31
`
`US 6,415,280 B1
`
`
`
`
`FIND LDE
`
`S388
`
`FIG. 24
`
`RECORD
`
`
`
` S392
`
`INCREMENT
`
`
`
`
`GROOMING
`
`DELETE COUNT
`
`
`SIZES
`
` 3394
`
`ADJUST FILE
`
`RACK-1001
`RACK-1001
`Page 27 of 56
`Page 27 of 56
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 26 0f 31
`
`US 6,415,280 B1
`
`FIG. 25
`
`LOCK
`
`S398
`
`UNLOCK
`
`GROOMING
`
`RACK-1001
`RACK-1001
`Page 28 of 56
`Page 28 of 56
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 2‘? 0f 31
`
`US 6,415,280 B1
`
`3va.9“.
`
`mun—himmu=n_
`
`«>I_I_<OO.._
`
`vowm
`
`tmiOflm
`
`meO
`
`Nova
`
`MHz—um—
`
`«awkdmmo
`
`mmvm >.._z0.0dwm
`
`tmiOfla
`
`RACK-1001
`RACK-1001
`Page 29 of 56
`Page 29 of 56
`
`
`
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 28 of 31
`
`US 6,415,280 B1
`
`owvm
`
`.2004mxgz
`
`
`
`a...205mm)
`
`n:w..=u_2m.4th
`
`NEH50m".
`
`BENGE
`
`#02n:300..
`
`hum—:00."
`
`2.5
`
`02—mm
`
`>4MEJ¢EO
`
`vNVM
`
`zmwfimm
`
`D.
`
`
`
`mud—u.IOhEOm
`
`m5."—mwdamm
`
`movw
`
`whom—mo
`
`mn=m$8.ng
`
`RACK-1001
`RACK-1001
`Page 30 of 56
`Page 30 of 56
`
`
`
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 29 0f 31
`
`US 6,415,280 B1
`
`.2910”?—
`
`ZOFmJMD
`
`mm.»
`
`..OomOOmmmo...0.
`
`
`
`2.m09.5.004Md...—
`
`>420.D<m_m
`
`w>MOhommE
`
`
`
`ABNN.o_n_
`
`vmvm
`
`
`
`mnm...>n=._.zwn=
`
`
`
`mat.EOMEMd...—
`
`mE<Z
`
`Nva
`
`
`
` >~Fzmkma.man.MES—ammo
`
`mOmmom—00mm
`
`m3."—
`
`RACK-1001
`RACK-1001
`Page 31 of 56
`Page 31 of 56
`
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 30 0f 31
`
`US 6,415,280 B1
`
`nmvw
`
`mhmfimn
`
`m.=n_".0
`
`
`
`>n_00IUPdw—Uw
`
`wm>
`
`why—Emath
`
`
`
` szm..5500mwa
`
`«midimay;
`
`0202m<ImJE
`
`wm>
`
`omvw
`
`mhmnfio
`
`m.=n_m3”:
`
`mmvw
`
`
`
`Oh>m._.zmn_n_<
`
`
`
`MAE.5034
`
`
`SKN.07..
`
`wmvw
`
`
`
`mm:museum
`
`mzoE:58
`
`RACK-1001
`RACK-1001
`Page 32 of 56
`Page 32 of 56
`
`
`
`
`
`
`

`

`US. Patent
`
`Jul. 2, 2002
`
`Sheet 31 0f 31
`
`US 6,415,280 B1
`
`mid:mam...wN.m:u—mean...
`
`mmwm
`
`Ozmmr
`
`vmfiw
`
`«QZDOu
`
`mmOHhmmDOm—m
`
`hownm¢ngL
`
`mmvw
`
`m>_._.<0m2
`
`mmZOmwmm
`
`vaw
`
`hwy—(EEO...—
`
`ham30mm
`
`
`
`0.m.=n_mun—3402
`
`nmwmmmn—EOOm0
`
`fin:Md...
`
`Elam
`
`mZCwOm
`
`mmZOmwmm
`
`RACK-1001
`RACK-1001
`Page 33 of 56
`Page 33 of 56
`
`
`
`
`
`
`

`

`US 6,415,280 B1
`
`1
`IDENTIFYING ANI) REQUESTING DATA IN
`NETWORK USING IDENTIHERS WHICH
`ARE BASED ON CONTENTS 01"” DATA
`
`This is a division of application Ser. No. (t8t96f),079,
`filed Oct. 24, 1997, now US. Pat. No. 5,9?8,791 filed Oct.
`24, 2001 which is a continuation of Ser. No. (l8t425,160,
`filed Apr. 11, 1995, now abandoned.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`
`This invention relates to data processing systems and,
`more particularly, to data processing systems wherein data
`items are identified by substantially unique identifiers which
`depend on all of the data in the data items and only on the
`data in the data items.
`
`ll]
`
`15
`
`2. Background of the Invention
`Data processing (DP) systems, computers, networks of
`computers, or the like, typically offer users and programs
`various ways to identify the data in the systems.
`Users typically identify data in the data processing system
`by giving the data some form of name. For example, a
`typical operating system (OS) on a computer provides a file
`system in which data items are named by alphanumeric -
`identifiers. Programs typically identify data in the data
`processing system using a location or address. For example,
`a program may identify a record in a file or database by using
`a record number which serves to locate that record.
`
`3o
`
`In all but the most primitive operating systems, users and
`programs are able to create and use collections of named
`data items, these collections themselves being named by
`identifiers. These named collections can then, themselves,
`be made part of other named collections. For example, an
`OS may provide mechanisms to group files {data items) into
`directories (collections). These directories can then, them-
`selves be made part of other directories. A data item may
`thus be identified relative to these nested directories using a
`sequence of names, or a so-called pathname, which defines
`a path through the directories to a particular data item (file
`or directory).
`As another example, a database management system may
`group data records (data items) into tables and then group
`these tables into database files (collections). The complete
`address of any data record can then be specified using the
`database file name, the table name, and the record numberof
`that data record.
`
`Other examples of identifying data items include: identi-
`fying files in a network file system, identifying objects in an
`object-oriented database,
`identifying images in an image
`database, and identifying articles in a text database.
`In general, the terms "data" and "data item” as used herein
`refer to sequences of bits. Thus a data item may be the
`contents of a file, a portion of a file, a page in memory, an
`object in an object-oriented program, a digital message, a
`digital scanned image, a part of a video or audio signal, or
`any other entity which can be represented by a sequence of
`bits. The term "data processing" herein refers to the pro-
`cessing of data items, and is sometimes dependent on the
`type of data item being processed. For example, a data
`processor for a digital image may differ from a data pro-
`cessor for an audio signal.
`In all of the prior data processing systems the names or
`identifiers provided to identify data items (the data items
`being tiles, directories, records in the database, objects in
`object-oriented programming, locations in memory or on a
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`physical device, or the like) are always defined relative to a
`specific context. For instance, the file identified by a par-
`ticular file name can only be determined when the directory
`containing the file (the context) is known. The file identified
`by a pathname can be determined only when the file system
`(context) is known. Similarly,
`the addresses in a process
`address space, the keys in a database table. or domain names
`on a global computer network such as the Internet are
`meaningful only becaum they are specified relative to a
`COHIEXI.
`
`In prior art systems for identifying data items there is no
`direct relationship between the data names and the data item.
`The same data name in two different contexts may refer to
`different data items, and two different data names in the
`same context may refer to the same data item.
`In addition, because there is no correlation between a data
`name and the data it refers to, there is no a priori way to
`confirm that a given data item is in fact the one named by a
`data name. For instance, in a DP system, if one processor
`requests that another processor deliver a data item with a
`given data name.
`the requesting processor cannot,
`in
`general, verify that
`the data delivered is the correct data
`(given only the name}. Therefore it may require further
`processing, typically on the part of the requestor, to verify
`that
`the data item it has obtained is,
`in fact,
`the item it
`requested.
`A common operation in a DP system is adding a new data
`item to the system. When a new data item is added to the
`system, a name can be assigned to it only by updating the
`context
`in which names are defined. Thus such systems
`require a centralized mechanism for the management of
`names. Such a mechanism is required even in a multi-
`processing systcm when data items are created and identified
`at separate processors in distinct locations, and in which
`there is no other need for communication when data items
`are added.
`
`In many data processing systems or environments, data
`items are transferred between different
`locations in the
`system. These locations may be processors in the data
`processing system, storage devices, memory, or the like. For
`example, one processor may obtain a data item from another
`processor or from an external storage device, such as a
`floppy disk, and may incorporate that data item into its
`system (using the name provided with that data item).
`However, when a processor (or some location) obtains a
`data item from another location in the DP system,
`it
`is
`possible that this obtained data item is already present in the
`system (either at the location of the processor or at some
`other location accessible by the processor) and therefore a
`duplicate of the data item is created. This situation is
`common in a network data processing environment where
`proprietary software products are installed from floppy disks
`onto several processors sharing a common file server. In
`these systems, it is often the case that the same product will
`be installed on several systems, so that several copies of
`each file will reside on the common file server.
`
`In some data processing systems in which several pro-
`cessors are connected in a network, one system is designated
`as a cache server to maintain master copies of data items,
`and other systems are designated as cache clients to copy
`local copies of the master data items into a local cache on an
`as—nceded basis. Before using a cached item, a cache client
`must either reload the cached item, be informed of changes
`to the cached item, or confirm that the master item corre-
`sponding to the cached item has not changed. In other words,
`a cache client must synchronize its data items with those on
`
`RACK-1001
`RACK-1001
`Page 34 of 56
`Page 34 of 56
`
`

`

`US 6,415,280 B1
`
`3
`the cache server. This synchronization may involve reload-
`ing data items onto the cache client. The need to keep the
`cache synchronimd or reload it adds significant overhead to
`existing caching mechanisms.
`In view of the above and other problems with prior art
`systems, it is therefore desirable to have a mechanism which
`allows each processor in a multiprocessor system to deter-
`mine a common and substantially unique identifier for a data
`item, using only the data in the data item and not relying on
`any sort of context.
`It is further desirable to have a mechanism for reducing
`multiple copies ofdata items in a data processing system and
`to have a mechanism which enables the identification of
`identical data items so as to reduce multiple copies.
`It is
`further desirable to determine whether two instances of a
`data item are in fact the same data item, and to perform
`various other systems’ functions and applications on data
`items without relying on any context information or prop-
`erties of the data item.
`
`It is also desirable to provide such a mechanism in such
`a way as to make it
`transparent
`to users of the data
`processing system, and it is desirable that a single mecha~
`nism be used to address each of the problems described
`above.
`
`SUMMARY OF THE INVENTION
`
`This invention provides, in a data processing system, a
`method and apparatus for identifying a data item in the
`system. where the identity of the data item depends on all of
`the data in the data item and only on the data in the data item.
`Thus the identity of a data item is independent of its name,
`origin, location, address, or other information not derivable
`directly from the data, and depends only on the data itself.
`This invention further provides an apparatus and a method
`for determining whether a particular data item is present in
`the system or at a location in the system, by examining only
`the data identities of a plurality of data items.
`Using the method or apparatus of the present invention,
`the efliciency and integrity of a data processing system can
`be improved. The present invention improves the design and
`operation of a data storage system. file system, relational
`database, object-oriented database, or the like that stores a
`plurality ofdata items, by making possible or improving the
`design and operation of at least some or all of the following
`features:
`
`the system stores at most one copy of any data item at a
`given location, even when multiple data names in the
`system refer to the same contents;
`the system avoids copying data from source to destination
`locations when the destination locations already have
`the data;
`the system provides transparent access to any data item by
`reference only to its identity and independent of its
`present location, whether it be local, remote, or offline;
`the system caches data items from a server, so that only
`the most recently accessed data items need be retained;
`when the system is being used to cache data items,
`problems of maintaining cache consistency are
`avoided;
`the system maintains a desired level of redundancy ofdata
`items in a network of servers, to protect against failure
`by ensuring that multiple copies of the data items are
`present at dilIerent locations in the system;
`the system automatically archives data items as they are
`created or modified;
`
`in
`
`15
`
`so
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`the system provides the size, age, and location of groups
`of data items in order to decide whether they can be
`safely removed from a local file system;
`the system can efficiently record and preserve any col-
`lection of data items;
`the system can efficiently make a copy of any collection
`of data items, to support a version control mechanism
`for groups of the data items;
`the system can publish data items, allowing other, possi-
`bly anonymous, systems in a network to gain access to
`the data items and to rely on the availability ofthe data
`items;
`the system can maintain a local inventory of all the data
`items located on a given removable medium, such as a
`diskette or (TD-ROM, the inventory is independent of
`other properties of the data items such as their name,
`location, and date of creation;
`the system allows closely related sets of data items, such
`as matching or corresponding directories on discon-
`nected computers,
`to be periodically resynchronized
`with one another;
`the system can verify that data retrieved from another
`location is the desired or requested data, using only the
`data identifier used to retrieve the data;
`the system can prove possession of specific data items by
`content without disclosing the content ofthe data items,
`for purposes of later legal verification and to provide
`anonymity;
`the system tracks possession of specific data items accord-
`ing to content by owner, independent ofthc name, date,
`or other properties of the data item, and tracks the uses
`of specific data items and files by content for account-
`ing purposes.
`Other objects, features, and characteristics of the present
`invention as well as the methods of operation and functions
`of the related elements of structure, and the combination of
`parts and economies of manufacture, will become more
`apparent upon consideration of the following description
`and the appended claims with reference to the accompany-
`ing drawings, all of which form a part of this specification.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIGS. 1(a) and 1(b) depict a typical data processing
`system in which a preferred embodiment of the present
`invention operates;
`FIG. 2 depicts a hierarchy of data items stored at any
`location in such a data processing system;
`FIGS. 3—9 depict data structures used to implement an
`embodiment of the present invention; and
`FIGS. 10(a)~28 are flow charts depicting operation of
`various aspects of the present invention.
`DETAILED DESCRIP’I'ION OF THE
`PRESENTIX PREFERRED EXEMPLARY
`EMBODIMENTS
`
`An embodiment of the present invention is now described
`with reference to a typical data processing system 100,
`which, with reference to FIGS. 1(a) and 1(b). includes one
`or more proLchors (or computers) 102 and various storage
`devices 104 connected in some way, for example by a bus
`106.
`
`Each processor 102 includes a CPU 108, a memory 110
`and one or more local storage devices 112. The CPU 108,
`memory 110, and local storage device 112 may be internally
`
`RACK-1001
`RACK-1001
`Page 35 of 56
`Page 35 of 56
`
`

`

`US 6,415,280 B1
`
`ll]
`
`15
`
`5
`connected, for example by a bus 114. Each processor 102
`may also include other devices (not shown), such as a
`keyboard. a display, a printer. and the like.
`In a data processing system 100, wherein more than one
`processor 102 is used, that is, in a multiprocessor system, the
`processors may he in one of various relationships. For
`example,
`two processors 102 may be in a clienti'server,
`clienlt’client, or a serverfserver relationship. These inter-
`processor relationships may be dynamic, changing depend-
`ing on particular situations and functions. Thus, a particular
`processor 102 may change its relationship to other proces-
`sors as needed, essentially setting up a peer-to-peer relation-
`ship with other processors. In a peer—to—peer relationship,
`sometimes a particular processor 102 acts as a client
`processor, whereas at other times the same processor acts as
`a server processor.
`In other words, there is no hierarchy
`imposed on or required of processors 102.
`In a multiprocessor system, the processors 102 may be
`homogeneous or heterogeneous. Further, in a multiprocessor
`data processing system 100, some or all of the processors
`102 may be disconnected from the network of processors for
`periods of time. Such disconnection may be part of the
`normal operation of the system 100 or it may be because a
`particular processor 102 is in need of repair.
`Within a data processing system 100, the data may be *
`organized to form a hierarchy of data storage elements,
`wherein lower level data storage elements are combined to
`form higher level elements. This hierarchy can consist of, for
`exa mple, processors, file systems, regions, directories, data
`files, segments, and the like. For example, with reference to
`FIG. 2, the data items on a particular processor 102 may be
`organized or structured as a file system 116 which comprises
`regions 117, each of which comprises directories 118, each
`ot‘which can contain other directories 118 or files 120. Each
`
`an
`
`35
`
`file 120 being made up of one or more data segments 122.
`In a typical data processing system, some or all of these
`elements can be named by users given certain implementa—
`tion specific naming conventions, the name (or pathname) of
`an element being relative to a context. In the context of a
`data processing system 100, a pathname is fully specified by
`a processor name, a fitesystem name, a sequence of zero or
`more directory names identifying nested directories, and a
`final file name. (Usually the lowest level elements, in this
`case segments 122, cannot be named by users.)
`In other words, a
`file system 116 is a collection of
`directories 118.

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket