`Case 5:18-md-02834-BLF Document 406-10 Filed 04/12/19 Page 1 of 7
`
`EXHIBIT9
`EXHIBIT9
`
`
`
`Case 5:18-md-02834-BLF Document 406-10 Filed 04/12/19 Page 2 of 7
`
`I 1111111111111111 11111 111111111111111 111111111111111 IIIII IIIIII IIII IIII IIII
`US007945544B2
`
`c12) United States Patent
`Farber et al.
`
`(IO) Patent No.:
`(45) Date of Patent:
`
`US 7,945,544 B2
`May 17, 2011
`
`(54) SIMILARITY-BASED ACCESS CONTROL"OF
`DATA IN A DATA PROCESSING SYSTEM
`
`(75)
`
`Inventors: David A. Farber, Ojai, CA (US);
`Honald D. Luchmun, Northbrook, IL
`(US)
`
`(73) Assignees: Kinetcch, Inc., Studio City, CA (US);
`Level 3 Communications, LLC,
`Broomfield, CO (US)
`
`( *) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 624 days.
`
`(21) Appl. No.: 11/980,688
`
`(22) Filed:
`
`Oct. 31, 2007
`
`(65)
`
`Prior Publication Datu
`
`US 2008/0065635 A 1
`
`Mar. 13, 2008
`
`Related U.S. Application Data
`
`(60) Continuation of application No. 1 J/724,232, filed on
`Mar. 15, 2007, which is a continuation of application
`No. 11/017,650, tiled on Dec. 22, 2004, which is a
`continuation of application No. 09/987,723, filed on
`Nov. 15, 2001, now Pat. No. 6,928,442, which is a
`continuation of application No. 09/283,160, filed on
`Apr. 1, 1999, now Pal. No. 6,415,280, which is a
`division of application No. 08/960,079, filed on Oct.
`24, 1997, now Pat. No. 5,978,791, which is a
`continuation of application No. 08/425,160, filed on
`Apr. ll, 1995, now abandoned, application No.
`11/980,688, which is a continuation of application No.
`10/742,972, filed on Dec. 23, 2003, which is a division
`of application No. 09/987, 723, which is a continuation
`of application No. 09/283,160, which is a division of
`application No. 08/960,079, which is a continuation or
`application No. 08/425,160.
`
`(51)
`
`Int. Cl.
`G06F 17/00
`(2006.01)
`(52) U.S. Cl .
`.................. ....... 707/698; 707/821; 707/822
`I•icld of Classification Search .................. 707/609,
`(58)
`707/821-828,697-698
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`6i 1972 Evangelisti el al
`3,668,647 A
`3,835,260 A
`9i 1974 Prescher el al.
`(Continued)
`
`EP
`
`PORDIGN PATENT DOCUMENTS
`511988
`0 268 069 A2
`(Continued)
`
`CTrHER PUBLICATIONS
`
`Fowler, et al. "A User-Level Replicated File System," AT&T Bell
`Laboratories Technical Memorandum 0 l 12670-930414-05, Apr.
`1993, and USEI\IX 1993 Summer Conference Proceedings, Cincin(cid:173)
`nati, OH, Jun. 1993.
`
`(Conlinuecl)
`
`Primary Examiner - Khanh B Pham
`(74) Attorney, Agent, or Firm - Davidson Berquist Jackson
`& Gowdey, LLP; Brian Siritzky
`
`ABSTRACT
`(57)
`Similarity of data items is determined by analyzing corre(cid:173)
`sponding segments of the data items. A function is applied to
`each segment of a data item and the output of that function is
`compared to the output of the same function applied to a
`corresponding segment of another data item. A fimction may
`he Hpplied lo the oulput of the functitm8. The runclions muy
`be hash or message digest ti.mctions.
`
`56 Claims, 31 Drawing Sheets
`
`~1:PN,1,\.NamiOr~
`14-. .. ,~Mti.111 Dff'1'1,10aUO,· b,l,f,t,
`
`...
`
`
`
`Case 5:18-md-02834-BLF Document 406-10 Filed 04/12/19 Page 3 of 7
`
`US 7,945 ,544 B2
`
`1
`SIMILARITY-BASED ACCESS CONTROL OF
`DATA IN A DATA PROCESSING SYSTEM
`
`RELATED APPLICATIONS
`
`1l1is application is a continuation of and claims priority lo
`co-pending U .S. patent application Ser. No. 11/724,232 filed
`Mar. 15, 2007, which is a continuation of co-pending appli(cid:173)
`c11tion Ser. No. 11/017,650, tiled Dec. 22, 2004, which is a
`continuation of pending application Ser. No. 10/742,972,
`filed Dec. 23, 2003 , which is a continuation of Ser. No.
`09/987,723, filed Nov. 15, 2001, patented as U.S. Pat. No.
`6,928,442; which is a which is a cnnlinualion or application
`Ser. No. 09/283,160, filed Apr. 1, 1999, now U.S. Pat. No .
`6,415,280, which is a division ofapplication Ser. No. 08/960,
`079, filed Oct. 24, 1997, now U.S. Pat. No. 5.978,791, which
`is a continuation of Ser. No. 08/425,160, filed Apr. 11, 1995,
`now abandoned, the contents of which each of these npplica(cid:173)
`tions arc hereby incorporntcd herein by reference. TI1is appli(cid:173)
`cation is a continuation of and claims priority to co-pending
`application Ser. No.11/017,650, filed Dec. 22, 2004, which is
`a c<intinuation of upplicalion Ser. No. 09/987,723, filed Nov.
`15, 2001 , now U.S. Pat. No. 6,928,442, which is a continua(cid:173)
`tion of application Ser. No. 09/283,160, filed Apr. ! , 1999,
`now U.S. Pat. No. 6,415 ,280, which is a division of applica(cid:173)
`tion Ser. No. 08/960,079, filed Oct. 24, 1997, now U.S. Pal.
`No. 5,978,791, which is a continuation of Ser. No. 08/425,
`160, filed Apr. 11, 1995, now abandoned, the contents of
`which each of these applic,11ions are hereby incorporated
`herein by reference. This is also a continuation of and claims
`priority lo co-pending application Ser. No. 10/742,972, liled
`Dec. 23, 2003, which is a division of application Ser. No .
`09/987,723, filed Nov. 15, 2001, now U.S. Pat. No. 6,928.
`442, which is a continuation of application Ser. No. 09/283,
`160, liledApr. 1, 1999, now lJ.S. Pal. No. 6,415,2X0, which is
`a division of application Ser. No. 08/960,079, filed Oct. 24,
`1997, now U.S. Pal. No. 5,978,791, which is a continuation of
`Ser. No. 08/425,160, filed Apr. 11, 1995. now abandoned, the
`contents of which each of these applications are hereby incor(cid:173)
`porated herein by reference.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`T11is invention relates to data processing systems and, more 45
`particularly, to data processing systems wherein data items
`are identified by substantially unique identifiers which
`depend on a)] of the data in the data items and only on the data
`in 1he data items.
`. 2. Background of the Iuvention
`Data processing (DP) systems, compulers, networks of
`computers, or the like, typically offer users and programs
`various ways to identify the data in the systems.
`Users typically identify data in the data processing system
`by giving the data some form of name. For example, a typical ss
`operating system (OS) on a computer provides a file system in
`which data items are named by alphanumeric identifiers. Pro(cid:173)
`grams typically identify data in the data processing system
`using a location or address. For example, a program may
`idenlily a record in a file l>rdalahase by using a record numher
`which serves to locate that record.
`In all but the most primitive operating systems. users and
`progrmns are able to create and use collections of named data
`items, these collections themselves being named by identifi(cid:173)
`ers. These nmued collections can then, themselves, be made 65
`purl of other numed l:Otleclions. For exumple, ,111 OS may
`provide mechanisms to group files (data items) into dirccto-
`
`5
`
`40
`
`2
`ries (collections). These directories can then, themselves be
`made part of other directories. A data item may thus be
`identified relative to these nested directories using a sequence
`of names, or a so-called patlmame, which defines a path
`through the directories lo a panicular data item (file or direc(cid:173)
`lnry).
`As another example, a database management system may
`group data records (data items) into tables and then group
`these tables into database files (collections). ]11e complete
`10 address of any data record can then be specified using the
`database file name, the table name, and the record number of
`that data record.
`Other examples of identifying data items include: identi(cid:173)
`fying files in a network file system, identifying objects in an
`15 object-oriented database, identifying images in an image
`database, and identifying articles in a rexl database.
`Ju general, the tenns "data" and "data item" as used herein
`refer to sequences of bits. Tirns a data item may be the con(cid:173)
`tents of a file, a por1ion of a file, a pnge in memory, an object
`20 in an object-oriented program, a digital message, a digital
`scanned image, a part of fl video or audio signal. or any other
`entity which can be represented by a sequence of bits. The
`term "data processing" herein refers lo the processing of data
`items, am! is srnnclimes dependent on the type of dula item
`25 being processed. For example, a data processor for a digital
`image may di!Ter from a data processor for an audio signal.
`In all of the prior data processing systems the names or
`identifiers provided to identify data items (the data items
`being files, directories, records in the dntabase, objects in
`30 object-oriented programming, locations in memory or on a
`physical device, or the like) are always defined relative to a
`specific context. For instance, the file identified by a particu(cid:173)
`lar nle name can only be detem1ined when the directury
`containing the file (the context) is known. The file identified
`35 by a pathname can be detennined only when the file system
`(context) is known. Similarly, the addresses in a process
`address space, the keys in a database table, or domain names
`on a global computer network such as lhe lnlemet are mean(cid:173)
`ingfol only because they arc speciJied relative to a context.
`In prior art systems for identifying data items there is no
`direct relationship between the data names and the dala item.
`The same data name in two different contexts may refer to
`different data items. and two different data names in the same
`context may reler to the same data item.
`In addition, because there is no correlation between a data
`name and the data it refers to, there is no a priori way to
`confin11 that a given dala item is in fact the one named by a
`data name. For instance, in a DP system, if one processor
`requests that another processor deliver a data item with a
`so given dat<1 name, the requesting processor cannot. in genernl,
`verify that the data delivered is the correct data (given only the
`name). Therefore it may require fi.1rther processing. typically
`on the prn1 of the requestor, to verify that the data item it has
`obtained is, in fact, the item it requested.
`A common operntion in a DP system is adding a new data
`item to the system. When a new data item is added to the
`system, a name can be assigned to i1 only by updating the
`context in which names are defined . Tims such systems
`require a centralized mechanism for the management of
`60 names. Such a mechanism is required even in a multi-pro(cid:173)
`cessing system when data items ::ire created and identified at
`separate processors in distinct locations. and in which there is
`no other need for com1m111ication when data items are added.
`In many data processing systems or environments, data
`items are transferred belween different locations in the sys(cid:173)
`tem. These localimrn nrny be prm;essor.. in the data processing
`system, storngc devices, memory, or the like. For exmuplc,
`
`
`
`Case 5:18-md-02834-BLF Document 406-10 Filed 04/12/19 Page 4 of 7
`
`US 7,945,544 B2
`
`25
`
`37
`exact contents of the directory tree at the time it was frozen.
`The frozen directory can be copied with its components pre(cid:173)
`served.
`The Acquire Trne File remote mechanism (used in mirror(cid:173)
`ing and archiving) preserves the directory tree stmctme by
`ensuring that all ol'thecomponent segments mid True Files in
`a compound data item are actually copied to a remote system.
`Of course, no transfer is necessary for data items already in
`the registry of the remote system.
`In operation, the system can efficiently make a copy of any 10
`collection of data items, lo support a version conlrnl mecha(cid:173)
`nism for groups of the data items.
`TI1e freeze Directory primitive mechanism is used to cre-
`ate a collection of data items. The constituent files and seg(cid:173)
`ments referred to by the frozen directory are maintained in the t 5
`registry, without any need to make copies of the constituents
`each time the directory is frozen.
`Whenever a pathname is traversed, the Gel Files in Direc(cid:173)
`tory operating system mechanism is used, and when it
`encounters a frozen directory, it uses the Expand Frozen 20
`Directory primitive mechanism.
`A frozen directory can be copied from one patlmame to
`another efficiently, merely by copying its Tme Name. The
`Copy File opernling system mechanism is used lo copy a
`frozen dircctoty.
`Tirns it is possible to elTiciently create copies of different
`versions of a directory, thereby creating a record of its history
`(hence a version control system).
`In operntion, the system can maintain a local inventory or
`all the data items located on a given removable medium, such 3u
`as a diskette or CD-ROM. The inventory is independent of
`other properties of the data items such as their name, location,
`and date nf creation.
`TI1e lnvcntory Existing Directory extended mechanism
`provides a way to create True File Registry entries for all of 35
`the files in a directory. One use of this inventoty is as a way to
`pre-load a True rile registry with backup record information.
`Those mes in the registry (such <1s previously installed soft(cid:173)
`ware) which are on the volumes inventoried need not be
`backed up onto other volumes.
`The Inventory Removable, Read-only riles extended
`mechanism not only detennines the Trne Names for the files
`on the medium, but also records directory entries for each file
`in a frozen directory stmcture. By copying mid modifying this
`directory, it is possible to create an on line patch, or snrnll 45
`modification of an existing read-only file. for example, it is
`possible to create an online representation ofa modified CD(cid:173)
`ROM, such that the munodified files are actually on the CD(cid:173)
`ROM, and only the modified files are online.
`In operation, the system tracks possession of specilic data 50
`items according to content by owner, independent of the
`name, date, or other properties of the data item, and tracks the
`uses of specific data items and files by content for accounting
`purposes. Using the Track for Accounting Purposes extended
`meclrnnism provides a way to know reliably which files have 55
`been stored on a system or transmitted from one system to
`another.
`Trne Names in Relational and Object-Oriented Databases
`Although the prefen-ed embodiment of this invention has
`been presented in the cnntexl of a Ille system, the inventinn of 60
`Tme Names would be equnlly valuable in a relational or
`object-oriented database.A relational orobject-oriented data(cid:173)
`base system using Trne Names would have similar benefits to
`those of the file system employing the invention. For instance,
`such a database would permit efficient elimination of dupli- 65
`cale records, support a cache for records, simplify the process
`of maintaining cache consistency, provide location-indepen-
`
`4IJ
`
`38
`dent access to records, nrninlain archives and histories of
`records, and synchronize with distant or disc01mected sys(cid:173)
`tems or databases.
`ll1e mechanisms described above can be easily modified to
`serve in such a database environment. The Tme N.:ime regis(cid:173)
`try would be used as a repository of database records. All
`references to records would be via the Trne Name of the
`record. (The Local Directory Extensions table is an example
`oJ a primary index that uses the True Name as the unique
`ide 111ilier of the desired records.)
`In such a database, the operations of inserting, updating,
`and deleting records would be implemented by first assimi(cid:173)
`lating records into the registry, and then updating a ptimmy
`key index to map the key nf the record In its rnntents hy using
`the Trne Name as a pointer to the contents.
`The mechanisms described in the prelerred embodiment,
`or similar meclrnnisms, would be employed in such a system.
`These mechanisms could include, for example, the mecha(cid:173)
`nisms for calculating true names, assimilating, locating, real(cid:173)
`izing, deleting, copying, and moving Trne Files, for mirroring
`Tme Files. for maintaining a cache of True Files, for groom(cid:173)
`ing True Files, and other mechanisms based on the use of
`substantially unique identifiers.
`While the invention bas been described in connection with
`what is presently considered to be the most practical and
`preferred embodiments, it is to be understood that the inven(cid:173)
`tion is not to be limited to the disclosed embodiment, but on
`the contrnry, is intended to cover vmious modificntions and
`equivalent arrangements included within the spirit and scope
`of the appended claims.
`
`We claim:
`1. A computer-implemented method, the method compris(cid:173)
`ing:
`(A) for a first data item comprising a first plurality of parts,
`(al) applying a first function to each part of said first
`plurality or parts to obtain a corresponding part value
`for ench part of said first plurality of parts, wherein
`each part of said first plurality of pm1s comprises a
`corresponding sequence of bits, and wherein the part
`value for each particular part of said first plurality of
`parts is based, at least in part, on the corresponding
`bits in the pnrticulnr part, and wherein two identical
`parts will have the same part vnlue ns determined
`using said first function, wherein said first function
`comprises a first hash fi.tnction; and
`(a2)obtaininga first value fort he first data item, said first
`value obtained by applying a second function to the
`part values of said first plurnlity of parts of said first
`data item, said second function comprising a second
`hash function;
`(B) for a second data item comprising a second plurality of
`parts,
`(bl) applying said first ti.mction to each pnrt of said
`second plurality of parts to obtain a corresponding
`part value for each part of said second plurality of
`parts, wherein each part of said second plurality of
`pai1s consists of a corresponding sequence of bits, and
`wherein the part value for tmch particubr part of st1id
`second plurality of parts is based, at least in part, on
`the corresponding bits in the particular part of the
`second plurality of parts; and
`(b2) obtaining a second value for the second data item by
`applying said second function lo the parl values of
`suid second plurality orparls nf said second data item;
`and
`
`
`
`Case 5:18-md-02834-BLF Document 406-10 Filed 04/12/19 Page 5 of 7
`
`us 7,945,544 82
`
`IO
`
`\5
`
`25
`
`39
`(C) ,JScertaining whether or not said lirst data item corre(cid:173)
`sponds to said second data item based, at least in part, on
`said fJrst value and said second value.
`2. The method of claim .1 wherein said first data item
`corresponds to said second data item when said first value is 5
`identical to said second value.
`3. 1l1e method of claim I wherein said ascertaining in (C)
`is used to determine whether the first data item matches !he
`second data item.
`4. 1l1e method of claim 1 wherein said ascertaining in (C)
`is used to detennine whether the first data item is a copy of the
`second data item.
`5. The metl10d of claim l wherein the first function is the
`same as the second runction.
`6. The method of claim 1 wherein the first plurality of parts
`of the first data item arc non-overlapping, and wherein the
`second plurality of parts of the second data item are non(cid:173)
`overlapping.
`7. TI1e method of claim 1 wherein each of the parts in the 20
`fJrst plurality or parts and each or the parts in the sernnd
`plurality of parts is the same size.
`8. A comptiler-implemenled method comprising:
`(A) maintaining a datahast! of values, al least one value li1r
`each data item of a plurality of data items,
`wherein each data item of the plurality of data items
`comprises a corresponding one or more parts, and
`wherein each of the one or more parts of each data item
`comprises a corresponding sequence of bits, and
`wherein each of the one or more parts of each data item 30
`has a corresponding part value, the part val11e [or each
`particular part being based on a first given function of
`the corresponding sequence hits for lhal particular
`part ,
`wherein two identical parts will have the same part value JS
`as determined using the first given function, and
`the value for each particular data item being based, at
`least in part, on a second given function of the part
`values of the one or more parts oftlmt particular data
`item, and
`wherein the first given function comprises a fJrst hash
`function, and the second given function comprises a
`second hash function:
`(B) obtaining a second value, the second value correspond(cid:173)
`ing to a second data item, the second data item compris- 45
`ing a corresponding one or more parts,
`each of the one or more parts of the second data item
`comprising a corresponding sequence of bits,
`each of the one or more parts of the second data item
`having a corresponding part value,
`wherein the part value for each particular part of the
`second data item is based on the first given function of
`the corresponding sequence of bits in that particular
`part of the second data item; ,md
`wherein the second value is based on the second fonction 55
`of the one or more part values of the second data item;
`and
`(C) ascertaining whether or not the second data item cor(cid:173)
`responds to any of the plurality of data items, based, at
`least in part, on whether or nnl the second value corre- 60
`sponds to nny value in the dntabase ofvalnes.
`9. The method of claim 8 wherein the second value corre(cid:173)
`sponds to a particular value in the d,itabase of values when the
`second value is equal to the particular value in the database of
`values.
`JO. The method or claim 8 wherein the f1rsl hash runclion
`is selected from the functions MDS and SHA.
`
`40
`
`so
`
`65
`
`40
`I.I. The method of claim 10 wherein the second liash fonc(cid:173)
`tion is selected from the functions MDS and SIIA.
`12. The mellrnd of claim 8 whernin the first given runclion
`is the same as the second given function.
`13. The method of claim 8 wherein the database comprises
`a mapping from data item values to corresponding data items.
`14. The method of claim 8 wherein !he second value is
`obtained as pm1 of a search for !he second data item.
`15. The method of claim 8 wherein the second value is
`obtained as part of a search for data items matching the
`second data item.
`16. The method orclnim 8 further comprising:
`(D) when the second value corresponds to a particular
`value in the th1lahase, providing inronnation ahoul a
`particular data item corresponding lo the particular
`entry.
`17. The method of claim 8 wherein the step (13) of obtaining
`lhe second value comprises calculating lhe second value.
`18. The method of claim 8 wherein al least some or the data
`items are f11es.
`J 9. The method of claim 8 wherein the database comprises
`a mapping from values to corresponding data items, and
`wlwrcin, when the second value corresponds to the fJrsl value,
`infonnMion about the corresponding data item is provided.
`20. The method of claim 8 wherein the parts are segments.
`21. The method of claim 8 further comprising:
`(D) adding a new value lo the database, wherein the new
`value corresponds to a new data item distinct from the
`plurality of data items.
`22. A computer-implemented method comprising:
`(A) obtaining a particular data item value corresponding to
`a parliculnr data item, the p,1rlicular data ile111 compris(cid:173)
`ing a corresponding one or more parts,
`each of the one or more parts of the particular data item
`comprising a co1Tesponding sequence of bits,
`each of the one or more pa11s of the particular data item
`having a corresponding part v~lue,
`wherein the part value tor each speci lie part of the one or
`more parts of the particular data item is based, at least in
`part, on a first given function of !he corresponding
`sequence of bits in tlrnt specific part of the particular data
`item;
`wherein two identical parts will have the same pm-I value as
`determined using the first given Ji.mction, and
`wherein the partic11lar data item value is based, at least in
`part, on a second given function of the one or more part
`values o[the particular data item,
`wherein the first given function comprises a first hash fi.mc(cid:173)
`tion, and the second given fimetion comprises a second
`hash fonction; and
`(13) ascertaining whether or not the particular data item
`corresponds lo any ofa pluralily of data items, based, at
`le.isl in part, on whether or not the particular data item
`value corresponds to any value in a database of data item
`values,
`wherein the database of data item values comprises at least
`one data item value for each data item of the plurality of
`data items,
`wherein each data ilem of the plurality or data items com(cid:173)
`prises a corresponding one or more pm1s, and
`wherein each of the one or more parts of each data item of
`the plurality of data items comprises a corresponding
`sequence of bits, and
`wherein each of the one or more parts of each data item of
`the plurnlily nf data items has a corresponding parl
`value, the part value for each particular part of the one or
`
`
`
`Case 5:18-md-02834-BLF Document 406-10 Filed 04/12/19 Page 6 of 7
`
`US 7,945,544 B2
`
`41
`more parts of each data item being based on the Jirsl
`given function of the corresponding sequence bits for
`that particular part,
`the data item value for each particular data item being
`based, al leas I in part, on lhe second given function of the
`par! values oflhe one or more parts of that particulardal,1
`item.
`23. The method of claim 22 wherein step (D) of adding is
`a background process.
`24. The method of claim 22 wherein the first given funclion 10
`is selected from lhe functions Ml)S and SH/\.
`25. The method of claim 22 wherein the second given
`function is selected from the functions MOS and SilA.
`26. The method of claim 22 wherein the database com(cid:173)
`prises a mapping from data item values to corresponding data 15
`items.
`27. The method of claim 22 wherein the parlicular data
`item value is obtained as part of a search for the particular data
`item.
`28. The method of claim 22 wherein the particular data 20
`item value is obtained as part of a search for dala items
`matching the particular data item.
`29. The method of claim 22 wherein the particular data
`item value cnrresponc.ls lo a value in the database when the
`particular data item value is equal to the value in the database. 25
`30.111e method of claim 22 wherein the data items are files.
`31. The method of claim 22 wherein the first given function
`is the same as the seconc.l given function.
`32. A computer-implemented method comprising:
`(A) maintaining a database comprising a mapping of data 3U
`item keys to corresponding data item information ror
`each of a plurality of data items, wherein each data item
`or lhe plurality or dala items has al least one data item
`key,
`wherein each data item of the plurality of dala items com- 35
`prises a co1Tesponding one or more portions, and
`wherein each of the one or more po1tio11s of each data item
`comprises a correspunc.ling sequence ofbils, and
`wherein each of the one or more portions of e11ch dnla item
`has a corresponding portion value, the portion value for 40
`each particular portion being based on a first given func(cid:173)
`tion of the corresponding sequence bits for that particu-
`lar portion, wherein the first given funclion comprises a
`first lmsh ti.mction, and wherein two identical portions
`will have the smne portion value as determined using lhe 45
`first given function,
`the particular data item key for each particular data item
`being based on a second given function of the portion
`values of the one or more portions of that particular data
`item, wherein tbe second given function comprises a 50
`second hash function;
`(□) obtaining a particular value, the particular value having
`been determined from a corresponding one or more par(cid:173)
`ticular porlions,
`each of the one or more particular portions comprising a 55
`corresponding sequence of bits,
`each of the one or more particular portions having a corre(cid:173)
`sponding portion value, wherein the portion value for
`each specific portion of the one or more particular por(cid:173)
`tions is based on lhe first given function of lhe cnrre- 60
`sponding bits in that specific porlion; and
`wherein the particular value is based on the second func(cid:173)
`tion of the portion values of the one or more parlicular
`portions; and
`(CJ using the particular value and the database to asce11ain 65
`whether or not lhe one or more particular portions cnr(cid:173)
`rcspond to any of the plurality of data items.
`
`42
`33. The melhnd of claim 32 wherein the one or more
`particular portions arc obtained as part of a search.
`34. l11e method of claim 32 further comprising:
`(D) when the one or more pm1icular po11ions correspond to
`a specific data item, providing informalion about that
`specific dall1 item.
`35. The method of claim 34 wherein the inforurntion pro(cid:173)
`vided in (D) includes information about a location ofa copy
`of the specific data item .
`36. A compuler-implemented method comprising:
`(/\) fnreach parliculardata item ora pluralily of data items :
`(al) detenuining a corresponding particular data item
`key;and
`(a2) adding an entry to a database to map said particular
`data item key lo infrmuation about the particular data
`item,
`wherein each data item of the plurality of data items com(cid:173)
`prises a corresponding one or more pilrts, and
`wherein each of the one or more parts of each data item
`comprises 11 corresponding sequence of bits, and
`wherein each of the one or more parts of each data item has
`a corresponding part value, the part value for each par(cid:173)
`ticular parl being based on a first given function of lhe
`corresponding sequence hits li)r lhal particular parl ,
`wherein the first given fonction comprises a first hash
`fimction, wherein two idenlical parts will have the same
`part value as determined using the fu-st given function,
`lhe data item key for each particular data item being based
`011 a second given function of the part values of the one
`or more ports of that data item, wherein the second given
`function comprises a second hash function;
`(B) determining a second key value, the second key value
`being based on ont:! or more particular p:1rts,
`each of the one or more particular parts comprising a cor(cid:173)
`responding sequence of bits,
`each of the one or more particular pai1s having a corre(cid:173)
`sponding part value, wherein 1he part value for each
`specific part of the om: or more pa11icular parts is based
`on the Jirst given Ji.mction of the corresponding bits in
`that specific part; and
`wherein the second key value is based on the second func(cid:173)
`tion of the part values of the one or more particular parts;
`and
`(C)11Sing lhesecond keyvalueand tbedntabaseto ascertain
`whether or not the one or more particular parts corre(cid:173)
`spond lo any of the plurality of data items.
`37. The method of claim 36 wherein the one or more
`particu Jar parts are obtained as part of a search.
`38 . The method of claim 36 wherein the one or more
`particular parts are obtained as part of a search for dnta items
`compris ing the one or more particular parts.
`39. The method of claim 36 wherein the one or more
`particular parts coJTespond to a specific data ilem of lhe
`plurality of data items when the second key value is equal to
`a data item key of the specific data item.
`40. The method ofcl<1im 36whercin the parts arc segments.
`41. The method of claim 36 wherein the step (C) of using
`comprises:
`looking up the second key value in the database.
`42. Th<! method of claim % wherein slep (/\) is a hack(cid:173)
`ground process.
`43. The method of cl<1im 36 wherein 1he information about
`a particular dala item in the database includes location infor(cid:173)
`mation about the particular data item.
`44. The method of claim 43 wherein the location infonna(cid:173)
`lion includes a list ortmt:! or more locations ol'the particular
`data item.
`
`
`
`Case 5:18-md-02834-BLF Document 406-10 Filed 04/12/19 Page 7 of 7
`
`US 7,945,544 B2
`
`10
`
`43
`45. The method of claim 36 wherein the i1iformalion about
`a particular data item in the database includes a copy of the
`data item.
`46. A computer-implemented method comprising:
`(A) for each particular file of a plurality of files:
`(a2) determining a particular digital key ror the particu(cid:173)
`lar file, wherein the particulm tile comprises a first o