`WORLD INTELLECTUAL PROPERTY ORGANIZATION
`International Bureau
`INTERNATIONAL APPLICATION PUBLISHED UNDER TIIE PATENT COOPERATION TREATY (PCT)
`WO 96/25801
`
`(11) International Publication Number:
`
`(51) International Patent Classification 6 :
`H03M 7130, H04L 23/00, G06F 7/00,
`7/06, 1n.2
`
`Al
`
`(43) International Publication Date:
`
`22 August 1996 (22.08.96)
`
`(21) International Application Number:
`
`PCT/ AU96/00081
`
`(22) International Filing Date:
`
`15 February 1996 (15.02.96)
`
`(30) Priority Data:
`PN 1232
`PN 2392
`
`17 February 1995 (17.02.95)
`12 April 1995 (12.04.95)
`
`AU
`AU
`
`(71) Applicant (for all designated States except US): TRUSTUS
`PTY. LTD. [AU/AU]; 200 East Terrace, Adelaide, S.A.
`5000 (AU).
`
`(72) Inventor; and
`(75) Inventor/Applicant (for US only): WILLIAMS, Ross, Neil
`[AU/AU]; 16 Lerwick Avenue, Hazelwood Park, S.A. 5066
`(AU).
`
`(74) Agent: MADDERNS; 1st floor, 64 Hindmarsh Square, Ade(cid:173)
`laide, S.A. 5000 (AU).
`
`(81) Designated States: AL, AM, AT, AU, AZ, BB, BG, BR, BY,
`CA, CH, CN, CZ, DE, DK, EE, ES, Fl, GB, GE, HU, IS,
`JP,KE,KG,KP,KR,KZ,LK,LR,LS,LT,LU,LV,MD,
`MG, MK, MN, MW, MX, NO, NZ, PL, PT, RO, RU, SD,
`SE, SG, SI, SK, TJ, TM, TR, TT, UA, UG, US, UZ, VN,
`ARIPO patent (KE, LS, MW, SD, SZ, UG), Eurasian patent
`(AZ, BY, KG, KZ, RU, TJ, TM), European patent (AT, BE,
`CH, DE, DK, ES, FR, GB, GR, IE, IT, LU, MC, NL, PT,
`SE), OAPI patent (BF, BJ, CF, CG, CI, CM, GA, GN, ML,
`MR, NE, SN, TD, TG).
`
`Published
`With international search report.
`With amended claims and statement.
`
`(54) Title: METHOD FOR PARTITIONING A BLOCK OF DATA INTO SUBBLOCKS AND FOR STORING AND COMMUNICAT(cid:173)
`ING SUCH SUBBLOCKS
`
`(57) Abstract
`
`This invention provides a method and appa(cid:173)
`ratus for detecting common spans within one or
`more data blocks by partitioning the blocks (figure
`4) into subblocks and searching the group of sub(cid:173)
`blocks (figure 12) (or their corresponding hashes
`(figure 13)) for duplicates. Blocks can be parti(cid:173)
`tioned into subblocks using a variety of methods,
`including methods that place subblock boundaries
`at fixed positions (figure 3), methods that place sub(cid:173)
`block boundaries at data-dependent positions (figure
`3), and methods that yield multiple overlapping sub(cid:173)
`blocks (figure 6). By comparing the hashes of sub(cid:173)
`blocks, common spans of one or more blocks can
`be identified without ever having to compare the
`blocks or subblocks themselves (figure 13). This
`leads to several applications including an incremen(cid:173)
`tal backup system that backs up changes rather than
`changed files (figure 25), a utility that determines
`the similarities and differences between two files
`(figure 13), a file system that stores each unique
`subblock at most once (figure 26), and a communi(cid:173)
`cations system that eliminates the need to transmit
`subblocks already possessed by the receiver (figure
`19).
`
`... b
`F(b
`)
`... b
`,b
`k-A
`k
`k+l
`k+B
`A B
`
`A B
`
`B
`
`... ................. ..
`B
`... b
`F(b
`k-A k
`
`)
`... b
`,b
`k+I
`k+B
`
`Page 1
`
`
`
`FOR THE PURPOSES OF INFORMATION ONLY
`
`Codes used to identify States party to the PCT on the front pages of pamphlets publishing international
`applications under the PCT.
`
`AM
`AT
`AU
`BB
`BE
`BF
`BG
`BJ
`BR
`BY
`CA
`CF
`CG
`CH
`Cl
`CM
`CN
`cs
`CZ
`DE
`DK
`EE
`ES
`Fl
`FR
`GA
`
`Annenia
`Austria
`Australia
`Barbados
`Belgium
`Burkina Faso
`Bulgaria
`Benin
`Brazil
`Belarus
`Canada
`Central African Republic
`Congo
`Switzerland
`COte d'Ivoire
`Cameroon
`China
`Czechoslovakia
`Czech Republic
`Gennany
`Denmark
`Estonia
`Spain
`Finland
`France
`Gabon
`
`GB
`GE
`GN
`GR
`HU
`IE
`IT
`JP
`KE
`KG
`KP
`
`KR
`KZ
`LI
`LK
`LR
`LT
`LU
`LV
`MC
`MD
`MG
`ML
`MN
`MR
`
`United Kingdom
`Georgia
`Guinea
`Greece
`Hungll}'
`Ireland
`Italy
`Japan
`Kenya
`Kyrgystan
`Democratic People's Republic
`of Korea
`Republic of Korea
`Kazakhstan
`Liechtenstein
`Sri Lanka
`Liberia
`Lithuania
`Luxembourg
`Latvia
`Monaco
`Republic of Moldova
`Madagascar
`Mali
`Mongolia
`Mauritania
`
`MW
`MX
`NE
`NL
`NO
`NZ
`PL
`PT
`RO
`RU
`SD
`SE
`SG
`SI
`SK
`SN
`sz
`TD
`TG
`TJ
`TI'
`UA
`VG
`us
`uz
`VN
`
`Malawi
`Mexico
`Niger
`Netherlands
`Norway
`New Zealand
`Poland
`Portugal
`Romania
`Russian Federation
`Sudan
`Sweden
`Singapore
`Slovenia
`Slovakia
`Senegal
`Swaziland
`Chad
`Togo
`Tajikistan
`Trinidad and Tobago
`Ukraine
`Uganda
`United Stales of America
`Uzbekistan
`Viet Nam
`
`Page 2
`
`
`
`W096/25801
`
`PCT/AU96/00081
`
`Method For Partitioning
`A Block of Data
`Into Subblocks And For
`Storing And Communicating
`Such Subblocks
`
`INTRODUCTION
`
`The present invention provides a method and apparatus for identif.ving identical
`
`subblocks of data within one or more blocks of data and of communicating and
`
`storing such subblocks in an efficient manner.
`
`BACKGROUND
`
`Much the massive amount of information stored. communicated. and manipulated
`
`by modern computer s>'stems is duplicated within tlw same or a related computer
`
`s>'stem. It is commonplace. for example. for computers to store rn<rn>· slightly dif(cid:173)
`
`fering versions of the same document. It is also commonplace for data transmitted
`
`during a backup operation to be almost identical to the data transmitted dming
`
`the previous backup operatiou. Computer networks also must repeatedly carry the
`
`same or similar data i11 accordance the requirements of their users.
`
`Despite the obvious benefits that would flow from a rPduction i11 tlw redundancy of
`
`communicated and stored data. few computer systems perform any sucl1 optimiza(cid:173)
`
`tion. Some instances can he found at thC' applicatiou level. one example being the
`
`class of incremental backup utilities that save onl>· those files that haH' changed
`
`since tlw most recent hi1ckup. Ho\\"e\·er. even these utilities do not at.tempt to f'X(cid:173)
`
`ploi t t lie signi fie ant similarities bet ween old all(! ne,,· versions of files. and bet \\'Pen
`
`SUBSTITUTE SHEET (Rule 26)
`
`Page 3
`
`
`
`W096/25801
`
`PCT/AU96/00081
`
`files sharing other close semantic ties. This kind of redundancy can be approached
`
`only by analysing the contents of the files.
`
`The present invention addresses the potential for reducing redundancy by providing
`
`an efficient method for identifying identical portions of data within a group of blocks
`
`of data. and for using this identification to increase the efficiency of systems that
`
`store and communicate data.
`
`SUMMARY OF THE INVENTION
`
`To identify identical portions of data within a group of blocks of data. the blocks
`
`must be analysed. In a simple aspect of the invention. the blocks are divided into
`
`fixed-length (e.g. -512-byte) subblocks and these subblocks are compared with each
`
`other so as to identify all identical subblocks. This knowledge cau then be used to
`
`manage the blocks in more efficient ways.
`
`Unfortunately. the partitioning of blocks into fixed-length subblocks does not always
`
`provide a suitable framework for th<' recognition of duplicated portions of data. as
`
`identical portions of data can occur in different sizes and places within a group of
`
`blocks of data. Figure 1 shows how division into fixed-size subblocks fails to generate
`
`identical subblocks in two blocks whose only difference is the insertion of a single
`
`byte ( 'X' ). A comparison of the t \\'O groups of sub blocks would reveal no identical
`
`pairs of subhlocks.
`
`In a more sophisticated aspect oft h<' invention. the blocks are partitioned at bound(cid:173)
`
`aries determined by the content oft lw data itself. For example. the block could lw
`
`divided at each point at \Yhich t!te preceding three h:."tes hash to a particular cor1-
`
`stant ,·alue. Figure 2 shows hm\' such a partitioning could turn out. and contrasts
`
`it \\'ith a fixf'd-length partitioning.
`
`SUBSTITUTE SHEET (RULE 26)
`
`Page 4
`
`
`
`WO 96/25801
`
`PCT/AU96/00081
`
`The fact that a partitioning is data dependent does not imply that it must incorpo(cid:173)
`
`rate any knowledge of the syntax or semantics of the data. So long as the boundaries
`
`are positioned in a manner dependent on the local data content, identical subblocks
`
`are likely to be formed from identical portions of data. even if the two portions are
`
`not identically aligned relative to the start of their enclosing blocks (Figure 3).
`
`Once the group of blocks has been partitioned into subblocks, the resulting group of
`
`subblocks can be manipulated in a manner that exploits the occurrence of duplicate
`
`subblocks. This leads to a variety of applications. some of which are listed below.
`
`However. the application of a further aspect of the invention leads to even greater
`
`benefits.
`
`In a further aspect of the invention. the hash of one or more subblocks is calcu(cid:173)
`
`lated. The hash function can be an ordinary hash function or one providing cryp(cid:173)
`
`tographic strength. The hash function maps each subblock into a small tractable
`
`value (e.g. 128 bits) that provides an identity of the subblock. These hashes can
`
`usually be manipulated more efficiently than their corresponding subblocks.
`
`Some applications of aspects of this invention are:
`
`Fine-grained
`
`incremental backups: Conventional
`
`i11cremental
`
`backup technology uses the file as the unit of backup. However. in
`
`practice many large files change only slightly. resulting in a wasteful
`
`rf'-transmission of changed files. By storing the hashes of subblocks of
`
`the previous versions of files. the transmission of unchanged sub blocks
`
`can lw eliminated.
`
`Con11nunications: By providing a framework for communicating the
`
`hashes of subblocks. the invention can eliminate the transmission of sub(cid:173)
`
`hlocks already possessf'd by the receiver.
`
`3
`
`SUBSTITUTE SHEET (Rule 26)
`
`Page 5
`
`
`
`W096/25801
`
`PCT/AU96/00081
`
`Differences: The invention could be used as the basis of a program that
`
`determines the areas of similarity and difference between two blocks.
`
`Low-redundancy file system: Data stored in a file system can be par(cid:173)
`
`titioned into subblocks whose hashes can be compared so as to eliminate
`
`the redundant storage of identical subblocks.
`
`Virtual memory: Virtual memory could be organized by subblock us(cid:173)
`
`ing a table of hashes to determine if a subblock is somewhere in memory.
`
`Clarification Of Tern1s
`
`The term block and subblock both refer, without limitation. to finite blocks or
`
`infinite blocks (sometimes called streams) of zero or more bits or bytes of digital data.
`
`Although the two different terms ("block" and '·subblock"") essentially describe the
`
`same substance (digital data), the two different terms have been employed so as
`
`to indicate the role that a particular piece of data is pla.ving. The term ·'block'" is
`
`usually used to refer to raw data to be manipulated by aspects of the invention. The
`
`term ··subblock'" is usuall~· used to refer to a part of a block.
`
`The term partition has its usual meaning of exhaustively dividing an entity into
`
`mutually exclusive parts. However. within this patent specification. the term also
`
`includes:
`
`• Analyses in which only one or more parts are analysed.
`
`• Anal~·ses in which multiple overlapping subblocks are formed.
`
`:\natural number is a non-negati\'e integer (0. I. 2. :3. 4 .. j .... ).
`
`4
`
`SUBSTITUTS SHE£T (Rule 26)
`
`Page 6
`
`
`
`W096/25801
`
`PCT/AU96/00081
`
`Where the phrase zero or more is used. this phrase is intended to encompass the
`
`degenerate case where the objects being enumerated are not considered at all, as
`
`well as the case where zero or more objects are used.
`
`BRIEF DESCRIPTION
`
`The following aspects of this invention are numbered for reference purposes. The
`
`terms "block'' and '·subblock" refer to blocks and subblocks of digital data.
`
`1. In an aspect of the invention, the invention provides a method for partitioning a
`
`block b into one or more subblocks. the method using the component:
`
`(i) a deterministic or non-deterministic function F that returns one of at least two
`
`values, and vvhose arguments include at least a block of A bits and a block of B
`
`bits. where A and B are natural numbers:
`
`and comprising the step of:
`
`a. Basing the positions of subblock boundaries on the positions /, in the block for
`
`·which F( bk-A ... bb bk+I ... bk+B) falls within a predetermined subclass of the set of
`
`possible function result values.
`
`l\.otc The specification of this aspect (am! rnch othH specification of 011 asput i11POl11i11g
`
`a function F) rncompasses the degenernt( CllH( in 1chich eitha A or B i.~ ::rro and the
`
`function taltrs (without limitation) just onr of tin tll'u a1·guments dcscribul. Such s1ncifi(cid:173)
`
`cotio11s also includr tl1F case of functions F that do not use somr bits of tltf..ir argmm.nts.
`
`A f1111ctio11 F that ba8fs its calrulatio11 solrly 011 (8ay) bk-3 and bk+ 2 1rn11ld fall under the
`
`classc.- of F constmirnd by tltf condition .-1 2'. :{ mid B 2'. 2.
`
`2. In a further aspect of the invention. the inw'ntion provides a method for locating
`
`the nearest subblock boundar~· on a particular side of a particular position Jl within
`
`5
`
`SUBSTITUTE SHEET (Rule 26)
`
`Page 7
`
`
`
`W096/l5801
`
`PCT/AU96/00081
`
`a block. the method using the same components as aspect 1. but replacing step (a)
`
`with:
`
`a. EYaluating F( bp-A ... bp, bp+l ... bp+B) for increasing (or decreasing) p until the
`
`result of F falls \Vi thin a predetermined subclass of the set of possible function result
`
`values. the position of the resultant boundary being based on this position.
`
`3. In a further aspect of the invention, the invention provides a method for parti(cid:173)
`
`tioning a block into one or more subblocks. the method being identical to one of
`
`those above. \\'herein boundaries may be added and removed in accordance with a
`
`further method.
`
`4. In a further aspect of the invention, the invention provides a method for parti(cid:173)
`
`tioning a block into one or more subblocks, the method bei11g identical to one of
`
`those above. wherein an upperbound U on the subblock size is imposed.
`
`5. In a further aspect of the invention. the invention proYides a method for parti(cid:173)
`
`tioning a block into one or more subblocks. the method being identical to one of
`
`those above. wherein a lowerbound L on the subblock size is imposed.
`
`G. In a further aspect of the invention. the inYention prO\·ides a method for partition(cid:173)
`
`ing a block into one or more subblocks. the method being identical to one of those
`
`abo,·c. wlwrein an upperbound Con the subblock size is imposed and a lo\\'erbound
`
`L on the subblock size is also imposed.
`
`i. In a further aspect of the invention. the invention uses 01w or more of the methods
`
`a hove. but a ppliPs more than one partitioning function (e.g. F1• F2 • •.. ) a11cl nwthod
`independently to the block b so as to form more than one group of suhblocks.
`
`Sole The $Ubbloc/..·."- of the t 1ario11s gmup.s aT"C
`
`l'CT",11 likely to 01•f1"lap. Tiff grnups pmduccd
`
`by tlii.~ a.<:pccl can bt: usu/ indtptndn1tly or com bi nu/ in rnrio11s trays to fonn lorgr 1" group.<.
`
`of .<:ubblocf..:s.
`
`6
`
`SUBSTITUTE SHEET (Rule 26)
`
`Page 8
`
`
`
`WO 96/25801
`
`PCT/AU96/00081
`
`S. In a further aspect of the invention, the invention provides a method for parti(cid:173)
`
`tioning a block into one or more subblocks by dividing the block into subblocks of
`
`equal size.
`
`f\"otc: This aspect is not noPel and has been included solely so that later aspects can refer
`
`to it.
`
`9. In a further aspect of the invention, the invention provides a method for parti(cid:173)
`
`tioning a block into one or more subblocks by dividing the block into subblocks of
`
`a small number of different sizes.
`
`Note This aspect is not nord and has been included solely so that later aspects can refer
`
`to it.
`
`10. In a further aspect of the invention, the invention uses one of the methods above.
`
`and additionally forms subblocks from one or more groups of subblocks.
`
`11. In a further aspect of the invention, the invention employs one of the met hods
`
`above, and additionally forms a hierarchy of sub blocks from one or more contiguous
`
`groups of subblocks.
`
`1\"otc The flS]J(Cts abot•t tcill bl nfcrred lo as the partitioning aspects.
`
`12. In a further aspect of the invention. the invention proYides a method for par(cid:173)
`
`titioning a block into suhblocks and forming a corresponding collection of hashes.
`
`comprising the steps of:
`
`a. Partitioning the block into one or more subhlocks in accordance with any parti(cid:173)
`
`tioning aspect:
`
`b. Calculating the hash of one or more subblocks using a hash function Ii .
`
`.\"otr: Thr collcctio11 of has/I(·' is particularly 11s€ful if H is a slmng oru-trny lw.,J, fu11r-tin11.
`
`SUBSTITUTE SHSET (Rule 26)
`
`Page 9
`
`
`
`W096/25801
`
`PCT/AU96/00081
`
`13. In a further aspect of the invention, the invention provides a method for con(cid:173)
`
`structing a projection of a block, comprising the steps of:
`
`a. Partitioning the block into one or more subblocks in accordance with any parti(cid:173)
`
`tioning aspect;
`
`b. Forming a projection which is an ordered or unordered list containing identities
`
`(e.g. sub blocks or hashes of sub blocks) of, or references to, one or more of the
`
`sub blocks.
`
`!\°ate: Thl specification of this aspect is intended to admit lists that contain a mixture of
`
`various kinds of identities and references.
`
`Soil: In most applications the output of this asplcl ll'ill bt an ordered list of hashls of tin
`
`suublocks of the block.
`
`14. In a further aspect of the invention, the invention prO\·ides a method for finding
`
`identical portions within a group of one or more blocks comprising the steps of:
`
`a. Partitioning one or more of said blocks into one or more suhblocks in accordance
`
`with an aspect above;
`
`b. Comparing the subblocks or the identities (e.g. bashes) of tlie subblocks.
`
`1.S. In a further aspect of the invention, the il1\·ention provides a method for repre(cid:173)
`
`senting one or more blocks. im·olving the following rnmponents:
`
`(i) :\ method for storing and retrieYing subblocks:
`
`(ii) :\ mapping from block representatives (e.g. filenames) to lists of entries that
`
`identif~· subblocks:
`
`whereby the modification of data in a stored block invoh·es the following steps:
`
`8
`
`SUBSTITUTi SHEET (Rule 26)
`
`Page 10
`
`
`
`W096/25801
`
`PCT/AU96/00081
`
`a. Partitioning the new data into subblocks m accordance with any partitioning
`
`aspect:
`
`b. Adding subblocks in the new data that are not already in the collection of
`
`stored subblocks to the collection of stored subblocks, and updating the subblock
`
`list associated with the block being modified;
`
`16. In a further aspect of the invention, the invention provides a method for an
`
`entity El to communicate a group X of one or more subblocks X 1 ••• Xn to E2
`where El possesses the knowledge that E2 possesses a group ) · of zero or more
`
`sub blocks ) ·1 ••• ) ~ comprising the following step:
`
`a. Transmitting from El to E2 the contents of a subset of zero or more subblocks in
`
`X, and the remaining subblocks as references which may take (but are not limited
`
`to) the follo\\"ing forms:
`
`(i) a hash of a subblock;
`
`(ii) a reference to a subblock in ) ·;
`
`(iii) a reference to a range of sub blocks in ) ·:
`
`(iv) a reference to a subblock already transmitted:
`
`( v) a reference to a range of sub blocks already transmitted.
`
`Nott: Jn most implementations of this aspect. the subb/oc/.;s t1 1hos( contcnf.r;; ar·< tran.omiittul
`
`tt'ill be thosl in X
`
`that an 11ot in}". and for which 1w idu1tical subblock has pnl'iously
`
`becn trawm1ittul .
`
`. !\"otc To posstss l.:nou·l<:dgr that £2 posscssts }"1 ••. }",,.. El nffd not actually possfs.'
`
`}"1 ••• }·m its1lf. El nad only possess thl idrntitics of }"1 ••• }~ (e.g. the lrnshu. of rnch
`
`9
`
`SUBSTITIITE SHEET (Rule 26)
`
`Page 11
`
`
`
`W096/25801
`
`PCT/AU96/00081
`
`subblocl.: }'1 ... }~). This specification is intended to admit any other representation in
`
`which El may have the knowledgf: that E2 possesses (or has access to) } ·1 ... } ~. In
`
`particular. the knowledge may tal.:t tlll form of a projection of}'.
`
`/l:otc: It is implicit in this aspect that El will be able to use comparison (or other methods)
`
`to use its knowledge of E2 's possession of Y
`
`to determine th£ set of subblocks that aff
`
`common to both X and }'. For example if EI possessed the hashes of tht subblocks of
`
`}'. it could compare them to the hashes of the subblocks of.\' to determine the subblocks
`
`common to both X and}'. Subblocks that are not common can b( transmitted explicitly.
`
`Subblocks that are common to both X and}' can be transmitt£d by transmitting a reference
`
`to thl subblock.
`
`1 i. In a further aspect of the invent ion, the invention provides a method for an
`
`entity £1 to communicate a block X to £2 where El possesses the knowledge that
`
`E2 possesses a group }. of subblocks }·, ... }·~ comprising step (a) of aspect 16
`
`preceded by the step:
`
`s. Partitioning X into subblocks X 1 ••• Xn 111 accordance with any partitioning
`
`aspect.
`
`18. In a further aspect of the invention, the invention provides a method for an entity
`
`El to communicate one or more subblocks of a group X of subblocks X 1 •.• Xn to
`
`£2 where El possesses the knowledge that £2 possesses the block y·. comprising
`
`step (a) of aspect 16 preceded h~· the step:
`
`s. Partitioning } · into subblocks } ·, ... } ~' in accordance with any partitioning as(cid:173)
`
`pect.
`
`19.
`
`In a further aspect of the in vent ion. the in vent ion proYides a met hod for an
`
`entity El to communicate a block X to £2 where El possesses the knowledge that
`
`£2 possesses block L comprising step (a) of aspect 16 preceded by the steps:
`
`10
`
`SUBSTITUTE SHEET (Rule 26)
`
`Page 12
`
`
`
`W096/25801
`
`PCT/AU96/00081
`
`sl. Partitioning X into subblocks X 1 ..• Xn 111 accordance with any partitioning
`
`aspect.
`
`s2. Partitioning Y into subblocks Y1 ••• }~ 111 accordance with any partitioning
`
`aspect.
`
`iVotc Steps (sl} and (s2) could be performed in any order. or concurrently. as could many
`
`other subsets of steps in this patrnt specification.
`
`20. In a further aspect of the invention, the invention provides a method for con(cid:173)
`
`structing a block D from a group X of one or more sub blocks X 1 •.. Xn and a group
`
`) · of zero or more sub blocks Y1 ••• ) ~ such that X can be constructed from ) · and
`
`D. comprising the step:
`
`a. Co11structing D from at least one of the following components:
`
`(1) the contents of one or more subblocks in X:
`
`(2) references to subblocks in Y or to subblocks included m D. or to a range of
`
`subblocks from either Dor Y.
`
`;Vote Component 2 abore is inteiulcd to encompass tin ca.~( u:hcr·i: a mirtun~ of the d((cid:173)
`
`menf.'- it de8Cribes is used.
`
`21. 111 a further aspect of the invention. the invention provides a method for con(cid:173)
`
`structing a block D from a block X and a group )' of subblocks ) ·1 ••• ) ~ such that
`X can be constructed from ) · and D. comprising step (a) of aspect 20 preceded b.\·
`
`thf' step:
`
`s. Partitioning X into subblocks X 1 ••. X 11 m accordance with any partitioning
`
`aspect.
`
`11
`
`SUBSTITUTE SHEST (Rule 26)
`
`Page 13
`
`
`
`W096/25801
`
`PCT/AU96/00081
`
`22. In a further aspect of the invention, the invention provides a method for con(cid:173)
`structing a block D from a group X of suhblocks X 1 ... X,, and a block } ·such that
`X can be constructed from } · and D. comprising step (a) of aspect 20 preceded by
`
`the step:
`
`s. Partitioning Y into subblocks Yi ... } ~ in accordance with any partitioning as(cid:173)
`
`pect.
`
`23. In a further aspect of the invention. the invention provides a method for con(cid:173)
`
`structing a block D from a block X and a block } · such that X can be constructed
`
`from } · and D. com prising step (a) of aspect 20 preceded by the steps:
`
`sl. Partitioning X into subblocks X 1 ••• X,, in accordance "''ith any partitioning
`
`aspect.
`
`s2. Partitioning } · into subblocks } i ... Ln 111 accordance with any partitioning
`aspect.
`
`l\'otc: Steps (sl) and (s2) could be paformfd in any order. 01· concurrently. as could many
`
`other subsets of sltps in thi8 pallnl spcriftcation.
`
`24. In a further aspect of the invention. the invention pro,·ides a method for con(cid:173)
`
`structing a block D from a group X of subblocks X 1 •.• X,, and a projc>ction of a
`block } · (or a projection of a group } · of sub blocks }-, ... } ~,, ). such that X can lw
`
`constructed from } · and D. comprising the step:
`
`a. Constructing D from at least 01w of tlw following components:
`
`( 1) the contents of one or more subblocks in X:
`
`(2) references to subblocks in } · or to subblocks included lll D. or to a range of
`
`subblocks from either D or L
`
`12
`
`SUBSTITUTE SHEET (Rule 26)
`
`Page 14
`
`
`
`W096/25801
`
`PCT/AU96/00081
`
`Note: The projection of}· will usually have been calculated in accordance with aspect 13.
`
`The projection of a group of subblocks will usually have been calculaffd in accordance with
`
`step (b) of aspect 13.
`
`Note: An implementation will usually be able to use the projection of Y to determine if a
`
`subblod· in X is also in Y.
`
`Note: Component 2 above is intended to encompass the case where a mixturt of the ele(cid:173)
`
`ments it describes is used.
`
`25. In a further aspect of the invention. the invention provides a method for con(cid:173)
`
`structing a block D from a block X and a projection of } " such that X can be
`
`constructed from Y and D, comprising the step of aspect 24 with the following step
`
`inserted before step (a):
`
`s. Partitioning X into subblocks X 1 ... Xn
`
`Ill accordance with any partitioning
`
`aspect:
`
`26. In a further aspect of the invention, the invention provides a method for con(cid:173)
`
`structing a block X (or group X of subblocks X 1 ... X 11 ) from a group y· of subblocks
`} ~ ... } ~ and a block D, where D was constructf'd in accordance with one of aspects
`
`20 to ~:j above. comprising the step of:
`
`a. Constructing X from D and } · by constructing the subblocks of X based on one
`
`or more of:
`
`( i) references in D to sub blocks in } ·:
`
`(ii) references in D to subblocks in D:
`
`(iii) ref<'re11ccs in D that specify a rang<' of suhblocks in}":
`
`(iY) references in D that specify a range of subblocks in D:
`
`13
`
`SUBSTITUTE SHEET (Rule 26)
`
`Page 15
`
`
`
`W096/25801
`
`PCT/AU96/00081
`
`( v) sub blocks contained within D;
`
`( \·i) other data elements in D.
`
`21. In a further aspect of the invention, the invention provides a method for con(cid:173)
`
`structing a block X (or group X of sub blocks X 1 .•• Xn) from a block ) · and a block
`D. where D was constructed in accordance with one of aspects 20 to 2.j above.
`
`comprising the step of aspect 2G with the following step inserted before step (a):
`
`s. Partitioning Y into subblocks )-1 ... )~ in accordance with any partitioning as(cid:173)
`pect.
`
`28. In a further aspect of the invention, the invention provides a method for trans(cid:173)
`
`mitting a group X of subblocks X 1 ... X 11 from one entit~· £1 to another entity £2.
`
`comprising the steps of:
`
`a. Transmitting from £1 to £2 an identity of one or more subblocks:
`
`b. Transmitting from £2 to El information communicating the presence or absence
`
`of subblocks at £2;
`
`c. El transmitting to £2 at least the subblocks identified in step (h) as not being
`
`presP11t at £2:
`
`/Vole: The i11fo1"fnatio11 communicated in step (b) could take fllf for111 of a bitmap (or o
`
`co1111>rrs.~<d bitmap) cornsponding to t/1< subblocks referred to in .<.lrp (o). It could also
`
`take 111f111y othrr fonns.
`
`i\'oil: rr (/ gmup of subblocb an to br tra11.s111ittul. thr obol'f sflp.'i could be performld
`complrtd.IJ for rncli .rnbblock before 11101·i11g onto the nci.·t subblod. The steps could b£
`
`appliul lo any subgroup of subblocks.
`
`14
`
`SUBSTlTU'l'S SHEET (Rule 26)
`
`Page 16
`
`
`
`WO 96/25801
`
`PCT/A U96/00081
`
`29. In a further aspect of the invention. the invention provides a method for com(cid:173)
`municating a data block X from one entity El to another entity £2. comprising the
`
`steps of aspect 28 but with the following step inserted before step (a):
`
`s. Partitioning X into subblocks X 1 .•. Xn in accordance with any partitioning
`
`aspect.
`
`30. In a further aspect of the invention. the invention provides a method for com(cid:173)
`
`paring the contents of two or more blocks comprising the steps:
`
`a. Constructing a projection of each block as described in aspect 13;
`
`b. Comparing the projections of the blocks.
`
`:\"otc: The phrasc
`
`··comparing fh( pmjrctions"' is intu1dcd to include not just thf cos(
`
`when the tu•o projections an tfstul to see if they arc the same, but also thf casf when
`
`the subblocks (or 11rojections of .<;ubblocl..:s) within the projections an compared so <1s to
`
`determine thl subblocks that an common to the tu•o original blocl.·s.
`
`31. In a further aspect of the invention, the invention provides a method for trans(cid:173)
`
`mitting a group X of sub blocks X 1 ••• X n from one entity El to another entity £2.
`
`comprising the steps of:
`
`a. Transmitting from £2 to El information communicating the presence or absewe
`
`at £2 of members of a group). of subblocks 1·1 ••• )~,:
`
`b. Transmitting from El to £2 the contents of zero or more subblocks in X. and
`
`tlw remaining su hblocks as reference~ \\' h ich may take (but are not Jim it e<l to) tlw
`
`following forms:
`
`(i) a hash of a subblock:
`
`(ii) a reference to a sub block in L
`
`I :j
`
`.SUBSTITUTE SHEET (Rule 26) .
`
`Page 17
`
`
`
`W096/l5801
`
`PCT/AU96/00081
`
`(iii) a reference to a range of subblocks in F;
`
`(i\') a reference to a subblock already transmitted:
`
`( \') a reference to a range of sub blocks already transmitted.
`
`Note The information communicated in step (a) could tal.:e the form of subblock identities
`
`such as hashes. It could also take many other forms.
`
`32. In a further aspect of the invention, the invention provides a method for trans(cid:173)
`
`mitting a block X from one entity El to another entity £2. comprising the steps of
`
`aspect 31 with the following step inserted before step (a):
`
`s. El partitioning X into subblocks X 1 ••• Xn in accordance with any partitioning
`
`aspect;
`
`33. In a further aspect of the invention. the invention provides a method for an
`
`entity E2 to communicate to an entity El the fact that £2 possesses a group ) · of
`
`subblocks )"1 ••• )~.comprising the step of:
`
`a. £2 transmitting to El identities or references of the subblocks )"1 ... )~.
`
`34.
`
`In a further aspect of the invention. the invention provides a method for a11
`
`entity £2 to communicate to an entity £1 the fact that £2 possesses a block ) ·.
`
`comprising the step of aspect 3:3 with the following step inserted hefore step (a):
`
`s. £2 partitioning )- into subblocks Y1 ••. ) ~" iu accordance with an:• partitioning
`
`aspect:
`
`:J.5.
`
`In a further aspect of the invention. the iuventiou provides a method for au
`
`entity El to communicate a subblock X; to an entit~· £"!..comprising tl1c steps:
`
`a. E"2 sending El an identity of X1.
`
`SUBSTITUTE SHEET (Rule 26)
`
`Page 18
`
`
`
`W096/25801
`
`PCT/A U96/00081
`
`b. El sending X; to E2.
`
`Note This aspect applies (among other applications) to thf cast of a 11dwor/,· serrer El
`
`that serves subblocks to clients such as E2, given the identities (e.g. hashes) of the requested
`
`subblocks.
`
`36. In a further aspect of the invention, the invention provides a method for an
`
`entity El to communicate a subblock X; to an entity £2. comprising the steps of
`
`aspect 35 with the following step inserted before step (a):
`
`s. El partitioning a block X into subblocks X 1 ... Xn m accordance with any
`
`partitioning aspect;
`
`31.
`
`lu a further aspect of the invention. the invention provides a method similar
`
`to any of those above, but where one or more of the comparisons of subblocks are
`
`performed by comparing the hashes of the subblocks. using hashes already a\"ailable
`
`(e.g. as a byproduct of other steps), or calculated for the purpos<" of performing one
`
`or more said comparisons.
`
`38.
`
`In a furtlwr aspect of the invention the invention pro,·ides a nwthod similar
`
`to any of those above. but where subsets of identical subblocks within a group of
`
`one or more subblocks are identified by inserting each subblock. a11 identit>' of each
`
`subblock. a reference of each subblock. or a hash of each subblock. into a data
`
`structure.
`
`'.39. In a further aspect of the invention thf' invention provides a met hod identical
`
`to any of those above. but with various subsets of steps execu t eel co11cu rrently.
`
`40. In an aspect of the invention. the invention provides an apparatus for partitiou(cid:173)
`
`ing a block b into one or more subblocks. the a