throbber
111111111111111111111111111111111111111111111111111111111111111111111111111
`US005990810A
`Patent Number:
`Date of Patent:
`
`[11]
`
`[45]
`
`5,990,810
`Nov. 23, 1999
`
`United States Patent
`Williams
`
`[19]
`
`[54]
`
`[76]
`
`[21]
`
`[22]
`
`[86]
`
`METHOD FOR PARTITIONING A BLOCK
`OF DATA INTO SUBBLOCKS AND FOR
`STORING AND COMMUNCATING SUCH
`SUBBLOCKS
`
`Inventor: Ross Neil Williams, 3/305 N. Terrace,
`Adelaide SASOOO, Australia
`
`Appl. No.:
`
`08/894,091
`
`PCT Filed:
`
`Feb. 15, 1996
`
`PCTNo.:
`
`PCT/AU96/00081
`
`§ 371 Date:
`
`Aug. 15, 1997
`
`§ 102(e) Date: Aug. 15, 1997
`
`[87]
`
`PCT Pub. No.: W096/25801
`
`PCT Pub. Date: Aug. 22, 1996
`
`[30]
`
`Foreign Application Priority Data
`
`Feb. 17, 1995
`Apr. 12, 1995
`
`[AU]
`[AU]
`
`Australia
`Australia
`
`Int. C1.6
`[51]
`[52] U.S. Cl.
`Field of Search
`[58]
`
`PN1232
`PN2392
`
`H03M 7/00
`341/51; 341/67
`341/51,50,67;
`375/241; 704/203
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,698,628 10/1987 Herkert et al.
`8/1993 Sugiyama et al.
`5,235,623
`5,479,654 12/1995 Squibb
`
`340/825.02
`341/67
`395/600
`
`OlliER PUBLICATIONS
`
`Williams, Ross, "An algorithm for matching text (possibly
`original)", Newsgroup posting, comp.compression, Jan. 27,
`1992.
`Williams, Ross, "Parallel data compression", Newsgroup
`posting, comp.conmpression.research, Jun. 30, 1992.
`
`Knuth, Donald E., "The Art of Computer Programming, vol.
`3: Sorting and Searching", pp. 508-513, Addison-Wesley
`Publishing Company, 1973.
`Williams, Ross N., "An Introduction to Digest Algorithms"
`Proceedings of the Digital Equipment Computer Users Soci(cid:173)
`ety, pp. 9-18, Aug. 1994.
`Williams, Ross N., "An Extremely Fast ZIV-Lempel Data
`Compression Algorithm", Proceedings of Data Compression
`Conference, pp. 362-371, Apr. 1991.
`Knuth, Donald E., The Art of Computer Programming, vol.
`1: Fundamental Algorithms, pp. 435-451, Addison Wesley
`Publishing Company, 1973.
`
`Primary Examiner-Brian Young
`Attorney, Agent, or Firm---Greenberg Traurig; Robert P.
`Bell
`
`[57]
`
`ABSTRACT
`
`This invention provides a method and apparatus for detect(cid:173)
`ing common spans within one or more data blocks by
`partitioning the blocks (FIG. 4) into subblocks and searching
`the group of subblocks (FIG. 12) (or their corresponding
`hashes (FIG. 13)) for duplicates. Blocks can be partitioned
`into subblocks using a variety of methods, including meth(cid:173)
`ods that place subblock boundaries at fixed positions (FIG.
`3), methods that place subblock boundaries at data(cid:173)
`dependent positions (FIG. 3), and methods that yield mul(cid:173)
`tiple overlapping subblocks (FIG. 6). By comparing the
`hashes of subblocks, common spans of one or more blocks
`can be identified without ever having to compare the blocks
`or subblocks themselves (FIG. 13). This leads to several
`applications including an incremental backup system that
`backs up changes rather than changed files (FIG. 25), a
`utility that determines the similarities and differences
`between two files (FIG. 13), a file system that stores each
`unique subblock at most once (FIG. 26), and a communi(cid:173)
`cations system that eliminates the need to transmit subblocks
`already possessed by the receiver (FIG. 19).
`
`30 Claims, 26 Drawing Sheets
`
`Hash of subbblock
`I
`The subbblock itself
`I
`Hash of subbblock
`I
`
`Pointer to subbblock
`
`A Projection Of b
`
`Oracle Exhibit 1006, pg 1
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 1 of 26
`
`5,990,810
`
`MADD 0003
`Sheet 1 of 26
`
`Demonstr ates con tent mis alignmen t.
`XDemonst rates co ntent mi salignme nt.
`
`Figure 1
`
`Oracle Exhibit 1006, pg 2
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 2 of 26
`
`5,990,810
`
`Fixed an Id variab lie width Ipartiti loning.
`Fixe I d an Id variable wid Ith part Iitioning.
`
`Figure 2
`
`Oracle Exhibit 1006, pg 3
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 3 of 26
`
`5,990,810
`
`partitioning.
`Data-indep endent
`XData-inde pendent pa rtitioning . I
`
`Data-dep I edent I partiti Ioning.1
`XData-dep Iendent partiti I oning·1
`
`Figure 3
`
`Oracle Exhibit 1006, pg 4
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 4 of 26
`
`5,990,810
`
`..--- ----.-
`A
`B
`
`..--- ----.-
`A
`B
`
`..--- ----.-
`A
`B
`
`b
`
`k
`
`k
`
`k
`
`Function result space
`
`Figure 4
`
`Oracle Exhibit 1006, pg 5
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 5 of 26
`
`5,990,810
`
`A B
`..--- ----.-
`
`b
`
`p
`
`..
`Increase p
`
`F( bp-A+1···bp,b p+1···bp+B) -----:·~I (SUbclass)I
`
`Function result space
`
`Figure 5
`
`Oracle Exhibit 1006, pg 6
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 6 of 26
`
`5,990,810
`
`I
`
`I
`I
`I
`I
`I
`I
`~----_---~--------,------I---L-----------~-----,------------------~--~--------~
`I
`I
`I
`I :
`I
`I
`I
`I
`f
`
`f
`
`Figure 6
`
`r-----'-----~---------------------,------------------~-----T------------I------.
`I
`I
`I
`1
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`- ~ -
`- ~ -
`- ~
`
`-
`
`;
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`- -;- -
`
`-
`
`-
`
`-
`
`- L -
`
`-
`
`-
`
`- - ; -
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`- -'- -
`
`-
`
`-
`
`- _L. -
`
`-
`
`-
`
`- - ; -
`
`-
`
`-
`
`-
`
`-
`
`-
`
`-
`
`F3 1
`F2 ~ -----L -
`F1 f
`
`Oracle Exhibit 1006, pg 7
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 7 of 26
`
`5,990,810
`
`I-----'------------------~
`I
`I
`I
`I
`L
`.1
`
`I
`'-
`
`~--------------------~---------I
`,
`I
`I
`,
`I
`---------------~--------~
`I
`
`r---------------~-----------------~------------------------~------------------I
`I
`I
`I
`I
`.L.
`'-
`L
`I
`I
`I
`I
`~
`
`J
`
`I
`
`F,
`
`1
`
`r-----·--------~--~------~--------r-----T-----~--~---------~--------~--------~
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`
`Figure 7
`
`Oracle Exhibit 1006, pg 8
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 8 of 26
`
`5,990,810
`
`~
`
`I
`I
`
`- - ~ - - ~ - -
`
`-
`
`-
`
`- - -;- -
`
`I
`
`I
`
`I
`I
`- L -
`
`-
`
`-
`
`-
`
`- ~ -
`
`-
`
`-
`
`-
`
`I
`
`I
`I
`- ~
`:
`
`- ~ -
`
`I
`
`I
`
`-
`
`-
`
`-
`
`- ~ -
`I
`
`F3 ;---------------------------------~------------------------~------------------:
`L
`~------------------~-----L-----~------------- __ I L ~-------------~
`F2:
`I
`I
`I
`I
`I
`I
`F1f - - -- - ~ - - 1 - -
`- J - - ~ - - ~ - - - - - -;- - - '- - - - - - - L - ~ -
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`I
`
`Figure 8
`
`Oracle Exhibit 1006, pg 9
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 9 of 26
`
`5,990,810
`
`.----------------------------------------------------------------------------.
`
`I
`I
`I
`I
`~----------------------------------------------------------------------------~
`I
`I
`I
`I
`~
`1
`I
`I
`I
`
`~--------------,---------------------i--------------~---------------T--------~
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`f
`r-----r--------~--~------I------------~--------,------I--------- r -----.-
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I :
`I
`I
`I
`I
`I
`I
`I
`
`F,
`
`I
`
`I
`I
`
`, __ ,
`I
`I
`
`I
`I
`
`bL..-.-1.-.l...-.l-.....l..-....L--.L...-...l-----l...----L----l...-...J.-..I...--JL.....J..----l....--L.---l.-..l..--L-J.----L.----L..-.....l..-...L-J
`
`Figure 9
`
`Oracle Exhibit 1006, pg 10
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 10 of 26
`
`5,990,810
`
`.-- ----~
`
`A B
`
`k
`
`.-- ----~
`
`A B
`
`k
`
`.-- ----~
`
`A B
`
`k
`
`b
`
`Hashes of
`subblocks
`
`Figure 10
`
`Oracle Exhibit 1006, pg 11
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 11 of 26
`
`5,990,810
`
`.-- ----~
`
`A B
`
`k
`
`.-- ----~
`
`A B
`
`k
`
`.-- ----~
`
`A B
`
`k
`
`b
`
`Hash of subbblock
`I
`The subbblock itself
`I
`Hash of subbblock
`I
`
`Pointer to subbblock
`
`A Projection Of b
`
`Figure 11
`
`Oracle Exhibit 1006, pg 12
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 12 of 26
`
`5,990,810
`
`B
`
`A
`
`k
`
`..--- ----.-
`--
`
`....
`
`....
`
`....
`
`....
`
`,
`
`,
`
`,
`
`,
`
`B
`
`..--- ----.-
`
`A
`
`k
`
`B
`
`..--- ----.-
`
`A
`
`k
`
`b1
`
`....
`
`....
`
`--
`
`--
`
`--
`
`--
`
`---
`
`Compare
`
`..--- ----.(cid:173)
`k
`A B
`
`..--- ----.(cid:173)
`k
`A B
`
`Figure 12
`
`Oracle Exhibit 1006, pg 13
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 13 of 26
`
`5,990,810
`
`A B
`.--- ----.(cid:173)
`
`k
`
`A B
`.--- ----.(cid:173)
`
`k
`
`.--- ----.-
`k
`A
`B
`
`.--- ----.(cid:173)
`k
`A B
`
`Figure 13
`
`Oracle Exhibit 1006, pg 14
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 14 of 26
`
`5,990,810
`
`aardvark.txt -----..C3---f----------=~.. L...---'
`
`walrus.txt -------;~~)-
`
`sloth.dat ---~~~
`
`Entry lists
`
`Subblock storage
`
`Figure 14
`
`Oracle Exhibit 1006, pg 15
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 15 of 26
`
`5,990,810
`
`E>-··
`,....
`>-..
`>-
`
`C\J
`W
`
`-"'0
`
`~(
`
`~C
`
`Q)
`)
`::J
`10(cid:173)
`/)e
`
`o(
`
`)
`
`Q)a:--
`
`r------
`
`eX
`
`.
`,....
`:::::::: X
`........
`=:::=:
`..
`:.:.:. X
`~I~
`
`~
`
`~
`
`e
`"'0 0
`Q) .-
`~ .
`~ ctS
`E ex
`.....
`Q)
`C/)
`C/) 0
`e
`ctS Q)
`10-
`10- a.
`I- Q)
`
`10-
`
`It')
`~
`Q)
`~
`
`::s
`bJj
`."""
`
`~
`
`e
`X
`,....
`X
`..
`X
`
`E
`>-
`··
`,....
`>-
`>=
`
`,....
`W
`
`Oracle Exhibit 1006, pg 16
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 16 of 26
`
`5,990,810
`
`~...•.............,...~::::::::::::::::::=:: :::::::::: ::::::::::::
`• t. ..
`......
`
`X: X1 ..Xn
`
`••• « • t
`
`•
`
`t
`
`Y: Y1 .. Ym
`
`Figure 16
`
`Oracle Exhibit 1006, pg 17
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 17 of 26
`
`5,990,810
`
`-.. .. .. .. ..... ....
`~ ~...... .. .....
`.
`.
`......................
`.
`.
`.
`.
`•••••••.•.•.
`X: X1 .. Xn
`
`0.
`
`. . . . .t
`
`,
`
`1 - - - - - - -1 Projection of Y
`
`Figure 17
`
`Y: Y1 .. Ym
`
`I
`I
`I
`I
`
`I,
`
`I
`
`I
`I
`I
`
`I I
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`
`Oracle Exhibit 1006, pg 18
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 18 of 26
`
`5,990,810
`
`.- ----------------------.
`
`•
`•
`•
`•
`•
`•
`•
`•
`:
`X: X1 .. Xn
`:
`·- -- ------------------ - -'
`
`Y: Y1 ..Ym
`
`Figure 18
`
`Oracle Exhibit 1006, pg 19
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 19 of 26
`
`5,990,810
`
`Q)
`
`C Q
`
`)E
`
`i-
`
`E
`>-
`.,...
`>-
`>-
`
`-o
`
`en c
`Q)X
`:
`:~
`+-'
`.,...
`a5 X"'0
`
`~""""""L-II
`
`N
`W
`
`.,...
`W
`
`E
`>-
`.,...
`>-
`>-
`
`c
`X
`.
`.,...
`~~{~ X
`.
`~jjj~j
`X
`~:~:~:
`
`N
`W
`
`.,...
`w
`
`Oracle Exhibit 1006, pg 20
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 20 of 26
`
`5,990,810
`
`(J)
`
`c (
`
`J)
`
`E~
`
`-T"""
`W
`
`E>
`
`-
`T""">-
`>-
`
`C\J
`W
`
`T"""
`W
`
`c>
`
`<
`or-><
`><
`
`II
`
`Oracle Exhibit 1006, pg 21
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 21 of 26
`
`5,990,810
`
`C\I
`W
`
`E>
`
`-..
`~>-. .
`>-
`
`'+-1-0o
`cncnE
`Q)~>-
`.-t= C
`.
`+-'Q)""':
`a5Q5>-
`-0 '+-
`-Q )
`1-
`
`.............-....
`
`~
`
`C><
`><. .
`><
`
`~
`
`w
`
`Oracle Exhibit 1006, pg 22
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 22 of 26
`
`5,990,810
`
`C'I
`C'I
`QJ
`~
`=
`on
`."""
`~
`
`N
`W
`
`Q)
`C
`
`Q)
`E
`~
`
`.,...
`w
`
`.
`'-
`X
`......
`0
`~I
`~ j!j~j! X
`"'0
`Q)
`.c
`I-
`
`cX.
`
`,...
`X. .
`X
`
`N
`W
`
`.,...
`w
`
`.
`.-
`X
`......
`0
`~
`+-'
`+-'
`
`c
`Q)
`.-.
`"'0
`Q)
`.c
`l-
`
`......
`0
`Q) ~ •
`.c -- --
`I-EX
`Q)
`"'0
`
`cX.
`
`,...
`X..
`X
`
`Oracle Exhibit 1006, pg 23
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 23 of 26
`
`5,990,810
`
`..0
`
`l-LL
`00>
`"t- C
`(J)~
`c etS
`etS
`::J
`0 ) -
`~ ~
`0)
`
`(J)
`::J
`C\il-
`etSa.
`a.«
`
`".........
`
`(J) CO .-.
`"t-C ......... CO
`00'r-.....
`.- ~ CD
`+-,+-,"""
`0) . - \.""
`.........
`. ~
`en L
`etS 0> ('I)
`a.a.>,.-
`
`~
`
`Oracle Exhibit 1006, pg 24
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 24 of 26
`
`5,990,810
`
`-.-- ----..
`
`A B
`
`k
`
`-.-- ----..
`
`A
`
`k
`
`B
`
`-.-- ----..
`
`A B
`
`k
`
`b
`
`Hashes of
`subblocks
`
`Table of hashes
`
`Figure 24
`
`Oracle Exhibit 1006, pg 25
`
`

`

`d•
`'JJ.
`
`• ~~.
`
`....
`~=.....
`
`zo
`
`~ N~
`
`~
`
`'""'"'0
`'0
`'0
`
`'JJ.=(cid:173)~
`~....
`
`NU
`
`l
`o....,
`'1
`
`N0
`
`Ul
`....
`\C
`
`00
`
`\C=....
`~=
`
`X: X1 ..Xn: Current
`version (to be
`backed up).
`
`H(Y1 )... H(Ym)
`(Hashes of subblocks
`in previously backed
`up version (Y)).
`
`E1
`
`..
`.~~
`1
`.......:~I
`.........
`:.:.>:.:
`
`- 6
`-
`
`2
`
`I
`I
`I
`
`I
`I
`
`..
`
`D: Incremental
`backup
`consists of
`subblocks in X
`and references
`to subblocks in
`Y (and D).
`
`Figure 25
`
`-----------------
`
`I
`I
`I
`
`X: X1 ..Xn
`
`I
`----------~
`
`'AM«' ft • • L'
`
`X can be
`reconstructed
`from Yand D.
`~~~""ih~~ ~ J m
`:;::::::::::::::::::: ~j
`Y: Y1 ..Ym
`(Previously
`backed up
`version).
`E2
`
`Oracle Exhibit 1006, pg 26
`
`

`

`u.s. Patent
`
`Nov. 23, 1999
`
`Sheet 26 of 26
`
`5,990,810
`
`Filename
`
`Blocks
`
`aardvark.txt
`
`walrus.txt
`
`sloth.dat
`
`File Table
`
`I
`I
`I
`
`Hash
`830..
`9F8..
`AA6..
`092..
`3EB..
`
`Refs
`1
`2
`1
`1
`3
`
`421
`
`013
`
`I
`
`Ptr /
`
`- - /
`~
`
`I I
`
`I
`I
`
`o1234
`
`I
`I
`I
`
`544 VCJ D
`/
`--
`" ~
`-"""-
`..
`...
`
`I
`
`/1
`
`/-Pi
`
`..,l
`
`",l
`
`-
`
`~
`
`I
`
`I
`
`I
`
`0
`
`c=J
`
`I
`
`I
`
`0
`
`I
`I
`I
`
`I
`I
`I
`
`I
`I
`I
`
`Hash Table
`
`Figure 26
`
`Subblock storage system
`(Note: Subblocks are all
`different).
`
`Oracle Exhibit 1006, pg 27
`
`

`

`5,990,810
`
`1
`METHOD FOR PARTITIONING A BLOCK
`OF DATA INTO SUBBLOCKS AND FOR
`STORING AND COMMUNCATING SUCH
`SUBBLOCKS
`
`INTRODUCTION
`
`The present invention provides a method and apparatus
`for partitioning one or more blocks of data into subblocks for
`the purpose of communicating and storing such subblocks in
`an efficient manner.
`
`BACKGROUND
`
`5
`
`20
`
`Much of the voluminous amount of information stored,
`communicated, and manipulated by modern computer sys(cid:173)
`tems is duplicated within the same or a related computer
`system. It is commonplace, for example, for computers to
`store many slightly differing versions of the same document.
`It is also commonplace for data transmitted during a backup
`operation to be almost
`identical
`to the data transmitted
`during the previous backup operation. Computer networks
`also must repeatedly carry the same or similar data in
`accordance the requirements of their users.
`Despite the obvious benefits that would flow from a
`reduction in the redundancy of communicated and stored
`data, few computer systems perform any such optimization.
`Some instances can be found at the application level, one
`example being the class of incremental backup utilities that
`save only those files that have changed since the most recent
`backup. However, even these utilities do not attempt to
`exploit
`the significant similarities between old and new
`versions of files, and between files sharing other close
`semantic ties. This kind of redundancy can be approached
`only by analysing the contents of the files.
`The present invention addresses the potential for reducing 35
`redundancy by providing an efficient method for identifying
`identical portions of data within a group of blocks of data,
`and for using this identification to increase the efficiency of
`systems that store and communicate data.
`
`2
`subblocks whereas the data-dependent partitioning gener(cid:173)
`ates just four, allowing much of the similarity between the
`two blocks to be detected.
`The fact that a partitioning is data dependent does not
`imply that it must incorporate any knowledge of the syntax
`or semantics of the data. So long as the boundaries are
`positioned in a manner dependent on the local data content,
`identical subblocks are likely to be formed from identical
`portions of data, even if the two portions are not identically
`10 aligned relative to the start of their enclosing blocks (FIG.
`3).
`Once the group of blocks has blocks has been partitioned
`the resulting group of subblocks can be
`into subblocks,
`manipulated in a manner that exploits the occurrence of
`15 duplicate subblocks. This leads to a variety of applications,
`some of which are described below. However, the applica(cid:173)
`tion of a further aspect of the invention leads to even greater
`benefits.
`In a further aspect of the invention, the hash of one or
`more subblocks is calculated. The hash function can be an
`ordinary hash function or one providing cryptographic
`strength. The hash function maps each subblock into a small
`tractable value (e.g. 128 bits) that provides an identity of the
`25 subblock. These hashes can usually be manipulated more
`efficiently than their corresponding subblocks.
`Some applications of aspects of this invention are:
`Fine-grained incremental backups: Conventional incre(cid:173)
`mental backup technology uses the file as the unit of backup.
`30 However, in practice many large files change only slightly,
`resulting in a wasteful re-transmission of changed files. By
`storing the hashes of subblocks of the previous versions of
`files, the transmission of unchanged subblocks can be elimi(cid:173)
`nated.
`Communications: By providing a framework for commu(cid:173)
`nicating the hashes of subblocks, the invention can eliminate
`the ~ransmission of subblocks already possessed by the
`receIver.
`Differences: The invention could be used as the basis of
`40 a program that determines the areas of similarity and dif(cid:173)
`ference between two blocks.
`Low-redundancy file system: Data stored in a file system
`can be partitioned into subblocks whose hashes can be
`compared so as to eliminate the redundant storage of iden(cid:173)
`45 tical subblocks.
`Virtual memory: Virtual memory could be organized by
`subblock using a table of hashes to determine if a subblock
`is somewhere in memory.
`
`SUMMARY OF THE INVENTION
`
`To identify identical portions of data within a group of
`blocks of data, the blocks must be analysed. One simple
`approach is to divide the blocks into fixed-length (e.g.
`512-byte) subblocks and compare these with each other so
`as to identify all identical subblocks. This knowledge can the
`be used to manage the blocks in more efficient ways.
`Unfortunately, the partitioning of blocks into fixed-length
`subblocks does not always provide a suitable framework for 50
`the recognition of duplicated portions of data, as identical
`portions of data can occur in different sizes and places within
`a group of blocks of data. FIG. 1 shows how division into
`fixed-size subblocks of two blocks (whose only difference is
`the insertion of a single byte eX')) fails to generate identical 55
`subblocks. A comparison of the two groups of subblocks
`would reveal no identical pairs of subblocks even thought
`the two original blocks differ by just one character.
`A better approach is to partition each block using the data
`in the block itself to determine the position of the partitions. 60
`In an aspect of the invention, the blocks are partitioned at
`boundaries determined by the content of the data itself. For
`example, a block could be partitioned at each point at which
`the preceding three bytes has to a particular constant value.
`FIG. 2 shows how such a data dependent partitioning could 65
`turn out, and contrasts it with a fixed-length partitioning. In
`FIG. 3 data independent partitioning generates seven distinct
`
`Clarification of Terms
`
`The term block and subblock both refer, without
`limitation,
`to finite blocks or infinite blocks (sometimes
`called streams) of zero or more bits or bytes of digital data.
`Although the two different terms ("blocks" and "subblock")
`essentially describe the same substance (digital data), the
`two different terms have been employed in this specification
`to indicate the role that a particular piece of data is playing.
`The term "block" is usually used to refer to raw data to be
`manipulated by aspects of the invention. The term "sub(cid:173)
`block" is usually used to refer to a part of a block. "Blocks"
`are "partitioned" into "subblocks".
`The term partition has its usual meaning of exhaustively
`dividing an entity into mutually exclusive parts. However,
`within this specification, the term also includes cases where:
`Not all of the block is subdivided.
`Multiple overlapping subblocks are formed.
`
`Oracle Exhibit 1006, pg 28
`
`

`

`5,990,810
`
`5
`
`3
`A natural number is a non-negative integer (0, 1, 2, 3, 4,
`5, ... ).
`Where the phrase zero or more is used, this phrase is
`intended to encompass the degenerate case where the objects
`being enumerated are not considered at all, as well as the
`case where zero or more objects are used.
`BRIEF DESCRIPTION
`The following aspects of this invention are numbered for
`reference purposes. The terms "block" and "subblock" refer
`to blocks and subblocks of digital data.
`1. In an aspect of the invention, the invention provides a
`method for organizing a block b of digital data for the
`purpose of storage, communication, or comparison, by par(cid:173)
`titioning said block into subblocks at one or more positions
`klk+l within said block for which b[k-A+l ... k+B]
`satisfies a predetermine constraint, where A and Bare
`natural numbers.
`Note: The specification of this aspect encompasses the
`degenerate case in which either A or B is zero. The speci(cid:173)
`fication also includes the case where the constraint does not
`pay attention to some parts of b[k-A+l ... k+B]. For
`example, a constraint that pays attention only to (say) b[k-3]
`and b[k+2] would fall under the classes of constraint cor(cid:173)
`responding to A~4 and B~2.
`the invention
`2. In a further aspect of the invention,
`provides a method according to aspect 1 in which the
`constraint comprises the hash of some or all of b[k-A+1 ...
`k+B].
`3. In a further aspect of the invention,
`the invention
`~~~~~~~ss:b~~~~o~oa~~~~~~n~ntoaa:fct~c~i'af~ro~ft~~~n~ptz~ 30
`within a said block, comprising the step of:
`a. Evaluating whether said predetermined constraint is
`satisfied at each position klk+l,
`for
`increasing (or
`decreasing) k, where k starts with the value p.
`4. In a further aspect of the invention,
`the invention 35
`provides a method according to aspect 1, wherein one or
`more bounds are imposed on the size of one or more
`subblocks.
`the invention
`5. In a further aspect of the invention,
`provides a method according to aspect 1, wherein additional 40
`subblocks are formed from one or more groups of subblocks.
`6. In a further aspect of the invention,
`the invention
`provides a method according to aspect 1, wherein an addi(cid:173)
`tional hierarchy of subblocks is formed from one or more
`groups of contiguous subblocks.
`the invention
`7. In a further aspect of the invention,
`provides a method according to one of aspects 1 to 6,
`comprising the further step of:
`b. Calculating the hash of each of one or more of said
`subblocks.
`Note: The resulting collection of hashes is particularly
`useful if H is a strong one-way hash function.
`8. In a further aspect of the invention,
`the invention
`provides a method according to one of aspects 1 to 6,
`comprising the further step of:
`b. Forming a projection of said block, being an ordered or
`unordered collection of elements, wherein each element
`consists of a subblock, an identity of a subblock, or a
`reference of a subblock.
`Note: The specification of this aspect is intended to admit
`collections that contain a mixture of various kinds of iden(cid:173)
`tities and references.
`Note: In most applications, the output of this aspect will be
`an ordered list of hashes of the subblocks of the block.
`9. In a further aspect of the invention,
`the invention
`provides a method for comparing one or more blocks,
`comprising the steps of:
`
`4
`a. Partitioning one or more of said blocks into one or more
`subblocks in accordance with one of aspects 1 to 6.
`b. Forming a projection of each said block, being an
`ordered or unordered collection of elements, wherein each
`element consists of a subblock, an identity of a subblock, or
`a reference of a subblock.
`c. Comparing the elements of said projections of said
`blocks.
`10. In a further aspect of the invention, the invention
`10 provides a method for representing one or more blocks,
`comprising:
`(i) A collection of subblocks;
`filenames) which are
`(ii) Block representatives (e.g.
`mapped to lists of entries that identify subblocks;
`whereby the modification of one of said blocks involves the
`15 following steps:
`a. Partitioning some or all of said modified block into
`subblocks in accordance with one of aspects 1 to 6;
`b. Adding to said collection of subblocks zero or more
`subblocks that are not already in said collection, and updat(cid:173)
`20 ing said subblock list associated with said modified block.
`11. In a further aspect of the invention, the invention
`provides a method according to aspect 10, in which step b
`is replaced by:
`b. Removing from said collection of subblocks zero or
`25 more subblocks, and updating said subblock list associated
`with said modified block.
`12. In a further aspect of the invention, the invention
`provides a method according to aspect 10, in which step b
`is replaced by:
`b. Adding to said collection of subblocks zero or more
`subblocks that are not already in the collection, removing
`from said collection of subblocks zero or more subblocks,
`and updating said subblock list associated with said modi(cid:173)
`fied block.
`13. In a further aspect of the invention, the invention
`provides a method for an entity El to communicate a block
`X to El where El possesses the knowledge that E2 pos(cid:173)
`sesses a group Y of subblocks Y 1 . . . Y m' comprising the
`following steps:
`a. Partitioning X into subblocks Xl ... X n in accordance
`with one of aspects 1 to 6;
`b. Transmitting from El to E2 the contents of zero or more
`subblocks in X, and the remaining subblocks as references
`to subblocks in Y 1
`. Y m and to subblocks already
`.
`.
`45 transmitted.
`Note: In most implementations of this aspect, the subblocks
`whose contents are transmitted will be those in X that are not
`in Y, and for which no identical subblock has previously
`been transmitted.
`50 Note: To posses knowledge that E2 possesses Y 1 . . . Y m' El
`. Y m itself. El need only
`need not actually posses Y 1 .
`.
`posses the identities of Y l . . . Ym (e.g. the hashes of each
`subblock Y 1 . . . Y m)' This specification is intended to admit
`any other representation in which El may have the knowl-
`55 edge that E2 possesses (or has access to) Y 1 . . . Y m' In
`particular, the knowledge may take the form of a projection
`ofY.
`Note: It is implicit in this aspect the El will be able to use
`comparison (or other methods) to use its knowledge of E2's
`60 possession of Y to determine the set of subblocks that are
`common to both X and Y. For example, if El possessed the
`hashes of the subblocks of Y, it could compare them to the
`hashes of the subblocks of X to determine the subblocks
`common to both X and Y. Subblocks that are not common
`65 can be transmitted explicitly. Subblocks that are common to
`both X and Y can be transmitted by transmitting a reference
`to the subblock.
`
`Oracle Exhibit 1006, pg 29
`
`

`

`5,990,810
`
`5
`
`5
`14. In a further aspect of the invention, the invention
`provides a method for an entity E1 to communicate one or
`more subblocks of a group X of subblocks Xl ... X n to E2
`where E1 possesses the knowledge that E2 possesses the
`blocks Y, comprising the following steps:
`a. Partitioning Y into subblocks Y 1 . . . Ym in accordance
`with one or aspects 1 to 6;
`b. Transmitting from E1 to E2 the contents of zero or more
`subblocks in X, and the remaining subblocks as references
`to subblocks in Y and to subblocks already transmitted.
`15. In a further aspect of the invention, the invention
`provides a method for an entity E1 to communicate a block
`X to E2 where E1 possesses the knowledge that E2 pos(cid:173)
`sesses block Y, comprising the following steps:
`a. Partitioning in accordance with one of aspects 1 to 6, X 15
`into subblocks Xl ... X n and Y into subblocks Y 1 . . . Ym;
`b. Transmitting from E1 to E2 the contents of subblocks
`in X, and the remaining subblocks as references to subblocks
`in Y and to subblocks already transmitted.
`16. In a further aspect of the invention, the invention
`provides a method for constructing a block D from a block
`X and a group Y of subblocks Y 1 . . . Ym such that X can be
`constructed from Y and D, comprising the following steps:
`a. Partitioning X into subblocks Xl ... X n in accordance
`with one of aspects 1 to 6;
`b. Constructing D from one or more of the following: the
`contents of zero or more subblocks in X, references to zero
`or more subblocks in Y, and references to zero or more
`subblocks in D.
`Note: Step b above is intended to encompass the case where
`a mixture of the elements it describes is used.
`17. In a further aspect of the invention, the invention
`provides a method for constructing a block D from a group
`X of subblocks Xl ... X n and a block Y such that X can be
`constructed from Y and D, comprising the following steps:
`a. Partitioning Y into subblocks Y 1 . . . Ym in accordance
`with one of aspects 1 to 6;
`b. Constructing D from one or more of the following: the
`contents of zero or more subblocks in X, references to zero
`or more subblocks in Y, and references to zero or more
`subblocks in D.
`18. In a further aspect of the invention, the invention
`provides a method for constructing a block D from a block
`X and a block Y such that X can be constructed from Y and
`D, comprising the following steps:
`a. Partitioning in accordance with one of aspects 1 to 6, X
`into subblocks Xl ... X n and Y into subblocks Y 1 . . . Ym;
`b. Constructing D from one or more of the following: the
`contents of zero or more subblocks in X, references to zero
`or more subblocks in Y, and references to zero or more
`subblocks in D.
`19. In a further aspect of the invention, the invention
`provides a method for constructing a block D from a block
`X and a projection of Y, said projection comprising an
`ordered or unordered collection of elements wherein each 55
`element consists of a subblock in Y, an identity of a subblock
`in Y, or a reference of a subblock in Y, such that X can be
`constructed from Y and D, comprising the following steps:
`a. Partitioning X into subblocks Xl ... X n in accordance
`with one of aspects 1 to 6;
`b. Constructing D from one or more of the following: the
`contents of zero or more subblocks in X, references to zero
`or more subblocks in Y, and references to zero or more
`subblocks in D.
`20. In a further aspect of the invention, the invention
`provides a method for constructing a block X from a block
`Y and a block D, comprising the following steps:
`
`6
`a. Partitioning Y into subblocks Y 1 . . . Ym in accordance
`with one or aspects 1 to 6;
`b. Constructing X from D and Y by constructing the
`subblocks of X based on one or more of:
`(i) subblocks contained within D;
`(iii) references in D to subblocks in Y;
`(iii) references in D to subblocks in D;
`21. In a further aspect of the invention, the invention
`provides a method for constructing a group X of subblocks
`10 Xl ... X n from a block Y and a block D, comprising the
`following steps:
`a. Partitioning Y into subblocks Y 1 . . . Ym in accordance
`with one of aspects 1 to 6;
`b. Constructing Xl ... X n from D and Y based on one or
`more of:
`(i) subblocks contained within D;
`(iii) references in D to subblocks in Y;
`(iii) references in D to subblocks in D;
`22. In a further aspect of the invention, the invention
`provides a method for communicating a data block X from
`20 one entity E1 to another entity E2 comprising the following
`steps:
`a. Partitioning X into subblocks Xl ... X n in accordance
`with one of aspects 1 to 6;
`b. Transmitting from E1 to E2 an identity of one or more
`25 subblocks;
`c. Transmitting from E2 to E1 information communicat(cid:173)
`ing the presence or absence of subblocks at E2;
`d. Transmitting from E1 to E2 at least the subblocks
`identified in step (c) as not being present at E2.
`30 Note: The information communicated in step (c) could take
`the form of a bitmap (or a compressed bitmap) correspond(cid:173)
`ing to the subblocks referred to in step (a). It could also take
`many other forms.
`Note: If a group of subblocks are to be transmitted, the above
`steps could be performed completely for each subblock
`35 before moving onto the next subblock. The steps could be
`applied to any subgroup of subblocks.
`23. In a further aspect of the invention, the invention
`provides a method for communicating a block X from one
`entity E1 to another entity E2, comprising the following
`40 steps:
`a. Partitioning X into subblocks Xl ... X n in accordance
`with one of aspects 1 to 6;
`b. Transmitting from E2 to E1 information communicat(cid:173)
`ing the presence or absence at E2 of members of a group Y
`45 or subblocks Y 1 . . . Ym;
`c. Transmitting from E1 to E2 the contents of zero or more
`subblocks in X, and the remaining subblocks as references
`to subblocks in Y 1 . . . Ym and to subblocks transmitted.
`24. In a further aspect of the invention, the invention
`50 provides a method for an entity E2 to communicate to an
`entity E1 the fact that E2 possesses a block Y, comprising the
`following steps:
`a. Partitioning Y into subblocks Y 1 . . . Ym in accordance
`with one or aspects 1 to 6;
`b. Transmitting from E2 to E1 references of the subblocks
`Y l · · · Y m ·
`25. In a further aspect of the invention, the invention
`provides a method for an entity E1 to communicate a
`subblock Xi to an entity E2, comprising the following steps:
`a. Partitioning X into subblocks Xl ... X n in accordance
`with one of aspects 1 to 6;
`b. Transmitting from E2 to E1 an identity of Xi;
`c. Transmitting Xi from E1 to E2.
`Note: This aspect applies (among other applications) to the
`65 case of a network server E1 that serves subblocks to clients
`such as E2, given the identities (e.g. hashes) of the requested
`subblocks.
`
`60
`
`Oracle Exhibit 1006, pg 30
`
`

`

`5,990,810
`
`5
`
`7
`26. In a further aspect of the invention, the invention
`provides a method according to one of aspects 1 to 6,
`wherein said subblocks are compared by comparing the
`hashes of said subblocks.
`27. In a further aspect of the invention, the invention
`provides a method according to one of aspects 1 to 6,
`wherein subsets of identical subblocks within a group of one
`or more subblocks are found, by inserting each subblock, an
`identity of each subblock, a reference of each subblock, or
`a hash of each subblock, into a data structure.
`28.
`In further aspect of the invention,
`the invention
`provides an apparatus for organizing a block b of digital data
`for the purpose of storage, communication, or comparison,
`by partitioning said block into subblocks at one or more
`positions klk+1 within said block for which b[k-A+1 ...
`k+B] satisfies a predetermined constraint, where A and Bare
`natural numbers.
`Note: The specification of this aspect encompasses the
`degenerate case in which either A or B is zero. The speci(cid:173)
`fication also includes the case where the constraint does not
`pay attention

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket