`
`
`
`
`UNIFIED PATENTS
`
`EXHIBIT 1015
`
`Unfied Patents Exhibit 1015
`Pg. 1
`
`
`
`TR-CS-96-05
`
`The rsync algorithm
`
`Andrew Tridgell and Paul Mackerras
`
`June 1996
`
`Joint Computer Science Technical Report Series
`Department of Computer Science
`Faculty of Engineering and Information Technology
`Computer Sciences Laboratory
`Research School of Information Sciences and Engineering
`
`Unfied Patents Exhibit 1015
`Pg. 2
`
`
`
`This technical report series is published jointly by the Department of
`Computer Science, Faculty of Engineering and Information Technology,
`and the Computer Sciences Laboratory, Research School of Information
`Sciences and Engineering, The Australian National University.
`Please direct correspondence regarding this series to:
`
`Technical Reports
`Department of Computer Science
`Faculty of Engineering and Information Technology
`The Australian National University
`Canberra ACT 0200
`Australia
`
`or send email to:
`
`Technical.Reports@cs.anu.edu.au
`
`A list of technical reports, including some abstracts and copies of some full
`reports may be found at:
`
`http://cs.anu.edu.au/techreports/
`
`Recent reports in this series:
`
`TR-CS-96-04 Peter Bailey and David Hawking. A parallel architecture for
`query processing over a terabyte of text. June 1996.
`TR-CS-96-03 Brendan D. McKay. autoson – a distributed batch system for
`unix workstation networks (version 1.3). March 1996.
`
`TR-CS-96-02 Richard P. Brent. Factorization of the tenth and eleventh Fermat
`numbers. February 1996.
`
`TR-CS-96-01 Weifa Liang and Richard P. Brent. Constructing the spanners
`of graphs in parallel. January 1996.
`
`TR-CS-95-08 David Hawking. The design and implementation of a parallel
`document retrieval engine. December 1995.
`
`TR-CS-95-07 Raymond H. Chan and Michael K. Ng. Conjugate gradient
`methods for Toeplitz systems. September 1995.
`
`Unfied Patents Exhibit 1015
`Pg. 3
`
`
`
`The rsync algorithm
`
`Paul Mackerras
`Andrew Tridgell
`Department of Computer Science
`Australian National University
`Canberra(cid:2) ACT (cid:2) Australia
`
`June (cid:2)
`
`Abstract
`
`This report presents an algorithm for updating a le on one machine
`to be identical to a le on another machine(cid:3) We assume that the two
`machines are connected by a low(cid:4)bandwidth high(cid:4)latency bi(cid:4)directional
`communications link(cid:3) The algorithm identi es parts of the source le
`which are identical to some part of the destination le(cid:5) and only sends
`those parts which cannot be matched in this way(cid:3) E ectively(cid:5) the algo(cid:4)
`rithm computes a set of di erences without having both les on the same
`machine(cid:3) The algorithm works best when the les are similar(cid:5) but will
`also function correctly and reasonably e ciently when the les are quite
`di erent(cid:3)
`
` The problem
`
`Imagine you have two les(cid:3) A and B(cid:3) and you wish to update B to be the same
`as A(cid:4) The obvious method is to copy A onto B(cid:4)
`Now imagine that the two les are on machines connected by a slow com(cid:5)
`munications link(cid:3) for example a dial up IP link(cid:4) If A is large(cid:3) copying A onto
`B will be slow(cid:4) To make it faster you could compress A before sending it(cid:3) but
`that will usually only gain a factor of to (cid:4)
`Now assume that A and B are quite similar(cid:3) perhaps both derived from the
`same original le(cid:4) To really speed things up you would need to take advantage
`of this similarity(cid:4) A common method is to send just the di erences between A
`and B down the link and then use this list of di erences to reconstruct the le(cid:4)
`The problem is that the normal methods for creating a set of di erences
`between two les rely on being able to read both les(cid:4) Thus they require that
`both les are available beforehand at one end of the link(cid:4) If they are not both
`available on the same machine(cid:3) these algorithms cannot be used once you had
`copied the le over(cid:3) you wouldn(cid:10)t need the di erences(cid:4) This is the problem
`that rsync addresses(cid:4)
`The rsync algorithm e ciently computes which parts of a source le match
`some part of an existing destination le(cid:4) These parts need not be sent across
`the link(cid:13) all that is needed is a reference to the part of the destination le(cid:4)
`Only parts of the source le which are not matched in this way need to be sent
`verbatim(cid:4) The receiver can then construct a copy of the source le using the
`references to parts of the existing destination le and the verbatim material(cid:4)
`
`
`
`Unfied Patents Exhibit 1015
`Pg. 4
`
`
`
`Trivially(cid:2) the data sent to the receiver can be compressed using any of a
`range of common compression algorithms(cid:2) for further speed improvements(cid:3)
`
` The rsync algorithm
`
`Suppose we have two general purpose computers and (cid:3) Computer has
`access to a le A and has access to le B(cid:2) where A and B are similar(cid:6)(cid:3)
`There is a slow communications link between and (cid:3)
`The rsync algorithm consists of the following steps(cid:7)
`
` (cid:3) splits the le B into a series of non(cid:9)overlapping xed(cid:9)sized blocks of size
`S bytes (cid:3) The last block may be shorter than S bytes(cid:3)
`
`(cid:3) For each of these blocks calculates two checksums(cid:7) a weak rolling(cid:6)
` (cid:9)bit checksum described below and a strong (cid:9)bit MD checksum(cid:3)
`
` (cid:3) sends these checksums to (cid:3)
`
`(cid:3) searches through A to nd all blocks of length S bytes at any o set(cid:2)
`not just multiples of S that have the same weak and strong checksum
`as one of the blocks of B(cid:3) This can be done in a single pass very quickly
`using a special property of the rolling checksum described below(cid:3)
`
`(cid:3) sends a sequence of instructions for constructing a copy of A(cid:3) Each
`instruction is either a reference to a block of B(cid:2) or literal data(cid:3) Literal
`data is sent only for those sections of A which did not match any of the
`blocks of B(cid:3)
`
`The end result is that gets a copy of A(cid:2) but only the pieces of A that are
`not found in B plus a small amount of data for checksums and block indexes
`are sent over the link(cid:3) The algorithm also only requires one round trip(cid:2) which
`minimises the impact of the link latency(cid:3)
`The most important details of the algorithm are the rolling checksum and
`the associated multi(cid:9)alternate search mechanism which allows the all(cid:9)o sets
`checksum search to proceed very quickly(cid:3) These will be discussed in greater
`detail below(cid:3)
`
` Rolling checksum
`
`The weak rolling checksum used in the rsync algorithm needs to have the prop(cid:9)
`erty that it is very cheap to calculate the checksum of a bu er X(cid:3)(cid:3)Xn(cid:3) given
`the checksum of bu er X (cid:3)(cid:3)Xn and the values of the bytes X and Xn(cid:3) (cid:3)
`The weak checksum algorithm we used in our implementation was inspired
`by Mark Adler(cid:18)s adler(cid:9) checksum(cid:3) Our checksum is de ned by
`
`Xi modM
`
`l X i
`
`(cid:4)k
`
`ak(cid:4) l (cid:19)
`
` We have found that values of S between and are quite good for most purposes
`
`
`
`Unfied Patents Exhibit 1015
`Pg. 5
`
`
`
`l (cid:0) i (cid:4) Xi modM
`
`l X i
`
`(cid:0)k
`
`bk(cid:2) l (cid:3)
`
`sk(cid:2) l (cid:3)ak(cid:2) l (cid:4) bk(cid:2) l
`
`where sk(cid:2) l is the rolling checksum of the bytes Xk (cid:3) (cid:3) (cid:3) Xl(cid:7) For simplicity
`and speed(cid:8) we use M (cid:3) (cid:7)
`The important property of this checksum is that successive values can be
`computed very e ciently using the recurrence relations
`
`ak (cid:4) (cid:2) l (cid:4) (cid:3) ak(cid:2) l (cid:0) Xk (cid:4) Xl(cid:4) mod M
`
`bk (cid:4) (cid:2) l (cid:4) (cid:3) bk(cid:2) l (cid:0) l (cid:0) k (cid:4) Xk (cid:4) ak (cid:4) (cid:2) l (cid:4) mod M
`
`Thus the checksum can be calculated for blocks of length S atall possible
`o sets within a le in a rolling(cid:13) fashion(cid:8) with very little computation at each
`point(cid:7)
`Despite its simplicity(cid:8) this checksum was found to be quite adequate as a
` rst level check for a match of two le blocks(cid:7) We have found in practice that
`the probability of this checksum matching when the blocks are not equal is quite
`low(cid:7) This is important because the much more expensive strong checksum must
`be calculated for each block where the weak checksum matches(cid:7)
`
` Checksum searching
`
`Once has received the list of checksums of the blocks of B(cid:8) it must search A
`for any blocks at any o set that match the checksum of some block of B(cid:7) The
`basic strategy is to compute the (cid:15)bit rolling checksum for a block of length S
`starting at each byte of A in turn(cid:8) and for each checksum(cid:8) search the list for a
`match(cid:7) To do this our implementation uses a simple level searching scheme(cid:7)
`The rst level uses a (cid:15)bit hash of the (cid:15)bit rolling checksum and a
`entry hash table(cid:7) The list of checksum values i(cid:7)e(cid:7)(cid:8) the checksums from the blocks
`of B is sorted according to the (cid:15)bit hash of the (cid:15)bit rolling checksum(cid:7) Each
`entry in the hash table points to the rst element of the list for that hash value(cid:8)
`or contains a null value if no element of the list has that hash value(cid:7)
`At each o set in the le the (cid:15)bit rolling checksum and its (cid:15)bit hash are
`calculated(cid:7) If the hash table entry for that hash value is not a null value(cid:8) the
`second level check is invoked(cid:7)
`The second level check involves scanning the sorted checksum list starting
`with the entry pointed to by the hash table entry(cid:8) looking for an entry whose
` (cid:15)bit rolling checksum matches the current value(cid:7) The scan terminates when
`it reaches an entry whose (cid:15)bit hash di ers(cid:7) If this search nds a match(cid:8) the
`third level check is invoked(cid:7)
`The third level check involves calculating the strong checksum for the current
`o set in the le and comparing it with the strong checksum value in the current
`list entry(cid:7) If the two strong checksums match(cid:8) we assume that we have found
`a block of A which matches a block of B(cid:7) In fact the blocks could be di erent(cid:8)
`but the probability of this is microscopic(cid:8) and in practice this is a reasonable
`assumption(cid:7)
`When a match is found(cid:8) sends the data in A between the current le
`o set and the end of the previous match(cid:8) followed by the index of the block in
`
`
`
`Unfied Patents Exhibit 1015
`Pg. 6
`
`
`
`B that matched(cid:2) This data is sent immediately a match is found(cid:3) which allows
`us to overlap the communication with further computation(cid:2)
`If no match is found at a given o set in the le(cid:3) the rolling checksum is
`updated to the next o set and the search proceeds(cid:2) If a match is found(cid:3) the
`search is restarted at the end of the matched block(cid:2) This strategy saves a
`considerable amount of computation for the common case where the two les
`are nearly identical(cid:2)
`In addition(cid:3) it would be a simple matter to encode the
`block indexes as runs(cid:3) for the common case where a portion of A matches a
`series of blocks of B in order(cid:2)
`
` Pipelining
`
`The above sections describe the process for constructing a copy of one le on
`a remote system(cid:2) If we have a several les to copy(cid:3) we can gain a considerable
`latency advantage by pipelining the process(cid:2)
`This involves initiating two independent processes(cid:2) One of the processes
`generates and sends the checksums to while the other receives the di erence
`information from and reconstructs the les(cid:2)
`If the communications link is bu ered then these two processes can proceed
`independently and the link should be kept fully utilised in both directions for
`most of the time(cid:2)
`
` Results
`
`To test the algorithm(cid:3) tar les were created of the Linux kernel sources for two
`versions of the kernel(cid:2) The two kernel versions were (cid:2) (cid:2) and (cid:2) (cid:2) (cid:2) These
`tar les are approximately MB in size and are separated by released patch
`levels(cid:2)
`Out of the les in the (cid:2) (cid:2) release(cid:3) les had changed in the (cid:2) (cid:2)
`release(cid:3) les had been removed and les had been added(cid:2)
`A di (cid:13) of the two tar les using the standard GNU di utility produced
`over thousand lines of output totalling (cid:2) MB(cid:2)
`The following table shows the results for rsync between the two les with a
`varying block size(cid:2)
`
`block matches
`size
`
`tag
`hits
`
`false
`alarms
`
`data
`
`written read
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`In each case(cid:3) the CPU time taken was less than the time it takes to run
`di (cid:13) on the two les(cid:2)
`
`All the tests in this section were carried out using rsync version (cid:3)
` The wall clock time was approximately minutes per run on a MHz SPARC running
`SunOS(cid:7) using rsh over loopback for communication(cid:3) GNU di took about minutes(cid:3)
`
`
`
`Unfied Patents Exhibit 1015
`Pg. 7
`
`
`
`The columns in the table are as follows(cid:2)
`
`block size The size in bytes of the checksummed blocks(cid:3)
`
`matches The number of times a block of B was found in A(cid:3)
`
`tag hits The number of times the bit hash of the rolling checksum matched
`a hash of one of the checksums from B(cid:3)
`
`false alarms The number of times the bit rolling checksum matched but
`the strong checksum didn(cid:8)t(cid:3)
`
`data The amount of le data transferred verbatim(cid:10) in bytes(cid:3)
`
`written The total number of bytes written by including protocol overheads(cid:3)
`This is almost all le data(cid:3)
`
`read The total number of bytes read by including protocol overheads(cid:3) This
`is almost all checksum information(cid:3)
`
`The results demonstrate that for block sizes above bytes(cid:10) only a small
`fraction around of the le was transferred(cid:3) The amount transferred was
`also considerably less than the size of the di le that would have been trans(cid:17)
`ferred if the di patch method of updating a remote le was used(cid:3)
`The checksums themselves took up a considerable amount of space(cid:10) although
`much less than the size of the data transferred in each case(cid:3) Each pair of
`checksums consumes bytes(cid:2) bytes for the rolling checksum plus bytes
`for the (cid:17)bit MD checksum(cid:3)
`The number of false alarms was less than (cid:3) of the number of true
`matches(cid:10) indicating that the bit rolling checksum is quite good at screening
`out false matches(cid:3)
`The number of tag hits indicates that the second level of the checksum
`search algorithm was invoked about once every characters(cid:3) This is quite high
`because the total number of blocks in the le is a large fraction of the size of the
`tag hash table(cid:3) For smaller les we would expect the tag hit rate to be much
`closer to the number of matches(cid:3) For extremely large les(cid:10) we should probably
`increase the size of the hash table(cid:3)
`The next table shows similar results for a much smaller set of les(cid:3) In this
`case the les were not packed into a tar le rst(cid:3) Rather(cid:10) rsync was invoked
`with an option to recursively descend the directory tree(cid:3) The les used were
`from two source releases of another software package called Samba(cid:3) The total
`source code size is (cid:3) MB and the di between the two releases is lines
`long totalling kB(cid:3)
`
`block matches
`size
`
`
`
`
`
`
`
`
`
`
`
`
`tag
`hits
`
`
`
`
`
`
`false
`alarms
`
`
`
`
`
`
`data
`
`written read
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Unfied Patents Exhibit 1015
`Pg. 8
`
`
`
` Availability
`
`An implementation of rsync which provides a convenient interface similar to the
`common UNIX command rcp has been written and is available for download
`from ftp(cid:2)samba(cid:4)anu(cid:4)edu(cid:4)aupubrsync(cid:4)
`
`
`
`Unfied Patents Exhibit 1015
`Pg. 9
`
`