throbber
1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`IN THE UNITED STATES PATENT AND TRADEMARK OFFICE
`
`FILED VIA THE PATENT REVIEW PROCESSING SYSTEM
`
`In re Inter Partes Review of:
`U.S. Patent No. 6,757,717
`
`Issued: June 29, 2004
`
`Applicant: Leonid Goldstein
`
`Application No. 09/398,007
`
`Filed: September 16, 1999
`
`Title: System And Method For Data
`
`Access
`
`Currently in Litigation Styled:
`Proxyconn Inc. v. Microsoft
`Corporation, et al., Central District
`of California, Case No. SA CV11-
`1681 DOC (ANx) [Consolidated
`With Case Nos. SA CV11-1682
`DOC (ANx), SA CV11-1683 DOC
`(ANx), and SA CV11-1684 DOC
`(ANx)]
`
`
`
`)
`)
`)
`)
`)
`)
`)
`)
`)
`)
`)
`)
`)
`)
`)
`)
`)
`)
`)
`)
`)
`)
`
`Declaration of Professor Darrell D. E. Long
`Regarding U.S. Patent No. 6,757,717
`
`I.
`
`QUALIFICATIONS
`
`I am a Professor of Computer Science and have served as Associate
`
`Dean for Research and Graduate Studies in the Jack Baskin School of
`
`Engineering at the University of California at Santa Cruz. I hold the Kumar
`
`Malavalli Endowed Chair of Storage Systems Research and I am the Director
`
`1
`
`MICROSOFT
`
`EXHIBIT 1007
`
`

`
`
`
`
`
`1
`
`of the Storage Systems Research Center, an internationally recognized center
`
`2
`
`of excellence in data storage. I am also the Director of the Working-group on
`
`3
`
`Applied Security and Privacy (WASP), the laboratory at the University of
`
`4
`
`California at Santa Cruz that studies computer security. I teach graduate and
`
`5
`
`undergraduate courses in computer security, operating systems, data storage
`
`6
`
`and have taught courses in networking and distributed systems. I received my
`
`7
`
`B.S. degree in Computer Science from San Diego State University, and my
`
`8
`
`M.S. and Ph.D. from the University of California, San Diego. I am a Fellow
`
`9
`
`of the Institute of Electrical and Electronics Engineers and of the American
`
`10
`
`Association for the Advancement of Science. My research interests include
`
`11
`
`data storage systems, operating systems, computer security, distributed
`
`12
`
`systems and networking. My qualifications are further described in my
`
`13
`
`appended Curriculum Vitae.
`
`14
`
`
`
`I have published numerous papers including in the ACM Transactions
`
`15
`
`on Storage, and various IEEE journals, and I am the co-author of two books.
`
`16
`
`These publications are listed in Exhibit A. I am the founder of the premier
`
`17
`
`conference in the data storage field known as the Symposium on File Storage
`
`18
`
`Technologies (“FAST”). I have participated in organizing numerous
`
`19
`
`academic conferences including:
`
`20
`
`2
`
`

`
`
`
`
`
`1
`
`2
`
`2012:
`
`Steering Committee: Petascale Data Storage Workshop (PDSW),
`
`3
`
`Symposium on Modeling, Analysis and Simulation of Computer and
`
`4
`
`Telecommunication Systems (MASCOTS), Symposium on File and Storage
`
`5
`
`Systems Technology (FAST).
`
`6
`
`Program Committee: Symposium on File and Storage Systems
`
`7
`
`Technology (FAST).
`
`8
`
`9
`
`2011:
`
`Steering Committee: Petascale Data Storage Workshop (PDSW),
`
`10
`
`Symposium on Modeling, Analysis and Simulation of Computer and
`
`11
`
`Telecommunication Systems (MASCOTS), Symposium on File and Storage
`
`12
`
`Systems Technology (FAST).
`
`13
`
`Program Committee: Symposium on Modeling, Analysis and
`
`14
`
`Simulation of Computer and Telecommunication Systems (MASCOTS).
`
`15
`
`16
`
`2010:
`
`Program Chair: Symposium on Modeling, Analysis and Simulation of
`
`17
`
`Computer and Telecommunication Systems (MASCOTS).
`
`18
`
`Steering Committee: Petascale Data Storage Workshop (PDSW),
`
`19
`
`Symposium on Modeling, Analysis and Simulation of Computer and
`
`20
`
`3
`
`

`
`
`
`
`
`1
`
`Telecommunication Systems (MASCOTS), Symposium on File and Storage
`
`2
`
`Systems Technology (FAST).
`
`3
`
`4
`
`2009:
`
`Program Committee: International Workshop on Software Support for
`
`5
`
`Portable Storage (IWSSPS), Inaugural International Conference on
`
`6
`
`Virtualization and Cloud Computing, Symposium on Modeling, Analysis and
`
`7
`
`Simulation of Computer and Telecommunication Systems (MASCOTS),
`
`8
`
`Petascale Data Storage Workshop (PDSW).
`
`9
`
`10
`
`11
`
`Program Chair: Web Information Systems Engineering (WISE).
`
`General Chair: Symposium on Applications and the Internet (SAINT).
`
`Steering Committee: Symposium on Modeling, Analysis and
`
`12
`
`Simulation of Computer and Telecommunication Systems (MASCOTS),
`
`13
`
`Symposium on File and Storage Systems Technology (FAST).
`
`14
`
`I have also consulted for industry in the area of storage systems
`
`15
`
`including for Hewlett-Packard Laboratories and IBM. I have also been a
`
`16
`
`consultant to numerous government agencies.
`
`17
`
`II. COMPENSATION
`
`18
`
`I am being compensated by counsel for Microsoft at my compensation
`
`19
`
`rate of $500/hour for consulting and $600/hour for testimony in deposition or
`
`20
`
`4
`
`

`
`
`
`
`
`1
`
`trial, plus reimbursement for reasonably incurred expenses. I have no interest
`
`2
`
`in the outcome of the related litigation or this proceeding.
`
`3
`
`4
`
`III. SUMMARY OF MY STUDY AND CONCLUSIONS
`
`I have read U.S. Patent No. 6,757,717. The patent concerns
`
`5
`
`technology within my areas of expertise. I have considered the patent’s
`
`6
`
`disclosures from the perspective of a person of ordinary skill in the art in
`
`7
`
`1998-99.
`
`8
`
`I have studied the following references and considered them from the
`
`9
`
`perspective of the person of ordinary skill in the art in 1998-99.
`
`10
`
`Perlman: Radia J. Perlman et al., U.S. Patent No. 5,742,820,
`
`11
`
`“Mechanism for Efficiently Synchronizing Information Over a Network,”
`
`12
`
`issued Apr. 21, 1998 (“’820” or “Perlman”).
`
`13
`
`Yohe: Thomas Patrick Yohe et al., U.S. Patent No. 5,835,943,
`
`14
`
`“Apparatus and Method for Increased Data Access in a Network File
`
`15
`
`Oriented Caching System,” issued Nov. 10, 1998 on an application filed July
`
`16
`
`3, 1997 (“’943” or “Yohe”).
`
`17
`
`Santos: Jonathan Santos et al., “USENIX, Increasing Effective Link
`
`18
`
`Bandwidth by Suppressing Replicated Date,” Proceedings of the USENIX
`
`19
`
`Annual Technical Conference (NO 98) New Orleans, Louisiana, June 1998
`
`20
`
`(“Santos”).
`
`5
`
`

`
`
`
`
`
`1
`
`I have compared these references to claims 1, 3, 10-12, 14 and 22-24
`
`2
`
`of the ’717 patent. I also have reviewed Proxyconn’s arguments (in an
`
`3
`
`“interrogatory” response) attempting to distinguish the claims from these
`
`4
`
`references.
`
`5
`
`I have considered the perspective of the person of ordinary skill in the
`
`6
`
`art in 1998-99 (defined below) who was designing a system in which
`
`7
`
`dynamic data is sent over a network from some source and is stored by a
`
`8
`
`receiver computer for possible later reuse, and having a design goal of
`
`9
`
`reducing the transmission of redundant data over the network. I have
`
`10
`
`considered such a person having read Perlman with an eye toward using and
`
`11
`
`possibly expanding on its teachings. I have also considered such a person
`
`12
`
`having read Yohe and looking to use and possibly expand on its teachings.
`
`13
`
`These nine ’717 patent claims recite nothing innovative compared to
`
`14
`
`these references. Perlman discloses to the person of ordinary skill in the art
`
`15
`
`in 1998-99 everything required by these claims. Yohe likewise discloses
`
`16
`
`everything, except for bundling multiple digital fingerprints in the same
`
`17
`
`message, which Perlman does disclose. Also, the natural combination of
`
`18
`
`Perlman and Yohe taught the person of skill in the art the combinations
`
`19
`
`claimed in these claims. In other words, the person of skill in the art would
`
`20
`
`have already possessed the claimed subject matter upon reading Perlman and
`
`6
`
`

`
`
`
`
`
`1
`
`Yohe, combined with their knowledge of the conventional technology in the
`
`2
`
`field, as explained below.
`
`3
`
`Also, the patent is internally inconsistent and unclear on the meaning
`
`4
`
`of “digital digest,” as explained below. For purposes of my comparison of
`
`5
`
`the claims to the prior art references, however, I have assumed that this term
`
`6
`
`includes a fixed-size digital fingerprint (e.g., hash, message digest, signature
`
`7
`
`or identifier), of 32 or more bits, calculated using an MD5 and/or CRC
`
`8
`
`algorithm and calculated on arbitrary-size data, such that it represents and
`
`9
`
`depends only on the content of that data.
`
`10
`
`IV. FIELD OF THE INVENTION
`
`11
`
`The ’717 patent defines its “field of the invention” as accessing data in
`
`12
`
`communication networks. (’717, 1:10-15). The field also includes the areas
`
`13
`
`of distributed data storage systems and networking, coding theory including
`
`14
`
`error detection and correction codes, and cryptographic hash functions
`
`15
`
`commonly called message digest functions. These were all mature fields in
`
`16
`
`1998-99.
`
`17
`
`V. LEVEL OF SKILL IN THE ART IN 1998-99
`
`18
`
`A person of ordinary skill in this art in 1998-99 would hold a B.S.
`
`19
`
`degree in computer science and would have as part of his study courses in
`
`20
`
`operating systems, networking, data compression and computer security.
`
`7
`
`

`
`
`
`
`
`1
`
`These studies would include the storage subsystem of computer operating
`
`2
`
`systems which is covered briefly in most undergraduate operating systems
`
`3
`
`courses, but few require the student to examine actual source code. In
`
`4
`
`addition he would have several years of practical experience working in
`
`5
`
`operating systems, in particular the data storage subsystem.
`
`6
`
`As a result, actual experience in working with this operating system
`
`7
`
`subsystem would normally occur after several years of experience working
`
`8
`
`for a company with a focus on systems software.
`
`9
`
`Alternatively, a person would develop the level of ordinary skill in the
`
`10
`
`art in 1998-99 by obtaining an M.S. in computer science and by writing his or
`
`11
`
`her thesis in an area related to data storage and/or computer security.
`
`12
`
`A person of ordinary skill in the art would understand network
`
`13
`
`protocols. This was normally part of undergraduate programs in computer
`
`14
`
`science in 1998-99. A person of ordinary skill in the art would also
`
`15
`
`understand coding theory; in particular error detection and correction codes,
`
`16
`
`as well as cryptographic hash functions and message digest functions.
`
`17
`
`Introduction to basic hash functions is a normal part of most undergraduate
`
`18
`
`curricula, but coding theory is normally part of specialized courses (although
`
`19
`
`it is commonly part of electrical engineering programs), and cryptographic
`
`20
`
`8
`
`

`
`
`
`
`
`1
`
`hash functions would normally be taught only in courses in computer
`
`2
`
`security.
`
`3
`
`I have first-hand experience teaching and working with such persons of
`
`4
`
`ordinary skill in the art. For example, I have taught students having about that
`
`5
`
`level of skill in this art since at least as early as 1990.
`
`VI. PERLMAN AND YOHE TEACH
`
`THE CLAIMED SUBJECT MATTER
`
`A.
`
`Perlman and Yohe Are A Natural Combination
`
`There are multiple reasons why a PHOSITA looking to use Perlman
`
`would have looked to the teachings of Yohe and, conversely, one looking to
`
`use Yohe would have looked to the teachings of Perlman.
`
`Same Problem: First, each reference addresses the same specific
`
`communications problem addressed in the ’717 patent: the desire to reduce
`
`redundancy in network data transmissions where dynamic data previously
`
`sent over the network has been stored by the receiver for possible later reuse.
`
`Each addresses how to validate that stored (cached) copy to confirm that it
`
`remains synchronized with the source data at the sender. “Another object of
`
`the present invention is to decrease data traffic throughout the network.”
`
`(’717, 1:61-65). “[I]t is among the objects of the present invention to reduce
`
`the bandwidth consumed by transmission of database summary information
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`9
`
`

`
`
`
`
`
`1
`
`packets over a computer network.” (’820, 3:52-55). “If a copy of the data
`
`2
`
`can be stored in the permanent storage memory of the remote client computer
`
`3
`
`and also verified to be current when it is subsequently retrieved, this will
`
`4
`
`improve performance significantly.” (’943, 4:32-40). This common problem
`
`5
`
`would have motivated a PHOSITA to consider these references together.
`
`6
`
`Same Algorithm: Second, each reference proposes the same
`
`7
`
`algorithmic solution to this common problem. Each uses the identity-check
`
`8
`
`property of message digest fingerprints or CRC values. Each transmits those
`
`9
`
`fingerprints from sender to receiver for the receiver to compare to its own
`
`10
`
`digital fingerprint values. Each returns a negative indication from receiver to
`
`11
`
`sender if there is no match, upon which the sender returns the underlying
`
`12
`
`data. This common algorithmic solution would have motivated a PHOSITA
`
`13
`
`to consider these references together.
`
`14
`
`Overlapping Applications: Third, each reference teaches that its
`
`15
`
`techniques can be used with network routers. Yohe discloses that its cache
`
`16
`
`verifying computer and communications server may be formed integrally
`
`17
`
`together (’943, claim 7) and that its communication server could be a Cisco
`
`18
`
`or other router (id., 4:52-56). Perlman’s illustrative embodiment is a router.
`
`19
`
`This overlap of a common application would have motivated a PHOSITA to
`
`20
`
`consider these references together.
`
`10
`
`

`
`
`
`
`
`1
`
`Perlman’s Express Direction: Fourth, Perlman expressly discloses and
`
`2
`
`claims that its mechanisms are not limited to routers but apply to any network
`
`3
`
`nodes (which includes Perlman’s source nodes and destination nodes (’820,
`
`4
`
`5:41-46, Fig. 2)) needing such synchronization between their database
`
`5
`
`contents. (’820, Abstract, 1:1-9, 3:61-4:4, 7:12-22, 8:52-9:2, claims 1, 4-6, 8,
`
`6
`
`10.) This broad teaching would have motivated a PHOSITA to consider
`
`7
`
`these references together.
`
`8
`
`Nothing Incompatible: Finally, nothing in either of these references
`
`9
`
`teaches away from their combination. In particular, as explained below, the
`
`10
`
`few implementation differences between these references are mere design
`
`11
`
`choices arising from the particular applications being illustrated. There is no
`
`12
`
`essential feature in Perlman in conflict with an essential feature of Yohe.
`
`B.
`
`
`The Designer Naturally Would Have
`Combined Perlman And Yohe In The Claimed Manner
`
`Perlman discloses everything recited in ’717 patent claims 1, 3, 10-12
`
`and 22-24. Perlman discloses the only element in any challenged claim
`
`arguably missing from Yohe (bundling plural “digital digests” in the same
`
`message, per claim 24). Yohe discloses the only element in any challenged
`
`claim missing from Perlman’s illustrative embodiment (searching for “digital
`
`digests” in the receiver’s permanent memory, per claim 23). As noted, this
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`11
`
`

`
`
`
`
`
`1
`
`permanent-memory element also is disclosed in Perlman’s broader
`
`2
`
`disclosure. I discuss below four claim features that Proxyconn might argue
`
`3
`
`would be missing from the combination of Perlman and Yohe.
`
`4
`
`5
`
`Use of Receiver’s Permanent Memory:
`
`Perlman discloses that the receiver computer has a hard disk or the like
`
`6
`
`for persistent storage of data, which the ’717 patent refers to as permanent
`
`7
`
`memory. Specifically, Perlman discloses that its illustrative-embodiment
`
`8
`
`router is a “general-purpose” computer with an operating system, only
`
`9
`
`“portions of which” are resident in RAM (’820, 5:41-52, Fig. 2). A person of
`
`10
`
`ordinary skill in the art in 1998-99 would have understood this term “general-
`
`11
`
`purpose computer” to refer to a computer having, among other things, a hard
`
`12
`
`disk or the like for storing persistently data and application programs. Any
`
`13
`
`“general-purpose computer” in 1998-99 was capable of storing data (e.g., a
`
`14
`
`digital fingerprint) in its permanent memory, e.g., as part of its operating
`
`15
`
`system or an application program or data files, etc. And, the person of
`
`16
`
`ordinary skill would have understood that the portion of the operating system
`
`17
`
`in Perlman not resident in RAM would necessarily have been stored on disk.
`
`18
`
`Further, Perlman expressly discloses that its mechanism applies to
`
`19
`
`“any type” of distributed network system needing to synchronize the database
`
`20
`
`on one node with that of another. (’820, 8:52-9:2). The person of skill in the
`
`12
`
`

`
`
`
`
`
`1
`
`art in 1998-99 would have understood that this “any type” includes the Yohe
`
`2
`
`system and other such systems whose computers have a hard disk or other
`
`3
`
`persistent storage.
`
`4
`
`Perlman’s illustrative-embodiment general-purpose computer is
`
`5
`
`running a link state routing protocol requiring rapid propagation of routing
`
`6
`
`table updates, and thus naturally stores its digital fingerprints in RAM, not in
`
`7
`
`the receiver’s permanent memory (see ’717, claim 10), and searches for data
`
`8
`
`with a particular digital fingerprint in the receiver’s RAM not its permanent
`
`9
`
`memory (see ’717, claim 23). But, as discussed, Perlman expressly discloses
`
`10
`
`that its mechanism is not restricted to this network routing application but
`
`11
`
`instead applies to “any type” of distributed system requiring synchronization
`
`12
`
`of databases on nodes of a network. This “any type” includes systems using
`
`13
`
`a cache stored on disk or other permanent-storage device.
`
`14
`
`While Perlman’s illustrative router embodiment stores its cached data
`
`15
`
`in RAM, Yohe’s receiver stores its cache data in part on disk. This difference
`
`16
`
`naturally follows from their different illustrative applications. A skilled
`
`17
`
`designer knew in 1998-99 that a cache could be stored in volatile (non-
`
`18
`
`persistent) memory, or persistent memory, or in both types of memory.
`
`19
`
`Those are the only choices. Often, a general-purpose computer had both
`
`20
`
`persistent and non-persistent memory, as in Yohe, for example. Each choice
`
`13
`
`

`
`
`
`
`
`1
`
`has tradeoffs versus the other, and the particular application typically
`
`2
`
`recommends use of one or the other, or both. For example, read/write speed
`
`3
`
`is of utmost importance in the routing table synchronization application of
`
`4
`
`Perlman’s illustrative embodiment, not storage capacity, and routers typically
`
`5
`
`are rebooted only rarely, so its cache naturally was in volatile RAM. On the
`
`6
`
`other hand, Yohe’s illustrative application of synchronizing entire file
`
`7
`
`systems (on client computers that might be rebooted frequently),
`
`8
`
`recommended the greater storage capacity and persistence of storing the
`
`9
`
`cache on disk not solely in RAM.
`
`10
`
`As noted, Perlman expressly discloses that its mechanism is not limited
`
`11
`
`to routers but instead applies to any network node in a distributed system that
`
`12
`
`needs to synchronize databases on the various nodes. (’820, 8:52-9:2).
`
`13
`
`(Perlman uses “database” in a broad sense. The illustrative “database” in
`
`14
`
`Perlman is a routing table.) Consistent with this teaching, at least five claims
`
`15
`
`of Perlman (claims 1 and 4-7) broadly apply to any nodes of a computer
`
`16
`
`network. Therefore, it not only would be natural and commonsense to use
`
`17
`
`Perlman’s mechanisms with nodes using a persistent-memory (as in Yohe),
`
`18
`
`Perlman itself teaches doing so.
`
`19
`
`20
`
`This conclusion that a person of skill in the art would naturally have
`
`applied Perlman’s teachings to Yohe and similar applications using persistent
`
`14
`
`

`
`
`
`
`
`1
`
`memory for cache, is bolstered by the ’717 patent itself. The patent notes that
`
`2
`
`the receiver uses its permanent memory or its first memory or both for its
`
`3
`
`cache (’717, 7:27-29) and contains no disclosure of any particular functional
`
`4
`
`or structural differences or benefits between these options. In other words,
`
`5
`
`the ’717 patent does not suggest some unexpected results from using the
`
`6
`
`identity-check algorithm with data stored in a persistent cache vs. non-
`
`7
`
`persistent cache. The mathematical algorithms and network transmission
`
`8
`
`mechanisms are the same whether the cached data survives a reboot (i.e., is
`
`9
`
`on disk) or not (i.e., is only in RAM).
`
`10
`
`11
`
`Bundling Plural Fingerprints In Single Message:
`
`Perlman discloses bundling two or more levels of signatures in a single
`
`12
`
`message—e.g., a high-level signature of a data set and lower-level signatures
`
`13
`
`of fragments of the data set. This is done so that the receiver can check for
`
`14
`
`matches for each fragment and request only the unmatched fragments. (’820,
`
`15
`
`8:7-42, 8:52-9:2, Fig. 7). Yohe does not describe this element, but Perlman
`
`16
`
`does.
`
`17
`
`Just as Perlman saw a benefit of bundling multiple digital fingerprints
`
`18
`
`for multiple data items in the same message, so too would have the person
`
`19
`
`having ordinary skill in the art in 1998-99 starting with the Yohe design.
`
`20
`
`This would have been a natural and commonsense variation on Yohe, for a
`
`15
`
`

`
`
`
`
`
`1
`
`person having ordinary skill in the art in 1998-99, for at least the following
`
`2
`
`reasons.
`
`3
`
`First, both Yohe’s receiver and sender already have the capability of
`
`4
`
`performing this step of calculating and bundling multiple signatures together.
`
`5
`
`Specifically, each has a block signature generator capable of calculating such
`
`6
`
`fragment signatures. (’943, Fig. 2).
`
`7
`
`Second, Yohe’s sender already calculates CRC or MD5 signatures of
`
`8
`
`each block of a directory. (’943, Fig. 16).
`
`9
`
`Third, Yohe’s receiver has a comparator capable of comparing CRC or
`
`10
`
`MD5 signatures, whatever data may have been represented by those
`
`11
`
`signatures. (’943, Fig. 2). Thus, the only change to Yohe would be for its
`
`12
`
`sender to transmit the block signatures it calculates, together with the
`
`13
`
`directory signatures, something it was already capable of doing.
`
`14
`
`Fourth, the motivation and teaching to make this minor change is
`
`15
`
`expressed in Perlman, which teaches that sending multiple levels of
`
`16
`
`identifiers in the same message “conserves processing resources.” (’820,
`
`17
`
`8:43-51).
`
`18
`
`Fifth, as noted, Perlman expressly teaches that its mechanism may be
`
`19
`
`used in “any type” of distributed system requiring synchronization of
`
`20
`
`16
`
`

`
`
`
`
`
`1
`
`databases stored on the nodes of a network (’820, 8:52-9:2), which disclosure
`
`2
`
`encompasses Yohe’s distributed file system application.
`
`3
`
`Thus, a person having ordinary skill in the art in 1998-99 would have
`
`4
`
`combined this element of Perlman with Yohe in the way recited in the
`
`5
`
`challenged claims, would have had a reasonable expectation of success from
`
`6
`
`that combination, and that combination would have yielded the predictable
`
`7
`
`result explained in Perlman. For the above reasons, this combination of
`
`8
`
`Perlman and Yohe in this manner is no more than a predictable use of prior
`
`9
`
`art elements according to their established functions. A system designer of
`
`10
`
`ordinary skill wishing to improve the efficiency of network communications
`
`11
`
`would have seen a benefit to sending and comparing multiple levels of digital
`
`12
`
`fingerprints in the same message.
`
`13
`
`14
`
`Length of the Digital Fingerprint:
`
`Yohe’s illustrative embodiment uses 32-bit and 64-bit signatures. A
`
`15
`
`person of skill in the art, whose anticipated applications called for use of
`
`16
`
`longer fingerprints, would naturally have used a 128-bit fingerprint instead,
`
`17
`
`as in Perlman. First, Yohe itself states that a 128-bit MD5 signature could be
`
`18
`
`used. Second, if a particular application required a greater length of signature
`
`19
`
`to avoid collisions, then the skilled designer would have had every reason to
`
`20
`
`do that to avoid collisions, again as taught in Perlman and Yohe (and Santos).
`
`17
`
`

`
`
`
`
`
`1
`
`2
`
`Network Cache Memory:
`
`Although Perlman does not use the word “cache,” both Perlman and
`
`3
`
`Yohe disclose this element. Each discloses storing data in a logical part of
`
`4
`
`memory or storage storing copies of data received from the network, for
`
`5
`
`possible re-use. Further, a person of skill in the art would have been familiar
`
`6
`
`with the local and proxy caching prior art discussed in the ’717 patent (’717,
`
`7
`
`Figs. 1-2, 1:18-26), which required that the cached data be “verified to be
`
`8
`
`valid.” Perlman and Yohe each teaches a technique for verifying that saved
`
`9
`
`data previously received over the network is still valid, i.e, still the same as
`
`10
`
`the counterpart data at the data’s source. A skilled designer in 1998-99
`
`11
`
`looking to verify such data validity while reducing the amount of network
`
`12
`
`traffic naturally would have looked to Yohe and Perlman—each of which is
`
`13
`
`directed to the same design objective. This combination was particularly
`
`14
`
`natural given that some prior art proxy caches already used MD5 hashes on
`
`15
`
`the contents of a data object as its index for the cache’s contents, as disclosed
`
`16
`
`in, e.g., USP 6,128,623.
`
`17
`
`18
`
`19
`
`20
`
`18
`
`

`
`
`
`
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`VII. ’717 PATENT IS UNCLEAR ON
`
`MEANING OF “DIGITAL DIGEST”
`
`A. Hash Function And Message Digest
`
`A function is a computation that takes a value, say x, and maps it onto
`
`another value y. The value x is chosen from a set called the domain and y is
`
`chosen from a set called the range. A function may be one-to-one where for
`
`each y there is exactly one x, or many-to-one where many x1, x2, …, may all
`
`map to the same y. A hash function f is one that takes a key k, and maps it to
`
`exactly one of at most n values. That is, Ͳ ൑ ݂ሺ݇ሻ ൏ ݊.
`
`A message digest H is a form of one-way hash function that takes a
`
`message M of arbitrary length and produces a hash value of a fixed number of
`
`bits (“bit” stands for binary digit and has a value of 0 or 1.) If ܪሺܯሻ ൌ ݄
`then ݄ ൏ ʹ௠ where m is the length in bits of the hash value of this hash
`
`function. Widely used message digest functions in 1998-99 included MD5,
`
`which produces 128 bits and SHA-1 that produces 160 bits. CRC-32 and
`
`CRC-64 can be thought of as particularly small message digest functions,
`
`although their original purpose was error detection and, as discussed below,
`
`they are not collision resistant.
`
`19
`
`

`
`
`
`
`
`1
`
`2
`
`B. Mathematical Properties Of A Message Digest
`
`Just as human fingerprints are expected to be unique to individuals, the
`
`3
`
`message digest of a given message (e.g., a string or block of data) ideally is
`
`4
`
`unique to that message. The ability of a hash function to produce a unique
`
`5
`
`fingerprint of the message depends in part on the number of bits that the
`
`6
`
`message digest produces and on the number of messages that may be hashed.
`
`7
`
`If the number of messages is sufficiently large, and the number of bits
`
`8
`
`produced by the message digest function insufficiently low, then a collision
`
`9
`
`will occur and multiple messages will produce the same hash value.
`
`10
`
`Message digest functions have the following additional properties that
`
`11
`
`make them one-way and thus desirable for a designed purpose (see Bruce
`
`12
`
`Schneier, Applied Cryptography (2nd ed. 1996) at p. 429 (Exhibit C)):
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`1.
`
`Given a message M, it is easy to compute h. The computing of a
`
`message digest must be efficient, since they are typically applied to large
`
`amounts of data.
`
`2.
`
`Given h, it is hard to compute M such that H(M) = h. It must be
`
`computationally hard, that is nothing short of exhaustive search, to find
`
`the original message M given h. In fact, since M contains more bits than
`
`h, information is lost and so there are many messages ܯԢ that hash to the
`
`same h.
`
`20
`
`

`
`
`
`
`
`1
`
`2
`
`3
`
`4
`
`5
`
`6
`
`7
`
`8
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`Given M, it is hard to find another ܯԢ such that ܪሺܯሻ ൌ
`3.
` ܪሺܯᇱሻ. Even though there are many ܯԢ that hash to the same h, m (the
`
`fixed length in bits of the hash value) should be sufficiently large that it is
`
`difficult to find such an ܯԢ and they are extraordinarily unlikely to occur
`
`accidentally.
`
`One of the uses of a message digest function is to determine if any bits
`
`in a message M have been changed. For a well-designed message digest, even
`
`changing a single bit will result in a vastly different value being returned by
`
`the message digest computation. It is not the case that two messages that are
`
`close (e.g., a small Hamming or Levenshtein (also called edit) distance apart)
`
`will have message digests that are also close. While a poorly designed hash
`
`function may result in clustering in the range, there were no known message
`
`digest functions in 1998-99 where two messages that are close in terms of
`
`edit distance will result in hash values that are close either numerically or in
`
`terms of Hamming distance. As a result, none of CRC-32, CRC-64, MD4,
`
`MD5, or any of the SHA family can be used to determine similarity between
`
`two messages. A well-designed hash function is flat, that is all values
`
`Ͳ ൑ ݄ ൏ ݊ are equally likely to occur, regardless of the distribution of input
`
`messages.
`
`21
`
`

`
`
`
`
`
`1
`
`2
`
`C. Collision Resistance
`
`One of the properties that typically we desire in a message digest
`
`3
`
`function is called collision resistance. A message digest function is collision
`
`4
`
`resistant if two unrelated messages are unlikely to result in the same hash
`
`5
`
`value. A requirement that the hash function produces a nearly flat distribution
`
`6
`
`is a requirement, but it also requires that the function produce a relatively
`
`7
`
`large number of bits. By modern standards, even MD5’s 128 bits are
`
`8
`
`considered insufficient to be deemed collision resistant. The 32 bits of CRC-
`
`9
`
`32 would not have been considered sufficient for collision resistance in 1998-
`
`10
`
`99 for practical data storage systems, nor would the 64 bits of CRC-64 except
`
`11
`
`for the smallest of data storage systems.
`
`12
`
`13
`
`D. Calculating Collision Probabilities
`
`Hash collisions often are the result of a well-known problem in
`
`14
`
`probability theory known as the birthday problem, and cryptographic attacks
`
`15
`
`on these functions typically take the form of birthday attacks. The birthday
`
`16
`
`problem is: Given a room full of people, how many people need to be at the
`
`17
`
`party before the likelihood of two people sharing a birthday is larger than ½?
`
`18
`
`Assuming that birthdays are evenly spread over the course of a year, the
`
`19
`
`answer is surprisingly small: it is only about 23. This applies to hash
`
`20
`
`collisions as well as birthdays, and we ask the question: How many messages
`
`22
`
`

`
`
`
`
`
`1
`
`must be hashed before the likelihood of a collision is greater than some value.
`
`2
`
`The next question is how high of a chance of collision is considered
`
`3
`
`acceptable? A typical acceptable value might be 1 in 1015 (1 in
`
`4
`
`1,000,000,000,000,000) for a small data storage system, while for a large data
`
`5
`
`center the value would be much lower (since there are many more files that
`
`6
`
`might collide). One might ask where 1 in 1015 comes from: it is related to the
`
`7
`
`undetected error rate in modern hard drives and reflects the capability of a
`
`8
`
`typical error correction code that they employ such as Reed-Solomon.
`
`9
`
`10
`
`11
`
`12
`
`13
`
`14
`
`15
`
`16
`
`17
`
`18
`
`19
`
`20
`
`The expected number of messages before a collision occurs can be
`
`approximated using the formula ݊ ൌ ටʹ ݀ Ž ଵ
`ଵି௣ where Ě is the range of the
`
`hash function, and p is the desired probability. For example, if we let Ě = 365
`
`and Ɖ = ½, then we get Ŷ = 22.4944. If we consider CRC-32, and assume that
`
`the hash values produced are uniformly random then ݀ ൌ ʹଷଶ and if Ɖ = ½,
`
`then we get Ŷ = 77,162. In fact, CRC-32 is not uniformly random and so the
`
`value of n will be significantly lower. In other words, we can expect to hash
`
`fewer than 77,162 random and different messages with a CRC-32 function, in
`
`a benign environment, to have a 50-50 chance that two different messages
`
`will produce the same hash value. This is a very small number of messages
`
`given that the average personal computer may have more than a million files.
`
`23
`
`

`
`
`
`
`
`1
`
`2
`
`3
`
`4
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket