throbber
IN THE UNITED STATES PATENT AND TRADEMARK OFFICE
`
`
`In re Patent of: Ingemar J. Cox
`U.S. Patent No.: 8,205,237
`Issue Date: Jun. 19, 2012
`Appl. No.: 11/977,202
`Filing Date: Oct. 23, 2007
`Title: IDENTIFYING WORKS, USING A SUB-LINEAR TIME SEARCH,
`SUCH AS AN APPROXIMATE NEAREST NEIGHBOR SEARCH, FOR
`INITIATING A WORK-BASED ACTION, SUCH AS AN ACTION ON THE
`INTERNET
`
`Submitted via Electronic Filing
`Mail Stop PATENT BOARD
`Patent Trial and Appeal Board
`Commissioner for Patents
`P.O. Box 1450
`Alexandria, VA 22313-1450
`
`
`PETITIONER'S EXHIBIT 1004
`
`
`
`DECLARATION OF DR. PIERRE MOULIN IN SUPPORT OF
`PETITION FOR INTER PARTES REVIEW OF U.S. PATENT NO. 8,205,237
`UNDER 35 U.S.C. §§ 311-319
`
`
`
`
`
`Google Ex. 1004
`
`

`
`
`
`I.
`
`I, Dr. Pierre Moulin, D. Sc., declare as follows:
`
`INTRODUCTION
`1. My name is Pierre Moulin, D. Sc. I am a Professor of Electrical and
`
`Computer Engineering ("ECE") and Statistics at the University of Illinois. My
`
`business address is University of Illinois, Beckman Institute, 405 N. Matthews
`
`Ave., Urbana, IL 61801. I have been retained by counsel for Google Inc.
`
`("Petitioner"), as a technical expert witness with respect to the inter partes review
`
`petition filed by Petitioner concerning U.S. Patent No. 8,205,237 (the "'237
`
`patent").
`
`2.
`
`For purposes of this Declaration, I have been asked to provide an
`
`expert technical analysis and opinion, including whether a single prior art reference
`
`or combination of references discloses the elements of the claims of the '237
`
`patent. In addition, I have been asked to provide opinions on the knowledge of skill
`
`in the art, the scope and content of the prior art, the motivations to combine certain
`
`prior art references, and other matters as set forth in this Declaration.
`
`3. My investigation into the matters addressed herein is ongoing and I
`
`reserve the right to supplement this Declaration as my investigation continues.
`
`II. EDUCATIONAL AND PROFESSIONAL BACKGROUND
`4. My qualifications, including my publications, are generally
`
`summarized in my Curriculum Vitae, attached hereto as Exhibit 1005.
`
`
`
`Google Ex. 1004
`
`

`
`
`
`III. MATERIALS REVIEWED
`5.
`I have considered a number of references in connection with the
`
`preparation of this Declaration, including the '237 patent and file history. In
`
`addition, I have reviewed all references cited in this Declaration and the references
`
`identified in the Exhibit List included in Petitioner's petition. Further, I have
`
`conducted an independent review of available prior art. Finally, I have relied on
`
`my own expertise when formulating the opinions contained in this Declaration.
`
`IV. PERSON OF ORDINARY SKILL IN THE ART
`6.
`I understand that the present owner of the '237 patent, Network-1
`
`Technologies, Inc., has represented that the inventor conceived of the purported
`
`inventions embodied in the '237 patent no earlier than July 1, 2000 (the "critical
`
`date"). Ex. 1019 at 30.
`
`7.
`
`In my opinion, the relevant field of art for the '237 patent is that of
`
`automatic content recognition algorithms. It is my opinion that a person of ordinary
`
`skill in the art as of the critical date of July 1, 2000 would have been highly skilled,
`
`and typically would have possessed at least an M.S. in computer science, electrical
`
`engineering, or mathematics; knowledge of video and audio processing techniques;
`
`and 1-2 years of experience in audio, video, or image processing. I personally
`
`possessed this level of experience in 2000, and I worked closely with numerous
`
`other individuals who also possessed this level of experience.
`
`
`
`2
`
`Google Ex. 1004
`
`

`
`
`
`V. BACKGROUND IN THE FIELD OF AUTOMATED CONTENT
`RECOGNITION
`8.
`
`I provide the following background that relates to any opinions on the
`
`claims of the '237 patent.
`
`9.
`
`In the 1990s, the emergence of affordable computing power sufficient
`
`to process electronic media spurred researchers and entrepreneurs to automate
`
`various tasks within the field of content publication and consumption. Ex. 1001 at
`
`1:41-44. Three exemplary problems were (1) television viewers often disliked
`
`watching commercials; (2) television advertisers had no means to verify that
`
`commercials they paid to air were in fact aired, and aired in the correct time slot;
`
`and (3) radio listeners often failed to purchase music albums because they did not
`
`know the titles of songs that they enjoyed. Before 2000, numerous individuals
`
`concurrently developed the same two solutions to all three problems: (1)
`
`embedding hidden "watermarks" in electronic works, which could later be
`
`interpreted to identify the work; and (2) using computer-automated systems to
`
`recognize audio, video, and/or image content by analyzing the intrinsic features of
`
`a video work. The second technology is at issue in the present petition.
`
`10. Content recognition schemes universally relied on two widely known
`
`technologies: feature extraction and neighbor searching in a database.
`
`11. Feature extraction refers to quantifying a media work in a form that—
`
`unlike a raw video feed—is easily parsed by a computer. Typically, extracted
`
`
`
`3
`
`Google Ex. 1004
`
`

`
`
`
`features were compact, meaning they occupied less memory on a computer than
`
`the corresponding video file did. Furthermore, extracted features were typically
`
`structured in a format that facilitated efficient search. In the context of a content
`
`identification system, each set of extracted features corresponding to a given media
`
`work was stored as an entry in a database. Within such a database, entries were
`
`typically organized to facilitate efficient search. Such organization was known as
`
`"preprocessing."
`
`12. Neighbor searching refers to algorithms for comparing a first set of
`
`extracted features with one or more additional sets of extracted features to locate a
`
`close, but not necessarily exact, match. Because neighbor searching is
`
`computationally intensive for large feature sets, content recognition schemes
`
`typically employed search algorithms that increased efficiency by intelligently
`
`searching only a subset of potential matches (i.e., "non-exhaustive" algorithms).
`
`VI. OVERVIEW OF THE '237 PATENT
`13.
`It is my understanding that Petitioner is requesting inter partes review
`
`on the '237 patent, entitled "Identifying Works, Using A Sub-Linear Time Search,
`
`Such As An Approximate Nearest Neighbor Search, For Initiating A Work-Based
`
`Action, Such As An Action On The Internet" The '237 patent lists Ingemar J. Cox
`
`as inventor. It issued on Jun. 19, 2012, with 40 claims, of which claims 1, 5, 9, 13,
`
`25, and 33 are independent. The patent application that led to the '237 patent
`
`
`
`4
`
`Google Ex. 1004
`
`

`
`
`
`(Application No. 11/977,202) was filed on October 23, 2007, and was a
`
`continuation-in-part of Application No. 09/950,972, filed September 13, 2001,
`
`which issued as U.S. Patent No. 7,058,223. That application, in turn, claims
`
`priority to Provisional Application No. 60/232,618, filed September 14, 2000.
`
`14. The claims of the '237 patent relate to identifying an unknown media
`
`work and, based on its determined identity, performing an action.
`
`15. The independent claims of the '237 patent recite the following three
`
`elements: (1) receiving or obtaining "features . . . extracted from a media work";
`
`(2) "determining . . . an identification of the media work using the . . . features . . .
`
`to perform" a "search of . . . media works"; (3) either "transmitting . . . information
`
`about the identified media work to the client device" or "determining . . . an action
`
`based on the determined identification of the media work." Ex. 1001 at Claims 1,
`
`5, 9, 13, 25, 33.
`
`16. The specification of the '237 patent concedes that these three elements
`
`were well known prior to the filing of the '237 patent.
`
`17. First, the specification of the '237 patent explains that "[t]he
`
`recognition literature contains many different representations" of extraction
`
`algorithms, including "a pseudo-random sample of pixels from the frame or a low-
`
`resolution copy of the frame or the average intensities of nxn blocks of pixels," "a
`
`frequency-based decomposition of the signal, such as produced by the Fourier,
`
`
`
`5
`
`Google Ex. 1004
`
`

`
`
`
`wavelet and or discrete cosine transforms," "principal component analysis," "a
`
`temporal sequence of feature vectors," or "Fourier frequency decompositions." Ex.
`
`1001 at 6:57-7:11. The '237 specification does not purport to present any novel
`
`means of receiving or obtaining "features . . . extracted from a media work."
`
`18. Second, the '237 specification explains that the purported invention
`
`employs "common statistical measures" and "[o]ther forms of matching" already
`
`known in the art, such as "correlation coefficient," "Euclidean distance," "Lp-
`
`norms," "kd-trees, vantage point trees and excluded middle vantage point forests"
`
`(Ex. 1001 at 8:63-9:38) for "electronically determining an identification of the
`
`electronic work based on the extracted features" by performing "a non-exhaustive
`
`search identifying a neighbor." . The '237 specification explicitly credits these
`
`algorithms to various third party sources. Id. The '237 specification does not
`
`purport to present any novel means of "determining . . . an identification of the
`
`media work using the . . . features . . . to perform" a "search of . . . media works."
`
`See also, Ex. 1018 at 1-3, 10-11 (explaining, in 1999, that "there are many
`
`applications of nearest neighbor search where an approximate answer is good
`
`enough" and that "approximate similarity search can be performed much faster
`
`than the exact one").
`
`19. Third and fourth, the '237 specification explains that there already
`
`existed "[k]nown techniques of linking works delivered via traditional media
`
`
`
`6
`
`Google Ex. 1004
`
`

`
`
`
`channels to a more interactive system" for Ex. 1001 at 2:18-40 "electronically
`
`determining an action based on the identification of the electronic work." For
`
`instance, conceded prior art systems used an identification of a work to "access[] a
`
`database . . . and match[] the ID with the associated web address" and subsequently
`
`navigate to the web address. Id at 2:27-40. The '237 specification does not purport
`
`to present any novel means of "transmitting . . . information about the identified
`
`media work to the client device" or "determining . . . an action based on the
`
`determined identification of the media work."
`
`20. Claims 1, 5, 9, and 13 of the '237 patent additionally require that the
`
`extraction of features must occur "by a client device." Ex. 1001 at Claim 1, 5, 9,
`
`13. It is my opinion that this element was also well known in the art prior to the
`
`filing of the '237 patent. E.g., Ex. 1010 at 3:8-22; 2:42-49; 4:31-5:59; Claims 4-7,
`
`12-15; Ex. 1015 at 5:50-55, 2:34-46; 7:34-8:13; Ex. 1013 at 2:46-47; 9:42-61; Ex.
`
`1011 at 4:51-53; 9:55-67; 18:6-65; Abstract; Claims 1-3.
`
`21. As expressed herein, it is further my opinion that persons skilled in
`
`the art had already combined the foregoing known elements as of September 14,
`
`2000. Even if they had not, it is my opinion that a person skilled in the art would
`
`have recognized a reason to combine these elements (to create a content
`
`recognition system), and the combination would have yielded predictable results.
`
`
`
`7
`
`Google Ex. 1004
`
`

`
`
`
`22. The dependent claims modify these elements by further specifying the
`
`characteristics of (a) the electronic work (Ex. 1001 at Claims 2, 6, 10, 14, 17, 18,
`
`19, 20); (b) the means of transmitting or receiving information about the identified
`
`media work (id. at 21, 22, 23, 24); (c) the process of identifying the electronic
`
`work (id. at Claims 2-6, 16-20); (d) the determined identification of the electronic
`
`work (id. at Claims 3, 7, 11, 15); (e) the action (id. at Claim 13); and/or (f) the
`
`performed action (id. at Claims 4, 8, 12, 16, 26-32, 34-40).
`
`23. Although the specification highlights the use of the claimed features
`
`to identify a television commercial (e.g., Ex. 1001 at 9:53-10:20), the claims are
`
`not limited to the identification of television commercials and the specification
`
`explicitly envisions the identification of "content" generally (e.g., id. at 1:31-35
`
`("e.g., content or an advertisement")). Further, while the specification highlights
`
`the use of certain exemplary algorithms for extracting features from an electronic
`
`work (e.g., id. at 19:63-20:30, 6:51-7:37) and determining an identification of an
`
`electronic work by performing a non-exhaustive search identifying a neighbor
`
`(e.g., id. at 8:64-9:7, 10:63-11:6, 21:7-23; 21:52-22:21), a person skilled in the art
`
`would have understood that extraction can be performed using a wide variety of
`
`algorithms.
`
`24. The claimed steps are illustrated in several figures of the '237 patent,
`
`including Figure 1, which demonstrates (1) "feature (vector) extraction
`
`
`
`8
`
`Google Ex. 1004
`
`

`
`
`
`operation(s)" (140), which "extract features from [an electronic] work" (Ex. 1001
`
`at 6:31-35); (2) "feature (vector) lookup operation(s)" (150), which identify a work
`
`by "search[ing] for a matching feature vector" (id. at 6:36-40); (3) "work-
`
`associated information lookup operation(s)" (160), which "retrieve associated
`
`information, such as an action" (id. at 6:41-44); and (4) "action initiation
`
`operation(s)" (170), which "perform some action based on the associated
`
`information. Figures 2-4 clarify that, in the absence of claim language to the
`
`contrary, the foregoing claimed steps can be performed on any combination of a
`
`local device and a remote server. E.g., Fig. 2 (depicting all steps (140a, 150a, 160a,
`
`and 170a) performed on "user computer, set-top-box or equivalent" (210)); Fig. 3
`
`(depicting extraction, identification, and action (140b, 150b, and 170b) performed
`
`on "user computer, set-top-box or equivalent" (310) and information lookup (160b)
`
`performed on "monitoring center" (340)); Fig. 4 (depicting extraction and action
`
`(140c and 170c) performed on "user computer, set-top-box or equivalent" (410),
`
`and identification and information lookup (150a and 160c) performed on
`
`"monitoring center" (440)).
`
`VII. UNDERSTANDING OF THE LEGAL PRINCIPLES RELEVANT TO
`MY DECLARATION
`25.
`
`In performing my analysis, I was given a basic understanding of
`
`certain applicable patent law concepts, which I have applied herein.
`
`
`
`9
`
`Google Ex. 1004
`
`

`
`
`
`26.
`
`I understand that the subject matter sought to be patented is
`
`anticipated if each and every limitation of a claim is found, expressly or inherently,
`
`in a single prior art reference. To the extent that I refer to a particular reference as
`
`"anticipating" a particular claim, I am using this phrasing as a shorthand to refer to
`
`the facts that underlie a determination of anticipation, including the element-by-
`
`element comparison of the claims to the prior art.
`
`27.
`
`I understand that a patent claim may also be invalid if it would have
`
`been obvious to a person of ordinary skill in the art, at the time the invention was
`
`made. In evaluating whether a prior art reference or a combination of prior art
`
`references renders a patent claim obvious, it is necessary to determine the scope
`
`and content of the prior art references, ascertain the differences between the prior
`
`art references and the claim in issue, resolve the level of ordinary skill in the
`
`pertinent art, and evaluate evidence of secondary considerations. I further
`
`understand that a claim is not proved obvious merely by demonstrating that each of
`
`the claimed elements was independently known in the prior art. Rather, there must
`
`be some reason why a person of ordinary skill in the art would have been prompted
`
`to combine the elements in the manner claimed. A reason to combine known
`
`elements could come from a variety of sources, including but not limited to the
`
`prior art itself, the background knowledge of one of ordinary skill in the art, the
`
`nature of the problem to be solved, market demand, or common sense. To the
`
`
`
`10
`
`Google Ex. 1004
`
`

`
`
`
`extent that I refer to a particular reference or combination of references as
`
`rendering a particular term "obvious," I am using this phrasing as a shorthand to
`
`refer to the facts that underlie a legal determination of obviousness.
`
`28.
`
`I understand that secondary considerations are objective factors which
`
`may be considered when determining whether a claim was obvious at the time of
`
`invention. These include indicia of commercial success of the invention, lack of
`
`commercial success of the prior art, long-felt but unsolved need to solve the
`
`problem addressed by the invention, unexpected results, failure of others to
`
`develop a similar invention, and copying of the invention by others. In addition, I
`
`understand that if the claimed invention was invented independently by other
`
`persons, either before it was invented by the patentee or at about the same time,
`
`then this fact indicates that a claimed invention is obvious.
`
`VIII. SUMMARY OF OPINIONS
`29.
`It is my opinion that U.S. Patent No. 6,188,010 ("Iwamura", Ex. 1012)
`
`discloses each and every element of claims 1, 3-5, 7-9, 11-13, 15-16, 21-25, 29-30,
`
`33, 37, and 38 of the '237 patent. Therefore, these claims anticipated by Iwamura.
`
`30.
`
`It is my opinion that the combination of U.S. Patent No. 6,505,160
`
`("Levy", Ex. 1013) and Sunil Arya, et al., "An Optimal Algorithm for Approximate
`
`Nearest Neighbor Searching in Fixed Dimensions" ("Arya", Ex. 1006) discloses
`
`each and every element of claims 1-25, 29-30, 32-33, 37-38, and 40 of the '237
`
`
`
`11
`
`Google Ex. 1004
`
`

`
`
`
`patent. Therefore, these claims are obvious to a person of ordinary skill in the art
`
`over Levy in view of Arya.
`
`31.
`
`It is my opinion that the combination of U.S. Patent No. 6,597,405
`
`("Iggulden", Ex. 1011) and Christian Böhm, et al., "Efficient Similarity Search in
`
`Digital Libraries" ("Böhm") ("Böhm", Ex. 1007) discloses each and every element
`
`of claims 25, 32-33, and 40 of the '237 patent. Therefore, these claims are obvious
`
`to a person of ordinary skill in the art over Iggulden in view of Böhm.
`
`32.
`
`It is my opinion that U.S. Patent No. 5,874,686 ("Ghias", Ex. 1010)
`
`discloses each and every element of claims 1-3, 5-7, 9-11, 13-15 and 21-24 of the
`
`'237 patent. Therefore, these claims are anticipated by Ghias.
`
`33.
`
`It is my opinion that U.S. Patent No. 7,743,092 ("Wood", Ex. 1015)
`
`discloses each and every element of claims 9-16 and 23-24 of the '237 patent.
`
`Therefore, these claims are anticipated by Wood.
`
`34.
`
`It is my opinion that the combination of Iwamura and Chen discloses
`
`each and every element of claims 26-27 and 34-35. Therefore, these claims are
`
`obvious to a person of ordinary skill in the art over Iwamura in view of Chen.
`
`35.
`
`It is my opinion that the combination of Levy, Arya, and Chen
`
`discloses each and every element of claims 26-27 and 34-35. Therefore, these
`
`claims are obvious to a person of ordinary skill in the art over Levy in view of
`
`Arya and Chen.
`
`
`
`12
`
`Google Ex. 1004
`
`

`
`
`
`IX. CONSTRUCTION OF CLAIM TERMS
`36.
`I understand that, for purposes of this proceeding, the claim terms of
`
`the '237 patent should be given their broadest reasonable construction in light of
`
`the specification. I further understand that this standard of claim construction can
`
`be broader than that generally employed by a federal district court, which assigns
`
`claim terms their ordinary and customary meaning to a person of ordinary skill in
`
`the art at the time of filing, considering: (1) in the context of the entire patent,
`
`intrinsic evidence, such as the claims, specification, and file history; and (2) in the
`
`context of the intrinsic evidence, and to a lesser degree, extrinsic evidence, such as
`
`inventor testimony, dictionaries, and learned treatises.
`
`37. For the purposes of this Declaration, I will base my construction of
`
`the claim terms on the Petitioner's position in this proceeding. I understand and
`
`agree with Petitioner that for the majority of terms, the broadest reasonable
`
`construction is the plain and ordinary meaning to a person of ordinary skill in the
`
`art.
`
`38.
`
`It is my opinion that, regardless of the claim constructions ultimately
`
`adopted by the Board, the prior art references discussed herein anticipate and/or
`
`render obvious the enumerated claims of the asserted patents under any possible
`
`constructions.
`
`
`
`13
`
`Google Ex. 1004
`
`

`
`
`
`39.
`
`I understand and agree with Petitioner's position that the term
`
`"extracting features" means "generating a quantitative representation of the
`
`features of a media work, or a subset of the features of a media work."
`
`40. The patent specification supports this construction of "extracting
`
`features." In particular, the specification explains that "[t]he recognition literature
`
`contains many different" extraction algorithms, and enumerates a broad range of
`
`algorithms capable of "feature extraction," including "pseudo-random sample[s] of
`
`pixels," "the average intensities of n×n blocks of pixels," "frequency-based
`
`decomposition of the signal, such as produced by the Fourier, wavelet and or
`
`discrete cosine transforms" "principal component analysis," "a temporal sequence
`
`of feature vectors," or "a combination of these." E.g., Ex. 1001 at 6:51-7:11. The
`
`specification further notes that "other representations are possible." Id. at 7:6. The
`
`only unifying characteristic of these methods is that they generate a quantitative
`
`representation of the features of a media work.
`
`41. Moreover, in 2000, persons skilled in the art consistently employed
`
`this definition of "extracting features." In particular, persons skilled in the art
`
`understood that it was advantageous for content identification systems to generate
`
`quantitative representations of the features of a media work, because such
`
`representations were typically compact and logically organized, and therefore were
`
`more easily processed and searched by computers than raw media files. See Ex.
`
`
`
`14
`
`Google Ex. 1004
`
`

`
`
`
`1001 at 6:57-59. Such representations were often, but not always, formatted to
`
`maximize the efficiency of subsequent search.
`
`42.
`
`I understand and agree with Petitioner's position that the term
`
`"features extracted from the media work" means "a quantitative representation of
`
`the features of a media work, or a subset of the features of a media work,"
`
`consistent with the above discussion of "extracting features."
`
`43.
`
`I understand and agree with Petitioner's position that the broadest
`
`reasonable construction of the term "non-exhaustive search" is "a search that
`
`locates a match without conducting a brute force comparison of all possible
`
`matches, and all data within all possible matches." The specification never uses the
`
`terms "non-exhaustive" or "exhaustive." Ex. 1001. Moreover, in 2000, persons
`
`skilled in the art did not have a precise technical definition for "non-exhaustive
`
`search," and only used the term colloquially. As such, "non-exhaustive search" was
`
`understood to have multiple possible definitions, the broadest of which was
`
`consistent with the above definition. In particular, the broadest definition of "non-
`
`exhaustive search" employed by persons skilled in the art encompassed anything
`
`other than a "brute force" search. (As a corollary, the narrowest definition of
`
`"exhaustive search" was "brute force search.") A "brute force" search, in turn, is a
`
`search wherein a query is compared to every single portion of every single item in
`
`a database. Persons skilled in the art recognized that, while a properly constructed
`
`
`
`15
`
`Google Ex. 1004
`
`

`
`
`
`brute force search was guaranteed to find the best match in a database, brute force
`
`searches were inherently inefficient and required on average N/2 comparison
`
`operations (where N is the number of entries in the search database). See id. at
`
`8:54-63. Therefore, to maximize search efficiency, persons skilled in the art
`
`routinely employed more efficient searches that did not conduct a comparison of
`
`every single item in a database, sometimes referred to as non-exhaustive searches.
`
`44. Because a brute force search conducts a comparison of every item in a
`
`search database, it is guaranteed to find the best match. Thus, any search that is not
`
`guaranteed to find the best match is necessarily not brute force, and therefore is
`
`necessarily non-exhaustive.
`
`45. By contrast, certain non-exhaustive search algorithms are guaranteed
`
`to find the best match. Thus, the fact that a search is guaranteed to find the best
`
`match does not necessarily indicate that it is a brute force search, and therefore
`
`does not preclude it from being considered non-exhaustive.
`
`46. For the avoidance of doubt, it is my opinion that a search that does
`
`compare a query item to all items in a database, but does not consider all data
`
`within each database item, is non-exhaustive. For instance, suppose a content
`
`identification system in which each item of extracted data contains a first part and
`
`a second part. If that system compares an unknown work to every known work in
`
`the database, but does so only on the basis of the first part of each item of extracted
`
`
`
`16
`
`Google Ex. 1004
`
`

`
`
`
`data (ignoring the second part of each item of extracted data), the search is non-
`
`exhaustive.
`
`47.
`
`It is further my opinion that, when evaluating whether a search is non-
`
`exhaustive, only the search algorithm should be considered, and not the
`
`preprocessing algorithm, if any. When database entries are ingested into a content
`
`recognition database, the system will typically preprocess the data to organize it
`
`into a form that facilitates efficient search using a dedicated preprocessing
`
`algorithm. Subsequently and separately, when a search operation is initiated, the
`
`system will locate the best match in the database using a dedicated search
`
`algorithm. The search algorithm and the preprocessing algorithm are distinct, and
`
`operate at separate points in time. Thus, the fact that a preprocessing algorithm
`
`may analyze all data entries in a database at the time of ingestion does not render a
`
`search algorithm brute force.
`
`48.
`
`I understand and agree with Petitioner's position that the terms
`
`"identify a neighbor," "identify a near neighbor," and "nearest neighbor search."
`
`means "identify a close, but not necessarily exact, match." In 2000, persons skilled
`
`in the art consistently employed this definition. In particular, persons skilled in the
`
`art typically designed content identification systems that compared extracted
`
`features from a query work to extracted features from a known work. However,
`
`due to discrepancies in the source material, such as noise, distortion, and source,
`
`
`
`17
`
`Google Ex. 1004
`
`

`
`
`
`two different copies of the same work might produce somewhat different sets of
`
`extracted features. Ex. 1001 at 8:31-53. Thus, rather than design systems that
`
`searched only for exact matches, persons skilled in the art typically designed
`
`systems that identified close, but not necessarily exact, matches. Id. Persons skilled
`
`in the art understood that the term "identifying a neighbor" referred to such
`
`searches.
`
`49.
`
`It is my opinion that different algorithms for identifying a neighbor
`
`can employ different measures of distance. For instance, "Euclidean" distance
`
`refers to the literal distance between data points (i.e., "as the crow flies");
`
`"Manhattan" distance refers to the sum of the absolute differences of the
`
`coordinates of data points (i.e., as car drives in a city); and Hamming distance
`
`refers, in the context of binary data points, to a count of the number of bits that are
`
`not equal.
`
`50. Consistent with the above, I understand and agree with Petitioner's
`
`position that a "neighbor" means "a close, but not necessarily exact, match."
`
`51.
`
`I understand and agree with Petitioner's position that the term
`
`"portable client device" means "any portable electronic device" including, for
`
`example, a laptop. The specification never uses the terms "portable" or "client."
`
`However, in 2000, persons skilled in the art consistently employed the above
`
`definition. Further, persons skilled in the art understood that computers could
`
`
`
`18
`
`Google Ex. 1004
`
`

`
`
`
`embody essentially any form—including portable forms, such as laptops—while
`
`still being considered computers. Thus, it is further my opinion that the term
`
`"computer" broadly refers to any electronic device. As such, it is my opinion that
`
`any disclosure of a "computer" amounts to disclosure of a "portable client device."
`
`52.
`
`I understand and agree with Petitioner's position that the term "fixed
`
`radius search" means "a neighbor search that identifies matches within a threshold
`
`radius of the query item." The specification confirms that one type of neighbor
`
`search is a "fixed radius search" in which "if the database contains a vector that is
`
`within X of the query, then there is a match." Ex. 1001 at 21:52-22:7. As with a
`
`neighbor search, different algorithms for performing a fixed radius search can
`
`employ different measures of distance, including Euclidean distance, Manhattan
`
`distance, and Hamming distance. Moreover, in 2000, persons skilled in the art
`
`consistently employed this definition.
`
`53.
`
`I understand and agree with Petitioner's position that the term
`
`"sublinear search" means "a search whose execution time has a sublinear
`
`relationship to database size." For instance, a linear search of a 200-item database
`
`would take twice as long as a linear search of a 100-item database. By contrast, a
`
`sublinear search of a 200-item database would take less than twice as long as a
`
`sublinear search of a 100-item database, perhaps, for instance, 1.5 times as long.
`
`Thus, if search execution time were plotted as a function of database size, a linear
`
`
`
`19
`
`Google Ex. 1004
`
`

`
`
`
`search would yield a diagonal line of constant slope, while a sublinear search
`
`would yield a line with a slope approaching zero.
`
`54. To illustrate, the following chart (which plots search time as a
`
`function of the number of entries in the search database), depicts an exemplary
`
`linear search and an exemplary sublinear search. The linear search exhibits a
`
`constant slope, while the sublinear search exhibits a slope approaching zero. It is
`
`my opinion that a search exhibiting such a slope approaching zero would be
`
`considered sublinear.
`
`Linear Search vs Sublinear Search
`10
`
`0
`
`9
`8
`7
`6
`5
`4
`3
`2
`1
`Number of Entries in Search Database
`
`10
`
`Linear
`Sublinear
`
`
`
`0123456789
`
`Search Time
`
`55.
`
`In 2000, persons skilled in the art consistently employed the foregoing
`
`definition of sublinear. In particular, as the sizes of automatic content identification
`
`databases grew, persons skilled in the art consistently employed sublinear searches
`
`to avoid having to proportionally increase the computing power available to the
`
`identification system.
`
`
`
`20
`
`Google Ex. 1004
`
`

`
`
`
`56. Consistent with the above, it is my opinion that a search whose
`
`execution time is proportional to the logarithm of the size of the data set (e.g., a
`
`search with execution time proportional to A log (BN), where A and B are
`
`constants, and N is the number of entries in the database) is sublinear. The
`
`specification of the '237 patent confirms this. Ex. 1001 at 8:54-63 (contrasting
`
`binary search with linear search and noting that, "[i]f binary search was possible,
`
`then a database containing N vectors would require at most log(N) comparisons.").
`
`57. Consistent with the above, it is my opinion that the typical
`
`performance of search algorithms associated with "kd-trees and vantage point
`
`trees" is sublinear. The specification of the '237 patent confirms this. Ex. 1001 at
`
`21:56-60 ("A number of possible data structures are applicable including kd-trees
`
`and vantage point trees. These data structures and associated search algorithms
`
`organize a N-point dataset (N=90,000,000 in out previous example) so that sub-
`
`linear time searches can be performed on average.").
`
`58. Consistent with the above, it is my opinion that the typical
`
`performance of "excluded middle vantage point forest" searches is sublinear. The
`
`specification of the '237 patent confirms this. Ex. 1001 at 21:61-22:7 ("Recently,
`
`Yianilos proposed an excluded middle vantage point forest for nearest neighbor
`
`search. See, e.g., the Yianilos reference. This data structur

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket