`
`UNITED STATES PATENT AND TRADEMARK OFFICE
`
`BEFORE THE PATENT TRIAL AND APPEAL BOARD
`
`FACEBOOK, INC., LINKEDIN CORP., AND TWITTER, INC.,
`Petitioners
`v.
`
`SOFTWARE RIGHTS ARCHIVE, LLC
`Patent Owner
`
`Case IPR2013-00479
`Patent 5,832,494
`
`DECLARATION OF AMY LANG VILLE
`in Support of Patent Owner Response
`
`EXHIBIT 2114
`Facebook, Inc. et aL
`v.
`Software Rights Archive, UC
`CASE IPR2013-00479
`
`
`
`I, Amy Langville, declare as follows:
`
`1.
`
`My name is Amy Langville. I am a tenured Associate Professor of
`
`Mathematics at the College of Charleston. My business address is 1014 E. Ashley
`
`Avenue, P.O. Box 295, Folly Beach, SC 29439. I understand that my declaration
`
`is being submitted in connection with the above-referenced inter partes review
`
`proceeding.
`
`I. QUALIFICATIONS, BACKGROUND, AND EXPERIENCE
`
`A.
`
`2.
`
`Background and Experience
`
`I was hired as an Associate Professor of Mathematics at The College
`
`of Charleston in 2004 and obtained tenure in 2010. In my current position at The
`
`College of Charleston, my teaching responsibilities range from undergraduate,
`
`general service courses in Calculus and Linear Algebra to graduate special topics
`
`courses in Linear Optimization, Evolutionary Optimization, Operations Research,
`
`and Integer Programming. I also have research mentoring duties. Consequently, I
`
`have supervised over two dozen student projects and B.S., M.S. and Ph.D. theses,
`
`covering such topics as ranking, clustering, graph theory, and optimization.
`
`3.
`
`I am also the Operations Research specialist in our department, having
`
`received my M.S. and Ph.D. degrees in Operations Research from N.C. State
`
`University in 1999 and 2002, respectively. During one summer of graduate school,
`
`I worked for The National Security Agency in their Operations Research program,
`
`1
`
`
`
`researching network problems. My Ph.D. dissertation topic was Markov chains,
`
`which a few years later I learned was the mathematical process underlying the
`
`PageRank algorithm used by Google’s search engine for ranking webpages.1
`
`4.
`
`After my Ph.D., I took a postdoctoral position in the Mathematics
`
`Department at N.C. State University from 2002-2005, where I studied the
`
`mathematics of search engines. In particular, I studied Google’s search engine and
`
`its PageRank algorithm.
`
`5.
`
`My research and technical expertise are sought out by both business
`
`and professional scientific organizations. In particular, I have worked with the
`
`following companies, as a consultant and/or member of the Advisory Board: U.S.
`
`Olympic Committee, The Boeing Corporation, The SAS Institute, Semandex,
`
`Piffany, Fortune Interactive, Your Music On, Trilogy Excursions, College Football
`
`Performance Awards, and Tiger Falcon’s Prediculous. In addition, I am frequently
`
`requested to review work in my field by the National Science Foundation,
`
`Princeton University Press, and many scientific journals. While each reviewing
`
`responsibility appears in the attached curriculum vitae, here I list just a few of the
`
`more recognizable journals from a wide range of professional organizations (ACM,
`
`INFORMS, SIAM, WWW, and IEEE): The Association for Computing
`
`1 See Ex. 2053: Sergey Brin and Lawrence Page, The Anatomy of a Large-Scale
`Hypertextual Search Engine, WWW, 1998; Ex. 2054: Lawrence Page, Sergey Brin, Rajeev
`Motwani and Terry Winograd, The PageRank Citation Ranking: Bringing Order to the Web,
`Technical Report, Stanford InfoLab, December 2008.
`
`2
`
`
`
`Machinery’s Transactions on Information Systems, the Institute for Operations
`
`Research and Management Science’s Journal on Computing, The Society for
`
`Industrial and Applied Mathematics’ Review, the World Wide Web conference,
`
`and the Institute for Electrical and Electronics Engineers’ Transactions on Signal
`
`Processing.
`
`6.
`
`I have chaired several symposiums and conferences in my field,
`
`including The Markov Anniversary Meeting in 2006, the Southeastern Ranking
`
`and Clustering workshop in 2009, and symposiums at the annual meetings of the
`
`Mathematical Association of America (MAA), the American Mathematical Society
`
`(AMS), and the Society for Industrial and Applied Mathematics (SIAM).
`
`7.
`
`I have received many awards for these activities. For example, within
`
`my university, I received the prestigious Distinguished Teacher-Scholar award, the
`
`Gordon E. Jones Distinguished Achievement award, and the Faculty of the Year
`
`award. From N.C. State University, I received a Distinguished Alumni award. In
`
`my state, I was nominated for the S.C. Governor’s Award for Excellence in
`
`Science and was a finalist for the S.C. Professor of the Year. Nationally, I have
`
`been selected to be a member of the Committee for Women in Mathematics, a
`
`group representing the AWM, MAA, AMS, INFORMS, and SIAM professional
`
`organizations. My books, publications, and algorithms have received many
`
`awards, a few of which are described in the next few sections.
`
`3
`
`
`
`B.
`
`8.
`
`Publications and Conferences
`
`I have authored two books that specifically concern Google, its search
`
`engine, and the “PageRank” algorithm used by its search engine. The first book,
`
`published in 2006 by Princeton University Press, is titled Google’s PageRank and
`
`Beyond: The Science of Search Engine Rankings. This book won the runner-up
`
`award for the Best New Book in Information Science from the Association of
`
`American Publishers and has been translated into Japanese and Greek with a
`
`Russian translation under contract. The second book, published in January 2012
`
`by Princeton University Press, is titled Who’s #1? The Science of Ranking and
`
`Rating. It covers ranking more broadly with applications, methods, and measures
`
`beyond the web. A book on my clustering research is underway and expected to
`
`be completed in 2015. My clustering research includes two clustering algorithms,
`
`which were adopted by the popular software programs of SAS’s Enterprise Miner
`
`and MATLAB’s Matrix Laboratory software.
`
`9.
`
`In 2006, I was the co-editor for The Proceedings of the Markov
`
`Anniversary Meeting, a compendium of work on Markov chains.
`
`10.
`
`In addition to the books mentioned above, my research has been
`
`published in over 40 papers. My complete publication list appears in the attached
`
`curriculum vitae (Appendix A). Those publications include the following: The
`
`2006 paper “A Survey of Eigenvector Methods for Web Information Retrieval”
`
`4
`
`
`
`appeared in The SIAM Review, the flagship journal of the Society for Industrial and
`
`Applied Mathematics. The recent 2011 paper “The Sensitivity and Stability of
`
`Ranking Vectors” appeared in the SIAM Journal on Scientific Computing and was
`
`highlighted with a SIAM press release that was quickly tweeted, emailed, and
`
`propagated, resulting in a podcast interview by IEEE, the most well-known
`
`professional organization for engineers. A 2007 paper on clustering methods
`
`published in Computational Statistics and Data Analysis won that journal’s Top
`
`Cited Award for the period of 2005-2010.
`
`11.
`
`I have been invited to give over 70 presentations of my research. Each
`
`talk is listed in the attached curriculum vitae (Appendix A). For now, I provide a
`
`quick summary with highlights. I have given industrial talks to Yahoo! Research,
`
`The Boeing Company, and The SAS Institute, service talks to students and
`
`colleagues at Columbia University, Stanford University, The University of Illinois
`
`at Urbana-Champaign, Tsukuba University in Japan, and The Hamilton Institute in
`
`Ireland, research talks at professional conferences including INFORMS, SIAM,
`
`MAA, WWW and talks to governmental organizations such as The National
`
`Security Agency and The Department of Energy. Perhaps the most prestigious
`
`invitation came from the American Mathematical Society with their annual “Talk
`
`on Capitol Hill” informing congressional members and their staffers on the state of
`
`ranking and clustering research.
`
`5
`
`
`
`C.
`
`Significant Research
`
`12. My research deals with ranking (creating an ordered list of items from
`
`information about the relationships between those items) and clustering (grouping
`
`those same items together by some similarity measure). This research applies to
`
`many types of data, e.g., webpages, library collections of documents, sports teams,
`
`social networks, and genetic and biological data. My research has been funded
`
`continuously by the National Science Foundation. In 2005 I received the very
`
`prestigious and highly competitive CAREER award (totaling $432,722 for 5 years)
`
`for Early Career Faculty members. Most recently, I was awarded another NSF
`
`grant (totaling $400,000 over 3 years) to study optimization methods for ranking
`
`and clustering.
`
`13. Additional information concerning my background, qualifications,
`
`publications, conferences, honors, and awards are described in my curriculum
`
`vitae, attached to my declaration as Appendix A.
`
`14.
`
`I served as an expert in In re Google Litig., a patent infringement suit
`
`by Software Rights Archive, LLC (“SRA”) against Google, Microsoft, Yahoo, and
`
`a number of other defendants in United States District Court for the Northern
`
`District of California (05-cv-01372 RMW) (the “prior Google litigation”). I am
`
`currently serving as an expert in Britt et al. v. Trilogy Excursions in the Circuit
`
`Court in the Second Circuit State of Hawaii (13-1-0085(2)).
`
`6
`
`
`
`II.
`
`COMPENSATION AND ENGAGEMENT
`
`
`
`15.
`
`I understand that the Patent Trial and Appeal Board (the “Board”)
`
`granted petitions
`
`to Facebook, LinkedIn, and Twitter (collectively,
`
`the
`
`“Petitioners”) to institute inter partes review regarding the following claims on
`
`obviousness grounds: (1) claims 26, 28-30, 32, 34, and 39 of U.S. Patent No.
`
`5,544,352 (the “’352 patent”); (2) claims 8, 10, 11, 18-20, 35, 40, 45, 48-49, and
`
`51 of U.S. Patent No. 5,832,494 (the ‘”494 patent”);2 and (3) claims 12, 21, and 22
`
`of U.S. Patent No. 6,233,571 (the “’571 patent”). The ’352, ’494, and ’571 patents
`
`are collectively referred to as the “SRA patents,” and the identified claims of those
`
`patents at issue in these proceedings are collectively referred to as the “challenged
`
`claims.” I understand these patents are owned by SRA.
`
`16.
`
`I have been retained by DiNovo Price Ellwanger & Hardy, LLP to
`
`provide testimony and analysis in this case. I receive compensation in the amount
`
`of $300 for every hour I devote to providing the analysis and testimony requested
`
`of me in this case. SRA has also agreed to reimburse me for travel and other
`
`expenses that I have incurred that are related to providing this analysis. My
`
`compensation does not depend in any way on the nature of my opinion or the
`
`outcomes of any of the above-referenced inter partes review proceedings or any
`
`related action.
`
`2 I have not addressed claims 8, 10, 11, 35 and 40 of the ’494 patent, as I understand that
`SRA has requested cancellation of those claims.
`
`7
`
`
`
`III. MATERIALS REVIEWED
`
`17.
`
`In preparing and rendering the testimony and opinions set forth in this
`
`declaration, I have reviewed the documents and information referred to and/or
`
`cited herein.
`
`IV.
`
`LEGAL STANDARD RELATING TO OBVIOUSNESS
`
`18.
`
`I understand that a patent claim may be found invalid if the claimed
`
`invention would have been obvious to a person of ordinary skill in the relevant
`
`field as of the priority date of the patent.3
`
`19.
`
`It is my understanding that the presence of any of the following
`
`factors may be considered an indication that the claimed invention would not have
`
`been obvious at the time the claimed invention was made:
`
`a. Commercial success of a product due to the merits of the claimed
`invention;
`
`b. A long felt need for the solution provided by the claimed
`invention;
`
`c. Unsuccessful attempts by others to find the solution provided by
`the claimed invention;
`
`d. Unexpected and superior results from the claimed invention;
`
`3 I understand that the field relevant to the claimed invention is computerized search and
`information retrieval. I further understand that a person of ordinary skill in the relevant field as
`of June 17, 1996, would have had familiarity with computerized search and information retrieval,
`and have at least a bachelor’s degree in one of computer science or electrical and computer
`engineering, or a comparable amount of combined education and equivalent industry experience
`in computerized search and information retrieval. I also understand that strength in one of these
`areas can compensate for a weakness in another.
`
`8
`
`
`
`e. Acceptance by others of the claimed invention as shown by praise
`from others in the field or from the licensing of the claimed
`invention; and
`
`f. Other evidence tending to show nonobviousness.
`
`20.
`
`I further understand that in order for evidence to be relevant to the
`
`obviousness inquiry, there must be a relationship (or a nexus) between the
`
`advantages of the claimed invention and the evidence of secondary considerations.
`
`V.
`
`SUMMARY OF OPINIONS
`
`21.
`
`I am of
`
`the opinion
`
`that at
`
`least
`
`the following secondary
`
`considerations are present here: (1) unexpected results; (2) commercial success &
`
`commercial acquiescence; (3) long felt but unresolved needs; and (4) praise by
`
`others. I have been informed by counsel that these secondary considerations have
`
`been accepted by the Board, as well as by other courts, as demonstrating
`
`nonobviousness.
`
`22. Google’s search engine using its PageRank algorithm is a commercial
`
`embodiment of the inventions claimed in the SRA patents. See Section VI, infra.
`
`23.
`
`The superior search results obtained by the use of web-based link
`
`analysis in computer search were unexpected. Industry members, competitors, and
`
`investors initially doubted whether PageRank would improve computerized search.
`
`Yet the revolution of the search engine industry was directly tied to web-based link
`
`analysis as claimed in the SRA patents and embodied in PageRank. See Section
`
`VII, infra.
`
`9
`
`
`
`24. Google achieved substantial commercial success with its search
`
`engine using PageRank. This success is demonstrated by at least the following
`
`facts. Shortly after introducing its search engine and PageRank, Google captured a
`
`large share of the search engine market and has increased that market share each
`
`year since. Based on its five-year revenue growth, Google is one of the fastest
`
`growing companies ever. Google revolutionized the search industry with its
`
`PageRank technology such that shortly after entering the market, every search
`
`engine company ultimately adopted web-based link analysis algorithms like
`
`PageRank or went out of business. See Section VIII, infra.
`
`25.
`
`The commercial success of the inventions claimed in the SRA patents
`
`is further evidenced by the fact that approximately 99% of the search market,
`99%
`
`including the market leader Google, has taken licenses to the SRA patents for
`the market leader Google,
`patents for
`
`substantial royalty amounts. See Section VIII, infra.
`substantial royalty amounts.
`
`26.
`
`There is a nexus between the commercial success of Google and
`
`PageRank—that is, the commercial success and acquiescence are due to the
`
`inventions claimed in the SRA patents. See Section VIII, infra.
`
`27. Web-based link analysis technology satisfied a long felt need for
`
`improved computerized search. The search engine industry was plagued with the
`
`problem of the poor quality of search results. The first generation of search
`
`engines relied (primarily or exclusively) on word-based search methods alone that
`
`10
`
`
`
`returned an overabundance of results and were unable to adequately distinguish
`
`relevant search results from irrelevant search results in electronic databases,
`
`thereby making the user sort through this vast result set. The patented SRA
`
`technology embodied in PageRank solved the problem by providing tools that were
`
`capable of analyzing electronic databases for non-semantical relationships and for
`
`using indirect citation relationships to enhance search and ranking of objects in
`
`computer databases, thereby bringing order to the search result set. See Section
`
`IX, infra.
`
`28. Google received significant praise within the search industry, the
`
`media, and academics, and it garnered numerous awards for PageRank, which is a
`
`commercial embodiment of the inventions claimed in the SRA patents. See
`
`Section X, infra.
`
`VI.
`
`PAGERANK IS A COMMERCIAL EMBODIMENT OF THE
`INVENTIONS CLAIMED IN THE SRA PATENTS
`
`29.
`
`In 1996, while attending Stanford University, Larry Page and Sergey
`
`Brin began collaborating on a research project relating to search engines. The
`
`eventual result of this project was a search engine originally called BackRub and
`
`later renamed Google that used an innovative link analysis algorithm called
`
`PageRank. As I describe more fully below, adding link analysis to computerized
`
`search, which prior to Google’s PageRank had focused on text analysis, was
`
`11
`
`
`
`innovative, extremely productive, and ultimately revolutionized the search
`
`industry.
`
`30. At a high level, Google’s search engine works as follows: Google
`
`crawls the Web to collect the contents of accessible sites. This data is then broken
`
`down into an index (organized by word, just like the index of a textbook), thereby
`
`creating a way of finding any page based on its content. As a user of Google, you
`
`enter a query into the search box and in less than a second Google returns a ranked
`
`list of results that use your search terms. The first page of results contains the top
`
`10 results, the ones that Google has judged to be the most relevant to your query.
`
`This ranked list is the foundation of Google and the most important factor
`
`determining this ranking is Google’s PageRank algorithm.
`
`31.
`
`Page and Brin have described PageRank as:
`
`a query-independent technique for determining the importance of web
`pages by looking at the link structure of the web. PageRank treats a
`link from web page A to web page B as a “vote” by page A in favor of
`page B. The PageRank of a page is the sum of the PageRank of the
`pages that link to it. The PageRank of a web page also depends on the
`importance (or PageRank) of the other web pages casting the votes.
`Votes cast by important web pages with high PageRank weigh more
`heavily and are more influential in deciding the PageRank of pages on
`the web.4
`
`32.
`
` In short, PageRank (1) uses the hyperlink structure of the Web
`
`(i.e., non-semantic relationships) to (2) create a measure of each webpage’s
`
`4 Ex. 2055: Google Founders’ IPO Letter to Investors, April 2004.
`
`12
`
`
`
`importance and (3) this measure is independent of the query entered. At query
`
`time, query-independent link measures such as PageRank are combined with
`
`traditional query-dependent text measures. Brin and Page visualized a giant
`
`directed graph of nodes and links that represents the Web. The nodes represented
`
`the webpages and the links, the hyperlinks pointing from one page to another.
`
`Adding an analysis of this graph, known as link analysis, to search, which prior to
`
`the SRA patents and development of Google’s PageRank had focused on text
`
`analysis, was innovative and extremely productive. The innovation and novelty
`
`that this technology brought to the search industry will be described in later
`
`sections.
`
`33. By thinking of the Web as a directed graph, Brin and Page treated a
`
`hyperlink from one page to another as a directed relationship between the two
`
`pages. The recursive calculation of their famous PageRank algorithm analyzed
`
`these direct relationships as well as indirect relationships. An indirect relationship
`
`occurs when two pages are connected through at least one intermediate page.
`
`These auxiliary relationships are both useful and essential to the PageRank
`
`algorithm.
`
`34. By mid-1997, Page and Brin had built a working search engine using
`
`the PageRank technology. The initial version of Google ran on Brin’s homepage
`
`on the computer network at Stanford. But it did not take long for word to spread
`
`13
`
`
`
`that Brin and Page had built a better search engine. Many thousands of people
`
`quickly began using Google as their main search tool because it was better than
`
`any of the other search engines that were around at that time. New users were
`
`trying to access the Google server at an exponential rate. Seeing the increased
`
`traffic, the pair took steps to incorporate and patent their ideas.
`
`35.
`
`The first paper about Page and Brin’s project that described PageRank
`
`and the search engine prototype was the now famous and oft-cited “The Anatomy
`
`of a Large-Scale Hypertextual Search Engine,” which was published in 1998 and
`
`delivered at the World Wide Web Conference (see Ex. 2053). In January 1998,
`
`Page filed a patent application (No. 09/004827) for the PageRank algorithm, which
`
`was subsequently granted as U.S. Patent No. 6,285,999 (the “’999 patent” or the
`
`“PageRank patent”) (see Ex. 2086)
`
`36.
`
`Shortly thereafter, in September 1998, Google was incorporated.
`
`37. As set forth above, I was retained by SRA as an expert in the prior
`
`Google litigation. In that case, SRA alleged that Google’s search engine using
`
`PageRank infringed the challenged claims of the SRA patents.
`
`38.
`
`I have reviewed the PageRank patent (i.e., the ’999 patent). I have
`
`also reviewed the SRA patents. I am aware that Daniel Egger’s link analysis work
`
`and the priority date of the SRA patents predates Google’s link analysis work by
`
`more than a year. Specifically, I understand that the application that issued as the
`
`14
`
`
`
`’352 patent was filed on June 14, 1993: the application that issued as ’494 patent
`
`was filed on May 17, 1996; and the application that issued as the ’571 patent has
`
`an effective date of May 17, 1996. The ’999 patent was filed in January 1998,
`
`more than 4 years after Daniel Egger’s initial filing. I have also reviewed Sergey
`
`Brin and Lawrence Page’s original publications describing PageRank, Ex. 2054:
`
`The PageRank Citation Ranking: Bringing Order to the Web (1998) and Ex. 2053:
`
`The Anatomy of a Large-Scale Hypertextual Web Search Engine (1998), both of
`
`which were published several years after Daniel Egger’s initial filing.
`
`39.
`
`In the prior Google litigation, I consulted with Dr. Brian Davison,
`
`who supervised a team that reviewed what Google represented to be the complete
`
`source code implementation for PageRank. I understand that the team supervised
`
`by Dr. Davison spent over 9,000 hours reviewing code and documentation relating
`
`to the Google search engine. In addition, I personally spent over 200 hours
`
`reviewing the source code and documentation produced by Google in that prior
`
`litigation, particularly that which related to PageRank. Dr. Davison and I both
`
`concluded that PageRank infringes the SRA patents and that Google utilizes a
`
`version of PageRank within its search engine.
`
`40. Dr. Davison was exceptionally qualified to lead the review, as he was
`
`an integral part of the DiscoWeb and Teoma web search engine projects in the late
`
`1990s, which were contemporaneous major competitors to Page and Brin’s work
`
`15
`
`
`
`with PageRank and Google. The Teoma algorithm was eventually acquired and
`
`commercially adopted by the Ask Jeeves search engine in 2001.
`
`41. Attached as Exhibits 2050, 2051, 2052 are claim charts, based on
`
`publicly available information, demonstrating that Google’s search engine using
`
`PageRank infringes the challenged claims of the SRA patents.
`
`42.
`
`It is my opinion that PageRank is a commercial embodiment of the
`
`inventions covered by the challenged claims of the SRA patents. My testimony
`
`and the documentation supporting this opinion are set forth in the claim charts (see
`
`Exhibits 2050-2052).
`
`VII. THE RESULTS OF WEB-BASED LINK ANALYSIS WERE
`UNEXPECTED
`The PageRank algorithm (i.e., the inventions claimed in the SRA
`
`43.
`
`patents) provided significant unexpected results over the prior methods being used
`
`in automated information retrieval and web-based search, as described in more
`
`detail below.
`
`44.
`
`Prior to 1998, search engines relied primarily or exclusively on text
`
`analysis (i.e., semantic relationships) to answer queries. These early, text analysis
`
`search engines are commonly referred to as “first generation search engines.”
`
`16
`
`
`
`45.
`
`Search engine analyst and author John Battelle succinctly describes
`
`how these first generation search engines generally worked:
`
`So how does a search engine work? … In essence, a search engine connects
`words you enter (queries) to a database it has created of Web pages (an
`index). It then produces a list of URLs (and summaries of content) it
`believes are most relevant for your query. While there are experimental
`approaches to search that are not driven by this paradigm, for the most part,
`every major search engine is driven by this text-based analysis. … As Tim
`Bray, a search pioneer now at Sun Microsystems, puts it in his excellent
`series “On Search,” “The fact of the matter is that there really hasn’t been
`much progress
`in
`the basic science of how
`to search since
`the
`seventies….Before Google, most search engines employed simple keyword-
`based algorithms to determine ranking.”5
`
`46.
`
`Early Web search engines focused on the content on a webpage,
`
`counting, for example, the number of times a term appeared on the page, where it
`
`appeared on the page (e.g., in the title, headings, body, etc.), and in what font (e.g.,
`
`italicized, boldfaced, or capitalized). Pages returned in response to a user query
`
`were then ranked by such content- or text-based measures. Early search engines
`
`competed to either: (1) have the largest index and thereby extract the greatest
`
`content from the greatest number of webpages or (2) to seek more information
`
`from the textual content by, for instance, adding proximity features such that a
`
`multi-term phrase appearing on a page with the terms in close proximity to each
`
`5 Ex. 2056: John Battelle, The Search: How Google and Its Rivals Transformed Our
`Culture, Portfolio, 2005, p. 20-1 and p. 103.
`
`17
`
`
`
`other got a higher ranking than another page that used the same multi-term phrase
`
`yet with the terms spread throughout the page.
`
`47.
`
`Improved computerized search based on link analysis (such as that
`
`claimed in the SRA patents and embodied in Google’s PageRank and IAC’s HITS
`
`algorithms) could not have been predicted based on the prior art. I have reviewed
`
`the declarations of Paul Jacobs. I understand he has concluded based on his review
`
`of the prior art that the pre-1998 art discouraged the use of indirect relationships as
`
`part of computerized search systems.
`
`48.
`
`I attended the April 27 & 28, 2014 deposition of Edward A. Fox,
`
`Ph.D. I also reviewed the transcript of his testimony.6 Fox testified that his
`
`experiments showed in almost all cases that the use of indirect relationships
`
`(including bibliographic coupling and co-citations) “degraded” search results when
`
`compared to using either terms alone or terms in combination with direct links.7
`
`Additionally, he testified that his experiments were specific to the collections he
`
`experimented on and could not be generalized to other collections outside of his
`
`experiment collection (i.e., collections other than the CACM and the ISI
`
`6 See generally Ex. 2016: Deposition of Edward A. Fox, Ph.D, dated April 26, 2014
`(“Fox Depo. Trans. Pt. 1”); Ex. 2017: Deposition of Edward A. Fox, Ph.D., dated April 27, 2014
`(“Fox Depo. Trans. Pt. 2”).
`7 Ex. 2016: at 33:15-23, 56:10-57:21; see also Ex. 2016 at 35:1-36:3, 45:13-48:18,
`51:10-52:8, 138:20-139:23.
`
`18
`
`
`
`collection).8 Dr. Fox further testified that he was unaware of any non-research,
`
`commercial application of his methods to the Web.9
`
`49. Dr. Jacob’s conclusions and Dr. Fox’s testimony are consistent with
`
`my knowledge of the history and development of search engines, in at least the
`
`following ways.
`
`50. Dr. Fox’s focus on co-citation and bibliographic coupling would not
`
`have led to the significant breakthrough of the inventions claimed in the challenged
`
`claims of the SRA patents and embodied in PageRank because Dr. Fox’s research
`
`makes use of only the two most straightforward indirect relationships, indirect
`
`relationships of length two (i.e., involving two links). There are many other
`
`indirect relationships representing higher order relationships of increasing length
`
`and complexity of links. Compared to Fox’s indirect relationships of co-citation
`
`and bibliographic coupling, PageRank, HITS, and the cluster link generator
`
`claimed in the challenged claims of the ’494 patent examine far more types of
`
`indirect relationships, thereby providing much more complete and less manipulable
`
`information. In fact, PageRank, HITS, and cluster link generator claimed in the
`
`challenged claims of the ’494 patent employ algorithms that examine indirect
`
`relationships of length up to a specified distance. Similarly, the inventions claimed
`
`8 Id. 76:8-14, 74:24-75:3, 36:16-37:3.
`9 Ex. 2017at 288:22-295:12, 296:4-22.
`
`19
`
`
`
`in the challenged claims of the ’352 patent used 18 different patterns of both direct
`
`and indirect relationships to capture this more useful information.
`
`51. Additionally, I am not aware of any publicly disclosed search engine
`
`that employed link analysis prior to Google’s search engine and its PageRank
`
`algorithm. Furthermore, I am not aware of any non-research application or usage
`
`of link analysis as part of a computerized search system that predates Daniel
`
`Egger’s link analysis work and/or the SRA patents. As Stanford computer science
`
`professor Rajeev Motwani 10 describes: “Before this [i.e., Google’s PageRank],
`
`people were only looking at the content. They were completely ignoring the fact
`
`that people were going to the effort of putting a link from one page to another and
`
`that there must be a meaning to that.”11
`
`52.
`
`In my research I have reviewed a substantial number of textbooks on
`
`information retrieval. I am aware of no textbooks published prior to Google that
`
`made any mention of link analysis for information retrieval on the Web; instead,
`
`when explaining how search engines work these textbooks described text analysis.
`
`I am aware that Stephen Levy, longtime technology writer and critic, likewise
`
`10 Rajeev Motwani was Page and Brin’s advisor and a co-author on Ex. 2054: The
`PageRank Citation Ranking: Bringing Order to the Web (1998).
`11 Ex. 2041: Michael Specter, “Search and Deploy,” The New Yorker, May 29, 2000.
`
`20
`
`
`
`describes how pre-1998 “no one at the web search companies mentioned using
`
`links.”12
`
`53. My research has further revealed that the results of web-based link
`
`analysis algorithms were unexpected or surprising, as further demonstrated below.
`
`In short, those skilled in the art, industry members generally, competitors, and
`
`investors alike doubted whether PageRank would improve computerized search
`
`and revolutionize the search engine industry and/or were surprised when it did.
`
`54. BackRub, the first version of Google’s search engine, only used the
`
`titles of the documents in combination with their PageRank link analysis to return
`
`the search query results. Even the first set of results using only the titles of the
`
`documents was surprisingly successful. According to Brin and Page’s advisor at
`
`Stanford at the time, renowned search expert Hector Garcia-Molina, “[e]ven the
`
`first set of results was very convincing. It was pretty clear to everyone who saw
`
`[the] demo that this [i.e., PageRank] was a very good, very powerful way to order
`
`things.”13
`
`55.
`
`It was remarkable and very surprising that Backrub, which used only
`
`the titles of the documents, could outperform AltaVista, Excite, and Yahoo!, which
`
`used extensive text analysis of entire page context, title, metatag information, font
`
`12Ex. 2048: Stephen Levy, In the Plex: how Google thinks, works, and shapes our lives,
`Simon & Schuster, 2011, p. 21.
`13Ex. 2048: Stephen Levy, In the Plex: how Google thinks, work, and shapes our lives,
`Simon & Schuster, 2011 p. 18.
`
`21
`
`
`
`size, proximity indexing, and term location to answer queries. These first
`
`generation search engines prided themselves on the size of their index. Yet
`
`Backrub outperformed them when looking only at the titles. This surprising result
`
`was due to Backrub’s innovative new feature – link analysis, as discussed more
`
`fully below.
`
`56.
`
`The u