throbber
Case 1:14-cv-02396-PGG-SN Document 241-14 Filed 11/12/20 Page 1 of 4
`
`
`
`
`
`
`
` Exhibit 39
`
`
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 241-14 Filed 11/12/20 Page 2 of 4
`
`Inverted index
`
`In computer science, an inverted index (also referred to as a postings file or inverted file) is a
`database index storing a mapping from content, such as words or numbers, to its locations in a table, or
`in a document or a set of documents (named in contrast to a forward index, which maps from documents
`to content). The purpose of an inverted index is to allow fast full-text searches, at a cost of increased
`processing when a document is added to the database. The inverted file may be the database file itself,
`rather than its index. It is the most popular data structure used in document retrieval systems,[1] used on
`a large scale for example in search engines. Additionally, several significant general-purpose mainframe-
`based database management systems have used inverted list architectures, including ADABAS,
`DATACOM/DB, and Model 204.
`
`There are two main variants of inverted indexes: A record-level inverted index (or inverted file
`index or just inverted file) contains a list of references to documents for each word. A word-level
`inverted index (or full inverted index or inverted list) additionally contains the positions of each
`word within a document.[2] The latter form offers more functionality (like phrase searches), but needs
`more processing power and space to be created.
`
`Contents
`Applications
`Compression
`See also
`Bibliography
`References
`External links
`
`Applications
`
`The inverted index data structure is a central component of a typical search engine indexing algorithm. A
`goal of a search engine implementation is to optimize the speed of the query: find the documents where
`word X occurs. Once a forward index is developed, which stores lists of words per document, it is next
`inverted to develop an inverted index. Querying the forward index would require sequential iteration
`through each document and to each word to verify a matching document. The time, memory, and
`processing resources to perform such a query are not always technically realistic. Instead of listing the
`words per document in the forward index, the inverted index data structure is developed which lists the
`documents per word.
`
`With the inverted index created, the query can now be resolved by jumping to the word ID (via random
`access) in the inverted index.
`
`In pre-computer times, concordances to important books were manually assembled. These were
`effectively inverted indexes with a small amount of accompanying commentary that required a
`tremendous amount of effort to produce.
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 241-14 Filed 11/12/20 Page 3 of 4
`In bioinformatics, inverted indexes are very important in the sequence assembly of short fragments of
`sequenced DNA. One way to find the source of a fragment is to search for it against a reference DNA
`sequence. A small number of mismatches (due to differences between the sequenced DNA and reference
`DNA, or errors) can be accounted for by dividing the fragment into smaller fragments—at least one
`subfragment is likely to match the reference DNA sequence. The matching requires constructing an
`inverted index of all substrings of a certain length from the reference DNA sequence. Since the human
`DNA contains more than 3 billion base pairs, and we need to store a DNA substring for every index and a
`32-bit integer for index itself, the storage requirement for such an inverted index would probably be in
`the tens of gigabytes.
`Compression
`
`For historical reasons, inverted list compression and bitmap compression were developed as separate
`lines of research, and only later were recognized as solving essentially the same problem.[3]
`See also
`Index (search engine)
`Reverse index
`Vector space model
`Bibliography
`Knuth, D. E. (1997) [1973]. "6.5. Retrieval on Secondary Keys". The Art of Computer Programming
`(Third ed.). Reading, Massachusetts: Addison-Wesley. ISBN 0-201-89685-0.
`Zobel, Justin; Moffat, Alistair; Ramamohanarao, Kotagiri (December 1998). "Inverted files versus
`signature files for text indexing". ACM Transactions on Database Systems. New York: Association for
`Computing Machinery. 23 (4): 453–490. doi:10.1145/296854.277632 (https://doi.org/10.1145%2F296
`854.277632).
`Zobel, Justin; Moffat, Alistair (July 2006). "Inverted Files for Text Search Engines". ACM Computing
`Surveys. New York: Association for Computing Machinery. 38 (2): 6. doi:10.1145/1132956.1132959
`(https://doi.org/10.1145%2F1132956.1132959).
`Baeza-Yates, Ricardo; Ribeiro-Neto, Berthier (1999). Modern information retrieval. Reading,
`Massachusetts: Addison-Wesley Longman. p. 192. ISBN 0-201-39829-X.
`Salton, Gerard; Fox, Edward A.; Wu, Harry (1983). "Extended Boolean information retrieval".
`Commun. ACM. ACM. 26 (11): 1022. doi:10.1145/182.358466 (https://doi.org/10.1145%2F182.35846
`6). hdl:1813/6351 (https://hdl.handle.net/1813%2F6351).
`Information Retrieval: Implementing and Evaluating Search Engines (http://www.ir.uwaterloo.ca/boo
`k/). Cambridge, Massachusetts: MIT Press. 2010. ISBN 978-0-262-02651-2.
`References
`1. Zobel, Moffat & Ramamohanarao 1998
`2. Baeza-Yates & Ribeiro-Neto 1999, p. 192
`3. Jianguo Wang; Chunbin Lin; Yannis Papakonstantinou; Steven Swanson. "An Experimental Study of
`Bitmap Compression vs. Inverted List Compression" (http://db.ucsd.edu/wp-content/uploads/2017/03/
`sidm338-wangA.pdf). 2017. doi: 10.1145/3035918.3064007
`
`

`

`Case 1:14-cv-02396-PGG-SN Document 241-14 Filed 11/12/20 Page 4 of 4
`External links
`NIST's Dictionary of Algorithms and Data Structures: inverted index (https://xlinux.nist.gov/dads/HTM
`L/invertedIndex.html)
`Managing Gigabytes for Java (http://mg4j.di.unimi.it) a free full-text search engine for large document
`collections written in Java.
`Lucene (http://lucene.apache.org/java/docs/) - Apache Lucene is a full-featured text search engine
`library written in Java.
`Sphinx Search (http://sphinxsearch.com/) - Open source high-performance, full-featured text search
`engine library used by craigslist and others employing an inverted index.
`Example implementations (http://rosettacode.org/wiki/Inverted_Index) on Rosetta Code
`Caltech Large Scale Image Search Toolbox (https://web.archive.org/web/20101203074412/http://ww
`w.vision.caltech.edu/malaa/software/research/image-search/): a Matlab toolbox implementing
`Inverted File Bag-of-Words image search.
`
`Retrieved from "https://en.wikipedia.org/w/index.php?title=Inverted_index&oldid=951292132"
`
`This page was last edited on 16 April 2020, at 13:00 (UTC).
`Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. By using this site,
`you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a
`non-profit organization.
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket