IPR2020-00755, No. 1030 Exhibit - Google Exhibit 1030 Tengi, Design and Implementation of the WordNet Lexical Database and Searching Software (P.T.A.B. Mar. 27, 2020)

- 62 -
`
`Design and Implementation of the WordNet Lexical Database
`and Searching Software†
`
`Richard Beckwith, George A. Miller, and Randee Tengi
`
`Lexicographers must be concerned with the presentation as well as the content of
`their work, and this concern is heightened when presentation moves from the printed
`page to the computer monitor. Printed dictionaries have become relatively standardized
`through many years of publishing (Vizetelly, 1915); expectations for electronic lexicons
`are still up for grabs. Indeed, computer technology itself is evolving rapidly; an
`indeﬁnite variety of ways to present lexical information is possible with this new
`technology, and the advantages and disadvantages of many possible alternatives are still
`matters for experimentation and debate. Given this degree of uncertainty, manner of
`presentation must be a central concern for the electronic lexicographer.
`WordNet is a pioneering excursion into this new medium. Considerable attention
`has been devoted to making it useful and convenient, but the solutions described here are
`unlikely to be the ﬁnal word on these matters. It is hoped that readers will not merely
`note the shortcomings of this work, but will also be inspired to make improvements on it.
`One’s ﬁrst impression of WordNet is likely to be that it is an on-line thesaurus. It is
`true that sets of synonyms are basic building blocks, and with nothing more than these
`synonym sets the system would have all the power of a thesaurus. When short glosses
`are added to the synonym sets, it resembles an on-line dictionary that has been
`supplemented with synonyms for cross referencing (Calzolari, 1988). But WordNet
`includes much more information than that. In an attempt to model the lexical knowledge
`of a native speaker of English, WordNet has been given detailed information about
`relations between word forms and synonym sets. How this relational structure should be
`presented to a user raises questions that outrun the experience of conventional
`lexicography.
`In developing this on-line lexical database, it has been convenient to divide the
`work into two interdependent tasks which bear a vague similarity to the traditional tasks
`of writing and printing a dictionary. One task was to write the source ﬁles that contain
`the basic lexical data — the contents of those ﬁles are the lexical substance of WordNet.
`The second task was to create a set of computer programs that would accept the source
`hhhhhhhhhhhhhhh
`† This is a revised version of "Implementing a Lexical Network" in CSL Report #43, prepared
`by Randee Tengi. UNIX is a registered trademark of UNIX System Laboratories, Inc. Sun, Sun 3
`and Sun 4 are trademarks of Sun Microsystems, Inc. Macintosh is a trademark of Macintosh La-
`boratory, Inc. licensed to Apple Computer, Inc. NeXT is a trademark of NeXT. Microsoft Win-
`dows is a trademark of Microsoft Corporation. IBM is a registered trademark of International
`Business Machines Corporation. X Windows is a trademark of the Massachusetts Institute of
`Technology. DECstation is a trademark of Digital Equipment Corporation.
`
`Page 1 of 25
`
`GOOGLE EXHIBIT 1030
`
`

`- 63 -
`
`ﬁles and do all the work leading ultimately to the generation of a display for the user.
`The WordNet system falls naturally into four parts: the WordNet lexicographers’
`source ﬁles; the software to convert these ﬁles into the WordNet lexical database; the
`WordNet lexical database; and the suite of software tools used to access the database.
`The WordNet system is developed on a network of Sun-4 workstations. The software
`programs and tools are written using the C programming language, Unix utilities, and
`shell scripts. To date, WordNet has been ported to the following computer systems:
`Sun-3; DECstation; NeXT; IBM PC and PC clones; Macintosh.
`The remainder of this paper discusses general features of the design and
`implementation of WordNet. The ‘‘WordNet Reference Manual’’ is a set of manual
`pages that describe aspects of the WordNet system in detail, particularly the user
`interfaces and ﬁle formats. Together the two provide a fairly comprehensive view of the
`WordNet system.
`
`Index of Familiarity
`One of the best known and most important psycholinguistic facts about the mental
`lexicon is that some words are much more familiar than others. The familiarity of a word
`is known to inﬂuence a wide range of performance variables: speed of reading, speed of
`comprehension, ease of recall, probability of use. The effects are so ubiquitous that
`experimenters who hope to study anything else must take great pains to equate the words
`they use for familiarity. To ignore this variable in a lexical database that is supposed to
`reﬂect psycholinguistic principles would be unthinkable.
`In order to incorporate differences in familiarity into WordNet, a syntactically
`tagged index of familiarity is associated with each word form. This index does not
`reﬂect all of the consequences of differences of familiarity — some theorists would ask
`for strength indices associated with each relation — but accurate information on all of
`the consequences is not easily obtained. The present index is a ﬁrst step.
`Frequency of use is usually assumed to be the best indicator of familiarity. The
`closed class words that play an important syntactic role are the most frequently used, of
`course, but even within the open classes of words there are large differences in frequency
`of occurrence that are assumed to correlate with — or to explain — the large differences
`in familiarity. The frequency data that are readily available in the technical literature,
`however, are inadequate for a database as extensive as WordNet. Thorndike and Lorge
`(1944) published data based on a count of some 5,000,000 running words of text, but
`they reported their results only for the 30,000 most frequent words. Moreover, they
`deﬁned a ‘‘word’’ as any string of letters between successive spaces, so their counts for
`homographs are untrustworthy; there is no way to tell, for example, how often lead
`occurred as a noun and how often as a verb. Francis and Kucvera (1982) tag words for
`their syntactic category, but they report results for only 1,014,000 running words of text
`— or 50,400 word types, including many proper names — which is not a large enough
`sample to yield reliable counts for infrequently used words. (A comfortable rate of
`speaking is about 120 words/minute, so that 1,000,000 words corresponds to 140 hours,
`or about two weeks of normal exposure to language.)
`
`Page 2 of 25
`
`

`- 64 -
`
`Fortunately, an alternative indicator of familiarity is available. It has been known at
`least since Zipf (1945) that frequency of occurrence and polysemy are correlated. That is
`to say, on the average, the more frequently a word is used the more different meanings it
`will have in a dictionary. An intriguing ﬁnding in psycholinguistics (Jastrezembski,
`1981) is that polysemy seems to predict lexical access times as well as frequency does.
`Indeed, if the effect of frequency is controlled by choosing words of equivalent
`frequencies, polysemy is still a signiﬁcant predictor of lexical decision times.
`Instead of using frequency of occurrence as an index of familiarity, therefore,
`WordNet uses polysemy. This measure can be determined from an on-line dictionary. If
`an index value of 0 is assigned to words that do not appear in the dictionary, and if values
`of 1 or more are assigned according to the number of senses the word has, then an index
`value can be made available for every word in every syntactic category. Associated with
`every word form in WordNet, therefore, there is an integer that represents a count (of the
`Collins Dictionary of the English Language) of the number of senses that word form has
`when it is used as a noun, verb, adjective, or adverb.
`A simple example of how the familiarity index might be used is shown in Table 1.
`If, say, the superordinates of bronco are requested, WordNet can respond with the
`sequence of hypernyms shown in Table 1. Now, if all the terms with a familiarity index
`(polysemy count) of 0 or 1 are omitted, which are primarily technical terms, the
`hypernyms of bronco include simply: bronco @ﬁ
`pony @ﬁ
`horse @ﬁ
`animal @ﬁ
`organism @ﬁ
`entity. This shortened chain is much closer to what a layman would
`expect. The index of familiarity should be useful, therefore, when making suggestions
`for changes in wording. A user can search for a more familiar word by inspecting the
`polysemy in the WordNet hierarchy.
`WordNet would be a better simulation of human semantic memory if a familiarity
`index could be assigned to word-meaning pairs rather than to word forms. The noun tie,
`for example, is used far more often with the meaning {tie, necktie} than with the
`meaning {tie, tie beam}, yet both are presently assigned the same index, 13.
`
`Lexicographers’ Source Files
`WordNet’s source ﬁles are written by lexicographers. They are the product of a
`detailed relational analysis of lexical semantics: a variety of lexical and semantic
`relations are used to represent the organization of lexical knowledge. Two kinds of
`building blocks are distinguished in the source ﬁles: word forms and word meanings.
`Word forms are represented in their familiar orthography; word meanings are represented
`by synonym sets — lists of synonymous word forms that are interchangeable in some
`syntax. Two kinds of relations are recognized: lexical and semantic. Lexical relations
`hold between word forms; semantic relations hold between word meanings.
`WordNet organizes nouns, verbs, adjectives and adverbs into synonym sets
`(synsets), which are further arranged into a set of lexicographers’ source ﬁles by syntactic
`category and other organizational criteria. Adverbs are maintained in one ﬁle, while
`nouns and verbs are grouped according to semantic ﬁelds. Adjectives are divided
`between two ﬁles: one for descriptive adjectives and one for relational adjectives.
`
`Page 3 of 25
`
`

`- 65 -
`
`Hypernyms of bronco and their index values
`
`Polysemy
`Word
`iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
`bronco
`1
`@ﬁ mustang
`1
`@ﬁ
`pony
`5
`@ﬁ
`horse
`14
`@ﬁ
`equine
`0
`@ﬁ
`odd-toed ungulate
`0
`@ﬁ
`placental mammal
`0
`@ﬁ mammal
`1
`@ﬁ
`vertebrate
`1
`@ﬁ
`chordate
`1
`@ﬁ
`animal
`4
`@ﬁ
`organism
`2
`@ﬁ
`entity
`3
`
`Table 1
`
`Appendix A lists the names of the lexicographers’ source ﬁles.
`Each source ﬁle contains a list of synsets for one part of speech. Each synset
`consists of synonymous word forms, relational pointers, and other information. The
`relations represented by these pointers include (but are not limited to):
`hypernymy/hyponymy, antonymy, entailment, and meronymy/holonymy. Polysemous
`word forms are those that appear in more than one synset, therefore representing more
`than one concept. A lexicographer often enters a textual gloss in a synset, usually to
`provide some insight into the semantics intended by the synonymous word forms and
`their usage. If present, the textual gloss is included in the database and can be displayed
`by retrieval software. Comments can be entered, outside of a synset, by enclosing the
`text of the comment in parentheses, and are not included in the database.
`Descriptive adjectives are organized into clusters that represent the values, from one
`extreme to the other, of some attribute. Thus each adjective cluster has two (occasionally
`three) parts, each part headed by an antonymous pair of word forms called a head synset.
`Most head synsets are followed by one or more satellite synsets, each representing a
`concept that is similar in meaning to the concept represented by the head synset. One
`way to think of the cluster organization is to visualize a wheel, with each head synset as a
`hub and its satellite synsets as the spokes. Two or more wheels are logically connected
`via antonymy, which can be thought of as an axle between wheels.
`The Grinder utility compiles the lexicographers’ ﬁles. It veriﬁes the syntax of the
`ﬁles, resolves the relational pointers, then generates the WordNet database that is used
`with the retrieval software and other research tools.
`
`Page 4 of 25
`
`

`- 66 -
`
`Word Forms
`In WordNet, a word form is represented as the orthographic representation of an
`individual word or a string of individual words joined with underscore characters. A
`string of words so joined is referred to as a collocation and represents a single concept,
`such as the noun collocation fountain_pen.
`In the lexicographers’ ﬁles a word form may be augmented with additional
`information, necessary for the correct processing and interpretation of the data. An
`integer sense number is added for sense disambiguation if the same word form appears
`more than once in a lexicographer ﬁle. A syntactic marker, enclosed in parentheses, is
`added to any adjectival word form whose use is limited to a speciﬁc syntactic position in
`relation to the noun that it modiﬁes. Each word form in WordNet is known by its
`orthographic representation, syntactic category, semantic ﬁeld, and sense number.
`Together, these data make a ‘‘key’’ which uniquely identiﬁes each word form in the
`database.
`
`Relational Pointers
`Relational pointers represent the relations between the word forms in a synset and
`other synsets, and are either lexical or semantic. Lexical relations exists between
`relational adjectives and the nouns that they relate to, and between adverbs and the
`adjectives from which they are derived. The semantic relation between adjectives and
`the nouns for which they express values are encoded as attributes. The semantic relation
`between noun attributes and the adjectives expressing their values are also encoded.
`Presently these are the only pointers that cross from one syntactic category to another.
`Antonyms are also lexically related. Synonymy of word forms is implicit by inclusion in
`the same synset. Table 2 summarizes the relational pointers by syntactic category.
`Meronymy is further speciﬁed by appending one of the following characters to the
`meronymy pointer: p to indicate a part of something; s to indicate the substance of
`something; m to indicate a member of some group. Holonymy is speciﬁed in the same
`manner, each pointer representing the semantic relation opposite to the corresponding
`meronymy relation.
`Many pointers are reﬂexive, meaning that if a synset contains a pointer to another
`synset, the other synset should contain a corresponding reﬂexive pointer back to the
`original synset. The Grinder automatically generates the relations for missing reﬂexive
`pointers of the types listed in Table 3.
`A relational pointer can be entered by the lexicographer in one of two ways. If a
`pointer is to represent a relation between synsets — a semantic relation — it is entered
`following the list of word forms in the synset. Hypernymy always relates one synset to
`another, and is an example of a semantic relation. The lexicographer can also enclose a
`word form and a list of pointers within square brackets ([...]) to deﬁne a lexical relation
`between word forms. Relational adjectives are entered in this manner, showing the
`lexical relation between the adjective and the noun that it pertains to.
`
`Page 5 of 25
`
`

`- 67 -
`
`WordNet Relational Pointers
`
`iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
`Noun
`Verb
`Adjective
`Adverb
`iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
`Antonym
`Antonym
`Antonym
`Antonym
`!
`!
`!
`!
`Hyponym
`Troponym
`Similar
`& Derived from \
`Hypernym @ Hypernym @ Relational Adj.
`\
`Meronym
`#
`Entailment
`*
`Also See
`ˆ
`=
`Holonym
`% Cause
`>
`Attribute
`=
`Attribute
`Also See
`ˆ
`iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii
`
`ccccccccc ccccccccc ccccccccc ccccccccc ccccccccc
`
`Table 2
`
`Reﬂexive Pointers
`
`iiiiiiiiiiiiiiiiiiiiiii
`Pointer
`Reﬂect
`iiiiiiiiiiiiiiiiiiiiiii
`Antonym
`Antonym
`Hyponym
`Hypernym
`Hypernym Hyponym
`Holonym
`Meronym
`Meronym
`Holonym
`Similar to
`Similar to
`Attribute
`Attribute
`iiiiiiiiiiiiiiiiiiiiiii
`
`cccccccccc cccccccccc cccccccccc
`
`Table 3
`
`Verb Sentence Frames
`Each verb synset contains a list of verb frames illustrating the types of simple
`sentences in which the verbs in the synset can be used. A list of verb frames can be
`restricted to a word form by using the square bracket syntax described above. See
`Appendix B for a list of the verb sentence frames.
`
`Synset Syntax
`Strings in the source ﬁles that conform to the following syntactic rules are treated as
`synsets. Note that this is a brief description of the general synset syntax and is not a
`formal description of the source ﬁle format. A formal speciﬁcation is found in the
`manual page wninput(5) of the ‘‘WordNet Reference Manual’’.
`
`Page 6 of 25
`
`~
`~
`

`- 68 -
`
`[1] Each synset begins with a left curly bracket ({).
`[2] Each synset is terminated with a right curly bracket (}).
`[3] Each synset contains a list of one or more word forms, each followed by a
`comma.
`[4] To code semantic relations, the list of word forms is followed by a list of
`relational pointers using the following syntax: a word form (optionally preceded
`by "ﬁlename:" to indicate a word form in a different lexicographer ﬁle) followed
`by a comma, followed by a relational pointer symbol.
`[5] For verb synsets, "frames:" is followed by a comma separated list of applicable
`verb frames. The verb frames follow all relational pointers.
`[6] To code lexical relations, a word form is followed by a list of elements from [4]
`and/or [5] inside square brackets ([...]).
`[7] To code adjective clusters, each part of a cluster (a head synset, optionally
`followed by satellite synsets) is separated from other parts of a cluster by a line
`containing only hyphens. Each entire cluster is enclosed in square brackets.
`
`Archive System
`The lexicographers’ source ﬁles are maintained in an archive system based on the
`Unix Revision Control System (RCS) for managing multiple revisions of text ﬁles. The
`archive system has been established for several reasons — to allow the reconstruction of
`any version of the WordNet database, to keep a history of all the changes to
`lexicographers’ ﬁles, to prevent people from making conﬂicting changes to the same ﬁle,
`and to ensure that it is always possible to produce an up-to-date version of the WordNet
`database. The programs in the archive system are Unix shell scripts which envelop RCS
`commands in a manner that maintains the desired control over the lexicographers’ source
`ﬁles and provides a user-friendly interface for the lexicographers.
`The reserve command extracts from the archive the most recent revision of a given
`ﬁle or ﬁles and locks the ﬁle for as long as a user is working on it. The review command
`extracts from the archive the most recent revision of a given ﬁle or ﬁles for the purpose
`of examination only, therefore the ﬁle is not locked. To discourage making changes,
`review ﬁles do not have write permission since any such changes could not be
`incorporated into the archive. The restore command veriﬁes the integrity of a reserved
`ﬁle and returns it to the archive system. The release command is used to break a lock
`placed on a ﬁle with the reserve command. This is generally used if the lexicographer
`decides that changes should not be returned to the archive. The whose command is used
`to ﬁnd out whether ﬁles are currently reserved, and if so, by whom.
`
`Grinder Utility
`The Grinder is a versatile utility with the primary purpose of compiling the
`lexicographers’ ﬁles into a database format that facilitates machine retrieval of the
`information in WordNet. The Grinder has several options that control its operation on a
`set of input ﬁles. To build a complete WordNet database, all of the lexicographers’ ﬁles
`
`Page 7 of 25
`
`

`- 69 -
`
`must be processed at the same time. The Grinder is also used as a veriﬁcation tool to
`ensure the syntactic integrity of the lexicographers’ ﬁles when they are returned to the
`archive system with the restore command.
`
`Implementation
`The Grinder is a multi-pass compiler that is coded in C. The ﬁrst pass uses a parser,
`written in yacc and lex, to verify that the syntax of the input ﬁles conforms to the
`speciﬁcation of the input grammar and lexical items, and builds an internal representation
`of the parsed synsets. Additional passes refer only to this internal representation of the
`lexicographic data. Pass one attempts to ﬁnd as many syntactic and structural errors as
`possible. Syntactic errors are those in which the input ﬁle fails to conform to the input
`grammar’s speciﬁcation, and structural errors refer to relational pointers that cannot be
`resolved for some reason. Usually these errors occur because the lexicographer has made
`a typographical error, such as constructing a pointer to a non-existent ﬁle, or fails to
`specify a sense number when referring to an ambiguous word form. Pass one cannot
`determine structural errors in pointers to ﬁles that are not processed together. When used
`as a veriﬁcation tool, as from the restore command, only pass one is run.
`In its second pass, the Grinder resolves all of the semantic and lexical pointers. To
`do this, the pointers that were speciﬁed in each synset are examined in turn, and the
`target of each pointer (either a synset or a word form in a synset) is found. The source
`pointer is then resolved by adding an entry to the internal data structure which notes the
`‘‘location’’ of the target. In the case of reﬂexive pointers, the target pointer’s synset is
`then searched for a corresponding reﬂexive pointer. If found, the data structure
`representing the reﬂexive pointer is modiﬁed to note the ‘‘location’’ of its target, the
`original source pointer. If a reﬂexive pointer is not found, the Grinder automatically
`creates one with all the pertinent information.
`A subsequent pass through the list of word forms assigns a polysemy index value, or
`sense count, to each word form found in the on-line dictionary. There is a separate sense
`count for each syntactic category that the word form is found in. The Grinder’s ﬁnal pass
`generates the WordNet database.
`
`Internal Representation
`The internal representation of the lexicographic data is a network of interrelated
`linked lists. A hash table of word forms is created as the lexicographers’ ﬁles are parsed.
`Lower-case strings are used as keys; the original orthographic word form, if not in
`lower-case, is retained as part of the data structure for inclusion in the database ﬁles. As
`the parser processes an input ﬁle, it calls functions which create data structures for the
`word forms, pointers, and verb frames in a synset. Once an entire synset had been
`parsed, a data structure is created for it which includes pointers to the various structures
`representing the word forms, pointers, and verb frames. All of the synsets from the input
`ﬁles are maintained as a single linked list. The Grinder’s different passes access the
`structures either through the linked list of synsets or the hash table of word forms. A list
`of synsets that specify each word form is maintained for the purposes of resolving
`
`Page 8 of 25
`
`

`- 70 -
`
`pointers and generating the database’s index ﬁles.
`
`WordNet Database
`For each syntactic category, two ﬁles represent the WordNet database — index.pos
`and data.pos, where pos is either noun, verb, adj or adv (the actual ﬁle names may be
`different on platforms other than Sun-4). The database is in an ASCII format that is
`human- and machine-readable, and is easily accessible to those who wish to use it with
`their own applications. Each index ﬁle is an alphabetized list of all of the word forms in
`WordNet for the corresponding syntactic category. Each data ﬁle contains all of the
`lexicographic data gathered from the lexicographers’ ﬁles for the corresponding syntactic
`category, with relational pointers resolved to addresses in data ﬁles.
`The index and data ﬁles are interrelated. Part of each entry in an index ﬁle is a list
`of one or more byte offsets, each indicating the starting address of a synset in a data ﬁle.
`The ﬁrst step to the retrieval of synsets or other information is typically a search for a
`word form in one or more index ﬁles to obtain all data ﬁle addresses of the synsets
`containing the word form. Each address is the byte offset (in the data ﬁle corresponding
`to the syntactic category of the index ﬁle) at which the synset’s information begins. The
`information pertaining to a single synset is encoded as described in the Data Files
`section below.
`One shortcoming of the database’s structure is that although all the ﬁles are in
`ASCII, and are therefore editable, and in theory extensible, in practice this is almost
`impossible. One of the Grinder’s primary functions is the calculation of addresses for the
`synsets in the data ﬁles. Editing any of the database ﬁles would (most likely) create
`incorrect byte offsets, and would thus derail many searching strategies. At the present
`time, building a WordNet database requires the use of the Grinder and the processing of
`all lexicographers’ source ﬁles at the same time.
`The descriptions of the Index and Data ﬁles that follow are brief and are intended to
`provide only a glimpse into the structure, syntax, and organization of the database. More
`detailed descriptions can be found in the manual page wndb(5) included in the
`‘‘WordNet Reference Manual’’.
`
`Index Files
`Word forms in an index ﬁle are in lower case regardless of how they were entered in
`the lexicographers’ ﬁles. The ﬁles are sorted according to the ASCII character set
`collating sequence and can be searched quickly with a binary search.
`Each index ﬁle begins with several lines containing a copyright notice, version
`number and license agreement, followed by the data lines. Each line of data contains the
`following information: the sense count from the on-line dictionary; a list of the relational
`pointer types used in all synsets containing the word (this is used by the retrieval
`software to indicate to a user which searches are applicable); a list of indices which are
`byte offsets into the corresponding data ﬁle, one for each occurrence of the word form in
`a synset. Each data line is terminated with an end-of-line character.
`
`Page 9 of 25
`
`

`- 71 -
`
`Data Files
`A data ﬁle contains information corresponding to the synsets that were deﬁned in
`the lexicographers’ ﬁles with pointers resolved to byte offsets in data.pos ﬁles.
`Each data ﬁle begins with several lines containing a copyright notice, version
`number and license agreement. This is followed by a list of the names of all the input
`ﬁles that were speciﬁed to the Grinder, in the order that they were given on the command
`line, followed by the data lines. Each line of data contains an encoding of the
`information entered by the lexicographer for a synset, as well as additional information
`provided by the Grinder which is useful to the retrieval software and other programs.
`Each data line is terminated with an end-of-line character. In the data ﬁles, word forms
`in a synset match the orthographic representation entered in the lexicographers’ ﬁles.
`The ﬁrst piece of information on each line is the byte offset, or address, of the
`synset. This is slightly redundant, since almost any computer program that reads a synset
`from a data ﬁle knows the byte offset that it read it from; however this piece of
`information is useful when using UNIX utilities like grep to trace synsets and pointers
`without the use of sophisticated software. It also provides a unique ‘‘key’’ for a synset,
`if a user’s application requires one. An integer, corresponding to the location in the list
`of ﬁle names of the ﬁle from which the synset originated, follows. This can be used by
`retrieval software to annotate the display of a synset with the name of the originating ﬁle,
`and can be helpful for distinguishing senses. A list of word forms, relational pointers,
`and verb frames follows. An optional textual gloss is the ﬁnal component of a data line.
`Relational pointers are represented by several pieces of information. The symbol
`for the pointer comes ﬁrst, followed by the address of the target synset and its syntactic
`category (necessary for pointers that cross over into a different syntactic category),
`followed by a ﬁeld which differentiates lexical and semantic pointers. If a lexical pointer
`is being represented, this ﬁeld indicates which word forms in the source and target
`synsets the pointer pertains to. For a semantic pointer, this ﬁeld is 0.
`
`Retrieving Lexical Information
`In order to give a user access to information in the database, an interface is required.
`Interfaces enable end users to retrieve the lexical data and display it via a window-based
`tool or the command line. When considering the role of the interface, it is important to
`recognize the difference between a printed dictionary and a lexical database. WordNet’s
`interface software creates its responses to a user’s requests on the ﬂy. Unlike an on-line
`version of a printed dictionary, where information is stored in a ﬁxed format and
`displayed on demand, WordNet’s information is stored in a format that would be
`meaningless to an ordinary reader. The interface provides a user with a variety of ways
`to retrieve and display lexical information. Different interfaces can be created to serve
`the purposes of different users, but all of them will draw on the same underlying lexical
`database, and may use the same software functions that interface to the database ﬁles.
`User interfaces to WordNet can take on many forms. The standard interface is an X
`Windows application, which has been ported to several computer platforms. Microsoft
`Windows and Macintosh interfaces have also been written. An alternative command line
`
`Page 10 of 25
`
`

`- 72 -
`
`interface allows the user to retrieve the same data, with exactly the same output as the
`window-based interfaces, although the speciﬁcation of the retrieval criteria is more
`cumbersome, and the whole effect is less impressive. Nevertheless, the command line
`interface is useful because some users do not have access to windowing environments.
`Shell scripts and other programs can also be written around the command line interface.
`The search process is the same regardless of the type of search requested. The ﬁrst
`step is to retrieve the index entry located in the appropriate index ﬁle. This will contain a
`list of addresses of the synsets in the data ﬁle in which the word appears. Then each of
`the synsets in the data ﬁle is searched for the requested information, which is retrieved
`and formatted for output. Searching is complicated by the fact that each synset
`containing the search word also contains pointers to other synsets in the data ﬁle that may
`need to be retrieved and displayed, depending on the search type. For example, each
`synset in the hypernymic pathway points to the next synset in the hierarchy. If a user
`requests a recursive search on hypernyms a recursive retrieval process is repeated until a
`synset is encountered that contains no further pointers.
`The user interfaces to WordNet and other software tools rely upon a library of
`functions that interface to the database ﬁles. A fairly comprehensive set of functions is
`provided: they perform searches and retrievals, morphology, and various other utility
`functions. Appendix C contains a brief description of these functions. The structured,
`ﬂexible design of the library provides a simple programming interface to the WordNet
`database. Low-level, complex, and utility functions are included. The user interface
`software depends upon the more complex functions to perform the actual data retrieval
`and formatting of the search results for display to the user. Low-level functions provide
`basic access to the lexical data in the index and data ﬁles, while shielding the
`programmer from the details of opening ﬁles, reading ﬁles, and parsing a line of data.
`These functions return the requested information in a data structure that can be
`interpreted and used as required by the application. Utility functions allow simple
`manipulations of the search strings.
`The basic searching function, ﬁndtheinfo(), receives as its input arguments a word
`form, syntactic category, and search type; ﬁndtheinfo() calls a low-level function to ﬁnd
`the corresponding entry in the index ﬁle, and for each sense calls the appropriate function
`to trace the pointer corresponding to the search type. Most traces are done with the
`function traceptrs(), but specialized functions exist for search types which do not
`conform to the standard hierarchical search. As a synset is retrieved from the database, it
`is formatted as required by the search type into a large output buffer. The resulting
`buffer, containing all of the formatted synsets for all of the senses of the search word, is
`returned to the caller. The calling function simply has to print the buffer returned from
`ﬁndtheinfo().
`This general search and retrieval algorithm is used in s

This document is available on Docket Alarm but you must sign up to view it.

Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

Up-to-date information for this case.
Email alerts whenever there is an update.
Full text search for other cases.
Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.

Access Government Site

We are redirecting you
to a mobile optimized page.

Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket

Supplemental Search

Search for PTAB Motions

PTAB Analytics

TTAB Analytics

Basic Search

Filters

Party Search

Advanced

Selected Courts

Recently Selected Courts

Find PTAB Decisions

PTAB Analytics

Special PTAB Alerts

Orange Book

Directly Search Federal Courts

Search Trademark ...

This document is available on Docket Alarm but you must sign up to view it.

Accessing this document will incur an additional charge of $.

Still Working On It

A few More Minutes ... Still Working

This document could not be displayed.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

One Moment Please

Your document is on its way!

Sealed Document

We are redirecting youto a mobile optimized page.

Document Unreadable or Corrupt

We are unable to display this document.

STEP 2 of 2

Choose your membership type

Flat-Fee

Pay-As-You-Go

Add your payment information

Login or Join

Enter your corporate Email

Thousands of your peers are saving time and gaining a competitive advantage with Docket Alarm.

Join Docket Alarm to perform smarter legal research.

Download this document and millions of others instantly with a Docket Alarm membership.

Join Docket Alarm and start performing smarter legal research.

Start tracking this docket instantly with a Docket Alarm membership.

Join thousands of your peers and start performing smarter legal research.

STEP 1 of 2

Millions of Documents | 15 Seconds to Signup

Hi !

Welcome to Docket Alarm

Welcome to Docket Alarm!

Explore Litigation Insights andManage Your Cases

Reset Password

What is PACER?

Why do I need it?

What will I be charged?

Do other courts have fees?

Basic Free Access

Welcome

Thank you

Check Firm Account

We are redirecting you
to a mobile optimized page.

Explore Litigation Insights and
Manage Your Cases