throbber
The SMART Automatic
`Docun1ent Retrieval System(cid:173)
`An Illustration
`
`G. SALTON AND j\.f. E. L]lSK
`Har-vard lhdve:rsily*, Cambridge, Mass.
`
`A fully automatic document retrieval system operating on
`t11e IBM 7094 is described. The system is characterized by the
`fact that several hundred different methods ere available to
`analyze documents and search requests. This feature is used
`in the retrieval process by leaving the exact sequence of
`operations initially unspecified, and adapting
`the search
`strategy to the needs of individual users.
`The system is used not only to simulate en actual operating
`environment, but also to test tl1e effectiveness of the various
`available processing metl1ods. Results obtained so far seem
`to indicate that some combination of analysis procedures can
`in general be relied upon to retrieve the wonted information.
`A typical search request is used as en example in the present
`report to illustrate systems operations and evaluation pro(cid:173)
`cedures.
`
`I. Introduction
`
`ln 1957, Luhn suggested a fully automatic procedure
`for the processing of written texts, based on the frequency
`of occmTence of words within the texts [1]. Specifically,
`use of high-frequency words was advocated for purposes
`of content identification, and documertt rettieval was to
`be effected by manipulation of the corresponding word(cid:173)
`~rcquency lists. The suggested pt·ocedure, even t.hough
`tmperfect, is still used as the basis for many automatic
`text-processing systems.
`In the SMART retrieval system, an attempt is made to
`go beyond t;he original wot·d matching procedures by
`generating more effective content indicators to identify
`documents and search request;s. This is accomplished in
`part by generating wol'd stems fl'om the original word
`forms, by introducing synonym dictionaries to lessen the
`effe<:ts of vocabnlaty variations, and, most importantly,
`by identifying relations bet.wccn cet·tain words to be used
`
`H. KOLLER, Editor
`
`as content indicators in conjunction with the surrounding
`words.
`Stored documentB and search requests are processed
`without any prior manual analysis by one of several hun(cid:173)
`dred possible methods, and those documents which most
`nearly match a given s~.arch request are extracted from
`the document file. The system may be controlled by the
`user in that a search request can be processed first in a
`standard mode; the user is then free to analyze the output
`obtained, and depending on his further requirements a
`reprocessing of the request may be ordered under new
`conditions. The new output can again be examined, and
`the process iterated until such time as the right kind and
`amount of information are retrieved.
`SMART is thus designed to overcome many of the short(cid:173)
`comings of presently available automatic retrieval sys(cid:173)
`tems, and may serve as a reasonable prototype for fully
`automatic document retrieval. The following summarizing
`characteristics may be of principal interest.
`(a) The information analysis is believed to be suffi(cid:173)
`ciently deep and refined to ensure the identification of
`most relevant material in answer to most search requests.
`(b) The varying needs of individual users are recog(cid:173)
`nized by enabling each user to call on many different text
`processing modes, and by choosing a suitable sequence
`of procedures, eventually to obtain satisfactory retrieval
`performance.
`(c) The system can serve as a means for evaluating the
`effectiveness of a large vatiety of automatic analysis pro(cid:173)
`cedures, in that the same search requests can be processed
`against the same document collection in many different
`ways and results compared.
`The present report illustrates the capabilities of the
`system by using a typical search request and exhibiting
`some of the processing steps as well as the retrieval results.
`The detailed systems organization as well as the main
`programming aspects are not included.'
`
`2. Processing Options
`The SMART retrieval system is designed around a
`general supervisory system which can in turn call on many
`different subroutines. The supervisor accepts input in(cid:173)
`structions to specify the iype of operation to be performed,
`as well as control data to choose the subroutines which
`are t.o be called. At present, eight basic input instructic:ms
`
`.The \\'ode described in this study was suppol'ted by the N:Ltional
`Science l!'cmnd:Ltion under gr11.nt GN-245.
`* Con1putation L:Lbomtory.
`
`Volume R I Number 6 I .Tune, 1965
`
`'Additional, detniled descriptions of the SMART system may
`be found in[2, 3].
`EXHIBIT 2035
`Facebook, Inc. et al.
`v.
`Software Rights Archive, LLC
`CASE IPR2013-00479
`
`Communications of the ACM
`
`391
`
`

`

`The SMAll'r Autornatic
`Docurne nl l{etricval System(cid:173)
`An Illustration
`
`G. SAI!l'O:\ c\::\D l\1. K LJ•;sK
`Harvard L'nz'uersily*, Cambriclr;e, Lliass.
`
`A fully automatic document retrieval system operating on
`the IBM 7094 is described. The system is characterized by the
`fact that several hundred different methods are available to
`analyze documents and search requests. This feature is used
`in the retrieval process by leaving the exact sequence of
`operations
`initially unspecified, and adapting
`the search
`strategy to the needs of individual users.
`The system is used not only to simulate an actual operating
`environment, but also to test the effectiveness of the various
`available processing methods. Results obtained so far seem
`to indicate that some combination of analysis procedures can
`in general be relied upon to retrieve the wanted information.
`A typical search request is used as an example in the present
`report to illustrate systems operations and evaluation pro(cid:173)
`cedures.
`
`I. Introduction
`
`In 19ii7, Luhn suggested a fully automatic procedure
`for the processing of written texts, based on the frequency
`of occurrence of words within the texts [1]. Specifically,
`use of high-frequency words was advocated for pmposes
`of r:ontenL identification, and document retrieval was to
`be effected by manipulation of the corresponding word(cid:173)
`~requeney lists. The suggested procedure, even though
`tmperfect, is still used as Lhe basis for many automatic
`tcxt-proeessing systems.
`In the S:V!ART retrieval system, an attempt is made to
`go beyond the original word matching procedures by
`generating more nffcetive content indicators to identify
`documentt~ and search requests. This is accomplished in
`~art by generating word sLcms from the Ol'iginal word
`forms, by introdw~ing synonym dietionaries to lessen the
`effeets of vocabulary variations, and, most importantly,
`by identifying relations between certain words to be used
`
`H. KOLLER, Editor
`
`as content indicators in conjunction with the surrounding
`words.
`Stored documents and search requests are processed
`without any prior manual analysis by one of several hun(cid:173)
`dred possible methods, and those documents which most
`nearly match a given search request are extracted from
`the document file. The system may be controlled by the
`user in that a search request can be processed first in a
`standard mode; the user is then free to analyze the output
`obtained, and depending on his further requirements a
`reprocessing of the request may be ordered under new
`conditions. The new output can again be examined, and
`the process iterated until such time as the right kind and
`amount of information are retrieved.
`s;vrART is thus designed to overcome many of the short(cid:173)
`comings of presently available automatic retrieval sys(cid:173)
`tems, and may l:lerve as a rcai:lonable prototype for fully
`automatic document retrieval. The following summarizing
`characteristics may be of principal interest.
`(a) The information analysis is believed to be suffi(cid:173)
`ciently deep and refined to ensure the identification of
`most relevant material in answer to most search requests.
`(b) The varying needs of individual users are recog(cid:173)
`nized by enabling each user to call on many different text
`processing modes, and by choosing a suitable sequence
`of procedures, eventually to obtain satisfactory retrieval
`performance.
`(c) The system can serve as a means for evaluating the
`effectiveness of a large variety of automatic analysis pro(cid:173)
`cedures, in that the same search requests can be processed
`against the same document collection in many different
`ways and results compared.
`The present report illustrates the capabilities of the
`system by using a typical search request and exhibiting
`some of the processing steps as well as the retrieval results.
`The detailed systems organization as well as the main
`programming aspects are not included. 1
`
`2. Proeessing Options
`The S:\1ART retrieval system is designed around a
`general supervisory system which can in turn eall on many
`different subroutines. The supervisor accepts input in(cid:173)
`structions to spceify the type of operation to be performed,
`as well as control data to choose tho subroutines which
`are to be called. At present, eight basic input instructions
`
`The work deseribed in this study was supported by the National
`Sewn,.<, Foundation under grant UN-245.
`*Computation Laboratory.
`
`I Additional, detailed descriptions of the SMAltT system may
`be found in [2, 3].
`
`Voltuue a I N uruhet· (i I .June, 1965
`
`Cornrnunieations of the ACiVI
`
`391
`
`

`

`are available, in addition to thirty-five different processing
`options and six variable parameter settings.
`Five basic dictionaries, or tables, are incorporated into
`the system: an alphabetic-stem dictionary, also known as
`the thesaurus, designed to supply each word stem with a
`number of syntactic and semantic codes; an alphabetic··
`suffix table to obtain syntactic codes for word suffixes; a
`numeric-concept hierarehy to represent various relations
`between semantic categories; a syntaeLic (criterion) phrase
`dictionary to aid in the syntactic processing; and a statis(cid:173)
`tical-phrase list.
`The following principal facilities are used in the system
`for purposes of information analysis and general proc(cid:173)
`essing:
`(a) a system for separating English words into l!tenLS
`and affixes, using a dual left-to-right and right-to-left
`scanning procedure, and incorporating extensive English
`morphological rules to detect doubling of consonants,
`deletion of final "e", and "y" to "i" changes, before addi-·
`tion of a suffix;
`(b) a thesaurus look-up method using list-tracing meth(cid:173)
`ods to replace word stems by "concept" numbers (the
`present thesaurus includes about 500 concepts in the
`computer literature corresponding to about 3000 English
`stems);
`(e) a so-called "vacuous" thesaurus in which the origi(cid:173)
`nal word stems included in a text function as concepts;
`(d) a hierarch£cat arrangement of the concepts included
`in the thesaurus, and a eomplete set of list-processing
`methods which make it possible, given any concept num(cid:173)
`ber, to find its "parent" in the hierarchy, its "sons", its
`"brothers", and any of a set of possible cross-references;
`(c)
`statistical procedures to compute stem (or concept)
`similarity coefficients based on CO··oecurrence of terms within
`the sentences of a given document, or within the documents
`of a given collection; association factors between docu(cid:173)
`ments can also be determined, as ean clusters (rather than
`only pairs) of related documents or related concepts;
`syntactic-phrase-matching procedures which make
`(f)
`it possible to match the syntaetieally analyzed sentences
`of documents and seareh requests with a precoded dic(cid:173)
`tionary of "criterion" phrases; the phrase matcher recog(cid:173)
`nizes a large number of semantically equivalent but syn(cid:173)
`tactically quite different constructions, and assigns the
`same concept numbers to all such equivalent construetions
`(as, for example, to "information retrieval", "the retrieval
`of information", "the retrieval of documents", "text
`processing", and so on); a dietionary of about 120 criterion
`phrases corresponding to several thousand English con(cid:173)
`structions in tho computer literature, is used at present,
`and two phrases arc defined as equivalent if concept num(cid:173)
`bers and syntaetie indieators match, and if the syntactic
`dependencies between concepts arc preserved;
`(g)
`"statistical phm8e" rncdching procedures which
`operate like syntactic phrases except that no syntaetie
`analysis is performed, and syntactie dependencies are
`disregarded (n statistical phrase is thus in fact equivalent
`
`to a set of eonccpLs co occurring in a :-:l'lllr'IICI' of a doeu.
`ment);
`(h) a complete sd of uprlat inq roll/ incs designed to
`alter the five principal die\.ionarir•s includ<~d in Lho system
`(stem t.hesaurus, suffix dic\iona(l', <'OW'l'pL hierarchy,
`statistical phrases and syrtlal'l ic critct·ion phmsos);
`
`/Cor'r'p·Jisory OperotJons
`/ OptiOr'ICI Operotior~s
`
`To Request
`P'OCeSSifl9
`
`FIG, l, Preprocessing of input text inellHiing main content
`analysis procedures
`
`PRE-PROCESSED DOCUMENTS
`
`A2, A5,
`A6 AS 86
`
`/Compulsory
`~/ 0pt1Q0()1
`
`llequest-document processing including ltierarehicnl
`FIG, 2.
`processing and statistical eorTelntion;.;
`
`392
`
`Communications of the ACM
`
`Volume a I Nurnht"' ()I .June, 1965
`
`

`

`(i) a snpr:n>isory system, culled CHIEF, designed to
`decode a large va.ridy of' inpuL im;Lnwt.iom; and t.o arrange
`the pror:e;.;sing S<'(]LWII<'<' in ac:cordalJ(:c with the instruc(cid:173)
`tions given.
`A flowchart. or the <·on1plctc ;.;yste1u (<:xelusive of t.l!e
`didionary updating I'OUI inc;.;) is :-:hoWil in Figmes 1 and 2
`respcet.ivcly. Vigurc l de:-:crilH:s th<: proeessing which i~
`in general performed only one<: for eaeh document or
`search reqw:st., including Llw lookup operations in the
`various dictionaries and the :,;ynt.ac:tie analysis. The re(cid:173)
`quest processing proper, eonsistinp; of the matching of
`requests and preprocessed doeument veetors is shown in
`Figure 2. This chart also illustrates the hierarchical pro(cid:173)
`cedures, as well as the statist.ieal term-tcrrn and docu(cid:173)
`ment-document correlations. Most of the procedures
`described in Figures 1 and 2 arc optional, as shown by the
`dotted lines. The principal tape assignments included in
`the flowcharts are decoded in Table 1.
`
`TABLE 1. PRINCIPAL TAPE AssiGNMEN'rs
`
`Tape I\~wnber
`
`A2
`
`A3
`A5
`A6
`A7
`A8
`B1
`B2
`135
`
`B6
`
`Function
`input
`text, manual
`
`relevance
`
`Input parameters,
`judgments
`Output print tape for later printing
`Partially assembled concept vectors
`Preprocessed document vectors
`Merged concept vectors
`(preanalyzed and new)
`Input to and output from syntax analyzer
`'Vords not found in dictionary
`Correlations between document and request vectors
`Library
`tape
`ineluding thesaurus, hierarchy and
`phrase dictionaries
`Scratch tape and merged concept vectors
`
`3. Input Specifications
`A typical analysis is best described by using an infor(cid:173)
`mation search actually run on the computer as an illus(cid:173)
`tration. Figure 3 shows the introduction of a new document
`;>~t. differential equations (code: DIFFERNTL EQ).
`lln~ document serves as search request, and is compared
`agamst a previously stored collection of 40.5 document
`abstracts.z
`The complete doeumenL is printed at the top of Fio·me 3.
`I
`""
`Th
`e resu ts of the thesaurus lookup, shown in the second
`part of
`that one word stem
`the
`figure,
`indicate
`(RESPECT), occurring in sentence 2, word 14, of the docu(cid:173)
`~ner~t could not be found in the st.em dictionary. This word
`lS effectively ip;nored in the remainder of the process.
`Before a run is undertaken, instruct-ions are given to the
`SL~pervisor concerning the processing mode and the type
`of output desired. This infonna(.ion is shown in short
`f~n·m in the third part of Figure :~. A longer, decoded ver(cid:173)
`Sion of a set of processing instruetions is ineluded in Figure
`4, deseribed later in the report. Following the processing
`
`2 The stored <'olli~etion ('OttRists of the 405 abst.raets of doeu(cid:173)
`~r:euts in the computer literature published during 195!) in the
`'1 ransactions of the flU~ on Eli!clronic Computers. The document
`abstraets arc numbered from 1 to 405 for identificatimL
`
`Volume II I Number 6 / June, 1965
`
`~nstructions, new documents may be introduced from the
`mput tape, and previously available documents are proc(cid:173)
`essed from a separate data tape. This is done under con(cid:173)
`tr:ol of a special instruction set recognized by the super(cid:173)
`VIsor.
`The instructions listed in the lower part of Figure 3
`receive the following interpretation:
`
`identifier (DIF(cid:173)
`The. document whose 12-character
`FERNTL I~q) follows, appears next on the input tape
`in binary form.
`The designated document is identified as a request for
`later rnaLching with other documents.
`The current time is printed.
`The supervisor switches from the input tape to the data
`tape and treats this tape as if it were mwther input.
`A comment follows next.
`The document file is introduced starting with document
`1 (first five documents shown in Figure 3).
`
`.'I'IME
`*TAPE
`
`*NOTE
`*LIST
`
`A typical processing record is reproduced in decoded
`form in Figure 4. It may be noted that in the example
`shown, version 2 of the regular (Harris) thesaurus is used
`to normalize the vocabulary. No statistical processing is
`done, but a syntactic analysis is performed instead, and
`the syntactic phrases detected by matching the incoming
`sentences with the criterion tree dictionary arc weighted
`3 to 1 compared with ordinary (nonphrase) concepts.
`Concepts derived from the titles of documents are weighted
`equally with all other concepts. The "cosine" funetion is
`used to correlate the analyzed request with the document
`identifications, and all documents whose correlation eo(cid:173)
`efficient exceeds 0.:3.1) are printed out as answers.
`
`4. Information Analysis
`
`As an example of the kind of information analysis
`included in the SMART system, Figure 5 shows an excerpt
`of the output print produced by the syntactic matching
`process. The first sentence of the request DIFFERNTL
`EQ is processed: the sentence structure diagram produced
`by the syntactic: analysis of the sentence is shown at the
`top of Figure 5. The format reflects the syntac:tie depend(cid:173)
`ency structure of the sentence, and is produced by the
`Kuno-Oettinger syntactic analyzer incorporated into the
`SMART system [4, 5].
`The matching process between the syntactically ana(cid:173)
`lyzed sentence and the set of criterion trees ineluded in the
`syntactic-phrase dictionary is illustrated in the center
`part of Figure 5. H is seen that the combination of sen(cid:173)
`tence nodes 12 and 14, corresponding to concept numbers
`181 and 274, or equivalently to the words "equations"
`aud "differential", respectively, match a criterion tree
`labeled DIFEQU with serial 47. Similarly, sentence nodes
`~) and 7 in combination mateh the tree NUMERI with
`serial 87. The concept numbers corresponding to these
`two trees are therefore attached to the search request on
`"differential equations" which is being analyzed.
`The bottom part, of Figure 5 shows this last operation.
`Specifically, concept 274 originally introduced by thesaurus
`lookup through the word "differential", and concept 181
`
`Communications of the AClVI
`
`393
`
`

`

`corresponding to "equations", arc replaced by the new
`phrase concept 379 corTesponding to "differential equa(cid:173)
`tions". Concepts 13 ("number") and 11 ("analysis")
`are replaced by 37 5 ("numerical analysis"). These two
`phrase concepts· are weighted three times more heavily
`than were the original component concepts.
`In order to obtain a match between a criterion tree and a
`sentence part, it is necessary to compare the concept num(cid:173)
`bers attached to the components, the syntactic indicators
`and the syntactic dependency structures. A graph match(cid:173)
`ing process is used for this purpose which has previously
`been described in detail [6, 7].
`Further aspects of the information analysis procedures
`included in this system_ are shown in Figures 6 and 7.
`Figure 6 is a composite print showing the index1!ng products
`obtained for two documents (the original search request,
`and document number 1 from the abstract collection) by
`each of three different analysis methods. These automati(cid:173)
`cally generated indexing products arc effectively equivalent
`to the manually assigned keyword sets which are common
`in ordinary coordinate-indexing systems; it is the compari(cid:173)
`son between the indexing products representing doeu(cid:173)
`ments and search requests which is used to obtain the
`similarity coefficients needed for retrieval. In the SMART
`system, hundreds of different indexing products may be
`obtained by suitable alterations of the analysis procedures.
`The three analysis methods illustrated in Figure 6 cor(cid:173)
`respond l'espcctively to the usc of the regular thesaurus
`to obtain concept numbers from word stems, the use of the
`"null" thesaurus (that is, of original word sterns with
`weights), and the usc of the regular thesaurus followed by
`a statistical phrase detection. In the center section of
`Figure 6, up to six alphabetic characters are printed for
`each word stem together with a weight (multiplied by 12
`for internal reasons). For the two analysis procedures
`which make use of a thesaurus lookup, the word stems
`are replaced by concept numbers and the remainder of the
`six-character field is filled out by mnemonic characters to
`provide a clue to the significance of the corresponding
`concept.
`A eornparison of the index vectors produced by the
`various methods shows that new concepts may be intro(cid:173)
`duced by switching to a new analysis procedure; further(cid:173)
`more, eonccpts common to two or more of the indexing
`systems may nevertheless be weighted differently in each
`one. Consider, for example, the index vectors for the docu(cid:173)
`ment DI.FF'ERNTL EQ. Using the null thesaurus, the
`stem DIFFEREN (listed as DIFFER and obtained from
`the word "differential") is assigned a weight of 24. The
`corresponding concept number 274 obtained from the
`thesaurus is listed with a weight of 36. Using the statistical
`phrase matcher, a new concept ~H9 is created to represent
`the phrase "differential equations", with a weight of 72.
`'I'hus, the shift from word stems, to thesaurus, to statistical
`phrases, assigns an increasingly larger importance to the
`notion of "differential equations".
`Figure 7 shows an example of a change produced in the
`
`index veetor for DIFFERNTL EQ by using the himarc:hy.
`The ve<:tors are shown both before and after expansion; a
`comparison indicates that new eoueeptK are introduced
`through Lhe hicmrehy, and that some existing concepts
`receive a change in emphasis. l 1'or example, concept 110
`is assigned [L weight of 12 before expansion, and of ;)(i after(cid:173)
`ward8.
`The usefulness of the various indexing produets, and
`therefore of the analysis procedures which give rise to
`them, must be determined by an evaluation procedure.
`This is further discussed in Section 6 of this report.
`
`.5. Information Uctrieval
`Following the information analysis, the index vectors
`derived from documents and search requests arc compared
`in order to obtain for each document a coefficient of simi(cid:173)
`larity with each search request. Figure 8 shows, as an ex(cid:173)
`ample, the request-document correlations for the regular
`t;hesaurus run, obtained by using the "cosine" function to
`correlate the request DIFFERNTL EQ with the stored
`document collection. The output of Figure 8 is presented
`in three different forms: first in increasing document order,
`next in decreasing correlation order, and finally as a type
`of histogram.. The histogram shows, for example, that
`exactly one document had a correlation with the request
`equal to or greater than 0.60, 8 documents exceeded 0.40,
`60 documents exceeded 0.20, 165 documents exceeded
`0.10, and so on.
`The correlations shown in Figure 8 are produced for
`analysis purposes and are not intended to be given to the
`user. The user receives his "answers" in one of two forms,
`shown in Figures 9 and 10. In either case, the documents
`which receive the highest correlation with the search
`request are listed in decreasing correlation order down to
`the cutoff specified in the processing instruetions (sec
`Figure 4). The shorter version of the output provides one
`line per document, including only the document identifica(cid:173)
`tion, correlation coefficient and document tiLle; the ex(cid:173)
`ample of Figure 9 shows the output obtained for the "titles
`only" run, where the text of each document is disregarded
`and titles only are used in the analysis.
`The more complete form of the output is shown in Fig(cid:173)
`ure 10. Here the search request is reproduced at the top
`of the page, as in Figure 9; however, complete bibliographic
`citations are given for each document. The regular thesau(cid:173)
`rus was used to obtain the output of Figure 10; this run
`was previously illustrated also in Figure 8.
`A very condensed form of the output, consisting for
`each document listed of only one twelve-character identi(cid:173)
`fier, ineluding document number and the first few char(cid:173)
`acters of the title, is also produced. Figure 11 shows ex(cid:173)
`amples of the short-form answers obtained with a variety
`of processing methods for the request on differential equa(cid:173)
`tions. The documents are again listed in decreasing cor(cid:173)
`relation order, but correlation coefficients are omitted.
`The cutoff, which may be different from one processing
`method to the other, applies as before.
`
`394
`
`Communications of the ACM
`
`Volume 8 / Number 6 / June, 1965
`
`

`

`E'iGLl$H Ti::'XT Po{OVlOEO F(l't DOCUMENT OIFFERNTL EO
`
`SEPTEMBER 28r 1964
`
`PAGE 345
`
`< :..
`= a e
`
`~
`
`------2.
`
`"' -~
`
`e-.
`
`l.lVE ALGORITHMS USEFUl FOR THE NUMERICAL SOlUTION OF ORDINARY OIFFER(cid:173)
`fQUATIONS ANC PARTIAl DfFFEREI'.ITJAl.. EQUATIONS ON OIGJT,!Il
`I_,'H!AL.
`EVALUATE TH!: VAR WUS
`(TRY
`IN Tf:GRAT I 01'1. PROCEDURES
`CDMPUH:RS
`a
`KUi'lGt'-KUffAr i'!lLNE-S METHOD) WITH RESPECT TO ACCURACY, STABit..ITYr AND
`SPEED •
`
`WUROS, lN DOCUMENT OIFHRNTL EO NOT FOUND IN THESAURUS
`
`WORD NOT FOUND
`
`KIND
`
`LOC
`
`NUflll
`
`SENTENCE AND WORD NUMBERS
`
`l
`8
`STEM
`KESPECT
`JlJB COMPLEft. PRINT A6 UNOER PROGRAM CONTROL
`
`2r
`
`14
`
`INSTRUCTION CARDS TO SJolART SUPERVISOR
`
`------?
`~
`>C e-.
`
`;.tl
`
`ANSWER REQUESTS, FORMAT JUJ180, Pft SCORES YES, THESHR 2t MAXCON 511,
`REQUEST CCRR.ElATIO~St CORM03 COS, CUTOF3 3500, TEXTS PROCESSED
`
`•LIST D!FHRNTL EO NUMERICAL DIGITAl SOLN OF DIFH:RfNTIAL EQUATIONS
`
`•LIKE D1FFERNTL ~C
`
`•T JME
`72.1 MINUTES. YOU WILL REMEMBER THAT START-OF-JOB
`TtiE CURRENT TIME
`IS
`WAS AT
`69.9 MINUTES, WH(LE THE CLOCK READ
`71.8 WHt:N EXECUTION BEGAN.
`
`•TAPf
`
`•NOTE THIS IS THE
`
`t'ARRlS THESAURUS
`
`tf'ERSION TWO LOOKUP
`
`•LIST lA COM,PUTER CRIENTt:D TOWARD SPATIAL PROBLEMS •
`
`•LtST 2MICR0-PROGRAP!MING
`
`•UST 3THE RClE OF
`
`lARGE MEMORIES IN SCIENTIFIC COMMUNICATIONS
`
`•LIST 4A NEW CLASS OF OIGrTAt DIVISION METHODS
`
`•LIST SANALYSIS Of SHIFT REGISTE~ COUNTERS •
`
`FIG. 3. Typical SMART processing instructions
`
`ORIGINAL
`REQUEST
`
`THESAURUS
`LOOK UP
`
`SHORT FORM
`} PROCESSING
`INSTRUCTIONS
`
`} SEARCH
`INSTRUCTION
`
`} TIME
`INFORMATION
`
`PARTIAL
`LIST OF
`DOCUMENT
`TITLES
`
`IHESE ARE THE TEXT,
`
`&lVE
`ALGrJI.tl THMS
`USEFUL
`FUR
`THf.
`NUMtRICAl
`SOLUT 1 ON
`Of
`ORDINARY
`DlFfb~ENTlAL
`tJUATIUN$
`A">C
`PARTIAL
`DiFFE:Kl:NTlAL
`Er~U!trlO~S
`
`"·' DIG l TAl
`
`CO,..PUTt::ri.S
`
`lq64
`
`PAGE 347
`
`SYNTACTIC
`ANALYSIS
`OUTPUT
`
`SEP'fEMBER 28 1
`NODE NUMBERS, AND STRINGS OF SENTENCE NOa CCOOOl
`w
`30
`30A
`~OA.PR
`30APOA
`30APOA
`30APO
`30A.POPR
`30APOPOA
`3QAPOPOA
`30APOPO
`30AP+
`30APllA
`30APOA
`30APO
`30APOPR
`lOAPUPOA
`lOAPOPO
`
`• 8
`
`q
`7
`11
`13
`14
`12
`5
`16
`17
`15
`19
`21
`20
`
`NO(.E CrlRRF5PO~OENCES dF- TREE RITH INDEX =DlfEQU, SE:RtAL NOa
`StNH-NO::
`fRft.
`l
`12 - KE:Y
`2
`14
`
`~t-7, AND OUTPUT CONCEPT NOS
`
`.~dCE CORRt:.SPU~DENCES nF
`I Rt t
`St:NTLNCE
`q
`2
`1
`7 - KEY
`
`fREt: WITH
`
`lNOEX =NUf'IERt, SERIAL NOa
`
`87, AND OUTPUT CONCEPT NOS
`
`ft~l:- CRlffk.l,-JN R8UTINE HAS PROCESSlO
`
`l S£NTE,...Ct:S, HAVING
`
`2 MATCHES OF
`
`2 DISTINCT INDICES~
`
`TRt:FS OEHCrEC SYNTACTICAllY'
`
`IN DOCUMENT OlFFERNTl EO
`
`Trl.tf
`
`CU\OCE'PT
`
`OCCURREC
`
`CO~PONI::NT CONCEPTS
`
`OIFt:QU
`NUMtRl
`
`H'IOlF
`37S~ll~
`
`2740lF l81QUA
`13CALC llAL YS
`
`FIG. 5. Syntactic phrase matching
`
`l
`
`SYNTACTIC
`TREE
`OUTPUT
`
`UCCURRtNCES OF CONCEPTS AND PHRASES
`
`IN OOCUI'IENTS
`
`SEPTEMBER 26, 1964
`
`DOCUMb'lT
`
`CUNCt:PT 1 CCCURS
`
`PAGE 17
`
`OlfFERNTl EO
`
`4EX,1CT 12
`llOllUT 12
`269Ell
`4
`4285 TB
`4
`
`8ALGUR 12
`143UT1 12
`2740IF 36
`505APP 24
`
`13CALC 1 a
`l16SOL 12
`356Vfl 12
`
`llEVAL
`6
`179STO 12
`357YAW
`4
`
`920JGI 12
`181QUA 24
`384HG 12
`
`21NPUT
`3161 T
`57DSCB 15
`87fNBl 12
`llOAUT
`.36
`143UT1 12
`l62RilF
`6
`182SAV
`4
`276GEM 18
`6
`346JET
`
`5LCCAT 12
`32REQU
`3
`59AMNT 24
`930Rl:R 10
`ll20Pt.:
`6
`146J0t'l 18
`163EAS 12
`187DIR 12
`327AST 12
`1501Fll
`6
`
`1CALPH 12
`41MCHC
`8
`72CXEC
`l06NOU
`l l ~A.UT a
`l47SYS 12
`1680RC
`4
`21JOUT
`4
`332SEf 12
`419GE11
`6
`
`1SBASE
`47CHNG
`77LIST
`1070G"'
`121MEM
`l49POG
`l76$0l
`212SIZ
`33BMCH
`SOJ.ORD
`
`4
`
`36
`12
`12
`8
`4
`
`3~ !~6~~ 1!
`
`16BASC
`SlOAT A
`8311AP
`
`6
`
`158REL 12
`178SYM 18
`21600M 12
`3
`340LET
`SDBACT
`b
`
`REGULAR
`THESAURUS
`
`DIGIT 121
`
`MfTHOO 12
`RUNGE- 12
`VARIE
`12
`
`INSTRUCTION CARDS TO SMART SUPERVISOR
`
`SEPTEMBER 28, l'il64
`
`PAGE H5
`
`lA COMPUTER
`
`2
`llBRARY USED WAS VERSION NUMBER
`THESAURUS 01 SCR IMINATES NOT MORE THAN
`ENGLISH TEXTS WERE PRINTED DURING lOOKUP.
`WORDS NOT fOUND WERE PR INTEO.
`
`511
`
`OF THE HARRIS THESAURUS ..
`CONCEPTS.
`
`STATISTICAL INTRA-DOCUMENT PROCESSING --
`
`STATISTICAl PHRASES
`
`NONE.
`
`NONE.
`
`CRITERION TREES --
`A s·YNTACT1C ANALYSIS NAS PRINTED FOR EACH SENTENCE.
`CRITERION TREES DETECTED WERE PRINTED.
`NODE CORRESPONDENCES OF TREES TO SENTENCES WERE PRINTED.
`SYNTACTIC PHRASES HAD WEIGHT OF
`3 .. 0
`
`Tilt ABO•E OATA WAS SUPPliED BY THE PROGRAMMER AND MAY BE INCORRECT.
`
`THE FOllOWI"G DATA IS FROM INSTRUCTIONS FOR THIS RUN WHICH OEFINITElY WERE EXECUTED.
`
`TITLES WERE GIVEN A WEIGHJ·of
`
`1.0
`
`DOCUMENT CORRELATION --
`REQUEST CORRELATIOKS, WERE PRINTED.
`CORRElAHOH MODE USED WAS
`COSINE.
`CUTOFF WAS 0. 3500
`
`HIERARCHY
`
`NONE.
`
`CONCEPT PROCESSING --
`
`NONE.
`
`REQUE·STS WERE ANSWERED.
`AUTD-EVALUATlON WAS REQUESTED AND WILL BE ATTEMPTED.
`
`(")
`0
`
`~ .... ;;·
`Ill ... ;·
`= "' g,
`
`;
`> (")
`~ ....
`
`~
`
`'-C _, ....
`
`DIFFERNTl EQ
`
`1A COMPUTER
`
`DIFFEKNTL EG
`
`LA COMPUTER
`
`t.CCUR
`tQU
`NUMER
`SOLUT
`
`12 ALGO~I 12
`24
`f:VAUJ
`1?
`12 OROPol
`12
`12
`SPE\:t::
`12
`
`COHPUT 12 DIFFER 24
`12
`GP/E
`INTEGR 12
`PARTI
`12 PROCEO 12
`STA6ll 12 USL
`12
`
`BI\S
`12
`OlREr.T 12
`GIVE
`12
`MACHIN 24
`POS
`12
`SCANN
`12
`TECH~I 12
`
`CHARli.C 12
`P~ABLE 12:
`HANDLE 12
`OPER
`12
`11
`PO.S.S
`SII'!PLE 12
`TOWARD 12
`
`COMPUT
`ESTIM
`I LLUST
`ORO
`PROBLE
`SIZE
`TRANSf
`
`36 OESCRI 12
`12 EXPLA I 12
`12
`INDEPE 12
`12 ORIENT 12
`'\6 PROGRA 36
`24 STORE
`12
`12 USING
`12
`
`81\lGUR 12
`4EXACT 12
`l43UT J 12
`lLOAUT 12
`269H I
`4 2740IF '36
`~ 3R4THi 12
`
`13CALC 18
`l76SOL 12
`356VEL 11
`4
`42BSTB
`
`6
`71l:VAL
`l79STO 12
`35 7YAW
`4
`505APP 24
`
`2 INPUT
`166ASC
`530ATA
`83HAP
`108LOO
`130i<!EA
`1 58;{El
`178S.Y'M
`2125 IZ
`302LOO
`l46JET
`
`4
`
`1 OAlPH 12
`5LOCAT 12
`14COOR 72
`41HCHO
`8
`32RECU
`3
`"3
`3HHT
`72E-XE:.C
`6
`'>9AMNT 24
`'i 70SCB 15
`87ENijl 12 <noRD~ 10
`6
`l06NQU
`6
`8
`12 11JAUT 36 1120PE
`6
`ll9AUT
`4
`143UTI 12
`146JOFi 18
`147SYS 12
`12
`l62RJf
`16lEAS 12
`l660RD 4
`b
`18
`182SAV
`4
`lB7DIR 12
`20UDA- 72
`12
`21600M 12 ~
`276GEM 1 B
`f2
`"327ASJ 12 ~
`13BMCH
`8
`419GEM
`6
`6
`35 ... 1FtJ
`6
`5010RD
`4
`
`FIG. 4. Typical processing record (long form)
`
`FIG. fi. Document vectors generated by three analysis methods
`
`12
`PLANE
`RECCGN 12
`STRUCT 12
`WRI TT
`12
`
`9201Gl 12
`18 lQUA 24,
`375NUM 36
`
`l STAT
`6 STAT.
`! PHRASES[ PHRASE
`LOOK·UP
`•
`
`DESIGN 12 j NULL
`
`FORM
`
`12
`
`INFORM 12
`
`THESAURUS
`
`

`

`~· :-.
`
`""\
`
`~
`
`X.
`
`~ ....
`::
`....,
`... ....
`(') .,
`~ ,...
`
`()
`
`(r•A..\1,;< '> PJ OlJClJMF;,f-CONClPT Ml\fR[X
`
`lHIWUGtl USI: OF HlfKARCHY
`
`Sf.PHHtHR 2'8, 196-4
`
`Ali~WfH.S Ttl RlOUI-STS fOR DOCUHHITS ON Sf'ECIFlED TOPICS
`
`SEPTEMBER 2 6, 1964
`
`PAGE 169
`
`(XPA!\j:SlON
`(tJNCI.-PT-I•HKA)!:IRUGHTIVf:CfOR KEf-OR!:
`l'!ffl,{'I/Jl
`tQ
`4l~IIC!{ l/1 81\LGORI 121
`lJCAU .. I HO 7LtVAL(
`bl 9201GI( 121
`l76$0tl l2i l79STOI ll, 18lQUAI Z-4J 2b9Elll
`l'dUff( 121
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket