`{11] Patent Number:
`United States Patent
`Froessl
`[45] Date of Patent:
`Mar.7, 1995
`
`
`[19]
`
`US005396588A
`
`[54] DATA PROCESSING USING DIGITIZED
`IMAGES
`
`[76]
`
`[56]
`
`4,887,304 12/1989 Terzian ...-cccccccesessecesesseseeeee 382/30
`4,907,283
`3/1990 Tanakaetal. .
`... 382/40
`
`4,933,979 6/1990 Suzuki et al.......
`v 382/61
`
`4,944,022 7/1990 Yasujima etal. cessssessssssesseee 382/14
`Inventor: Horst Froessl, Gutenbergstrasse 2-4,
`1/1991 Fujisawaet al. ccs 364/900
`4,985,863
`6944 Hemsbach, Germany
`
`1/1991 Muramatsuetal. .......... 355/244
`4,989,042
`...-scsecssssssseeseeen 382/61
`5,038,392
`8/1991 Morris et al.
`[21] Appl. No.: 547,190
`
`9/1991 Yamagata et al.
`.....seeeeseeee 382/57
`5,048,113
`[22] Filed:
`Jul. 3, 1990
`.ssccssssssossseee 364/519
`5,051,925
`9/1991 Kadonoet al.
`
`5,060,146 10/1991 Chang et al. vsccccsscssoeenee 364/900
`[SU]
`Tint, C16 oooccccecccssseccsssecsssecessecesseseeseune G06F 15/62
`5,109,439 4/1992 Froess] .....-..ssesscsseressecesseees 382/61
`[52] U.S. Ch. woeeeeeeeeesesescecseseesesseees 395/145; 395/150;
`.
`.
`.
`382/69; 364/419.19; 364/DIG. 2; 364/963
`Primary Examiner—Mark K. Zimmerman
`[58] Field of Search ..ccccccssosssce--. 395/145, 150, 600;
`Assistant Examiner—Joseph Feild
`.
`Attorney, Agent, or Firm—Walter C. Farley
`382/11, 48, 6% 364/419, 963, 419.19
`References Cited
`67]
`ABS
`cT
`U.S. PATENT DOCUMENTS
`A method of manipulating information is disclosed in
`HOUR L/T2 hapannn30/0 ch the dia orl a igiaed nage ad
`
`_ 340/146.3
`4,028,674 6/1977 Chuang.......
`—-‘Tetained in image form for various data processing ma-
`4,273,440 6/1981 Froessl ceccccccsscscocsecossseccsseseouse 355/40
`nipulations. A font table is formed having a matrix of
`4,553,261 11/1985 Froessh ..........cccscsssesesssesesseeee 382/57
`fonts correlated with characters and symbols in code
`4,594,674 6/1986 Boulia et al.
`....
`w 395/145
`form such as ASCII. Desired material in the stored
`4,672,683 6/1987 Miatsueda soessccsssessceesnsesnneen 382/37
`documents is located using, pattern-match searching
`4,726,065 2/1988 Froessl .........cccccscscseseeseeeeeeeees 381/41
`.
`:
`4,748,678 5/1988 Takeda et al. coccccscnsscscssen 332/56
`With a parallel processor search engine.
`
`4,758,980 7/1988 Tsunekawaetal.....
`.... 364/900
`
`......ccceseceees 382/48
`4,760,606 7/1988 Lesnick et al.
`
`
`
`
`8 Claims, 4 Drawing Sheets
`
`
`INPUT
`(KEYBOARD,
`
`MOUSE,ETC.)
`
`
`
`DOCUMENT
`
`
`
`Ls) FEED, PRINT [| COMPUTER
`
`
`VOLATILE & NON-
`escaen
`
`
`
`VOLATILE MEMORY;
`
`RAM, TABLES, HD
`
`
`
`P.1
`
`SONY - Ex.-1012
`Sony Corporation - Petitioner
`
`
`
`U.S. Patent
`
`Mar. 7, 1995
`
`Sheet 1 of 4
`
`5,396,588
`
`FIG. 1
`
`
`
`
`
`U.S. Patent
`
`Mar. 7, 1995
`
`Sheet 2 of 4
`
`5,396,588
`
`FIG. 2
`
`15
`
`ESTABLISH FONT TABLE,
`ASCII TO PATTERN, WITH
`MULTIPLE FONTS ANDSIZES
`
`16
`
`ENTER FONT
`AND SEARCH
`AREA DESIRED
`
`17
`
`GO TO
`DOCUMENTSIN
`IMAGE PATTERN
`
`STORAGE IN FONT TABLE
`
`=
`
`DELETE
`
`19et =
`STOP PROCESS
`NUMBERS
`<aun>
`TOOPERATOR
`ERROR MESSAGE
`
`
`
`
`32
`
`MATCH PATTERN OF
`NEXT CHARACTER WITH
`
`PATTERNSIN FONT TABLE
`
` (1ST HEADING LTR)
`
`
`
`
`STORE ALL CHARACTERS
`IN HEADING IN ALPHABET
`GROUP OF FIRST CHARACTER
`
`35
`GO TO NEXT
`DOCUMENT
`
`IN STORAGE
`
`
`
`U.S. Patent
`
`Mar. 7, 1995
`
`Sheet 3 of 4
`
`5,396,588
`
`FIG. 3
`
`
`
`GO TO FIRST
`LETTER STORAGE,
`FIRST HEADING
`
`
`
`
`
`
`
`
`RELOCATE HEADING
`
`
`IN ACCORDANCEWITH
`
`ALPHABET POSITION
`
`
`
`
`COMPARE NTH CHAR-
`
`
`ACTER OF NEXT HEAD-
`
`ING WITH FONT TABLE
`
`
`
`COMPARE NTH
`CHARACTER OF
`SAME HEADING
`
`
`
`
`
`
`
`RELOCATE HEADING
`
`
`IN ACCORDANCE WITH
`
`ALPHABET POSITION
`
`
`
`
`
`RELOCATE HEADING
`
`
`IN ACCORDANCE WITH
`
`ALPHABET POSITION
`
`
`
`SET N=2, GOTO
`
`
`NEXT LETTER STORE
`
`
`
`
`P.4
`
`
`
`U.S. Patent
`
`Mar. 7, 1995
`
`Sheet 4 of 4
`
`5,396,588
`
`FIG. 4
`
`
`INPUT
`(KEYBOARD,
`
`MOUSE,ETC.)
`
`
`
`
`DOCUMENT
`
`
`
`FEED, PRINT [| COMPUTER
`
`& SCANNER
`
`
`VOLATILE & NON-
`
`VOLATILE MEMORY;
`RAM, TABLES, HD
`
`
`
`
`
`1
`
`
`
`5,396,588
`
`
`
`
`2
`
`
`
`
`
`
`
`
`
`rapidly using the full content of the document as search
`
`
`
`
`
`
`criteria for both text and graphics.
`
`
`
`
`
`
`
`
`A further object is to provide such a method wherein
`
`
`
`
`
`
`document contents can be manipulated and processed
`without converting the alphanumeric characters in the
`
`
`
`
`
`
`document into code.
`
`
`
`
`
`
`
`
`
`
`
`
`Yet another object is to provide a system which in-
`
`
`
`
`
`
`
`cludes a methodofretrieving image content by pattern
`
`
`
`
`
`matching, with or without indexing.
`
`
`
`
`
`
`
`
`
`
`A further object is to provide a system which can
`convert existing paper documents, such as technical
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`manuals, into a form suitable for interactive electronic
`
`display.
`
`
`
`
`
`
`Briefly described, the invention comprises a method
`
`
`
`
`
`
`
`
`of data processing including storing digitized images of
`
`
`
`
`
`
`documentcontents, establishing in non-volatile memory
`
`
`
`
`
`
`
`
`a font table including code values of alphanumeric char-
`
`
`
`
`
`
`
`
`acters and symbols and images of characters and sym-
`
`
`
`
`
`
`
`
`
`
`bols in each font used in the documents, each of the
`
`
`
`
`
`
`
`characters and symbols in each font being correlated
`
`
`
`
`
`
`
`
`
`with the code values for the character or symbol, locat-
`
`
`
`
`
`
`
`
`ing digitized images of selected portions of stored docu-
`
`
`
`
`
`
`
`ments which are to be manipulated, and manipulating
`
`
`
`
`
`
`
`
`
`the selected portions in the form of digitized images.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`In orderto impart full understanding of the mannerin
`
`
`
`
`
`
`
`
`which these and other objects are attained in accor-
`
`
`
`
`
`
`dance with the invention, particularly advantageous
`embodiments thereof will be described with reference
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`to the accompanying drawings, which form part ofthis
`
`
`
`specification, and wherein:
`FIG.1 is a schematic illustration of a font table show-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`ing the organization of a matrix usable to construct a
`
`
`
`
`
`
`
`correlation between characters and their equivalents in
`
`
`
`
`a plurality of fonts;
`
`
`
`
`
`
`
`
`
`FIGS. 2 and 3 are parts of a flow diagram illustrating
`
`
`
`
`
`
`
`
`
`
`
`the steps of the method of the invention as applied to a
`
`
`
`
`
`
`specific storage and retrieval problem; and
`
`
`
`
`
`
`
`
`
`FIG.4 is a schematic diagram of a system for per-
`
`
`
`forming the method.
`DESCRIPTION OF THE PREFERRED
`
`
`
`EMBODIMENTS
`
`
`
`
`
`
`
`
`
`In most storage systems which deal with large
`
`
`
`
`
`
`
`
`
`
`amounts of data, the data is converted into dp code,
`
`
`
`
`
`
`
`
`
`ASCII being the most common, and stored in code
`
`
`
`
`
`
`
`
`
`
`form. When one wishes to retrieve some part of the
`
`
`
`
`
`
`
`
`stored data, various techniques can be used, depending
`
`
`
`
`
`
`
`
`
`
`on how the system is designed to operate. Some use
`
`
`
`
`
`
`
`
`index techniques while others rely on full-text searching
`for selected search words.
`
`
`
`
`
`
`
`
`
`
`In accordance with the present invention, informa-
`
`
`
`
`
`
`
`
`
`tion is stored in image form, the word “information”
`
`
`
`
`
`
`
`
`
`
`being used to mean the content of documents which are
`
`
`
`
`
`
`
`
`
`being or have been transferred from a typed, printed or
`
`
`
`
`
`
`
`
`written form to digital storage. The stored information
`
`
`
`
`
`
`
`
`
`
`
`is preferably not indexed as it is entered into the system
`
`
`
`
`
`
`
`
`
`because any indexing system adds time to the input
`
`process.
`
`
`
`
`
`
`
`While it would theoretically have been possible in
`
`
`
`
`
`
`
`
`prior art systems using image storage to conducta pat-
`
`
`
`
`
`
`
`
`
`
`tern search to locate a specific word “match” in the
`
`
`
`
`
`
`
`
`stored images of a large number of documents, success
`
`
`
`
`
`
`
`
`
`would not have been likely unless the “searched for”
`
`
`
`
`
`
`
`
`
`word werepresented in a font or typeface very similar
`
`
`
`
`
`
`
`
`
`to that used in the original document. Since such sys-
`
`
`
`
`
`
`
`
`
`tems have had no wayofidentifying which font might
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`DATA PROCESSING USING DIGITIZED IMAGES
`
`
`
`
`
`
`
`
`
`CROSS REFERENCE TO RELATED
`
`
`
`APPLICATION
`
`Reference is made to application Ser. No. 536,769,
`
`
`
`
`
`
`
`
`filed Jun. 12, 1990, the entire content of which is hereby
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`incorporated by reference.
`SPECIFICATION
`
`
`
`
`
`
`
`
`
`
`
`This invention relates to a method and apparatusfor
`
`
`
`
`
`
`
`using image searching and manipulation techniques to
`
`
`
`
`
`
`
`
`
`
`store and retrieve information in such a way that the
`
`
`
`
`
`
`
`
`
`input of material from documents to mass storage is
`facilitated and the retrieval of desired informationis not
`
`
`
`
`
`
`
`
`
`
`impeded.
`
`
`
`BACKGROUND OF THE INVENTION
`
`
`
`
`
`
`
`
`
`
`
`Mass storage is becoming a much more interesting
`
`
`
`
`
`
`
`
`
`
`
`tool than it has in the past for a larger numberof appli-
`
`
`
`
`
`
`
`
`cations because of the introduction ofrelatively new
`
`
`
`
`
`
`
`
`
`
`mass storage media such as optical disks. However,it is
`
`
`
`
`
`
`
`
`
`
`still necessary to find efficient ways of putting the data
`
`
`
`
`
`
`into mass storage and retrievingit.
`
`
`
`
`
`
`
`Certainly the most efficient technique for inputting
`
`
`
`
`
`
`
`
`
`the contents of typed or printed documentation is with
`
`
`
`
`
`
`
`
`the use of optical scanning techniques. Methods and
`
`
`
`
`
`
`
`
`
`apparatus for handling incoming mail and the like in
`
`
`
`
`
`
`
`large quantities are disclosed in my copending applica-
`
`
`
`
`
`
`
`
`
`tion Ser. No. 536,769. In that application, the technique
`
`
`
`
`
`
`
`
`is used of optically scanning each document,identifying
`
`
`
`
`
`
`
`by data processing techniques “search words” which
`
`
`
`
`
`
`
`
`
`can subsequently be used to retrieve the documents and
`
`
`
`
`
`
`
`
`
`
`then storing the documents in a mass store, either in
`
`
`
`
`
`
`
`
`
`
`image form or in a data processing code such as ASCII.
`
`
`
`
`
`
`
`
`By “image form”it is meant that a digitized representa-
`
`
`
`
`
`
`
`
`
`
`
`
`tion of the image of the documentis stored in a form
`
`
`
`
`
`
`
`
`
`which is sometimes referred to as “bit mapped”. While
`
`
`
`
`
`
`
`
`
`image storage requires much more memory,it has the
`
`
`
`
`
`
`
`
`advantage of speed over converting everything into dp
`
`
`
`
`
`
`
`code, which necessarily requires human editing to as-
`
`
`
`
`
`
`
`
`
`sure accuracy of conversion, and also has the advantage
`
`
`
`
`
`
`
`
`
`
`of being able to reproducea replica of the original on a
`
`
`
`
`
`
`
`display or with a suitable printer, including signatures,
`
`
`
`
`
`
`
`letterhead “logos” and other non-text or unconvertible
`
`
`
`
`
`
`features such as drawings or graphics.
`
`
`
`
`
`
`
`Retrieval has always been regarded as a requirement
`
`
`
`
`
`
`
`
`
`which necessitated conversion into dp code ofall or a
`
`
`
`
`
`
`
`
`
`significant part of each document. Even in the system
`
`
`
`
`
`
`
`
`
`disclosed in Ser. No. 536,769, some conversion is used
`
`
`
`
`
`
`
`
`
`
`in connection with search words andthe like, and that
`
`
`
`
`
`
`
`
`system is regarded as representing a minimum of con-
`
`
`
`
`
`
`
`
`version, and probably the most efficient system for
`
`
`
`
`
`
`
`
`
`bridging the gap between hard copy (paper) and mass
`
`
`
`
`
`
`
`
`electronic or optical storage. It would, however, be
`
`
`
`
`
`
`
`
`advantageous for many circumstances if the speed of
`
`
`
`
`
`
`
`putting information from documentsinto digital storage
`
`
`
`
`
`
`
`
`
`
`could be further increased so that the time for putting a
`
`
`
`
`
`
`
`
`
`
`page of printed or typed material into digital form in
`
`
`
`
`
`
`
`
`mass storage could be,
`in essence, not significantly
`
`
`
`
`
`
`
`
`
`
`longer than the time required for the page to be physi-
`
`
`
`
`
`
`
`cally scanned by an optical scanning device.
`SUMMARY OF THE INVENTION
`
`
`
`
`
`
`
`
`
`
`
`
`
`An object of the present invention is to provide a
`
`
`
`
`
`
`method of retrievably storing contents of documents
`
`
`
`
`
`
`
`
`
`25
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`60
`
`
`65
`
`
`
`
`
`P.6
`
`
`P. 6
`
`
`
`5,396,588
`
`
`3
`
`
`
`
`
`
`
`
`
`have been used in the original document, a pattern
`
`
`
`
`
`
`
`
`
`
`search has had a low probability of success and could
`
`
`
`
`not be relied upon.
`
`
`
`
`
`
`
`
`In order to overcomethis problem, the present inven-
`tion uses what will be referred to as a “font table”. The
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`fonttable is a matrix of patterns organized in such a way
`
`
`
`
`
`
`
`
`that the alphanumeric characters and other symbols in a
`
`
`
`
`
`
`
`
`
`
`specific style of font or typeface are correlated with the
`
`
`
`
`
`
`
`
`
`ASCII (or other code system) values for those symbols.
`
`
`
`
`
`
`
`
`
`A schematic representation of a font table is shown in
`
`
`
`
`
`
`
`
`
`
`FIG. 1. When represented on paper, the table has a
`
`
`
`
`
`
`
`plurality of columns 10a, 10d, 10c, .
`. and a plurality of
`.
`
`
`
`
`
`
`
`
`
`rows 12a, 126, 12c, ... . Each column containsa list of
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`the various patterns of characters which go to make up
`
`
`
`
`
`
`
`
`
`
`
`a font set, each character being in the particular style of
`that font. Each row contains various forms of each
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`character or symbolin each of the various fonts. At the
`
`
`
`
`
`
`
`
`
`intersection of a row and column will be found a spe-
`
`
`
`
`
`
`
`
`
`cific character pattern in a selected font. The fonts can
`
`
`
`
`
`
`
`
`be identified in any convenient way, such as by num-
`
`
`
`
`
`
`
`
`
`bers, and the characters can also be identified in various
`
`
`
`
`
`
`
`
`
`
`waysalthough one of the most desirable is to use the
`ASCII value for each character.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`The present invention does not use a font table im-
`
`
`
`
`
`
`
`
`
`
`
`printed on paper but, rather, uses a table in the form of
`
`
`
`
`
`
`
`
`
`
`a non-volatile memory such as a hard disk, i.e., a mem-
`
`
`
`
`
`
`
`
`
`ory medium which is not erased when poweris re-
`
`
`
`
`
`
`
`
`
`moved. As such, it may not physically have columns
`
`
`
`
`
`
`
`
`
`and rows, but can have any convenient equivalent form
`
`
`
`
`
`
`
`
`of organization which has characteristics similar to the
`
`
`
`
`
`
`
`
`
`
`written form, i.e., a font set can be located and recog-
`
`
`
`
`
`
`
`
`nized, the members of a set of fonts representing any
`
`
`
`
`
`
`
`single character can be recognized and the “‘intersec-
`
`
`
`
`
`
`
`
`tion” patterns can be quickly located, e.g., the pattern
`
`
`
`
`
`
`
`
`
`which represents the letter “H” in font number 3 can be
`
`
`
`
`located in the memory.
`
`
`
`
`
`
`
`
`
`The form of memory usedis preferably revisable,i.e.,
`additions of new fonts can be made and corrections can
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`also be made. If this is considered unimportant for a
`
`
`
`
`
`
`
`
`
`specific application, the table can be stored in some
`
`
`
`
`
`
`
`
`
`form of read-only-memory (ROM) such as in one or
`
`
`
`more PROM chips.
`
`
`
`
`
`
`
`
`
`
`As mentioned above, each memberof each fontsetis
`
`
`
`
`
`
`
`
`
`
`
`stored in the table as an image. Thus, when one wishes
`
`
`
`
`
`
`
`
`
`
`
`to locate a specific word, printed in a specific font, in
`
`
`
`
`
`
`
`
`
`
`the mass store which contains all of the documenttext,
`
`
`
`
`
`
`
`
`
`
`
`the wordis entered in code form as by keyboard and the
`
`
`
`
`
`
`
`
`
`character pattern equivalent to each letter of the word
`
`
`
`
`
`
`
`
`
`is copied from the font table into volatile memory
`
`
`
`
`
`
`
`
`
`
`(RAM) in the selected font. A pattern search is then
`
`
`
`
`
`
`
`
`
`
`
`conductedto find a string of patterns in the stored text
`
`
`
`
`
`
`
`
`
`which matches the string of patterns which has been
`
`
`
`
`
`
`
`
`
`constructed from the table. The system can require a
`
`
`
`
`
`
`
`
`full match but, more practically, a match of some accu-
`
`
`
`
`
`
`
`
`
`
`
`racy less than 100% is used in order to be reasonably
`
`
`
`
`
`
`
`
`
`
`sure offinding the desired word and avoid the problem
`
`
`
`
`
`
`
`
`
`of missing the word becauseof a typographical error or
`
`
`
`
`
`
`
`a small defect in the stored image.
`
`
`
`
`
`
`
`
`
`A pattern search of the type described can be per-
`
`
`
`
`
`
`
`
`formed very quickly using a very fast computer search
`
`
`
`
`
`
`
`
`
`engine such as that developed and marketed by the
`
`
`
`
`
`
`Benson Computer Research Corporation, McLean, Va.
`
`
`
`
`
`
`
`
`The Benson systems employ multiple processors in a
`
`
`
`
`
`parallel architecture arrangement to conduct compari-
`
`
`
`
`
`
`
`
`
`
`sons at a high rate of speed. Once located, the words or
`
`
`
`
`
`
`
`
`the documents which contain the word images can be
`
`
`
`
`
`
`
`
`
`copied into RAM ordisk for sorting or other manipula-
`
`
`
`
`
`
`
`
`tion. Alternatively, the contents of storage such as opti-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`_ 0
`
`
`
`— 5
`
`
`
`20
`
`
`
`
`
`
`35
`
`
`
`
`45
`
`
`50
`
`
`
`
`60
`
`
`65
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`4
`
`
`
`
`
`
`
`
`
`cal disk (WORM)can be copied to RAM for pattern
`
`
`match searching.
`It is also possible to automatically provide a string of
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`all fonts for each search word entered by a user for an
`
`
`
`
`
`
`
`individual search. However, for the most efficient
`
`
`
`
`
`
`
`
`
`
`searching, as suggested above, it is desirable to be able
`
`
`
`
`
`
`
`
`
`
`to specify the font or fonts in which the searched-for
`
`
`
`
`
`
`
`
`
`
`text appears. This may not always be possible, but with
`
`
`
`
`
`
`
`
`proper system arrangementit is possible in a large num-
`
`
`
`
`
`
`
`ber of situations. When dealing with incoming corre-
`
`
`
`
`
`
`
`
`
`spondence, as in the system of application Ser. No.
`
`
`
`
`
`
`
`
`
`
`
`536,769, it is quite helpful to also maintain a record of
`
`
`
`
`
`
`
`
`
`
`fonts used by a specific company and to add newfonts,
`
`
`
`
`
`
`
`when they are encountered, with the identification of
`the sender.
`
`
`
`
`
`
`
`
`
`
`Consider, for example, the situation in which the
`
`
`
`
`
`
`
`
`mass store is dealing with correspondence and the
`
`
`
`
`
`
`
`
`
`Volkswagen Company is known to have used 5 specific
`fonts and further assume that retrieval of a document
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`received from the Volkswagen Company containing
`the term “rear axle” is needed.It is known thatthe letter
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`was sent by a Mr. Wagnerin May, 1989. The operator
`enters as search criteria the search words “rear axle,
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Wagner, May 1989” and “Volkswagen=company”.
`
`
`
`
`
`
`
`
`The program goes to a companylist and, under ““Volk-
`
`
`
`
`
`
`swagen”identifies those fonts associated with Volkswa-
`
`
`
`
`
`
`
`
`
`
`gen in image and select the search words “rear axle,
`
`
`
`
`
`
`
`
`
`
`Wagner, May 1989”in 5 different image fonts since the
`
`
`
`
`
`
`
`
`
`
`
`letter might have been written in any one of those. The
`
`
`
`
`
`
`
`
`
`search engine, in a relatively short time searches the
`
`
`
`
`
`
`
`
`
`imagefiles and extracts the desired letter, or possibly
`
`
`
`
`
`
`
`
`several letters meeting the criteria, ready for display
`
`
`
`
`
`
`
`and/or printout. This approach permits adding even
`
`
`
`
`
`
`
`
`old, outdated fonts for filing old documents,eliminating
`
`
`
`
`
`
`
`
`the requirement of warehouses stacked with oldfiles.
`
`
`
`
`
`
`
`
`
`It is a relatively simple matter to maintain the font
`
`
`
`
`
`
`
`
`record. In received correspondence, the font (or fonts)
`
`
`
`
`
`
`
`
`
`
`used in a newly received letter is compared with the
`
`
`
`
`
`
`
`
`
`
`
`
`font table and, if the font is recognized as being in the
`
`
`
`
`
`
`
`
`
`
`
`
`table, the fontis listed with the nameof the sender. If a
`
`
`
`
`
`
`
`
`
`font is not recognized, the document image can be
`
`
`
`
`
`
`
`
`
`
`
`flagged to bring it to the attention of an operator for
`
`
`
`
`
`
`
`
`addition to the table, generally a partly manual process.
`
`
`
`
`
`
`
`
`
`As a more detailed and specific illustration of the
`
`
`
`
`
`
`
`
`
`method of the present invention, consider the case of a
`
`
`
`
`
`
`
`naval organization which has a large number of hand-
`books and instruction manuals all of which are needed
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`for the routine maintenance of a ship. These manuals, or
`
`
`
`
`
`
`
`
`
`
`
`their equivalents, must be carried by the ship so that the
`
`
`
`
`
`
`
`
`
`personnel in various specialties can refer to them for
`
`
`
`
`
`
`
`routine maintenance, or non-routine repair, of any sys-
`
`
`
`
`
`
`tem aboard, whetherof a mechanical, hydraulic, electri-
`
`
`
`
`
`
`
`
`cal or other nature. Such manuals are typically printed
`
`
`
`
`
`
`
`
`
`
`in a small number of type styles or fonts which are
`
`
`
`
`
`
`
`
`relatively standard. The presence of a full set of such
`manuals for an aircraft carrier has been estimated to be
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`responsible for altering the draft of the ship by about
`three feet.
`
`
`
`
`
`
`
`
`
`
`
`In accordance with the present invention, all of the
`
`
`
`
`
`
`
`manuals with their associated diagramsandillustrations
`
`
`
`
`
`
`
`
`
`
`can be stored in image form on a reasonable numberof
`
`
`
`
`
`
`
`
`
`
`optical disks and can then be searched to locate the
`information of interest. Because of the limited number
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`of fonts which are known in advance, the search speed
`
`
`
`
`
`
`
`
`
`is maximized and the weight associated with the printed
`
`
`
`
`
`
`
`documents is replaced by the comparatively trivial
`
`
`
`
`
`
`
`weight of several computers (which are already avail-
`
`
`
`
`
`
`
`
`
`able on the ship in the form of personal computers)
`
`P.7
`
`
`P. 7
`
`
`
`5,396,588
`
`
`5
`
`
`
`
`
`
`
`
`
`along with the optical disks and disk readers for cooper-
`
`
`
`
`ating with the computers.
`
`
`
`
`
`
`
`
`Referring now to FIG.2, the following example will
`
`
`
`
`
`
`
`
`involve the review, by the computer equipment, of the
`
`
`
`
`
`
`
`
`
`stored text of the manuals for the purpose of extracting
`
`
`
`
`
`
`
`
`(copying into RAM)the chapter headingsofall chap-
`
`
`
`
`
`
`
`
`ters, and then putting them into alphabetical order.
`
`
`
`
`
`
`
`
`Although each chapter heading is preceded by a one- or
`
`
`
`
`
`
`
`two-digit reference number, for the present purpose
`
`
`
`
`
`
`
`
`
`
`those numbersare to be discarded and only the chapter
`
`
`
`
`
`
`
`
`
`names are to be retained. As indicated above, three
`
`
`
`
`
`
`
`
`
`
`
`fonts are used in the manuals, a large font for chapter
`
`
`
`
`
`
`
`
`headings, a medium font for subheadings and a “nor-
`
`
`
`
`
`
`
`
`
`
`
`mal” font for the text. The first step 15 in the method
`thusis the establishmentof a font table which cross-ref-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`erences the ASCII values for each letter, number and
`
`
`
`
`
`
`
`
`
`other symbol to the equivalent characters in the fonts
`
`
`
`
`
`
`
`
`
`
`used. For this example, only the largest font would be
`needed, but the table is established with all fonts when
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`the mass storagefiles are created so that any manipula-
`tion can be done.
`
`
`
`
`Theidentification of the font which is to be searched
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`for is then entered, 16, along with the search area de-
`
`sired, i.c., any limitation on the area of the manuals
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`which are to be searched. The search engine, such as
`
`
`
`
`
`
`
`
`the Benson system mentioned above, then examines the
`
`
`
`
`
`
`
`
`
`
`full text of the documents in image storage, 17, and
`
`
`
`
`
`
`
`
`
`
`attempts to match the pattern of the first character of
`
`
`
`
`
`
`
`
`
`
`
`
`each heading with the pattern in thetable, 18. If there is
`
`
`
`
`
`
`
`
`
`
`
`
`
`a match, the character is examined, 20, to see if it is a
`
`
`
`
`
`
`
`
`
`number. If it is a number, the heading is further exam-
`
`
`
`
`
`
`
`
`
`
`
`ined, 22, to see if the second character is a number or a
`
`
`
`
`
`
`
`
`
`
`
`blank space, 23. If the second character is found to be a
`
`
`
`
`
`
`
`
`
`
`
`
`
`number,there is a check to see if there is a blank space,
`
`
`
`
`
`
`
`
`
`24, following the second number. Location of a blank in
`
`
`
`
`
`
`
`
`either place completes the pattern which identifies a
`
`
`
`
`
`
`
`
`
`
`heading. Thus, in addition to the font size, the material
`
`
`
`
`
`
`
`
`
`being examined has been confirmed as a heading and the
`
`
`
`
`
`
`
`numbers preceding the heading name have been iso-
`
`
`
`
`
`
`
`
`lated. Those numbers, having served their purpose for
`
`
`
`
`
`
`
`
`
`
`
`this search, are deleted, 25. Failure to find a match in
`
`
`
`
`
`
`
`
`
`
`the appropriate font size or a blank in the proper place,
`
`
`
`
`
`
`
`
`
`
`26, is inconsistent with the known formatof the stored
`
`
`
`
`
`
`
`
`
`
`documents. In such a case, the process is stopped and
`
`
`
`
`
`
`
`
`the operatoris informed with an appropriate error mes-
`
`
`sage, 27.
`
`
`
`
`
`
`
`
`The following character is compared, 28, with the
`
`
`
`
`
`
`
`
`
`font table to determine which letter of the alphabet
`
`
`
`
`
`
`
`
`
`
`
`begins the first word of the heading. If there is no
`
`
`
`
`
`
`
`
`
`
`match, 30, then the method is stopped and the operator
`
`
`
`
`
`
`
`
`
`
`
`
`
`is informed, 27. If there is a match, all of the words in
`
`
`
`
`
`
`
`
`
`
`the heading are then copied and stored together, 32, as
`
`
`
`
`
`
`
`
`
`
`being a heading and they are stored in an alphabet
`
`
`
`
`
`
`
`
`group which will contain other headings starting with
`
`
`
`
`
`
`
`
`
`the sameletter, i.e., if the heading is “HYDRAULIC
`
`
`
`
`
`
`
`
`
`SYSTEMS”,thenit is stored in a memory area reserved
`
`
`
`
`
`
`
`
`
`
`for headings starting with H. The first letter of the
`
`
`
`
`
`
`
`
`
`
`stored heading will now be referred to as the “first
`character”.
`
`The mass store is then examined to see if there is
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`another document with a heading in storage, 34. If so,
`
`
`
`
`
`
`
`
`
`
`
`35, the above steps beginning at 18 are repeated for each
`
`
`
`
`
`
`
`
`
`as indicated by the circled recirculation numeral 1. If
`
`
`
`
`
`
`
`
`
`
`
`
`the heading is found to be the last heading in the last
`
`
`
`
`
`
`
`
`document, then the process of arranging the documents
`
`
`
`
`
`
`
`
`
`
`within the alphabet groups begins as shownin FIG.3.
`
`
`
`
`
`
`
`Actually, the rearranging process can be accomplished
`
`
`
`
`
`
`
`
`
`
`while the above steps are being repeated for the second
`
`
`
`
`
`
`
`
`
`6
`
`
`
`
`
`
`
`
`and subsequent documents, but for simplicity, it is de-
`
`
`
`
`
`
`
`scribed herein as being a totally serial process.
`
`
`
`
`
`
`
`
`The rearranging process begins, 36, with the first
`
`
`
`
`
`
`
`
`
`
`alphabet group in which the next, or Nth, character of
`
`
`
`
`
`
`
`
`
`each heading is examined, 38, for a match with a pattern
`
`
`
`
`
`
`
`
`
`
`
`in the font table. For this next character, N=2. If a
`
`
`
`
`
`
`
`
`
`
`
`match, 40, is found, the heading is relocated, 42, to a
`
`
`
`
`
`
`
`
`position in the memory consistent with that second
`
`
`
`
`
`
`
`
`
`
`letter position in the alphabet. The Nth character of the
`
`
`
`
`
`
`
`
`
`
`
`next heading is examined, 44, and if a match is found,46,
`
`
`
`
`
`
`
`
`
`
`
`that heading is also relocated, 48. If no match is found,
`
`
`
`
`
`
`
`
`
`
`
`
`the character is checked, 50, to seeif it is a blank.Ifit is
`
`
`
`
`
`
`
`
`
`
`not a blank, the process is stopped and an error message
`
`
`
`
`
`
`
`
`
`
`
`
`given, 27. If it is a blank, then N is increased by one, 52,
`
`
`
`
`
`
`
`
`
`
`and the next character of that same heading is checked,
`
`
`
`
`
`
`
`
`
`
`
`
`54. If that also is a blank, 56, it is assumed that the head-
`
`
`
`
`
`
`
`
`
`
`ing has ended and the headingis relocated, 57. In addi-
`
`
`
`
`
`
`
`
`
`tion, that heading is flagged as having been completed
`
`
`
`
`
`
`
`
`
`
`insofar as alphabetizing is concerned andit is passed by
`
`
`
`
`
`
`
`
`
`
`in subsequent operations.If it is not a blank, a match is
`
`
`
`
`
`
`
`
`
`
`
`sought, 58. A failure to find a match stops the process.
`
`
`
`
`
`
`
`
`
`
`
`
`If there is a match, then it means only that there was a
`
`
`
`
`
`
`
`
`
`
`space in the heading andit is relocated, 59. After either
`
`
`
`
`
`
`
`
`
`
`
`
`relocation 57 or 59, a check is made, 60, to see if this was
`
`
`
`
`
`
`
`
`
`
`
`
`the last heading in this first letter store. If not, N is
`
`
`
`
`
`
`
`
`
`
`reduced by one, 61, and the process continues from 44.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`If so, a check is made,62,to seeif this is the last heading
`
`
`
`
`
`
`
`
`
`
`
`in the last first letter store. If so, the entire alphabetizing
`
`
`
`
`
`
`
`
`
`
`
`
`
`process is ended. If not, N is reset to two, 64, and the
`
`
`
`
`
`
`
`
`
`
`process is continued with the next first letter store from
`38.
`
`
`
`
`
`
`
`
`
`Returning to step 48, after relocation, the process
`
`
`
`
`
`
`
`
`
`
`
`
`
`checksto see if that was the last heading, 66. If not, the
`
`
`
`
`
`
`
`
`
`
`
`
`process is recirculated to 44. If so, a check, 68, is made
`
`
`
`
`
`
`
`
`
`
`
`
`
`to see if this was the last heading in the last letter store.
`
`
`
`
`
`
`
`
`
`
`If so, the process has been completed and the headings
`
`
`
`
`
`
`
`
`
`
`
`are ready to be printed out or displayed in the desired
`
`
`
`
`
`
`
`
`
`
`
`
`order. If not, N is increased by one, 70, and the process
`
`
`
`repeats from 38.
`
`
`
`
`
`
`
`
`
`
`In the above process, it will be apparent that instead
`
`
`
`
`
`
`
`
`
`of being relocated each time the next characteris identi-
`
`
`
`
`
`
`
`
`
`
`
`fied, an index can be built up to identify the storage
`
`
`
`
`
`
`
`locations of words having certain character values.
`
`
`
`
`
`
`
`
`
`Then, when a printout or display is desired, the images
`are read out on the basis of the index information.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`In the approach described above, the text is stored
`
`
`
`
`
`
`
`
`totally in image form,i.e., without conversion to ASCII
`
`
`
`
`
`
`
`
`or other code. In a modified version of that approach,
`
`
`
`
`
`
`
`
`
`special use can be made of the printed index which
`
`
`
`
`
`
`
`generally accompanies documents suchas thisto facili-
`
`
`
`
`
`
`
`
`
`tate searching for and displaying desired portionsoftext
`
`
`
`
`
`
`
`
`or illustrations. In this modified approach, the printed
`
`
`
`
`
`
`
`
`
`index (as distinguished from any index created by the
`
`
`
`
`
`
`
`
`
`
`
`computer system) is not only stored in image butis also
`
`
`
`
`
`
`converted into code, using conventional character rec-
`
`
`
`
`
`
`
`ognition equipment and software, either when the mate-
`
`
`
`
`
`
`
`
`
`rial is first scanned into mass storage or subsequently.
`
`
`
`
`
`
`
`
`
`
`Then, when one wishes to locate those parts of the
`
`
`
`
`
`
`
`
`
`
`stored text relating to a specific index item, the indexis
`
`
`
`
`
`
`
`
`
`displayed from the stored code, the desired item is se-
`
`
`
`
`
`
`
`
`
`lected from the display and image search words are
`constructed from the font table in each of the fonts used
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`in the document. Those image search words are then
`
`
`
`
`
`
`
`
`
`
`used in a pattern search, as discussed above, to locate
`
`
`
`
`
`
`
`
`
`
`
`the relevant parts of the text. The image store of the
`index need not be maintained after conversion. This
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`approach retains the advantages of image storage for
`
`
`
`
`
`
`
`
`most of the material but facilitates retrieval by provid-
`
`
`
`
`
`
`
`-_ 0
`
`
`
`_ 5
`
`
`
`20
`
`
`25
`
`
`
`
`
`
`
`
`
`
`
`
`45
`
`
`
`
`35
`
`
`60
`
`
`
`65
`
`
`P.8
`
`
`P. 8
`
`
`
`5,396,588
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`40
`
`
`45
`
`
`
`7
`ing a more direct technique for finding relevant search
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`wordsand constructing images from them for a pattern
`
`
`
`
`
`
`
`
`search. The conversion and editing time is minimized
`
`
`
`
`
`
`
`
`
`
`because the index is generally rather small (in terms of
`
`
`
`
`
`
`
`
`
`the numberof characters) as compared with the entire
`document.
`
`
`
`
`
`
`
`
`
`Generally speaking, the selection of search wordsis a
`
`
`
`
`
`
`
`
`topic whichis discussed in detail in copending applica-
`
`
`
`
`
`
`
`
`tion Ser. No. 536,769, mentioned above. In documents
`
`
`
`
`
`
`
`
`
`
`not having a printed index of any kind, such as corre-
`
`
`
`
`
`
`
`spondence, search words are preferably selected in
`
`
`
`
`
`
`
`
`
`
`some fashion when the material is stored, or at least
`
`
`
`
`
`
`
`
`
`before it is expected to be used. Computer generated
`
`
`
`
`
`
`
`
`
`
`search word indexes can be created, as d