`
`(12) United States Patent
`Bressan et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,826,665 B2
`Nov. 2, 2010
`
`(54) PERSONAL INFORMATION RETRIEVAL
`USING KNOWLEDGE BASES FOR OPTICAL
`CHARACTER RECOGNITION CORRECTION
`
`(75)
`
`Inventors: Marco Bressan, La Tronche (FR);
`Hervé Dejean, Grenoble (FR);
`Christopher R. Dance, Meylan (FR)
`
`(73) Assignee: Xerox Corporation, Norwalk, CT (US)
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 1254 days.
`
`(21) Appl.No.: 11/299,453
`
`(22)
`
`Filed:
`
`Dec. 12, 2005
`
`(65)
`
`Prior Publication Data
`
`US 2007/0133874 A1
`
`Jun. 14, 2007
`
`(51)
`
`Int. Cl.
`(2006.01)
`G06K 9/00
`(52) U.S. Cl.
`...................... .. 382/181; 382/186; 382/187
`(58) Field of Classification Search ............... .. 382/181,
`382/231
`
`See application file for complete search history.
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`1/1996 Smith, III et al.
`5,483,052 A
`2/1997 Zipf et al.
`5,604,640 A
`5/1998 Higgins et al.
`5,754,671 A *
`12/1998 Scanlon
`5,850,480 A
`8/2004 Marappan
`6,783,060 B2
`11/2004 Myers et al.
`6,823,084 B2
`7,120,302 B1* 10/2006 Billester ................... .. 382/229
`2001/0044324 A1
`11/2001 Carayiarmis et al.
`
`............ .. 382/101
`
`2002/0131636 A1*
`2002/0191847 A1
`2003/0069877 A1
`2003/0086615 A1
`2005/0086205 A1
`2005/0086224 A1
`
`9/2002 Hou ......................... .. 382/181
`12/2002 Newman et al.
`4/2003 Grefenstette et al.
`5/2003 Dance et al.
`4/2005 Franciosa et al.
`4/2005 Franciosa et al.
`
`FOREIGN PATENT DOCUMENTS
`
`DE
`GB
`
`10104270
`2392290
`
`8/2002
`2/2004
`
`OTHER PUBLICATIONS
`
`Kilgarriff et a1., “Introduction to the Special Issue on the Web as
`Corpus,” Computational Linguistics, Vol. 29, No. 3, pp. 333-347,
`(2003).
`Handley et al., “Document Understanding System Using Stochastic
`Context-Free Grammars,” pp. 5.
`
`(Continued)
`
`Primary Examiner—Samir A. Ahmed
`Assistant Examiner—EdWard Park
`
`(74) Attorney, Agent, or Firm—Fay Sharpe LLP
`
`(57)
`
`ABSTRACT
`
`In a system for updating a contacts database (42, 46), a por-
`table imager (12) acquires a digital business card image (10).
`An image segmenter (16) extracts text image segments from
`the digital business card image. An optical character recog-
`nizer (OCR) (26) generates one or more textual content can-
`didates for each text image segment. A scoring processor (36)
`scores each textual content candidate based on results of
`
`database queries respective to the textual content candidates.
`A content selector (38) selects a textual content candidate for
`each text image segment based at least on the assigned scores.
`An interface (50) is configured to update the contacts list
`based on the selected textual content candidates.
`
`15 Claims, 3 Drawing Sheets
`
`10
`PORTABLE IMAGER
`:T_r"
`IMAGE PRE-PROCESSOR
`::| /14
`
`I6
`
`20
`7- IMAGE SEGMENTOR
`SEGMENTS TAGGER
`
`3?
` SEGMENTS
`
`
`TEXTUAL CONTENT CANDIDATES
`(OPTIONALLV TAGGED)
`
`TEXTUAL CONTENT CANDIDATES
`SCORING PROCESSOR
`
`
`
`
`
`: ,»-:44
`
`CONTACT RECORD
`
`CONTENT SELECTOR
`
`UI FOR VERIFICATION
`OR MANUAL CORRECTION
`
`PERSONAL
`coumcrs LIST
`
`<3.
`I
`I
`
`40
`
`
`=
`comm;
`nmscroxv
`
`ADDRESSES
`'"T”‘"”
`Dmmkv
`
`42
`
`I
`
`45
`
`LOGOS
`DATABASE
`
`Page 1 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`Page 1 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`
`
`US 7,826,665 B2
`Page 2
`
`OTHER PUBLICATIONS
`_
`_
`_
`Strohmaier et al., “Lexical Postcorrection of OCR-Results: The web
`as a Dynamic Secondary Dictionary?,” pp. 5.
`Martins et al., “Spelling Correction for Search Engine Queries,” pp.
`12.
`Xerox, “Xerox Document Imaging Technology Changes the Way
`People Communicate,” Public Relations Department, France, pp. 2,
`(2004).
`Katsuyama et al., “Highy Accurate Retrieval of Japanese Document
`Images Through a Combination .
`.
`. ,” Proc SPIE Vol. 2670, pp. 57-67,
`(2002).
`
`Saiga et al., “An OCR System for Business Cards,” IEEE Comput.
`Soc., pp. 802-805, 1993.
`
`Lrkrorrrrarrsurerrr er ara “Proper Names Extraction rr0rr1 Fax rrrlages
`Combining Textual and Image Features,” IEEE, pp. 545-549, 2003.
`
`Luo et al., “Design and implementation of a card reader based on
`b11r1d'rr1 Carrleraa” rEEEa V0r~ rs PP~ 4r7'420a 2004
`
`* cited by examiner
`
`Page 2 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`Page 2 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`
`
`U.S. Patent
`
`Nov. 2, 2010
`
`Sheet 1 of3
`
`US 7,826,665 B2
`
`1 0
`
`PORTABLE IMAGER
`
`IMAGE PRE-PROCESSOR
`
`IMAGE SEGMENTOR
`
`20
`
`
`SEGMENTSTAGGER
`
`
`
`I2
`
`I4
`
`I6
`
`32
`
`22
`
`L€E°G°M'£?v“T‘§E
`
`PERSONA_L
`CONTACTS LIST
`
`DATABASE
`
`CORPORATE
`DIRECTORY
`
`LOGOS
`
`INTERNEY
`ADDRESSES
`DIRECTORY
`
`Page 3 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`Page 3 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`
`
`U.S. Patent
`
`Nov. 2, 2010
`
`Sheet 2 of3
`
`US 7,826,665 B2
`
`New York, NY 11111
`
`V
`
`/
`
`A
`
`John H. Smith
`
`Process Engineer
`
`ABC Widget Corporation
`12345 Main Street
`
`FIG. 2
`
`Page 4 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`Page 4 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`
`
`U.S. Patent
`
`Nov. 2, 2010
`
`Sheet 3 of3
`
`US 7,826,665 B2
`
`20
`
`22
`
`TEXTUAL CONTENT CANDIDATES
`L050 ‘M5
`(OPTIONALLYTAGGED)
`SEGMENTS
`"""""""""""""""""""""""""""""""""""
`””””””””””””””””””"T-A-C;.(3|_N_G_"""""""""""
`
`76
`
`ADJUSTER
`
`
`
`5”
`
`
`
`
`
`
`DATABASE QUERY
`
`62
`
`IMAGE DB
`
`QUERY
`
`
`
`
`LOCAL SCORING PROCESSOR
`
`GLOBAL SCORES ADJUSTER
`
`L _ _ . _ _ _ _ _ _ _
`
`. _ . _ . . _ . _ . _ _ . _ _ _ _ . _ . . _ _ _ _ _ _ . _ _ _ _ _ . . . . F . _ _ _ _ _ _ . . _ . — — _ — — — — — — _ _ — _ — . — — —L — — — — — — — — — — — — — — — - — — — — — — --
`
`CONTACT RECORD
`CONTENT SELECTOR
`
`l
`
`Ul FOR VER|F|CAT|ON
`OR MANUAL CORRECTION
`
`
`Page 5 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`Page 5 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`
`
`US 7,826,665 B2
`
`1
`PERSONAL INFORMATION RETRIEVAL
`USING KNOWLEDGE BASES FOR OPTICAL
`CHARACTER RECOGNITION CORRECTION
`
`CROSS REFERENCE TO RELATED PATENTS
`AND APPLICATIONS
`
`The following U.S. patent applications, relating generally
`at least to aspects of capturing text images and to processing
`of digitally captured text, are commonly assigned with the
`present application, and are incorporated herein by reference:
`Dance et al., “Method and Apparatus for Capturing Text
`Images,” U.S. patent application Ser. No. 09/985,433 filed 2
`Nov. 2001 and published as US 2003/0086615 A1, is incor-
`porated by reference herein in its entirety.
`Newman et al., “Portable Text Capturing Method and
`Device Therefor,” U.S. patent application Ser. No. 10/214,
`291 filed 8 Aug. 2002 and published as US 2002/0191847 A1,
`is incorporated by reference herein in its entirety.
`The following U.S. patent application, relating generally at
`least to aspects of using knowledge bases for augmenting
`information, is commonly assigned with the present applica-
`tion, and is incorporated herein by reference:
`Grefenstette et al., “System for Automatically Generating
`Queries,” U.S. patent application Ser. No. 09/683,235 filed 5
`Dec. 2001 and published as US 2003/0069877 A1, is incor-
`porated by reference herein in its entirety.
`The following U.S. patent applications, relating generally
`at least to aspects of document retrieval, are commonly
`assigned with the present application, and are incorporated
`herein by reference:
`Franciosa et al., “System and Method for Computing a
`Measure of Similarity between Documents,” U.S. patent
`application Ser. No. 10/605,631 filed 15 Oct. 2003 and pub-
`lished as US 2005/0086224 A1, is incorporated by reference
`herein in its entirety.
`Franciosa et al., “System and Method for Performing Elec-
`tronic Information Retrieval Using Keywords,” U.S. patent
`application Ser. No. 10/605,630 filed 15 Oct. 2003 and pub-
`lished as US 2005/0086205 A1, is incorporated by reference
`herein in its entirety.
`
`BACKGROUND
`
`The following relates to the information arts. It especially
`relates to methods and apparatuses for extracting textual per-
`sonal information from business cards photographed using
`the built-in camera of a cellular telephone, and will be
`described with particular reference thereto. The following
`relates more generally to extraction of textual personal infor-
`mation from images acquired by portable imagers such as
`digital cameras, handheld scarmers, and so forth, and to
`acquiring personal information by using a portable imager in
`conjunction with text extraction techniques, and so forth.
`The cellular telephone including built-in digital camera is a
`common device carried by business and professional persons.
`While having a wide range of uses, one application to which
`the digital camera component of cellular telephones is
`applied is the rapid capture of business card images. When
`meeting someone for the first time, or when meeting someone
`whose personal information has changed due to ajob transfer,
`promotion, or so forth, it is convenient for the business or
`professional person to use the built-in camera of his or her
`cellular telephone to photograph the business card of the
`newly met person, thus creating a digital image of the busi-
`ness card. In effect, the built-in digital camera of the cellular
`telephone is used as a kind of portable instant document
`
`2
`
`scanner. However, the photograph is in an image format, such
`that the textual content is not immediately accessible for input
`to a text-based personal contacts list or other text-based data-
`base.
`
`Optical character recognition (OCR) software extracts tex-
`tual information from images. Thus, a desirable combination
`is to apply OCR to extract textual information from the busi-
`ness card image acquired using the built-in digital camera of
`the cellular telephone. Once text is extracted, each text line
`can optionally be tagged as to data type (such as tagging text
`lines as “personal name”, “job title”, “entity afiiliation”, or so
`forth), and optionally incorporated into a contacts database.
`In practice, however,
`it has been found to be difficult to
`effectively apply OCR to business card images acquiredusing
`digital cameras.
`One problem which arises is that the resolution of the
`built-in digital cameras ofcellular telephones is typically low.
`The built-in cameras of existing cellular telephones some-
`times have a so-called VGA resolution corresponding to the
`coarse pixel density of a typical display monitor. Some exist-
`ing cellular telephones have built-in cameras with higher
`resolution, such as around 1-2 megapixels or more. It is
`anticipated that the built-in camera resolution will increase as
`cost-per-pixel decreases. However, even with improved pixel
`resolution, image quality is likely to be limited by poor optics.
`Higher manufacturing costs of the physical optical system as
`compared with electronics has tended to cause manufacturers
`to use optics of limited quality. Lens quality is improving at a
`substantially slower rate than resolution, and so this aspect of
`typical cellphone cameras is less likely to improve substan-
`tially in the near future. Further, the trend toward more com-
`pact or thinner cellular telephones calls for miniaturized
`optics, which are difficult to manufacture with high optical
`quality. Common adverse effects of poor lenses include
`image noise, aberrations, artifacts and blurring. OCR tends to
`produce more errors and higher uncertainty under these con-
`ditions.
`
`the cellular telephone is held by hand,
`Additionally,
`focused on the small business card, during imaging of the
`business card. Accordingly, unsteadiness of the camera dur-
`ing the photographing can produce blurring, artifacts, or other
`image degradation. Image acquisition is typically done in
`uncontrolled conditions, such as variable lighting, strong
`shadows, non-expert usage, variable distance to objective,
`variable three-dimensional viewing angle, and so forth. The
`acquired document image orientation often has substantial
`scale, skew, and/or rotation components, and may have sub-
`stantial variation in illumination. In summary, the physical
`characteristics of the camera, non-ideal imaging environ-
`ment, and the typically limited photographic skill of the
`operator combine such that a built-in digital camera of a
`cellular telephone typically acquires business card images of
`relatively quality with substantial image defects, which tends
`to lead to substantial errors and uncertainty in the OCR.
`The textual content of the business card also does not lend
`
`itselfto accurate OCR. In typical OCR processing, objects are
`recognized and identified as letters, numerals, punctuation, or
`other characters based on pattern matching, but with some
`uncertainty because the rendering of the characters is less
`than optimal, because the text font may vary, and so forth. To
`counter these difficulties, OCR processing sometimes resolve
`uncertainties by comparing uncertain words or phrases
`against an electronic dictionary or grammar checker. These
`approaches are relatively ineffective when applied to OCR
`conversion of the textual content of business cards, because
`the content (such as personal names, job titles, affiliations,
`addresses, and so forth) are typically not found in electronic
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Page 6 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`Page 6 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`
`
`US 7,826,665 B2
`
`3
`dictionaries and typically do not follow conventional gram-
`mar rules. Thus, the nature ofthe textual content tends to lead
`to unresolvable errors and uncertainty in the OCR.
`
`BRIEF DESCRIPTION
`
`According to aspects illustrated herein, there is provided a
`system for updating a contacts database. A portable imager is
`configured to acquire a digital business card image. An image
`segmenter is configured to extract text image segments from
`the digital business card image. An optical character recog-
`nizer (OCR) is configured to generate one or more textual
`content candidates for each text image segment. A scoring
`processor is configured to score textual content candidates
`based on results of database queries respective to the textual
`content candidates. A content selector selects a textual con-
`
`tent candidate for each text image segment based at least on
`the assigned scores. An interface is configured to update the
`contacts database based on the selected textual content can-
`didates.
`
`According to aspects illustrated herein, there is provided a
`method for acquiring personal information. A business card
`image is acquired. A text image segment is extracted from the
`business card image. Optical character recognition (OCR) is
`applied to the text image segment to generate a plurality of
`textual content candidates. At least one database is queried
`respective to each of the textual content candidates. A most
`likely one of the textual content candidates is selected based
`at least on records returned by the querying.
`According to aspects illustrated herein, there is provided a
`system for generating a textual contact record from textual
`content candidates extracted by optical character recognition
`(OCR) from text image segments of a business card image. A
`databases query queries at least one database respective to the
`textual content candidates and collects records returned
`
`responsive to the queries. A content candidates scoring pro-
`cessor assigns scores to the textual content candidates based
`on the collected records. A content selector selects a textual
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`content candidate for each text image segment based at least
`on the assigned scores.
`
`40
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 diagrammatically shows a system for acquiring a
`business card image and processing the business card image
`to construct a contact record.
`
`45
`
`FIG. 2 shows a typical example business card including
`personal name, title, business affiliation, business address,
`and a corporate logo.
`FIG. 3 diagrammatically shows principal components of
`the textual content candidates scoring processor ofthe system
`of FIG. 1.
`
`DETAILED DESCRIPTION
`
`With reference to FIG. 1, a business or professional person
`or other person receives a business card 10, which he or she
`wants to add to a contacts database, contact list, or so forth.
`Accordingly, the person acquires an image of the business
`card 10 using a portable imager 12, which may for example be
`a built-in camera of a cellular telephone, a portable digital
`camera, a handheld document scarmer, or so forth. Using the
`built-in camera of a cellular telephone as the portable imager
`12 has advantages in that the cellular telephone is a portable
`device commonly carried by business and professional per-
`sons into meetings and other interpersonal transactional set-
`tings. Using the built-in camera of a cellular telephone or a
`
`50
`
`55
`
`60
`
`65
`
`4
`
`point-and-shoot-type digital camera advantageously allows
`the business card image to be acquired at the click of a shutter
`button. However, it is also contemplated to acquire the image
`using a portable scanner or other scan-based portable imager.
`In the illustrated embodiment, the business card 10 is a
`physical card, such as a 2><31/2-inch business card or similar-
`sized business card that is commonly carried by business and
`professional persons. However, the term “business card” is
`intended to encompass other printed personal information
`summaries which may be advantageously digitally imaged by
`a portable imager and processed to extract textual personal
`information for inclusion in a contacts database. For example,
`the term “business card” as used herein may also encompass
`presenter information appearing on the title slide of a printed
`copy of an overhead presentation, or author information
`appearing on the first page of a scientific or technical article
`pre-print, or so forth. The personal information content of
`business cards typically include personal name, job title,
`affiliation (such as a company name, university name, firm
`name, or so forth), graphical affiliation logo (such as a cor-
`porate logo, university logo, firm logo, or so forth), business
`address information, business telephone number, business
`facsimile number, email address, or so forth. A given business
`card may include only some of these items or all of these
`items, and may include additional or other information.
`Optionally, an image pre-processor 14 performs selected
`image pre-processing on the acquired business card image.
`Such pre-processing may include, for example, squaring of
`the image, re-sizing the image, performing a blurring correc-
`tion, shadow correction, reflection correction, or other cor-
`rection, converting the business card image to black-and-
`white, performing image compression, or so forth. In some
`embodiments, the image pre-processor 14 is embodied by
`Mobile Document Imaging software (available from Xerox
`Corporation, Xerox Research Centre Europe, Grenoble,
`France) disposed on and executing on the cellular telephone
`or other portable imager 12. In other embodiments, the image
`pre-processor may be other pre-processing software disposed
`on and executing on the portable imager 12. In other embodi-
`ments, the image pre-processor 14 may execute on a network
`server, personal computer, or other computer, in which case
`the image pre-processor 14 receives the business card image
`from the cellular telephone or other portable imager 12 by a
`suitable wired or wireless communication path such as a
`Bluetooth link, a mobile telephone network link, or so forth.
`With continuing reference to FIG. 1 and with further ref-
`erence to FIG. 2, the acquired and optionally pre-processed
`business card image is segmented by an image segmenter 16
`to extract text image segments 20 and optional logo image
`segments 22. Each of the text image segments 20 suitably
`corresponds to a dot-matrix representation of a line of text in
`the business card image. For the example business card of
`FIG. 2, the text image segments 20 may include the following
`five text image segments: “John H. Smith”, “Process Engi-
`neer”, “ABC Widget Corporation”, “l2345 Main Street”, and
`“New York, N.Y. l l l l l”. The text image segments retain the
`font characteristics such as keming, since the text image
`segments are not character-based. In this example, the text
`image segments correspond to physical lines of text. How-
`ever, depending upon the layout of the business card and the
`segmenting algorithm implemented by the image segmenter
`16, the text image segments may in some embodiments cor-
`respond to units other than physical lines of text.
`Similarly, the image segmenter 16 optionally also extracts
`logo image segments 22. For the example business card of
`FIG. 2, the logo image segments 22 may include the single
`logo shown at the left side of the business card, showing the
`
`Page 7 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`Page 7 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`
`
`US 7,826,665 B2
`
`5
`company name “ABC” with the A inscribed into a left-slanted
`“W” indicative of the widget products of ABC Widget Cor-
`poration.
`An optical character recognizer (OCR) 26 processes each
`of the text image segments 20 to generate character-based
`textual content candidates 30. The OCR operates based on a
`pattern recognition algorithm or algorithms which identify
`characters based on matching with expected character shapes.
`Errors or uncertainty in the output ofthe OCR processing can
`be expected to occur due to various factors, such as: less than
`ideal match between a printed character and the expected
`pattern; non-optimal image quality (in spite of improvements
`provided by the image pre-processor 14); short or usual tex-
`tual content such as names and addresses; difiicult-to-match
`fonts having substantial flourishes or other artistic features;
`and so forth. Accordingly, the OCR 26 outputs one or (if
`uncertainty exists) more than one character-based textual
`content candidate for each text image segment. For example,
`OCR processing of the text image segment: “John H. Smith”
`may produce several different textual content candidates,
`such as: “John N. Smith”,Yohn H. Smith”, “John H. Smith”,
`and so forth.
`
`To resolve uncertainties, the OCR 26 optionally utilizes
`additional information or post-conversion processing such as
`a spelling checker, a grammar checker, or so forth. However,
`because the content of business cards typically includes per-
`sonal names, addresses, and so forth that are not commonly
`found in dictionaries, and because the content of business
`cards is typically not laid out in grammatically proper form,
`attempts to resolve uncertainties using dictionaries or gram-
`mar checkers are unlikely to be effective for the present
`application.
`In some embodiments, the OCR 26 assigns a confidence
`level to each textual content candidate based on the closeness
`
`of the pattern match and optionally based on other informa-
`tion such as whether the textual content candidate (or words
`within the textual content candidate) match a term found in a
`dictionary. Again, because of the non-standard content, fonts,
`and text layout oftypical business cards, the confidence levels
`assigned to the textual content candidates by the OCR 26 may
`be more suspect than in other typical OCR applications.
`Optionally, a segments tagger 32 attempts to tag the text
`image segments 20 with an indication of the type of content
`each text image segment conveys. For example, suitable tags
`for business card content may include “personal name”, “job
`title”, “entity” (a suitable tag for an affiliation such as a
`corporation, university, or the like), “address line”, or so
`forth. The segments tagger 26 can use various pieces of infor-
`mation in assigning tags to the text image segments 20. For
`example, the first line and/or the line with the largest font size
`is often the name of the person whose information is con-
`veyed by the card. Hence, the position and large font size of
`the text image segment: “John H. Smith” may enable the
`tagger 32 to tag this text image segment as a personal name.
`The relative font size is suitably derived from the text image
`segment 20. Address information often starts with numer-
`als—hence, the text image segment: “12345 Main Street”
`may be tagged as the first line of an address by the tagger 32.
`Recognition that text begins with numeric characters is suit-
`ably derived from the character-based textual content candi-
`dates 30. The tagger 32 may in general operate on the text
`image segments 20, the textual content candidates 30, orboth.
`As with the OCR processing, it will be appreciated that tags
`optionally assigned by the segments tagger 32 will typically
`have a certain degree of uncertainty.
`In view of the certainty and limited accuracy of the OCR
`and tagging, the textual content candidates 30 and assigned
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`tags typically have a degree of uncertainty. In order to select
`from amongst two or more textual content candidates corre-
`sponding to a single text image segment (for example, to
`select from amongst “John N. Smith”,Yohn H. Smith”, “John
`H. Smit ”, and so forth for the text image segment “John H.
`Smith”), a textual content candidates scoring processor 36
`assigns a score to each textual content candidate. The score
`assigned to each textual content candidate reflects a weight,
`probability, or other indicator of likelihood of correctness of
`the textual content candidate. A contact record content selec-
`
`tor 38 selects the textual content candidate for each text image
`segment for inclusion in the contact record based at least on
`the assigned scores. The textual content candidates scoring
`processor 36 assigns the score to each textual content candi-
`date on the basis of records collected from queries against at
`least one database respective to the textual content candi-
`dates.
`
`FIG. 1 diagrammatically illustrates some suitable data-
`bases for querying by the textual content candidates scoring
`processor 36, such as an Internet address book or directory 40
`or a corporate directory 42, both ofwhich are accessible via a
`network pathway 44, or a personal contacts list 46, which may
`reside on the portable imager 12 (for example as a cellular
`telephone contacts list) or on the business or professional
`person’s personal computer. In the latter case, the person
`identified by the business card 10 may be known to the person
`acquiring the personal information, but the business card 10
`may include updated affiliation information (for example, if
`the person identified by the business card 10 has recently
`changed jobs), updated position (for example, if the person
`identified by the business card 10 has recently been pro-
`moted), or so forth.
`The database queries provide information as to whether a
`textual content candidate is reasonable. For the example text
`image segment: “John H. Smith”, queries respective to the
`textual content candidates “John N. Smith”, “Yohn H.
`Smith”, “John H. Smith” should readily eliminate at least
`“Yohn H. Smith” since this candidate will produce few or no
`hits. Moreover, the collected records may be useful to update
`the tagging information. For example, if a collected record
`includes “John H. Smith” tagged as a personal name, this can
`be used to bias the tagging of the text image segment: “John
`H. Smith” toward being tagged as a personal name. In some
`embodiments, the optionally extracted logo segment tags 22
`are also queried against a logos database 48. The logo query
`may return metadata pertaining to the logo. For the example
`business card of FIG. 2, a logo query on the logo shown at the
`left side of the business card may return entity identification
`metadata associating the logo with the ABC Widget Corpo-
`ration. This, in turn, can be used to select “ABC Widget
`Corporation” from amongst other possible textual content
`candidates for the text image segment “ABC Widget Corpo-
`ration”.
`
`The contact record content selector 38 selects a most likely
`textual content candidate for each text image segment for
`inclusion in the contact record. The contact record is suitably
`stored in a contacts database such as the corporate directory
`42, the personal contacts list 46, both contacts databases 42,
`46, or so forth. Optionally, a user interface 50 is provided to
`enable the acquiring person to review and optionally edit the
`constructed contact record for the person whose information
`is conveyed by the business card 10 prior to storage in one or
`more contacts databases 42, 46.
`As noted previously,
`the image pre-processor 14 may
`reside either on the portable imager 12 (for example as a
`software application on a cellular telephone having a built-in
`camera), or may reside on a network server, personal com-
`
`Page 8 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`Page 8 of 11
`
`ROTHSCHILD EXHIBIT 1009
`
`
`
`US 7,826,665 B2
`
`7
`puter, or so forth. In similar fashion, the image segmenter 16,
`OCR processor 26, segments tagger 32, scoring processor 36,
`contact record content selector 38, and user interface 50 may
`be variously distributed. In some embodiments, the process-
`ing components 14, 16, 26, 32, 36, 38, 50 all reside on the
`portable imager 12. In these embodiments, the network path-
`way 44 suitably includes a wireless mobile telephone con-
`nection, a wireless connection to an Internet hotspot, or other
`wireless portion. In some embodiments, some or all of the
`processing components 14, 16, 26, 32, 36, 38, 50 reside on a
`network server, personal computer, or the like. Data is trans-
`ferred at the appropriate point in processing from the portable
`imager 12 to the network server, personal computer, or so
`forth. In these embodiments, the network pathway 44 may
`included wired or wireless components as is suitable for the
`network server, personal computer, or so forth performing the
`processing. As just one example of one ofthese latter embodi-
`ments, the image pre-processor 14 and image segmenter 16
`may reside on the portable imager 12 as cellular telephone-
`based software, and the text image segments 20 and optional
`logo image segments 22 output by the cellphone-based image
`segmenter 16 are transferred to a corporate network (such as
`to a network server) for OCR and further processing. Once the
`contact record is generated on the network, it may be stored in
`the corporate directory 42, and/or may be sent back to the
`portable imager 12 for storage in the personal contacts list 46
`residing on the portable imager 12 as a cellular telephone
`contacts list.
`
`With continuing reference to FIG. 1 and with further ref-
`erence to FIG. 3, some illustrative example embodiments of
`the textual content candidates scoring processor 36 are
`described. A databases query 60 queries one or more data-
`bases 40, 42, 46 respective to each of the textual content
`candidates 20, and collects records returned responsive to the
`query. A local scoring processor 62 computes a score for each
`textual content candidate based on records returned by the
`query respective to that textual content candidate. If only a
`single database is queried, then one suitable local score for
`textual content candidate p may be a total of the number of
`records or hits returned by the query. If more than one data-
`base is queried, then one suitable local score for textual con-
`tent candidate p may be:
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`8
`one outside of the employing corporation, then the queries
`performed by the databases query 60 are suitably expanded to
`encompass other databases such as the Internet addresses
`directory 40.
`In some embodiments, the local score of Equation (1) is
`suitably used as the score for selecting which of two or more
`textual content candidates is the correct extracted text for a
`
`corresponding text image segment. However, while compu-
`tationally
`straightforward,
`this
`exclusively
`localized
`approach may fail to take advantage of interrelationships
`between the various text image segments 20. To take the
`business card of FIG. 2 as an example, the name “John Smith”
`is common in the United States—accordingly, database que-
`ries may collect a large number ofrecords for both the correct
`“John H. Smith” textual content candidate, and also for the
`incorrect “John N. Smith” textual content candidate. Thus, an
`exclusively localized approach may be unable to accurately
`distinguish the correct “John H. Smith” over the incorrect
`“John N. Smith.” This ambiguity may be resolvable by taking
`into account records collected for queries on the text image
`segment “ABC Widget Corporation”. Collected records for
`the correct textual content candidate “ABC Widget Corpora-
`tion” are likely to include records that also contain “John H.
`Smith” since he is an employee ofABC Widget Corporation,
`and are much less likely to include records that also contain
`“John N. Smith”, assuming that ABC Widget Corporation
`does not have any employees by that name.
`Thus, in some embodiments a global adjuster 70 modifies
`the score of a first textual content candidate corresponding to
`a first text image segment when at least one record returned by
`a query respective to another textual content candidate corre-
`sponding to another text image segment also includes the first
`textual content candidate. More generally, the global adjuster
`70 is configure