`a2) Patent Application Publication 10) Pub. No.: US 2009/0196510 Al
`Gokturketal. Aug.6, 2009 (43) Pub. Date:
`
`
`
`US 20090196S510A1
`
`(76)
`
`(54) SYSTEM AND METHOD FOR ENABLING
`THE USE OF CAPTURED IMAGES
`THROUGH RECOGNITION
`Inventors:
`Salih Burak Gokturk, Mountain
`View, CA (US); Dragomir
`Anguelov, San Francisco, CA (US);
`Vincent Vanhoucke, Menlo Park,
`CA (US); Kuang-Chih Lee,
`Mountain View, CA (US); Diem Vu,
`vountainview ran(UUS) Danny
`Shale Los Altos, CA(US Asher
`Khan,San Francisco, CA (US)
`
`ang,
`
`Palo
`
`Alto,
`
`; Munja
`
`Correspondence Address:
`SHEMWELL MAHAMEDI LLP
`4880 STEVENS CREEK BOULEVARD, SUITE
`201
`SAN JOSE, CA 95129-1034 (US)
`
`(21) Appl. No.:
`
`12/395,422
`
`(22)
`
`Filed:
`
`Feb. 27, 2009
`
`Related U.S. Application Data
`(63) Continuation of application No. 11/246,742, filed on
`Oct. 7, 2005, now Pat. No. 7,519,200.
`(60) 5nal application No. 60/679,591, fled on May
`;
`Publication Classification
`
`(51)
`
`Int. Cl.
`G06K 9/62
`(2006.01)
`GO6K 9/54
`(2006.01)
`(52) US. CD. cesccccccccscecsstecssseseseeesseeeees 382/224; 382/305
`(67)
`ABSTRACT
`Anembodimentprovidesfor enablingretrieval ofa collection
`of captured images that form at least a portion of a library of
`images. For each imagein the collection, a captured image
`may be analyzed to recognize information from image data
`contained in the captured image, and an index maybe gener-
`ated, where the index data is based on the recognized infor-
`mation. Using the index, functionality such as search and
`retrieval is enabled. Various recognition techniques, includ-
`ing those that use the face, clothing, apparel, and combina-
`tions of characteristics may beutilized. Recognition may be
`performed on, amongotherthings, persons and text carried on
`objects.
`
`Multi-User
`Network Service
`
`
`
`NETWORK
`LocalLibrary
`
`
`
`LIBRARY
`1247
`
`1243
`
`CRAWLER
`1292
`
`
`
`iD
`.
`Manual
`
`
`Source
`Store
`Programmatic
`
`
`
`1218
`Source
`1284
`1294
`
`
`
`
`
`
`
`Knowledge
`
`Text Input
`1204
`Image Input
`1206
`
`
`Image-Text Link
`
`
`
`1209
`New Image Check
`1208
`New Image Data
`
`Picture ID
`
`
`
`
`New/Old
`Response
`
`
`
`mageAnalysis
`
`
`
`
`
`Text Analysis
`Person Analysis
`
`
`Component
`Component
`
`1222
`
`
`Object
`
`Analysis
`Component
`
`
`
`Non-Image
`Index Data
`407
`
`
`
`
`
`
`
`SAMSUNG 1043
`SAMSUNG v. MEMORYWEB
`IPR2022-00222
`
`Text Object ID
`1235
`
`ID Information Indexer
`4240
`
`Signature Indexer
`
`.
`
`1
`
`
`Person ID
`
`233
`ID Index Data
`1245
`1D
`
`
`Information
`Index
`
`
`
`1242
`
`
`
`
`
`
`4253
`Sig. index Data
`Signature
`1245
`
`Index
`
`
`1252
`
`
` 1
`
`SAMSUNG 1043
`SAMSUNG v. MEMORYWEB
`IPR2022-00222
`
`1
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 1 of 16
`
`US 2009/0196510 Al
`
`32
`
`Detected
`Objects
`
`Recognition
`Information
`
`Search Feature
`60
`
`
`
`
`
`
`Correlation
`Information
`42
`
`Objectified
`‘Image
`50
`
`Categorizatlon/Sort
`
`Feature —
`
`66
`
`
`
`Extrapolation Feature
`66
`
`Input
`52
`
`FIG. 2
`
`Train Classification Algorithm Using Training
`Set of Images
`210
`
`Apply Algorithm To Window About Each
`-
`Image Portion
`220
`
`Validate Guess Using Skin Color Of Detected
`Face Region
`
`230
`
`Validate Guess Using Marker Feature Of
`Face
`
`~
`FIG. 3
`
`
`
`Detect Face In Image
`340
`
`
`
`Create Normalized Rendition Of Detected
`Face
`
`320
`
`
`
`Generate Recognition Signature For Each
`Detected Face
`330
`
`
`Match Recognition Signatureto Identity
`340
`
`FIG. 4
`
`2
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 2 of 16
`
`US 2009/0196510 Al
`
`Detect Face Of Person
`410
`
`—
`
`Extract Image Data From Window Located A
`Designated Location From Face
`
`420
`
` Determine A Quantification On Clothing In
`
`Window
`430
`
`FIG. 5
`
`
`Face
`Recognition
`520
`
`Input
`
`
`
`
`
`
`Clothing
`Location
`Metadata
`Recognition
`
`
`
`
`550
`Location
`Clothing
`530
`
`
`
`
`Vector
`
`Vector
`
`554
`558
`
`
`
`
`Time
`Vector
`
`
`556
`
`
`
`Time Metadata
`
`540
`
`
`FIG. 6
`
`3
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 3 of 16
`
`US 2009/0196510 Al
`
`
`
`Cluster Image Filed Deemed As Having The Same Person
`710
`
`
`
`
`Receive Identity Assignment For Cluster As A Whole From User
`720
`
`Store Correlation Information for SubsequentUse
`730
`
`FIG. 7
`
`4
`
`
`
`Patent Application Publication puno4Zlpuno4g}=punoyze=puno4Op=punoypy——ppuno4zgBuenyon|¢"
`BOISSTS)|)
`
` aex0yQ){sy=coLo)KajmolUeYS(set=|e8punoy6Spunoy89punoy96CSOK1S)©)||“=punoj90;__punojgit
`
`
`
`
`
`_punosjozi _—punojopy—punojcp, —_punosogruosie}egUN
`
`@5>G
`
`SyA
`
`(eNx
`
`om
`
`ala
`
`Aug. 6, 2009 Sheet 4 of 16
`
`US 2009/0196510 Al
`
`= £o
`
`OoQ
`
`Ya)
`
`5
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 5 of 16
`
`US 2009/0196510 Al
`
`
`
`
`
`An ImageIs Analyzed To Detect Presence Of Text
`910
`
`
`
`Recognize Detected Text
`920
`
`
`
`interpret Text For Context And Meaning
`930
`
`
`
`
`
`
`
`6
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 6 of 16
`
`US 2009/0196510 Al
`
`FIG. 10B
`
`SOVUTMS BOUTIQUE
`FIG. 10C
`
`
`
`SHOF.> so
`
`ASHBURY> ssusvny
`
`FIG. 10D
`
`ccewe
`
`7
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 7 of 16
`
`US 2009/0196510 Al
`
`Detect Text From a Given mageIn A Collection
`,
`1110
`
`Not
`Relevant
`
`
`
`
`Make Determination As To Whether Text
`Relevant
`Provides A Relevant Tag
`
`
`4120
`
`
`
`Not Spannable
`
`
`
`Spannable
`Make Determination As To WhetherTextIs
`Spannable?
`
`
`1130
`
`
`Ignore Text
`4125
`
`
`
`
`Do Not Span
`1140
`
`
`
`
`
`
`
`Determine Image Grouping From Collection Using
`Spannable Text
`1150
`
`
`
`FIG. 11
`
`8
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 8 of 16
`
`US 2009/0196510 Al
`
`Multi-User
`Network Service
`
`
`
`
`
`
`NETWORK
`Local Libra
`
`ry
`LIBRARY
`043
`4247
`
`
`
`CRAWLER
`1292
`
`Programmatic
`Source
`1294
`
`
`
`Image Input
`Text Input
`1206
`1204
`
`
`
`
`
`
`Image-Text Link
`1209
`
`
`
`
`
`New Image Check
`1208
`
`
`New/Old
`Picture ID
`New Image Data
`1228
`Response
`
`1229
`1227
`
`
`Image Analysis
`Module
`
`1220
`
`
`
`ae
`PersonAnalysis
`Text Analysis
`
`
`C
`Component
`Component
`y
`
`
`
`omponent
`
`
`
`1224 1226
`
`1222
`
`
`
`F | G . 1 2
`
`Person ID
`233
`
`
`
`
`Text Object ID
`1235
`
`
`
`ID Index Data
`ID
`
`
`
`.
`1245
`
`7
`j
`Information
`
`'
`ID Information Indexer
`Index
`
`
`‘
`1240
`Non-Image
`1242
`
`
`
`
`;
`;
`Index Data
`
`
`PersonSig.
`Object Sig.
`107
`
`
`1257
`Sig. Index Data
`1253
`
`
`
`Signature
`
`1245
`
`
`Index
`
`1252
`
`
`Signature Indexer
`1250
`
`.
`
`
`
`
`
`9
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 9 of 16
`
`US 2009/0196510 Al
`
`Metadata Store
`
`Recognition
`Indexer
`1360
`
`Signatures
`1352
`
`Signatures
`1353
`
`Metadata
`435
`
`=
`Facial
`
`identifier
`
`|
`
`FaceInfo
`1342
`
`Image
`Input
`1302
`
`1311
`
`Face Detect
`1310
`
`Clothing!
`Apparel
`
`Context
`Analysis &
`Data Inference
`(CADI}
`1340
`
`Normalized Input
`
`
`seHair Info
`
`1346
`
`Gen. Info
`4348
`
`Retine
`
`1326
`
`Gender
`Analsysis
`1328
`
`Relationship
`
`Analysis
`1329
`
`Metadata
`Extractor
`1312
`
`CADI Feedback
`1355
`
`f
`
`Person
`Analysis
`Component
`1300
`
`FIG. 13
`
`10
`
`10
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 10 of 16
`
`US 2009/0196510 Al
`
`11
`
`11
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 11 of 16
`
`US 2009/0196510 Al
`
`Metadata Store
`
`
` Indexer
`
`for Image!
`1560
`
`Identity
`
`1576
`
`Metadata
`1566
`
`Metadata
`1566
`
`
`
`
`
`
` Recognition Term
`1544
`
` ImageFile
`
` Context And
`1540
`Interpretation Build
`
`
` Text Data
`1532
`
`Processed Text
`
`Image
`
`
`Text
`1522
`
`
`
`Processing
`
`1520
`
`1510 FIG. 15
`
`Metadata
`1542
`
` Image Input
`1508
`
`Text Detector
`
`12
`
`12
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 12 of 16
`
`US 2009/0196510 Al
`
`Objectified
`_
`Image
`1706
`
`Text Input
`
`1702
`
`User-Interface
`1710
`
`Object
`Select
`1708
`
`Image/Metadata
`1712
`
`1704
`
`
`
`
`
`Image Analysis
`Component
`
`1720
`
`
`sone
`Signature
`470,
`
`1722
`
`Text Input
`
`.
`
`
`
`Search Module
`1730
`
`
`Criteria
`
`
`1732
`
`
`
`Image Store
`Text Index
`1746
`
`
`1742
`
`FIG. 16
`
`Signature Index
`1744
`
`
`13
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 13 of 16
`
`US 2009/0196510 Al
`
`Recognition Information And Data is Generated
`1810
`
`Assoclate Recognition Information And Data With ImageFile
`1820
`
`Store Correlation Information for Subsequent Use
`
`1830 Detect User-Action Resulting In Use Of Recognition Information And Data
`
`1840
`
`FIG. 17
`
`14
`
`14
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 14 of 16
`
`US 2009/0196510 Al
`
`Header Metadata for Object ID
`1920 Recognition Information
`
`
`poee
`
`
`
`
`
`ie me,
`
`.
`
`—
`
`1914
`
`es .
`TET /
`\
`~ 7
`~~,Text image”
`i
`a “Face Imag
`Coreen cee
`ry
`/
`
` a
`Se PWinaserie_/
`
`Pp
`&
`“LANDMARK”
`
`,
`
`_—
`
`c /
`
`“
`
`1910
`
`1912
`
`FIG. 18
`
`
` f TEXT . /
`
`\.Text Image.”
`‘
`
`
`
`“LANDMARK
`aaa” MageFile
`
`ee
`‘Face Image’
`
`!
`
`i
`
`
`
`No
`
`
`
`____.| Metadata Data
`Store
`41970
`
`
`
`4930
`
`\Y FileID
`
`FIG. 19
`~~
`Object 1 --Signature 1
`
`15
`
`15
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 15 of 16
`
`US 2009/0196510 Al
`
`
`
`
`
`
`
`16
`
`
`
`Patent Application Publication
`
`Aug. 6, 2009 Sheet 16 of 16
`
`US 2009/0196510 Al
`
`
`
`Image Data
`
`Analysis
`
`Database of Stored
`With
`
`Component
`.
`
`
`Signatures
`
`
`Person
`Similar
`2020
`2030
`
`
`2010
`
`Matches
`
`2034
`
`
`FIG, 21
`
`17
`
`17
`
`
`
`US 2009/0196510 Al
`
`Aug. 6, 2009
`
`SYSTEM AND METHOD FOR ENABLING
`THE USE OF CAPTURED IMAGES
`THROUGH RECOGNITION
`
`RELATED APPLICATIONS
`
`[0001] This application is a continuation of U.S. patent
`application Ser. No. 11/246,742 filed Oct. 7, 2005, entitled
`SYSTEM AND METHOD FOR ENABLING THE USE OF
`CAPTURED IMAGES THROUGH RECOGNITION,
`which claimspriority to U.S. Provisional Patent Application
`No. 60/679,591 filed May 9, 2005, entitled METHOD FOR
`TAGGING IMAGES; all of the aforementioned applications
`are hereby incorporated by referencein their entirety forall
`purposes.
`
`TECHNICAL FIELD
`
`[0002] The disclosed embodiments relate generally to the
`field of digital image processing. Moreparticularly, the dis-
`closed embodiments relate to a system and method for
`enabling the use of captured images.
`
`BACKGROUND
`
`[0003] Digital photography has become a consumerappli-
`cation of great significance. It has afforded individuals con-
`venience in capturing and sharing digital images. Devices
`that capture digital images have become low-cost, and the
`ability to send pictures from onelocationto the other has been
`oneofthe driving forces in the drive for more network band-
`width.
`[0004] Due to the relative low cost of memory and the
`availability of devices and platforms from which digital
`images can be viewed, the average consumer maintains most
`digital images on computer-readable mediums, such as hard
`drives, CD-Roms, and flash memory. The use offile folders
`are the primary source of organization, although applications
`have been created to aid users in organizing and viewing
`digital images. Some search engines, such as GOOGLE,also
`enables users to search for images, primarily by matching
`text-based search inputto text metadata or content associated
`with images.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 illustrates a sequence of processes which
`[0005]
`may be performed independently in order to enable various
`kinds of usages of images, according to an embodiment.
`[0006]
`FIG.2 illustrates an embodimentin whichthe cor-
`relation information may be usedto create objectified image
`renderings, as well as enable other functionality
`[0007]
`FIG. 3 describes a technique for detecting a face in
`an image, under an embodimentof the invention.
`[0008]
`FIG.4 illustrates a technique for recognizing a face
`in an image, under an embodimentofthe invention.
`[0009]
`FIG.5 illustrates a technique for recognizing a per-
`son in an image using clothing and/or apparel worn by the
`person in the image, under an embodimentofthe invention.
`[0010]
`FIG. 6is a block diagram illustrating techniques for
`using recognition information from different physical char-
`acteristics of persons in order to determine a recognition
`signature for that person, under an embodimentofthe inven-
`tion.
`
`FIG.7 illustrates a method for correlating an iden-
`[0011]
`tity of a person with recognition information for that person,
`under an embodimentof the invention.
`
`FIG. 8 illustrates an embodimentin whichcluster-
`[0012]
`ing of images is performed pro grammatically.
`[0013]
`FIG. 9 illustrates a basic method is described for
`recognizing and using text whentext is provided on objects of
`an image, under an embodimentofthe invention.
`[0014]
`FIG. 10A provide individual examples of features,
`providedas block patters, provided for purpose of detecting
`the presenceof text in an image, under an embodimentof the
`invention.
`
`FIG. 10B and FIG.10Cillustrate examples of a text
`[0015]
`stretching post-processing techniquefor text in images, under
`an embodimentof the invention.
`[0016]
`FIG. 10D illustrates examplesofa text tilting post-
`processing technique for text in images, under an embodi-
`mentof the invention.
`[0017]
`FIG. 11 illustrates a technique in which a detected
`and recognized wordin one imageis then spannedacross a set
`of images for purpose of tagging imagesin the set with the
`recognized text, under an embodimentof the invention.
`[0018]
`FIG. 12 illustrates a system on which one or more
`embodimentsofthe invention may be performedor otherwise
`provided.
`[0019]
`FIG. 13 illustrates person analysis component for
`use in embodiments such as described in FIG. 12 with greater
`detail, under an embodimentof invention.
`[0020]
`FIG. 14A is a graphical representation of the
`Markov random field, which captures appearance and co-
`appearancestatistics of different people, under an embodi-
`mentof the invention.
`[0021]
`FIG. 14B is another graphical representation of the
`Markov random field,
`incorporating clothing recognition,
`under an embodimentofthe invention.
`
`FIG. 15 illustrates a system for text recognition of
`[0022]
`text carried in images, under an embodimentofthe invention.
`[0023]
`FIG. 16 illustrates a system in which searching for
`images based on their contents can be performed, under an
`embodimentofthe invention.
`
`FIG. 17 describes a methodfor creating objectified
`[0024]
`image renderings, under an embodimentofthe invention.
`[0025]
`FIG. 18 is a representation of an objectified image
`file as rendered, under an embodimentofthe invention.
`[0026]
`FIG. 19 is a representation of an objectified image
`file as rendered, under another embodimentof the invention.
`[0027]
`FIG. 20 provides an example ofan objectified image
`rendering, where metadata is displayed in correspondence
`with recognized objects in the image, under an embodiment
`of the invention.
`
`FIG. 21 illustrates a basic system for enabling simi-
`[0028]
`larity matching ofpeople, under an embodimentofthe inven-
`tion.
`FIG. 22 illustrates an embodiment in which an
`[0029]
`image is selected for a text content.
`
`DETAILED DESCRIPTION
`
`[0030] Embodiments described herein provide for various
`techniques that enable the programmatic ofdigitally captured
`images using, among other advancements, image recogni-
`tion. Embodiments described herein mine imagefiles for data
`and information that enables, among other features,
`the
`indexing of the contents of images based on analysis of the
`images. Additionally, images may be made searchable based
`on recognition information of objects contained in the
`images. Other embodiments provide for rendering of image
`files in a manner that makes recognition information about
`18
`
`18
`
`
`
`US 2009/0196510 Al
`
`Aug. 6, 2009
`
`objects those images usable. Numerous other applications
`and embodiments are provided.
`[0031] Various applications and implementations are con-
`templated for one or more embodiments of the invention. In
`the context of consumer photographs, for example, embodi-
`ments of the invention enable users to (1) categorize, sort, and
`label their images quickly andefficiently through recognition
`of the contents of the images,(i1) index images using recog-
`nition, and (iii) search and retrieve images through text or
`image input. For these purposes, recognition may be per-
`formed on persons, on text carried on objects, or on other
`objects that are identifiable for images. Techniquesare also
`described in which images maybe rendered in a form where
`individual objects previously recognized are madeselectable
`or otherwise interactable to the user. Network services are
`also described that enable online management and use of
`consumerphotographs. Additionally, embodiments contem-
`plate amusement applications where image recognition may
`be used to match people whoare look-alikes. Social network
`and image-based as insertion applications are also contem-
`plated and described with embodimentsof the invention.
`[0032] An embodimentprovides for enablingretrieval of a
`collection of captured images that form at least a portion of a
`library of images. For each imagein the collection, a captured
`image maybe analyzedto recognize information from image
`data contained in the captured image. An index maybe gen-
`erated based on the recognized information. Using the index,
`functionality such as search andretrieval is enabled. Various
`recognition techniques, including those that use the face,
`clothing, apparel, and combinations of characteristics may be
`utilized. Recognition may be performed on, among other
`things, persons andtext carried on objects.
`[0033] Among the various applications contemplated,
`embodiments enable the search andretrieval of images based
`on recognition of objects appearing in the images being
`searched. Furthermore, one or more embodiments contem-
`plate inputs that correspondto text or image input for purpose
`of identifying a search criteria. For example, an input may
`correspond to an image specified by a user, and that imageis
`used to generate the search criteria from which other images
`are found.
`
`For persons, embodiments provide for detection and
`[0034]
`recognition of faces. Additionally, one or more embodiments
`described enable recognition of personsto be basedat least in
`part on clothing or apparel worn by those persons. Under one
`embodiment, a person may be detected from a captured
`image. Once the detection occurs, recognition information
`may be generated from the clothing or apparel of the person.
`In one embodiment, the person is detectedfirst, using one or
`more markers indicating people (e.g. skin and/or facial fea-
`tures), and then the position of the clothingis identified from
`the location ofthe person’s face. The recognition information
`of the clothing may correlate to the coloring present in a
`region predeterminedin relative location to the detectedface,
`taking into account the proportionality provided from the
`image.
`information
`[0035] According to another embodiment,
`about captured images be determinedbyidentifying a cluster
`of images from a collection of captured images. The cluster
`may be based on a commoncharacteristic ofeither the image
`or ofthe imagefile (such as metadata). In one embodiment, a
`recognition signature may be determined for a given person
`appearing in one of the cluster of images. The recognition
`
`signature may be usedin identifying a recognition signature
`of one or more persons appearing in any oneofthe cluster of
`images.
`In one embodiment, the personsin the other images
`[0036]
`are all the sameperson, thus recognition of one person leads
`to all persons (assuming only one person appears in the
`imagesin the cluster) in the cluster being identified as being
`the sameperson.
`[0037] According to another embodiment, a collection of
`images may be organized using recognition.In particular, an
`embodimentprovides for detecting and recognizing texts car-
`ried on objects. When such text is recognized, information
`related to the text may be used to categorize the image with
`other images. For example, the text may indicate a location
`because the nameofthe city, or of a business establishment
`for whichthe city is known, appears on a sign or other object
`in the image.
`[0038] According to another embodiment, recognition is
`performed on captured images for purpose of identifying
`people appearing in the images. In one embodiment, image
`data from the captured imageis analyzed to detect a face of a
`person in the image. The image data is then normalized for
`one or more ofthe following: lighting, orientation, and size or
`relative size of the image.
`[0039]
`In another embodiment, recognition may also be
`performed using more than one markeror physical character-
`istic ofa person. In one embodiment, a combination oftwo or
`more markers are used. Specifically, embodiments contem-
`plate generating a recognition signature based on recognition
`information from two or more of the following characteris-
`tics: facial features(e.g. eye or eye region including eye brow,
`nose, mouth, lips and ears), clothing and/or apparel, hair
`(including color, length and style) and gender.
`[0040] According to another embodiment, metadata about
`the imagefile, such as the time the image wascaptured, or the
`location from which the image was captured, may be used in
`combination with recognition information from one or more
`of the featureslisted above.
`
`In another embodiment, content analysis and data
`[0041]
`inference is used to determine a recognition signature for a
`person. For example, relationships between people in images
`maybe utilized to use probabilities to enhance recognition
`performance.
`[0042]
`In another embodiment, imagesare displayed to a
`user in a manner where recognized objects from that image
`are made user-interactive. In one embodiment, stored data
`that corresponds to an image is supplemented with metadata
`that identifies one or more objects in the captured image that
`have been previously recognized. The captured imageis then
`rendered, or made renderable, using the stored data and the
`metadata so that each of the recognized objects are made
`selectable. When selected, a programmatic action may be
`performed, such as the display of the supplemental informa-
`tion, or a search for other images containing the selected
`object.
`[0043] According to another embodiment, an image view-
`ing system is provided comprising a memory that stores an
`imagefile and metadatathat identifies one or more objects in
`the image file. The one or more objects have recognition
`information associated with them. A user-interface or viewer
`
`may be provided that is configured to use the metadata to
`display an indication or information about the one or more
`objects.
`19
`
`19
`
`
`
`US 2009/0196510 Al
`
`Aug. 6, 2009
`
`can be carried and/or executed. In particular, the numerous
`machines shown with embodiments of the invention include
`
`processor(s) and various forms of memory for holding data
`and instructions. Examples of computer-readable mediums
`include permanent memory storage devices, such as hard
`drives on personal computers or servers. Other examples of
`computer storage mediums include portable storage units,
`such as CD or DVD units, flash memory (such as carried on
`manycell phones and personaldigital assistants (PDAs)), and
`magnetic memory. Computers, terminals, network enabled
`devices (e.g. mobile devices such as cell phones) are all
`examples of machines and devices that utilize processors,
`memory, and instructions stored on computer-readable medi-
`ums.
`
`[0044] As used herein, the term “image data”is intended to
`mean data that correspondsto or is based on discrete portions
`ofa captured image. For example, with digital images, suchas
`those provided in a JPEG format, the image data may corre-
`spondto data or information aboutpixels that form the image,
`or data or information determined from pixels of the image.
`[0045] The terms“recognize”, or “recognition”, or variants
`thereof, in the context of an image or imagedata (e.g. “rec-
`ognize an image”) is meant to meansthat a determination is
`madeasto what the imagecorrelates to, represents, identifies,
`means, and/or a context provided by the image. Recognition
`does not mean a determination of identity by name, unless
`stated so expressly, as name identification may require an
`additional step of correlation.
`[0046] As used herein, the terms “programmatic”, “pro-
`grammatically”or variations thereofmean through execution
`of code, programmingorother logic. A programmatic action
`may be performed with software, firmware or hardware, and
`generally without user-intervention, albeit not necessarily
`automatically, as the action may be manually triggered.
`[0047] One or more embodiments described herein may be
`implemented using programmatic elements, often referred to
`as modules or components, although other names may be
`used. Such programmatic elements may include a program, a
`subroutine, a portion of a program, or a software component
`or a hardware componentcapable of performing one or more
`stated tasks or functions. As used herein, a module or com-
`ponent, can exist on a hardware componentindependently of
`other modules/components or a module/componentcan be a
`shared elementor process ofother modules/components, pro-
`grams or machines. A module or component mayreside on
`[0052]
`In FIG.1, image data 10 is retrieved from a source.
`one machine, such as onaclient or on a server, or a module/
`The image data 10 may correspondto a captured image, or
`component may be distributed amongst multiple machines,
`portion or segmentthereof. A system may be implemented in
`such as on multiple clients or server machines. Any system
`which one or more types of objects may be detected and
`described may be implementedin whole orin part ona server,
`recognized from the captured image. One or more object
`or as part of a networkservice. Alternatively, a system such as
`detection processes 20 may perform detection processes for
`described herein may be implemented on a local computer or
`different types of objects identified from the imagedata. In an
`terminal, in whole orin part. In either case, implementation of
`embodiment, the object detected is a person, or a portion of a
`system provided for in this application may require use of
`person, such as a face, a body, a hair or other characteristic.
`memory, processors and network resources (including data
`Numerousother types of objects may be detected by the one
`ports, and signal lines (optical, electrical etc.), unless stated
`or more object detection processes, including (i) objects car-
`otherwise.
`rying text or other alphanumeric characters, and (i1) objects
`associated with people for purpose of identifying an indi-
`vidual. An example of the latter type of object includes
`apparel, such as a purse, a briefcase, or a hat. Other types of
`objects that can be detected from object detection processes
`include animals (such as dogsor cats), and landmarks.
`[0053] Detected objects 22 are then analyzed and possibly
`recognized by one or more object recognition processes 30.
`Different recognition results may be generated for different
`types of objects. For persons, the recognition processes 30
`mayidentify or indicate (such as by guess) one or more ofthe
`following for a given person: identity, ethnic classification,
`hair color or shape, gender, or type (e.g. size of the person).
`For objects carrying text, the recognition information may
`correspond to alphanumeric characters. These characters
`may be identified as guesses or candidates of the actual text
`carried on the detected object. For other types of objects, the
`recognition information may indicate or identify any one or
`moreof the following: whatthe detected object is, a class of
`the detected object, a distinguishing characteristic of the
`detected object, or an identity of the detected object.
`[0054] As the above examplesillustrate, recognition infor-
`mation may recognizeto different levels of granularity. In the
`20
`
`[0050] Overview
`[0051]
`FIG. 1 illustrates a sequence of processes which
`may be performed independently or otherwise, in order to
`enable various kinds of usages of images, according to an
`embodiment. A sequence such asillustrated by FIG. 1 is
`intendedto illustrate just one implementation for enabling the
`use of captured images. As described below, each of the
`processes in the sequence of FIG. 1 may be performedinde-
`pendently, and with or without other processes described.
`Furthermore, other processes or functionality described else-
`wherein this application may be implementedin addition to
`any of the processes illustrated by FIG. 1. While FIG. 1
`illustrates an embodiment that utilizes a sequence of pro-
`cesses, each ofthe processes and sub-processes that comprise
`the described sequence may in andofitself form an embodi-
`mentof the invention.
`
`[0048] Embodiments described herein generally require
`the use of computers,
`including processing and memory
`resources. For example, systems described herein may be
`implemented on a server or network service. Such servers
`may connect and be used by users over networks such as the
`Internet, or by a combination of networks, such as cellular
`networksandthe Internet. Alternatively, one or more embodi-
`ments described herein may be implementedlocally, in whole
`or in part, on computing machines such as desktops, cellular
`phones, personal digital assistances or laptop computers.
`Thus, memory, processing and network resources mayall be
`used in connection with the establishment, use or perfor-
`mance of any embodimentdescribed herein (including with
`the performance of any methodor with the implementation of
`any system).
`[0049]
`Furthermore, one or more embodiments described
`herein may be implemented through the use of instructions
`that are executable by one or more processors. These instruc-
`tions may be carried on a computer-readable medium.
`Machines shownin figures below provide examples of pro-
`cessing resources and computer-readable mediums on which
`instructions for implementing embodiments of the invention
`
`20
`
`
`
`US 2009/0196510 Al
`
`Aug. 6, 2009
`
`case where the detected object is a person, the recognition
`information may correspondto a recognition signature that
`serves as a relatively unique identifier of that person. For
`example, a recognition signature may be usedto identify an
`individual from any other individual in a collection of photo-
`graphs depicting hundreds, thousands, or even millions of
`individual (depending on the quality and/or confidence of the
`recognition). Alternatively,
`recognition information may
`only be able to identify a person as belonging to a set of
`personsthat are identifiable from other persons in the same
`pool ofpeople. For example, the recognition information may
`identify people by ethnic class or gender, or identify a person
`as being one of a limited number of matching possibilities.
`[0055]
`In an embodiment, recognition information is a
`quantitative expression. According to one implementation,
`for example, a recognition signature may correspond to a
`highly dimensional vector or other dimensional numerical
`value.
`
`[0056] Once the recognition information 32 is generated, a
`correlation process 40 can be used to correlate the detected
`and recognized object ofthe image with data and information
`items, and/or other information resources. Various types of
`functionality may be enabled with the correlation process 40,
`including for example, search, categorization, and text object
`research. In one embodiment, the recognized object is a per-
`son, or a portion of a person. In such an embodiment, the
`correlation process 40 generates correlation information 42
`that is an identity, or more generally identification informa-
`tion to the person. In another embodiment, the recognized
`object carries text, and the correlation information 42 assigns
`meaning or contextto the text.
`[0057] As analternative or addition to the correlation infor-
`mation described above, in another embodiment, correlation
`process 40 may, for a recognized face, generate correlation
`information 42 that correlates the recognition information 32
`with other images that have been determined to carry the
`same recognized face. Thus, one recognition signature may
`be correlated to a collection of digital photographs carrying
`the same person. Examplesof the types of information items
`and resources that recognized objects can be correlated to
`include someorall of the following: other images with the
`samerecognition information or signature, clothing recogni-
`tion information, text based content associated with a recog-
`nized object, audio or video content associated with the rec-
`ognized object, other images that contain objects with similar
`but not the same detected object, or third-party Internet search
`engines that can retrieve information in responseto specified
`criteria.
`
`[0058] With regard to text carrying objects, the correlation
`process 40 maycorrelate recognition information 32 in the
`form ofa string of alphanumeric characters, to a meaning or
`context, such as to a proper name, classification, brand-name,
`or dictionary meaning. As an addition oralternative, the cor-
`relation process 40 may generate correlation information 42
`that indirectly correlates recognition information 32 to rec-
`ognized word. For example the recognition information 32
`maycorrelate the popular nameof a hotel with a city where
`the hotel is located.
`
`[0059] According to an embodiment, correlation informa-
`tion 42 resulting from the correlation process 40 may be