`Casillas et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,978,936 B1
`Jul. 12, 2011
`
`US007978936B1
`
`(54) INDICATING ACORRESPONDENCE
`BETWEEN AN MAGE AND AN OBJECT
`
`(75) Inventors: Amy Casillas, San Jose, CA (US);
`Lubomir Bourdev, San Jose, CA (US)
`
`(73) Assignee: Adobe Systems Incorporated, San Jose,
`CA (US)
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 1077 days.
`
`(*) Notice:
`
`(21) Appl. No.: 11/342,225
`
`Jan. 26, 2006
`
`2:59: E: 1399; k al
`metal.
`7,171,630 B2
`1/2007 O'Leary et al.
`7,203,367 B2 * 4/2007 Shinib
`tal. .............. 382,224
`7,274,832 B2
`9/2007 SES. al
`7,343,365 B2
`3/2008 Farnham et al.
`... 382.118
`7,403,642 B2 *
`7/2008 Zhang et al. ...
`7,477,805 B2 *
`1/2009 Ohtsuka et al. ............... 382,305
`7,519,200 B2
`4/2009 Gokturk et al.
`7.587,101 B1
`9/2009 Bourdev
`2001/0053292 A1 12/2001 Nakamura
`2002/0054059 A1
`5/2002 Schneiderman .............. 345/700
`3.39,
`A.
`$383 E. et al.
`rigon
`2003/0023754 A1
`1/2003 Eichstadt et al.
`2003/0033296 A1
`2/2003 Rothmuller et al.
`2003/0158855 A1
`8/2003 Farnham et al. .............. 707/102
`Continued
`(
`)
`(22) Filed:
`OTHER PUBLICATIONS
`(51) Int. Cl
`Pentland etal. (Jun. 1996) “Photobook: Content-based manipulation
`(2006.01)
`G06K9/54
`of image databases.” Intill J. Computer Vision, vol. 18 No. 3, pp.
`(2006.01)
`G06K 9/00
`(52) U.S. Cl. ........ 382/305,382/118; 707/914, 707/915; '''
`71.5/838
`(58) Field of Classification Search .................. 382/118,
`382/305: 707/104.1, 345/642; 715/764,
`715/803, 821,823, 835,838,839
`See application file for complete search history.
`
`(Continued)
`Primary Examiner — Bhavesh M Mehta
`Assistant Examiner — Barry Drennan
`(74) Attorney, Agent, or Firm — Van Pelt, Yi & James LLP
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`4,651,146 A
`3, 1987 Lucash et al.
`5,943,093 A
`8, 1999 Anderson et al.
`5,963,203 A 10/1999 Goldberg et al.
`6,182,069 B1
`1/2001 Niblack et al.
`6,324,555 B1 1 1/2001 Sites
`6,366.296 B1 * 4/2002 Boreczky et al. ............. 71.5/719
`6,408,301 B1* 6/2002 Patton et al. .................. 707/741
`6,714,672 B1
`3, 2004 Berestov et al.
`6,721,733 B2 * 4/2004 Lipson et al. ..................... 707/3
`6,728,728 B2
`4/2004 Spiegler et al.
`6,879,709 B2
`4/2005 Tian et al.
`6,940,545 B1
`9/2005 Ray et al.
`
`ABSTRACT
`(57)
`Indicating an object is disclosed. Indicating an object
`includes receiving an indication associated with selecting an
`image and providing a second indication that a set of one or
`more objects correspond to the image, wherein the objects
`have been detected from the image. Indicating an image is
`disclosed. Indicating an image includes receiving an indica
`tion associated with selecting an object, wherein the object
`has been detected from an image and displaying the image
`Such that a correspondence between the selected object and
`the image is conveyed.
`
`33 Claims, 10 Drawing Sheets
`
`Y 500
`
`
`
`Show:
`
`O All Faces
`
`O Bob
`
`O Janet
`
`O Other
`
`y
`ANA
`
`as
`TSN
`
`304
`
`Petitioner Apple Inc. - Ex. 1050, p. 1
`
`
`
`US 7,978,936 B1
`Page 2
`
`U.S. PATENT DOCUMENTS
`2003/0210808
`A1 11/2003 Chen et al.
`2004/OOO8906
`A1
`1, 2004 Webb
`2004/OO17930
`A1
`1/2004 Kim et al.
`A1
`4/2004 Blazey et al.
`2004/OO60976
`A1
`4/2004 Rosenzweig et al.
`2004, OO64455
`A1
`5/2004 Fedorovskaya et al.
`2004/0101212
`2004/0204635
`A1 10, 2004 Scharf et al.
`2004/0267612
`A1 12/2004 Veach
`2005, OO11959
`A1
`1/2005 Grosvenor
`2005, 0013488
`A1
`1/2005 Hashimoto et al.
`2005.0025376
`A1
`2/2005 Ishida
`A1
`2/2005 Kagaya
`2005/0041114
`2005, 0046730
`A1
`3/2005 Li ............................ 348,333.12
`2005, OOSOO27
`A1
`3/2005 Yeh et al.
`2005, OO63568
`A1
`3/2005 Sun et al.
`2005/O 105779
`A1
`5, 2005 Kamei
`A1* 5/2005 Nagaoka et al. .............. 382,224
`2005/O105806
`2005/011 7870
`A1
`6, 2005 Lee
`2005/O128221
`A1
`6/2005 Aratani et al.
`A1
`6/2005 Haynes et al.
`2005/0129276
`A1
`7/2005 Leung
`2005, 01473O2
`2005/O157952
`Al
`7/2005 Gohda et al. .................. 382,305
`A1
`8/2005 Squibbs et al.
`2005/0172215
`2005/O196069
`A1
`9, 2005 Yonaha
`2005/02O7630
`A1
`9, 2005 Chan et al.
`A1
`9/2005 Oya et al.
`2005/0213793
`2005/0285943
`A1 12/2005 Cutler
`2006,0008145
`A1
`1/2006 Kaku
`2006,0008152
`A1
`1/2006 Kumar et al. ................. 382,190
`2006, OO32916
`A1
`2/2006 Mueller et al.
`2006/0050934
`A1* 3/2006 Asai .............................. 382,118
`2006.0053364
`A1
`3, 2006 Hollander et al.
`2006/0071942
`A1
`4/2006 Ubillos et al.
`2006/0098.737
`A1
`5/2006 Sethuraman et al.
`2006/O120572
`A1
`6, 2006 Li et al.
`2006, O140455
`A1
`6/2006 Costache et al.
`2006, O161588
`Al
`7/2006 Nomoto ..................... TO7 104.1
`A1
`8/2006 Rogers
`2006/0171573
`2006/0222243
`A1 10, 2006 Newell et al.
`A1* 10/2006 Zhang et al. .................. 382,118
`2006/0239515
`2006/0251338
`A1* 11/2006 Gokturk et al.
`382,305
`2006/0251339
`Al 1 1/2006 Gokturk et al. ............... 382,305
`2007/007 1323
`A1
`3/2007 Kontsevich et al.
`2007/0O81744
`A1
`4/2007 Gokturk et al.
`A1* 5/2007 Gallagher et al. ............ 382,305
`2007/OO983O3
`2007/0183638
`A1
`8, 2007 Nakamura
`2007/0242856
`A1 10, 2007 Suzuki et al.
`2008. O080745
`A1
`4/2008 Vanhoucke et al.
`2009, OO16576
`A1
`1/2009 Goh et al. ..................... 382,118
`A1
`6/2009 Kumagai et al.
`2009.0160618
`
`
`
`OTHER PUBLICATIONS
`Ma et al. (Sep. 2000) “An indexing and browsing system for home
`video.” Proc. 2000 European Signal Processing Conf. pp. 131-134.*
`Cox et al. (Jan. 2000) “The Bayesian image retrieval system.
`Pichunter: theory, implementation, and psychophysical experi
`ments.” IEEE Trans. on Image Processing, vol. 9 No. 1, pp. 20-37.*
`Nakazato et al. (Aug. 2003) “ImageCrouper: a group-oriented user
`interface for content-based image retrieval and digital image arrange
`ment.” J. Visual Languages and Computing, vol. 14 No. 4, pp. 363
`386.
`Girgensohn et al. (Oct. 2004) “Leveraging face recognition technol
`ogy to find and organize photos.” Proc. 6th ACMSGMM Int'l Work
`shop on Multimedia Information Retrieval, pp. 99-106.*
`“Notes and Tags'. p. 3 pf 6. http://www.flickr.com/learn more 3.
`gne.
`Riya—Photo Search. http://www.riya.com.
`Riya—Photo Search. http://www.riya.com/corp? learn-more.jsp.
`Riya—Photo Search. http://www.riya.com/corp? learn-more-s2.jsp.
`Riya—Photo Search. http://www.riya.com/corp? learn-more-s3.jsp.
`Riya—Photo Search. http://www.riya.com/corp? learn-more-S5.jsp.
`Riya—Photo Search. http://www.riya.com/corp? learn-more-S6.jsp.
`Gormish, Michael J. “JPEG 2000: Worth the Wait?” Ricoh Silicon
`Valley, Inc. pp. 1-4.
`Yang et al. "Detecting Faces in Images: A Survey.” IEEE Transac
`tions of Pattern Analysis and Machine Intelligence, vol. 4, No. 1. Jan.
`2003. pp. 34-58.
`Sunetal. "Quantized Wavelet Features and Support Vector Machines
`for On-Road Vehicle Detection.” Dept. of Computer Science, U. of
`Nevada, Reno & e-Technology Dept., Ford Motor Company,
`Dearborn, MI. pp. 1-6.
`Schneiderman, Henry. “A Statistical Approach to 3D Object Detec
`tion.” Robotics Institute, Carnegie Mellon University, Pittsburgh, PA.
`2000.
`Yang, Ming-Hsuan. “Recent Advances in Face Detection.” Honda
`Research Institute, Mountain View, CA.
`U.S. Appl. No. 1 1/097.951, Newell et al.
`Jennifer Granick, “Face It: Privacy Is Endangered”. Dec. 7, 2005.
`Michael Arrington, “First Screen Shots of Riya'. Oct. 26, 2005.
`Adobe R. Photoshop(R) Album 2.0 User Guide for Windows(R), Adobe
`Systems Incorporated, 2003.
`Adobe R. PhotoshopR, Elements 3.0 Getting Started Guide for Win
`dows(R), Adobe Systems Incorporated, 2004.
`Adobe R. Photoshop(R) Elements 3.0 Feature Highlights Adobe Sys
`tems Incorporated, 2004.
`* cited by examiner
`
`Petitioner Apple Inc. - Ex. 1050, p. 2
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul. 12, 2011
`Jul. 12, 2011
`
`Sheet 1 of 10
`Sheet 1 of 10
`
`US 7,978,936 B1
`US 7,978,936 B1
`
`
`
`
`
`FIG. 1A
`FIG. 1A
`
`Petitioner Apple Inc. - Ex. 1050, p. 3
`
`Petitioner Apple Inc. - Ex. 1050, p. 3
`
`
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul. 12, 2011
`Jul. 12, 2011
`
`Sheet 2 of 10
`Sheet 2 of 10
`
`US 7,978,936 B1
`US 7,978,936 B1
`
`O
`oO
`Te)
`=
`V
`
`4Display
`
`ObjectIdentifier
`
`06
`
`N
`
`6_
`oo
`as
`
`FIG.1B
`
`Petitioner Apple Inc. - Ex. 1050, p. 4
`
`Petitioner Apple Inc. - Ex. 1050, p. 4
`
`
`
`U.S. Patent
`
`Jul. 12, 2011
`
`Sheet 3 of 10
`
`US 7,978,936 B1
`
`
`
`ShOW:
`
`Untagged
`O
`F
`a CGS
`O Bob
`
`O Janet
`
`O Other
`
`Untagged
`N
`204S-208
`?ey
`(). XSS
`
`2O6
`
`FIG. 2
`
`Petitioner Apple Inc. - Ex. 1050, p. 5
`
`
`
`U.S. Patent
`
`Jul. 12, 2011
`
`Sheet 4 of 10
`
`US 7,978,936 B1
`
`
`
`ShOW:
`
`O All Faces
`
`O Bob
`
`O Janet
`
`FIG. 3
`
`Petitioner Apple Inc. - Ex. 1050, p. 6
`
`
`
`U.S. Patent
`
`Jul. 12, 2011
`
`Sheet 5 of 10
`
`US 7,978,936 B1
`
`
`
`Receive indication that Object
`has Been Selected
`
`402
`
`Display Image Such that a
`Correspondence Between the
`Selected Object and the
`Image is Conveyed
`
`404
`
`FIG. 4
`
`Petitioner Apple Inc. - Ex. 1050, p. 7
`
`
`
`U.S. Patent
`
`Jul. 12, 2011
`
`Sheet 6 of 10
`
`US 7,978,936 B1
`
`
`
`ShOW:
`
`O All Faces
`
`O Bob
`
`O Janet
`
`FIG. 5
`
`Petitioner Apple Inc. - Ex. 1050, p. 8
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul. 12, 2011
`
`Sheet 7 of 10
`
`US 7,978,936 B1
`US 7,978,936 B1
`
`payoajasulsjoalqo
`
`sobeuw|
`
`V9‘Sls
`
`
`
`Z09
`
` cO9
`
`Petitioner Apple Inc. - Ex. 1050, p. 9
`
`Petitioner Apple Inc. - Ex. 1050, p. 9
`
`
`
`U.S. Patent
`U.S. Patent
`
`Jul. 12, 2011
`Jul. 12, 2011
`
`Sheet 8 of 10
`Sheet 8 of 10
`
`US 7,978,936 B1
`US 7,978,936 B1
`
`FIG.6B
`
`N
`N
`
`ce
`—_
`
`co
`
`>8of
`
`”oo
`
`OD
`oO—_
`oO
`
`a®
`
`
`
`—wT
`co
`
`“”
`
`_
`oO
`
`22O
`
`O©£
`& ”—
`
`
`
`630
`
`Petitioner Apple Inc. - Ex. 1050, p. 10
`
`Petitioner Apple Inc. - Ex. 1050, p. 10
`
`
`
`Jul. 12, 2011
`Jul. 12, 2011
`
`Sheet 9 of 10
`Sheet 9 of 10
`
`US 7,978,936 B1
`US 7,978,936 B1
`
`U.S. Patent
`U.S. Patent
`
`
`
`ya©©sy
`
`ts|
`
`902
`
`Petitioner Apple Inc. - Ex. 1050, p. 11
`
`Petitioner Apple Inc. - Ex. 1050, p. 11
`
`
`
`U.S. Patent
`
`Jul. 12, 2011
`
`Sheet 10 of 10
`
`US 7,978,936 B1
`
`
`
`Receive Indication that Image
`has Been Selected
`
`8O2
`
`Provide indication that a Set
`of Objects Corresponds to
`the Image
`
`804
`
`FIG. 8
`
`Petitioner Apple Inc. - Ex. 1050, p. 12
`
`
`
`US 7,978,936 B1
`
`1.
`INDICATING ACORRESPONDENCE
`BETWEEN AN IMAGE AND AN OBJECT
`
`BACKGROUND OF THE INVENTION
`
`Automatic detection techniques can be used to detect
`objects in an image. For example, a face detection process can
`detect faces of people in an image. With digital cameras
`becoming increasingly popular, more and more digital
`images are being created for personal and commercial use.
`Face detection technology can be applied to these digital
`images to detect faces. However, existing methods for han
`dling faces once they have been detected are limited.
`Improved techniques for managing faces or other objects
`resulting from a detection process would be useful.
`
`10
`
`15
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`Various embodiments of the invention are disclosed in the
`following detailed description and the accompanying draw
`ings.
`FIG. 1A is an embodiment of an image including objects
`resulting from a detection process.
`FIG.1B is a block diagram illustrating an embodiment of a
`system for detecting and processing objects.
`FIG. 2 illustrates an embodiment of an interface for view
`ing objects.
`FIG. 3 illustrates an embodiment of an interface for view
`ing objects when an object is selected.
`FIG. 4 is a flow chart illustrating an embodiment of a
`process for indicating an image.
`FIG. 5 illustrates an embodiment of an interface for view
`ing objects when an image is selected.
`FIG. 6A illustrates an embodimentofan interface for view
`ing images when an image is selected.
`FIG. 6B illustrates an embodimentofan interface for view
`ing images when more than one image is selected.
`FIG. 7 illustrates an embodiment of an interface for view
`ing objects when a portion of an image is selected.
`FIG. 8 is a flow chart illustrating an embodiment of a
`process for indicating a set of objects.
`
`25
`
`30
`
`35
`
`40
`
`DETAILED DESCRIPTION
`
`2
`A detailed description of one or more embodiments of the
`invention is provided below along with accompanying figures
`that illustrate the principles of the invention. The invention is
`described in connection with such embodiments, but the
`invention is not limited to any embodiment. The scope of the
`invention is limited only by the claims and the invention
`encompasses numerous alternatives, modifications and
`equivalents. Numerous specific details are set forth in the
`following description in order to provide a thorough under
`standing of the invention. These details are provided for the
`purpose of example and the invention may be practiced
`according to the claims without some or all of these specific
`details. For the purpose of clarity, technical material that is
`known in the technical fields related to the invention has not
`been described in detail so that the invention is not unneces
`sarily obscured.
`FIG. 1A is an embodiment of an image including objects
`resulting from a detection process. In the example shown,
`image 100 may be a file in a variety of formats, including Joint
`Photographic Experts Group (JPEG), Graphics Interchange
`Format (GIF), Tagged Image File Format (TIFF), and Por
`table Network Graphics (PNG). In some embodiments,
`image 100 is generated using a digital camera. Although
`images may be described in the examples herein, any data,
`including audio, video, streaming video, or graphical data,
`may be used in various embodiments. For example image 100
`may be a frame of video.
`Automatic detection processing is performed on image
`100. Automatic detection processing detects occurrences of a
`detection object in an image. Automatic detection processing
`may be performed using various techniques in various
`embodiments. For example, Eigenfaces, Adaboost, or neural
`networks may be used. A two dimensional pattern matching
`technique may be used. A three dimensional model of the
`object may be used to approximate the object. Detection may
`be performed based on the model. Adobe R Photoshop(R) Ele
`ments may be used to perform automatic face detection on
`photographs.
`Objects are output by the automatic detection process and
`are believed by the automatic detection process to include an
`occurrence of the detection object. Automatic detection pro
`cesses may not necessarily attempt to detect a particular
`detection object(for example, the face of a particular person).
`Rather, the process may attempt to detect any occurrence of a
`detection object in an image (for example, any face). In some
`embodiments, including this example, each object includes
`one and only one occurrence of a detection object. Examples
`of detection objects include a face, person, animal, car, boat,
`book, table, tree, mountain, etc.
`An object resulting from a detection process may be
`referred to as a “detected object' or an “object that has been
`detected from an image. A detected object may include (an
`occurrence of) a detection object. As used herein, “face' may
`refer to either an object that includes a face or a face as a
`detection object (i.e., a face that is shown in an object).
`Objects may be associated with a Subimage (i.e., a portion
`ofan image) and may be described in a variety of ways. In this
`example, objects are approximated with a rectangle. In some
`embodiments, objects output by an automatic detection pro
`cess have a different shape, such as a round shape. Object 102
`may be described by coordinates (x, y). Coordinates (x, y)
`may describe the location of the lower left corner of object
`102 with respect to the origin (i.e., lower left corner of image
`100). Any appropriate unit may be used for coordinates (x, y).
`Object 102 in this example is also described by a height, H.
`and a width, W. In some embodiments, objects output by an
`automatic detection process have a fixed aspect ratio (i.e., a
`
`45
`
`55
`
`The invention can be implemented in numerous ways,
`including as a process, an apparatus, a system, a composition
`of matter, a computer readable medium Such as a computer
`readable storage medium or a computer network wherein
`program instructions are sent over optical or electronic com
`50
`munication links. In this specification, these implementa
`tions, or any other form that the invention may take, may be
`referred to as techniques. A component such as a processor or
`a memory described as being configured to perform a task
`includes both a general component that is temporarily con
`figured to perform the task at a given time or a specific
`component that is manufactured to perform the task. In some
`embodiments, the technique is performed by a processor and
`a memory coupled with the processor, wherein the memory is
`configured to provide the processor with instructions which
`when executed cause the processor to perform one or more
`steps. In some embodiments, the technique is performed by a
`computer program product being embodied in a computer
`readable storage medium and comprising computer instruc
`tions for performing one or more steps. In general, the order
`of the steps of disclosed processes may be altered within the
`Scope of the invention.
`
`60
`
`65
`
`Petitioner Apple Inc. - Ex. 1050, p. 13
`
`
`
`US 7,978,936 B1
`
`3
`fixed width to height ratio). For example, although the sizes of
`objects 102 and 104 are different, the aspect ratios of the two
`objects may be the same.
`Additional information associated with each object may be
`output by an automatic detection process. In some embodi
`ments, a probability that a given object includes the detection
`object is output. For example, object 106 may be associated
`
`4
`Table 1 lists examples of information that may be stored for
`various objects. This information may be output by an object
`detector. In this example, objects 1-5 were automatically
`detected and object 6 was manually detected. Such informa
`tion may be stored in one or more of a database, file metadata,
`file, or in any other appropriate way.
`TABLE 1
`
`Object
`ID
`
`Source
`File
`ID
`
`Coordinates
`of Origin
`
`P(Object
`Date
`:
`Detection Object
`Width Height Angle Object) Detected
`
`Manually or
`Automatically Identity
`Detected
`Confirmed?
`
`1
`2
`3
`4
`5
`6
`
`1
`1
`1
`2
`2
`2
`
`x0, y0
`x1, y1
`X2, y2
`X3, y3
`X4, y4
`x5, y5
`
`5
`5
`1
`2
`3
`1
`
`8
`7
`1
`2
`4
`1
`
`O
`5
`O
`O
`2O
`O
`
`Automatically yes
`Jan. 1, 2005
`O.8
`Automatically yes
`Jan. 1, 2005
`0.7
`Automatically no
`Jan. 1, 2005
`O.S
`Automatically yes
`O6 Nov. 2, 2005
`Automatically yes
`0.7
`Nov. 3, 2005
`1
`Nov. 22, 2005 User
`
`25
`
`FIG.1B is a block diagram illustrating an embodiment of a
`system for detecting and processing objects. In this example,
`system 150 includes object detector 152, object identifier
`154, and object manager 156. Data 158 is input to object
`detector 152. Data 158 may include an image, video, audio
`clip, and/or other data. Object detector 152 performs an
`object detection process to detect occurrences of detection
`objects in data158. Object detector 152 may detect any occur
`rence of a detection object (e.g., any face). Object detector
`152 provides detected objects 162 as output.
`Objects 162 are provided as input to object identifier 154,
`which identifies detection objects. For example, object detec
`tor 152 may detect any face, and object identifier 154 may
`identify the face as belonging to a specific person. Object
`identifier may output one or more names associated with one
`or more of objects 162. In some embodiments, object identi
`fier 154 assigns a tag (such as the tag “Bob”) to an object.
`Objects 162 and the output of object identifier 154 are pro
`vided as input to object manager 156. User input 164 may also
`be provided as input to object manager 156. In some embodi
`ments, system 150 does not include object identifier 154.
`Object manager 156 manages objects 162, including orga
`nizing, tagging, and displaying information associated with
`objects 162 on display 160. For example, object manager 156
`may manage the tagging of objects, including assigning, Stor
`ing, and obtaining tag information. Object manager 156 may
`manage the display of detected objects and other information.
`For example, object manager 156 may indicate a correspon
`dence between an image and an object, as more fully
`described below.
`FIG. 2 illustrates an embodiment of an interface for view
`ing objects. In the example shown, interface 200 displays
`objects resulting from face detection performed on images.
`Some of the objects in this example are tagged while other
`objects are untagged. Object 202, for example, has been
`assigned a tag of “Bob” while object 204 is untagged. Object
`210, which may include someone other than Bob (e.g., Janet),
`may have been mistagged, perhaps by a user or a face iden
`tification process. Interface 200 may be used to tag faces or
`other objects. Interface 200 may include results from a search
`query.
`Tagging refers to the process of assigning a tag to an object
`or image. A user or an automatic process may assign a tag. A
`tag includes tag data. Tag data may be user specified or
`machine specified. Examples of tag data include a name,
`place, event, date, etc. A tag may represent descriptive infor
`
`with a probability that object 106 includes a face. In some
`embodiments, one or more angles are output by an automatic
`detection process. For example, one angle may describe the
`rotation of the detection object in the image plane (face tilted
`side-to-side), a second angle in the 3D space along the
`vertical axis (frontal vs. profile face, or a rotation) and a third
`angle in the 3D space along the horizontal axis (face look
`ing up or down, or a tilt up or down).
`Automatic detection processes can be imperfect. Some
`30
`times, an automatic detection process may not be able detect
`an occurrence of a detection object. For example, some face
`detection processes may not be able to detect the face of a
`person if the face is too small in an image. An automatic
`detection process can also generate "false alarms. A face
`detection process may output an object that does not include
`a face.
`In some embodiments, additional processes may be
`applied to image 100 or an associated object after automatic
`detection is performed. For example, a face identification
`process may be performed where objects are evaluated to
`determine whether they contain the face of a particular per
`son. Objects may be identified in various ways in various
`embodiments. For example, a technique based on Adaboost,
`Linear Discriminant Analysis (LDA), or principal component
`analysis (PCA) may be used to perform object identification.
`In some embodiments, a face that is identified is automati
`cally tagged. Face identification may be imperfect. For
`example, a face may be misidentified or mistagged. In some
`embodiments, a probability that the face is identified cor
`rectly is provided. In some embodiments, a face matching
`process is performed, where multiple objects from multiple
`images are compared and similar faces are matched together.
`In some embodiments, a process generates a new object or
`modifies an existing object. For example, the aspect ratio of
`an object may be adjusted.
`Object detection may be automatic or manual. A user may
`examine an image, detect an occurrence of a detection object,
`and specify the portion of the image associated with the new
`object. For example, a user may have drawn a box around any
`of faces 102, 104, and 106 to detect a face. The output of a
`manual detection process may include the same information
`as the output of an automatic detection process. The probabil
`ity that a manually detected object includes the detection
`object may be set to 1.
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Petitioner Apple Inc. - Ex. 1050, p. 14
`
`
`
`US 7,978,936 B1
`
`5
`mation associated with an object or image. For example, a
`vacation photographed may be tagged with "Boston.”
`“Mom,” or “Fourth of July.” Tag data may include any type of
`data, including text, image, audio, or video. Tag data may
`include free form text or keywords. The same tag may be
`assigned to more than one object and/or image. An object or
`image may have multiple tags.
`In some embodiments, the output of an object detector
`includes tag data for an object. For example, the coordinates
`of an object may be considered tag data for an object. In some
`embodiments, the output of an object identifier includes tag
`data for an object, where the tag data includes a name. In some
`embodiments, a tag may be designated as a particular type of
`tag. Such as a name tag. A name tag may be assigned to an
`object that includes a face.
`Table 2 lists examples of information that may be stored for
`various tags. Such information may be stored in one or more
`of a database, file metadata, file, or in any other appropriate
`way.
`
`TABLE 2
`
`Tag Tag
`ID
`Data
`
`Tag Icon or
`User or
`P(Ob-
`Ob-
`Machine Object ID
`User or
`ject
`ject(s)
`Specified
`to Use for
`= Tag Machine
`Being
`Tagged Data) Assigned Tag Data Tag Icon
`
`1
`
`2
`3
`4
`5
`
`Bob
`
`1, 6
`
`Janet
`teeth
`hat
`Oll
`tains
`
`4
`1
`1
`1, 2, 3
`
`0.6, 1 Machine, User
`User
`User
`User
`O.S
`User
`User
`1
`User
`User
`1
`Machine, Machine
`0.8,
`0.7, 1 Machine,
`User
`
`Object ID 1
`
`Object ID 2
`icon1.jpg
`icon2.jpg
`icon3.jpg
`
`10
`
`15
`
`25
`
`30
`
`35
`
`6
`In some embodiments, interface 300 shows interface 200
`when an “All Faces' option is selected. In some embodi
`ments, the detection object is not a face and different objects
`are accordingly detected and displayed.
`One or more objects may be selected. To select objects, an
`input device may be used to interact with interface 300. The
`input device can be a mouse, a stylus, a touch sensitive dis
`play, or any pointing device. Using an input device, one or
`more objects may be selected from the objects displayed in
`interface 300. For example, by placing a mouse cursor over
`object 304 and clicking the mouse, object 304 may be
`selected. Clicking an object may toggle the object between a
`selected and an unselected state. If a user clicks object 306
`after selecting object 304 (e.g., while holding down the “Con
`trol button), object 306 may be selected in addition to object
`304. Clicking a mouse cursor above object 306 one more time
`may unselect object 306. In some cases, multiple objects are
`selected. In other cases, a single object is selected. Objects
`can be selected based on a criterion. For example, all objects
`associated with certain tag(s) may be selected.
`An object may be selected for various purposes. For
`example, an object may be selected in order to perform an
`action associated with the object. Actions such as saving,
`exporting, tagging, copying, editing, or opening may be per
`formed. Such actions may be performed with a selected
`object as the target. For example, in order to save an object,
`the object is selected and a "SAVE' button is pressed.
`When an object is selected, the corresponding image is
`displayed. The corresponding image includes the image from
`which the object was detected. In this example, object 304 is
`selected, as shown by a thicker border around object 304. In
`response, side bar 310 displays image 312. Image 312 is the
`image from which object 304 is detected. If more than one
`object is selected, more than one image may be displayed. In
`Some embodiments, selecting object 304 causes image 312 to
`be selected. For example, side bar 310 may display multiple
`images. When object 304 is selected, image 312 is selected
`among the multiple images. As with object 304, a thicker
`border around image 312 may indicate that image 312 is
`selected. In some embodiments, unselecting an object causes
`an associated image to be removed from display. For
`example, if object 304 is unselected, image 312 may be
`removed from side bar 310. In some embodiments, no images
`are displayed if no objects are selected. In some embodi
`ments, if more than one object is selected, the most recently
`selected objects corresponding image is indicated.
`In some embodiments, an object may be detected from
`Video, audio, or other data. Thus, video, audio, or other data
`may be displayed in response to selecting an object. For
`example, an object may have been detected from video.
`Selecting Such an object may cause video data to be dis
`played. The video data may be displayed as text (e.g., a
`filename), an icon corresponding to a video file, or in any
`other way. Video data may be played. In the case of an audio
`file, selecting an object associated with Voices from the audio
`file may cause the audio file to be displayed. Selecting an
`object associated with a sound in the audio file may cause the
`audio file to be displayed. In some embodiments, a lower
`resolution or bandwidth version of video, audio, or other data
`is displayed to reduce the amount of resources consumed.
`In some embodiments, each object is detected from one
`and only one data source. In some embodiments, an object
`may be detected from more than one data source. For
`example, the same object may be detected from images that
`are copies of each other or two videos that include a same
`portion of video. An object may be associated with more than
`one audio file. In Such cases, one or more of the data sources
`
`In the example of Table 2, tag 1 (having tag data “Bob”) has
`been assigned to object 1 and object 6. The probability that
`object 1 includes Bob is 0.6. The probability that object 6
`includes Bob is 1. For example, a face detection process may
`have output object 1 and a user may have generated object 6.
`A user is assumed to detect faces without any errors, while a
`face detection process may be imperfect. The tag data “Bob”
`is obtained from the user and the value in the sixth column
`indicates this. The tag icon to use for tag 1 is set to object ID
`1. The tag icon is more fully described below.
`Tags may be organized hierarchically. For example, tags
`may be organized into categories and Subcategories.
`Examples of categories include people, places, and events.
`Subcategories might include family, USA, and sports. In
`Some embodiments, side bar 206 displayStags hierarchically.
`For example, “people' may be displayed in the first line and
`“family' may be displayed as indented in the second line.
`In some embodiments, objects, images, video, and/or
`audio may be organized into collections. For example, photos
`to use in a slideshow may form a collection. A collection tag
`may be a particular type of tag. Collections may be displayed
`in side bar 206.
`In some embodiments, the interface used varies from that
`of interface 200. For example, an interface may have no
`concept of tags, and an interface may not necessarily display
`objects based on a tag. Faces 202 and 204 may, for example,
`be displayed based on a date. The date may be the date a
`photograph is taken, or may be the date automatic detection is
`performed.
`FIG. 3 illustrates an embodiment of an interface for view
`ing objects where an object is selected. In the example shown,
`interface 300 displays objects resulting from face detection
`performed on images. Any number of faces may be displayed.
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Petitioner Apple Inc. - Ex. 1050, p. 15
`
`
`
`7
`associated with a given object can be displayed. In some
`embodiments, all data sources are displayed. In some
`embodiments, a representative data source is selected and
`displayed.
`In some embodiments, a sequence of interactions or an
`input device used differs from that described in this example.
`For example, instead of using a mouse as an input device, a
`touch sensitive display may be used.
`FIG. 4 is a flow chart illustrating an embodiment of a
`process for indicating an image. In the example shown, a set
`of one or more objects is detected by a detection process. At
`402, an indication that an object has been selected is received.
`For example, an indication that a user has selected object 304
`using a mouse is received. The indication may be triggered by
`a cursor hovering over the object, double clicking on the
`object, or any other appropriate sequence of inputs. At 404, an
`image is displayed such that a correspondence between the
`selected object and the image is conveyed. For example,
`image 312, and only image 312, may be displayed. In some
`embodiments, multiple images are displayed and the corre
`spondence is conveyed visually, for example using a border, a
`highlight, shading, transparency, etc. Image 312 may be
`obtained in various ways. For example, in Table 2, each object
`has a source file ID. By looking up a selected object, the
`source file can be o