throbber
(19) United States
`(12) Patent Application Publication (10) Pub. No.: US 2007/0030391 A1
`Kim et al.
`(43) Pub. Date:
`Feb. 8, 2007
`
`US 20070O30391A1
`
`(54) APPARATUS, MEDIUM, AND METHOD
`SEGMENTINGVIDEO SEQUENCES BASED
`ON TOPIC
`(75) Inventors: Jungbae Kim, Yongin-si (KR); Doosun
`Hwang, Seoul (KR); Jiyeun Kim,
`Seoul (KR)
`Correspondence Address:
`STAAS & HALSEY LLP
`SUTE 700
`1201 NEW YORK AVENUE, N.W.
`WASHINGTON, DC 20005 (US)
`(73) Assignee: Samsung Electronics Co., Ltd., Suwon
`si (KR)
`(21) Appl. No.:
`11/498,857
`(22) Filed:
`Aug. 4, 2006
`
`(30)
`
`Foreign Application Priority Data
`
`Aug. 4, 2005 (KR)............................ 10-2005-0071507
`
`Publication Classification
`
`(51) Int. Cl.
`(2006.01)
`H04N 5/445
`(52) U.S. Cl. .............................................................. 34.8/564
`
`ABSTRACT
`(57)
`Provided are an apparatus, medium, and method segmenting
`Video sequences based on a topic. The apparatus may
`include a start-shot determination unit detecting a plurality
`of key-frames by using character information from video
`sequences including a plurality of frames to determine the
`detected key-frames as start-shots for each topic, and a topic
`list creation unit creating a topic list by using the start-shots
`for each topic.
`
`CHAPTER 4
`
`CHAPTER 5
`
`CHAPTER 6
`
`CHAPTER 7
`
`s
`
`Sri
`:
`
`risex
`geoists' .
`8T mir
`
`CHAPTER 22
`
`
`
`START :
`
`CONTENTS:
`
`START :
`
`CONTENTS:
`
`Petitioner Apple Inc. - Ex. 1049, p. 1
`
`

`

`Patent Application Publication
`
`US 2007/0030391 A1
`
`
`
`Petitioner Apple Inc. - Ex. 1049, p. 2
`
`

`

`Patent Application Publication Feb. 8, 2007 Sheet 2 of 10
`
`US 2007/0030391 A1
`
`FIG. 2
`
`START-SHOT
`DETERMINATION
`UNIT
`
`TOPC LIST
`CREATION
`UNT
`
`FIG 3
`
`29
`
`
`
`VIDEO
`SEOUENCES
`
`EPG
`SIGNAL
`
`
`
`310
`
`330
`
`350
`
`PRE-
`PROCESSING
`UNIT
`
`
`
`FACE
`DETECTION
`UNIT
`
`KEY-FRAME
`DETERMINATION
`UNIT
`
`Petitioner Apple Inc. - Ex. 1049, p. 3
`
`

`

`Patent Application Publication Feb. 8, 2007 Sheet 3 of 10
`
`US 2007/0030391 A1
`
`FIG. 4A
`
`
`
`
`
`
`
`TITLE: WJREPORTERS
`CHANNEL : KBS2
`BROADCASTING TIME:
`411 PM 9:55
`- 411 PM 11:05
`GENRE: CURRENT AFFAIRS
`/DOCUMENTARY-SOCIETY
`BROADCASTER :
`JUNG-MIN, HWANG
`
`
`
`FIG, 4B
`
`
`
`
`
`
`
`
`
`
`
`
`
`Petitioner Apple Inc. - Ex. 1049, p. 4
`
`

`

`Patent Application Publication Feb. 8, 2007 Sheet 4 of 10
`
`US 2007/0030391 A1
`
`FIG. 4C
`
`
`
`
`
`VIDEO
`SEOUENCES
`
`HUMBNAIL
`MAGE
`CREATION
`UNIT
`
`CHANGE
`DETECTION
`UNIT
`
`DE6.
`UNIT
`
`
`
`EPG
`
`SIGNAL
`
`EPG
`ANALYZER
`UNIT
`
`NUMBER-OF-MAIN
`-CHARACTERS
`DETERMINATION
`UNIT
`
`TO KEY FRAME
`DETERMINATION
`UNIT
`
`Petitioner Apple Inc. - Ex. 1049, p. 5
`
`

`

`Patent Application Publication Feb. 8, 2007 Sheet 5 of 10
`
`US 2007/0030391 A1
`
`FIG. 6A
`
`THUMBNAL MAGE
`RE-ORGANIZATION UNIT
`
`CLASSIFYING
`UNIT
`
`FIG. 7
`
`M. S.
`NY's
`M.
`v.0.3 sq to
`
`.
`
`
`
`710
`
`730
`
`750
`
`Petitioner Apple Inc. - Ex. 1049, p. 6
`
`

`

`Patent Application Publication Feb. 8, 2007 Sheet 6 of 10
`
`US 2007/0030391 A1
`
`FIG. 8A
`
`811
`
`812
`
`813
`
`814
`
`86
`
`815
`
`FIG. 8B
`
`821
`
`823
`
`
`
`Petitioner Apple Inc. - Ex. 1049, p. 7
`
`

`

`Patent Application Publication Feb. 8, 2007 Sheet 7 of 10
`
`US 2007/0030391 A1
`
`FIG. 9
`
`
`
`900 -
`
`Petitioner Apple Inc. - Ex. 1049, p. 8
`
`

`

`Patent Application Publication
`
`Feb. 8,2007 Sheet 8 of 10
`
`US 2007/0030391 Al
`
`S]NOILVYSN39
`
`éQ313FIdWOO
`
`StNOMLO3140
`
`éINSSSAQONS
`
`dOvA
`
`SINOILWHANI9D
`
`é05LI1dWOO
`
`SINOILO3LS0
`
`AOVA
`
`VOT‘Old
`
`S3AMOONIM-SNS3LVYSNI9 SLOl
`
`
`
`
`
`JOVISWVYS
`
`AZINVDYO-3u
`
`é1INSSSAQONS
`
`
`
`NOILOASLSuldNI
`
`
`
`JOVANITIVWNEWNHL
`
`
`
`WHOSLNAZINVOYO
`
`
`
`NOILOISLSHIJYOJOVWI
`
`
`
`
`
`Y4OJSMOGNIM-SNS3LVH3NI9TWHSSALNSZINVDYO
`
`
`
`
`
`
`
`
`
`SNOILOSSGNODSSONYLSHI4YO4J9vWI
`
`
`
`031V901MOCGNIM-8NS30N10x3)QNOOASNOILO3S
`
`
`
`
`
`
`
`
`
`(NOILOASLSHI3AINONI
`
`Petitioner Apple Inc. - Ex. 1049, p. 9
`
`Petitioner Apple Inc. - Ex. 1049, p. 9
`
`
`
`
`
`
`
`
`
`

`

`Patent Application Publication
`
`Feb. 8, 2007 Sheet 9 of 10
`
`US 2007/0030391 A1
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`MOCINIM-90S BOTTOXB)
`
`(1S8||-|ATNO NI CJELWOOT
`
`Petitioner Apple Inc. - Ex. 1049, p. 10
`
`

`

`Patent Application Publication Feb. 8, 2007 Sheet 10 of 10
`
`US 2007/0030391 A1
`
`FIG 11
`
`CLOTHING
`INFORMATION
`EXTRACTION UNIT
`
`CHARACTER
`CLUSTERNG
`UNIT
`
`MAIN CHARACTER
`DETERMINATION
`UNIT
`
`
`
`1210 -
`
`Petitioner Apple Inc. - Ex. 1049, p. 11
`
`

`

`US 2007/0O30391 A1
`
`Feb. 8, 2007
`
`APPARATUS, MEDIUM, AND METHOD
`SEGMENTING VIDEO SEQUENCES BASED ON
`TOPIC
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`0001. This application claims the benefit of Korean
`Patent Application No. 10-2005-0071507, filed on Aug. 4,
`2005, in the Korean Intellectual Property Office, the disclo
`sure of which is incorporated herein in its entirety by
`reference
`
`BACKGROUND OF THE INVENTION
`
`0002)
`1. Field of the Invention
`0003. An embodiment of the present invention relates to
`segmentation of video sequences, and more particularly, to
`an apparatus, medium, and method segmenting video
`sequences based on a topic at high speed by detecting main
`characters.
`0004 2. Description of the Related Art
`0005 Developments in digital signal processing tech
`niques such as video and audio compression have allowed
`users to retrieve and browse desired multimedia content at
`desired points in time. Fundamental techniques required to
`browse and retrieve non-linear multimedia content include
`shot segmentation and shot clustering, with these two tech
`niques being most important for structurally and hierarchi
`cally analyzing multimedia content.
`0006. A “shot' in a video program is a sequence of
`frames that can be obtained from a video camera without
`interruption, and may functions as a basic unit for analyzing
`or organizing the video program. The shot may mean a
`single frame or a plurality of frames, however, for simplicity
`of explanation, the term shot will be exemplified by the
`single frame, noting that embodiments of the invention are
`not limited to the same. In addition, a 'scene” in the video
`program is a semantic element of a video construction or
`development of a story, and includes a collection of shots
`related to one another by the same semantic context. The
`concept of the shot or the scene may be similarly applied to
`an audio program as well as the video program.
`0007. A multimedia indexing technique allows users to
`easily browse or retrieve a desired part of the video program.
`A conventional multimedia indexing technique may include
`extracting organizational information of video content in
`units of shots or scenes, extracting main characteristic
`elements such as key-frames capable of representing a
`corresponding segment for each organizational unit, index
`ing the organizational information for multimedia content,
`and describing semantic information, such as an occurrence
`of an event, advent of visual or auditory objects, and
`conditions and backgrounds of objects, along a temporal
`aX1S.
`0008 However, such conventional multimedia content
`indexing techniques fail to easily identify the result of a
`Summarization because excessive segments are generated
`when segmentation is performed on the basis of scene
`change. In addition, conventional techniques fail to accu
`rately detect start points of the segments because the mul
`timedia content is not segmented on the basis of similarity
`
`of content, but rather, the multimedia content is Summarized
`using a single piece of information Such as similarity of
`colors. Further, it is difficult to summarize the multimedia
`content when a broadcast type or genre is changed because
`only a characteristic of a particular genre is used. Moreover,
`due to an excessive processing load generated during the
`summarization of the multimedia content, it is difficult to
`apply conventional techniques to embedded systems such as
`mobile phones, personal digital assistants (PDAs), and digi
`tal cameras, which have low performance processors.
`
`SUMMARY OF THE INVENTION
`0009. An embodiment of the present invention provides
`an apparatus, medium, and method for segmenting video
`sequences based on a topic, at high speed, based on the
`detection of main characters.
`0010 Additional aspects and/or advantages of the inven
`tion will be set forth in part in the description which follows
`and, in part, will be apparent from the description, or may be
`learned by practice of the invention.
`0011 To achieve the above and/or other aspects and
`advantages, embodiments of the present invention include
`an apparatus topic based segmenting a video program, the
`apparatus including a start-shot determination unit to detect
`a plurality of key-frames based on character information
`from video sequences including a plurality of frames to
`determine the detected key-frames as start-shots for each
`topic, and a topic list creation unit to create a topic list based
`on the start-shots for each topic.
`0012. The start-shot determination unit may detect key
`frames based on clothing information of at least one main
`character.
`0013 The topic list creation unit may organize frames
`existing between a current topic start-shot and a next topic
`start-shot into a current topic episode, and add the current
`topic episode to the start-shot of each topic in the topic list.
`0014 Further, the start-shot determination unit may
`include a pre-processing unit to determine frames belonging
`to a respective scene by detecting scene change among
`frames included in the video sequences and to obtain a
`number of main characters appearing in the video sequences,
`a face detection unit to detect faces from the determined
`frames belonging to the respective scene to determine face
`detection frames, and a key-frame determination unit to
`cluster the determined face detection frames according to the
`main characters corresponding to the number of main char
`acters to determine the key-frames.
`0015 The pre-processing unit may detect the scene
`change by calculating similarity between a current frame
`and a previous frame.
`0016.
`In addition, the pre-processing unit may obtain the
`number of main characters from an electronic program guide
`(EPG) signal.
`0017. The pre-processing unit may include a thumbnail
`image creation unit to create thumbnail images for input
`frames, a scene change detection unit to detect the scene
`change using similarity of color histograms between thumb
`nail images of neighboring frames, and a number-of-main
`characters determination unit to determine the number of
`main characters by analyzing an EPG signal.
`
`Petitioner Apple Inc. - Ex. 1049, p. 12
`
`

`

`US 2007/0O30391 A1
`
`Feb. 8, 2007
`
`0018. In addition, the face detection unit may include a
`thumbnail image re-organization unit to create an integral
`image for thumbnail images of input frames and to re
`organize the thumbnail images using the integral image, a
`Sub-window generation unit to generate a Sub-window for
`the re-organized thumbnail images, and a classifying unit to
`determining whether the sub-window includes a face.
`0.019
`Here, the face detection unit may divide the thumb
`nail images of the input frames into a plurality of sections
`having a section having a highest probability of detecting the
`face, and sequentially provide the plurality of sections to the
`thumbnail image re-organization unit in descending order
`from the section having the highest probability of detecting
`the face to a section having a lowest probability of detecting
`the face.
`0020. The key-frame determination unit may further
`include a clothing information extraction unit to extract
`clothing information from a face detection frame, a character
`clustering unit to perform a character clustering method
`based on the extracted clothing information, and a main
`character determination unit to select a cluster correspond
`ing to the main character from a plurality of clusters,
`clustered in the character clustering unit, corresponding to
`the number of main characters and to provide frames
`included in the selected cluster as key-frames of each topic.
`0021. The clothing information may include a clothing
`color histogram.
`0022. To achieve the above and/or other aspects and
`advantages, embodiments of the present invention include a
`method of topic based segmenting of video sequences, the
`method including detecting a plurality of key-frames based
`on character information from video sequences including a
`plurality of frames to determine the detected key-frames as
`start-shots for each topic, and creating a topic list based on
`the start-shots for each topic.
`0023 The determination of the start-shots may include
`detecting key-frames based on clothing information of at
`least one main character.
`0024. Further, the creation of the topic list may include
`organizing frames existing between a current topic start-shot
`and a next topic start-shot into a current topic episode, and
`adding the current topic episode to the start-shot of each
`topic in the topic list.
`0.025 The determination of the start-shots may include
`detecting a scene change from the frames included in the
`Video sequences to determine frames belonging to a respec
`tive scene and obtaining a number of main characters
`appearing in the video sequences, detecting faces from the
`determined frames belonging to the respective scene to
`determine face detection frames, and clustering the deter
`mined face detection frames according to the main charac
`ters corresponding to the number of main characters to
`determine the face detection frames as key-frames.
`0026. The scene change may be detected by creating
`thumbnail images of input frames and using similarity of
`color histograms between thumbnail images of neighboring
`frames.
`0027. In addition, the number of main characters may be
`obtained by analyzing an electronic program guide (EPG)
`signal.
`
`0028. The detection of the faces my include creating an
`integral image for thumbnail images of input frames and
`re-organizing the thumbnail images using the integral
`image, generating a sub-window for the re-organized
`thumbnail images, and determining whether the Sub-window
`includes a face.
`0029. The detection of the faces may further include
`dividing the thumbnail images of the input frames into a
`plurality of sections including a section having a highest
`probability of detecting a face, and sequentially providing
`the thumbnail images for the thumbnail image re-organizing
`in descending order from the section having the highest
`probability of detecting the face to a section having a lowest
`probability of detecting the face.
`0030 The determination of the key-frames may include
`extracting clothing information from the face detection
`frames, performing a character clustering method based on
`the extracted clothing information, and selecting a cluster
`corresponding to the main character from a plurality of
`clusters corresponding to the number of main characters and
`providing frames included in the selected cluster as the
`key-frames of each topic.
`0031) To achieve the above and/or other aspects and
`advantages, embodiments of the present invention include a
`medium including computer readable code to implement a
`method of topic based segmenting of video sequences, the
`method may include detecting a plurality of key-frames
`based on character information from video sequences
`including a plurality of frames to determine the detected
`key-frames as start-shots for each topic, and creating a topic
`list based on the start-shots for each topic.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`0032. These and/or other aspects and advantages of the
`invention will become apparent and more readily appreci
`ated from the following description of the embodiments,
`taken in conjunction with the accompanying drawings of
`which:
`0033 FIG. 1 illustrates an example of topic-based seg
`mentation of video sequences related to news;
`0034 FIG. 2 illustrates an apparatus for segmenting
`Video sequences based on a topic, according to an embodi
`ment of the present invention;
`0035 FIG. 3 illustrates a start-shot determination unit,
`such as that of FIG. 2, according to an embodiment of the
`present invention;
`0036 FIGS. 4A to 4C illustrate an operation of each
`element of a start-shot determination unit, Such as that of
`FIG. 3, according to an embodiment of the present inven
`tion;
`0037 FIG. 5 illustrates a pre-processing unit of a start
`shot determination unit, such as that of FIG. 3, according to
`an embodiment of the present invention;
`0038 FIG. 6A illustrates a face detection unit of a start
`shot determination unit, such as that of FIG. 3, according to
`an embodiment of the present invention;
`0039 FIG. 6B illustrates a method of organizing an
`integral image, according to an embodiment of the present
`invention;
`
`Petitioner Apple Inc. - Ex. 1049, p. 13
`
`

`

`US 2007/0O30391 A1
`
`Feb. 8, 2007
`
`0040 FIG. 7 illustrates an example of a sub-window used
`in a face detection unit of a start-shot determination unit,
`such as that of FIG. 3, according to an embodiment of the
`present invention;
`0041
`FIGS. 8A and 8B illustrate examples of character
`istics used in a classifier of a face detection unit, such as that
`of FIG. 6A, according to an embodiment of the present
`invention;
`0.042
`FIG. 9 illustrates an example of frame image
`segmentation for detecting faces in a face detection unit of
`a start-shot determination unit, such as that of FIG. 3,
`according to an embodiment of the present invention;
`0043 FIGS. 10A and 10B illustrate an operation of a face
`detection unit of a start-shot determination unit, Such as that
`of FIG. 3, according to an embodiment of the present
`invention;
`0044 FIG. 11 illustrates a key-frame determination unit
`of a start-shot determination unit, such as that of FIG. 3,
`according to an embodiment of the present invention; and
`0045 FIG. 12 illustrates an operation of a clothing infor
`mation extraction unit, Such as that of FIG. 11, according to
`an embodiment of the present invention.
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENTS
`0046 Reference will now be made in detail to embodi
`ments of the present invention, examples of which are
`illustrated in the accompanying drawings, wherein like
`reference numerals refer to the like elements throughout.
`Embodiments are described below to explain the present
`invention by referring to the figures.
`0047 FIG. 1 illustrates an example of topic-based seg
`mentation of video sequences related to news. Referring to
`FIG. 1, Chapters 1 to 25 are segmented based on a topic,
`whereby each chapter includes a start-shot set as a key
`frame having a main character and material frames, e.g., an
`episode, for Supporting corresponding content. Here, though
`only news has been shown, embodiments of the present
`invention are equally available for alternate topics in addi
`tion to news.
`0.048
`FIG. 2 illustrates an apparatus for segmenting
`Video sequences based on a topic, according to an embodi
`ment of the present invention. Referring to FIG. 2, the
`apparatus for segmenting video sequences based on a topic
`may include a start-shot determination unit 210 and a topic
`list creation unit 230, for example, in order to segment video
`sequences based on the topic by detecting the main charac
`terS.
`0049 Referring to FIG. 2, the start-shot determination
`unit 210 may detect a plurality of key-frames by using
`character information from video sequences including a
`plurality of frames to determine the detected key-frames as
`start-shots for each topic. In one embodiment, a main
`character may appear in each key frame. In addition, an
`operation of detecting the start-shot preferably may be
`performed in units of Scenes.
`0050. The topic list creation unit 230 may further create
`a topic list by using the start-shots for each topic determined
`by the start-shot determination unit 210. The start-shots
`detected for each scene are combined to create the topic list.
`In one embodiment, frames existing between a current topic
`start-shot and a next topic start-shot are made into a current
`
`topic episode, and the current topic episode is added to the
`start-shot of each topic of the topic list.
`0051
`FIG. 3 illustrates a make up of a start-shot deter
`mination unit 210. Such as that of FIG. 2, according to an
`embodiment of the present invention. The start-shot deter
`mination unit 210 may include a pre-processing unit 310, a
`face detection unit 330, and a key-frame determination unit
`350, for example.
`0.052
`Referring to FIG. 3, the pre-processing unit 310
`may receive video sequences making up one video program
`and detect scene changes to determine frames belonging to
`a current scene. In addition, the pre-processing unit 310 may
`receive an electronic program guide (EPG) signal of a
`corresponding video program and determines the number of
`main characters. As shown in FIG. 4A, the EPG signal may
`include various kinds of information Such as broadcasting
`time, program genre, title, name of a director, names of
`characters, plot, etc.
`0053) The face detection unit 330 may detect faces in
`each of the frames belonging to the current scene, e.g., as
`determined by the pre-processing unit 310. Since the main
`characters may look to the front, front faces may be detected.
`In this case, only whether a face exists may be determined,
`for example, regardless of the number of faces in each
`frame. Here, a variety of well-known face detection algo
`rithms may be employed to detect faces.
`0054 The key-frame determination unit 350 may detect
`clothing information from the frames in which faces have
`been detected, e.g., in the face detection unit 330, cluster
`frames for each character corresponding to the clothing
`information, and determine frames including the main char
`acter as the key-frames, e.g., start-shots of a corresponding
`topic. Since the clothing information of a main character
`seldom changes in a single video program, the clothing
`information may be used in a character clustering method.
`Clusters having relatively few frames may also be removed
`from a plurality of clusters generated as a result of the
`clustering, in consideration with a determined number of
`main characters, e.g., as determined in the pre-processing
`unit 310, assuming that the main characters appear more
`frequently compared to other characters. The key frame
`determination unit 350, thus, may determine the result of the
`character clustering, for example, the key-frames of FIG.
`4B, and use the key-frames to create a topic list as shown in
`FIG 4C.
`0055 FIG. 5 illustrates a make up of a pre-processing
`unit 310 of a start-shot determination unit 210, such as that
`of FIG. 3, according to an embodiment of the present
`invention. The pre-processing unit 310 may include a frame
`input unit 510, a thumbnail image creation unit 530, a scene
`change detection unit 550, an EPG analyzing unit 570, and
`a number-of-main-characters determination unit 590, for
`example.
`0056 Referring to FIG. 5, the frame input unit 510 may
`sequentially receive frame images detected from the video
`Sequences.
`0057 The thumbnail image creation unit 530 may sample
`pixels with a constant interval for original frame images
`provided from the frame input unit 510 in a size of WXH to
`create thumbnail images having a reduced size of wxh.
`These thumbnail images allow the face detection unit 330 to
`detect faces at a higher speed in comparison with when the
`original frame images are used.
`
`Petitioner Apple Inc. - Ex. 1049, p. 14
`
`

`

`US 2007/0O30391 A1
`
`Feb. 8, 2007
`
`0.058. The scene change detection unit 550 may store
`previous frame images and calculate similarity of color
`histograms between two Successive frame images, e.g.,
`between a current frame image and the previous frame
`image. When the calculated similarity is lower than a
`predetermined threshold value, it may be determined that a
`scene change is detected in the current frame. In this case,
`the similarity Sim(Ht, Hit--1) may be calculated from the
`below Equation 1, for example.
`
`W
`Sim(Hi, H) = Xmin H. (n), H., (n)),
`
`Equation 1
`
`0059 Here, Ht corresponds to a color histogram of the
`previous frame image, Hit--1 corresponds to a color histo
`gram of the current frame image, and N corresponds to a
`histogram level.
`0060. The EPG analyzing unit 570 may analyze an EPG
`signal included in a single video program, and the number
`of-main-characters determination unit 590 may determine
`the number of main characters based on the result of the
`analysis in the EPG analyzing unit 570.
`0061
`FIG. 6A illustrates a detailed make up of a face
`detection unit 330 of a start-shot determination unit 210,
`such as that of FIG. 3, according to an embodiment of the
`present invention. The face detection unit 330 may include
`a thumbnail image re-organization unit 610, a Sub-window
`generation unit 630, and a classifying unit 650, for example.
`0062 Referring to FIG. 6A, the thumbnail re-organiza
`tion unit 610 may obtain integral images at each point from
`the thumbnail images for the frames belonging to the current
`scene, e.g., as provided by the pre-processing unit 310, to
`re-organize a thumbnail image. A method of obtaining the
`integral images will be further described below in greater
`detail with reference to FIG. 6B.
`0063 Referring to FIG. 6B, the thumbnail image may
`include four regions A, B, C, and D, and four points a, b, c,
`and d, specified according to an embodiment of the present
`invention. An integral image of a point a refers to a Sum of
`pixel values in a region on an upper left side of the point a.
`That is, the integral image of the point a corresponds to a
`Sum of pixel values in the region A. In this case, each of the
`pixel values may include a luminance level of a pixel, for
`example. In addition, an integral square image of the point
`a refers to a sum of squared pixel values in the region on the
`upper left side of the point a. That is, the integral square
`image at the point a corresponds to a sum of squared pixel
`values included in the region A. This concept of Such an
`integral image allows convenient calculation of the Sum of
`the pixel values in any region of an image. In addition, use
`of such an integral image allows for fast segmentation in the
`segmentation unit 670. For example, the sum of the pixel
`values of the region D may be calculated from the below
`Equation 2, for example.
`Equation 2
`S(D)=i(d)-i(b)-i(c)+i(a)
`0064. Here, ii(d) corresponds to the integral image of the
`point d, ii (b) corresponds to the integral image of the point
`b. ii (c) corresponds to the integral image of the point c, and
`i(a) corresponds to the integral image of the point a.
`
`0065. The thumbnail image re-organization unit 610 may
`reorganize the thumbnail images using integral images at
`each point, as calculated from Equation 2, for example. In
`one embodiment, the inclusion of the thumbnail re-organi
`zation unit 610 may be optional.
`0066. The sub-window generation unit 630 may generate
`sub-windows by dividing the re-organized thumbnail
`images, e.g., as re-organized in the thumbnail image re
`organization unit 610. In one embodiment, the size of the
`sub-window may be previously determined and may be
`linearly enlarged by a predetermined ratio. For example, the
`size of the sub-window may be initially set to 20x20 pixels,
`and the entire image may be divided using the Sub-window
`having the above initial size. Then the size of the sub
`window may be linearly enlarged by a ratio of 1:2, and the
`entire image may be divided again using the Sub-window
`having the enlarged size. The image may be divided by
`enlarging the size of the sub-window until the size of the
`Sub-window becomes equal to the size of the entire image.
`The Sub-windows generated in the Sub-window generation
`unit 630 may be superposed with one another, for example.
`Reference numerals 710, 730, and 750 of FIG. 7 further
`illustrate examples of sub-windows generated by the sub
`window generation unit 630.
`0067. The classifying unit 650 may be implemented by n
`stages S1 to Sn, which may further be cascaded. Each of the
`stages S1 to Sn detects faces using classifiers based on a
`simple characteristic. The number of classifiers may also
`increase as the stage number increases. For example, four or
`five classifiers may be used in the first stage S1, and fifteen
`to twenty classifiers may be used in the second stage S2, and
`SO. O.
`0068. Each stage may have a weighted sum for a plurality
`of classifiers and may determine whether the face has been
`Successfully detected based on the sign of the weighted Sum.
`The sign of the weighted Sum of each stage can be expressed
`by the following Equation 3, for example.
`
`Equation 3
`
`0069. Here, cm corresponds to a weighting value of a
`classifier, and fm(X) corresponds to an output of a classifier.
`Each classifier has a single simple characteristic and a
`threshold value. As a result, -1 or +1 is output as the value
`offm(x).
`0070. In the classifying unit 650, the first stage S1 may
`receive the k-th sub-window provided from the sub-window
`generation unit 630 and tries to detect faces. When the face
`detection fails, the k-th sub-window is determined as a
`non-face sub-window. Conversely, when the face detection
`is successful, the k-th Sub-window image is provided to the
`second stage S2. When the face detection is successful in the
`k-th sub-window of the final stage Sn, the k-th sub-window
`is determined as a face sub-window. On the other hand, an
`Adaboost learning algorithm may also be employed in each
`classifier to select the weighting value. According to the
`Adaboost algorithm, some important visual characteristics
`are selected from a large characteristic set to generate a very
`
`Petitioner Apple Inc. - Ex. 1049, p. 15
`
`

`

`US 2007/0O30391 A1
`
`Feb. 8, 2007
`
`efficient classifier. Such a cascaded stage structure allows the
`non-face Sub-window to be determined even by using a
`small number of simple characteristics. Therefore, the non
`face Sub-window can be directly rejected at initial stages
`Such as the first or second stage, and then the next (k+1)-th
`sub-window can be received to detect faces. As a result, it is
`possible to improve a total speed of the face detection
`process.
`FIG. 8A illustrates edge simple characteristics 811
`0071
`and 812, and line simple characteristics 813, 814, 815, and
`816 used in each classifier of the classifying unit 650,
`according to an embodiment of the present invention. Each
`simple characteristic includes two or three rectangular areas
`having a white or black color. Each classifier subtracts the
`sum of pixel values of the white rectangular area from the
`Sum of pixel values of the black rectangular area according
`to the simple characteristics, and the Subtraction result is
`compared with the threshold value corresponding to the
`simple characteristic. A sign value of -1 or +1 is output
`depending on the result of the comparison between the
`subtraction result and the threshold value. FIG. 8B further
`illustrates an example of eye detection using a line simple
`characteristic 821 having one white rectangular area and two
`black rectangular areas or an edge simple characteristic 823
`having one white rectangular area and one black rectangular
`area. When the line simple characteristic is used, the differ
`ence of pixel values between an eye region and a nose ridge
`region of a face is measured taking into consideration that
`the eye region is darker than the nose ridge region. When the
`edge simple characteristic is used, the difference of grada
`tions between the eye region and the cheek region of a face
`is measured taking into consideration that the eye region is
`darker than the cheek region. As described above, the simple
`characteristics for detecting the face may be variously
`provided.
`0072 FIG. 9 illustrates an example of frame image
`segmentation for detecting faces at high speed using a face
`detection unit 330 of a start-shot determination unit 210,
`such as that of FIG. 3, according to an embodiment of the
`present invention. A frame image may be divided into first
`to fifth sections 910, 930, 950, 970, and 990 according to a
`possibility of face existence before the thumbnail images are
`input to the thumbnail re-organization unit 610. In this case,
`the segmentation locations for each section may be statisti
`cally determined through experiments or simulations, for
`example. Generally, since the first section 910 has the
`highest probability of detecting the face 900, the plurality of
`sections may be sequentially provided to the thumbnail
`image re-organization unit

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket