`International Bureau
`
`(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT)
`oo
`
`egy <IUMNA
`yiv
`Ab.
`
`
`
`HN
`3)
`(10) International Publication Number
`
`eer
`PCT
`WO 2009/108645 Al
`
`
`(43) International Publication Date
`
`3 September 2009 (03.09.2009)
`
`(51) International Patent Classification:
`G06K 9/00 (2006.01)
`
`(21) International Application Number:
`PCT/US2009/035032
`
`(22) International Filing Date:
`
`24 February 2009 (24.02.2009)
`English
`English
`
`(25) Filing Language:
`(26) Publication Language:
`_
`(30) Priority Data:
`US
`27 February 2008 (27.02.2008)
`61/032,028
`(71) Applicants (for ail designated States except US): SONY
`COMPUTER ENTERTAINMENT AMERICA INC.
`[US/US]; 919 East Hillsdale Boulevard, 2nd Floor, Foster
`City, CA 94404 (US). SONY COMPUTER ENTER-
`TAINMENT EUROPE LIMITED [GB/GB]; 10 Greal
`Marlborough Street, London W1F 7LP (GB).
`Inventors; and
`Inventors/Applicants (or US only): ZALEWSKI, Gary,
`M.[US/US]; 919 Hillsdale Boulevard, 2nd Floor, Foster
`
`(72)
`(75)
`
`City, CA 94404-2175 (US). HAIGH, Mike [GB/GB]; 10
`Great Marlborough Street, London W1F 7LP (GB).
`
`(74) Agent: CHAN, Konrad, K.; Martine Penilla & Gencarel-
`la, Llp, 710 Lakeway Drive, Suite 200, Sunnyvale, CA
`94085 (US).
`
`(81) Designated States (unless otherwise indicated, for every
`kind of national protection available): AE, AG, AL, AM,
`AO, AT, AU, AZ, BA, BB, BG, BH, BR, BW, BY, BZ,
`CA, CH, CN, CO, CR, CU, CZ, DE, DK, DM, DO, DZ,
`EC, EE, EG, ES, FL GB, GD, GE, GH, GM, GT, HN,
`HR, HU,ID,IL, IN, IS, JP, KE, KG, KM, KN, KP, KR,
`KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD, ME,
`MG, MK, MN, Mw, MX, MY, MZ, NA, NG,NI, NO,
`NZ, OM,PG,PE,PL, PT, RO, RS, RU, SC, SD,SE, SG,
`SK, SL, SM,ST, SV, SY, TI, TM, TN, TR, TT,TZ, UA,
`UG, US, UZ, VC, VN, ZA, ZM, ZW.
`(84) Designated States (unless otherwise indicated, for every
`kind of regional protection available): ARIPO (BW, GH,
`GM,KE, LS, MW, MZ, NA,SD, SL, SZ, TZ, UG, ZM,
`ZW), Eurasian (AM, AZ, BY, KG, KZ, MD, RU, TJ,
`TM), European (AT, BE, BG, CH, CY, CZ, DE, DK, EE,
`ES, FI, FR, GB, GR, HR, HU,IE, IS, IT, LT, LU, LV,
`MC, MK, MT, NL, NO, PL, PT, RO, SE, SL SK, TR),
`
`[Continued on next page]
`
`(54) Title: METHODS FOR CAPTURING DEPTH DATA OF A SCENE AND APPLYING COMPUTER ACTIONS
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`10h
`
`404
`
`1S greene”
`
`FIG, 1B
`
`
`
`wo2009/108645A1[IIIIHIIMININNIITMINTATIAIANIACT0MMTTTMA
`
`
`
`MAGE=| \
`
`(57) Abstract: A computer-implemented method is provided to au-
`tomatically apply predefined privileges for identified and tracked
`users in a space having one or more media sources The method in-
`112
`cludes an operation to define and save to memory, a user profile
`
`The user profile may include data that identifies and tracks a user
`
`70MM
`with a depth-sensing camera Alternatively, privileges defining lev-
`DEPH (30) |
`Laalc
`
`els of access to particular media for the user profile are defined and
`4 TJPRCCESSING] FOCUS||
`UNIT
`saved The method also includes an operation to capture image and
`RGR
`
`depth data from the depth-sensing camera of a scene within the
`
`
`
`space In yet another operation, the user is tracked and identified
`
`4
`within the scene from the image and depth data Alternatively, de-
`420
`fined privileges automatically apply to media sources so that the
`\
`user is granted access to selected content from the one or more me-
`‘EDR
`
`
`
`COMRUTING SYSTEM SOJFCE/|soc dia sources when identified and tracked within the scene.
`ct
`Teo
`[SELECTABLE
`
`SOJRCE
`12Xe
`i~|TSOURCE
`
`4
`DATA
`OVOBLUE
`
`
`UY VEDA
`
`EN gaves
`
`DEPTH-SENSIKG
`CAMERA
`110
`
`a
`
`150
`
`442
`/
`
`
`
`
`
`
` 144
`
` —
`
`
`
`
`
`OAPI(BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, ML, Published:
`
`MR, NE, SN, TD, TG). —__with international search report (Art. 21(3))
`
`WO 2009/108645 A1 |IfITINU TMNT AMTINANIT TNA TUA
`
`
`
`WO 2009/108645
`
`PCT/US2009/035032
`
`METHODS FOR CAPTURING DEPTH DATA OF A SCENE AND APPLYING
`COMPUTER ACTIONS
`
`by Gary Zalewski and Mike Haigh
`
`BACKGROUND OF THE INVENTION
`
`Description of the Related Art
`
`10
`
`[0001] The video game industry has seen many changes over the years. As computing power
`
`has expanded, developers of video games have likewise created game software that takes
`
`advantage of these increases in computing power. To this end, video game developers have
`
`been coding games that incorporate sophisticated operations and mathematics to produce a
`
`very realistic game experience.
`
`15
`
`[0002] Example gaming platforms, may be the Sony Playstation, Sony Playstation2 (PS2),
`
`and Sony Playstation3 (PS3), each of which is sold in the form of a game console. Asis well
`
`known,the game console is designed to connect to a monitor (usually a television) and enable
`
`userinteraction through handheld controllers. The game console is designed with specialized
`
`processing hardware,
`
`including a CPU, a graphics synthesizer for processing intensive
`
`graphics operations, a vector unit for performing geometry transformations, and other glue
`
`hardware, firmware, and software. The game console is further designed with an optical disc
`
`tray for receiving game compact discs for local play through the game console. Online
`
`gaming is also possible, where a user can interactively play against or with other users over
`
`the Internet.
`
`[0003] Game consoles connected to the Internet with increasing processing power are
`
`beginning to function as entertainment portals capable of providing access to online
`
`interaction and online marketplaces supplying streaming and downloadable media.
`
`In an
`
`online environment, parents can often struggle with effective and efficient parental controls.
`
`Piecemeal implementation of parental controls can result in ineffective, overzealous, or partial
`
`30
`
`implementation creating gaps in parental controls. For example, where a cable box may
`
`provide parental control for television channels, separate parental controls may be required for
`
`videos stored on optical media such DVDsor Blu-Raydiscs.
`
`[0004] It is within this context that embodiments of the invention arise.
`
`
`
`WO 2009/108645
`
`PCT/US2009/035032
`
`SUMMARY
`
`[0005] Broadly speaking, the present invention enables users to be identified and tracked
`
`within a scene using a depth-sensing camera. The identification and tracking of users can
`
`enable automatic application of access controls to a variety of media sources based on the
`
`identified and tracked users within the scene. Users can configure the access controls as part
`
`of a user profile. A user can also configure an avatar as part of their user profile along with
`
`avatar animations. The avatar animations can be used to display user movement within and
`
`across the scene.
`
`[0006] In one embodiment,
`
`a computer-implemented method to automatically apply
`
`10
`
`predefined privileges for identified and tracked users in a space having one or more media
`
`sources is disclosed. The method includes an operation to define and save to memory, a user
`
`profile. The user profile may include data that identifies and tracks a user with a depth-sensing
`
`camera. In another operation privileges that define levels of access to particular media for the
`
`user profile are defined and saved. The method also includes an operation to capture image
`
`15
`
`and depth data from the depth-sensing camera of a scene within the space. In yet another
`
`operation, the useris tracked and identified within the scene from the image and depth data. In
`
`still another operation the defined privileges are automatically applied to one or more media
`
`sources, so that the user is granted access to selected content from the one or more media
`
`sources when identified and tracked within the scene.
`
`[0007] In another embodiment, a computer-implemented method for identifying and tracking
`
`real-world objects to automatically apply predefined computer-generated effects to virtual
`
`world representations of the real world objects is disclosed. The method includes an operation
`
`to define and save to memory a user profile that includes data to identify and track the user
`
`with a depth-sensing camera. The method also includes an operation to define and save to the
`
`25
`
`memory animations that are integrated into a virtual world scene associated with the user
`
`profile. In another operation the depth-sensing camera captures a scene where the user is
`
`identified and tracked within the scene. In yet another operation, the defined animations are
`
`automatically applied based on the identified and tracked user, so that a display screen shows
`
`the integrated animations.
`
`30
`
`[0008] In yet another embodiment a computer implemented method for identifying and
`
`tracking a real-world users within a real-world space is disclosed. In one operation a user
`
`profile is defined from image and depth data captured by a depth-sensing camera. The user
`
`profile may include image and depth data related to physical characteristics of the real-world
`
`
`
`WO 2009/108645
`
`PCT/US2009/035032
`
`user. In another operation image and depth data may be captured for a scene using the depth-
`
`sensing camera. The method may also include an operation that identifies moving objects
`
`within the scene. In another operation, image and depth data for the moving objects allows a
`
`head of the real-world user to be locked onto and tracked within the scene. In yet another
`
`operation the image and depth data for the head is analyzed in real-time. The analysis can
`
`include comparing image and depth data for the head to user profile image and depth data
`
`related to physical characteristics, wherein a user is identified when image and depth data
`
`within the user profile substantially matches image and depth data for the head.
`
`[0009] Other aspects and advantages of the invention will become apparent from the
`
`10
`
`following detailed description,
`
`taken in conjunction with the accompanying drawings,
`
`illustrating by way of example the principles of the invention.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`[0010] The invention, together with further advantages thereof, may best be understood by
`
`15
`
`reference to the following description taken in conjunction with the accompanying drawings.
`
`[0011] Figure 1A is a flowchart including exemplary operations that can be used to identify
`
`and track real world objects in order to apply pre-defined computer generated effects to virtual
`
`world representations of the real-world objects, in accordance with one embodiment of the
`
`present invention.
`
`[0012] Figure 1B showsa scene within the a field of view of a depth-sensing camera that is
`
`connected to a computer system, in accordance with one embodimentof the present invention.
`
`[0013] Figures 2A-2D illustrate exemplary screen that can be used to define a user profile, in
`
`accordance with one embodimentof the present invention.
`
`[0014] Figure 2E is an exemplary screenillustrating completion of adding an additional user
`
`25
`
`mom, in accordance with one embodimentof the present invention.
`
`[0015] Figure 2F-1 illustrates facial features captured by the depth-sensing camera that can be
`
`used to recognize users in accordance with one embodimentof the present invention.
`
`[0016] Figure 2F-2 illustrates capturing image and depth data of a user head 250 in a variety
`
`of position, in accordance with one embodimentof the present invention.
`
`30
`
`[0017] Figures 2G illustrates a matrix of various poses of a modeled user's face 251
`
`constructed from various views captured by the depth-sensing camera in accordance with one
`
`embodimentof the present invention.
`
`
`
`WO 2009/108645
`
`PCT/US2009/035032
`
`[0018] Figure 2H is a sequence of three images showing changes in relative position of
`
`various facial features in accordance with one embodimentof the present invention.
`
`[0019] Figure 2I is an exemplary flow chart illustrating a procedure to acquire image and
`
`depth data for a user's head in accordance with one embodimentof the present invention.
`
`[0020] Figure 2J is an exemplary flow chart
`
`illustrating exemplary operations within a
`
`procedure to identify a user within the field of view of the depth-sensing camera,
`
`in
`
`accordance with one embodiment of the present invention.
`
`[0021] Figures 3A-3C show an abbreviated set of exemplary screen that can be used to create
`
`a user profile for a pet, in accordance with one embodimentof the present invention.
`
`10
`
`[0022] Figure 4A illustrates an exemplary chart showing various privileges assigned to users
`
`in accordance with one embodimentof the present invention.
`
`[0023] Figure 4B is an exemplary chart illustrating animations created by users in accordance
`
`with one embodiment of the present invention.
`
`[0024] Figure 5A illustrates a space including a real-world scene that is displayed on the
`
`15
`
`screen as processed video, in accordance with one embodimentof the present invention.
`
`[0025] Figure 5B shows exemplary processed video in accordance with embodiments of the
`
`present invention.
`
`[0026] Figure 6 illustrate a real-world scene and howthe real-world scene is displayed on the
`
`screen as processed video 10c, in accordance with one embodimentof the present invention.
`
`[0027] Figure 7 is an exemplary flow chart illustrating operations to apply point tracking in
`
`order to improveidentification and tracking of recognized objects.
`
`[0028] Figure 8 is an exemplary view of a sceneillustrating point tracking in accordance with
`
`one embodiment of the present invention.
`
`[0029] Figure 9 schematically illustrates the overall system architecture of the Sony®
`
`25
`
`Playstation 3® entertainment device, a computer system capable of utilizing dynamic three-
`
`dimensional object mapping to create user-defined controllers in accordance with one
`
`embodiment of the present invention.
`
`
`
`WO 2009/108645
`
`PCT/US2009/035032
`
`DETAILED DESCRIPTION
`
`[0030] An invention is disclosed for automatically applying user profiles for a computer
`
`system after a user is identified with image and depth data from a depth-sensing camera.
`
`Broadly speaking, the computer system can be any type of system that takes input from a user,
`
`whether it be a general purpose computer
`
`(e.g., desktop, notebook, handheld device,
`
`smartphone, etc.), or a special purpose computer like a game console. The depth-sensing
`
`camera is can capture geometric depth data along image data. The depth-sensing camera can
`
`provide image and depth data to the computer system for analysis and processing. In one
`
`embodiment, the depth-sensing camera is a single lens camera, and in other embodiments,
`
`10
`
`multiple camera lenses can be used to capture images and depth data from various locations or
`
`perspectives.
`
`[0031] In the following description, numerousspecific details are set forth in order to provide
`
`a thorough understanding of the present invention. It will be apparent, however, to one skilled
`
`in the art that the present invention may be practiced without some or all of these specific
`
`15
`
`details.
`
`In other instances, well known process steps have not been described in detail in
`
`order not to unnecessarily obscure the present invention.
`
`[0032] Figure 1A is a flow chart including exemplary operations that can be used to identify
`
`and track real world objects in order to apply pre-defined computer generated effects to virtual
`
`world representations of the real-world objects, in accordance with one embodiment of the
`
`present invention. Operation 100 is used to define a user for identification and tracking. In one
`
`embodiment, operation 100 is performed using a depth-sensing camera connected to a
`
`computer system. The users can be distinguished by individual characteristics captured by the
`
`depth-sensing camera and recognized by software executed by the computer system.
`
`In
`
`various embodiments, facial characteristics including, but not limited to various distances
`
`between facial features such as eyes, nose, and mouth can be captured. In other embodiments,
`
`the depth-sensing features of the camera can be used to recognize features of a user, such as
`
`the nose, eyes, head size, relative positions of features, etc. of a user.
`
`[0033] Users can also define virtual world avatars as part of operation 100. As part of
`
`defining a user for identification and tracking, the user can configure a customized avatar that
`
`30
`
`is representative of the user within a virtual world. A user can configure their avatar to be
`
`similar to their real-world appearance or choose to have a fanciful avatar not bound bytheir
`
`real-world appearance. To that end, avatars can include, but are not limited to configurations
`
`for size and shape of bodies, heads, eyes, noses, ears, mouths, arms, legs, and hair. Users can
`
`
`
`WO 2009/108645
`
`PCT/US2009/035032
`
`also be allowed to configure virtual clothing and footwear for their avatar along with fashion
`
`accessories such as jewelry and sunglasses.
`
`[0034] Operation 102 is used to define privileges and animations for identified users. The
`
`privileges allow restrictions to be placed on the type of content accessible via the computer
`
`system when an identified user is in front of the depth-sensing camera. In one embodiment,
`
`the computer system is connected to the internet and provides access to various media sources
`
`suchas, but not limited to, streaming or downloadable music and multimedia such as movies,
`
`television shows, and video clips. Additionally, the computer system can provide access to
`
`online or downloadable games along with providing a web browserfor accessing websites
`
`10
`
`capable of streaming video such as YouTube. The computer system can also include an
`
`integrated media source that is capable of playing DVDsorother optical storage media such
`
`as Blu-Ray or HD-DVD discs. Privileges assigned to defined users can restrict access to
`
`particular types of movies, television shows, games and websites.
`
`[0035] Operation 102 can also be used to define animations for identified users. The
`
`15
`
`animations can be used to animate an identified user's avatar on the screen in responseto real-
`
`world actions detected by the depth-sensing camera and the computer system. For examples,
`
`in one embodiment, when the speed of movements for a user crosses a threshold velocity, an
`
`animation can be applied to the user's avatar. In one embodiment slow movement from a user
`
`can result in cobwebs or spider webs being animated on the user's virtual world avatar. In
`
`another embodiment, rapid movement fromthe user can result in animations emphasizing the
`
`user's high rate of speed such as blurring the avatar or other animations such as motion clouds
`
`or sound effects. The user avatar along with the defined privileges and animations can be
`
`saved for recall when the depth-sensing camera and the computer system recognize the
`
`identified user.
`
`[0036] Operation 104 can be used to identify and track moving objects that enter the field of
`
`view of the depth-sensing camera. If the object moving within the field of view of the camera
`
`is recognized as a defined user, operation 106 can automatically apply the privileges and/or
`
`animations.
`
`In situations where the moving object
`
`is not recognized as a defined user,
`
`operation 106 can automatically load default privileges and/or animations. Operation 104 can
`
`30
`
`also utilize the computer system and depth-sensing camera to track and animate movements
`
`of recognized or unrecognized user over time. In one embodiment, when the depth-sensing
`
`camera identifies movement, it can begin creating a log file of the movement over time. In
`
`embodiments, when an identified user enters the field of view of the depth-sensing camera, a
`
`
`
`WO 2009/108645
`
`PCT/US2009/035032
`
`log file is created using the preset animation for the user. The log file can be played back
`
`showing the identified users movement within the field of view of the depth-sensing camera
`
`overtime.
`
`[0037] Figure 1B shows a scene 150 within the a field of view 152 of a depth-sensing camera
`
`110 that is connected to a computer system 120, in accordance with one embodiment of the
`
`present invention. The depth-sensing camera 110 can capture image data using an RGB image
`
`module 114 while the depth data module 112 can capture relative depth data for objects
`
`within its field of view 152. In one embodiment, the depth-sensing camera 110 can determine
`
`relative depths based on an amountof light reflected off of three-dimensional objects. In some
`
`10
`
`embodiments, the depth-sensing camera includes an array of infrared Light Emitting Diodes
`
`(LEDs) capable of pulsing infrared light. The depth data module 112 can determinerelative
`
`depth of objects within its field of view of based on the amount of pulsed infrared light that is
`
`reflected back into the depth-sensing camera 110. In other embodiments, image and depth
`
`data from the depth-sensing camera 110 is sent to the computer system 120 for processing.
`
`15
`
`[0038] A focusing module 118 can be included with the depth-sensing camera 110 along with
`
`a logic processing unit 116. In some embodiments, the logic processing unit 116 can be used
`
`to correlate data from the RGB image module 114 and the depth data module 112. In other
`
`embodiments, the logic processing unit 116 can assist in controlling the focusing module 118.
`
`The focusing module 118 can change the focus of the RGB image module 114 and the focus
`
`of the depth data module 112. Augmenting the intensity and duration of individual infrared
`
`LEDswithin the infrared LED array can change the focus of the depth data module 112. The
`
`image and depth data captured by the depth-sensing camera can be processed in substantially
`
`real-time by the computer system 120.
`
`[0039] In addition to accepting and processing image and depth data from the depth-sensing
`
`25
`
`camera 110, the computer system 120 can include or accept input from a variety of other
`
`sources. For example, TV source 122, DVD/Blu-Ray media 124, games 126 and the Internet
`
`128 can be accessed through the computer system 120. Users can select different media
`
`sources 130a/b via a user-interface for the computer system 120.
`
`[0040] The scene 150 includes a user 140, a lamp 142 and a sofa 144. The computer system
`
`30
`
`120 can distinguish the user 140 from stationary objects such as the lamp 142 and the sofa
`
`144. Responding to commands from the computer system 120, the depth-sensing camera 110
`
`can focus on an area 146 around the user 140. In an effort to identify the user 140, the depth-
`
`sensing camera 110 can refine its focus to a head area 148 of the user 140. Focusing on the
`
`
`
`WO 2009/108645
`
`PCT/US2009/035032
`
`head area 148 can allow the depth-sensing camera to capture image and depth data for the
`
`user 140 that can be analyzed and compared to profile data associated with the computer
`
`system 120.
`
`[0041] Figures 2A-2D illustrate exemplary screen that can be used to define a user profile,
`
`including an avatar, in accordance with one embodiment of the present invention. Figure 2A
`
`shows an exemplary screen 200 for the computer system where user profiles for dad 202,
`
`daughter 204 and son 206 have already been created. Also shown on the screen 200 is button
`
`208 that allows a new user profile to be created. Figure 2B illustrates an exemplary screen
`
`200b as a result of selecting button 208, in accordance with one embodiment of the present
`
`10
`
`invention. Screen 200b displays different types of user profiles that can be created for one
`
`embodimentof the present invention. For example, profiles based on people can be created by
`
`selection human icon 210. In embodiments where a user wishes to track the movementof pets
`
`within a room, selecting dog icon 212 or cat icon 214 can create dog or cat profiles.
`
`Additional types of profiles can be included and those listed should not be construed as
`
`15
`
`limiting.
`
`[0042] Figure 2C showsa representative screen 200c as a result of selecting human icon 210
`
`in accordance with one embodiment of the present invention. Screen 200c allows a user to
`
`select between a male icon 216 or a female icon 218. In this example, the user chooses female
`
`icon 218. Figure 2D illustrates two different screens 200d/e for configuring an avatar in
`
`accordance with various embodiments of the present invention. Screen 200d illustrates a
`
`menu systemthat could be used to configure an avatar. As illustrated, the menu system can
`
`include selections for name, height, body type, eyes, hair, nose and mouth. As should be
`
`understood by those skilled in the art, each menuselection shown on screen 200d can call up
`
`another screen or sub-menuthat allows users finer granularity for configuring an avatar.
`
`25
`
`[0043] Screen 200e illustrates an alternative avatar customization screen in accordance with
`
`one embodiment of the present invention. Using the alternative avatar customization the
`
`depth-sensing camera can be used to capture images of the user. The captured images of the
`
`user can then be processed by the computer system to automatically create an avatar based on
`
`the capture images of the user. In some embodiments, the automatically created avatar is a
`
`30
`
`baseline avatar where the user can modify features of the automatically created avatar. In both
`
`embodiments, a user can customize or tweak the self-created or automatically created avatar
`
`with clothing or fashion accessories.
`
`
`
`WO 2009/108645
`
`PCT/US2009/035032
`
`[0044] Figure 2E is an exemplary screen 200fillustrating completion of adding an additional
`
`user mom 220, in accordance with one embodiment of the present invention. Screen 200f is
`
`the result of completing the creation of the mom 220 user profile as described in Figures 2A-
`
`2D. This results in the screen 200f showing user profiles for dad 202, daughter 204, son 206
`
`and mom220. Figure 2E also illustrates button 208 being selected to add anotheruserprofile.
`
`[0045] Figure 2F-1 illustrates facial features captured by the depth-sensing camera that can be
`
`used to recognize users in accordance with one embodiment of the present invention. During
`
`the configuration of a user's avatar, the depth-sensing camera can be used to capture images
`
`and depth data of a user's head 250 and facial features. The images and associated depth data
`
`10
`
`can be analyzed by the computer system for identifying characteristics that will allow the
`
`computer system to recognize the user.
`
`[0046] Various identifying characteristics can be used including, but not limited to distances
`
`between facial features, relative sizes of facial features and relative location of facial features.
`
`In other embodiments, features on the user's head can be identified such as the relative
`
`15
`
`location and size of ears. For example, depth data, shown in Figure 2F as distances in the Z-
`
`plane, can be used to determine and recognize Z;, the distance between the tip of a user's nose
`
`and the user's upper lip. Depth data can also be used to determine and recognize 7», the
`
`distance between the tip of a user's nose and their eyes. Similarly, image data can be used to
`
`recognize the distance between a user's eyes shown as distance X in Figure 2F-1. Likewise,
`
`the distance B, between a user's nose and their mouth can be measured and used as an
`
`identifying characteristic. The image data and the associated depth data can determine ratios
`
`between depth data and measurements from image data in order to identify and recognize
`users.
`
`[0047] Figure 2F-2 illustrates capturing image and depth data of a user head 250 in a variety
`
`25
`
`of position,
`
`in accordance with one embodiment of the present
`
`invention.
`
`In some
`
`embodiments, when creating a user profile,
`
`the user can be prompted (e.g., by a GUL
`
`voice/sound commands, or text) to turn or rotate their head into a variety of positions. This
`
`allows the depth-sensing camera to capture image and depth data for the user's entire head, or
`
`at least most of the front part of the head having the identifiable face features.
`
`30
`
`[0048] The computer system can analyze the image and depth data to create a wire-frame
`
`model of the user's head. In some embodiments, the wire frame model ofthe user's head can
`
`be used as part of the user's virtual-world avatar. As will be discussed in further detail below,
`
`the computer system can analyze the wire-frame model to determine user specific ratios
`
`
`
`WO 2009/108645
`
`PCT/US2009/035032
`
`10
`
`between facial and head characteristics at a variety of angles. The specific facial features and
`
`measurement that have been discussed are intended to be exemplary and should not be
`
`considered limiting. The image and depth data can be analyzed for additional measurements
`
`that can be used for identifying and recognizing a uset.
`
`[0049] In other embodiments, the depth-sensing camera can be used to capture image and
`
`depth data of a user's entire body in various poses. Similar to facial recognition, the computer
`
`system can analyze the images and associated depth data to identify the user. As will be
`
`discussed in more detail with Figure 8 and Figure 9, stationary object within a scene can be
`
`recognized by the computer system and assist in identifying and tracking users by providing
`
`10
`
`relative positioning of users within the scene.
`
`[0050] Figures 2G illustrates a matrix of various poses of a modeled user's face 251
`
`constructed, at least in part, from various views captured by the depth-sensing camera in
`
`accordance with one embodiment of the present invention.
`
`In some cases, the constructed
`
`poses are generated by approximating dimensional and depth data (e.g., using the data
`
`15
`
`captured in Figure 2F-2).
`
`[0051] When the system is in use or operation, the depth-sensing camera may not always
`
`obtain a straight forward view of a users because users can enter the field of view of the
`
`depth-sensing camera from a variety of angles. Thus, in order to identify and track a user, the
`
`computer system can use the wire-frame model of a user's head to extrapolate variousratios of
`
`facial and head characteristics for a user's head in a variety of positions.
`
`[0052] Row 262 illustrates a sequence of images where the wire-frame model of the user's
`
`head is captured as if the user turned their head from right to left withouttilting their head.
`
`Row 260 showsa similar sequence of images where the wire-frame modelis positioned so the
`
`head is tilted backwards while in row 264, the wire-frame model is tilted forward. Column
`
`25
`
`272 shows the wire-frame model face forward for the respective rows while column 270 and
`
`column 274 show image data for the user in respective right and left one-third views to the
`
`depth-sensing camera. Similarly, column 268 and column 276 show the user in respective
`
`right and left two-thirds views while column 266 and column 278 show the user in respective
`
`right and left profile to the depth-sensing camera. The matrix illustrated in Figure 2G has been
`
`30
`
`simplified and should be considered exemplary. Real world embodiments can sample video
`
`images at various frame rates to compile more or less image and depth data for an individual
`
`user. As will be discussed below, in the different views of the user, the image and depth data
`
`can be analyzed for the relative location of various facial features. In one embodiment, a right
`
`
`
`WO 2009/108645
`
`PCT/US2009/035032
`
`11
`
`ear E;, right eye e;, nose N,left eye e, and left ear E, can be identified and tracked from the
`
`image and depth data.
`
`[0053] Figure 2H is a sequence of three images showing changes (e.g., delta values) in
`
`relative position of various facial features in accordance with one embodimentof the present
`
`invention. In profile 280 the user is directly facing the depth-sensing camera. Fromthis angle,
`
`the position of the user's nose N can be determinedrelative to the position of the user's eyes ey
`
`and es, along with ears E; and Ex. On someusers, the relative position of ears E; and Ex may
`
`not be able to be determined in profile 280. In these instances, identification can be determine
`
`from the relative position of the user's eyes and nose. In order to identify a user, the relative
`
`10
`
`position of the facial features and ears of the user can be compared to the matrix of wire-
`
`frame models. Should the user be identified, the computer system can automatically apply the
`
`appropriate user profile. Additionally,
`
`in some embodiments,
`
`the computer system can
`
`monitor image and depth data from the depth-sensing camera to monitor and track the
`
`position of the user's head. In still further embodiment, the image and depth data can also
`
`15
`
`track eye movements of the user to determine where the user is looking within the space.
`
`[0054] Profile 282 illustrates how rotating the user's head 250 to the left changes the relative
`
`position of facial features for the user. Comparing profile 280 and profile 282, the user's left
`
`ear E, is no longer visible to the dep