throbber
111111
`
`1111111111111111111111111111111111111111111111111111111111111
`US007911513B2
`
`c12) United States Patent
`Garrison et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 7,911,513 B2
`Mar.22,2011
`
`(54) SIMULATING SHORT DEPTH OF FIELD TO
`MAXIMIZE PRIVACY IN VIDEOTELEPHONY
`
`(75)
`
`Inventors: William J. Garrison, Warminster, PA
`(US); Albert Fitzgerald Elcock,
`Havertown, PA (US)
`
`(73) Assignee: General Instrument Corporation,
`Horsham, PA (US)
`
`( *) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 456 days.
`
`(21) Appl. No.: 11/737,813
`
`(22) Filed:
`
`Apr. 20, 2007
`
`(65)
`
`Prior Publication Data
`
`US 2008/0259154 AI
`
`Oct. 23, 2008
`
`(51)
`
`Int. Cl.
`H04N 51262
`(2006.01)
`(52) U.S. Cl. ....................................................... 348/239
`(58) Field of Classification Search .................. 348/586,
`348/239
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`5,384,615 A
`111995 Hsieh eta!.
`6,148,113 A
`1112000 Wolverton et a!.
`6,590,571 B2
`7/2003 Laffargue et al.
`6,950,130 B1 *
`9/2005 Qian ............................. 348/239
`7,227,567 B1 *
`6/2007 Becket a!. ................. 348/14.07
`2002/0079425 A1
`6/2002 Rhoads
`2004/0120584 A1 *
`6/2004 Jang et al ...................... 382/232
`2006/0193509 A1 *
`8/2006 Criminisi eta!. ............. 382/154
`2007/0053513 A1
`3/2007 Hofiberg
`2007/0237393 A1 *
`10/2007 Zhang et al ................... 382/173
`
`OTHER PUBLICATIONS
`
`Erik Hjelmas eta!, "Face Detection: A Survey", Computer Vision and
`Image Understanding 83,236-274 (2001), pp. 236-275.
`Harold M. Merklinger, "A Technical View of Bokeh", Photo Tech(cid:173)
`niques, May/Jun. 1997, 5 pages.
`Gary Bradski eta!, "Learning-Based Computer Vision with Intel's
`Open Source Computer Vision Library,"Intel Technology Journal,
`vol. 9, Issue 02, May 19, 2005, pp. 119-130.
`PCT International Search Report and Written Opinion for PCT/
`US2008/058338. Dated Jun. 30, 2008.
`* cited by examiner
`
`Primary Examiner- James M Hannett
`(74) Attorney, Agent, or Firm- Stewart M. Wiener
`
`(57)
`
`ABSTRACT
`
`An arrangement for simulating a short depth of field in a
`captured videophone image is provided in which the back(cid:173)
`ground portion of the image is digitally segregated and
`blurred to render it indistinct. Thus, the displayed video of a
`user in the foreground is kept in focus while the background
`appears to be out of focus. Image tracking or fixed templates
`are used to segregate an area of interest that is kept in focus
`from the remaining captured video image. Image processing
`techniques are applied to groups of pixels in the remaining
`portion to blur that portion of the captured video image. Such
`techniques include the application of a filter that are alterna(cid:173)
`tively selected from convolution filters in the spatial domain
`(e.g., mean, median, or Gaussian filters), or frequency filters
`in the frequency domain (e.g., low-pass or Gaussian filters).
`User-selectable control is optionally implemented for con(cid:173)
`trolling the type offoreground/background segregation tech(cid:173)
`nique utilized (i.e., dynamic face-tracking or fixed template
`shape), degree of blurring applied to the background, and
`on/off control of the background blurring.
`
`20 Claims, 9 Drawing Sheets
`
`GTL 1005
`
`

`
`U.S. Patent
`
`Mar.22,2011
`
`Sheet 1 of9
`
`US 7,911,513 B2
`
`

`
`FIG. 4
`
`:..tO_Q
`
`452
`
`MIDDLE GROUND
`
`442
`
`462
`
`BACKGROUND
`
`460
`
`/
`
`/
`I'
`
`/
`
`418
`
`/
`t'
`
`/
`
`/
`
`/
`
`413
`
`435
`
`

`
`U.S. Patent
`
`Mar. 22, 2011
`
`Sheet 3 of9
`
`US 7,911,513 B2
`
`FIG. 5
`
`532
`
`

`
`U.S. Patent
`
`Mar.22,2011
`
`Sheet 4 of9
`
`US 7,911,513 B2
`
`LD
`N
`1.0
`
`121
`
`~I
`
`

`
`U.S. Patent
`
`Mar.22,2011
`
`Sheet 5 of9
`
`US 7,911,513 B2
`
`1.0
`
`t
`N rn r------1
`~
`I
`.-1
`I_.
`L _____ .,.
`I
`I 21
`+
`
`0')
`
`~I
`
`Oi
`oi
`mi
`
`00
`0
`~
`
`~
`
`a
`0
`k:
`
`§I
`
`o: o:
`o!
`'\"--:
`
`lDI 0
`0
`..-
`
`

`
`FICi. 12
`
`FIG. 13
`
`I11 I12 1u 114 I1s 116 L7 I1s 119
`I21 122 In I24 I2s 126 127 1281 I29
`Ln 132 I33 I34 IJs I36 I37 138 I39
`141 I42 143 144 I4s 146 In I48 149
`Js1 152 153 154 l55 156 157 158 159
`
`161 I62 I63 I64 I6s 166 167 l68 169
`
`IMAGE
`
`J;J_JQ.
`
`K11 Kl°Kl3
`
`K21 Kn K.23
`
`KERNEL
`
`FJ(J. 14
`
`1
`9
`1
`9
`1
`9
`
`.,
`9
`1
`g
`1
`9
`
`1
`9
`1
`9
`1
`9
`
`_HJ_Q
`
`

`
`FIG. 15
`
`1500
`
`"1516
`
`VIDEOPHONE ARCHITECTURE
`
`HARDWARE LAYER
`
`

`
`U.S. Patent
`
`Mar.22,2011
`
`Sheet 8 of9
`
`US 7,911,513 B2
`
`FIC7. 16
`
`c START
`!
`
`1605
`
`VIDEO IMAGE CAPTURED BY CAMERA
`WITH LONG DEPTH OF FIELD
`
`............ 1611
`
`!
`
`SPATIALLY SEGREGATE AND BUFFER A PORTION ~ 1616
`OF THE CAPTURED VIDEO IMAGE
`
`!
`
`IMAGE PROCESS THE SEGREGATED VIDEO
`PORTION TO INCREASE CIRCLE OF CONFUSION
`
`!
`
`GENERATE COMPOSITE VIDEO IMAGE
`
`!
`
`.....,.-.. 1620
`
`.....,.-.. 1622
`
`REFRESH BUFFER WITH COMPOSITE VIDEO
`IMAGE
`
`............ 1625
`
`____________ jl _____________
`r RENDER COMPOSITE VIDEO IMAGE ONTO
`~1631
`1
`L------------- -------------~
`I
`
`DISPLAY SCREEN TO CONFIRM PRIVACY
`ENABLEMENT TO USER
`
`I
`I
`
`TRANSMIT COMPOSITE IMAGE TO REMOTE
`VIDEOPHONE
`
`.,....,.1635
`
`1640
`
`

`
`U.S. Patent
`
`Mar.22,2011
`
`Sheet 9 of9
`
`US 7,911,513 B2
`
`

`
`US 7,911,513 B2
`
`1
`SIMULATING SHORT DEPTH OF FIELD TO
`MAXIMIZE PRIVACY IN VIDEOTELEPHONY
`
`BACKGROUND
`
`2
`Such techniques include the application of one or more filters
`selected from convolution filters in the spatial domain (e.g.,
`mean, median, or Gaussian filters), or frequency filters in the
`frequency domain (e.g., low-pass or Gaussian filters). Fixed
`templates are also alternatively utilizable to segregate the
`portions of the captured video which are respectively focused
`and blurred. The templates have various shapes including
`those that are substantially rectangular, oval, or arch-shaped.
`For example, application of the oval-shaped template keeps
`10 the portion of the captured video image falling inside a fixed
`oval in focus and the remaining portion of the image falling
`outside the oval is then digitally blurred.
`User-selectable control is optionally provided to enable
`control of the type of foreground/background segregation
`15 technique utilized (i.e., dynamic object detection/tracking or
`fixed template shape), degree of blurring applied to the back(cid:173)
`ground, and on/off control of the background blurring.
`The simulated short depth of field provided by present
`arrangement advantageously enables a high degree of privacy
`20 to be implemented while preserving the intrinsic value of
`videophone telephony by keeping the video component of the
`videophone call intact. The privacy feature is provided using
`economically-implemented digital image processing tech(cid:173)
`niques that do not require modifications or additions to the
`25 camera hardware which would add undesirable costs. In addi-
`tion, the blurred background portion of the video image
`appears natural to the viewer because short depth of field
`images are in common use in television, movies, and other
`media presentations. Thus, privacy is enabled in a non-intru(cid:173)
`sive manner that does not interfere with the videophone call
`or bring attention to the fact that privacy is being utilized.
`
`DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 shows a camera and two black and white patterned
`targets located in the camera's field of view;
`FIGS. 2 and 3 show images captured by the camera to
`illustrative depth of view;
`FIG. 4 is a pictorial view of an illustrative arrangement
`showing two videophone users;
`FIG. 5 is a pictorial view of one of the videophones shown
`in FIG. 4;
`FIG. 6 shows an illustrative screen shot of a video image
`having a long depth of field that is rendered by a videophone;
`FIG. 7 shows an illustrative screen shot of a video image
`with a simulated short depth of field that is rendered by a
`videophone in accordance with the principles of the present
`arrangement;
`FIG. 8 is an illustration showing an illustrative segregation
`of a captured video image into a portion of interest that is kept
`in focus and a remaining portion that is blurred using a variety
`of alternative image processing techniques;
`FIGS. 9-11 show various illustrative fixed templates, each
`of which segregate a portion of interest in a video image that
`55 is kept in focus while the remaining portions are blurred;
`FIG. 12 is a diagram of an illustrative template having a
`transition area between the portion of interest that is kept in
`focus and the blurred portion;
`FIG. 13 shows an illustrative image and kernel arrays used
`60 to perform convolution attendant to application of digital
`filtering;
`FIG. 14 is an illustrative kernel used with a mean (i.e.,
`averaging) digital filter;
`FIG. 15 is simplified diagram of an illustrative videophone
`65 architecture;
`FIG.16 is a flowchart of an illustrative method simulating
`depth of field effects in a video image; and
`
`Current videophones use cameras having a long depth of
`field which results in the subject matter in a scene captured by
`the camera from foreground to background being in focus.
`This compares to video images captured by cameras having a
`shorter depth of field where subject matter in the foreground
`appears in focus while subject matter in the background of the
`scene appears out of focus.
`Long depth of field in videophones generally results from
`a small digital imaging sensor size relative to the lens aperture
`in combination with a fixed focal length and shutter speed.
`These particular design parameters are selected in order to
`provide good videophone image quality while maintaining
`low component costs which is important for videophones
`sold into the highly competitive consumer electronics market.
`Consumer-market videophones provide excellent perfor(cid:173)
`mance overall, and the long depth of field provided is nor(cid:173)
`mally acceptable in many settings. Not only does it provide a
`perception that the videophone image is sharp and clear over(cid:173)
`all, but a videophone can be used in a variety of settings
`without the user worrying that some portions of a captured
`scene be out of focus. For example, a group of people on one
`end of a videophone call can have some participants posi(cid:173)
`tioned close to the camera while others are farther away.
`Another user may wish to use the videophone to show some(cid:173)
`thing that needs to be kept at some distance from the camera. 30
`However, the videophone's long depth of field can present
`issues in some situations. Some users may find the details in
`the background of the received video image to be distracting.
`Others might be uncomfortable that their videophone cap(cid:173)
`tures too a clear view of themselves, their home, or surround- 35
`ings and represents some degree of intrusion on their privacy.
`And even for those users who fully embrace the videophone's
`capabilities, it is possible that details of a user's life may be
`unintendedly revealed during a videophone call. For
`example, a person might not realize that a videophone call is 40
`taking place and walk through the background in a state of
`attire that is inappropriate for viewing by people outside the
`home.
`One current solution to address privacy concerns includes
`placing controls on the videophone that let a user turn the 45
`videophone camera off while keeping the audio portion of the
`call intact. While effective in many situations, it represents an
`all or none solution that not all users accept since the loss of
`the video function removes a primary feature provided by the
`videophone. In addition, such user controls do not prevent the 50
`accidental capture of undesirable or inappropriate content by
`the videophone.
`
`SUMMARY
`
`An arrangement for simulating a short depth of field in a
`captured videophone image is provided in which the back(cid:173)
`ground portion of the image is digitally segregated and
`blurred to render it indistinct. As a result, the displayed video
`image of a videophone user in the foreground is kept in focus
`while the background appears to be out of focus.
`In various illustrative examples, image detection and track(cid:173)
`ing techniques are used to dynamically segregate a portion of
`interest-such as a person's face, or face and shoulder area
`that is kept in focus-from the remaining video image. Image
`processing techniques are applied to groups of pixels in the
`remaining portion to blur that portion and render it indistinct.
`
`

`
`US 7,911,513 B2
`
`3
`FIG. 17 shows an illustrative screen shot of a video image
`with a simulated short depth of field that is rendered by a
`videophone to provide positive feedback to a user that privacy
`is enabled in accordance with the principles of the present
`arrangement.
`Like reference numerals indicate like elements throughout
`the drawings.
`
`DETAILED DESCRIPTION
`
`Various compositional techniques are employed in tradi(cid:173)
`tional photography to emphasize the primary subject matter
`in a scene. One such technique is known as "Bokeh" which is
`Japanese term that translates into "fuzzy" or "dizziness."
`Bokeh refers to the use of out-of-focus highlights or areas in 15
`a rendered image. Bokeh techniques may be used for a variety
`of functional, artistic, or aesthetic reasons in which an
`attribute known as "depth of field" is manipulated to provide
`the desire effect where the primary subject is kept in focus
`while the remaining portion of the rendered image is out of 20
`focus.
`Depth of field in both still and video photography is deter(cid:173)
`mined by lens aperture, film negative/image sensor size (in
`traditional/digital imaging, respectively), and focal length.
`Traditional35 mm film has a short depth of field because the 25
`negative size is large compared with the lens aperture. By
`comparison, to minimize costs, most videophones targeted at
`the consumer market use a very small digital image sensor
`along with an optics package that includes a fixed focal length
`and shutter speed. Thus, traditional techniques used to 30
`shorten depth of field by adjusting the aperture number (i.e.,
`f/stop) down below the lens's maximum aperture and reduc(cid:173)
`ing shutter speed to compensate for exposure are not gener(cid:173)
`ally applicable to videophone cameras.
`Depth of field is the range of distance around the focal 35
`plane which is acceptably sharp. The depth of field varies
`depending on camera type, aperture and focusing distance,
`although the rendered image size and viewing distance can
`influence the perception of it. The depth of field does not
`abruptly change from sharp to unsharp, but instead occurs as 40
`a gradual transition. In fact, everything immediately in front
`of or in back of the focusing distance begins to lose sharpness
`even if this is not perceived by the viewer or by the resolution
`of the camera.
`Because there is no critical point of transition, a term called
`the "circle of confusion" is used to define how much a par(cid:173)
`ticular point needs to be blurred in order to be perceived as
`being unsharp. The circle of confusion is an optical spot
`caused by a cone oflight from a lens not coming to a perfect
`focus when imaging a point source. Objects with a small
`"circle of confusion" show a clear and clean dot and are in
`focus. Objects with a large "circle of confusion" show a dot
`with blurry edges and are out of focus.
`Accordingly, the present arrangement provides a person's
`face or other area of interest in the foreground of the rendered
`videophone image with a small circle of confusion. The
`remaining portion of the image is rendered with a large circle
`of confusion. Further discussion of Bokeh techniques, circle
`of confusion and sample images are available in H. Merk(cid:173)
`linger, A Technical View ofBokeh, Photo Techniques, May/
`June (1997).
`FIGS. 1-3 are provided to illustrate the application of the
`principles of depth of field to the present arrangement. FIG. 1
`is a pictorial illustration showing a camera 105 having two
`black and white patterned targets 112 and 115 within its field
`of view. As shown, target 112 is in the foreground of the
`camera's field of view and target 115 is in the background.
`
`4
`FIG. 2 shows an example of the appearance of an image with
`a long depth of focus taken by camera 105. As shown, targets
`112 and 115 are both in focus. By comparison, FIG. 3 shows
`an example of an image having a shorter depth of focus. Here,
`the target 112 in the foreground is in focus, but target 115 in
`the background is no longer in focus and appears blurry.
`Turning to FIG. 4, there is shown an illustrative arrange(cid:173)
`ment 400 in which two videophone users are engaged in a
`video telephony session. User 405 is using videophone 408 in
`10 home 413. Videophone 408 is coupled over a network 418 to
`videophone 426 used by user 430 in home 435. Videophones
`generally provide better image quality with both higher frame
`rates and resolution when calls are carried over broadband
`networks, although some videophones are configured to work
`over regular public switched telephone networks ("PSTN s").
`Broadband networks services are commonly provided from
`cable, DSL (Digital Subscriber Line) and satellite service
`providers. Videophones are normally used in pairs where
`each party on the call uses a videophone.
`FIG. 5 is a pictorial view of the videophone 408 shown in
`FIG. 4. Videophone 408 is representative of videophones that
`are available to the consumer market. Videophone 408
`includes a display component 502 that is attached to a base
`505 with a mounting arm 512. Base 505 is configured to allow
`videophone 408 to be positioned on desk or table, for
`example. A camera 514 is disposed in the display component
`having a lens that is oriented towards the videophone user, as
`shown. A microphone (not shown) is also positioned near
`camera 514 to capture voices and other sounds associated
`with a videophone call.
`Camera 514 is commonly implemented using a CCD
`(charge coupled device) image sensor that captures images
`formed, from a multiplicity of pixels (i.e., discrete picture
`elements), of the videophone user and surrounding area. The
`images from camera 514 are subjected to digital signal pro(cid:173)
`cessing in videophone 408 to generate a digital video image
`output stream that is transmitted to the videophone 426 on the
`other end of the videophone call. In this illustrative example,
`the digital video image output stream is a compressed video
`stream compliant with MPEG-4 video standard defined by
`the Moving Picture Experts Group with the International
`Organization for Standardization ("ISO"). In alternative
`embodiments, other formats and/or video compression
`schemes are usable including one selected from MPEG-1,
`45 MPEG-2, MPEG-7, MPEG-21, VC-1 (also known as Society
`of Motion Picture and Television Engineers SMPTE 421M),
`DV (Digital Video), DivX created by DivX, Inc. (formerly
`known as DivXNetworks Inc.), International Telecommuni(cid:173)
`cations Union ITU H.261, ITU H.263, ITU H.264, WMV
`50 (Windows Media Video), RealMedia, RealVideo, Apple
`QuickTime, ASF (Advanced Streaming Format, also known
`as Advanced System Format), AVI (Audio Video Interface),
`3GPP (3rd Generation Partnership Project), 3GPP2 (3rd Gen(cid:173)
`eration Partnership Project 2), JPEG (Joint Photographic
`55 Experts Group), or Motion-JPEG.
`Display component 502 includes a screen 516 that com(cid:173)
`prises a receiving picture area 520 and a sending picture area
`525. The receiving picture area 520 of screen 516 is arranged
`to display the video image of the user 430 captured by a
`60 camera in videophone 426 shown in FIG. 4. The sending
`picture area 525 displays a relatively smaller image of the
`user 405 captured by the camera 514. Sending picture area
`525 thus enables user 405 to see the picture ofhimselfthat is
`being sent and seen by the other user 430. Such feedback is
`65 important to enable user 405 to place himself in field of view
`of camera 514 with the desired positioning and framing
`within the captured video image.
`
`

`
`US 7,911,513 B2
`
`5
`Mounting arm 512 is arranged to position the display com(cid:173)
`ponent 502 and camera 514 at a distance above the base 505
`to provide comfortable viewing of the displayed video image
`and position the camera 514 with a good field of view of the
`videophone user. Disposed in mounting arm 512 are video(cid:173)
`phone operating controls 532 which are provided for the user
`to place videophone calls, set user-preferences, adjust video(cid:173)
`phone settings, and the like.
`Referring again to FIG. 4, videophone user 430 is posi(cid:173)
`tioned in the foreground of a scene 440 captured by the 10
`camera disposed in videophone 426. The foreground is indi(cid:173)
`cated by reference numeral442. Similarly, as shown, a house(cid:173)
`plant 450 is in the middle ground 452 of the scene, and a
`family member 460 is in the background 462.
`FIG. 6 shows an illustrative screen shot 600 of a video 15
`image of the captured scene 440 in FIG. 4 as rendered onto
`screen 516 by the videophone 408. As shown, the rendered
`image appears with a long depth of field as user 430, house(cid:173)
`plant 450, and family member 460 are all in focus. As noted
`above, such long depth of field is normally provided for video
`images rendered by conventional videophones. And, such
`clear imaging of all the subject matter in the capture scene
`may present privacy concerns.
`In comparison to the conventional long depth of field video
`image shown in FIG. 6, FIG. 7 shows an illustrative screen
`shot 700 of a video image ofhaving a simulated short depth of
`field as provided by the present arrangement. The video
`image shown in screen shot 700 is of the same captured scene
`440 as rendered onto screen 516 by the videophone 408. Here,
`only the image of the user 430 in the foreground 442 is kept in
`focus while the houseplant 450 and family member 460 are
`blurred and rendered indistinct as indicated by the dot pat(cid:173)
`terns in FIG. 7.
`FIG. 8 is an illustration showing an illustrative segregation
`of a captured video image into a region of interest 805 that is
`kept in focus and a remaining portion 810 that is blurred using
`a one of several alternative image processing techniques (as
`described below in the text accompanying FIGS. 13 and 14).
`In this illustrative example, object detection techniques are
`utilized in which a specific feature, in this case the user's face,
`head, and shoulders are dynamically detected in the captured
`video image and tracked as the user moves and/or changes
`position during the course of the videophone call. While FIG.
`8 shows the area of interest comprises the user's face, head,
`and shoulder region, other areas of interest may also be 45
`defined for detection and tracking. For example, the area of
`the image kept in focus using a dynamic detection and track(cid:173)
`ing technique may be limited to just the user's face area.
`Object detection, and in particular, face detection is an
`important element of various computer vision areas, such as
`image retrieval, shot detection, video surveillance, etc. The
`goal is to find an object of a pre-defined class in a video image.
`A variety of conventional object detection in video images
`techniques are usable depending on the requirements of a
`specific application. Such techniques include feature-based
`approaches which locate face geometry features by extract(cid:173)
`ing, for example certain image features, such as edges, color
`regions, textures, contours, video motion cues etc., and then
`using some heuristics to find configurations and/or combina(cid:173)
`tions of those features specific to the object of interest.
`Other object detection techniques use
`image-based
`approaches in which the location of objects such as faces is
`essentially treated as a pattern recognition problem. The basic
`approach in recognizing face patterns is via a training proce(cid:173)
`dure which classifies examples into face and non-face proto- 65
`type classes. Comparison between these classes and a 2D
`intensity array (hence the name image-based) extracted from
`
`6
`an input image allows the decision of face existence to be
`made. Image-based approaches include linear subspace
`methods, neural networks, and statistical approaches.
`An overview of these techniques and a discussion of others
`may be found in E. Hjelmas and B. K. Low, Face Detection:
`A Survey, Computer Vision and Image Understanding 83,
`236-274 (2001). In addition, a variety of open source code
`sources are available to implement appropriate face-detection
`algorithms including the OpenCV computer vision facility
`from Intel Corporation provides both low-level and high(cid:173)
`level APis (application programming interfaces) for face
`detection using a statistical model. This statistical model, or
`classifier, takes multiple instances of the object class of inter(cid:173)
`est, or "positive" samples, and multiple "negative" samples,
`i.e., images that do not contain objects of interest. Positive
`and negative samples together make a training set. During
`training, different features are extracted from the training
`samples and distinctive features that can be used to classify
`the object are selected. This information is "compressed" into
`20 the statistical model parameters. If the trained classifier does
`not detect an object (misses the object) or mistakenly detects
`the absent object (i.e., gives a false alarm), it is easy to make
`an adjustment by adding the corresponding positive or nega(cid:173)
`tive samples to the training set. More information on Intel
`25 OpenCV face detection may be found in G. Bradski, A. Kae(cid:173)
`hler, and V. Pisarevsky, Learning-Based Computer Vision
`with Intel's Open Source Computer Vision Library, Intel
`Technical Journal, Vol. 9, Issue 2, (2005).
`FIGS. 9-11 show illustrative examples of fixed templates
`30 that are applied to a captured video image to segregate the
`portion of interest from the remaining portion. By com pari(cid:173)
`son to the object detection technique where the shape of the
`target portion dynamically varies as the subject moves, the
`templates in FIGS. 9-11 use a fixed border between the target
`35 and remaining portions. Use of fixed templates may provide a
`less complex implementation of the segregation aspect of the
`present arrangement for implementing privacy while main(cid:173)
`taining the majority of its functionality which may be benefi(cid:173)
`cial in some scenarios. In an optional arrangement, control is
`40 provided to the videophone user to select from various tem(cid:173)
`plates to find a template that best matches the particular use
`and circumstances. In other arrangements, the relative sizes
`of the target and remaining portions may be adjusted, either in
`fixed increments or infinitely in a fixed range.
`As shown, template 900 in FIG. 9 has a substantially rect-
`angular target portion 905 that is disposed in an area that fills
`approximately the central two-thirds of the screen. Target
`portion 905 is positioned to allow the remaining portion 910
`to fill the top and sides of the screen. This template makes use
`50 of the observation that most videophone users position them(cid:173)
`selves to fill the central portion of the videophone camera's
`field of view. Accordingly, the areas of potential privacy con(cid:173)
`cern will tend to be at the tops and sides of the captured image.
`As noted above, in optional arrangements the relative size
`55 between the target portion 905 and remaining portion 910
`may be configured to be user adjustable as indicated by the
`dashed rectangle 925 in FIG. 9.
`FIG.10 shows a template 1000 that is similar to that shown
`in FIG. 9 (by occupying approximately the central two-thirds
`60 of the screen) except the top portion of the target portion 1005
`is curved. Thus, the target portion 1005 is substantially arched
`shaped. Use of this shape increases the area of the remaining
`portion 1010 and may provide a better fit between in-focus
`and blurred portions for a particular user's application.
`FIG. 11 shows a template 1100 in which the target portion
`is substantially oval shaped. In this case, the remaining por(cid:173)
`tion 1110 surrounds the target portion 1105 so that privacy
`
`

`
`US 7,911,513 B2
`
`8
`
`O(i, j) = ~~l(i+k -1, j+l-1)K(k, l)
`k=l l=l
`
`where i runs from 1 to M-m+1 andj runs from 1 to N-n+l.
`In one illustrative example, the convolution filter applied is
`called a mean filter where each pixel in the image is replaced
`by an average value of its neighbors, including itself. Mean
`filters are also commonly referred to as "box," "smoothing,"
`or "averaging" filters. The kernel used for the mean filter
`represents the size and shape of the neighborhood to be
`sampled when calculating the mean. Often, a 3x3 square
`kernel as indicated by reference numeral 1410 in FIG. 14,
`although larger 5x5, 7x7 etc., kernels may also be used to
`create more blurring. The kernel 1405 may also be applied
`more than once.
`A median filter is alternatively utilized in which the aver(cid:173)
`age value used in the mean filter is replaced by the median
`20 value of neighboring pixels.
`In another illustrative example, a Gaussian filter is applied
`to blur the remaining portions other than the portion of inter(cid:173)
`est in the image to be rendered in focus. This filter uses a
`kernel having a shape that represents a Gaussian (i.e., bell(cid:173)
`shaped curve) as represented by:
`
`7
`blurring will be performed at the bottom center of the ren(cid:173)
`dered image (unlike templates 900 and 1000) along with the
`top and side areas of the screen.
`FIG. 12 shows an illustrative template 1200 having a tran(cid:173)
`sition area 1202 between the target portion 1205 in which
`focus is kept intact and remaining portion 1210 that is blurred
`using the present techniques described herein. The transition
`area 1202 is configured with an intermediate degree of circle
`of confusion between the target portion 1205 and remaining
`portion 1210. This enables a softer transition between focus 10
`and blurred areas to be achieved which may help to make the
`rendered image appear more natural in some situations. The
`size of the transition area 1202 is a design choice that will
`normally be selected according to the requirements of a par- 15
`ticular application. Although the transition area is shown
`being used with a template having an oval target portion, it is
`emphasized that such transition area may be used with any
`target portion shape in both fixed templates and dynamic
`object detection embodiments.
`Once a captured video image is segregated into a portion of
`interest and a remaining portion, digital image processing is
`performed to increase the circle of confusion for groups of
`pixels in the remaining portion to thereby blur it and render it
`indistinct. In this illustrative example, the digital image pro- 25
`cessing comprises filtering in either the spatial domain or
`frequency domain.
`The spatial domain is normal image space in which an
`image is represented by intensities at given points in space.
`The spatial domain is a common representation for image 30
`data. A convolution operator is applied to blur the pixels in the
`remaining portion. Convolution is a simple mathematical
`operation which is fundamental to many common image pro(cid:173)
`cessing operations. Convolution provides a way of multiply- 35
`ing together two arrays of numbers, generally of different
`sizes, but of the same dimensionality, to produce a third array
`of numbers of the same dimensionality. This can be used in
`image processing to implement operators whose output pixel
`values are simple linear combinations of certain input pixel 40
`values.
`In an image processing context, one of the input arrays is
`typically a set of intensity values (i.e., gray level) for one of
`the color components in the video image, for example using
`the RGB (red green blue) color model. The second array is
`usually much smaller, and is also two-dimensional (although
`it may be just a single pixel thick), and is known as the kernel.
`FIG. 13 shows an example image 1305 and kemel1310 used
`to illustrate convolution.
`The convolution is performed by sliding the kernel over the
`image, generally starting at the top left comer, so as to move
`the kernel through all the positions where the kernel fits
`entirely within the boundaries of the image. (Note that imple(cid:173)
`mentations differ in what they do at the edges of images, as
`explained below.) Each kernel position corresponds to a
`single output pixel, the value of which is

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket