`
`(12) United States Patent
`Tam et al.
`
`(10) Patent N0.:
`(45) Date of Patent:
`
`US 8,488,868 B2
`*Jul. 16, 2013
`
`(54)
`
`(75)
`
`(73)
`
`GENERATION OF A DEPTH MAP FROM A
`MONOSCOPIC COLOR IMAGE FOR
`RENDERING STEREOSCOPIC STILL AND
`VIDEO IMAGES
`
`Inventors: Wa James Tam, Orleans (CA); Carlos
`Vazquez, Gatineau (CA)
`
`Assignee: Her Majesty the Queen in Right of
`Canada, as represented by the
`Minister of Industry, through the
`Communications Research Centre
`Canada, Ottawa (CA)
`
`(*)
`
`Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 1476 days.
`This patent is subject to a terminal dis
`claimer.
`
`(21)
`
`(22)
`
`(65)
`
`(60)
`
`(51)
`
`(52)
`
`(58)
`
`Appl. N0.: 12/060,978
`
`Filed:
`
`Apr. 2, 2008
`
`Prior Publication Data
`
`US 2008/0247670 A1
`
`Oct. 9, 2008
`
`Related US. Application Data
`
`Provisional application No. 60/907,475, ?led on Apr.
`3, 2007.
`
`(2006.01)
`
`Int. Cl.
`G06K 9/00
`US. Cl.
`USPC ............................ .. 382/154; 348/44; 345/419
`Field of Classi?cation Search
`None
`See application ?le for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,641,177 A
`2/1987 Ganss ............................. .. 358/3
`5,109,425 A *
`4/1992 Lawton ........ ..
`382/107
`5,886,701 A *
`3/1999 Chauvin et a1.
`345/418
`6,215,516 B1
`4/2001 Ma et a1. . . . . . . . .
`. . . .. 348/43
`6,314,211 B1* 11/2001 Kim et a1. .... ..
`382/285
`7,035,451 B2
`4/2006 Harman et a1.
`382/154
`7,054,478 B2
`5/2006 Harman ........... ..
`382/154
`7,180,536 B2
`2/2007 WoloWelsky et a1.
`348/42
`7,319,798 B2 *
`1/2008 Kim et a1. ........ ..
`382/285
`8,036,451 B2 * 10/2011 Redert et a1.
`382/154
`8,213,711 B2* 7/2012 Tam et a1. .... ..
`382/162
`2003/0218606 A1* 11/2003 Zhirkov et a1.
`.. 345/419
`2005/0053276 A1
`3/2005 Curti et a1. ..
`382/154
`2006/0056679 A1
`3/2006 Redert et a1.
`382/154
`2006/0232666 A1 10/2006 Op de Beeck eta .
`........ .. 348/51
`(Continued)
`OTHER PUBLICATIONS
`
`L. MacMillan, “An image based approach to three dimensional com
`puter graphics”, Ph. D. dissertation, University of North Carolina,
`1997.
`
`(Continued)
`Primary Examiner * Barry Drennan
`(74) Attorney, Agent, or Firm * Teitelbaum & MacLean;
`Neil Teitelbaum; Doug MacLean
`(57)
`ABSTRACT
`The invention relates to a method and an apparatus for gen
`erating a depth map from a digital monoscopic color image.
`The method includes the following general steps: a) obtaining
`a ?rst color component of the MCI, said ?rst color component
`corresponding to partial color information of the MCI; and, b)
`assigning depth values to pixels of the MCI based on values of
`the ?rst color component of respective pixels for forming the
`depth map for the MCI. In one embodiment, the depth values
`are generated by adjusting and/or scaling of pixel values of
`the Cr chroma component of the monoscopic source color
`image in the Y'CbCr color system.
`19 Claims, 6 Drawing Sheets
`
`1
`
`——I
`
`J
`
`Obtaimng 1" Color
`Component
`
`I30
`
`20 Depth Cues
`
`I
`I
`I
`I
`.__1
`
`, .........
`
`15
`
`,
`1‘v Color
`I Component Image J
`
`20 \ Asslgmng depth vaIues to pIxeIs
`based on pIxeI vaIues of the first
`ooIur component
`
`I
`
`25 \f ............... ....
`
`:
`
`Depth Map ‘
`
`L- ......... ..
`
`57
`
`I ........ _.
`
`ereoscopidf)
`I
`
`Image Pair
`
`f .... ..._.
`
`Deviated Image
`
`7
`
`Legend3D, Inc.
`Exhibit 1004-0001
`
`
`
`US 8,488,868 B2
`Page 2
`
`US. PATENT DOCUMENTS
`
`2007/0008342 A1* 1/2007 Sethuraman et a1. ....... .. 345/635
`2007/0024614 A1
`2/2007 Tam et al. ........ ..
`. 345/419
`2007/0146232 A1
`6/2007 Redert et al.
`345/6
`2010/0182410 A1* 7/2010 Verburgh et al.
`348/51
`2011/0193860 A1* 8/2011 Lee et a1. .................... .. 345/419
`
`OTHER PUBLICATIONS
`
`K. T. Kim, M. Siegel, & J. Y. Son, “Synthesis ofa high-resolution 3D
`stereoscopic image pair from a high-resolution monoscopic image
`and a low-resolution depth map,” Proceedings of the SPIE: Stereo
`scopic Displays and Applications IX, vol. 3295A, pp. 76-86, San
`Jose, Calif., USA, 1998.
`J. Flack, P. Harman, & S. Fox, “Low bandwidth stereoscopic image
`encoding and transmission” Proceedings of the SPIE: Stereoscopic
`
`Displays and Virtual Reality Systems X, vol. 5006, pp. 206-214,
`Santa Clara, Calif., USA, Jan. 2003.
`L. Zhang & W. J. Tam, “Stereoscopic image generation based on
`depth images for 3D TV,” IEEE Transactions on Broadcasting, vol.
`51, pp. 191-199, 2005.
`W.J. Tam, “Human Factors and Content Creation for Three-Dimen
`sional Displays”, Proceedings of the 14th International DisplayWork
`shops (IDW’07), Dec. 2007, vol. 3, pp. 2255-2258.
`Redert et a1. “Philips 3D solutions: from content creation to visual
`ization”, Proceeding of the Third International Symposium on 3D
`Data Processing, Visualization, and Transmission (3DPVT’06), Uni
`versity ofNorth Carolina, Chapel Hill, USA, Jun. 14-16, 2006.
`“Dynamic Digital dDepth (DDD) and Real-time 2D to 3D conversion
`on the ARM processor”, DDD Group plc., White paper, Nov. 2005.
`
`* cited by examiner
`
`Legend3D, Inc.
`Exhibit 1004-0002
`
`
`
`US. Patent
`
`Jul. 16, 2013
`
`Sheet 1 of6
`
`US 8,488,868 B2
`
`n .l -
`m w m m m m m
`n O m
`
`m o m
`n C m n C e n
`
`n O a m V
`
`O 3
`
`Obtaining 15‘ Color
`Component
`
`2D Depth Cues
`
`.
`
`C
`. 1 D. u n m m n 0 w
`s O n
`
`m r m m m m u
`m a m
`._~ mo ....
`
`. n
`
`20 \ Assigning depth values to pixels
`based on pixel values of the first <- —
`color component
`
`25 N"
`
`3'
`
`Depth Map
`
`\
`
`lllllllllll I
`
`DIBR
`
`n t 0 n .E n n v u n e u m D u
`
`n a n n m n
`u | m
`
`“ e u g .
`
`FIG. 1
`
`
`“"08 u" .......... 1m."
`
`“CD1 "m
`
`
`
` M; mmma I 2mm m"
`TSI “m
`
`..m .I
`
`......... ...__
`
`Legend3D, Inc.
`Exhibit 1004-0003
`
`
`
`US. Patent
`
`Jul. 16, 2013
`
`Sheet 2 of6
`
`US 8,488,868 B2
`
`0 \
`
`lllllllllll \\
`
`10\
`
`Obtaining 1st Color
`Component
`
`l Obtaining pixel values of
`a 2nd Color Component
`
`Component Image
`
`Q‘
`\
`~-------...- .......-.---¢
`
`llllllllllll I!
`
`17
`
`\ Adjusting Selected Regions of the 1St
`Color Component Image
`
`4—
`
`19
`
`\ Scaling the 1st Color
`Component Image
`
`25
`
`:"
`
`Depth Map
`
`\\I I!
`
`lllllll II
`
`~
`
`,'
`
`~_----------I------¢....¢
`
`Smoothing
`
`25s
`
`Smoothed Depth
`llllllll ll \\ I!
`Map
`
`,
`\_
`__.__.-..-..............Q.’
`
`FIG. 2
`
`Legend3D, Inc.
`Exhibit 1004-0004
`
`
`
`US. Patent
`
`Jul. 16, 2013
`
`Sheet 3 0f 6
`
`US 8,488,868 B2
`
`55
`
`~‘2/15
`
`
`
`"i6 81\” /5551
`
`Value
`
`mill/v
`
`551a
`
`Legend3D, Inc.
`Exhibit 1004-0005
`
`
`
`US. Patent
`
`Jul. 16, 2013
`
`Sheet 4 0f 6
`
`US 8,488,868 B2
`
`Value
`
`FIG. 3D
`
`Pixel
`
`FIG. 3E
`
`Legend3D, Inc.
`Exhibit 1004-0006
`
`
`
`US. Patent
`
`Jul. 16, 2013
`
`Sheet 5 of6
`
`US 8,488,868 B2
`
`‘I’-
`I,
`I
`'
`'
`
`i
`-
`'
`'
`'
`:
`i
`:
`:
`u
`'
`:
`l
`'
`'
`'
`'
`‘
`‘
`\
`N.
`
`Value
`
`7 7
`
`_~\
`\
`i
`I
`u
`
`'
`=
`I
`a
`o
`o
`S
`I a
`-
`:
`I
`I
`:
`I
`I
`n
`I
`0
`I
`I
`-'
`
`z
`‘2'.
`0.0. _
`
`. >
`
`66
`
`z
`
`FIG. 3F
`
`662
`\\ \\
`660
`
`/
`
`________________._____
`
`Pixel
`
`FIG. 36
`
`Legend3D, Inc.
`Exhibit 1004-0007
`
`
`
`US. Patent
`
`Jul. 16, 2013
`
`Sheet 6 of6
`
`US 8,488,868 B2
`
`
`
`\_| -
`
`EQQO Ema
`
`
`
`6 mo LmcHooEw V \_owwmook_n_ _ w Q_m_>_
`a a g
`
`1
`
`#GI
`
`
`
`(F. ii.
`
`a
`N: a H r:
`/ \
`
`
`
`@250 @526 am
`
`o8
`
`
`
`
`
`§w3€< 660 .660 E8
`
`a QINH a
`
`
`
`howwwooi A| 628mm A
`
`u-------------__
`'0
`.\
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`'l a
`
`LOI
`
`lllllllllllllllllllllllllll \\
`
`Legend3D, Inc.
`Exhibit 1004-0008
`
`
`
`US 8,488,868 B2
`
`1
`GENERATION OF A DEPTH MAP FROM A
`MONOSCOPIC COLOR IMAGE FOR
`RENDERING STEREOSCOPIC STILL AND
`VIDEO IMAGES
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`
`The present invention claims priority from US. Provi
`sional Patent Application No. 60/907,475 ?led Apr. 3, 2007,
`entitled “Methods for Generating Synthetic Depth Maps from
`Colour Images for Stereoscopic and MultivieW Imaging and
`Display”, Which is incorporated herein by reference.
`
`TECHNICAL FIELD
`
`The present invention generally relates to methods and
`systems for generating depth maps from monoscopic tWo
`dimensional color images, and more particularly relates to
`utiliZing color information containing in monoscopic images
`to generate depth maps for rendering stereoscopic still and
`video images.
`
`BACKGROUND OF THE INVENTION
`
`20
`
`25
`
`2
`vision of the scene. Examples of the DIBR technique are
`disclosed, for example, in articles K. T. Kim, M. Siegel, & J.
`Y. Son, “Synthesis of a high-resolution 3D stereoscopic
`image pair from a high-resolution monoscopic image and a
`loW-resolution depth map,” Proceedings of the SPIE: Stereo
`scopic Displays and Applications IX, Vol. 3295A, pp. 76-86,
`San Jose, Calif., U.S.A., 1998; and J. Flack, P. Harman, & S.
`Fox, “LoW bandWidth stereoscopic image encoding and
`transmission,” Proceedings of the SPIE: Stereoscopic Dis
`plays andV1rtual Reality Systems X, Vol. 5006, pp. 206-214,
`Santa Clara, Calif., USA, January 2003; L. Zhang & W. J.
`Tam, “Stereoscopic image generation based on depth images
`for 3D TV,” IEEE Transactions on Broadcasting, Vol. 51, pp.
`191-199, 2005.
`Advantageously, based on information from the depth
`maps, DIBR permits the creation of a set of images as if they
`Were captured With a camera from a range of vieWpoints. This
`feature is particularly suited for multi-vieW stereoscopic dis
`plays Where several vieWs are required.
`One problem With conventional DIBR is that accurate
`depth maps are expensive or cumbersome to acquire either
`directly or from a 2D image. For example, a “true” depth map
`can be generated using a commercial depth camera such as
`the ZCamTM available from 3DV Systems, Israel, that mea
`sures the distance to objects in a scene using an infra-red (IR)
`pulsed light source and an IR sensor sensing the re?ected light
`from the surface of each object. Depth maps can also be
`obtained by projecting a structured light pattern onto the
`scene so that the depths of the various objects could be recov
`ered by analyZing distortions of the light pattern. Disadvan
`tageously, these methods require highly specialiZed hardWare
`and/or cumbersome recording procedures, restrictive scene
`lighting and limited scene depth.
`Although many algorithms exist in the art for generating a
`depth map from a 2D image, they are typically computation
`ally complex and often require manual or semi-automatic
`processing. For example, a typical step in the 2D-to-3D con
`version process may be to generate depth maps by examining
`selected key frames in a video sequence and to manually mark
`regions that are foreground, mid-ground, and background. A
`specially designed computer softWare may then be used to
`track the regions in consecutive frames to allocate the depth
`values according to the markings. This type of approach
`requires trained technicians, and the task can be quite labori
`ous and time-consuming for a full-length movie. Examples of
`prior art methods of depth map generation Which involve
`intensive human intervention are disclosed in US. Pat. Nos.
`7,035,451 and 7,054,478 issued to Harman et al.
`Another group of approaches to depth map generation
`relies on extracting depth from the level of sharpness, or blur,
`in different image areas. These approaches are based on real
`iZation that there is a relationship betWeen the depth of an
`object, i.e., its distance from the camera, and the amount of
`blur of that object in the image, and that the depth information
`in a visual scene may be obtained by modeling the effect that
`a camera’s focal parameters have on the image. Attempts
`have also been made to generate depth maps from blur With
`out knoWledge of camera parameters by assuming a general
`monotonic relationship betWeen blur and distance. HoWever,
`extracting depth from blur may be a dif?cult and/ or unreliable
`task, as the blur found in images can also arise from other
`factors, such as lens aberration, atmospheric interference,
`fuZZy objects, and motion. In addition, a substantially same
`degree of blur arises for objects that are farther aWay and that
`are closer to the camera than the focal plane of the camera.
`Although methods to overcome some of these problems and
`to arrive at more accurate and precise depth values have been
`
`Stereoscopic or three-dimensional (3D) television (3D
`TV) is expected to be a next step in the advancement of
`television. Stereoscopic images that are displayed on a 3D TV
`are expected to increase visual impact and heighten the sense
`of presence for vieWers. 3D-TV displays may also provide
`multiple stereoscopic vieWs, offering motion parallax as Well
`as stereoscopic information.
`A successful adoption of 3D-TV by the general public Will
`depend not only on technological advances in stereoscopic
`and multi-vieW 3D displays, but also on the availability of a
`Wide variety of program contents in 3D. One Way to alleviate
`the likely lack of program material in the early stages of
`3D-TV rollout is to ?nd a Way to convert tWo-dimensional
`(2D) still and video images into 3D images, Which Would also
`enable content providers to re-use their vast library of pro
`gram material in 3D-TV.
`In order to generate a 3D impression on a multi-vieW
`display device, images from different vieW points have to be
`presented. This requires multiple input vieWs consisting of
`either camera-captured images or rendered images based on
`some 3D or depth information. This depth information can be
`either recorded, generated from multi-vieW camera systems
`or generated from conventional 2D video material. In a tech
`nique called depth image based rendering (DIBR), images
`With neW camera vieWpoints are generated using information
`from an original monoscopic source image and its corre
`sponding depth map containing depth values for each pixel or
`groups of pixels of the monoscopic source image. These neW
`images then can be used for 3D or multi-vieW imaging
`devices. The depth map can be vieWed as a gray-scale image
`in Which each pixel is assigned a depth value representing
`distance to the vieWer, either relative or absolute. Alterna
`tively, the depth value of a pixel may be understood as the
`distance of the point of the three-dimensional scene repre
`sented by the pixel from a reference plane that may for
`example coincide With the plane of the image during image
`capture or display. It is usually assumed that the higher the
`gray-value (lighter gray) associated With a pixel, the nearer is
`it situated to the vieWer.
`A depth map makes it possible to obtain from the starting
`image a second image that, together With the starting image,
`constitutes a stereoscopic pair providing a three-dimensional
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Legend3D, Inc.
`Exhibit 1004-0009
`
`
`
`US 8,488,868 B2
`
`3
`disclosed in the art, they typically require more than one
`exposure to obtain two or more images. A further disadvan
`tage of this approach is that it does not provide a simple way
`to determine depth values for regions for which there is no
`edge or texture information and where therefore no blur can
`be detected.
`A recent US. patent application 2007/0024614, which is
`assigned to the assignee of the current application, discloses
`the use of sparse depth maps for DIBR applications. These
`sparse depth maps, also referred to as so-called “surrogate”
`depth maps, can be obtained using edge analysis of the mono
`scopic image followed by asymmetrical smoothing, and con
`tain depth information that is concentrated mainly at edges
`and object boundaries in the 2D images. Although these sur
`rogate depth maps can have large regions with missing and/or
`incorrect depth values, the perceived depth of the rendered
`stereoscopic images using the surrogate depth maps has been
`judged to provide enhanced depth perception relative to the
`original monoscopic image when tested on groups of viewers.
`It was speculated that the visual system combines the depth
`information available at the boundary regions together with
`pictorial depth cues to ?ll in the missing areas. One drawback
`of this approach is that this technique can introduce geometric
`distortions in images with vertical lines or edges. The lack of
`depth information within obj ect’s boundaries might also
`negatively affect perceived depth quality rating.
`Accordingly, there is a need for methods and systems for
`generating depth maps from monoscopic images that provide
`accurate object segregation, are capable of resolving depth
`information within objects boundaries, and are computation
`ally simple requiring only small amount of processing.
`An object of the present invention is to overcome at least
`some shortcomings of the prior art by providing relatively
`simple and computationally inexpensive method and appara
`tus for depth map generation from a 2D image using color
`information comprised in said 2D image.
`Another object of the present invention is to provide rela
`tively simple and computationally inexpensive method and
`apparatus for rendering stereoscopic and multi-view video
`and still images from 2D video and still images utiliZing color
`information contained in said 2D images.
`
`SUMMARY OF THE INVENTION
`
`Accordingly, one aspect of the invention provides a method
`for generating a depth map from a monoscopic color image
`(MCI) composed of pixels. In one aspect of the invention, the
`method comprises the steps of: a) obtaining a ?rst color
`component of the MCI, said ?rst color component corre
`sponding to partial color information of the MCI; and, b)
`assigning depth values to pixels of the MCI based on values of
`the ?rst color component of respective pixels for forming the
`depth map for the MCI.
`In accordance with an aspect of this invention, a method of
`generating a depth map from a monoscopic color image com
`posed of pixels comprises the steps of: obtaining a Cr chroma
`component of the MCI; selectively adjusting pixel values of
`the Cr chroma component in a portion of the MCI that is
`selected based on color to obtain a color-adjusted chroma
`component; scaling values of the color-adjusted chroma com
`ponent to obtain depth values for corresponding pixels to be
`used in the depth map; and, smoothing the depth map using a
`2D spatial ?lter. The portion of the MCI in which pixel values
`of the Cr chroma component are adjusted may be selected
`based on pixel values of a second color component of the
`MCI, and may comprise one of: ?rst pixels having R values of
`the MCI in the RGB color space satisfying a pre-determined
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`red threshold criterion, and second pixels having G values of
`the MCI in the RGB color space satisfying a pre-determined
`green threshold criterion, with the step of selectively adjust
`ing pixel values of the ?rst chroma component comprising
`one of selectively reducing values of the ?rst color component
`for the ?rst pixels, and selectively enhancing values of the ?rst
`color component for the second pixels.
`Another aspect of the present invention relates to a method
`of generating a multi-view image comprising the step of:
`receiving a monoscopic color image composed of pixels;
`generating a depth map from the monoscopic color image
`based on a color component thereof, said color component
`corresponding to partial color information of the monoscopic
`color image; utiliZing a depth image based rendering (DIBR)
`algorithm to create at least one deviated image by processing
`the monoscopic color image based on the depth map, so as to
`form a stereoscopic image pair.
`Another feature of the present invention provides a 3D
`image generating apparatus comprising: a data receiver for
`receiving data representing a monoscopic color image; a
`depth map generator for generating a depth map comprising
`pixel depth values based on a ?rst color component of the
`monoscopic color image; and, a DIBR processor for process
`ing the monoscopic color image based on the depth map to
`render at least one deviated image for forming at least one
`stereoscopic image pair. In one aspect of the present inven
`tion, the depth map generator comprises: a color processor for
`obtaining the ?rst and a second color component from the
`monoscopic color image; a scaling unit for scaling pixel
`values of the ?rst color component of the monoscopic color
`image for producing the pixel depth values; a color adjuster
`operatively connected between the color processor and the
`scaling unit for selectively adjusting pixel values of the ?rst
`color component based on pixel values of the second color
`component for respective pixels; and a spatial smoother for
`smoothing a spatial distribution of the pixel depth values in
`the depth map.
`Another feature of the present invention provides an appa
`ratus for generating 3D motion pictures from a sequence of
`monoscopic color images, comprising: an image receiver for
`receiving each monoscopic color image; a depth map genera
`tor for generating a depth map for each MCI based on a color
`component of the respective monoscopic color image, said
`color component provided by the image receiver; a DIBR
`processor for processing each monoscopic color image based
`on the corresponding depth map to render at least one devi
`ated image to form at least one stereoscopic image pair for
`each of the monoscopic color images; and, a multi-view dis
`play for sequentially generating at least one stereoscopic view
`from each stereoscopic image pair.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The invention will be described in greater detail with ref
`erence to the accompanying drawings which represent pre
`ferred embodiments thereof, wherein:
`FIG. 1 is a ?owchart of a method of generating multi-view
`images from a source 2D color image according to the present
`invention;
`FIG. 2 is a ?owchart of a method of generating a depth map
`based on a color component of a source 2D color image
`according to the present invention;
`
`Legend3D, Inc.
`Exhibit 1004-0010
`
`
`
`US 8,488,868 B2
`
`5
`FIGS. 3A-3G is a set of schematic diagrams illustrating
`different stages of the color-based process of generating a
`depth map from a source 2D color image;
`FIG. 4 is a block diagram of a 3D image generating appa
`ratus.
`
`DETAILED DESCRIPTION
`
`The invention Will be described in connection With a num
`ber of exemplary embodiments. To facilitate an understand
`ing of the invention, many aspects of the invention are
`described in terms of sequences of actions to be performed by
`functional elements of a video-processing system. It Will be
`recogniZed that in each of the embodiments, the various
`actions including those depicted as blocks in ?oW-chart illus
`trations and block schemes could be performed by special
`iZed circuits, for example discrete logic gates interconnected
`to perform a specialiZed function, by computer program
`instructions being executed by one or more processors, or by
`a combination of both. Moreover, the invention can addition
`ally be considered to be embodied entirely Within any form of
`a computer readable storage medium having stored therein an
`appropriate set of computer instructions that Would cause a
`processor to carry out the techniques described herein. Thus,
`the various aspects of the invention may be embodied in many
`different forms, and all such forms are contemplated to be
`Within the scope of the invention.
`In the context of the present speci?cation the terms “mono
`scopic color image” and “2D color image” are used inter
`changeably to mean a picture, typically digital and tWo-di
`mensional planar, containing an image of a scene complete
`With visual characteristics and information that are observed
`With one eye, such as luminance intensity, color, shape, tex
`ture, etc. Images described in this speci?cation are assumed
`to be composed of picture elements called pixels and can be
`vieWed as tWo-dimensional arrays or matrices of pixels, With
`the term “array” is understood herein to encompass matrices.
`A depth map is a tWo-dimensional array of pixels each
`assigned a depth value indicating the relative or absolute
`distance from a vieWer or a reference plane to a part of an
`object in the scene that is depicted by the corresponding pixel
`or block of pixels. The term “color component”, When used
`With reference to a color image, means a pixel array Wherein
`each pixel is assigned a value representing a partial color
`content of the color image. A color component of a mono
`scopic color image can also be vieWed as a gray-scale image.
`Examples of color components include any one or any com
`bination of tWo of the RGB color components of the image, or
`a chrominance component of the image in a particular color
`space. The term “deviated image,” With respect to a source
`image, means an image With a different vieWpoint from the
`source image of a given scene. A deviated image and a source
`image may form a stereoscopic image pair; tWo deviated
`images With different vieWpoints may also form a stereo
`scopic pair. The larger the difference in vieW point deviation
`the larger Will be the perceived depth of objects in a resulting
`stereoscopic vieW.
`FloWcharts shoWn in FIGS. 1 and 2 illustrate exemplary
`embodiments of a method of the present invention for gen
`eration of a depth map from a monoscopic color image
`(MCI), Which Will noW be described.
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`FIG. 1 generally illustrates a process 1 of the present inven
`tion Wherein a stereoscopic image pair (SIP) 57 is generated
`from an MCI 5 using a depth map 25, Which is obtained from
`the MCI 5 using a depth map generation method 3. The
`method 3, Which according to the present invention can be
`used for generating depth maps from either still or video 2D
`color images, generally involves selecting, or in other Way
`obtaining, a color component 15 of the MCI 5, Which is
`hereinafter referred to as the ?rst color component 15, and
`then using this color component, With optional modi?cations,
`as the depth map 25 to render the SIP 57. In its preferred
`embodiment, method 3 makes use of the fact that digital video
`signals carry color images in the form of a luminance (luma)
`component and tWo or more chrominance (chroma) compo
`nents, and thus those chroma components are readily avail
`able from the received video signal.
`Generally, a variety of color models may be used for rep
`resenting colors of the MCI 5, such as RGB, HSV, L*a*b*,
`YUV, Y'CbCr, CYMK, etc. RGB (Red, Green, Blue) color
`space represents a color With a red component (R), a green
`component (G) and a blue component (B). In a three-dimen
`sional coordinate system, each of the R, G, and B components
`of the RGB color space represents a value along an axis, the
`combination of the values de?ning a color space. For digital
`video signals in component format, the Y'CbCr color system
`is typically used; the Y'CbCr color space represents a color
`With a gamma-corrected luma componentY', and tWo chroma
`components, Cr and Cb. The chroma, or chrominance, com
`ponents Cr and Cb are obtained by subtracting the luma
`component Y' from the red component R and the blue com
`ponent B, respectively:
`
`The R, G, and B may refer to the tristimulus values of red,
`green, and blue that are combined to create the color image on
`a display, Which may be gamma-corrected. The color com
`ponents may have other scale factors and offsets applied to
`them, Which differ depending on the video signal scheme
`used. Furthermore, chroma subsampling may be used
`Wherein the luminance component representing brightness is
`provided With a higher resolution than the chroma compo
`nents. For example, in 4:2:2 chroma subsampling, the tWo
`chroma components are sampled at half the sample rate of
`luma, so horizontal chroma resolution is cut in half. Advan
`tageously, this chroma sub-sampling reduces processing
`requirements of the method 3 of the present invention. Gen
`erally, the method 3 of the present invention may be applied to
`MCI 5 provided in data formats corresponding to a variety of
`color models, as any color format, i.e., color space or model,
`can be converted to another color format.
`Turning again to FIG. 1, the ?rst color component 15 of the
`MCI 5 is obtained in a ?rst step 10. This step may involve, for
`example, receiving a digital video signal, identifying therein
`pixels corresponding to one image frame, Wherein each pixel
`has three or more values associated thereWith identifying
`pixel’ s color and brightness, and extracting pixel values C1 (11,
`m) of the ?rst color component of the MCI 5, Which may form
`a 2D array C1 of the pixel values C1(n,m), Where integers n
`and m are pixel roW and column counters in a respective
`gray- scale image. Alternatively, this step may involve reading
`an image ?le from computer-readable memory to obtain the
`
`Legend3D, Inc.
`Exhibit 1004-0011
`
`
`
`US 8,488,868 B2
`
`7
`MCI 5, and performing video signal processing to extract
`therefrom the ?rst color component 15 of the MCI 5. The
`depth map 25 is obtained in a step 20 from the ?rst color
`component 15 of the MCI 5 by assigning depth values to
`pixels of the MCI 5 based on values of the ?rst color compo
`nent 15 for respective pixels. In some embodiments, this step
`may involve a simple spatially-uniform scaling, Which may
`include uniform offsetting, of pixel values of the ?rst color
`component 15, andusing the resulting gray-scale image as the
`depth map 25, Which We found may be adequate to provide an
`enhanced perception of depth in some cases. In other embodi
`ments, this step may include selectively adjusting pixel values
`of the ?rst color component 15 in selected regions thereof
`based on color or other depth cues, as described hereinbeloW
`by Way of example. The term “uniform scaling” as used
`herein means applying a same scaling rule to pixel values
`independently on locations of respective pixels in the image.
`Once generated, the depth map 25 is used in a step 40 to
`form at least one deviated color image (DCI) 7 by means of
`depth image based rendering (DIBR) processing of the MCI
`5, With the DCI 7 corresponding to a different camera vieW
`point as compared to one used in recording the MCI 5. In
`some embodiments, the DCI 7 and the MCI 5 form the SIP 57
`that is provided to a multi-vieW (3D) display. In some
`embodiments, more than one deviated images may be gener
`ated by the DIBR step 40 to form one or more stereoscopic
`image pairs With different vieWpoints. A detailed description
`of a suitable DIBR algorithm can be found, for example, in an
`article entitled “Stereoscopic Image Generation Based on
`Depth Images for 3D TV”, Liang Zhang; Tam, W. 1., IEEE
`Transactions on Broadcasting, Volume 51, Issue 2, June 2005
`Page(s): 191-199, Which is incorporated herein by reference.
`The method 3 of the present invention for depth map gen
`eration takes advantage of the ability of the human visual
`cognitive system to mask any inaccuracies that might occur in
`the depth ordering of different components of the stereo
`scopic image. These inaccuracies may hoWever be at least
`partially compensated by identifying regions of the ?rst color
`component 15 Wherein depth ordering errors are most likely
`to occur, and selectively adjusting pixel values of the ?rst
`color component 15 in the respective regions of the MCI 5.
`Such identi?cation may be preformed based on one of knoWn
`monoscopic depth cues in step 30, and/or based on color.
`Inaccurate depth ordering may occur, for example, for
`regions of the MCI 5 that are characteriZed by high-intensity
`saturated or nearly-saturated colors, resulting in the particu
`larly brightly colored regions appearing too close or too far
`from the vieWer in a 3D image obtained using the color-based
`depth map 25. Advantageously, these inaccuracies can be at
`least partially compensated by identifying the brightly-col
`ored regions based on a color component or components other
`than the ?rst color component and adjusting pixel values of
`the ?rst color component 15 Within these regions.
`FIG. 2 illustrates one exemplary embodiment of the
`method 3 of the present invention in further detail. In this
`embodiment, the step 10 of obtaining the ?rst color compo
`nent of the MCI 5 may be accompanied by a step 13 Wherein
`a second color component of the MCI 5 is obtained, for
`example as an array C2 of pixel values C2(n,m).
`Note also that each of the ?rst color component 15, the
`second color component and the depth map 25 can be vieWed
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`8
`as a gray-scale image composed of pixels; accordingly, We
`Will be referring to pixel values of these images also as (pixel)
`intensity values, and We Will be referring to regions of these
`maps