`
`(19) United States
`(12) Patent Application Publication
`CHEN
`
`11111111111111111111111111111111111111111111111111111111111111
`US 20010010546Al
`
`(10) Pub. No.: US 2001/0010546 A1
`Aug. 2, 2001
`(43) Pub. Date:
`
`(54) VIRTUAL REALITY CAMERA
`
`(22) Filed:
`
`Sep.26, 1997
`
`(76)
`
`Inventor: SHENCHANG ERIC CHEN, LOS
`GATOS, CA (US)
`
`Correspondence Address:
`BLAKELY SOKOLOFF TAYLOR & ZAFMAN
`12400 WILSHIRE BLVD
`7TH FLOOR
`LOS ANGELES, CA 90025
`
`Publication Classification
`
`Int. CI? .......................... H04N 15/00; H04N 5!222
`(51)
`(52) U.S. Cl. ...................... 348/218; 348/50; 348/333.01;
`382/173
`
`(57)
`
`ABSTRACT
`
`(*) Notice:
`
`This is a publication of a continued pros(cid:173)
`ecution application (CPA) filed under 37
`CFR 1.53(d).
`
`(21) Appl. No.:
`
`08/938,366
`
`A method and apparatus for creating and rendering multiple(cid:173)
`view images. A camera includes an image sensor to receive
`images, sampling logic to digitize the images and a proces(cid:173)
`sor programmed to combine the images based upon a spatial
`relationship between the images.
`
`":!:" N\ i"-G'i:.. \
`';;,/
`
`CoMPos:::TE.
`"I: [1'\ 1\ G E.
`
`S9
`
`I
`
`I
`
`1\
`
`I
`
`i
`
`'
`i
`'
`
`I
`
`1
`
`Petition for Inter Partes Review
`of U.S. Pat. No. 7,477,284
`IPR2013‐00219
`EXHIBIT
`Sony‐
`
`
`
`Patent Application Publication
`
`Aug. 2, 2001
`
`Sheet 1 of 9 US 2001!0010546 A1
`
`'-I
`
`10/P"c·= I UsER. TNPlJ-1
`
`\'!>,Ni:\...(S.)
`
`/\(_
`r-
`
`IS 8 UNJ:.I
`
`I 'l) ------,---__J~
`NoN- v"'-'"''"-L(. i'"' -z._<;,
`~ STOicl'>C,(
`
`( "T.,_f¥\A& t Of'\ I~")
`
`2
`
`
`
`Patent Application Publication
`
`Aug. 2, 2001
`
`Sheet 2 of 9
`
`US 2001!0010546 A1
`
`C\="
`G_ NVI:~ON 1\"'.'i:IVT
`
`~O'-L.
`
`Prrc~
`
`0~ Dt o·:~. li-·i·~:'D:
`'-----' '--' '------' L_j!
`~
`
`O'i'" DI:;(.R.E.TE 2:/V\,.,GE~
`
`OrJ--.:-o
`(_'(t.:!:_NDIC"C.AL
`(:>..
`SuR..'i'"A-<..1: o > ~~cvot.til"::_oN
`
`w
`1\.R.t'-1~ o> Ov'O:R!..I>.;:;
`
`~
`
`3
`
`
`
`Patent Application Publication
`
`Aug. 2, 2001
`
`Sheet 3 of 9 US 2001/0010546 A1
`
`51
`
`ss
`
`I
`
`--.,:!
`
`I _)=\
`
`0
`
`D
`
`t:•
`
`0
`
`c
`
`r.J
`
`I I
`J ()F'i".SS\
`
`)
`
`~L.J\'2. I= A.CC Pl-'0\ 0(,. R.t\1-' '" c \:
`v .1..E.w P::-::10-;- $
`r \;l.ON\ M u L-1::. Pu:.
`'"--
`I
`
`f('~:.:>f".. ON"'- A.NCThi:.R...
`
`ODIILJI I
`0: lCOLJ
`0\J([J: ~;---~
`
`____ I ~----·_j
`
`I
`
`I
`
`i
`
`(
`
`\
`
`I
`
`~
`
`!
`
`I
`i
`
`' I
`
`C. oM Po s ::T£.
`Lf't\1\GE..
`
`S9
`
`4
`
`
`
`Patent Application Publication
`
`Aug. 2, 2001
`
`Sheet 4 of 9 US 2001/0010546 A1
`
`0 [3:S0.C..I GE !..NG
`PMc.: .. ro c~ a_i:....?\~c., "'l
`
`l
`DOD
`Jl JDD DODD
`DOD
`DO
`
`j,
`
`D::sc.R..'E.'E
`"!:'MAG-E-S
`
`5
`
`
`
`Patent Application Publication
`
`Aug. 2, 2001
`
`Sheet 5 of 9 US 2001/0010546 A1
`
`-----~
`
`-----
`
`?..lr
`
`6
`
`
`
`Patent Application Publication
`
`Aug. 2, 2001
`
`Sheet 6 of 9 US 2001!0010546 A1
`
`7
`
`
`
`Patent Application Publication
`
`Aug. 2, 2001
`
`Sheet 7 of 9
`
`US 2001!0010546 A1
`
`2.1 r:::
`
`115
`
`r'n
`
`"IN'A.GS
`f:>...GGlv;t.SXI;t..()N r---~
`LJrvrr
`
`1'1
`
`t i i "L
`
`11.3
`
`Usoa. J: .veur
`PI\N&l.(~J
`
`NON- VoLI\ n:. LE
`::J>c<<-1\GS
`
`(PROGRA" (ooc_')
`
`t..I.J,
`
`8
`
`
`
`Patent Application Publication
`
`Aug. 2, 2001
`
`Sheet 8 of 9 US 2001!0010546 A1
`
`START
`
`RECEIVE A SET OF DISCRETE IMAGES IN A CAMERA}- 1 41
`
`DIGITIZE THE IMAGES ~ 1 u, 3
`
`t
`1
`
`COMBINE THE DIGITIZED IMAGES BASED UPON A
`
`SPATIAL RELATIONSHIP BETWEEN THE
`
`DIGITIZED IMAGES TO PRODUCE A MULTIPLE-
`
`r--ILlS
`
`VIEW IMAGE
`
`DISPLAY AT LEAST A PORTION OF THE MULTIPLE- ""IY/
`
`VIEW IMAGE ON A DISPLAY OF THE CAMERA.
`
`9
`
`
`
`Patent Application Publication
`
`Aug. 2, 2001
`
`Sheet 9 of 9 US 2001/0010546 A1
`
`START
`~<E-------,
`
`RECEIVE A DISCRETE IMAGEi IN A CAMERA,
`WHERE i = 0, 1, ... N
`
`I
`DIGITWE IMAGE;
`15311~..... _ __ _ __________ __...~
`
`157
`
`INCREMENT i
`
`COMBINE DIGITIZED IMAGEi WITH ONE OR MORE
`
`PREVIOUSLY DIGITIZED IMAGES BASED UPON
`
`I r.. \
`
`A SPATIAL RELATIONSHIP BETWEEN THE
`
`DIGITIZED IMAGEi AND THE ONE OR MORE
`
`PREVIOUSLY DIGITIZED IMAGES TO PRODUCE
`
`A MULTIPLE-VIEW IMAGE
`
`Nc
`
`10
`
`
`
`US 2001/0010546 Al
`
`Aug. 2, 2001
`
`1
`
`VIRTUAL REALITY CAMERA
`
`FIELD OF THE INVENTION
`
`[0001] The present invention relates to the field of pho(cid:173)
`tography, and more particularly to a camera that combines
`images based on a spatial relationship between the images.
`
`BACKGROUND OF THE INVENTION
`
`[0002] A panoramic image of a scene has traditionally
`been created by rotating a vertical slit camera about an
`optical center. Using this technique, film at the optical center
`is continuously exposed to create a wide field of view (e.g.,
`a 360° field of view). Because of their specialized design,
`however, vertical slit cameras are relatively expensive. Fur(cid:173)
`ther, because the panoramic image is captured in a continu(cid:173)
`ous rotation of the camera, it is difficult to adjust the camera
`to account for changes in the scene, such as lighting or focal
`depth, as the camera is rotated.
`
`In a more modern technique for creating panoramic
`[0003]
`images, called "image stitching", a scene is photographed
`from different camera orientations to obtain a set of discrete
`images. The discrete images of the scene are then transferred
`to a computer which executes application software to blend
`the discrete images into a panoramic image.
`
`[0004] After the panoramic image is created, application
`software may be executed to render user-specified portions
`of the panoramic image onto a display. The effect is to create
`a virtual environment that can be navigated by a user. Using
`a mouse, keyboard, headset or other input device, the user
`can pan about the virtual environment and zoom in or out to
`view objects of interest.
`
`[0005] One disadvantage of existing image stitching tech(cid:173)
`niques is that photographed images must be transferred from
`the camera to the computer before they can be stitched
`together to create a navigable panoramic image. For
`example, with a conventional exposed-film camera, film
`must be exposed, developed, printed and digitized (e.g.,
`using a digital scanner) to obtain a set of images that can be
`stitched into a panoramic image. In a digital camera, the
`process is less cumbersome, but images must still be trans(cid:173)
`ferred to a computer to be stitched into a panoramic view.
`
`[0006] Another disadvantage of existing image stitching
`techniques is that the orientation of the camera used to
`photograph each discrete image is typically unknown. This
`makes it more difficult to stitch the discrete images into a
`panoramic image because the spatial relationship between
`the constituent images of the panoramic image are deter(cid:173)
`mined, at least partly, based on the respective orientations of
`the camera at which they were captured. In order to deter(cid:173)
`mine the spatial relationship between a set of images that are
`to be stitched into a panoramic image, application software
`must be executed to prompt the user for assistance, hunt for
`common features in the images, or both.
`
`[0007] Yet another disadvantage of existing image stitch(cid:173)
`ing techniques is that it is usually not possible to determine
`whether there are missing views in the set of images used to
`create the panoramic image until after the images have been
`transferred to the computer and stitched. Depending on the
`subject of the panoramic image, it may be inconvenient or
`impossible to recreate the scene necessary to obtain the
`missing view. Because of the difficulty determining whether
`
`a complete set of images has been captured, images to be
`combined into a panoramic image are typically photo(cid:173)
`graphed with conservative overlap to avoid gaps in the
`panoramic image. Because there is more redundancy in the
`captured images, however, a greater number of images must
`be obtained to produce the panoramic view. For conven(cid:173)
`tional film cameras, this means that more film must be
`exposed, developed, printed and scanned to produce a
`panoramic image than if less conservative image overlap
`were possible. For digital cameras, more memory must
`typically be provided to hold the larger number of images
`that must be captured than if less conservative image overlap
`were possible.
`
`SUMMARY OF THE INVENTION
`
`[0008] A method and apparatus for creating and rendering
`multiple-view images are disclosed. Images are received on
`the image sensor of a camera and digitized by sampling logic
`in the camera. The digitized images are combined by a
`programmed processor in the camera based upon a spatial
`relationship between the images.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`[0009] The present invention is illustrated by way of
`example and not limitation in the figures of the accompa(cid:173)
`nying drawings in which like references indicate similar
`elements and in which:
`
`[0010] FIG. 1 is a block diagram of a virtual reality (VR)
`camera.
`
`[0011] FIG. 2 illustrates the use of a VR camera to
`generate a panoramic image.
`
`[0012] FIG. 3 illustrates the use of a VR camera to
`generate a composite image of a surface.
`
`[0013] FIG. 4 illustrates the use of a VR camera to
`generate an object image.
`
`[0014] FIG. 5 illustrates control inputs on a VR camera
`according.
`
`[0015] FIG. 6 illustrates the use of a VR camera to overlay
`a video feed over a previously recorded scene.
`
`[0016] FIG. 7 is a block diagram of a stereo VR camera.
`
`[0017] FIG. 8 is a diagram of a method according to one
`embodiment of the present invention.
`
`[0018] FIG. 9 is a diagram of a method according to an
`alternate embodiment of the present invention.
`
`DETAILED DESCRIPTION
`
`[0019] According to the present invention, a virtual reality
`(VR) camera is provided to create and render panoramic
`images and other multiple-view images. In one embodiment,
`the VR camera includes a sensor to detect the camera
`orientation at which images in a scene are captured. A
`computer within the VR camera combines the images of the
`scene into a panoramic image based, at least partly, on the
`respective camera orientations at which the images were
`captured. A display in the VR camera is used to view the
`panoramic image. In one embodiment of the present inven(cid:173)
`tion, the orientation of the VR camera is used to select which
`portion of the panoramic image is displayed so that a user
`
`11
`
`
`
`US 2001/0010546 Al
`
`Aug. 2, 2001
`
`2
`
`can effectively pan about the panoramic image by changing
`the orientation of the camera.
`
`[0020] FIG. 1 is a block diagram of a VR camera 12
`according to one embodiment of the present invention. VR
`camera 12 may be either a video camera or a still-image
`camera and includes an optic 15, an image acquisition unit
`(IAU) 17, an orientation/position sensor (0/P sensor) 21,
`one or more user input panels 23, a processor 19, a non(cid:173)
`volatile program code storage 24, a memory 25, a non(cid:173)
`volatile data storage 26 and a display 27.
`[0021] The optic 15 generally includes an automatically or
`manually focused lens and an aperture having a diameter
`that is adjustable to allow more or less light to pass. The lens
`projects a focused image through the aperture and onto an
`image sensor in the IAU 17. The image sensor is typically
`a charge-coupled device (CCD) that is sampled by sampling
`logic in the IAU 17 to develop a digitized version of the
`image. The digitized image may then be read directly by the
`processor 19 or transferred from the IAU 17 to the memory
`25 for later access by the processor 19. Although a CCD
`sensor has been described, any type of image sensor that can
`be sampled to generate digitized images may be used
`without departing from the scope of the present invention.
`
`[0022]
`In one embodiment of the present invention, the
`processor 19 fetches and executes program code stored in
`the code storage 24 to implement a logic unit capable of
`obtaining the image from the IAU 17 (which may include
`sampling the image sensor), receiving orientation and posi(cid:173)
`tion information from the 0/P sensor 21, receiving input
`from the one or more user input panels 23 and outputting
`image data to the display 27. It will be appreciated that
`multiple processors, or hard-wired logic may alternatively
`be used to perform these functions. The memory 25 is
`provided for temporary storage of program variables and
`image data, and the non-volatile image storage 26 is pro(cid:173)
`vided for more permanent storage of image data. The
`non-volatile storage 26 may include a removable storage
`element, such as a magnetic disk or tape, to allow panoramic
`and other multiple-view images created using the VR cam(cid:173)
`era 12 to be stored indefinitely.
`
`[0023] The 0/P sensor 21 is used to detect the orientation
`and position of the VR camera 12. The orientation of the VR
`camera 12 (i.e., pitch, yaw and roll) may be determined
`relative to an arbitrary starting orientation or relative to a
`fixed reference (e.g., earth's gravitational and magnetic
`fields). For example, an electronic level of the type com(cid:173)
`monly used in virtual reality headsets can be used to detect
`camera pitch and roll (rotation about horizontal axes), and an
`electronic compass can be used to detect camera yaw
`(rotation about a vertical axis). As discussed below, by
`recording the orientation of the VR camera 12 at which each
`of a set of discrete images is captured, the VR camera 12 can
`automatically determine the spatial relationship between the
`discrete images and combine the images into a panoramic
`image, planar composite image, object image or any other
`type of multiple-view image.
`
`[0024] Still referring to FIG. 1, when a panoramic image
`(or other multiple-view image) is displayed on display 27,
`changes in camera orientation are detected via the 0/P
`sensor 21 and interpreted by the processor 19 as requests to
`pan about the panoramic image. Thus, by rotating the VR
`camera 12 in different directions, a user can view different
`
`portions of the previously generated panoramic image on the
`display 27. The VR camera's display 27 becomes, in effect,
`a window into a virtual environment that has been created in
`the VR camera 12.
`
`[0025]
`In one embodiment of the present invention, the
`position of the VR camera 12 in a three-dimensional (3D)
`space is determined relative to an arbitrary or absolute
`reference. This is accomplished, for example, by including
`in the 0/P sensor 21 accelerometers or other devices to
`detect translation of VR the camera 12 relative to an
`arbitrary starting point. As another example, the absolute
`position of the VR camera 12 may be determined including
`in the 0/P sensor 21 a sensor that communicates with a
`global positioning system (GPS). GPS is well known to
`those of ordinary skill in the positioning and tracking arts.
`As discussed below, the ability to detect translation of the
`VR camera 12 between image capture positions is useful for
`combining discrete images to produce a composite image of
`a surface.
`
`[0026]
`It will be appreciated from the foregoing discussion
`that the 0/P sensor 21 need not include both an orientation
`sensor and a position sensor, depending on the application of
`the VR camera 12. For example, to create and render a
`panoramic image, it is usually necessary to change the
`angular orientation of the VR camera 12 only. Consequently,
`in one embodiment of the present invention, the 0/P sensor
`21 is an orientation sensor only. Other combinations of
`sensors may be used without departing from the scope of the
`present invention.
`
`[0027] Still referring to FIG. 1, the one or more user input
`panels 23 may be used to provide user control over such
`conventional camera functions as focus and zoom (and, at
`least in the case of a still camera, aperture size, shutter speed,
`etc.). As discussed below, the input panels 23 may also be
`used to receive user requests to pan about or zoom in and out
`on a panoramic image or other multiple-view image. Fur(cid:173)
`ther, the input panels 23 may be used to receive user requests
`to set certain image capture parameters, including param(cid:173)
`eters that indicate the type of composite image to be pro(cid:173)
`duced, whether certain features are enabled, and so forth. It
`will be appreciated that focus and other camera settings may
`be adjusted using a traditional lens dial instead of an input
`panel 23. Similarly, other types of user input devices and
`techniques, including, but not limited to, user rotation and
`translation of the VR camera 12, may be used to receive
`requests to pan about or zoom in or out on an image.
`
`[0028] The display 27 is typically a liquid crystal display
`(LCD) but may be any type of display that can be included
`in the VR camera 12, including a cathode-ray tube display.
`Further, as discussed below, the display 27 may be a stereo
`display designed to present left and right stereo images to the
`left and right eyes, respectively, of the user.
`
`[0029] FIG. 2 illustrates use of the VR camera 12 of FIG.
`1 to generate a panoramic image 41. A panoramic image is
`an image that represents a wide-angle view of a scene and
`is one of a class of images referred to herein as multiple(cid:173)
`view images. A multiple-view image is an image or collec(cid:173)
`tion of images that is displayed in user-selected portions.
`
`[0030] To create panoramic image 41, a set of discrete
`images 35 is first obtained by capturing images of an
`environment 31 at different camera orientations. With a still
`
`12
`
`
`
`US 2001/0010546 Al
`
`Aug. 2, 2001
`
`3
`
`camera, capturing images means taking photographs. With a
`video camera, capturing image refers to generating one or
`more video frames of each of the discrete images.
`
`mation recorded for each image 35, or based on common
`features in the overlapping regions of the images 35, or
`based on a combination of the two techniques.
`
`[0031] For ease of understanding, the environment 31 is
`depicted in FIG. 2 as being an enclosed space but this is not
`necessary. In order to avoid gaps in the panoramic image, the
`camera is oriented such that each captured image overlaps
`the preceding captured image. This is indicated by the
`overlapped regions 33. The orientation of the VR camera is
`detected via the 0/P sensor (e.g., element 21 of FIG.1) and
`recorded for each of the discrete images 35.
`
`In one still-image camera embodiment of the
`[0032]
`present invention, as the user pans the camera about the
`environment 31, the orientation sensor is monitored by the
`processor (e.g., element 19 ofFIG.1) to determine when the
`next photograph should be snapped. That is, the VR camera
`assists the photographer in determining the camera orienta(cid:173)
`tion at which each new discrete image 35 is to be snapped
`by signaling the photographer (e.g., by turning on a beeper
`or a light) when region of overlap 33 is within a target size.
`Note that the VR camera may be programmed to determine
`when the region of overlap 33 is within a target size not only
`for camera yaw, but also for camera pitch or roll. In another
`embodiment of the present invention, the VR camera may be
`user-configured (e.g., via a control panel 23 input) to auto(cid:173)
`matically snap a photograph whenever it detects sufficient
`change in orientation. In both manual and automatic image
`acquisition modes, the difference between camera orienta(cid:173)
`tions at which successive photographs are acquired may be
`input by the user or automatically determined by the VR
`camera based upon the camera's angle of view and the
`distance between the camera and subject.
`
`In a video camera embodiment of the present
`[0033]
`invention, the orientation sensor may be used to control the
`rate at which video frames are generated so that frames are
`generated only when the 0/P sensor indicates sufficient
`change in orientation (much like the automatic image acqui(cid:173)
`sition mode of the still camera discussed above), or video
`frames may be generated at standard rates with redundant
`frames being combined or discarded during the stitching
`process.
`
`[0034] As stated above, the overlapping discrete images
`35 can be combined based on their spatial relationship to
`form a panoramic image 41. Although the discrete images 35
`are shown as being a single row of images (indicating that
`the images were all captured at approximately same pitch
`angle), additional rows of images at higher or lower pitch
`angles could also have been obtained. Further, because the
`VR camera will typically be hand held (although a tripod
`may be used), a certain amount of angular error is incurred
`when the scene is recorded. This angular error is indicated
`in FIG. 2 by the slightly different pitch and roll orientation
`of the discrete images 35 relative to one another, and must
`be accounted for when the images are combined to form the
`panoramic image 41.
`
`[0035] After the discrete images 35 have been captured
`and stored in the memory of the camera (or at least two of
`the discrete image have been captured and stored), program
`code is executed in the VR camera to combine the discrete
`images 35 into the panoramic image 41. This is accom(cid:173)
`plished by determining a spatial relationship between the
`discrete images 35 based on the camera orientation infor-
`
`[0036] One technique for determining a spatial relation(cid:173)
`ship between images based on common features in the
`images is to "cross-correlate" the images. Consider, for
`example, two images having an unknown translational offset
`relative to one another. The images can be cross-correlated
`by "sliding" one image over the other image one step (e.g.,
`one pixel) at a time and generating a cross-correlation value
`at each sliding step. Each cross-correlation value is gener(cid:173)
`ated by performing a combination of arithmetic operations
`on the pixel values within the overlapping regions of the two
`images. The offset that corresponds to the sliding step
`providing the highest correlation value is found to be the
`offset of the two images. Cross-correlation can be applied to
`finding offsets in more than one direction or to determine
`other unknown transformational parameters, such as rotation
`or scaling. Techniques other than cross-correlation, such as
`pattern matching, can also be used to find unknown image
`offsets and other transformational parameters.
`
`[0037] Based on the spatial relationship between the dis(cid:173)
`crete images 35, the images 35 are mapped onto respective
`regions of a smooth surface such as a sphere or cylinder. The
`regions of overlap 33 are blended in the surface mapping.
`Depending on the geometry of the surface used, pixels in the
`discrete images 35 must be repositioned relative to one
`another in order to produce a two-dimensional pixel-map of
`the panoramic image 41. For example, if the discrete images
`35 are mapped onto a cylinder 37 to produce the panoramic
`image 41, then horizontal lines in the discrete images 35 will
`become curved when mapped onto the cylinder 37 with the
`degree of curvature being determined by latitude of the
`horizontal lines above the cylindrical equator. Thus, stitch(cid:173)
`ing the discrete images 35 together to generate a panoramic
`image 41 typically involves mathematical transformation of
`pixels to produce a panoramic image 41 that can be rendered
`without distortion.
`
`[0038] FIG. 3 illustrates the use of the VR camera 12 to
`generate a composite image of a surface 55 that is too
`detailed to be adequately represented in a single photograph.
`Examples of such surfaces include a white-board having
`notes on it, a painting, an inscribed monument (e.g., the Viet
`Nam War Memorial), and so forth.
`
`[0039] As indicated in FIG. 3, multiple discrete images 57
`of the surface 55 are obtained by translating the VR camera
`12 between a series of positions and capturing a portion of
`the surface 55 at each position. According to one embodi(cid:173)
`ment of the present invention, the position of the VR camera
`12 is obtained from the position sensing portion of the 0/P
`sensor (element 21 of FIG.1) and recorded for each discrete
`image 57. This allows the spatial relationship between the
`discrete images 57 to be determined no matter the order in
`which the images 57 are obtained. Consequently, the VR
`camera is able to generate an accurate composite image 59
`of the complete surface 55 regardless of the order in which
`the discrete images 57 are captured. In the case of a still
`image camera, the position sensor can be used to signal the
`user when the VR camera 12 has been sufficiently translated
`to take a new photograph. Alternatively, the VR camera may
`be user-configured to automatically snap photographs as the
`VR camera 12 is swept across the surface 55. In the case of
`
`13
`
`
`
`US 2001/0010546 Al
`
`Aug. 2, 2001
`
`4
`
`a video camera, the position sensor can be used to control
`when each new video frame is generated, or video frames
`may be generated at the standard rate and then blended or
`discarded based on position information associated with
`each.
`
`[0040] After two or more of the discrete images 57 have
`been stored in the memory of the VR camera 12, program
`code can be executed to combine the images into a com(cid:173)
`posite image 59 based on the position information recorded
`for each discrete image 57, or based on common features in
`overlapping regions of the discrete images 57, or both. After
`the discrete images 57 have been combined into a composite
`image 59, the user may view different portions of the
`composite image 59 on the VR camera's display by chang(cid:173)
`ing the orientation of the VR camera 12 or by using controls
`on a user input panel. By zooming in at a selected portion of
`the image, text on a white-board, artwork detail, inscriptions
`on a monument, etc. may be easily viewed. Thus, the VR
`camera 12 provides a simple and powerful way to digitize
`and render high resolution surfaces with a lower resolution
`camera. Composite images of such surfaces are referred to
`herein as "planar composite images", to distinguish them
`from panoramic images.
`
`[0041] FIG. 4 illustrates yet another application of the VR
`camera. In this case the VR camera is used to combine
`images into an object image 67. An object image is a set of
`discrete images that are spatially related to one another, but
`which have not been stitched together to form a composite
`image. The combination of images into an object image is
`accomplished by providing information indicating the loca(cid:173)
`tion of the discrete images relative to one another and not by
`creating a separate composite image.
`
`[0042] As shown in FIG. 4, images of an object 61 are
`captured from surrounding points of view 63. Though not
`shown in the plan view of the object 61, the VR camera may
`also be moved over or under the object 61, or may be raised
`or tilted to capture images of the object 61 at different
`heights. For example, the first floor of a multiple-story
`building could be captured in one sequence of video frames
`(or photographs), the second floor in a second sequence of
`video frames, and so forth. If the VR camera is maintained
`at an approximately fixed distance from the object 61, the
`orientation of the VR camera alone may be recorded to
`establish the spatial relationship between the discrete images
`65. If the object is filmed (or photographed) from positions
`that are not equidistant to the object 61, it may be necessary
`to record both the position and orientation of the VR camera
`for each discrete image 65 in order to produce a coherent
`objec image 67.
`
`[0043] After two or more discrete images 65 of object 61
`have been obtained, they can be combined based upon the
`spatial relationship between them to form an object image
`67. As stated above, combining the discrete images 65 to
`form an object image 67 typically does not involve stitching
`the discrete images 65 and is instead accomplished by
`associating with each of the discrete images 65 information
`that indicates the image's spatial location in the object image
`67 relative to other images in the object image 67. This can
`be accomplished, for example, by generating a data structure
`having one member for each discrete image 65 and which
`indicates neighboring images and their angular or positional
`proximity. Once the object image 67 is created, the user can
`
`pan through the images 65 by changing the orientation of the
`camera. Incremental changes in orientation can be used to
`select an image in the object image 67 that neighbors a
`previously displayed image. To the user, rendering of the
`object image 67 in this manner provides a sense of moving
`around, over and under the object of interest.
`
`[0044] According to another embodiment of the present
`invention, the relative spatial location of each image in the
`object image 67 an object image is provided by creating a
`data structure containing the camera orientation information
`recorded for each discrete image 65. To select a particular
`image in the object image 67, the user orients the VR camera
`in the direction that was used to capture the image. The VR
`camera's processor detects the orientation via the orientation
`sensor, and then searches the data structure to identify the
`discrete image 65 having a recorded orientation most nearly
`matching the input orientation. The identified image 65 is
`then displayed on the VR camera's display.
`
`[0045] FIG. 5 depicts a VR camera 12 that is equipped
`with a number of control buttons that are included in user
`input panels 23a and 23b. The buttons provided in user-input
`panel 23a vary depending on whether VR camera 12 is a
`video camera or a still-image camera. For example, in a
`still-image camera, panel23a may include shutter speed and
`aperture control buttons, among others, to manage the
`quality of the photographed image. In a video camera, user
`input panel 23a may include, for example, zoom and focus
`control. User input panel23a may also include mode control
`buttons to allow a user to select certain modes and options
`associated with creating and rendering virtual reality
`images. In one embodiment, for example, mode control
`buttons may be used to select a panoramic image capture
`mode, planar composite image capture mode or object
`image capture mode. Generally, any feature of the VR
`camera that can be selected, enabled or disabled may be
`controlled using the mode control buttons.
`
`[0046] According to one embodiment of the present inven(cid:173)
`tion, view control buttons Right/Left, Up/Down and Zoom
`are provided in user input panel 23b to allow the user to
`select which portion of a panoramic image, planar compos(cid:173)
`ite image, object image or other multiple-view image is
`presented on display 27. When the user presses the Right
`button, for example, view control logic in the camera detects
`the input and causes the displayed view of a composite
`image or object image to pan right. When the user presses
`the Zoom+button, the view control logic causes the dis(cid:173)
`played image to be magnified. The view control logic may
`be implemented by a programmed processor (e.g., element
`19 of FIG. 1), or by dedicated hardware. In one embodiment
`of the present invention, the view control logic will respond
`either to user input via panel 23b or to changes in camera
`orientation. Alternatively, the camera may be configured
`such that in one mode, view control is achieved by changing
`the VR camera orientation, and in another mode, view
`control is achieved via the user input panel 23b. In both
`cases, the user is provided with alternate ways to select a
`view of a multiple-view image.
`
`[0047] FIG. 6 illustrates yet another application of the VR
`camera 12 of the present invention. In this application, a
`video signal captured via the IAU (element 17 of FIG. 1) a
`is superimposed on a previously recorded scene using a
`chroma-key color replacement technique. For example, an
`
`14
`
`
`
`US 2001/0010546 Al
`
`Aug. 2, 2001
`
`5
`
`individual 83 standing in front of a blue background 82 may
`be recorded using the VR camera 12 to generate a live video
`signal. Program code in the VR camera 12 may then be
`executed to implement an overlay function that replaces
`pixels in a displayed scene with non-blue pixels from the
`live video. The effect is to place the subject 83 of the live
`video in the previously generated scene. According to one
`embodiment of the present invention, the user may pan
`about a panoramic image on display 27 to locate a portion
`of the image into which the live video is to be inserted, then
`snap the overlaid subject of the video image into the scene.
`In effect, the later received image is made part of the earlier
`recorded panoramic image (or other mult