`© Society for Imaging Science and Technology 2007
`
`Analysis of the Influence of Vertical Disparities Arising
`in Toed-in Stereoscopic Cameras
`
`Robert S. Allison
`Department of Computer Science and Centre for Vision Research, York University,
`4700 Keele St., Toronto, Ontario M3J 1P3, Canada
`E-mail: allison@cs.yorku.ca
`
`Abstract. A basic task in the construction and use of a stereo(cid:173)
`scopic camera and display system is the alignment of the left and
`right images appropriately-a
`task generally referred to as camera
`convergence. Convergence of the real or virtual stereoscopic cam(cid:173)
`eras can shift the range of portrayed depth to improve visual com(cid:173)
`fort, can adjust the disparity of targets to bring them nearer to the
`screen and reduce accommodation-vergence conflict , or can bring
`objects of interest into the binocular field of view. Although camera
`convergence is acknowledged as a useful function, there has been
`considerable debate over the transformation required. It is well
`known that rotational camera convergence or "toe-in" distorts the
`images in the two cameras producing patterns of horizontal and
`vertical disparities that can cause problems with fusion of the ste(cid:173)
`reoscopic imagery. Behaviorally, similar retinal vertical disparity pat(cid:173)
`terns are known to correlate with viewing distance and strongly af(cid:173)
`fect perception of stereoscopic shape and depth. There has been
`little analysis of the implications of recent findings on vertical dispar(cid:173)
`ity processing for the design of stereoscopic camera and display
`systems . I ask how such distortions caused by camera convergence
`affect the ability to fuse and perceive stereoscopic images. © 2007
`Society for Imaging Science and Technology.
`[DOI: 10.2352/J.lmagingSci.Technol.(2007 )51 :4(317)]
`
`INTRODUCTION
`In many stereoscopic viewing situations it is necessary to
`adjust the screen disparity of the displayed images for viewer
`comfort, to optimize depth perception or to otherwise en(cid:173)
`hance the stereoscopic experience. Convergence of the real
`or virtual cameras is an effective means of adjusting por(cid:173)
`trayed disparities. A long-standing question in the stereo(cid:173)
`scopic imaging and display literature is what is the best
`method to converge the cameras? Humans use rotational
`movements to binocularly align the visual axes of their eyes
`on targets of interest . Similarly, one of the easiest ways to
`converge the cameras is to pan them in opposite directions
`to "toe-in"
`the cameras. However, convergence through
`camera toe-in has side effects that can lead to unde sirable
`distortions of stereoscopic depth. 1
`2 In this paper we reana(cid:173)
`'
`lyze these geometric distortions of stereoscopic space in the
`context of recent findings on the role of vertical disparitie s in
`stereoscopic space perception. We focus on a numb er of is(cid:173)
`sues related to converged cameras and the mode of conver(cid:173)
`gence: The effect of rectification; relation between the geom(cid:173)
`etry of the imaging device and the display device; fused and
`
`Receiv ed Dec. 5, 2006; accepted for publication Mar. 7, 2007.
`1062-3701/2007/51(4)/3 J 7/l l/$20. 00.
`
`augmented displays; orthostereoscopy; the relation between
`parallax distortion s in the display and the resulting retinal
`disparity; and the effect of these toe-in induced retinal dis(cid:173)
`parities on depth perception and binocular fusion .
`Our interests lie in augmented-reality applications and
`stereoscopic heads for tele-oper ation applications . In these
`systems a focus is on the match and registration between the
`stereoscopic imagery and the "real world" so we will con(cid:173)
`centrate on orthost ereoscopic or near orthostereoscopic con(cid:173)
`figurations. These configurations have well known limita(cid:173)
`tions for applications such as visualization and cinema, and
`other configurations may result in displays that are more
`pleasing and easier to fuse. However, it is important to note
`that our basic analysis generalizes to other configurations,
`and we will discuss other viewing arrangements when
`appropriate. 3
`.4 In a projector-based display system with sepa(cid:173)
`rate right and left projectors, or in binocular head mounted
`display (HM D) with independen t left and right displays, the
`displays/projectors can also be converged mechanically or
`optically. In this paper we will also assume a single flat,
`fronto-parallel display (i.e., a monitor or projector display)
`so that the convergence of the projectors is not an issue.
`Since the left and right images are projected or displayed
`into the same plane we will refer to these configurations as a
`"parallel display." In most cases similar considerations will
`apply for a HMD with parallel left and right displays.
`
`OPTIONS FOR CAMERA CONVERGENCE
`We use the term convergence here to refer to a variety of
`means of realigning one stereoscopic half-image with respect
`to the other, includin g toe-in (or rotational) convergence
`and translational image shift.
`Convergence can shift the range of portrayed depth to
`improve visual comfort and compos ition . Looking at objects
`presented stereoscopically further or nearer than the screen
`causes a disruption of the normal synergy between vergence
`and accommodation in most displays. Normally accommo(cid:173)
`dation and vergence covary but, in a stereoscopic display, the
`eyes should remain focused at the screen regardless of dis(cid:173)
`parity. The accommodation-vergence conflict can cause vi(cid:173)
`sual stress and disrupt binocular vision. 5 Convergence of the
`cameras can be used to adjust the disparity of targets of
`interest to bring them nearer to the screen and reduce this
`conflict.
`
`317
`
`Legend3D, Inc.
`Exhibit 1012-0001
`
`
`
`Allison: Analysis of the influence of vertical disparities arising in toed-in stereoscopic cameras
`
`Table I. Typical convergence for stereoscopic sensors and displays. "Natural" modes of
`
`
`
`convergence ore shown in bold.
`
`REAL OR VIRTUAL CAMERA CONVERGENCE
`--
`-
`-
`-
`--
`-
`-
`- ~
`DISPLAY/SENSOR - - -
`
`
`
`GEOMETRY
`Translation
`Rotation
`
`Flot
`
`Spherical
`
`Horizontal Image Translation Toed-in camera, toed-in
`projector combination
`Toed-in stereoscopic camera
`or robot head
`
`Differential translation of
`computer graphics images
`Image sensor shift
`Variable baseline camera
`Human viewing of planar
`stereoscopic displays?
`
`Haploscope
`
`Human physiological
`vergence
`
`Convergence can also be used to shift the range of por(cid:173)
`trayed depth . For example, it is often preferable to port ray
`stereoscopic imagery in the space behind rath er than in front
`of the display. With convergence a user can shift stereoscopic
`imagery to appear "inside" the display and reduce interpo(cid:173)
`sition errors between the stereoscopic imagery and the edges
`of the displays.
`Cameras used in stereoscopic imagers have limited field
`of view and convergence can be used to bring objects of
`interest into the binocul ar field of view.
`Finally, convergence or more appropriately tran slation
`of the stereoscopic cameras can also be used to adjust for
`differences in a user's interpupillar y distance. The latter
`transformation is not typically called convergence since the
`stereoscopic baseline is not maintained.
`In choosing a method of convergence there are several
`issues one needs to consider. What type of 2D image trans(cid:173)
`formation is most natural for the imaging geometry? Can a
`3D movement of the imaging device accomplish this trans(cid:173)
`formation? In a system consisting of separate acquisition and
`display systems is convergence best achieved by changing the
`imaging configuration and/o r by transformi ng the images
`(or projector configuration) prior to display? If an unn atural
`convergence techniqu e must be used, what is the impact on
`stereoscopic depth perception?
`Although camera convergence is acknowledged as a use(cid:173)
`ful function, there has been considerable debate over the
`correct transformati on required. Since the eyes ( and the
`cameras in imaging applications) are separated laterally, con(cid:173)
`vergence needs to be an opposite horizontal shift of left and
`right eyes images on the sensor sur face or, equivalently, on
`the display. The most appropri ate type of tr ansform ation to
`accomplish this 2D shift-rot ation or translation-d epends
`on the geometry of the imaging and display devices. We
`agree with the view that the transformation should reflect
`the geometry of the display and imaging devices in order to
`minimize distortion (see Table I) . One could argue that a
`"pure" vergence movement should affect the disparity of all
`objects equally, resulting in a change in mean disparity over
`the entire image without any change in relative disparity
`between points.
`
`For example, consider a spherical imaging device such
`as the human eye where expressing disparity in terms of
`visual angle is a natural coding scheme. A rotational move(cid:173)
`ment about the optical centre of the eye would scan an
`image over the retina without distortin g the angular rela(cid:173)
`tionships within the image. Thus the natural convergence
`movement with such an imaging device is a differential ro(cid:173)
`tation of the two eyes, as occurs in physiological convergence
`(although freedom to choose various spherical coordinate
`systems complicates the definition of disparity\
`A flat sensor is the limiting form of sph erical sensor
`with an infinite radius of curvature, and thus the rotation of
`the sensor becomes a tran slation parallel to the sensor plane.
`For displays that rely on projection onto a single flat, fronto(cid:173)
`parallel display surface (many stereoscopic displays with the
`notable exception of some head-mounted displays and hap(cid:173)
`loscopic systems) depth differences should be represented as
`linear horizontal disparities in the image plane. The natur al
`convergence movement is a differential horizontal shift of
`the images in the plane of the display. Acquisition systems
`with parallel cameras are well-matched to such display ge(cid:173)
`ometry since a translation on the display corresponds to a
`tran slation in the sensor plane. This model of parallel cam(cid:173)
`eras is typically used for the virtu al cameras in stereoscopic
`computer graphics7 and the real cameras in many stereo(cid:173)
`scopic camera setups.
`Thus horizont al image translation of the images on the
`display is the preferred minimal distortion method to shift
`convergence in a stereoscopic rig with parallel cameras when
`presented on a parallel display. This analysis correspond s to
`current conventional wisdom . If the stereo baseline is to be
`maintained then this vergence movement is a horizontal
`translation of the images obtained from the parallel cameras
`rath er than a tran slation of the cameras themselves. For ex(cid:173)
`ample, in comput er-generated displays, the left and right half
`images can be shifted in opposite directions on the display
`surface to shift portr ayed depth with respect to the screen.
`With real camera images, a problem with shifting the dis(cid:173)
`played images to accomplish convergence is that in doing so,
`part of each half-image is shifted off of the display resulting
`in a smaller stereoscopic image.
`An alternative is to shift the imaging device (e.g., CCD
`array) behind the camera lens, with opposite sign of shift in
`the two cameras forming the stereo rig. This avoids some of
`the problems associated with rotational convergence dis(cid:173)
`cussed below. Implementin g a large, variable range of con(cid:173)
`vergence with mechanical movements or selection of subar(cid:173)
`rays from a large CCD can be complicated. Furtherm ore,
`many lenses have significant radial distortion and translating
`the center of the imaging device away from the optical axis
`increases the amount of radial distortion. Worse, for
`matched lenses the distortio ns introdu ced in each sensor
`image will be opposite if the sensors are shifted in opposite
`directions. This leads to increased disparity distorti on.
`Toed-in cameras can center the image on the optical axis
`and reduce this particular problem.
`If we converge nearer than infinity using horizontal im-
`
`318
`
`J. Imag ing Sci. Technol. 5 1 (4)/ Jul.-Aug. 20 07
`
`Legend3D, Inc.
`Exhibit 1012-0002
`
`
`
`Allison: Analysis of the influence of vertical disparities arising in toed-in stereoscopic cameras
`
`2
`
`1.8
`
`1.6
`
`••••••••••••••••••••
`
`-8.s
`
`0
`X Position (m)
`
`0.5
`
`Figure l. A pion view of on array of po ints loca ted in the X-Z plane at
`eye level. The solid dots show the true position of the points and also their
`reconstruction based on images from a para llel camera orthostereoscop ic
`rig presented at a 0.7 m view ing distance. The open diamond shaped
`markers show the reconstructed position of the points in the array when
`the ca meras ore converged using horizo ntal image translation (HIT) As
`predicted the points that ore truly at l . l m move in to appear near the
`screen distance of 0.7 m. Also depth a nd size should appear scaled
`appropriately Fm the_ nearer distance . But notice that depth ordering and
`planarily ore morntoi ned. C rrcles at a drstance of zero denote the posi(cid:173)
`tions of the eyes.
`
`age shift, then far objects should be brought toward th e
`plane of the screen. With convergence via horizontal image
`shift, a frontal plane at the camera convergence distance
`shou ld appear flat and at the screen distance. However,
`depth for a given retinal disparity increases approximat ely
`with the square of distanc e. Thus if the cameras are con(cid:173)
`verged at a distance other than the screen distance to brin g a
`farther (or nearer) target toward the screen, then the depth
`in the scene should be distor ted nonlin early but depth or(cid:173)
`dering and planarit y are maintained (Figure l ). This appar(cid:173)
`ent depth distortion is predicted for both the parallel and
`toed -in configurations. In the toed -in case it would be added
`to the curvature effects discussed below. Similar arguments
`can be made for size distortio ns in the image ( or equiva(cid:173)
`lently the apparent spacing of the dots in Fig. l). See Wood s1
`and Diner and Fender2 for an extende d discussion of these
`distortions.
`It is important
`to not e tha t th ese effects are predict ed
`from the geometry and do not always correspond to human
`perception . Percepts of stereoscop ic space tend to deviate
`from the geometric predi ctions based on the Keplerian pro (cid:173)
`jections and Euclidean geome try6). Vergence on its own is
`not a strong cue to distance and other depth cues in the
`display besides horizontal disparity can affect the interpr eta(cid:173)
`tion of stereoscopic displays. For example, it has been
`known for over 100 years that observers can use vertical
`
`disparities in the stereoscop ic images to obta in mor e veritical
`estimate s of stereoscopic form .8 In recent years, a role for
`vertical disparities in hum an stereoscopic depth perceptio n
`has been confirmed_9.1o
`Translation of th e images on the display or of the sen(cid:173)
`sors behind the lenses maintains the stereoscopic camera
`baseline and hence the relative disparities in the acquired or
`simulated image. Shifting of th e images can be used to shift
`this disparity rang e to be centered on the display to ease
`viewing comfort. However, in many applications this dispar (cid:173)
`ity range is excessive and other technique s may be more
`suitable. Laterally shiftin g the cameras toward or away from
`each other increa ses or decreases the range of disparities
`corresponding to a given scene. Con trol of the stereo rig
`baseline serves a complementary function to convergence by
`adjusting the "gain" of stereopsis instead of simply the mean
`disparity. This function is often very useful for mappin g a
`depth range to a useful or comfor table disparity range in
`·
`1.
`h
`h"
`411
`app icat1ons sue as comp uter grap ics, ' photogramm e-
`try, etc.
`In augmented reality or other enhanc ed vision systems
`tha t fuse stereoscopi c imagery with direct views of the world
`(or with displays from other stereoscopic image sources),
`orthos tereoscopic configurations
`( or at
`least consistent
`views) are important. In these systems, proper convergence
`of the camera systems and calibration of image geom etry is
`required so that objects in the display have appropri ate dis(cid:173)
`parity relative to th eir real world counterparts. A parallel
`camera orthostereoscopic configuration presents true dis(cid:173)
`parities to the user if presented on a parallel display. Thus ,
`geometrically at least, we should expect to see true depth. In
`practice this seldom occurs because of the influenc e of other
`depth cues (accommod ation-vergence conflict, changes in
`effective interpupillary distance with eye mo vement s, flatness
`cues corresponding to viewing a flat display, etc.).
`In summary, an orthostereos copic parallel-c amera/
`parallel-display configur ation can present accurate dispari(cid:173)
`ties to the user. 1
`7 On parallel displays, convergence by hor i(cid:173)
`'
`zonta l shift of the images obtained from parall el cameras
`introduces no distortion of horizon tal or vertical screen dis(cid:173)
`parity (parallax). Essentially, convergence by this method
`brings the two half images into register with out changin g
`relative disparity. This can reduce vergence-accomm odation
`conflict and impro ve the ability to fuse the imagery. Geo (cid:173)
`metrically, one would predict effects on perceived depth(cid:173)
`the apparent depth of imagery with respect to the screen and
`the depth scaling in the image are affected by the simulated
`1· f
`1,13 H
`h"
`vergence.
`owever, t 1s amoun ts to a re 1e trans1orma -
`C
`tion implying that depth order ing and coplanarity should be
`10
`maintained. 2
`'
`
`CAMERA TOE-IN
`While horizontal image translation is attractive theoretically,
`there are often practic al considera tions that limit use of the
`method and make rotational convergence attractive . For ex(cid:173)
`ample, with a limited camera field of view and a non zero
`stereo baseline ther e exists a region of space near to th e
`
`). Imaging Sci. Technol. 51 (4)/Jul.-Aug. 2007
`
`319
`
`Legend3D, Inc.
`Exhibit 1012-0003
`
`
`
`Allison: Analysis of the influence of vertical disparities arising in toed-in stereoscopic cameras
`
`\'\\:S·
`
`Camera\
`Optical
`Center~
`
`\
`
`/
`
`•. / ----a---•
`.
`.
`
`e
`
`.
`
`.
`
`(a)
`
`(b)
`
`Figure 2. (a) The Toronto IRIS Stereoscopic Head 2 (TRISH II), an example of o robot head built for a w ide
`range of wor king distances. Wi th such a system, a w ide range of camera convergence is required to bring
`objects of interest into view of the cameras. W ith off-the shelf cameras this can be most conveniently achieved
`with ca mera toe-in. (b) A hypothetical stereo rig with camera field of view 8. Objects in near work ing space
`are out of the binocular field of view wh ich is indicated by the cross hatch pattern.
`
`cameras that cannot be seen by one or both cameras. In
`some applications such as landscape photography this region
`of space may be irrelevant; in other applications such as
`augmented reality or stereoscopic robot heads this may cor(cid:173)
`respond to a crucial part of the normal working range (see
`Figure 2). Rotational convergence of the cameras can in(cid:173)
`crease the near working space of the system and center the
`target in the camera images. 14 Other motivations for rota(cid:173)
`tional convergence include the desire to center the target on
`the camera optics (e.g., to minimize camera distortion) and
`the relative simplicity and large range of motion possible
`with rotational mechanisms. Given that rotational conver(cid:173)
`gence of stereo cameras is often implemented in practice, we
`ask what effects the distortions produced by these move(cid:173)
`ments have on the perception of stereoscopic displays?
`It is well known that the toed-in configuration distorts
`the images in the two cameras producing patterns of hori(cid:173)
`zontal and vertical screen disparities (parallax). Geometri(cid:173)
`cally, deviations from the parallel-camera configuration may
`result in spatial distortion unless compensating transforma(cid:173)
`tions are introduced mechanically, optically or electronically
`in the displayed images,2
`2 for example unless a pair of pro(cid:173)
`,1
`jectors (or HMD with separate left and right displays) with
`matched convergence or a parallel display with special dis-
`.
`.
`h .
`d 15 16 F
`h
`f
`tortion correct10n tee mques are use . ·
`or t e rest o
`this paper we will assume a single projector or display sys(cid:173)
`tem (parallel display) and a dual sensor system with parallel
`or toed-in cameras.
`The effects of the horizontal disparities have been well
`described in the literature and we review them before turn(cid:173)
`ing to the vertical disparities in the next section. The depth
`distortions due to the horizontal disparities introduced can
`be estimated geometrically. 1 The geometry of the situation is
`illustrated in Figure 3. The imaging space world coordinate
`system is centered between the cameras, a is the intercamera
`distance and th e angle of convergence is /3 (using the con(cid:173)
`ventional stereoscopic camera measure of convergence rather
`than the physiological one).
`
`Let us assume the cameras converge symmetrically at
`point C located at distance F. A local coordinate system is
`attached to each camera and rotated ±/3 about the y axis
`with respect to the imaging space world coordinate system.
`The coordinates of a point P=[XYZY
`in the left and right
`cameras 1s
`
`( X + ; ) cos(/3) - Z sin(/3)
`
`y
`
`Z cos(/3) + ( X + ; ) sin(/3)
`
`( X - ~) cos(/3) + Z sin(/3)
`
`y
`
`Z cos(/3) - ( X - ~) sin(/3)
`
`(1)
`
`After perspective projection onto the converged CCD array
`(coordinate frame u-v centered on the optic axis and letting
`f= 1.0) we get the following image coordinates for the left,
`[u1,v,Y, and right , [u,,vrY, arrays:
`
`( X + ~) cos(/3) - Z sin(/3)
`
`Ut] = [X/ Z1] =
`Z cos(/3) + ( X + ; ) sin(/3)
`[
`Vz
`
`YzfZ1
`
`y
`
`Z cos(/3) + ( X + ; ) sin(/3)
`
`(2)
`
`320
`
`J. Imaging Sci. Technol. 51 (4)/Jul.-Aug. 2007
`
`Legend3D, Inc.
`Exhibit 1012-0004
`
`
`
`Allison: Analysis of the influence of vertical disparities arising in toed-in stereoscopic cameras
`
`z
`
`C., (0,0,F)
`
`z
`
`Display screen
`
`(U. , V,.)
`
`I
`I
`
`/
`I
`
`(U,,VJ
`
`V
`lPd
`
`Di
`
`I I " If
`,, ,, , ,
`
`I
`I
`I
`I
`
`I
`I
`I
`I
`I
`l
`I
`I
`i LeftEye
`I
`I
`I Right Ey,,
`:
`'-..
`I
`
`'---------------------1....i~--~-x
`
`-e-
`
`Figure 3. Imaging and display geometry for symmetric toe-in convergence at point C and view ing at distance
`D (plan view).
`
`+ + + +•++++
`xx•"xx•"'"
`+ ++
`···*•-'
`+ + ++
`xxxx
`.f -f +W. /+JC + Jli. + +,..+ X
`l( + )C + X ~ ti it •
`+ + ,c+l(
`,t- x+ X + + ii: * • .. f X X + IC+ ~ .f + +
`X
`X X °j
`+x +liOI + x+ * ~ +"+- x
`+ x+x it
`+ x+x ~ 1\- x + x +x ~ : * : : : +,c \ + x + 'fr
`+ + X + X .f \ < X : · 1< ii: * • 1(. ~ ., +: + :+ : k +x + X
`'k +x + •
`i.+x + .:
`
`,/-
`
`"t- x+ x +x :
`
`•
`
`.
`
`+" +• ¥ ~x
`
`xx
`
`+ + + + i(
`
`+
`++
`++
`++
`•-¥-¥-1'
`xxxxxll"xi.ii*
`+ -t,/"''t J"i,;+x.-+x-lil *:if•
`-'!t-¥ -f +x +•+ •+ !+'\•
`+ +x + :.rtJ'"x+x-tx-\oi.111'
`••4<-l'-i'C
`+ X+ '-f+x+•
`+
`.- ~ .ft. ..µc + x+ ~ ¥ +,c T x
`-t x+x -k ,t- x+,o; tx --k .;.; •
`•
`41 ~ .;« +,c+ * it +•+ x
`+ + x+lt
`,,._ x+x ~ -ff ~ •
`it
`+ + X + X -f' lfj,-w+x .p:: -f(
`11,: .+x +x + -,t- it +ic+ X
`......
`._ -t)( + x+ .Z-'t +. + x
`+ x + x .f ~ x+ x__.x -,c 41 •
`ic
`•
`'t
`+ •+•!
`41 ~ •
`~•+x+x_.,.
`1r 11' "\: +x + x+i
`-i; +x+ x
`* * ~ +,. + X + : + + X + ._
`':+. l(+ X +X # _f. ••
`+ X + X
`:
`1/; '\c + + + + f I·+
`X K x X X.f
`~fa*
`X 1 ... * ic t / +)I + ~ ~ +•-+· X
`XX
`'Ii ,t X + X \
`+ + x+x
`+ +•+x + 1ft. ~ +x + x
`+ + JC+x -k -K .,..+ .,..+x +• 1 •
`' • * ! .: +x +"+ * ii + X + X
`: +• +• + lOt-ii +x+ x
`+ " +• --1" Ir x+ x +,o;:
`+ X + X -1 ,q- .. + ,. +x +x f
`+ x + x f \- x+x +,c t ! : : : ; .: +"" +x + ;t t + x + ,c
`
`•
`
`•
`
`1(
`
`Ill *ii:
`
`+++
`
`+ t-+++
`
`+
`
`Xxxx,cxx"x
`
`,
`,,
`
`,1
`
`1///
`1 1///
`,1////
`I
`//
`
`//
`
`J'
`
`/'
`
`///
`
`/
`
`'''
`'x'x'\\\
`,,,
`----.:,,,
`"-..'-.'\\\\\
`, ............. , ......
`"""' ' ' '"
`-........,,
`
`...... '\
`
`\
`
`X
`
`•
`
`IC
`
`x
`x
`x
`x
`
`X
`
`.,.,,//
`//
`
`/.,
`
`/,,;
`
`,,
`
`,
`
`/,///?
`////,/
`////1
`///1
`///111,,
`
`I
`
`I
`I
`,
`11 ,
`
`... '\
`
`' ......................
`... ,, ,, ,
`... ' .... '' .........
`... .... ''"""
`' ' ''""'
`'''"'"""
`
`\
`
`Figure 4 . Keystone distortion due ta toe-in. (a) Left( +) and right (X) images for a regularly spaced grid of
`po ints w ith the stereo camera converged (toed-in) on the grid. (b) Correspond ing disparity vectors comparing
`left eye with right eye view s demonstrate bath horizo ntal and vertical components of the keystone distortion.
`
`( X - ; ) cos(/3) + Z sin(/3)
`
`Z cos(/3) - ( X - ; ) sin(/3)
`
`y
`
`Z cos(/3) - ( X - ; ) sin(/3)
`
`The CCD image is then reprojected onto the display screen.
`We assume a single display/projector model with central
`projection and a magnification of M with respect to the
`CCD sensor image resulting in the following screen coordi(cid:173)
`nates for the point in the left, [U 1, V1Y, and right, [U" v,Y,
`eye images:
`
`(3)
`
`Toeing-in the stereoscopic rig to converge on a surface cen(cid:173)
`ters the images of the target in the two cameras but also
`introduces a keystone distortion due to the differential per(cid:173)
`spective (Figure 4). In contrast convergence by shifting the
`CCD sensor behind the carriera lens ( or shifting the half
`images on the display) changes the mean horizontal dispar(cid:173)
`ity but does not entail keystone distortion. For a given focal
`length and camera separation, the extent of the keystone
`distortion is a function of the convergence distance and not
`the distance of the target.
`To see how the keystoning affects depth perception , as(cid:173)
`sume the images are projected onto a screen at distance D
`and viewed by a viewer with interocular distance of e. If the
`magnification from the CCD sensor array to screen image is
`
`}. Imaging Sci. Technol. 51 (4)/ Jul.-Aug. 2007
`
`321
`
`Legend3D, Inc.
`Exhibit 1012-0005
`
`
`
`Allison: Analysis of the influence of vertical disparities arising in toed-in stereoscopic cameras
`
`Figure 5. Ge ometrically predicted perception (curved grid) of displayed
`images taken from a toed-in stereoscopic ca mera rig converged on a
`fronto-porallel grid made with l O cm spacing (asterisks) based on hori(cid:173)
`zontal disparities (associated size distortion not shown). Camera conver(cid:173)
`gence distance (F) and display view ing distance (D) ore 0.70 cm
`(e= o =62. 5 mm; f=6.5 mm; see Fig. 3 and text for definitions). The
`icon at the bottom of the figure indicates the position of the wo rld coordi(cid:173)
`nate frame and the eyeballs.
`
`M and both images are centered on the display then geo(cid:173)
`metr ically predicted coordinates of the point in display space
`is ( after Ref. 1)
`
`e(U1+ Ur)
`
`2[e - (Ur - Ui)J
`
`(4)
`
`P,ar~J
`
`e(V1+ Vr)
`
`2[e - (Ur - Ui)]
`
`eD
`
`Hence, it subtends a larger angle at the nearer eye than at th e
`further. The vertical size ratio (VSR) between the images of
`an object in the two eyes varies as a function of the object's
`eccentricity with respect to th e head. Figure 6 also shows the
`variation of the vertical size ratio of the right eye image to
`the
`left eye image for a range of eccentriciti es and
`distances.
`It is evident that, for centrally located targets, th e gra(cid:173)
`dient of vertical size rati os varies with distance of the surface
`from the head. This is relatively independent of the vergence
`state of the eyes and the local depth stru cture . 17 Howard 18
`turned this relationship around and suggested that people
`could judge the distance of surfaces from the gradient of th e
`VSR. Gillam and Lawergren 19 prop osed a computational
`model for the recovery of surface distance and eccentricity
`based upon processing of VSR and VSR gradient s. An alter(cid:173)
`native comp utational framework 10
`20 uses vertical disparit ies
`·
`to calculate the convergence posture and gaze eccentricity of
`the eyes rather than the distance and eccentricity of a target
`surface. For our purposes, these models make the same pre(cid:173)
`dictions about the effects of camera toe-in. However, the
`latter model uses projections onto flat projection surfaces
`(hypothetical flat retinae) which is easier for visualization
`and matches well with our previous discussion of camera
`toe-in .
`With flat imaging planes, disparities are usually mea(cid:173)
`sured in terms of linear displacement in the image plane. If
`the cameras in a stereoscop ic rig are toed in ( or if eyes with
`flat retinae are converged), then th e left and right camera
`images have opposite keystone distortion. It is interesting to
`note that in contr ast to the angular disparity case the gradi(cid:173)
`ents of vertical disparities are a function of camera conver(cid:173)
`gence but are affected little by the distance of the surface.
`These vertical disparity gradients on flat cameras/retinae
`provide an indication of the convergence angle of the cam(cid:173)
`eras and hence the distance of the fixation point.
`For a pair of objects or for dep th within an object, the
`relationship between relative depth and relative disparity is a
`function of distance from the observer. To an extent, the
`visual system is able to maintain an accurate perception of
`depth of an object at various distances despite disparity
`varying inversely with the square of the distance between the
`object and the observer. This "depth constancy" demon(cid:173)
`strates an ability to account for the effects of viewing dis(cid:173)
`tance on stereoscopic depth. The relationship between the
`retinal image size of an object and its linear size in the world
`is also a function of distance. To the degree that vertical
`disparity gradients are used as an indicator of th e distance of
`a fixated surface for thr ee-dim ensional reconstruction, toe-in
`produced vertical disparity gradients would be expected to
`indirectly affect depth and size percepti on. Psychoph ysical
`experim ents have demonstrated that vertical disparity gradi(cid:173)
`ents strongly affect perception of stereoscopic shape, size and
`10
`21 and implicate vertical disparity processing in hu (cid:173)
`depth 9
`'
`'
`man size and dep th constancy.
`
`e-
`
`(Ur- Ut)
`
`where ( Ur- U1) is the horizontal screen parallax of th e point.
`If we ignore vertical disparities for the moment, con(cid:173)
`verging the camera causes changes in the geometrically pre(cid:173)
`dicted depth. For instance, if the cameras toe-in to converge
`on a frontoparalle l surface (parallel to the stereobaseline),
`then from geometric consideration s the center of th e object
`should appear at the screen distance but the surface should
`appear curved (Figure 5). This curvature should be espe(cid:173)
`cially apparent in the presence of undi storted stereoscopic
`reference imagery as would occur in augmented reality
`.
`I.
`16 I
`.
`app JCatlons.
`n contrast, 1f convergence is accomp lished
`via horizontal image translation then a frontal plane at the
`camera convergence distance should appear flat and at the
`screen distance althou gh depth and size will be scaled as
`discussed in the previous section .
`
`USE OF VERTICAL DISPARITY IN STEREOPSIS
`The patt ern of vertical disparities in a stereoscopic image
`depends on the geometry of the stereoscopic rig. With our
`spherical retinas disparity is best defined in terms of visual
`angle. An object that is located eccentric to the median plane
`of the head is closer to on e eye than the oth er (Figure 6).
`
`322
`
`J. Imaging Sci. Technol. 51 (4)/Jul.-Aug. 2007
`
`Legend3D, Inc.
`Exhibit 1012-0006
`
`
`
`Allison : Analysis of the influence of vertical disparities ari sing in toed-in stereoscopic camera s
`
`p
`
`X
`
`(a)
`
`1.15,---------.---------~
`
`1.1
`
`1.05
`
`0.95
`
`0Ys~o--
`
`- --
`
`--
`
`__ __ _ __
`---Lo
`Azimuth (degrees)
`
`:::::::===~ so
`
`(b)
`
`Figurn 6. (a) A vertical line loca ted eccentric to the midline of the heod is nearer to one eye than the other.
`Thus _it subtends_ a large r an_gle in the nearer eye than the further (ad apted from Howard and Roge rs6
`) (b) The
`gradie nt of vertical size ratio of the image of a surface element in the left eye to that in the right eye varies as
`a function of d istance of the surface (show n as a series of lines: distances of 70 6 0 50 4 0 and JO
`·
`order of ste