`
`Focal Stack Compositing for Depth of Field Control
`
`David E. Jacobs
`
`Jongmin Baek
`Stanford University∗
`
`Marc Levoy
`
`(a) Single focal stack slice
`
`(b) Reduced depth of field composite
`
`(c) Extended depth of field composite
`
`(d) Scene depth map (dark means close)
`
`(e) Defocus maps used to generate the images in (b) and (c), respectively (orange means blurry)
`
`Figure 1: Manipulating depth of field using a focal stack. (a) A single slice from a focal stack of 32 photographs, captured with a Canon 7D
`and a 28mm lens at f/4.5. The slice shown is focused 64cm away. (b) A simulated f/2.0 composite, focused at the same depth. To simulate
`the additional blur, objects closer to the camera are rendered from a slice focused afar, and objects far from the camera are rendered from
`a slice focused near. (c) An extended depth of field composite that blurs the foreground flower and is sharp for all depths beyond it. (d) A
`depth map for the scene, representing depth as image intensity (dark means close.) (e) A pair of defocus maps that encapsulate the requested
`amount of per-pixel defocus blur used to generate the composites above. Its magnitude is encoded with saturation.
`
`Abstract
`
`1 Introduction
`
`Many cameras provide insufficient control over depth of field.
`Some have a fixed aperture; others have a variable aperture that
`is either too small or too large to produce the desired amount of
`blur. To overcome this limitation, one can capture a focal stack,
`which is a collection of images each focused at a different depth,
`then combine these slices to form a single composite that exhibits
`the desired depth of field. In this paper, we present a theory of focal
`stack compositing, and algorithms for computing images with ex-
`tended depth of field, shallower depth of field than the lens aperture
`naturally provides, or even freeform (non-physical) depth of field.
`We show that while these composites are subject to halo artifacts,
`there is a principled methodology for avoiding these artifacts—by
`feathering a slice selection map according to certain rules before
`computing the composite image.
`
`CR Categories:
`I.4.3 [Image Processing and Computer Vision]:
`Enhancement—Geometric correction; I.3.3 [Computer Graphics]:
`Picture/Image Generation—Display algorithms
`
`Keywords: Focal stack, compositing, depth of field, halo correc-
`tion, geometric optics
`
`∗e-mail: {dejacobs, jbaek, levoy}@cs.stanford.edu
`
`Depth of field is one of the principal artistic tools available to
`photographers. Decisions about which scene elements are imaged
`sharply and which are out of focus direct a viewer’s attention and
`affect the mood of the photograph. For traditional cameras, such
`decisions are made by controlling the lens’ aperture and focus dis-
`tance. Unfortunately, many consumer cameras—including mobile
`phone cameras and compact point-and-shoot cameras—have lim-
`ited or no control over the aperture because of constraints imposed
`by portability and expense. However, nearly all cameras have focus
`controls and are capable of capturing a stack of images focused at
`different distances. This set of images is called a focal stack. As we
`will demostrate in this paper, these images can be combined to sim-
`ulate depth of field effects beyond the range normally allowable by
`the camera’s optics, including depth of field reduction, extension,
`and even freeform non-physical effects. Figure 1 shows examples
`of two of these manipulations.
`
`In focal stack compositing, each pixel of the output is a weighted
`sum of corresponding pixels in the input images—often referred
`to as focal stack “slices.” The choice of pixel weights determines
`the depth of field of the composite. Given a focal stack and user-
`specified novel camera parameters, appropriate blending weights
`can be computed via a compositing pipeline—ours is illustrated in
`Figure 2. The first step in compositing is to generate or otherwise
`
`1
`
`APPL-1026 / Page 1 of 10
`Apple v. Corephotonics
`
`
`
`Strudel-d Computer Graphics Inbomtotjv Technical Report 2012 1
`
`Scene
`
`Focal Stack
`
`Preliminary Preliminary
`FocusDistancc Composite
`
`
`
`{dark run-tn the)
`
`11
`
`Z,-
`
`ICI
`
`
`Zo
`F'mal
`
`Final
`
`Dcfocus
`
`FocusDistance Composite
`
`Focus Defocus Blur
`ReQuested
`
`Image Distance (mmmbm)
`focus
`
`II/t ' ”ID i
`
`
`
`=
`T!
`
`IC'I
`
`Z1
`22
`23
`[1,12,138tf/N j:2I. =
`Final
`
`
`D...
`, = 3I I
`
`
`
`ICjI
`
`
`Ground
`'Iruth
`
`
`
`Figure 2. Our corrpositing pipeline. Given the scene picturedin the twerlefz we capture a stack oftmaga {Ir . I2, 13 }focro'ed at depths
`{21, 2;, Zn} widr f—number N.11ris set ofimages is called afocal stock Wefled these imrgm into a dqrth extraction algorithm (ours is
`dmcribedin Sectim 4) to generate m estimatefirr the distance Z betwear the lens and the object imaged by each pixel. Fa‘ scene depth and
`focus distance maps (all images labeled Z) in the diagram above, we use interm'ty to repment depth; white means background, and black
`meansforground. Given the some depth rrrap Z, we calculate per—pixel the signed dficus blur C, wading to each slice I1, using
`Muation (1). Above, we rnlrualize the degree ofdd'ocus blur (in images labeled [C I) using satumtion, whee orange means Hurry and white
`means sharp. Equation (1) also allous us to compute a rogues-Md deficits map 0' given a user-specifiedfocus distance Z' and f—number
`N'. Some photographic examples ofC“ are shown in figure 1. In the example above, the user is requestinga reduceddepth offield
`compositefocused at Z ' = Z2 with f-monber N' : N[2. M then compute a preliminaryfocus distance map Z0 that
`fia the depth
`at which each pixd should befocused in order to achieve the requa’ted defocus C‘. For example, in order to maximally
`focus the distant
`rad object at Z; visible in the top third (I the congrosite, thepreliminaryfocus distance callsfor those pixels to be druwnfi‘om 13, which is
`factual close to the camera Indexing into ourfocal smelt as dwcribed cream a preliminary composite Io that is inexpensire to compute,
`but contains halo artifacts visible near depth edgm. To prevent such artifacts, we apply geometric constraints (discussedin Section 3.4)on
`Z; to create a smootha'focus distance mqr Z.11re resultirg composite I is IocaHy artifactfree. [art its corresponding ddocus map C dam
`not match C' perfectly. finally. in the bottom rightwe show a ground truth imagefor a camera with the requatedpammeters Z," N'
`
`acquire a proxy for the scene geometry—in our case. we use a depth
`nrap. Some knowledge of the scene geometry is necessary in order
`to estimate the per-pixel defocus blur preset! in each slice of the fo—
`cal stack. Additionally, scene geometry is required to calculate the
`per-pin] defircus appropriate for the synthetic innge taken with a
`user’s requested hypothetical camera. A basic focal stack compos-
`ite. then. is given by selecting or interpolating between the slices
`that nratch the requested defocus as closely as possible at each pixel.
`
`Certainly. one may produce similar effects without a focal stack—
`using only a single photograph. First. one can reduce the depth
`of field by segmenting the image into layers and comohing each
`layer with a blur kernel of tlr appropriate size. In practice. how-
`ever. synthetic blu' fiils to capture subtle details that are naturally
`present in photographic (physically poduced) blur.
`In particular.
`saturated image regions caurot be blwred synthetically because
`their true brightness is mknown. Similarly. scene inter-reflections
`and translucencies can cause a single pixel to have multiple depths:
`therefore. no single convolutional kernel will be correct. Photo-
`graphic blur. by contrast. guarantees a consistent defocus blur re-
`gardless of depth map accuracy. In addition. physical optical eEects
`like contrast inversion [Goodman 1996] camrot be correctly mod—
`eled by synthetic blur. but are present in photographic blur. Second.
`one can extend depth of field without a focal stack via deconvolu-
`tion. but this process is ill-posed without significantly modifying
`canrera optics or assuming strong priors about the scene.
`
`Finally. the requirement to capttu'e a focal stack is not as ona'ous
`as it would seem Cameras that employ contrast-based autofocus-
`ing [Bell 1992] already capture most. if not all of the required
`imagery. as they sweep the lens through a range. Contrast-based
`autofocusingis employed by nearly all cameras with electronic
`
`viewfinders. The only additional cost is the bandwidth required
`to save the autofocus ramp fi'ames to disk. Additionally. the depth
`map required for our compositing algaithm is easily computed as
`a byproduct of captruing a focal stack.
`
`We present a theory, framework and pipeline for focal stack cour-
`positirrg that produce composites nratclring a requested depth of
`field. This pipeline is shown in Figure 2. and is described through-
`out Section 3. “le will analyze the geometry of such composites.
`discuss how halo artifacts (especially at occlusion edges) can arise.
`and show how the hab artifacts can be mathematically avoided
`by minimal alteration of the desired depth of field. “le will then
`demonstrate the versatility of this framework in applications fa re-
`ducing (kpth of field. extending depth of field. and creating free-
`fornr non-physical composites tlnt are halo-flee.
`
`2 Prior Work
`
`Depth of field is a rrsefirl sisual cue for conveying the scene ge-
`anetry and directing the \iewer‘s attention. As such. it has been
`well-studied in the rendering literature. When raytracing a synthetic
`scene. care can obtain the desired depth of field by simriating the
`appropriate lens optics and aperture [Cook et al. 1984; Kalb et al.
`1995] c by employing other image-space postprocessing [Barsky
`and Pasztor 2004; Koslofl' and Barsky 2007] that nevertheless relies
`on access to the scene model. In traditional photography. however.
`the photographa determines the depth of field Via his choice ofthe
`relerant camera parameters. While modifying the camera can par-
`tially increase the range of possible depth of field [Mohan et al.
`2009] or the bokeh shape [Lanman et al. 2008]. the depth of field is
`essentially fixed at the captrue time. barring post-processing.
`
`APPL1026IPageZot10
`
`
`
`Stanford Computa- Graphics laboratory Technical 1241011 2012 I
`
`Overcoming this limitation requires correctly estimating the
`amount of blur present at each pixel, and then simulating the de
`sired blur (if fifiermt), which may be greater or snnller than the
`pre-existing bhnr at the pixel locaticn. Fa- instance. defocus mag-
`nification [Bae and Durand 2007] and variable-aperture photogra-
`phy[llasinofl and Kutulakos 2007]increase the per-pixel blur us-
`ing image-space convolution, thereby simulating a narrower depth
`of field. Reducing the per-1]er blur, on tlne other hand, requires
`deblutring the image, and can be ill-posed for traditional
`lelns
`bolnehs [Levin et al. 2W7].
`
`There exists a large body of work in computational optics that conn-
`bats the munnerical instability of deblurring a traditional photograph
`by capturing a coded 2D image. Many of tlnem employ spatiotenn—
`poral coding of the aperture. in ort‘kr to increase the invertibility of
`the defocus blur [Levin et al. 2007: Zhou and Nayar 2009]. and a
`large subset thereof is concerned with equalizing the defocus bhnr
`across depth, thereby avoiding errors introduced from inaccurate
`depth estimation [Dowski and Cathey 1995: Nagahara et al. 2008:
`Levin et al. 2009]. While t1: field of deconvolution has advanced
`significantly. deconvolved images tend to have a characteristically
`fiat texture and ringing artifacts.
`
`One alternative to capturing a coded 2D image is acquiring a
`redundant representation of the scene composed of may pho-
`tographs. Light fields [Levoy and Hannahan 1996: Ng 2005] and
`focal stacks [Streibl 1985] are connposed of multiple images that
`are either seen through different portions of the aperture, or focused
`at varying depths, respectively. Light fields can be rendered into an
`image by synthetic aperture focusing [Isaksen et al. 2000: Vaish
`et al. 2005]. Prim- works in focal stack connpositing [Agarwala
`et al. 2004: Hasirnoif et al. 2008] simulate a hypahetical camera’s
`depth of field by extracting fiom each slice the regions nntching
`theproperlevelofblur'foragiven aperturesizeandfocus dis-
`tance. However. while focal stack compositing is a rather well-
`knomn technique demonstrated to be light eflicient, it is yet to be
`analyzed with respect to the geometric implications of a particular
`composite. Specifically, the proper spatial relationships between
`connposited pixels necessary for artifact prevention are not yet well-
`studied. As a result, composites produced by these techniques fie
`quently suffer fi'om sisual artifacts.
`
`3 Theory
`
`We now build a theay offocal stack compositing as a tool for ma-
`nipulating depth of field, following the pipeline depicted in Fig-
`ure 2. Our theory difiers fi'om prior work in two key ways: 1) it
`is fully gerneral and allows non-physical. artistially driven depth-
`of-field efi‘ects. and 2) it explicitly models the interplay between the
`lens optics and scene geometry in orckr to renrediate visual artifacts
`at depth edges.
`
`We begin by assuming a thin-lens model and the paraxial approx-
`inntion. For now, let us also assume the existence of a depfln map
`2(1)“). where if = (p; , Dy) is a pixel's locationon the sensorrelative
`to the optical axis 2(1)) is defined as the axial distame between the
`lens arnd the object hit by the chief ray passing through in' when the
`sensor is focused at infinity. We discuss depth imp extraction in
`Section 4.
`
`Although the pipeline slnwn in Figure 2 is written in terms of
`object-space deptlns Z. it is algebraically simpler to extress our
`theory in terms of the conjugate sensor-space distances S. This
`simplification is a cornsequence ofthe 3D perspective transform ap-
`plied by the lens as the scene is imaged. Figure 3 illustrates this
`transform. Many geometric relationships in object space become
`arithmetic in sennsor space and thus me less unwieldy to discuss.
`
`Scene
`
`Depth
`
`\\
` W W
`
`Sensor
`Distance
`
`Figure 3: A lens performs a paupectiw transfonnation. A linear
`change in semorposition (on the right) conuponds to a non—linear
`change in object distance (on the ldi.) Cotpled with this change in
`(1th is lataul magnification (up and drawn in thefigure.) Although
`it is conceptually easier to reason about geometric relationships
`in object wee, the non—linearity it introdum makes the algebra
`unwieldy. Hera/ore we primarily work in sensor space firr the
`remainder ofthe piper:
`
`
`Circle of
`
`Confusion
`
`
`
`Figure 4: Defocus blur The rays emitted by the object at depth
`Z converge at a distance S behind the lens. A sensorplaced at a
`distant! S, inked ofthe correct distance 3, win not sharplyimage
`the oly'ect.1he rays do not manage at the salsa but ratha‘ create
`a blur spot ofmdints C. The apertwe radius A detaining the rate
`atwhicthrowsas thesensw mores awuyfiom 5'.
`
`The Gaussian lens formula tells Is the relationship between a scene
`point and its conjugate. If we apply this to the scene‘s (kpth map.
`2(5), we define a mp Sm = (1/f
`1/z‘(,3))-‘, where f is the
`focal length of the lens. $11?) is constructed such that tlne pixel at 13'
`will be in slurp focus ifthe sensa- is placed at a distance 5'03).
`
`Working in this conjugate sensor space, we define our fiamework
`as follows: given a set of images {[1}, taken at sensor positions
`{31}, with f-number N. we warnt to calculate a composite l that
`approxinmes a hypothetical camera with a sensor placed at S' and
`f-nuniner N ‘. later we will relax this connstrairnt and allow a hy-
`pothetical camera that is non-physical.
`
`3.1 Defocus Blur
`
`Defocus blur is a consequence of geormtric optics. As shown in
`Figure 4, if tlne sernsa' is placed at a distance S firm the lens. a
`bhnr spot of radius C forms on the sensor. This bhnr spot is known
`as the circle of confusion. and its shape is referred to as tlne lens’
`bokeh. Using sinnilartriangles. 012 can slnw thatC = A(l—S/S).
`Rewriting the the aperture radins A in terms of the focal length f
`and tlne f-number N, we obtain.
`
`0—— Lu — 3/3)
`
`(1)
`
`Note that C. as defined in Equation (1), is a signed quantity. If
`C > 0. then the camera is focused behind the object. If C < 0.
`then the camera is focused in fi'ont of the object. and the bokeh
`will be inverted. For most lenses. the bokeh is appoximately sym-
`metric, so it would be difficult for a human to distinguish between
`defocus bllus of C and -—C. Despite this perceptual equivalence,
`
`APPL1m6IPage3of10
`
`
`
`Stanford Carnpnta' Graphics Inbomtory Technical Report 2012 I
`
`we choose the above definition of C. rather than its absolute value.
`becarse it maintains amonotouic relationship between scene depth
`and defocus blur when eompositing.
`
`Defocus Ins
`
`Depth of field is typically defined as the range of depths within
`which objects are imaged sharply. Implicit in this definition is. first,
`that sharpnessrs dependent solely upon the depth. and second the
`range is a single contiguous intend. outside of which objects will
`be blurry. However often the desired blurrinessrs dependent also
`on objects’ locations in the frame (e.g.
`tilt-shift photography or
`other spatially Varying depth-oilfield efiects). Such non-standard
`criteria an be mptured by a more general representation, namely a
`map of desired circle-of-confusion radii across the sensor. We call
`this function C(17) a defocus map.
`
`A defocus map encapsulates the goal of focal stack caupositing.
`For example. ifwe wish to simulate a camera with the sensor placed
`at a distance S“ with f-mlmber N“. the desired defocus map car
`be calculated but 903') as
`
`0‘03): 170—379(15))
`
`(2)
`
`An all-focused image is trivially specified by 0‘03) = 0. One can
`specify arbitrary spatially-varying focus effects by manually paint
`ing 0‘. using or: stroke-based interface presented in Section 5.3.
`
`3.2 Sensor Distance Maps
`
`Given 0' as a goal, our task now is to find a composite that cor
`responds to a defocus map as close to C' as possible. We will
`represent our solution as a function that defines the desired sensor
`position for each pixel. We call such a functim a sensor distance
`map. This is a convenient choice as a Foxy for focal stack indices
`because it has physical meaning and is independent of the depth
`resolution of our focal stack. It also lends itself well to an alter-
`
`native interp'etation of focal stack eompositing as the constructim
`of a sensor surface that is conjugate to a (potentially non-planar)
`surface of focus in object space.
`
`Ifour only comeru is matching 0" as closely as possible. finding
`the optimal sensor distance map is straightforward. For any given
`(unsigned) defocus bhir radius, two sensor positions will achieve
`the desired bhn—one focused in fiont of the scene olject and are
`focused behind. Because we defined the defocus bhtr radius C'
`to be a signed quantity. however. there is no ambiguity. Accord
`ingly, we can find the sensa' distance map $36)) fa a Feliminary
`composite by inverting the relationship given by Equation (1 ):
`
`-
`
`~
`
`2NC'
`
`(3)
`sum =30» (1— fig)
`6n all-focus image is trivially specified by 30(5) = $03). We call
`50(5) a p‘diminary sensor distance map because. as we will show
`in Section 3-4. it may not poduce a visually pleasing composite.
`
`3.3 eompositing
`
`In order to build a preliminary canposite fo. we must determine
`which pixels hunt the focal staclr best approximate the desired sen
`sor distance 5'0. A simple choice would be to cpantize 5'9 to the
`nearest sensor positions available in the focal stack and assign pix-
`els the colors fiom those slices. The resulting defocus blur Co for
`such a composite will app-minute C'. Figure 5 shows a compari-
`son between Co and C‘ for two depth-of-field manipulations.
`
`
`Disttutct
`
`
`
`
`
`
`
`.slumml:
`
`.--.--.-..4.----..-
`sutpvumm
`
`
`
`
`
`Sensor
`Distance
`
`
`E aE
`
`Figure 5: Depth-dq'ocus relationshipsforfocal stack compositm.
`(a) An all-focus composite is chamctaized by IC' | = 0 (shown in
`purple.) Ifour stack has slicer at sensor distances corresponding to
`the red circles, then the composite will assigr each pixel the color
`ofthe nmt stack slice in satsor distance (as segmental by the
`dotted vertical lines), thus creating the depth-defocus rdationsth
`given by |C'o| (shown in orange.) [C'o| isfar-thatfiom |C'| midway
`bmmen stadc slices, so as one might suspect, adding more stack
`slices will improve the allefocus composite.
`(b) A simrdated wide
`apa'ture conposite is characterized by a |C’| drat grows qra‘cldy
`as it moves awayfirm in conjugate plane offocus (dashed pur-
`ple line.) Ihis can be approximated by Wing" sensorpositions
`about the conjugate plane officus. such that an object nearby is
`assigned the color ofa slicefircusedfar away and vice versa.
`
`However. quantia'ng So as described above can create discontinu-
`ities in the defocus map 60 as seen in Figure 5(b). These discon
`tinuities manifest themselves as false edges in a cunposite when
`transitioning between stack slices. We can smooth these transi
`tions by linearly interpolating between the two closest stack slices
`as an approximation for 5'0 instead of quantizing. This protides
`a good approximatim in most cases. Imerpolation should not be
`teed when C" callsfa'a pixel tobesharperthan both ofitsnearest
`stack slices (i.e. the scene object’s focused sensor distance 5113) is
`between the two nearest stack slice positions.) In this circumstame.
`blending the two slices will only increase the defocus bhlr at 13'. so
`it is best to just choose the closer single slice—this case is shown
`in Figure 5(a).
`
`3.4 Eliminating Color Halos
`
`Assigning each pixel a sensor distance independently of its neigh
`bors will create a conposite whose per-pixel blur Co matches C'
`as closely as possible. However, our goal is not just to obtain bhrr
`that matches C" . but to do so without producing visual artifacts.
`Halo anifircts. which we define as color bleeding across depth dis-
`contirmities. are common in peliminary composites and are Visu-
`ally objectionable. as demonstrated in Figure 6. Therefore. we will
`conpute afiral sensor distance map 5(13') that generates a halo-fi‘ee
`cunposite whose per-pixel bhlr is close to C‘.
`
`Halos. we claim. are the manifestation of the “double-counting" of
`rays. i.e. more than one pixel in the composite integrating a given
`ray. Consider a ray fiom an object sharply imaged in one pixel.
`Ifcaptured again by another pixel, it will necessarily appear as a
`defocused contribmion of the same object. Figure 7 illustrates this
`geometry. The result is the characteristic color bleeding of halos.
`
`APPL1026IPage40l10
`
`
`
`Starfor‘d Computer Graphics laboratory Téchrrical Report 2012 1
`
`V
`
`
`
`(c) Sharplyfiocmedatzz
`
`(d) Adjustingfoerstomnthalo
`
`(e) Ahab-fieesurfiee offiocm
`
`f) Alternative lab-flee surfies affirms
`
`V (
`
`Figure 7: Halo geometry (a) An example scare with a green and red object a dqrths Zr and 22. mtively Foran allfoars composite.
`the requated surjirce d'focrLr coincida with Z1, thenjurrnrs to Z2 at the occlusion boundary This surfice offocus correqronds to the sensor
`surfirce shown in purple on the right side ofthe lens. (b) All light emittedfivrn a point on theforgmund object (I Zr (green shaded area)
`oonvaga properly a the sarson (c) Near the occlusion boundary, only a fiaction of the light emittedfiorn the badtgmund object d Z2
`(red shaded area) readra' the senor: The light that is blochd is rephreed with a defircused contributionfionr theforeground object (green
`shaded area.) This contribution amears visuaHy as a great haw M the red backgrormd object—similar in qrpearance to that seen in lo in
`figure 2. This haze next to the otherwise shap silhouette ofthefaeground ob‘ect. is a halo artifict (& The chest halo-free alternative is
`tofocus on the line passing though the edge oftheforeground object and the corner ofthe lac aperture Any bundle ofrap leaving apoint
`on this line will not be occluded by the fireground object. The blue portion dtll's line gives a halofi'ee transition in the surfirce jfocus
`betwear Zr and Z2. The correspond‘rg sensor sud'oce transition isdrawn in blue to die right ofthe (em. (e) A halo-flee surflrce d'focus and
`in corresparding sensor surfice. (f) Alternative halofiee srrrfircer and their corriugdes can befirrrnd by choosing to dd'ocus thefirregmurrd
`imtead ofthe background (the orange transition connecting the twofocus distances) or some combination J both (the grey shaded regions).
`The best transition choir: is application specific.
`
`should have its gradient bound as follows,
`
`
`Ilvso'ou s 3?.
`
`(4)
`
`Algorithm 1 iteraes over the set of sensa- distance values. Each
`iteration, corresponding to a particula' sensor distance 3, enforces
`all pairwise constraints that involve any pixel whose sensor distance
`is s (the set of such pixels is denoted by Qo.) More precisely, we
`identifypixels that interact with Qo, ordered byincreasing distance
`from Q0, by iteratively dilating Q0. We then adjust their sensa'
`distances if their interaction with 00 violates Ecpation (4).
`
`APPL1026IP89050I10
`
`
`
`
`
`
`(a) Haloed composite
`
`Figure 6: Halo artifam. (a) A preliminary composite with a halo
`artifact (inset). (b) A long exposure ground trrth f02 image.
`
`Intuitively, observing a single ray twice in a composite should be
`avoided, because it is physically impossible. Real photographs of
`opaque objects nevercontain halos: (no: a ray hm been crptured by
`a pixel. it cannot be detected by another, even with an exotic. non-
`planar sensor surface (which we simulate with our composites.) Fo-
`cal stack composites are not cmstr'ained in this manner, because
`pixels at different sensor distances are not nec essarily captured si-
`multaneously, leaving open the possibilityof double-counting rays.
`
`Geometrically. the double-counting of rays by two distinct pixels
`is equivalent to the two pixels being collinear with a point on the
`aperture. In order to detect this conditim. we need to examine each
`pair of pixels. extend a line through than. and test whether this
`line intersects the aperture. If it does. then that pair of pixels will
`constitute a halo. Algebraically, this test is equivalent to asking
`whether tll: gradient of 5' is bounded by some maximum rate of
`change. For example, fa- a pixel ir’located on the optical axis. 5(5)
`
`where A is the aperture radius Under the paraxial approximation,
`Equation (4) applies to all other pixels 13'. Therefue, acceptable
`sensor surfaces are those that satisfy Equation (4).
`'0.) Gun-d. maximum-h
`_
`_
`Now that we know how to mathermtrcally characterize halo-
`inducing sensor configurations, we may construct a corrected sen-
`sor distance map 5 that avoids them. by algorithmically enforc-
`ing the constraints. Note that naively checking the slope of 5' be-
`tween every pair of pixels will yield an algorithm whose runtime
`is quadratic in the number of pixels. Instead. we observe than for
`each 5. it is suflicient to check the slqre between ii and is clos-
`est neighbor if whose sensor distance is s, for each possible value
`of s. This holds because the cmstraints arising fi'om checking all
`other pixels are necessarily weaker than those we check. This opti-
`mintion reduces the time cmrplexity of the algaithm to be linear
`in the number of pixels. at the cost of introducing a linea depen-
`dence m the depth resolution. However. the set ofvalues occurring
`in the sarsor distance map is typically small—on the order of the
`number of slices in the focal stack. Algorithm 1 summarizes the
`implementation of this optinrizrnion
`
`
`
`StarJord Computer Graphics laboratory Technical Report 2012 1
`
`Algorithm 1 Constructing S.
`
`S‘ (— S‘o.
`DilationLimit (— lengh of image diagonal
`_
`for all sensor distances 5 do
`Let Qo betlre set ofall pixels fisuch that S(q") = s.
`for r = l to DilationLimit do
`
`dilate(Qr r, 1 pixel).
`Let Q,
`Let ()0 be the set of newly included pixels in 01--
`Let (8min,s,...,)_— (s — i-rvs + -r)
`for all pixels“p m 00 do
`Clamp Sh?) to be in [Strum Smut]-
`end for
`end for
`end for
`
`The algorithm [resented abme generates a firmily ofhalo-fi'ee sen-
`scr distance -ps. Figure 8 shows c
`'tes fiun one such fam-
`ily. The specific sensor distance map produced depends on the order
`in which the sensu- distances are considered in the outer loop ofthe
`algorithm. Because each iteratim resolves any cmflict involving
`pixels at a specific sensor distance 5. those pixels at distance 5 will
`be unaffected by future iterations. As such, more importarl sensor
`distances should be prioritized if possirle. It is dificult to deter-
`mine the best order a priori. In fact, it will vary based on the usu's
`intentions. Our implementation uses a foreground-favored ordering
`by default, and would poduce the composite shown in Figure 8(a).
`
`The theory presented above relies hearily on knowing where the
`edge of the aperture is, and hence on the thin-lens model and the
`paraxial approxinration. Real photographic lenses. on the other
`hand. are complex systems of multiple elements. and as such, may
`deviate fiom our assumptions. If we had knowledge of the exact
`optical parameters for a lens, we could perform a similar analysis
`to more accurmely model the spatial extent of halos. In practice.
`without such knowledge. we conservatively over-estimate the size
`ofhalo effects to be twice the amount the theory would imply.
`
`3.5 Reducing Blur with a Variable Aperture
`
`The halo elimination algoritlnnjust descrflred removes cola bleed-
`ing by sacrificing some accuracy in matching 0'. For extended
`depth of field composites, this manifests as blurriness near depth
`edges. If the camera is equipped with a cmtrollable aperture. we
`can further inurroxe on our canposite by operating at a firm:-
`aperture block. rather than a focal stack. A focm-aperture block
`is a 2D family ofphotographs with va‘ying focus as well as varying
`aperture radius.
`
`Note that being able to cartnre narrow-aperture photograrirs does
`not necessarily obviate the work needed to gerraate an all-focrs
`image fa- two reasons: 1) the narrowest aperture may not be small
`enough. and 2)rmages taken with a small aperture are noisy as-
`suming a constant exposure duration. Wherever the depth map is
`flat a prqrerly focused wide-aperture photograph should beJust as
`sharp as its narrow-aperture counterpart, and less noisy. However.
`nea- depth discontinuities. we can trade OE noise against blurriness
`by selecting the appropriate f-mrmber for each pixel.
`
`
`
`(a) Forward-fiver“! eornpos'le (b) Wand-firmed cmposite
`
`Figure 8: Alternative halo-flee composites. Recallfiarn Figure 70)
`that there exism afamily ofhalofiee surfacm :1focus that well ap-
`proximate a given preliminwy surface offocus. The specific halo-
`fiae surface offoals generated is determined by the order in which
`sensor (Istances areprocessed in the outer loqr qulgorithrn l. (a)
`A composite rha prioritizes faey'ound oly‘ecm. produced by pro-
`cessing sensor distances in decran order: This corresponds to
`the blue depth transition shown in Figure 70). (b) A composite drat
`prioritizes background objects, produced byprocessing sensor dis-
`tances in increasing order This corrmponds m the orange depth
`transition shown in Figure 7(0.
`
`Aperture
`
`k
`
`
`
`Figure 9: Adjusting qrerrure to prevent halo. [f we use the fill
`aperture of the lens. the sensor will integrate all the light shown
`in the shaded region, including a halocausing contribution fiorn
`die foreground object. While this can be addressed In: adjusting
`focus as in Figure 7(d), stoppingdown the aperture m shown above
`reduces the light reaching the sarsor tojust the red shaded region,
`efl'ectively blocking the contributionfionr theforeground object. In
`general, we can eliminate halos without introducing any blur; by
`reducing aperture near occlusion boundaries.
`
`To handle focus-apu'ture blocks. we must sligrtly modify Algo-
`ritlnn 1. Specifically, we are to find not only a halo-free sensa-
`distance map 5'07), bu also a spatially varying aperture radius map
`13(5) that accompanies it. We solre this poblan ming a two-pass
`approach. In the first pass. we initialize lie?) to be the largest aper-
`ture ratius available. and execute Algorithm 1. However, instead of
`clamping the sensor distance whenever a halo is encorurtered, we
`narrow the aperture at the afl'ected pixel location by the appropri-
`ate amount to satisfy Equation (4).
`It may be that the narrowest
`available aperture is still too large. in which case we settle for this
`value. In the second pass. we execute the algorithm in its original
`form clamping the sensor distance according to the local constraint
`based m the spatially varying aperture map conrprrted in the previ-
`ors pas-s. Figure 10 risualizes the effect ofthe augnented algoritlnn
`on S, A and shows the