`
`A Morphable Model For The Synthesis Of 3D Faces
`
`Volker Blanz
`
`Thomas Vetter
`
`Max-Planck-Institut f&r biologische Kybemetik,
`Tubingen, Germany*
`
`Abstract
`
`In this paper, a new technique for modeling textured 3D feces is
`introduced. 3D ^es can either be generated automatically fern
`one or more photographs, or mtxieled ilirectly through an intuitive
`user interface. Users are assisted in two key problems of computer
`aided face modeling. First, new face images or new 3D fece mod
`els can be registered automatically by computing dense one-to-one
`coiiespondence to an internal face model. Second, the approach
`regulates the naturalness of modeled feces avoiding faces with an
`"unlikely" appearance.
`Starting ftom an example set of 3D face models, we denve a
`roorphable face model by transforming the shape and texture of the
`examples into a vector space representation. New feces and expres
`sions can be modeled by forming linear combinations of the proto
`types. Shape and texture constraints derived from the statistics of
`our example &ces are used to guide manual modeling or automated
`matching algorithms.
`^
`We show 3D face reeonstiuctions from single images and their
`applicaUons for photo-realistic image manipulations. We also
`demonstrate face manipulations according to complex parameters
`■mch as gender, fullness of a face or its distinctiveness.
`Keywords:
`facial modeling, registration, photogranimetry, mor-
`phing, facial animation, computer vision
`
`1 Introduction
`Computer aided modeling of human &ces still reqiiires a great deal
`of expertise and manual control to avoid uiuealistic, non-face-like
`results. Most limitations of automated techniques for face synthe
`sis, face animation or for general changes in the appearance of an
`individual face can be described either as the problem of finding
`corresponding feature locations in different faces or as the problem
`of separating realistic faces from faces that could never appear in
`the real world. The correspontlence problem is crucial for all mor-
`pbing techniques, both for the application of motion-capture data
`to pictures or 3D face models, and for most 3D face reconstrucdon
`techniques from images. A limited number of labeled feature points
`marked in one face, e.g., the tip of the nose, the eye comer and less
`prominent points on the cheek, must be located precisely in another
`face. The number of manually labeled feature points varies from
`■ MPl fOr bid. Kybemetik, Speroannstr. 38,72076 Ttibingen, Gennaay.
`E-mail: {volkerblanz, lhomas.vettet}@niebingen.mpgde
`
`3D Database
`
`Morphable
`Face Model
`
`Face
`Analyzer
`
`Modeler
`
`ZD Cutpul
`
`ZD Input
`Figure 1: Derived from a dataset of prototypical 30 scans of faces,
`the morphable face model contributes to two main steps in face
`manipulation: (1) deriving a 3D face model from a novel image,
`and (2) modifying shape and texture in a natural way.
`application to application, but visually ranges from 50 to 300.
`Only a correct alignment of all these points allows acceptable in
`termediate morphs, a convincing mapping of motion data from the
`reference to a new model, or the adaptation of a 3D fece moiMl to
`2D images for 'video cloning*. Human knowledge and «penroce
`Ts necessary to cora^ehiatV'f6r"tBe variEtiBHfbetWeen indwdual
`&ces and to guarantee a valid location assignment in the difTerent
`faces. At present, automated matching techniques can be utilized
`only for very prominent feature points such as the comets of eyes
`and mouth.
`. . .
`■
`t
`A second type of problem in face modeling Is the separation ot
`natural faces from non faces. For this, human Imowledge is even
`more critical. Many applications involve the design of completely
`new natural looking faces that can occur in the real world but which
`have no "teal" counterpart. Others require the manipulation of an
`existing face according to changes in age, body weight or simply to
`emphasize the characteristics of the face. Such tasks mually require
`time-consuming manual work combined with the skills of an artist.
`In this paper, we present a parametric face modeling technique
`that assists in both problems. First, arbitrary human faces can re
`created simultaneously controlling the likelihood of the gencratw
`faces. Second, the system is able to compute correspondence b^
`tween new faces. Exploiting the statistics of a large
`face scans (geometric and textural data, Cyberware
`) we Wilt
`a motphable face model and recover domain knowledge about fece
`variations by applying panetn classification methods. The mor
`phable face model is a multidimensional 3D moiphing functicra that
`is based on the linear combination of a large number of 3D face
`scans. Computing the average face and the main modes of vari
`ation in our dataset, a probability distribution is imposed on the
`morphing function to avoid unlikely faces. We also derive paramet
`ric descriptions of face attributes such as gender, distinctiveness,
`"hooked" noses or the weight of a person, by evaluating the distn-
`bution of exemplar faces for each attribute within cm face space.
`Having constructed a parametric face model that is able to gener
`ate almost any face, the correspondence problem mms into a mathe
`matical optimization problem. New faces, images or 3D face scans,
`can be registered by minimizing the difference beween the new
`face and its reconstruction by the face model function. We devel-
`
`3SHAPE EXHIBIT 2003
`Exocad v. 3Shape
`IPR2018-00788
`
`1
`
`
`
`oped an algorithm that adjusts the model parameters automatically
`for an optimal reconstruction of the target, requiring only a mini
`mum of manual initialization. The ouqiut of the matching proce
`dure is a high quality 3D face model that is in lull correspondence
`with our morphable ^e model. Conse^ently all fiice maniptda-
`lions parameterized in our model fiuction can be mapped to the
`target 6ce. The prior knowledge about the shape and texture of
`faces in general that is captured in our model function is sufficient
`to make reasonable estimates of the full 3D shape and texture of a
`face even when only a single picture is available. Wheii applying
`the method to several images of a person, the reconstructions reach
`almost the quality of laser scans.
`
`1.1 Previous and related work
`Modeling human faces has challenged researchers in computer
`graphics since its begituiing. Since the pioneering work of Parke
`(25, 26), various techniques have been reported for modeling the
`geometry of faces [10, 11, 22, 34, 21] and for animating them
`[28, 14, 19, 32, 22, 38, 29). A detailed overview can be found in
`the book of Parke and Waters [24].
`The key part of our approach is a generalized model of human
`feces. Similar to the approach of DeCarlos et al. [10], we restrict
`the range of allowable faces according to constraints derived from
`prototypical human faces. However, instead of using a limited set
`of measurements and proportions between a set of facial landmarks,
`we directly use the densely sampled geometry ofthe exemplar ftces
`obtained by laser scanning (pyberxuare ). The dense model
`ing of fecial geometry (several thousand vertices per face) leads
`directly to a triangulation of the surface. Consequently, there is no
`need for variational surface interpolation techniques (10, 23, 33].
`We also added a model of texture variations between fiM»s. The
`roorphableJD/ace model is a consequent extension of the jnterpo^
`lation technique between face geometries, as introduced by Parke
`[26]. Computing correspondence between individual 3D face data
`automatically, we are able to increase the number of vertices used
`in the face representation from a few hundreds to tens of thousands.
`Moreover, we are able to use a higher number of &ces, and thus
`to interpolate between hundreds of'basis' faces rather than just a
`few. The goal of such an extended morphable face model is to rep
`resent any fece as a linear combination of a limited basis set of face
`prototypes. Representing the fece of an arbitrary person as a linear
`combination (morph) of "prototype" faces was first formulated for
`image compression in telecommunications [8]. Image-based linear
`2D face models that exploit large data sets of prototype faces were
`developed for face recognition and image coding [4,18,37].
`Different approaches have been taken to automate the match
`ing step necessary for building up morphable models. One class
`of techniques is based on optic flow algorithms [5,4] and mother
`on an active model matching strategy [12, 16]. Coinbinations of
`both techniques have been applied to the problem of image match
`ing [36]. In this paper we extend this approach to the problem of
`matching 3D faces.
`The correspondence problem between different three-
`dimensional face data has been addressed previously by Lm
`et al.[20]. Their shape-matching algorithm differs significmtly
`from cur approach in several respects. First, we compute the
`correspondence in high resolution, considering shape and texture
`data simultaneously. Second, instead of using a physical tissue
`model to constrain the range of allowed mesh defornwtions, we Me
`the sladstics of our example feces to keep deformations plausible.
`Third, we do not rely on routines that are specifically designed to
`detect the features exclusively found in faces, e.g., eyes, nose.
`Our general matching strategy can be used not only to adapt the
`moqihable model to a 3D face scan, but also to 2D images of faces.
`Unlike a previotis approach [35], the morphable 3D face model is
`now directly matched to images, avoiding the detour of generat
`
`ing intermediate 2D morphable image models. As a consequence,
`head orientation, illumination conditions and other parameters can
`be flee variables subject to optimization. It is sufficient to use rough
`estimates of their values as a starting point of the automated match
`ing procedure.
`Most techniques for 'face cloning', the reconstruction of a 3D
`face model from one or more images, still rely on manual assistance
`for matching a deformable 3D face mode] to the images [26,1,30].
`The approach of Pighin et al. [28] demonstrates the high realism
`that can be achieved for the synthesis of 6ces and facial expressions
`ftom photographs where several images of a face are matched to a
`single 3D face model. Our automated matching procedure could be
`used to replace the manual initialization step, where ^eral corre
`sponding features have to be labeled in the presented images.
`For the animation of &ces, a variety of methods have been pro
`posed. For a complete overview we again refer to the book of
`Parke and Waters [24]. The technitjues can be roughly separated
`in those that rely on physical modeling of facial muscles [38,17],
`and in those applying previously captured facial expressions to a
`face (25,3]. These performance based animation techniques com
`pute the correspondence between the different facial expressions of
`a person by tracking markers glued to the face from image to im
`age. To obtain photo-realistic face animations, up to 182 markets
`are used [14]. Working directly on faces ivithout markers, our au
`tomated approach extends this number to its limit It matches the
`full number of vertices available in the farre model to images. The
`resulting dense correspondence fielrfs can even capture changes in
`wrinkles and map these from one face to another.
`
`1.2 Organization of the paper
`We Stan with a description of the database of 3D fhce scans from
`. which.our.morphabiejnodeijsbuilt.
`—jt—r* •
`In Section 3, we introduce the concept of the morphable face
`model, assuming a set of 3D face scans that are in full correspon
`dence. Exploiting the statistics of a dataset, we derive a parametnc
`description of feces, as well as the range of plausible faces. Ad
`ditionally, we define facial attributes, such as gender or fullness of
`faces, in the parameter space of the model.
`In Section 4, we describe an algorithm for matching our flwible
`model to novel images or 3D scans of faces. Along with a 3D re
`construction, the algorithm can compute correspondence, based on
`the morphable model.
`In Section 5, we introduce an iterative method for building a mor
`phable model automatically from a raw data set of 3D face sc^
`when no correspondences between the exemplar faces ate available.
`
`2 Database
`
`Laser scans (Cyberuwre'"*') of 200 heads of young adulu (100
`male and 100 female) were used. The laser scans
`"'"'l
`stnicmre data in a cylindrical representation, with radii
`v) J>'
`surface pointe sampled at 512 equally-spaced angles and at 512
`equally spaced vertical steps ft. Additionally, the RGB-eolor values
`R{k,4t), G(fi, 0),and
`were recorded in the same spatial
`resolution and were stored in a texture map with 8 bit jrer channel.
`All faces were without makeup, accessories, and facial hair. The
`subjects were scanned wearing bathing, caps, that were remowd
`digitally. Additional automatic pre-processjng of the scans, which
`for most heads required no human interaction, consisted of a ver
`tical cut behind the ears, a horizontal cut to remove the shouldei^
`and a normalization routine that brought each face to a standard
`orientation and position in space. The resultant faces were repre
`sented by approximately 70,000 vertices and the same number of
`color values.
`
`2
`
`
`
`3 Morphable 3D Face Model
`Themorphable model is based on a data sei of 3D faces. Morphing
`beween ftces requires fiiil corrsspondence between all ofthe faces.
`In this sficUon, wc will assume lhac all exemplar faces are In full
`correspondence. The algofiihm for computing correspondence will
`be described In Section S,
`We represent the geometry of a fece with a shape-vector S -
`(XuY\,Z,,Xi
`y„, Zn)'" e R'". that contains the X,Y,Z-
`coonUnaiES of Us n vertices. For simplicity, we assume thai the
`number of valid texture values in the texture map Is equal to the
`number of vertices. We therefore represent the texture of a face by
`atexture-vectorT = {Ri.Gi.fli I
`,Gn,S„) SR |thal
`contains the R. 0,8 color values of the n corresponding vertices.
`A morphable face model was then consmicied using a data set ofm
`exemplar faces, each represented by its shape-vector Si and texturt-
`veciorTi. Since we assume all faces in full coticspondence (see
`Section 5), new shapes Smorf«( and new textures Tmsiei con be
`expressed in barycentric coordinates as a linear combination of the
`shapes and textures ofthe m exemplar faces;
`
`Smsa = 2 I
`
`~ 53 '
`
`<=i
`
`i=»i
`
`We deftne the morphable model as the set of faces (Smaiia)^
`Tmo<t(6)), parameterized by the coefficients a = (oi|0j...0m)
`and 6 = (6|,
`' Arbitrary new fhces can be generated by
`varying the parameters 5 and b that eoniroi shape and texture.
`For a useful face synthesis system, it is irnportant to be able to
`quantify the results in tetms of theirplausiblllQ' ofbeing fi«s. Wc
`therefore estimated the probability distribution for the coefficients
`a, and-6( from our example-set of feces.--Thjsdisiiibuiion.enablcs.
`us to control the likelihood of the coefficients oj and bi and conse
`quently regulates the likelihood of the appearance of the generated
`^^'we fit a multivaiiaie normal distribution to our dM set of 200
`faces, based on the averages of shape 5 and texture T and the co-
`variance matrices Cs and Ct computed over the shape and texture
`differences ASi = St — If and lkT( = Ti— T".
`A common technique for data compression known as Principal
`Component Analysis (PCA) [15. 31] performs a beats ^forma
`tion to an orthogonal coordinate system fotmed by the eigenvectors
`at andlioftheeovariance matrices (in tJescending order according
`to their eigenvalues)';
`
`fsrl
`
`Oi J 6 R""' • The probability for coefficients a is given by
`^ m-l
`p(o) ~ espl-^
`
`with <7? being the eigenvalues of the shape covirianiw matrix Cs n
`The probability p(^) is computed similarly.
`Seemeniid morphable model: The morphable model oe-
`sctibed in equation (I), has m - 1 degrees of freedom for tex
`ture and m - 1 for shape. The expressiveness of the model can
`'Standard niotphlng iKtwcen two faces (m = 2J is obtained if the pa-
`tameien oi.ii ore varied between 0 and 1. swing oj = 1 - ot and
`' 'Due to the jubltscled average vectors f and 7*, the dlmeniloni of
`5panlASi}andSpon{ATi)iOTnimostm- 1.
`
`1(1/1
`
`ii/ll
`
`Figure 2: A single prototype adds a large variety ofnew faces to the
`morphable model. The deviation of aprototypc from the average is
`added (+) or subtracted (-) from the average. A sinndani ntorph ( )
`is located halfway between average and the prototype. Subtracting
`the differences from the average yields an •ami'-face (B). Adoing
`and subiiaciing deviations independently for shape (S) and i^sxtuie
`(T) on each of four segments produces a number ofdistinct laces.
`
`be incRBSCd by. djvidifi« fhces intoji^pendentjutogimts that ar^
`raoiphcd independently, for example into eyes, nose, mouth aMC
`sunounding region (see Figure 2). Since all faces ate iwumed to
`be In corresponiJencc. it is sufficient to define these r^lons on a
`reference face. This sepnentation is equivalent to mbdividmg the
`vector space of fcces into independent subspaces. A wmpleie 3U
`face is aeiteraied by computing linear combinations for i^ti seg
`ment separately and blending them at the borders according to an
`algorithm proposed for images by [7].
`
`3.1 Facial attributes
`Shape and texture coefficients Oi and 0i in tiur motp^ble fece
`ni^Bl do not correspond to the facial attributes used in human lan-
`zuase. While some facial attributes can easily be lelaied to biophys
`ical measurements [13.10], such as the width of the mouth, o hers
`such as facial fcmininicy or being more or less bony can
`described by numbers. In this section, we describe a method tor
`mapping facial attributes, defined by a hand-labeled set of exampte
`faces, to the parameter space of our morphable model. At each po
`sition in face space (thai is for any possible f^). we define shape
`and texture vectors thai, when added to or subtracted from 8 faw,
`will manipulBie a specific attribute while keeping all other otmbutes
`as constant as possible.
`, ,
`w.
`Ina perfotmsnce based technique [25], facial "P'fj"®"®."" f?
`transferred by recording two scans of the same individual with dif
`ferent expressions. and adding ihedifferenccs AS =
`', AT = T..rrc..ior, " T„
`to a different individual
`in a neutral expression.
`. , ..
`Unlike facial expressions, attributes that are invariant
`dividual are more difficult to isolate. The foilwing method allows
`us to model facial anribulcs such as gender, fullness effaces. (Mik-
`ness of eyebrows, double chins, and hooked versus concave noses
`(Figures). Based on a set effaces (Si.Ti) with manually assigned
`labels iJi describing the roarkedness of the attribute, we compute
`
`3
`
`
`
`weighted sums
`
`<'>
`AS = f^MSi - 5).
`ial
`«sl
`Multiples of (t^5, AT) can now be added to or subtracted from
`any individual face. For binary attributes, such as gender, we assign
`constant values ha for all in,r faces in class A, and hb # P/t for
`all iiiB faces in 8. Affecting only the scaling of AS and AT. the
`choice of ps is arbitrary.
`To justify this method, let p(5,T) be the overall function de-
`scribing the markedness of the attribute in a face (S,T). Since
`^(S,T) is not available per se for ail (S,T), the regression prob
`lem of estimating p(S,T) from a sample set of labeled faces has
`to be solved. Our technique assumes that p(5, T) is a linear func
`tion. Consequently, in order to achieve a change A/s of the at
`tribute, there is only a single optimal direction (A5, AT) for the
`whole space of faces. It can be shown that Equation (3) defines
`the direction with minimal variance-normalized length HASHaj =
`{A5.Cj'AS),l|ATll?„ = (Ar.CfAT}.
`A different kind of facial attribute is its "distinctiveness', which
`is commonly manipulated in caricatures. The automated produc
`tion of caricatures has been possible for many years [6]. This tech
`nique can easily be extended fh)m 2D images to our morphable face
`model. Individual faces are caricatured by increasing their disunce
`torn the average face. In our representation, shape and texmre co
`efficients ait Pi ate simply multiplied by a constant factor.
`
`ORIGINAl
`
`CARICAtURE
`
`MORE MALE
`
`FEMAIE
`
`SMILE
`
`FROWN
`
`WEIGHT
`
`HOOKED NOSE
`
`Figure 3: Variation of facial attributes of a single fece. The appw-
`anCB of an original face can be changed by adding or subtracting
`shape and texture vectors specific to the attribute.
`
`4 Matching a morphable model to images
`A crucial element of our framework is an algorithm for aiitomati-
`caliy matcfting the morphable face model to one or more images.
`Providing an estimate of the face's 3D structure (Figure 4), it closes
`the gap between the specific manipulations described in Section 3.1,
`and the type of data available in ^icai applications.
`Coefficients of the 3D model are optimized along with a set of
`rendering parameters such that they produce an image as close as
`possible to the input image. In an analysis-by-synthesis loop, Ae
`algorithm creates a texture mapped 3D face from the current model
`parameters, renders an image, and updates the parameters accord
`ing to the residual difference. It starts with the average head and
`with rendering parameters roughly estimated by the user.
`Model Parameters: Facial shape and texnire are defined
`by coefircients aj and 0j, j =
`— 1 (Ei^tion ly
`Rendering parameters p contain camera position (azimuth and
`elevation), object scale, image plane rotation and translation,
`intensity tr.ams, t,,.ms,
`of ambient light, and intensity
`
`2D Input
`
`Initializing
`the
`Morphatiio Model
`
`(ough IrtiorACiivtt
`alignment of
`^jPevefago haao
`
`Automated 30 Shape and Texture Reconstruction
`
`lllumlnattort Corrected Texture Exuaction
`
`Detail
`
`O^U
`
`a""'
`Figure 4: Processing steps.for,teconsitucting 3D
`of a new face from a single image. After a rough manual alignment
`of the average 3D head (top row), the automated matching proce
`dure fits the 3D moiphable model to the image (CCTter row). In the
`right column, the model is rendered on top of the input image. De
`tails in textuie can be improved by illurainadon-coirected texture
`extraction from the input (bottom tow).
`tMir,
`iMir of directed light. In order to handle photographs
`t^en tmder a wide vitriety of conditions, p also includes color con
`trast as well as offset and gain in the rei green, and blue channel.
`Other parameters, such as camera distance, light direction, and sur-
`fece shininess, remain fixed to the values estimated by the user.
`From parameters (o, P, pi), colored images
`
`,m0
`
`are rendered using perspective projection and the Phong illumina
`tion model. The reconstructed image is supposed to be closest to
`the input image in terms of Euclidean distance
`El — p lllinpurC^ip) Imod«l(®iy)ll n
`Matching a 3D surface to a given image is an ill-posed problem.
`Along with the desired solution, many non-fece-Uke surfaces lead
`to the same image. It is therefore essential to impose constraints
`on the set of solutions. In our morphable model, shape and texture
`vectors are restricted to the vector space spanned by the database.
`Within the vector space of faces, solutions can be mtlher re
`stricted by a tradeoff between matching quality and prior proba
`bilities, using P(nl), P0) from Section 3 and an ad-hoc estimate
`of P(^. In terms of Bayes decision theory, the problem is to find
`the set of parairteiers {3,0,p) with maximum posterior probabil
`ity. given an image Imppx- While o, 0. wd rendering parame
`ters p completely determine the predicted image Imeiti, t™ o°-
`setved image Itnput may vary due to noise. For Gaussian noise
`
`4
`
`
`
`with a standanl deviation an, the likelihood to observe Iinpvt Is
`~ e*P[si- • ^'I- Maximum posterior probsbil-
`is then achieved by minimizing the cost hinction
`
`m—I »
`
`m-I „-2
`
`{.fit - Pi?
`
`" /=i
`
`j=.i
`
`j
`
`The oprimizatioR ajgorithm described below uses an estimate of
`B based on a random selection of surface points. Predicted color
`values Imoict ate easiest to evaluate in the centers of triangles. In
`the center of triangle k, texture
`and 3D location
`(Xk.Yki^kV afs averages of the values at the comers. Perspec
`tive projection maps these points to image locations {pi.t,Pi/.k) n
`Surface nonnals nj. ofeach triangle k arc determined by the 3D lo
`cations of the comers. According to Phong iilumination, the color
`components lr.modki< ^f.mpdci and Jt.medti take the form
`
`Ir.n.kda.k = (tr,amS + ir,<f.> ' (nkl))iik + t'r.diri • (rkVt)" (6)
`
`where 1 is the direction of Illumination, vj. the normalized differ-
`ence ofcamera position and the position ofthe triangle's center, and
`ri, = 2(nl)n - I the direction of the reflected ray. s denotes sur
`face shininess, and t/ eonltols the angular distribution of the g«tt-
`ular reflection. Equation (6) reduces to Ir.mcdti.b = »r,»nis.S4 if
`a shadow Is east on the center of the triangle, which is tested in a
`method described below.
`For hi^ resolution 3D meshes, variations in Imoeci across each
`trian^e fc e
`are small, so fir may be approximated by
`
`fir S5 y Joa •
`J_-
`
`*!?».*) — IpigSel.all 1
`
`3
`
`-
`
`where Ok is the image area covered by triangle k. If the triangle is
`occluded, at, =! 0.
`In gradient descent, conuibutions from difrereni triangles of the
`mesh would be redundant. In each Iteration, we therefore select a
`random subset*: C
`"t} of 40 triangles fc andrepiacefi; by
`
`Ek —
`aarc
`
`~ Imadel,e)H •
`
`(J)
`
`The probability ofseleeling Is is p{k £ K) ~ o*. This method of
`stochastic gradient descent [16] is not only more efficient computa
`tionally, but also helps lo avoid local minima by adding noise to the
`gradient estimate,
`Before the first iteration, and once every 1000 steps, the algo
`rithm computes the full 3D shape of the current model, and 2D po
`sitions (pg.Pi,)'' ofali vertices. It then dctentiines a*, and detects
`hidden stufaces and cast shadows in a two-pass z-buffer technique.
`We assume that occlusions and cast shadows are constant during
`each subset of iterations.
`. . .
`Parameters are updated depending on anal^ieal denvsiives of
`the cost function fi, using aj •-* Oj — Xj n
`and similarly for
`0j and pj, with suiuble factors Xj.
`Derivatives of texture and shape (Equation 1) yield derivatives
`of 2D locations {p» », Pv,k)^. surface normals n*. vectors Vk and
`r».andlm«t»i.k (Ermtion 6) using chain rule. From Equation (7),
`partial derivatives
`^.and 4^ can be obtained.
`Coarse-lo-FInc: In order to avoid local minima, the algorithm fol
`lows a coarse-io-fine strategy in several respects;
`a] The fiiat set of iterations is performed on a down-sampled version
`oflhcinput image with a low resolution motphabie model.
`b) We start by optimizing only the first coefficients oj and /3j con
`trolling the first principal components, along with all parometets
`
`Pair of
`Input Images
`
`Automated
`Simulcsineous
`Knacctiing
`
`Reconstruction
`of 30 Shape
`
`and Texture
`
`lilumlnailon
`
`Collected
`
`Tenure
`Extraction I
`
`Original
`
`Reconstruction
`
`New Views
`
`Figure 5: Simultaneous reconstruction of3D shape and texture of a
`new fijce from two images taken under different eondllions. In the
`center tow, the 3D face is rendered on top ofdie input rtnsgcs.
`pj. In subsequent iterations, more and more principal components
`are added.
`_
`. .
`cj'Statung with a relatively large ffw,wluch"puis a strong weight
`on prior probability in equation (5) and ties the opiltnum towards
`the prior expectation value, we later reduce an to obtain maximum
`matching quality.
`d] In the last iterations, the ftce model is broken down Into seg
`ments (Section 3). With parameters p, fixed, coefficients aj and
`j3j are optimized independently for each segment. This increase
`numbcrofdegrces of freedom sigmficanily improves facial details.
`Mulllplc Images: It Is straightforward to extend this technique to
`the case where several images of a person are available (Figure 5).
`While shape and texture are still described by a common
`gf
`and Pj, there is now a separate set of Pj for each input image. Ei
`is replaced by a sum of image distances for each pair of input and
`model images, and ail parameters axe optimized simultaneously.
`niumlnation-Corrccted Texture Extraction: Specific fbatures of
`individual faces that arc noicapnircd by ihemorphabte model, such
`as blemishes, are extracted from the image in a subsequent texture
`adaptation process. Extracting texture from images is a technique
`widely trscd in constructing 3D models from images (e.g. [28]).
`However, in order to be able to change pose and illumination, it
`is important to separate pure albedo at any given point from the
`influence of shading and cast shadows in the image. In our ap
`proach, this can be achieved because our matching procedure pro
`vides an estimate of 3D shape, pose, and illumination conditions.
`Subsequent to matching, we compare the prediction
`for each
`vertex i with Iinpui(p=,<,Pii..).
`compute the change m twture
`(JZi.Gs.Bt) that accounts for the difference. In areas oreluded in
`the image, we rely on the prediction made by the model. Dctt from
`tnuliiple images can be blended using meiboda similar lo [28).
`
`4.1 Matching a morphable modal to 3D scans
`The method described above can also be applied to register new
`3D faces. Analogous to images, where perspective projection
`
`5
`
`
`
`and an illuminaiion model define a coloKd im
`P : TC* ->
`age I(z,s) = {il(a:,y).G(a:,»),5(j;,v))'', laser scans provide
`a nvD-dimensional cylindrical parameterization of the surface by
`means of a mapping C :V? -* "R?, (x,y, z) •-» (h,^). Hence,
`a scan can be represented as
`(8)
`l{h,<l>) = (RM),G{h,4),Blh,4>),r(h,<l>)f-
`In a face {S.T), defined by shape and texture coefficients oj and
`0j (Equation 1), vertex t with texture values (iZ<,Gi,Bj) and
`cylindrical coordinates
`is mapped to ImoSeiCiii^t) =
`{Ri,Gi, Bun)'''. The matching algorithm from the previous sec
`tion now determines
`and 0j minimizing
`Imoeet(fit ^)ll n
`
`E — ^ \ lllrnpuc(fii
`A.e
`
`5 Building a morphabie model
`
`In this section, we describe how to build the morphabie model from
`a set of unregistered 3D prototypes, and to add a new face to the
`existing morphabie model, increasing its dimensionality.
`The key problem is to compute a dense point-to-point correspon
`dence between the veilices of the faces. Since the method dMcribed
`in Section 4.1 finds the best match of a given face only within the
`range of the morphabie model, it cannot add new dimensions to the
`vector space of &ces. To determine residual deviations between a
`novel face and the best match within the model, as well as to set
`unregistered prototypes in conespondence, we use an optic flow al
`gorithm that computes cotrespondence between two faces without
`the need of a motphable model [35]. The following section sum
`marizes this technique.
`
`5.1 3D Correspondence using Optic Flow
`Initially designed to find corresponding points in grey-level images
`I{x, y), a gradient-based optic flow algorithm [2] is modified to es
`tablish correspondence between a pair of 3D scans I(/i, (Equa
`tion 8), taking into account color and radius values simultaneously
`[35]. The algorithm computes a flow field {Sh{h, ^), 4^{h, ^)) that
`minimizes differences of |lli (h, 4i)-hlh+Sh, ^+4^)11 in a norm
`that weights variations in texture and shape equally. Surface prop
`erties from differential geometiy, such as mean curvature, may be
`used as additional components in !(/>, 4).
`On fecial regions with little structure in texture and shape, such
`as forehead and cheeks, the results of the optic flow ^gorithm are
`sometimes sptirious. We therefore perform a smooth interpolation
`based on simulated relaxation of a system of flow vectors that are
`coupled with their neighbors. The quadratic coupling potential is
`equal for all flow vectors. On high-contrast areas, components of
`flow vectors orthogonal to edges are bound to the result of the pre
`vious optic flow computation. The system is otherwise free to take
`on a smooth minimum-energy arrangement Unlike simple filter
`ing routines, our technique fhlly retains matching quality wherever
`the flow field is reliable. Optic flow and smooth interpolation are
`computed on several consecutive levels of resolution.
`Constructing a morphabie face model from a set of unregistered
`3D scans requires the computation of the flow fields between each
`face and an arbitrary reference face. Given a definition of shape and
`texture vectors Sr«/ and T,,/ for the reference face, S and T for
`each lace in the database can be obtained by means of the point-to-
`point correspondence provided by (4/i(7i, 4), Wti ^))-
`
`5.2 Bootstrapping the model
`Because the optic flow algorithm does not incoiporatc any con
`straints on the set of solutions, it fails on some of the more unusual
`
`m
`
`Figure 6: Matching a motphable model to a single image (1) of a
`face results in a 3D shape (2) and a texture map estimate. The tex
`ture estimate can be improved by additional texture extraction (4).
`The 3D model is rendered back into the image after changing facial
`attributes, such as gaining (