`
`GTL 1013
`IPR of U.S. Patent 9,007,420
`
`
`
`Computer Vision and Image Understanding
`
`Volume 101, Number 1. Janttaty 2006
`
`© 2006 Elsevier Inc. All rights reserved.
`
`This journal and tlte individual contributions contained in it are protected under copyright by Elscvicr litc.. and
`the following tcrtns and conditions apply to their use:
`
`Photocopying Single photocopies of single articles may be made for personal use as allowed by national
`copyright lIl\\'S. Permission of the Publisher and prtytnent of E! fee is required for all other photocopying.
`in-
`cluding mttltiple or systematic copying. copying for advertisittg or promotional purposes. resale. and all fornis of
`document delivery. Special rates are available for educational institutions that wish to make photocopies for
`nonprofit educational eirtssroont use.
`Permissions may be sought directly from Else\rier‘s Rights Department in Oxford, UK: phone: {+44} 1865
`843830: fax: (+44) 1865 853333: c-mail: permissiot1s@e]sevier.cotu. Requests may also be completed online
`via the Elsevier hotne page [I1Itpzflwtvw,6Isevier.contflocatcipcrntissionsi.
`
`tn the USA. users may clear perntissions and make: payinents through the (Zap)-riglttC1carancc Center. lllL‘.. 222
`Rosewood Drive. Danvers. MA 01923. USA; phone: (978) "l'50—8t-l{l0'. fax: (W8) 'i50~:i'I4zl; and ill the UK
`through the Copyright Licensing Agency Rapid Clearance Service tCl.ARCS). 9t} Tottenham Court Rozttl.
`London W1!’ OLP. UK: phone: +44 (0)20 763i 5555: fax: +44 (0)20 763i 5300. Other countries may have :1
`local reprograpltic rights agency for payments.
`
`Derimtive Works Subscribers may reproduce tables oi’ contents or prepare lists of articles incltttlittg abstracts
`For internal circulation within their institutions. Permission of the Publisher is required for resale or distribution
`outside the institution.
`
`Pcrtttissiott of the Publisher is required for all other tlt.'ri\'ati\*e works. including compilations and trattslations.
`
`Permission of the Pttblisltcr is required to store or use electronically any
`Electronic Storage or Usage
`material contained in this journal. including any article or part of an article.
`F-Xi-‘tip!
`:15 Ulltlincd :Il3IWE. no part of this publication may be reproduced. stored in a retrieval systent. or
`Irattstnittctl itt any form or by any means. electronic. mcclianical. photocopying. recording, or otherwise. without
`prior written permission of the Publisher.
`
`Address pcrntission requests to the Elsevier Rights Departntent at the fax and e—mail addresses noted :thm-'e.
`
`Notice No responsibility is assumed by the Publisher for any injury and.r‘or dantage to persons or property as it
`matter of product liability. negligence. or otherwise. or from any use or operation of any ntetltotl.-a. products.
`instntctions. or ideas contained in the material Itcrcin. Because of rapid advances in the tncdical scicttccs. in
`particular, ittdependcnt \-erificatioti of diagnoses and drug dosages should be made.
`
`Althouglt all advertisittg material is expected to conform to ctltical {medical} standards. inclusion in this pulm-
`lication clues not constitute a guarantee or cndorsclttcrtt of the quality or value of such product or of the claims
`made of it by its titattttfacturcr.
`
`
`
`0002
`
`0002
`
`
`
`
`
`+3.
`,.
`_=
`‘-5
`
`._
`
`Available online at www.sciencedirect.com
`
`5c,Em:E Cbmm_,_c.,..
`
`R
`
`Coniputer Vision and Image Llntlerstantling lfll tltltlot I
`
`IS
`
`*" — ‘T’ — "
`.
`.
`Corriputer Vision
`and Image
`Understanding
`
`xrwxr.eI5-:\'ier.coni)‘locate!c\'i u
`
`A survey of approaches and challenges 111
`3D and multi-modal 3D + 2D face recognition
`
`Kevin W. B0wye1"", Kyong Chang, Patrick Flynn
`
`l’)rr_ir;iii'riiii*.ur Q,-" (‘wiipi.r.'i*i‘ .S'rit*m'i* (mi.-' I;'iigi'iirt'i'iri‘i:. {'iu'I'ei'.s'i'r_I' nf.-\'iit'i't' DtH.ii'[.’.
`
`.-\'o.ri'i* .-'}mm*.
`
`.".\" 46556. £'.‘i'.vt
`
`Receiveti 2? Atlgust Ziitl-l: ziccepted I} May Jlltifi
`:\\'nilable online I
`I October JIJUS
`
`Alifitracl
`
`This survey |'oeuses on i'ecogoition peri'ortnei.l by mateliiug iuutlels ol' the three-diineusioaal sliape ol' the thee. either alone or in coin-
`bination with matcl1ini__z corresponding two-dimensional intensit_v images. Researeli trends to date are sunnnarizetl. and challenges eon-
`lioiitittg the development of more accurate lltree-tlimensiouul lace recognition are lI’..l{.'l1llliCLl. These challenges include the need lot’ better
`sensors. improved recognition algoritlnns. and more rigorous experimental Inethotlology.
`@'= 3005 Eisevier luc. All rights i'esc1'vcd.
`
`:'\'tji'ii'.«:i'rf_-.'_' Biometrics: Face ri.'cog|1ilion: TI1t‘ee—tlime1Isional face reeot__-Iiilion: R:In_i_:e inutge: Mtlltiqnodal
`
`1. lntrotluetion
`
`Evaltizititiiis such as the Face Recognition Vendor Test
`il~‘RV’I') 2002 [46] make it clear that the current state oi‘
`the art in face recognition is not yet sttllicient for the more
`(lL‘l1‘Ittl't(lll'lg applications. However.
`iJl0l1‘IClt‘lt.‘ technologies
`that currently olier greater accuracy. such as fingerprint
`and iris. require much greater explicit eoogieration lrotn
`the user. For example. fiugcrprittt requires that the subject
`cooperate in tnakiug plrvsical contact with the sensor stir-
`face. This raises issues ol‘ how to keep the surlace clean
`and gertn-|'ree in :1 high-tl1i'o1iglipL1t application. Iris imag-
`uig L‘l.l|‘|'Cl'lll}’ requires that the subject cooperate to careful»
`Iv position their e_ve relative to the sensor. This can also
`‘ause probietns ll]
`:1
`l1i_t_1l]-tl‘tt‘0ttgl1])ttl application. Thus
`there is sigttilicattt potential application-driven -tleinantl
`For improved perl‘orinauce in Face recognition. One goal
`of the Face Recognition Grand Challenge prograin [45]
`spousoretl by various governtnent agencies is to foster an
`onlei‘-oi‘-i11a_i_1nitutlc increase in face recognition perl‘or-
`inance over that doctnnentetl in FRVT 2002..
`
`' Co|‘I‘espont|iI1g attthor. Fax: 4-! SH (131 ‘I260.
`l3o\\'_rer:. _iiI1.e|i:||1gra
`.E‘—.=iim]"
`H0"!-"J'¢‘.\'.\'t‘.$'.'
`|\'\\'h:_ui.‘se_1ul_eLlt| {K.\\'.
`philips.-:on1 IK. Cliangl. li}'l'li'|(i'J'C5lI.I1(i.i3(ilI ll’. Fl)-'|'ll'll.
`
`llI7T-3|-DIS - see Front matter (15 lllllfi Elsevier Inc. All rights reser\'et|,
`tluiill].l[II(iii.c't-'it1.2[l'{l5.ll5.0l'l5
`
`0003
`
`0003
`
`trtajority of lace recognition research and
`The vast
`eoniniercial face recognition s_vsteins use tvpieal
`intensit_v
`imiiges ol' the face. We refer to these as "ED i1nages."
`In contrast. a "3D intage" ol' the lace is one that repre-
`sents threc-tlilnensioual slntpe. A recent extensive survey
`of‘ Face recognition research is given in [60]. but does
`not
`include research ellorts hztsetl on matching 3D shape.
`Our survey given here Focuses specifically on 3D lace rec-
`ognition. This is an update and expansion ot‘ earlier vet'-
`siotts
`[8.9].
`to iitcltuie the initial
`round oi’
`research
`results coming out oi" the Face Recognition (iraiicl Chal~
`lenge [l6.33.33.4l.44.50]. as well as other recent
`resuils
`|42.2S.2‘).3tl.32.3l]. Scheenstra et ai. {SI} give an alternate
`survey of some of‘ the earlier work in 3D face recognition.
`We are particularly interested in 3D lace Ieeognition he-
`eause it
`is commonly thought that the use oli 3D sensing
`has the potential For greater recognition accuracy than
`3D. For example. one paper slittes—"Beeause we are
`working in 3D. we overcome limitations due to \-'ie\\-';3oittt
`and lighting variations" [34]. Another paper describing a
`<|i|Terent approaclt to 3D Face recognition states—“Range
`images have the atlvatttagc of capturittg shape vat'ialiot1
`irrespective of illumination \-'ai'iahilities" 22]. Siinilut'l},’, a
`ll'Iit‘Li paper states-—“Depth and curvature Features have
`several atlvantages over more tratlitioual
`intensit_v-based
`
`
`
`1.3
`
`K. II'. Bt'J'lt'_'l.'(’J' ct at’. I ('mi:pu.'c.-‘ I':'.u'mt mm‘ Inmgtr {_.-'mhu‘.r.*mtrffng .-‘I'll
`
`r_’t‘ltCt6l 1- I5
`
`Features. Specifically. curvature descriptors: [l} have the
`potential for higher accuracy in describing sttrliace-based
`events. (2) are better suited to describe properties of the
`Face in 2: areas such as the cheeks. forehead. and chin,
`
`and -:3] are viewpoint invariant" [El].
`
`2. Background concepts and terminology
`
`The general term “face recognition" can reFe:' to ditI'e1'ent
`application seemu'ios. One scenario is caller “recognition”
`or “identificaliou." and another is called “authentication"
`
`or “ve1‘il'ication." In either scenario. Face images of known
`persons are initialiy enrolled into the system. This set ofper-
`sons is sometimes referred to as the "gallery." Later images
`of these or other persons are Ltsed as "probes" to match
`against images in the gallery. In a 1'ecognition scenario, the
`matching is one-to-many.
`in the sense that a probe is
`matched against alt oi‘ the gallery to lind the best match
`above some threshold. In an authentication scenario.
`the
`
`the probe is
`in the sense tltat
`matching is one-to-one,
`matched against
`the gallery entry for a claimed identity.
`and the claimed identity is taken to be authenticated if the
`quality of match exceeds some threshold. The recognition
`scenario is more technically challenging than the authentica-
`tion seena1'io. One reason is that in a recognition scenario a
`larger gallery tends to present more chances For incorrect rec-
`ognition. Another reason is that the whole gallery must be
`searched in some manner on each recognition attempt.
`
`While research results may be presented in the context of
`either recognition or attthentication, the core 3D represen-
`tation and matching issues are essentially the satne. In fact,
`the raw matching scores underlying the c1.tri:u!ctt.t‘t'e .umm't
`c'.-'tat'm'tct't.vtt'c.- [CMC] curve For a recognition experiment
`can readily be tabulated in a diflerent manner to produce
`the t'eceft'-or operming (.'h(tt't’tt't'(’t'tS!f(' (ROC} curve for an
`attthentication experiment. The CMC curve sutnntarizes
`the percent of a set of probes that is considered to be cor-
`rectly matched as a function of the match rank that
`is
`counted as a correct match. The rank-one recognition rate
`is the most commonly stated single nutrtber from the CMC
`curve. The ROC curve sumtnarizes the percent of‘ a set of
`probes that is falsely rejected as a tradeolf against the pea»
`cent that is falsely accepted. The eqttal-e1'ro1' t'ate IEERJ.
`the point where the false reject rate equals the false accept
`rate, is the most connnonly stated single number from the
`ROC curve.
`
`The 3D shape of the face is often sensed in contbination
`with a 2D intensity image. In this case. the 21) image can he
`thought of‘ as a “texture tnap" overlaid on the 3D shape.
`An exatnple ofa 2D intensity image and the corresponding
`3D shape are shown in Fig.
`l, with the 3D shape rendered
`in the form of a t'ange image, a shaded 3D model and a
`mesh of points. A “range image." also sontclintes called a
`“depth image," is an image in which the pixel value rellects
`the distance from the sensor to the imaged surface.
`In
`Fig.
`I, the lighter values are closer to the sensor and the
`
`A
`
`B
`
`llll{.'l15l[_\’ and 3D shape data. The 2D intertsity image and the 3D range image are representations that would be used with
`Fig. 1. Example of 31')
`"etgenl‘aec" style approacltcs. (Al C'I'oppcd 2D ll‘1lCl‘ISl'l)‘ image. ll!) 31) rendered as range image. {C 1: 3D rendered as shaded model. tD] SD I'endcret| as
`wi1‘ct'r;Itue.
`
`0004
`
`0004
`
`
`
`K. ll". Br:u‘}'t't' ct ml.
`
`2‘ C‘utnpnm'
`
`l".«'.~‘.='mt aim’ lntttgc t-'ttn't'r.t'mttu’ittg Jr‘)! r.?m.'/it
`
`F
`
`[5
`
`3
`
`darker values are farther away. A range image. a shaded
`model, and a wire-frame mesh are common alternatives
`
`for displaying 3D face data.
`As commonly used, the term nml'n'-mortal br'omem'es re~
`lers to the use oi" multiple imaging modalities, such as 3D
`and 2D images of the Face. The term “multi-modal” is per-
`haps imprecise here, because the two types of data may be
`acquired by the same imaging system. In this survey, we
`consider algorithms for multi-modal 3D and 2D face ree-
`ognition as well as algorithms that use only 3D shape.
`We do not consider here the family of approaches in which
`a genetic. "morphable" 3D face model is used as at1 inter-
`mediate step in matching two 2D images For face recogIti«
`tion. This approach was popularized by Blanz and Vetter
`[5]. its potemial was investigated in the FRVT 2002 report
`[(46). and variations of this type of approach are already
`used in various commercial face recognition systems. How-
`ever. this type of approach does not involve the sensing or
`matching of 3D shape descriptions. Rather, a 2D image is
`mapped onto a del'ormable 3D model, and the 3D model
`with texture is used to produce a set ofsytttltetic 2D images
`for the matching process.
`
`3. Recognition based solely on 3D shape
`
`Table I gives a comparison olseleeted elements of algo-
`rithms that use only 3D shape to recognize faces. The
`
`works are listed chronologically by year of publication.
`and alphabetically by first author within a given year.
`The earliest work in this area was done over a decade
`
`ago [l3.2l.26,39]. There was relatively little work in this
`area througli the l990s, but activity has increased greatly
`in recent years.
`Most papers report perl'orn‘tancc as the rank—one rec-
`ognition rate, aithough some report equal-error rate or
`verification rate at a specified false accept rate. Histori-
`cally.
`the experimental component of work in this area
`was rather modest. The number of persons represented
`in experimental data sets did not reach 100 until 2003.
`And only a few works have dealt with data sets that
`explicitly incorporate pose and/or expression variation
`[38,30,44,l6,ll}.
`It
`is
`therefore perhaps not surprising
`that most of the early works reported rank-one recogni-
`tion rates of
`l0{}"/u. However.
`the Face Recognition
`Grand Challenge program [45] has already resulted in
`several research groups publishing results on a C0tt'll‘l't0:1
`data set representing over 4000 images of‘ over 400 pen»
`sons. with substantial variation in
`facial expression.
`Examples ol" the dilferent
`facial expressions present
`in
`the FRGC version two dataset are shown in Fig. 3. As
`experimental data sets have become larger and more
`challenging, algoritlnns have become more sophisticated
`even if the reported recognition rates are not as high
`as in some earlier works.
`
`Table I
`
`Recognition algorithms using 3D shape alone
`
`_—\ttllior. _\'I.‘iIl'. reference
`
`Persons in tlataset
`
`Images in dataset
`
`Image size
`
`3D lace tlata
`
`I98‘) [I2]
`C:II'tot|x.
`Lee. 1990 [36]
`t}or<jon_ I992 [ll]
`Vztgantinc. I993 [39]
`.-\Cltt‘t'llttlI'm. 199? [3]
`Tanztlia. I998 [53]
`.-\cltet‘1ttaI'm. 3[l{Jl.I [3]
`('ltt1a. 3000 [I T]
`lleslter. 3003 [23
`Lee. 3tl(l."i [23]
`-_\-ledioni. 2003 [34]
`:\-lorcno. 3tItJ_\ [38]
`Pan. .-'!tltI3 H2}
`
`Lee. 3004 {.738}
`Lu. 2004 [311]
`R 1155. 3004 [49]
`Kn. ltltlat [5?]
`
`l§I'on.\'tein. 3l.ltl.‘i [ll]
`t'llattg. 3t]tl5 [Ital
`Ufiklaerk. IIIUS [EH]
`Lee. 3005 [39]
`La. 3005 {3l]
`Pan. 3005 [-ll]
`l’assalis. 3005 H4]
`Russ. 3005 [50]
`
`5
`(1
`so train 8 test
`to
`34
`3?
`34
`(1
`31"
`35
`I00
`(all
`30
`
`43
`Is
`Jun l"R(jt' \‘I
`I30 I3tIt
`
`30
`-ltifi FRGC v3
`1015
`10!!
`100
`3% FRGC vl
`466 FRGC v.-'1
`300 FRGC vl
`
`I8
`t;
`20 train 24 test
`I60
`240
`3'."-'
`34|.|
`34
`333
`70
`700
`420
`360
`
`Prolllc. sttrlitcc
`Not available
`EGI
`356 x I50
`Not available Feature vector
`356 x 24!]
`Multiple proliles
`75 x I50
`Range image
`256 x 356
`EU]
`75 X I50
`Point set
`Not available
`Point set
`243 X 34?
`Range image
`330 x 330
`Feature vector
`Not available
`Point set
`2.2K points
`Feature vector
`3K points
`Point set. range image
`
`Core Inatching
`algorithm
`
`:\=linimun1 distaltcc
`Correlation
`Closest \'ccloI'
`Closest \'t2Cl0l'
`PCA. HM M
`C’orrelation
`I-laustlorll'distance
`Point signature
`PCA
`Closest vector
`ICTI’
`Closest vector
`I-[austlorI'l‘aml
`
`|’C'.='\
`
`84
`in
`468
`Tit}
`
`220
`dlltl?
`57“)
`300
`I96 probes
`943
`4007
`393
`
`240 X 33tl
`Max 32::
`430 x 640
`Not available
`
`Weiglttetl l-lausdortl‘
`R:II1ge. cttrt-‘attI1‘t'
`ICP
`point set
`llztusdorlt distal ace
`Range image
`Point set + feature vector Minimum distance
`
`Point set
`Not at-'ai|al'Jle
`Point set
`480 xésltl
`Not available Multiple
`Various
`Feature vector
`240 X 320
`Sur|'acc mesh
`480 X 640
`Range image
`480 X 640
`StIr|'ace mesh
`480 x 640
`Range image
`
`0005
`
`0005
`
`"I:a1mItie:Il I'orIns"
`niulti-lC'l’
`.\alu1tipie
`SV M
`ICP. TPS
`PCA
`Defortttttblc model
`l'l£l1lStl(‘Jl'lT distattcc
`
`Reported
`perl'ormance
`
`ltItl"-.-
`None
`Int)"
`llltl":.
`|[ltt':-..
`ltlt.l“-..
`Illlli‘-..
`|I]tl“;.
`9?"-..
`94".. at rank 5
`9S'!..
`?8'!'..
`.1--5'.’-.. EER.
`5—7".m EER
`
`':l?<‘.'--..
`959..
`|)t'a".'.. veritieation
`*)6‘3«- on 30.
`72% on Ill]
`lUl.l"-I
`93':/..
`99%
`96%;.
`S9'.’»':.
`9:'\‘:'... 3’.’-'.. EER
`‘~}ll'5-i.
`98.5"-..
`
`
`
`-I
`
`K. It-'. Bmr_r¢-r 1*! H}.
`
`J‘ ('uin;:an'r l'i.\1‘nn amt’ fiiirige ("lira-i'.sr:utrJ'ii.=g NH tjtltlfii I
`
`[5
`
`IE.\':u1tp|c inlages in ED and 3D with t|ilTcreI1l csmessioits. The seven expressions depicted are: neutral. angry. }iapp_\-'. sad. surprised. tlisgtlstetl. and
`
`Fig. 2.
`"pt1|l‘_\'.
`
`Cartons et al. [[22] approach 3D face recognition by seg~
`inenting a range image based on principal curvature and
`Iinding a plane 0|‘ bilateral symlnetry througli the Face. This
`plane is used to normalize for pose. They consider methods
`of matching the profile from the plane ofsytniiietry and of
`matching the farce surl'ace. and report 100% recognition for
`either in a small dataset.
`
`Lee and Milios [26] segment convex regions in a range im-
`age based on the sign o|‘the mean and Cratissian curvatures.
`and create an extended Gaussian image I: EGIJ for each con-
`vex region. A match between a region in a probe image and in
`a gallery image is done by eottelatittg EGIS. The EGI de«
`scribes the shape of an object by the distribution ol"stI1'face
`normal over the object surface. A graph matching aigorithm
`incorporating relational constraints is used to establish an
`overall match of probe image to gallery image. C‘onve,\‘ re-
`gions are asserted to change shape less than other regions
`in response to changes in facial expression. This gives some
`
`ability to cope with changes in Facial expression. However.
`EGIs are not sensitive to change in object size. and so two
`similar shape but dillerent size I‘-aces will not be distinguish~
`able in this representation.
`Gordon [2 l ] begins with a curvature-based segmentation
`of the face. Then a set of l'eatures are extracted that de-
`
`scribe both ctn'v:-mire and metric size properties of the face.
`Thus each l'ace becomes a point in Feature space. and near-
`cst—neighbor matching is done. Experinients are reported
`with a test set of three views of each of eight faces and rec-
`ognition rates as high as ltltl"/1. are reported. It is noted that
`the values of the li‘.l“¢t1llI‘CS used are generally similar for dill
`l'e:'ent images ol' the same lace. "except For the cases with
`large l‘eature detection error. or \'ariation due to e.\'pres-
`sion“ [21].
`Nagamine et al. [39] approach 3D lace reeogttition by
`finding five Feature points. using those Feature points to
`standardize Face pose. and then matching various curves
`
`0006
`
`0006
`
`
`
`K. Hi Bnir_I‘c-r N til.
`
`il ('nmpir.i't't'
`
`|'.“.v."n.ti amt’ hm.{i::'
`
`[-l'l(h’J'Kl't.'.l!t.(i"ll'_!.{' I'll! -_‘t'.iUr:,' I
`
`in"
`
`‘Jr
`
`or proliles through the face data. Experiments at'e pet'-
`|'orined for to subjects. with 10 images per subject. Tlte best
`t'ccogriition rates are found using vertical prolile curves
`that pass through the central portion ol‘ the lace. Compu-
`tational
`i‘equii'etnents were apparently regarded as severe
`at the titne this work was perl‘ormed. as the authors note
`that "using the whole facial data may not be feasible coit-
`sidering the lai'ge compittation atid hardwai‘c capacity
`needed" [39].
`Aeherinann et al. [3] extend eigeiifitee and hidden Mail‘-
`kov tnodel tHMM} approaches used for 2D liiee recogni-
`tion to work with range images. They present results For
`a dataset ol' 24 persons. with 10 images per person. and re-
`port [00"z'i- recognition ttsing an adaptation ol' the 2D lace
`recognition algorithnts.
`Tanakaet al. [52]also per|‘orn1cttrvattire-based segrnenta-
`lion and represent the Face using an extended Gaussian image
`lEGI t. Recognition is perl‘ornted usinga spherical correlation
`ol'the EGls. Experiments are reported with a set o|'3"i‘ images
`l't'oin a National Research Council of Canada range image
`dataset [48], and Hit)"/ii recognition is reported.
`Chan et al. [[7] use "point signattires" in 3D Face recog-
`nition. To deal with Facial expression change. only the
`approxiinatelv rigid portion oi" the face |‘rotn jttst below
`the nose up through the Forehead is used in matching. Point
`sigttaturcs are tised to locate t‘elei'eitce poitits that are ttsed
`to standardize the pose. F.xperitnents are done with multi-
`ple images with dillcrcnt expressions |'roin six subjects. antl
`|tlt1‘!/in recognition is reported.
`Acherniann and Buitkc [3] report on a method ol‘ 3|)
`|'ace i'ecogiiitioii that uses an extension o|' l-[ausdor|l' dis-
`tanee matching.
`"I‘hc_v report on experiments using 2-‘ill
`range nnztges. It) images o|' each of 24 persons. and achieve
`|tlt}'l/n i'ccogttitioii for some instances of the algorithm.
`I-lesher et al. [22] explore principal component analysis
`tPCAt style approacltes using dill'ereni ninnbers o|' eigen-
`vectors and image sizes. The image data set
`tised has six
`dill'erent facial expressions for each of 3? subjects. The per-
`t'orinance [igures reported result Froin using tniiltiple imag-
`es per subject in the gallery. This ellcctively gives the probe
`image more chances to make a correct ntatch. and is known
`to raise the recognition rate relative to having a single sain-
`plc per subject in the galler_v [36].
`Mediont attd W-atipotitsch [34] perl'ornt 3D face recogni-
`tion using an iterative closest point
`tlC'|’t
`iipproaelt
`to
`match Face surlhees. Whereas most of the works covered
`
`here use 31) shapes acquired tlii'otig|i a structui'cd-liglit sen-
`sor. this work tises 3D shapes acquired by a passive stereo
`settsor. Experiments with seven images each from a set of
`Hit} sub_iects are reported. with the seven images sampling
`dilierent poses. An EER of "better than 2“/.'." is reported.
`Moretto aittl co-workers [33] approaeli 3|) lace recogni-
`tion by lirst pcrl'oi‘ining a segmentation based on Gaussian
`curvatttre and then creating a feature vector based on the
`seginentcd regions. The)’ report
`results on a dataset oi‘
`430 face nteslies representing 60 dillerent persons. with
`some sampling oi‘ different exptessioris and poses For each
`
`person. Rank-one recognition of 78"“-.
`subset of l't'oiital views.
`
`is achieved on the
`
`Lee et al. [27] perl'orni 3D litce recognition by locating
`the nose tip. and then |‘orining a feature vector based on
`contottrs along the lace at it seqtieitee oll depth values. 'l‘he_v
`report ‘J-l‘.‘xi. correct recognition at rank live. btit do not re-
`port rtink-one recognition. The recognition t'ate can change
`drainatica|l_v between ratiks one and live. and so it
`is not
`possible to project how this approach wotild pet'i'orin at
`rank one.
`
`Pan et al. [42]experintent with 3-D face recognition using
`both a Hausdorll distance approach and a PCA-based ap-
`proach. In experiinettts with images l‘l'0l11 the MEVTS data-
`base [35] they report ait etiual-ei'i'or rate IEERJ in the range
`oi‘ 3—5% For the l-lattsdorll distance approach aitd an EER
`in the t'angc of 5- 7" ll For the PCA-based approaelt.
`Lee and Shim [38] consider approaches to using a
`"depth—wcightcd HattsdorlT distance" and stirlaee curva-
`ture inl'ormation ttlte tniniintnn. nuixintttni. attd Gaussian
`ctirvnttiret
`|‘or 3D lace recognition. They present results
`ol'experimettts with if data set representing 42 persons. with
`two images For each person. A ranlc-one recognition rate as
`high as 98"» is reported lot’ the best combination method
`investigated. whereas the plain Hausilor|Tdistance achieved
`less than *)tJ'.'-1..
`
`Lit et al. [30] report on t'esitlts o|'an ICI’-based approach
`to 3D face recognition. This approach assumes that the gal-
`|er_v 3-D image is it more complete lace model and the probe
`3D image is a lrontal view that is likely a subset oi‘ the gal-
`ler_v image.
`In experiments with images Front
`lb‘ persons.
`with multiple probe images per person. incorporating some
`variation in pose and expressioii. a recognition rate ol‘*J'l"'..
`was achieved.
`
`Russ et al. [-19] present results of Hausdorll matching on
`range images. The)-' use portions ol‘ the dataset used in [I-l}
`in their experiineitts. In a verilication experiment. 200 per-
`sons were enroiled in the ga||er_v. and the satne 200 persons
`plus another 68 iinpostcrs were repi'esenled in the probe
`set. A prohabilit}-' ol‘ correct verilicatioii as high as l).*t"..
`[o|' the 300} was acltievcd at a
`false alarm rate o|' ll
`tol‘
`
`the 68]. In a recognition experiment. 30 persons were eti-
`rolled in the gal|er_v and the same 30 persons iinaged at a
`later time were represented in the probe set. /\ fill“/ix proba-
`bilit_v oi‘ recognition was achieved at a l'alse alarm rate oil}.
`The recognition experiinent uses a subset of the available
`data "because of the coinputational cost ol' the current
`algorithin" [49].
`Xu et al. [Ti] developed :1 method for 3D face recogni-
`tion and evaluated it using the database from Beuinier
`and r\chcro)' [4]. The original 31') point clottd is converted
`to a rcgttlar mesh. The nose region is |'ound attd used as an
`anchor to lind other local regions. A l‘eatiire vector is coin-
`pitted from the data in the local regions o['tnouth. nose. lel't
`eye. and t'ight e_ve. I-eature space ditnensionalitv is reduced
`using principal components Etl‘lEtl}"SlS. and matching is based
`on ininiinuin distance using both global aitd local shape
`coinponents. Expci'intetital results are reported For the Full
`
`0007
`
`0007
`
`
`
`(1
`
`K, ll’. Bmr_|':-r er rt}.
`
`.-‘ ('unt_.*=t.'m'
`
`l'i.t'inn um! ."nn.-gr
`
`tiirlt-i-srtriitlint: Ht! r.‘rmm J‘ H
`
`[20 persons in the dataset and for a subset ol‘ fit} persons.
`with pet‘lo:‘:ttttttce oi" 72 and 96".». respectively. This illus-
`trates the general point that reported experimental perfor-
`mance can be highly dependent on the dataset size. Most
`other wot'ks have not considered pert‘ormanee vat'iation
`with dataset size. It should be mentioned tltat the reported
`perl‘ortnanee was obtained with live images o|' a person
`used |'or enrollment in the gallery. Performance would gen-
`et'ally be expected to be lower with only one itnage used to
`enroll a person.
`Bronstein et al. [I 1] present an approach to 3D litce rec-
`ognition intended to allow for cleforntatiott related to facial
`expression. Tlte idea is to convert the 3D Face data to an
`“eigent'orn1“ that is invariant to the type ol‘shape det‘orma-
`tion that is modeled. in effect. there is an assumptiott that
`"the change ol‘ the geodesic distances due to facial expres-
`sions is
`insignificant." Esperitnental evalttation is done
`using a datasct containing 220 images o|'30 persons (2? real
`persons and 3 mannequins]. and l0{l“x’u
`recognition is
`reported. A total til’ 65 enrollntent
`images were used for
`the 30 snh_ieets. so that a subject
`is represented by tnore
`than one itnage. As already mentioned. Lise of mot'e than
`otte cnt‘olhnent
`image per pet'son will generally increase
`recognition rates. The method is eotnpared to a 2D eiget1-
`['ace approach on the same subjects. httt tltc Face space is
`trained using just 35 images and has just 23 dimensions.
`The metltod is also compared to a t'igid suri'ace tnatching
`approach. Perhaps the most unusual aspect oi‘ this work
`is the claitn that
`the approach "can distinguish between
`identical twins."
`
`Giikberk et al. [ltl] compare live approaches to 3D Face
`t'ecognition using a subset ol'the data used by Beuinier and
`Acheroy [4]. They compare tnethotls based on extended
`Gaussian itnages. ICP matching. range protile. PCA. and
`linear discritninant analysis [LDA]. Their experimental
`datasct has 57]
`images l'rom lilo people. They Iind tltat
`the [C13 and LD/-\ approaches oller the best perfortnance.
`although pcriortnance is
`relatively similar among all
`approacltes bttt PCA. They also e.\‘plore ntcthotls oI' fusing
`the results of the live approaches and are able to achieve
`lJ9"Xi. rank-one recognition with a combination oi‘ reeogtti'/.-
`ers. This work is relatively novel in comparing the perfor-
`mance ol‘dill'erent 3D face recognition algorithms. and in
`documenting it perlormanee increase by combining results
`oi‘ multiple algorithuts. Additional work exploring these
`sorts oi‘ issues would scent to be valuable.
`
`Lee et al. [29] propose alt approach to 3D |'aee recogni-
`tion based on the curvature values at eight lealttre points
`on the face. Using a support vector‘ ntachine l'or classiIica-
`tion. they report a rank-one recognition rate ol'%"/.. for a
`data set representing tilt) persons. They use it Cybertvare
`sensor to acquire the enrollment itnages and a Genes sen-
`sor to acqI.iit‘c the probe images. The recognition results
`are called "simulation" results. apparently because the fea-
`ture points are matttlally located.
`Lu and Jain [31] extend previous work using an ICP-
`based recognition approach [30] to deal explicitly with V211’-
`
`iation in Iacial expression. The problertt is approached as a
`rigid transt‘ormation of probe to gallery. done with ICI’.
`along with a non-rigid del'o:'niation. done using tltin-plate
`spline tTPS] techniques. The approach is evaluated using
`a |00—person dataset, with netttral-expression and smiling
`probes. matched to neutral-cspression gallery images.
`The gallery entries are who|e—he-ad data structures. whereas
`the probes are frontal views. Most errors after the rigid
`transl'orn‘tation result from smiling probes. and these errors
`are reduced substantially alter the non-rigid deformation
`stage. For the total 196 probes (98 neutral and 98 smiling].
`performance reaches 89% for shape-based matching and
`9|‘!/u for multi—moda| 3D + 2D matching [32].
`Russ et al. [50] developed an approach to using Haus-
`tloril distance matching on the range image rept'esentatiott
`oi‘ the 3D face data. An iterative registration pt'ocedure
`similar to that in ICP is used to adjust the aligntncnt of
`probe data to gallery data. Various means o|' reducing
`space and titnc complexity oi‘ the matching process are ex-
`plored. Esperitnental results are presented on a part of the
`FRGC version 1 data set. using one probe per person rath-
`er than all available probes. l’et'Fot'ti1;tt1ce as high as 93.5"/.'.
`t"ctnk-one recognition. or 93.5% vcrilication at a false accept
`rate oi‘ tl.l"/1.. is achieved. In related work. Koudelka ct al.
`
`[24] have developed a l'lausdor|l'-based approach to pre-
`screcning a lat'ge datascl to select the most likely tttatcltcs
`For more careful consideration [24}.
`Pan et al. [4]] apply PCA. or cigettfacc. matching to a
`novel mapping oi‘ the 3D data to a range. or depth. image.
`Finding the nose tip to use 218 21 center point. and an axis oi‘
`sytnInett'_v to use |‘or alignment. the Face data are mapped to
`a circular range image. L~'.sperimenta| results are reported
`using the FRGC versiott
`I data set. The t'aeia| region used
`in the mapping contains apprositnately |2.500-—lltl,t'l0tl
`poittts. Perl‘ormanee is reported as 95'.‘/t. rank-one t'ecogtti-
`tion or 3.8% EER in a verilication scenario.
`it is ttot clear
`
`whether the reported ])et'|ormttttee includes the approxi-
`tnately l'E/.. oi‘ the images for which the tnapping process
`tails.
`
`Chang et al. [[6] describe an “mu1ti~region" approaclt to
`3D l'ace recognition.
`[I
`is a type ol‘elassilier ensemble ap-
`pt'oac|t
`in which multiple overlapping suhregions around
`the nose are independently matched using ICP. and the re-
`sults oi‘ the multiple .7-D matches Ittsed. The experitnental
`evaluation in this work uses essentially the FRGC version
`2 data set. representing over 400i] images from over 400
`persons. in an experiment in which one ncutral-expression
`image is enrolled as the gallery lot‘ each person. and all sttb-
`sequent
`images tot‘ varied facial expt'essiotts} are used as
`probes. perfortnanee o|'
`92"/J.
`ranl<~onc
`recognition is
`reported.
`Passalis et al. [44] describe an appt‘oaclt to 3D lace rec-
`ognition that uses annotatcd deformable models. An aver-
`age 3D lace is computed on a statistical basis from a
`training set. Landmark points on the 3D face are selected
`based on descriptions b_v I"-'arkas [IR]. Esperitnental results
`are presented using the FRGC version 2 data set. For an
`
`0008
`
`0008
`
`
`
`K. |i'. b’:m'_rt't' et at‘.
`
`I ('mt:ptm*.-'
`
`i'i.~'r'r:ti mm’ l'iiJu_ee £'itdet'.tmitdi.-J,e I'll!’ rfttrtoi I H
`
`--..l
`
`identitication experiment in which one image per person is
`enrolled in the gallery [466 total] attd all later images (3541)
`at'e used as probes. pe1‘l'o1'mance 1'eaches nearly 90% rank-
`one recognition.
`
`4. Multi-modal aigoritlnns using 31) and 2|) data
`
`While 3D face recognition research dates back to before
`1990. algorithms that combine results from 3D