throbber
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
`
`1
`
`Learning a Fixed-Length Fingerprint Representation
`
`Joshua J. Engelsma, Kai Cao, and Anil K. Jain, Life Fellow, IEEE
`
`Abstract—We present DeepPrint, a deep network, which learns to extract fixed-length fingerprint representations of only 200 bytes.
`DeepPrint incorporates fingerprint domain knowledge, including alignment and minutiae detection, into the deep network architecture
`to maximize the discriminative power of its representation. The compact, DeepPrint representation has several advantages over the
`prevailing variable length minutiae representation which (i) requires computationally expensive graph matching techniques, (ii) is
`difficult to secure using strong encryption schemes (e.g. homomorphic encryption), and (iii) has low discriminative power in poor quality
`fingerprints where minutiae extraction is unreliable. We benchmark DeepPrint against two top performing COTS SDKs (Verifinger and
`Innovatrics) from the NIST and FVC evaluations. Coupled with a re-ranking scheme, the DeepPrint rank-1 search accuracy on the
`NIST SD4 dataset against a gallery of 1.1 million fingerprints is comparable to the top COTS matcher, but it is significantly faster
`(DeepPrint: 98.80% in 0.3 seconds vs. COTS A: 98.85% in 27 seconds). To the best of our knowledge, the DeepPrint representation
`is the most compact and discriminative fixed-length fingerprint representation reported in the academic literature.
`
`Index Terms—Fingerprint Matching, Minutiae Representation, Fixed-Length Representation, Representation Learning, Deep
`Networks, Large-scale Search, Domain Knowledge in Deep Networks
`!
`
`1 INTRODUCTION
`
`O VER 100 years ago, the pioneering giant of modern day fin-
`
`gerprint recognition, Sir Francis Galton, astutely commented
`on fingerprints in his 1892 book titled “Finger Prints”:
`their
`“They have the unique merit of retaining all
`peculiarities unchanged throughout life, and afford in
`consequence an incomparably surer criterion of identity
`than any other bodily feature.” [1]
`Galton went on to describe fingerprint minutiae, the small details
`woven throughout the papillary ridges on each of our fingers,
`which Galton believed provided uniqueness and permanence prop-
`erties for accurately identifying individuals. Over the 100 years
`since Galton’s ground breaking scientific observations, fingerprint
`recognition systems have become ubiquitous and can be found in a
`plethora of different domains [2] such as forensics [3], healthcare,
`mobile device security [4], mobile payments [4], border cross-
`ing [5], and national ID [6]. To date, virtually all of these systems
`continue to rely upon the location and orientation of minutiae
`within fingerprint images for recognition (Fig. 1).
`Although automated fingerprint recognition systems based
`on minutiae representations (i.e. handcrafted features) have seen
`tremendous success over the years, they have several limitations.
`
`• Minutiae-based representations are of variable length,
`since the number of extracted minutiae (Table 1) varies
`amongst different fingerprint images even of the same
`finger (Fig. 2 (a)). Variations in the number of minutiae
`originate from a user’s interaction with the fingerprint
`reader (placement position and applied pressure) and con-
`dition of the finger (dry, wet, cuts, bruises, etc.). This
`variation in the number of minutiae causes two main
`problems: (i) pairwise fingerprint comparison is compu-
`tationally demanding and varies with number of minutiae
`
`•
`
`J. J. Engelsma and A. K. Jain are with the Department of Computer Science
`and Engineering, Michigan State University, East Lansing, MI, 48824
`E-mail: engelsm7@msu.edu, jain@cse.msu.edu
`• K. Cao is a Senior Biometrics Researcher at Goodix, San Diego, CA
`E-mail: caokai0505@gmail.com
`
`(a) Level-1 features
`
`(b) Level-2 features
`
`Fig. 1. The most popular fingerprint representation consists of (a) global
`level-1 features (ridge flow, core, and delta) and (b) local level-2 features,
`called minutiae points, together with their descriptors (e.g., texture in
`local minutiae neighborhoods). The fingerprint image illustrated here is
`a rolled impression from the NIST SD4 database [7]. The number of
`minutiae in NIST4 rolled fingerprint images range all the way from 12 to
`196.
`
`•
`
`and (ii) matching in the encrypted domain, a necessity for
`user privacy protection, is computationally expensive, and
`results in loss of accuracy [9].
`In the context of global population registration, fingerprint
`recognition can be viewed as a 75 billion class problem
`(≈ 7.5 billion living persons, assuming nearly all with
`10 fingers) with large intra-class variability and large
`inter-class similarity (Fig. 2). This necessitates extremely
`discriminative yet compact representations that are com-
`plementary and at least as discriminative as the traditional
`minutiae-based representation. For example, India’s civil
`registration system, Aadhaar, now has a database of ≈ 1.3
`billion residents who are enrolled based on their 10 finger-
`prints, 2 irises, and face image [6].
`• Reliable minutiae extraction in low quality fingerprints
`(due to noise, distortion, finger condition) is problematic,
`causing false rejects in the recognition system (Fig. 2 (a)).
`See also NIST fingerprint evaluation FpVTE 2012 [10].
`
`arXiv:1909.09901v2 [cs.CV] 18 Dec 2019
`
`ASSA ABLOY Ex 1036 - Page 1
`ASSA ABLOY AB, et al. v. CPC Patent Technologies Pty Ltd.
`IPR2022-01094 - U.S. Patent No. 8,620,039
`
`

`

`IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
`
`2
`
`Fig. 2. Failures of the COTS A minutiae-based matcher (minutiae anno-
`tated with COTS A). The genuine pair (two impressions from the same
`finger) in (a) was falsely rejected at 0.1% FAR (score of 9) due to heavy
`non-linear distortion and moist fingers. The imposter pair (impressions
`from two different fingers) in (b) was falsely accepted at 0.1% FAR (score
`of 38) due to the similar minutiae distribution in these two fingerprint
`images (the score threshold for COTS A @ FAR = 0.1% is 34). In
`contrast, DeepPrint is able to correctly match the genuine pair in (a)
`and reject the imposter pair in (b). These slap fingerprint impressions
`come from public domain FVC 2004 DB1 A database [8]. The number
`of minutiae in FVC 2004 DB1 A images range from 11 to 87.
`
`TABLE 1
`Comparison of variable length minutiae representation with fixed-length
`DeepPrint representation
`
`Matcher
`
`(Min, Max)
`(Min, Max)
`# of Minutiae1
`Template Size (kB)
`COTS A
`(1.5, 23.7)
`(12, 196)
`COTS B
`(0.6, 5.3)
`(12, 225)
`0.2†
`N.A.2
`Proposed
`1 Statistics from NIST SD4 and FVC 2004 DB1.
`2 Template is not explicitly comprised of minutiae.
`† Template size is fixed at 200 bytes, irrespective of
`the number of minutiae (192 bytes for the features
`and 8 bytes for 2 decompression scalars).
`
`To overcome the limitations of minutiae-based matchers, we
`present a reformulation of the fingerprint recognition problem. In
`particular, rather than extracting varying length minutiae-sets for
`matching (i.e. handcrafted features), we design a deep network
`embedded with fingerprint domain knowledge, called DeepPrint,
`to learn a fixed-length representation of 200 bytes which discrim-
`inates between fingerprint images from different fingers (Fig. 4).
`Our work follows the trajectory of state-of-the-art automated
`
`Fig. 3. Fixed-length, 192-dimensional fingerprint representations ex-
`tracted by DeepPrint (shown as 16 × 12 feature maps) from the same
`four fingerprints shown in Figure 2. Unlike COTS A, we correctly classify
`the pair in (a) as a genuine pair, and the pair in (b) as an imposter pair.
`The score threshold of DeepPrint @ FAR = 0.1% is 0.76
`
`face recognition systems which have almost entirely abandoned
`traditional handcrafted features in favor of deep features extracted
`by deep networks with remarkable success [11], [12], [13]. How-
`ever, unlike deep network based face recognition systems, we do
`not completely abandon handcrafted features. Instead, we aim to
`integrate handcrafted fingerprint features (minutiae 1) into the deep
`network architecture to exploit the benefits of both deep networks
`and traditional, domain knowledge inspired features.
`While prevailing minutiae-matchers require expensive graph
`matching algorithms for fingerprint comparison, the 200 byte
`representations extracted by DeepPrint can be compared using
`simple distance metrics such as the cosine similarity, requiring
`only d multiplications and d − 1 additions, where d is the dimen-
`sionality of the representation (for DeepPrint, d = 192)2. Another
`significant advantage of this fixed-length representation is that it
`can be matched in the encrypted domain using fully homomorphic
`encryption [14], [15], [16], [17]. Finally, since DeepPrint is able
`to encode features that go beyond fingerprint minutiae, it is able to
`match poor quality fingerprints when reliable minutiae extraction
`is not possible (Figs. 2 and 3).
`To arrive at a compact and discriminative representation of
`only 200 bytes, the DeepPrint architecture is embedded with
`
`1. Note that we do not require explicitly storing minutiae in our final
`template. Rather, we aim to guide DeepPrint to extract features related to
`minutiae during training of the network.
`2. The DeepPrint representation is originally 768 bytes (192 features and 4
`bytes per float value). We compress the 768 bytes to 200 by scaling the floats
`to integer values between [0,255] and saving the two compression parameters
`with the features. This loss in precision (which saves significant disk storage
`space) very minimally effects matching accuracy.
`
`ASSA ABLOY Ex 1036 - Page 2
`ASSA ABLOY AB, et al. v. CPC Patent Technologies Pty Ltd.
`IPR2022-01094 - U.S. Patent No. 8,620,039
`
`

`

`IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
`
`3
`
`Fig. 4. Flow diagram of DeepPrint: (i) a query fingerprint is aligned via a Localization Network which has been trained end-to-end with the Base-
`Network and Feature Extraction Networks (no reference points are needed for alignment); (ii) the aligned fingerprint proceeds to the Base-Network
`which is followed by two branches; (iii) the first branch extracts a 96-dimensional texture-based representation; (iv) the second branch extracts a
`96-dimensional minutiae-based representation, guided by a side-task of minutiae detection (via a minutiae map which does not have to be extracted
`during testing); (v) the texture-based representation and minutiae-based representation are concatenated into a 192-dimensional representation of
`768 bytes (192 features and 4 bytes per float). The 768 byte template is compressed into a 200 byte fixed-length representation by truncating floating
`point value features into integer value features, and saving the scaling and shifting values (8 bytes) used to truncate from floating point values to
`integers. The 200 byte DeepPrint representations can be used both for authentication and large-scale fingerprint search. The minutiae-map can be
`used to further improve system accuracy and interpretability by re-ranking candidates retrieved by the fixed-length representation.
`
`fingerprint domain knowledge via an automatic alignment module
`and a multi-task learning objective which requires minutiae-
`detection (in the form of a minutiae-map) as a side task to
`representation learning. More specifically, DeepPrint automati-
`cally aligns an input fingerprint and subsequently extracts both a
`texture representation and a minutiae-based representation (both
`with 96 features). The 192-dimensional concatenation of these
`two representations, followed by compression from floating point
`features to integer value features comprises a 200 byte fixed-length
`representation (192 bytes for the feature vector and 4 bytes for
`storing the 2 compression parameters). As a final step, we utilize
`Product Quantization [18] to further compress the DeepPrint
`representations stored in the gallery, significantly reducing the
`computational requirements and time for large-scale fingerprint
`search.
`Detecting minutiae (in the form of a minutiae-map) as a side-
`task to representation learning has several key benefits:
`• We guide our representation to incorporate domain in-
`spired features pertaining to minutiae by sharing pa-
`rameters between the minutiae-map output task and the
`representation learning task in the multi-task learning
`framework.
`Since minutiae representations are the most popular for
`fingerprint recognition, we posit
`that our method for
`guiding the DeepPrint feature extraction via its minutiae-
`map side-task falls in line with the goal of “Explainable
`AI” [19].
`• Given a probe fingerprint, we first use its DeepPrint
`representation to find the top k candidates and then re-
`rank the top k candidates using the minutiae-map provided
`by DeepPrint 3. This optional re-ranking add-on further
`improves both accuracy and interpretability.
`3. The 128 × 128 × 6 DeepPrint minutiae-map can be easily converted into
`a minutiae-set with n minutia: {(x1, y1, θ1), ..., (xn, yn, θn)} and passed to
`any minutia-matcher (e.g., COTS A, COTS B, or [20]).
`
`•
`
`The primary benefit of the 200 byte representation extracted
`by DeepPrint comes into play when performing mega-scale search
`against millions or even billions of identities (e.g., India’s Aad-
`haar [6] and the FBI’s Next Generation Identification (NGI)
`databases [3]). To highlight the significance of this benefit, we
`benchmark the search performance of DeepPrint against the latest
`version SDKs (as of July, 2019) of two top performers in the NIST
`FpVTE 2012 (Innovatrics4 v7.2.1.40 and Verifinger5 v10.06) on
`the NIST SD4 [7] and NIST SD14 [21] databases augmented with
`a gallery of nearly 1.1 million rolled fingerprints. Our empirical
`results demonstrate that DeepPrint is competitive with these two
`state-of-the-art COTS matchers in accuracy while requiring only a
`fraction of the search time. Furthermore, a given DeepPrint fixed-
`length representation can also be matched in the encrypted domain
`via homomorphic encryption with minor loss to recognition accu-
`racy as shown in [14] for face recognition.
`More concisely, the primary contributions of this work are:
`• A customized deep network (Fig. 4), called DeepPrint,
`which utilizes fingerprint domain knowledge (alignment
`and minutiae detection) to learn and extract a discrimina-
`tive fixed-length fingerprint representation.
`• Demonstrating in a manner similar to [29] that Product
`Quantization can be used to compress DeepPrint fin-
`gerprint representations, enabling even faster mega-scale
`search (51 ms search time against a gallery of 1.1 million
`fingerprints vs. 27,000 ms for a COTS with comparable
`accuracy).
`• Demonstrating with a two-stage search scheme similar
`to [29] that candidates retrieved by DeepPrint represen-
`tations can be re-ranked using a minutiae-matcher in
`conjunction with the DeepPrint minutiae-map. This further
`
`4. https://www.innovatrics.com/
`5. https://www.neurotechnology.com/
`6. We note that Verifinger v10.0 performs significantly better than earlier
`versions of the SDK often used in the literature.
`
`ASSA ABLOY Ex 1036 - Page 3
`ASSA ABLOY AB, et al. v. CPC Patent Technologies Pty Ltd.
`IPR2022-01094 - U.S. Patent No. 8,620,039
`
`

`

`IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
`
`4
`
`TABLE 2
`Published Studies on Fixed-Length Fingerprint Representations
`
`Algorithm
`
`Jain et al. [22], [23]
`
`Cappelli et al. [24]
`
`Cao and Jain [25]
`
`Song and Feng [26]
`
`Song et al. [27]
`
`Li et al. [28]
`
`Proposed
`
`HR @ PR = 1.0%1
`(NIST SD4)2
`N.A.
`
`HR @ PR = 1.0%
`(NIST SD14)3
`N.A.
`
`Template Size
`(bytes)
`640
`
`93.2%
`
`91.0%
`
`98.65%
`
`98.93%
`
`93.3%
`
`99.2%
`
`N.A.
`
`99.6%
`
`99.83%
`
`99.89%
`
`99.75%
`
`99.93%
`
`Description
`Fingercode: Global representation
`extracted using Gabor Filters
`MCC: Local descriptors via
`3D cylindrical structures
`comprised of the minutiae-set representation
`Inception v3: Global deep
`representation extracted via
`Alignment and Inception v3
`PDC: Deep representations extracted at
`different resolutions and aggregated
`into global representation
`MDC: Deep representations extracted
`from minutiae and aggregated into
`global representation
`Finger Patches: Local deep
`representations aggregated into
`global representation via global
`average pooling
`DeepPrint: Global deep representation
`extracted via multi-task CNN
`with built-in fingerprint alignment
`1 In some baselines we estimated the data points from a Figure (specific data points were not reported in the paper).
`2 Only 2,000 fingerprints are included in the gallery to enable comparison with previous works. (HR = Hit Rate, PR = Penetration Rate)
`3 Only last 2,700 pairs (2,700 probes; 2,700 gallery) are used to enable comparison with previous works.
`4 Largest gallery size used in the paper.
`† The DeepPrint representation can be further compressed to only 64 bytes using product quantization with minor loss in accuracy.
`
`Gallery
`Size4
`N.A.
`
`2,700
`
`250,000
`
`2,000
`
`2,700
`
`2,700
`
`1,100,000
`
`1,913
`
`8,192
`
`N.A.
`
`1,200
`
`1,024
`
`200†
`
`improves system interpretability and accuracy and demon-
`strates that the DeepPrint features are complementary to
`the traditional minutiae representation.
`• Benchmarking DeepPrint against
`two state-of-the-art
`COTS matchers (Innovatrics and Verifinger) on NIST
`SD4 and NIST SD14 against a gallery of 1.1 million
`fingerprints. Empirical results demonstrate that DeepPrint
`is comparable to COTS matchers in accuracy at a signifi-
`cantly faster search speed.
`• Benchmarking the authentication performance of Deep-
`Print on the NIST SD4 and NIST SD14 rolled-fingerprints
`databases and the FVC 2004 DB1 A slap fingerprint
`database [8]. Again, DeepPrint shows comparable perfor-
`mance against the two COTS matchers, demonstrating the
`generalization ability of DeepPrint to both rolled and slap
`fingerprint databases.
`• Demonstrating that homomorphic encryption can be used
`to match DeepPrint templates in the encrypted domain, in
`real time (1.26 ms), with minimal loss to matching accu-
`racy as shown for fixed-length face representations [14].
`• An interpretability visualization which demonstrates our
`ability to guide DeepPrint
`to look at minutiae-related
`features.
`
`2 PRIOR WORK
`Several early works [22], [23], [24] presented fixed-length fin-
`gerprint representations using traditional image processing tech-
`niques. In [22], [23], Jain et al. extracted a global fixed-length
`representation of 640 bytes, called Fingercode, using a set of
`Gabor Filters. Cappelli et al. introduced a fixed-length minutiae
`descriptor, called Minutiae Cylinder Code (MCC), using 3D
`
`cylindrical structures computed with minutiae points [24]. While
`both of these representations demonstrated success at the time
`they were proposed, their accuracy is now significantly inferior to
`state-of-the-art COTS matchers
`Following the seminal contributions of [22], [23] and [24],
`the past 10 years of research on fixed-length fingerprint repre-
`sentations [31], [32], [33], [34], [35], [36], [37], [38], [39] has
`not produced a representation competitive in terms of fingerprint
`recognition accuracy with the traditional minutiae-based represen-
`tation. However, recent studies [25], [26], [27], [28] have utilized
`deep networks to extract highly discriminative fixed-length finger-
`print representations. More specifically, (i) Cao and Jain [25] used
`global alignment and Inception v3 to learn fixed-length fingerprint
`representations. (ii) Song and Feng [26] used deep networks to
`extract representations at various resolutions which were then
`aggregated into a global fixed-length representation. (iii) Song et
`al. [27] further learned fixed-length minutiae descriptors which
`were aggregated into a global fixed-length representation via an
`aggregation network. Finally, (v) Li et al. [28] extracted local
`descriptors from predefined “fingerprint classes” which were then
`aggregated into a global fixed-length representation through global
`average pooling.
`While these efforts show tremendous promise, each method
`has some limitations. In particular, (i) the algorithms proposed
`in [25] and [26] both required computationally demanding global
`alignment as a preprocessing step, and the accuracy is inferior to
`state-of-the-art COTS matchers. (ii) The representations extracted
`in [27] require the arduous process of minutiae-detection, patch
`extraction, patch-level
`inference, and an aggregation network
`to build a single global feature representation. (iii) While the
`algorithm in [28] obtains high performance on rolled fingerprints
`
`ASSA ABLOY Ex 1036 - Page 4
`ASSA ABLOY AB, et al. v. CPC Patent Technologies Pty Ltd.
`IPR2022-01094 - U.S. Patent No. 8,620,039
`
`

`

`IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
`
`5
`
`Fig. 5. Fingerprint impressions from one subject in the DeepPrint training dataset [30]. Impressions were captured longitudinally, resulting in the
`variability across impressions (contrast and intensity from environmental conditions; distortion and alignment from user placement). Importantly,
`training with longitudinal data enables learning compact representations which are invariant to the typical noise observed across fingerprint
`impressions over time, a necessity in any fingerprint recognition system.
`
`(with small gallery size), the accuracy was not reported for slap
`fingerprints. Since [28] aggregates local descriptors by averaging
`them together, it is unlikely that the approach would work well
`when areas of the fingerprint are occluded or missing (often times
`the case in slap fingerprint databases like FVC 2004 DB1 A),
`and (v) all of the algorithms, suffer from lack of interpretability
`compared to traditional minutiae representations.
`In addition, existing studies targeting deep, fixed-length finger-
`print representations all lack an extensive, large-scale evaluation
`of the deep features. Indeed, one of the primary motivations for
`fixed-length fingerprint representations is to perform orders of
`magnitude faster large scale search. However, with the exception
`of Cao and Jain [25], who evaluate against a database of 250K
`fingerprints,
`the next
`largest gallery size used in any of the
`aforementioned studies is only 2,700.
`As an addendum, deep networks have also been used to
`improve specific sub-modules of fingerprint recognition systems
`such as segmentation [40], [41], [42], [43], orientation field
`estimation [44], [45], [46], minutiae extraction [47], [48], [49],
`and minutiae descriptor extraction [50]. However, these works all
`still operate within the conventional paradigm of extracting an
`unordered, variable length set of minutiae for fingerprint matching.
`
`3 DEEPPRINT
`In the following section, we (i) provide a high-level overview and
`intuition of DeepPrint, (ii) present how we incorporate automatic
`alignment into DeepPrint, and (iii) demonstrate how the accuracy
`and interpretability of DeepPrint is improved through the injection
`of fingerprint domain knowledge.
`
`3.1 Overview
`A high level overview of DeepPrint is provided in Figure 4
`with pseudocode in Algorithm 1. DeepPrint is trained with a
`longitudinal database (Fig. 5) comprised of 455K rolled fingerprint
`images stemming from 38,291 unique fingers [30]. Longitudinal
`fingerprint databases consist of fingerprints from distinct subjects
`captured over time (Fig. 5) [30]. It is necessary to train DeepPrint
`with a large, longitudinal database so that it can learn compact,
`fixed-length representations which are invariant to the differences
`introduced during fingerprint image acquisition at different times
`and in different environments (humidity, temperature, user interac-
`tion with the reader, and finger injuries). The primary task during
`training is to predict the finger identity label c ∈ [0, 38291]
`(encoded as a one-hot vector) of each of the 455K training
`
`Algorithm 1 Extract DeepPrint Representation
`1: L(If ): Shallow localization network, outputs x, y, θ
`2: A: Affine matrix composed with parameters x, y, θ
`3: G(If , A): Bilinear grid sampler, outputs aligned fingerprint
`4: S(It): Inception v4 stem
`5: E(x): Shared minutiae parameters
`6: M (x): Minutia representation branch
`7: D(x): Minutiae map estimation
`8: T (x): Texture representation branch
`9:
`10: Input: Unaligned 448 × 448 fingerprint image If
`11: A ← (x, y, θ) ← L(If )
`12: It ← G(If , A)
`13: Fmap ← S(It)
`14: Mmap ← E(Fmap)
`15: R1 ← M (Mmap)
`16: H ← D(Mmap)
`17: R2 ← T (Fmap)
`18: R ← R1 ⊕ R2
`19: Output: Fingerprint representation R ∈ R192 and minutiae-
`map H. (H can be optionally utilized for (i) visualization and
`(ii) fusion of DeepPrint scores obtained via R with minutiae-
`matching scores.)
`
`fingerprint images (≈ 12 fingerprint impressions / finger). The last
`fully connected layer is taken as the representation for fingerprint
`comparison during authentication and search.
`The input to DeepPrint is a 448 × 448 7 grayscale fingerprint
`image, If , which is first passed through the alignment module
`(Fig. 4). The alignment module consists of a localization network,
`L, and a grid sampler, G [51]. After applying the localization
`network and grid sampler to If , an aligned fingerprint It is passed
`to the base-network, S.
`The base-network is the stem of the Inception v4 architecture
`(Inception v4 minus Inception modules). Following the base-
`network are two different branches (Fig. 4) comprised primarily of
`the three Inception modules (A, B, and C) described in [52]. The
`first branch, T (x), completes the Inception v4 architecture 8 as
`7. Fingerprint images in our training dataset vary in size from ≈ 512 × 512
`to ≈ 800 × 800. As a pre-processing step, we do a center cropping (using
`Gaussian filtering, dilation and erosion, and thresholding) to all images to
`≈ 448 × 448. This size is sufficient to cover most of the rolled fingerprint
`area without extraneous background pixels.
`8. We selected Inception v4 after evaluating numerous other architectures
`such as: ResNet, Inception v3, Inception ResNet, and MobileNet.
`
`ASSA ABLOY Ex 1036 - Page 5
`ASSA ABLOY AB, et al. v. CPC Patent Technologies Pty Ltd.
`IPR2022-01094 - U.S. Patent No. 8,620,039
`
`

`

`IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
`
`6
`
`Fig. 6. Unaligned fingerprint images from NIST SD4 (top row) and corresponding DeepPrint aligned fingerprint images (bottom row).
`
`TABLE 3
`Localization Network Architecture
`
`T (S(It)) and performs the primary learning task of predicting a
`finger identity label directly from the cropped, aligned fingerprint
`It. It is included in order to learn the textural cues in the fingerprint
`image. The second branch (Figs. 4 and 8), M (E(S(It))), again
`predicts the finger identity label from the aligned fingerprint It,
`but it also has a related side task (Fig. 8) of detecting the minutiae
`locations and orientations in It via D(E(S(It))). In this manner,
`we guide this branch of the network to extract representations
`influenced by fingerprint minutiae (since parameters between the
`minutiae detection task and representation learning task are shared
`in E(x)). The textural cues act as complementary discriminative
`information to the minutiae-guided representation. The two 96-
`dimensional representations (each dimension is a float, consuming
`4 bytes of space) are concatenated into a 192-dimensional repre-
`sentation (768 total bytes). Finally, the floats are truncated from 32
`bits to 8 bit integer values, compressing the template size to 200
`bytes (192 bytes for features and 8 bytes for 2 decompression
`parameters). Note that the minutiae set is not explicitly used
`in the final representation. Rather, we use the minutiae-map to
`guide our network training. However, for improved accuracy and
`interpretability, we can optionally store the minutiae set for use in
`a re-ranking scheme during large-scale search operations.
`In the following subsections, we provide details of the major
`sub-components of the proposed network architecture.
`
`3.2 Alignment
`In nearly all fingerprint recognition systems, the first step is to
`perform alignment based on some reference points (such as the
`core point). However, this alignment is computationally expensive.
`This motivated us to adopt attention mechanisms such as the
`spatial transformers in [51].
`The advantages of using the spatial transformer module in
`place of reference point based alignment algorithms are two-
`fold: (i) it requires only one forward pass through a shallow
`localization network (Table 3), followed by bilinear grid sampling.
`This reduces the computational complexity of alignment (we
`resize the 448 × 448 fingerprints to 128 × 1289 to further
`9. We also tried 64 × 64, however, we could not obtain consistent alignment
`at this resolution.
`
`Filter
`Size, Stride
`5 × 5, 1
`2 × 2, 2
`3 × 3, 1
`2 × 2, 2
`3 × 3, 1
`2 × 2, 2
`3 × 3, 1
`2 × 2, 2
`
`Type
`
`Convolution
`Max Pooling
`Convolution
`Max Pooling
`Convolution
`Max Pooling
`Convolution
`Max Pooling
`Fully Connected
`
`Output
`Size
`128 × 128 × 24
`64 × 64 × 24
`64 × 64 × 32
`32 × 32 × 32
`32 × 32 × 48
`16 × 16 × 48
`16 × 16 × 64
`8 × 8 × 64
`64
`3†
`Fully Connected
`† These three outputs correspond to x,y,θ shown in
`Fig. 4.
`
`speed up the localization estimation); (ii) The parameters of the
`localization network are tuned to minimize the loss (Eq. 9) of
`the base-network and representation extraction networks. In other
`words, rather than supervising the transformation via reference
`points (such as the core point), we let the base-network and
`representation extraction networks tell the localization network
`what a “good” transformation is, so that it can learn a more
`discriminative representation for the input fingerprint.
`Given an unaligned fingerprint image If , a shallow local-
`ization network first hypothesizes the translation and rotation
`parameters (x,y, and θ) of an affine transformation matrix Aθ
`(Fig. 4). A user specified scaling parameter λ is used to complete
`Aθ (Fig. 4). This scaling parameter stipulates the area of the input
`fingerprint image which will be cropped. We train two DeepPrint
`models, one for rolled fingerprints (λ = 1) and one for slap
`448 ) meaning a 285 × 285 fingerprint area
`fingerprints (λ = 285
`window will be cropped from the 448 × 448 input fingerprint
`image. Given Aθ, a grid sampler G samples the input image If
`
`
`pixels (xfi , yfi ) for every target grid location (xt
`
`i, yti ) to output the
`
`ASSA ABLOY Ex 1036 - Page 6
`ASSA ABLOY AB, et al. v. CPC Patent Technologies Pty Ltd.
`IPR2022-01094 - U.S. Patent No. 8,620,039
`
`

`

`IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
`
`7
`
`3.3 Minutiae Map Domain Knowledge
`To prevent overfitting the network to the training data and to
`extract interpretable deep features, we incorporate fingerprint do-
`main knowledge into DeepPrint. The specific domain knowledge
`we incorporate into our network architecture is hereafter referred
`to as the minutiae map [20]. Note that the minutiae map is not
`explicitly used in the fixed-length fingerprint representation, but
`the information contained in the map is indirectly embedded in
`the network during training.
`A minutiae map is essentially a 6-channel heatmap quan-
`tizing the locations (x, y) and orientations θ ∈ [0, 2π] of the
`minutiae within a fingerprint image. More formally, let h and
`w be the height and width of an input fingerprint image and
`T = {m1, m2, ..., mn} be its minutiae template with n minutiae
`points, where mt = (xt, yt, θt) and t = 1, ..., n. Then, the
`minutiae map H ∈ Rh×w×6 at (i, j, k) can be computed by
`summing the location and orientation contributions of each of the
`minutiae in T to obtain the heat map (Fig. 7 (b)).
`
`H(i, j, k) =
`
`Cs((xt, yt), (i, j)) · Co(θt, 2kπ/6)
`
`(2)
`
`t=1
`where Cs(.) and Co(.) calculate the spatial and orientation con-
`tribution of minutiae mt to the minutiae map at (i, j, k) based
`upon the euclidean distance of (xt, yt) to (i, j) and the orientation
`difference between θt and 2kπ/6 as follows:
`
`n(cid:88)
`
`Cs((xt, yt), (i, j)) = exp(−||(xt, yt) − (i, j)||2
`
`2
`
`)
`
`)
`
`(3)
`
`(4)
`
`2σ2
`s
`Co(θt, 2kπ/6) = exp(− dφ(θt, 2kπ/6)
`2σ2
`s
`s is the parameter which controls the width of the
`where σ2
`gaussian, and dφ(θ1, θ2) is the orientation difference between
`angles θ1 and θ2:
`
`(cid:40)|θ1 − θ2|
`
`dφ(θ1, θ2) =
`
`−π ≤ θ1 − θ2 ≤ π
`2π − |θ1 − θ2| otherwise.
`An example fingerprint image and its corresponding minutiae map
`are shown in Figure 7. A minutiae-map can be converted back to
`a minutiae set by finding the local maximums in a channel (loca-
`tion), and individual channel contributions (orientation), followed
`by non-maximal suppression to remove spurious minutiae10.
`
`(5)
`
`3.4 Multi-Task Architecture
`The minutiae-map domain knowledge is injected into DeepPrint
`via multitask learning. Multitask learning improves generalizabil-
`ity of a model since domain knowledge within the training signals
`of related tasks acts as an inductive bias [53], [54]. The multi-
`task branch of the DeepPrint architecture is shown in Figures 4
`and 8. The primary task of the branch is to extract a representation
`and subsequently classify a given finger

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket