`
`Fingerprint Matching on Splash 2
`
`Chapter 10
`
`single match takes, say, one millisecond of CPU time, matching against a database
`of one million fingerprints would require a total of 103 seconds of CPU time. If we
`have to process 100 queries per day, we would need 105 seconds or 27.78 hours
`of CPU time alone, not including the I/O time in reading the fingerprints from the
`database.
`
`In order to provide a reasonable response time for each query, commercial sys-
`tems use dedicated hardware accelerators or application-specific integrated circuits
`(ASICs). While application-specific architectures and ASICs have been designed to
`meet the computing requirements of complex image processing tasks, such designs
`have the following two major limitations: (i) once fabricated, they are difficult to
`modify; and (ii) the cost of building special-purpose application accelerators is very
`expensive for low—volume applications. Both of these limitations have been the driv-
`ing force behind the design of custom computing machines (CCMs) using reconfig-
`urable logic arrays known as field—programmable gate arrays (FPGAs). An attached
`processor built with FPGAs can overcome the two limitations noted above. High
`performance is achieved with FPGAs by exploiting an important principle: most of
`the processing time of a compute-intensive job is spent within a small portion of
`its execution code [3], and if an architecture can provide efficient execution sup—
`port for the frequently executed code, then the overall performance can be improved
`substantially. Portions of the matching algorithm have been identified for implemen-
`tation on Splash 2, leaving the remainder to be implemented using software on the
`host.
`
`The goal of this chapter is threefold. First, it describes a successful application
`using Splash 2. Second, we demonstrate that a suitable mapping of an algorithm
`to a given architecture results in excellent performance. Third, we illustrate how
`FPGAs can facilitate this mapping process without sacrificing speed and flexibility.
`In fact, FPGAs offer greater flexibilty since the hardware is customized to meet the
`requirements of the algorithm.
`This chapter is organized as follows. In Section 2, a brief introduction to pat-
`tern recognition systems is given, followed by definition of the terminology used
`in fingerprint matching, and introduction of various stages in an AFIS. Section 3
`briefly reviews the Splash 2 architecture and its programming paradigm. The finger-
`print matching algorithm and its computational requirement are briefly presented in
`Section 4. The hardware-software design is presented in Section 5. The hardware
`component of the parallel algorithm has been simulated using the Splash simulator.
`The results of simulation and synthesis are discussed in Section 6. The synthesized
`logic has been executed on a set of actual fingerprints. For measuring execution
`speed, a synthetic database of 10,000 fingerprints has been created from 100 real
`fingerprints. The execution speed of the matching module is analyzed in Section 7,
`followed by conclusions in Section 8.
`
`10.2 BACKGROUND
`
`This section is devoted to an introduction to pattern recognition systems, some
`basic definitions with respect to fingerprints, and automatic fingerprint identification
`systems (AFIS).
`'
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 120
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 120
`
`
`
`Section 10.2
`
`Background
`
`121
`
`10.2.1 Pattern Recognition Systems
`
`Pattern recognition techniques are used to classify or describe complex patterns or
`objects by means of some measured properties or features. A pattern is an entity,
`vaguely defined, that could be given a name. A speech waveform, a person’s face,
`and a piston head are examples of patterns. The goals of pattern recognition are to
`(i) assign a pattern to a heretofore unknown class of patterns (clustering) or (ii) iden-
`tify a pattern as a member of an already known class (classification). Two examples
`of the recognition problem are identifying a suspect in a criminal case based on
`fingerprints, and finding defects in a printed circuit board.
`A pattern recognition system (PRS) classifies an object into one of several
`predefined classes. The input to a PRS is a set of N measurements represented
`by an N—dimensional vector called a pattern vector. A PRS can be used to com-
`pletely automate the decision-making process without any human intervention. A PRS
`requires data acquisition via some sensors, data representation, and data analysis or
`classification. The data are usually either in the form of pictures, as in the case of
`fingerprint matching, or one-dimensional time signals, as in the case of speech recog-
`nition. Although these images or signals can be interpreted, analyzed, or classified
`by trained human operators, pattern recognition systems can provide more reliable
`and faster analysis, often at a lower cost.
`The design of a PRS involves the following three steps:
`
`0 Sensing
`
`0 Representation
`0 Decision making
`
`The problem domain influences the choice of sensor, representation, and decision
`making model. An ideal representation is characterized by the following desirable
`properties; it is
`
`. Provided with discriminatory information at several levels of resolution (detail)
`
`Ul-BSle-d
`
`Easily computable
`Amenable to automated matching algorithms
`Stable and invariant to noise and distortions
`
`. Efficient and compact
`
`The compactness property of a representation often constrains its discriminating
`power.
`The pattern recognition problem is difficult because various sources of noise
`distort the patterns, and often there exists a substantial amount of variability among
`the patterns belonging to the same category [5]. For example, the character ‘A’
`written by different people looks different, though we assign the same class label
`‘A’ to all of them. Hence, it is not appropriate to use the raw pattern vector for
`classification. Invariant features that characterize a set of patterns are used to represent
`a class of patterns. Several issues arise, such as what features should be used and‘
`how they should be extracted reliably. The features of a pattern are the input to a
`classification stage. The challenge in designing a recognition system is in extraction
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 121
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 121
`
`
`
`122
`
`Fingerprint Matching on Splash 2
`
`Chapter 10
`
`of features that can tolerate the intra-class variations and still possess the inter-class
`discriminating power. If the extracted features have sufficient discriminating power,
`then the decision making stage is simple. Conversely, a sophisticated decision making
`stage can compensate for an unreliable feature extraction stage. In practice, we never
`have a noiseless input pattern, an ideal representation, perfect feature extraction, or
`robust decision maker. Imperfections in any of these stages may result in classification
`error. The goal of a pattern recognition system is to minimize the classification error.
`Many successful pattern recognition systems have been built in the area of document
`analysis, medical diagnosis, and fingerprint identification. A large number of books
`and survey papers have been written on pattern recognition. Readers interested in
`more details are referred to [5].
`
`10.2.2 Terminology
`
`The structural features that are commonly extracted from the gray-level input finger-
`print image are ridge bifurcations and ridge endings. Each of the features has three
`components, namely, the x-coordinate, the y-coordinate, and the local ridge direction
`at the feature location, as shown in Figure 10.4. Many other features that have been
`used for fingerprint matching are derived from this basic three-dimensional feature
`vector [1].
`
`Definitions of some relevant fingerprint—related terms are given below. Readers
`interested in more details are referred to [2].
`
`0 Fingerprint image: A digitized image of a fingerprint impression usually con-
`taining 512 X 512 pixels and 256 gray levels.
`
`0 Fingerprint card: A paper card with a provision to record impressions of all
`10 fingers of a person, including other textual details (such as name, sex, and
`age) useful for identification.
`
` Minutia (x, y) X
`
`M
`
`FIGURE 10.4 Components of a
`Minutia Feature
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 122
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 122
`A
`
`
`
`Section 10.2
`
`Background
`
`123
`
`
`
`FIGURE 10.5 A Core Point Marked on
`a Gray-level Fingerprint
`
`Pattern area: The area of the image where the fingerprint pattern is located.
`
`Ridge: A black line in a fingerprint image. See Figure 10.1.
`Valley: A white line in a fingerprint image. See Figure 10.1.
`Ridge bifurcation point: A point where a ridge branches into two ridges. See
`Figure 10.2(a).
`Ridge end point: A point where a ridge stops flowing. See Figure 10.2(b).
`
`Minutia: A ridge ending or bifurcation point.
`Classification: Based on the ridge flow type, the process of categorizing fin-
`gerprints into one of the following five classes: (i) arch, (ii) loop, (iii) whorl,
`(iv) double loop, and (V) accidental. The first three fingerprint classes are shown
`in Figure 10.1.
`Matching: The process of comparing a pair of fingerprints based on their minu—
`tiae feature sets. The AFIS systems usually determine a list of probables (possi-
`ble matches) from the database, often sorted on a matching score that indicates
`the degree of match.
`Core point: For whorls, loops, and double loops, the core point is defined as
`the topmost point on the innermost ridge, assuming the fingerprint is oriented.
`See Figure 10.5.
`
`10.2.3 Stages in AFIS
`
`An AFIS is a pattern recognition system for fingerprint matching. A typical AFIS
`consists of various processing stages as shown in Figure 10.6. For the purpose of
`automation, a suitable representation of fingerprints is essential. Clearly,
`the raw
`digital image (set of pixels) of a fingerprint itself does not meet the requirements '
`of an ideal representation described earlier. Hence, high-level structural features are
`extracted from the fingerprint image for the purpose of representation and matching.
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 123
`
`Petitioner Microsoft Corporation - EX. 1007, p. 123
`
`
`
`124
`
`Fingerprint Matching on Splash 2
`
`Chapter 10
`
`
`Image Acquisition
`
`
`
`
`
`(Optional)
`Feature Extraction
`
`
`Manual Feature Editing
`
`Matching
`
`Manual Verification
`
`
`
`FIGURE 10.6 Stages in an Automatic Fingerprint Identification System (AFIS)
`
`The commercially available fingerprint systems typically use ridge bifurcations
`and ridge endings as features (see Figure 10.2). Because of the large size of the fin-
`gerprint database and the noisy fingerprints encountered in practice, it is very difficult
`to achieve a reliable one-to-one matching in all test cases. Therefore, the commer-
`cial systems provide a ranked list of possible matches (usually the top 10 matches)
`that are then verified by a human expert. The matching stage uses the position and
`orientation of these features and the total number of such features. As a result, the
`
`accuracy of feature extraction has a strong impact on the overall accuracy of finger-
`print matching. Reliable and robust features can simplify the matching algorithm and
`obviate the manual verification stage.
`One of the main problems in extracting structural features is the presence of
`noise in the fingerprint image. Commonly used methods for taking fingerprint impres-
`sions involve applying a uniform layer of ink on the finger and rolling the finger on
`paper. This leads to the following problems. Smudgy areas in the image are created by
`overinked areas of the finger, while breaks in ridges are created by underinked areas.
`Additionally, the elastic nature of the skin can change the positional characteristics
`of the fingerprint features depending on the pressure applied on the fingers. Though
`inkless methods for taking fingerprint impressions are now available, these methods
`still suffer from the positional shifting caused by the skin elasticity. The AFIS used
`for criminal identification poses yet another problem. A noncooperative attitude of
`suspects or criminals in providing the impressions leads to smearing parts of the
`fingerprint impression. Thus, noisy features are inevitable in real fingerprint images.
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 124
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 124
`
`
`
`Section 10.4
`
`Fingerprint Matching Algorithm
`
`125
`
`The matching module must be robust to overcome the noisy features generated by
`the feature extraction module.
`
`The functioning of an AFIS can be described starting with the input stage. A
`gray-scale fingerprint image is obtained using a scanner or a camera. Recently, inkless
`methods have been used for this stage [7]. The input image needs enhancement before
`further processing can be done. This stage involves image processing techniques to
`minimize noise and enhance image contrast. A feature extraction stage locates the
`minutiae points in the enhanced image. Often, it is difficult to extract minutiae reliably
`from noisy inputs. In such cases, a human fingerprint expert interactively updates the
`location of the minutiae. The set of minutiae forms the input to a matcher. The
`matcher reads fingerprint features from the database and matches these with the
`query fingerprint feature set. It outputs a list of probables from the database in order
`of their degree of match. The system output is verified by the human expert to arrive
`at the final decision for each query fingerprint.
`
`10.3 SPLASH 2 ARCHITECTURE AND PROGRAMMING MODELS
`
`We review the major components of the Splash 2 system that are used by our finger—
`print matching algorithm. (For details, refer to the chapters on Splash 2 architecture
`and programming.)
`Each Splash 2 processing board has 16 Xilinx 40108 as Processing Elements
`(PEs X1—X16) in addition to a seventeenth Xilinx 4010 (X0) that controls the data
`flow into the processor board. Each PE has 512 KB of memory. The Sun SPARC-
`station host can read/write this memory. The PEs are connected through a crossbar
`that is programmed by X0. There is a 36-bit linear data path (SIMD Bus) running
`through all the PBS. The PEs can read data either from their respective memory or
`from any other PE. A broadcast path also exists by suitably programming X0.
`The Splash 2 system supports several models of computation, including PEs
`executing a single instruction on multiple data (SIMD mode) and PBS executing
`multiple instructions on multiple data (MIMD mode). It can also execute the same or
`different instructions on single data by receiving data through the global broadcast
`bus. The most common mode of operation is systolic, in which the SIMD Bus is
`used for data transfer. Also, individual memory available with each PE is used to
`conveniently store temporary results and tables.
`To program Splash 2, we need to program each of the PEs (X1—Xl6), the
`crossbar, and the host interface. The crossbar sets the communication paths for any
`arbitrary pattern of communication between PEs. In case the crossbar is used, X0
`needs to be programmed. The host interface handles data transfers in and out of the
`Splash 2 board.
`
`10.4 FINGERPRINT MATCHING ALGORITHM
`
`image and‘
`The feature extraction process takes the input fingerprint gray—level
`extracts the minutiae features described in Section 1, making no efforts to distinguish
`between the two categories (ridge endings and ridge bifurcations). Figure 10.7 shows
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 125
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 125
`
`
`
`126
`
`Fingerprint Matching on Splash 2
`
`Chapter 10
`
`
`
`FIGURE 10.7 Feature Extraction. (a) A gray—scale image of a fingerprint; (b) its skeleton
`with features
`
`a gray-scale fingerprint image and its skeleton image where these features are marked.
`In this section, an algorithm for matching rolled fingerprints against a database of
`rolled fingerprints is presented. A query fingerprint is matched with every fingerprint
`in the database, discarding candidates whose matching scores are below a user-
`specified threshold. Rolled fingerprints usually contain a large number of minutiae
`(between 50 and 100). Since the main focus of this chapter is on parallelizing the
`matching algorithm, we assume that the features have been extracted from the fin-
`gerprint images and the important information is available. In particular, we assume
`that the core point of the fingerprint is known and that the fingerprints are oriented
`properly.
`
`10.4.1 Minutia Matching
`
`Matching a query and database fingerprint is equivalent to matching their minutiae
`sets. Each query fingerprint minutia is examined to determine whether there is a
`corresponding database fingerprint minutia. Two minutiae are said to be paired or
`matched if their components (x, y, 6) are equal within some tolerance after regis—
`tration, which is the process of aligning the two sets of minutiae along a common
`core point (see section 4.2 for precise definitions). Three situations arise as shown in
`Figure 10.8.
`
`1. A database fingerprint minutia matches the query fingerprint minutia in all the
`components (paired minutiae);
`
`2. A database fingerprint minutia matches the query fingerprint minutia in the x
`and y coordinates, but does not match in the direction (minutiae with unmatched
`angle);
`
`3. No database fingerprint minutia matches
`(unmatched minutia).
`'
`
`the query fingerprint minutia
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 126
`
`Petitioner Microsoft Corporation - EX. 1007, p. 126
`
`
`
`Section 10.4
`
`Fingerprint Matching Algorithm
`
`127
`
`Paired minutiae
`
`Paired minutiae
`
`Minutiae with
`unmatched angle
`
`
`
`
`
`Unmatched minutiae
`Unmatched minutiae
`(Lying outside tolerance box)
`(No pairing possible)
`L____,____J
`CI Tolerance box
`
`. Query fingerprint minutiae
`0 Database fingerprint minutiae
`
`FIGURE 10.8 Different Situations in
`Minutia Matching
`
`Of the three cases described above, only in the first case are the minutiae said to be
`
`paired.
`
`10.4.2 Matching Algorithm
`
`The following notation is used in the sequential and parallel algorithms described
`below. Let the query fingerprint be represented as an n-dimensional feature vec-
`tor fq = (fq,fq, ...... ,fg). Note that each of the n elements is a feature vector cor-
`responding to one minutia, and the ith feature vector contains three components,
`fi = (fl-(X), My), 15(0))-
`The components of a feature vector are shown geometrically in Figure 10.4.
`The query fingerprint core point is located at (C3, C3). Similarly, let the rth ref—
`erence (database) fingerprint be represented as an mr-dimensional feature vector
`fr = (f', 5,.....IL), and the reference fingerprint core point
`is located at
`(0;, C5).
`Let (x3451) and (x2, y2) define the bounding box for the query fingerprint,
`where x;
`is the x-coordinate of the top—left comer of the box and x";
`is the x-
`coordinate of the bottom—right corner of the box. Quantities y; and y"; are defined
`similarly. A bounding box is the smallest rectangle that encloses all the feature points.
`Note that the query fingerprint f q may or may not belong to the fingerprint database
`fD. The fingerprints are assumed to be registered with a known orientation. Hence,
`there is no need of normalization for rotation.
`The matching algorithm is based on finding the number of paired minutiae
`between each database fingerprint and the query fingerprint. It uses the concept of
`minutiae matching described in Section 4.1. A tolerance box is shown graphically in
`Figure 10.9. In order to reduce the amount of computation, the matching algorithm
`takes into account only those minutiae that fall within a common bounding box.
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 127
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 127
`
`
`
`128
`
`Fingerprint Matching on Splash 2
`
`Chapter 10
`
`Minutia point
`
`Idsin(<I))
`
`Idcos(<I>)
`
`
`
`
`Core point ——-—> .1
`
`X
`
`FIGURE 10.9 Tolerance Box for X- and Y-components of a Minutia Point
`
`The common bounding box is the intersection of the bounding box for query and
`reference (database) fingerprints. Once the count of matching minutiae is obtained, a
`matching score is computed. The matching score is used for deciding the degree of
`match. Finally, a set of top-scoring reference fingerprints is obtained as a result of
`matching.
`In order to accommodate the shift in the minutia features, a tolerance box is
`
`created around each feature. The size of the box depends on the ridge widths and
`distance from the core point in the fingerprint.
`In the
`The sequential matching algorithm is described in Figure 10.10.
`sequential algorithm, the tolerance box (shown in Figure 10.9 with respect to a query
`fingerprint minutia) is calculated for the reference (database) fingerprint minutia. In
`the parallel algorithm described in the next section, it is calculated for the query
`fingerprint (as in Figure 10.9). A similar sequential matching algorithm is described
`by Wegstein [9]. Depending on the desired accuracy, more than one finger could be
`used in matching. In that case, a composite score is computed for each set.
`
`10.5 PARALLEL MATCHING ALGORITHM
`
`We parallelize the matching algorithm so that it utilizes the specific characteristics
`of the Splash 2 architecture. While performing this mapping, we need to take into
`account the limitations of the available FPGA technology. This is consistent with the
`approaches taken in hardware-software codesign. Any preprocessing needed on the
`query minutiae set is a one-time operation, whereas reference fingerprint minutiae
`matching is a repetitive operation. Computing the matching score involves floating-
`point division. The floating-point operations and one-time operations are performed
`in software on the host, whereas the repetitive operations are delegated to the FPGA-
`based PEs of Splash 2. The parallel version of the algorithm involves operations on
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 128
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 128
`
`
`
`Section 10.5
`
`Parallel Matching Algorithm
`
`129
`
`Input: Query fingerprint n—dimensional feature vector fq and the rolled fingerprint database fD = {f r}:v=1.
`The rth database fingerprint is represented as an mr-dimensional featurevector.
`Output: A list of top ten records from the database with matching score > T.
`Begin
`
`For r = 1 to N do
`
`1. Register the database fingerprint with respect to the core point (C3, C3) of the
`query fingerprint:
`Fori=1tomr do
`mx) = f,-’(x) ~ CE
`flm=flm—d
`2. Compute the common bounding box for the query and reference fingerprints:
`Let (x51, yfl) and (x5, y5) define the bounding box for the query fingerprint.
`Let (xi, yfi) and (xf, yf) define the bounding box for the rth reference fingerprint.
`The intersection of these two boxes is the common bounding box.
`Let the query print have Mg and reference print have N; minutiae in this box.
`3. Compute the tolerance vector for ith feature vector fir:
`If the distance from the reference core point to the current reference feature is less than K
`then
`
`ti’(x) = Id cos(¢),
`t,” (y) = Id sin(¢), and
`IKE) = k3,
`
`else
`
`if“) = k1,
`t{(y) 2 kg, and
`IN“) = k3,
`where l, k1, kg and k3 are prespecified constants determined
`empirically based on the average ridge width,
`4) is the angle of the line joining the core point and the i th feature with the x—axis,
`and d is the distance of the feature from the core point.
`Tolerance box is shown geometrically in Figure 10.9.
`4. Match minutiae:
`
`Two minutiae ff and f? are said to match if the following conditions are satisfied:
`qu(x) - 4’06) 5 f,’(X) S 1706) + t,’(X),
`f,~"(y) - 4’0) 5 f,’(y) : f,-”(y) + t{(y). and
`fl—rw)sflw)sfiwr+mm.
`where ti’ = (2‘1? (x), t,’ (y), t} (6)) is the tolerance vector.
`Set the number of paired features, m; :2 0;
`For all query features f?, j = 1,2, ...Meq, do
`.
`If f}! matches with any feature in f f i = 1,2,
`Mark the corresponding feature in f r as paired.
`5. Compute the matching score (MS (q,r)):
`
`m’ *m’
`MS(q’ r) = (MEWEY
`Sort the database fingerprints and obtain top 10 scoring database fingerprints.
`
`rP
`
`‘
`
`. ., N; , then increment m
`
`End
`
`FIGURE 10.10 Sequential Fingerprint Matching Algorithm
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 129
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 129
`
`
`
`130
`
`Fingerprint Matching on Splash 2
`
`Chapter 10
`
`X1
`
`
`
`X16
`
`Collect_flag
`
`From host
`
`
`
`
`FIGURE 10.11
`
`Fingerprint Matching in Splash 2
`
`the host, on X0, and on each PE. The schematic of fingerprint matching algorithm
`using Splash 2 is shown in Figure 10.11.
`One of the main constructs of the parallel algorithm is a lookup table. The
`lookup table consists of all possible points within the tolerance box that a feature
`may be mapped to. The Splash 2 data paths for the parallel algorithm are shown in
`Figure 10.12.
`,
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 130
`
`Petitioner Microsoft Corporation - EX. 1007, p. 130
`
`
`
`Section 10.5
`
`Parallel Matching Algorithm
`
`131
`
`
`
`
`Broadcast Bus (Using crossbar)
`
`Global OR Bus
`
`FIGURE 10.12 Data Flow in Parallel Matching Algorithm
`
`10.5.1 Preprocessing on the Host
`
`The host processes the query and database fingerprints as follows. The query finger-
`print is read first and the following preprocessing is done:
`
`1. The core point is assumed to be available. For each query feature f?, j = 1, 2,
`.
`. .n, generate a tolerance box. Enumerate a total of (IX X 1), X t9) grid points
`in this box, where tJr
`is the tolerance in x,
`ty is the tolerance in y and t9 is
`tolerance in 9.
`
`2. Allocate each feature to one PE in Splash 2. Repeat this cyclically, that is,
`features 1—16 are allocated to PBS X1 to X16, features 17—32 are allocated to
`PBS X1 to X16, and so on.
`
`3. Initialize the lookup tables by loading the grid points within each tolerance box
`in step (1) into the memory.
`
`the tolerance box is computed with respect to the query
`In this algorithm,
`fingerprint features. The host then reads the database of fingerprints and sends their
`feature vectors for matching to the Splash 2 board.
`For each database fingerprint, the host performs the following operations:
`
`1. Read the feature vectors.
`
`2. Register the features as described in step (1) of the sequential algorithm in
`Figure 10.10.
`3. Send each of the feature vectors over the broadcast bus to all PEs if it is within
`
`the bounding box of the query fingerprint.
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 131
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 131
`
`
`
`132
`
`_
`
`Fingerprint Matching on Splash 2
`
`Chapter 10
`
`(2) Broadcast feature vector
`Broadcast Bus
`
`(3) Check for paired feature
`Global OR Bus
`
`
`
`(4) Increment counter if paired
`feature; store in memory and
`reset after all features
`
`processed.
`
`(5) Host reads count from X0
`
`(X, Y, 6) (1) Feature vector received from host
`
`—i FingerprintDatabase
`
`
`
`
`Host (Sun SPARC)
`
`FIGURE 10.13 Data Flow in X0
`
`For each database fingerprint, the host then reads the number of paired features m;
`that was computed by the Splash 2 system, r = 1, .. . N. Finally, the matching score
`is computed as in the sequential method.
`
`10.5.2 Computations on Splash
`
`The computations carried out on each PE of Splash 2 are described below. As men-
`tioned earlier, X0 plays a special role in controlling the crossbar in Splash 2.
`
`1. Operations on X0:
`Each database feature vector received from the host is broadcast to all PEs. If
`
`it is matched with a feature in a lookup table, the PE drives the Global OR bus
`high. When the OR bus is high, X0 increments a counter. The host reads this
`counter value (mg) after all the feature vectors for the current database finger-
`print have been processed. Operations on X0 are highlighted in Figure 10.13.
`
`2. Operations on each PE:
`On receiving the broadcasted feature, a PE computes its address in the lookup
`table through a hashing function. If the data at the computed address is a ‘1’,
`then the feature is paired, and the PE drives the Global OR bus high. Operations
`on a PE are highlighted in Figure 1014.
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 132
`
`Petitioner Microsoft Corporation - EX. 1007, p. 132
`
`
`
`Section 10.5
`
`Parallel Matching Algorithm
`
`133
`
`(2) Check Lookup Table for ’1’
`at address of feature vector;
`
`indicates paired.
`
`(3) If paired drive OR (X, Y, 0) from Database
`Bus to ’1’.
`
`Global OR Bus
`
`FIGURE 10.14 Data Flow in a PE
`
`10.5.3 VHDL Specification for X0
`
`We illustrate how the operations on X0 are customized by describing segments of
`its VHDL code. The tasks carried out by the other PEs are relatively simpler. The
`following functions are carried out by X0:
`
`1. Broadcast feature vector to all PEs
`
`2. Update a counter if at least one of the bits of the Global OR bus is ‘1’, and
`3. Reset the counter after all the minutiae of a database fingerprint are processed
`and the result is updated in X0 memory.
`
`Five segments of VHDL code are shown in Figure 10.15 and are briefly
`described here. Segment 1 (lines 1.1—1.7) shows the signal declarations. The hard—
`ware buses have been directly mapped to bit vectors in VHDL. Some of the program
`variables have been tailored for the range needed based on the application require-
`ment (such as count, features). Segment 2 describes the padding instructions. Note
`that because of using input-output pads, there is a delay in a signal reaching all the
`PEs after it has been seen by X0. The delay is accounted for by using a data pipeline
`of suitable length (in our case the pipeline is 6 stages deep). The code in line 1.7
`combined with code segment 5 (line 5.1) show the use of the pipeline. X0 maintains
`this pipeline by writing data into the pipeline and flushing out the last data sets by
`writing zeros. The code in X0 looks at the end of the pipeline. Thus, the data is seen
`by X0 code when it would have reached other PEs.
`By setting suitable configuration parameters, X0 can be set to broadcast the
`contents of the SIMD Bus to all PEs. To set this mode, code segment 3 is used.
`In code segment 4, the collection of OR flags from all 16 PEs (PE X1 through
`Xl6) is being checked for any possible match by comparing with a bit vector of all
`0’s. If any of the bits is a ‘l’, we increment the counter count.
`If the input for a new database record is initiated, indicated by the 33rd bit of
`the SIMD bus, then the final paired count and the number of features for the previous
`record is stored in memory. The two counters count and features are reset to zero.
`These activities are carried out in code segment 5.
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 133
`
`Petitioner Microsoft Corporation - EX. 1007, p. 133
`
`
`
`134
`
`Fingerprint Matching on Splash 2
`
`Chapter 10
`
`— Signal declarations — (Segment 1)
`
`1.1— SIGNAL Data
`1.2— SIGNAL Address
`1.3- SIGNAL
`count
`1.4— SIGNAL
`features
`1.5— SIGNAL
`SIMD
`1.6— SIGNAL
`Collect-flag
`1.7— SIGNAL feat_pipeline
`
`:
`:
`:
`:
`:
`:
`pipeline;
`
`Bit_Vector(15 downto 0);
`Bit_Vector(17 downto O);
`natural range 0 to 255 := 0;
`natural range 0 to 255 := 0;
`Bit_Vector(35 downto 0);
`Bit_Vector(15 downto 0);
`
`——«
`
`Connections to I/O pads — (Segment 2)
`
`2.1— pad_output (XOMemA, Address);
`2.2— pad_output (X0.Mem_D, Data);
`2.3— pad_Input
`(X0_SIMD, SIMD);
`2.4— pad-Output (XOXB.Data, XbarrOut);
`2.5— padanut
`(X0_GOR_Result_in, CollecLflag);
`
`- Setting X0 to be the crossbar master — (Segment 3)
`
`3.1—— XO_Xbar_En_L <= ’0’;
`3.2— X0_Xl6_Disable <= ’1’;
`3.3— X0)(bar-Send <= ’1’;
`
`— ..... — (Segment 4)
`
`4.1— IF (CollecLfiag /= itobv(0,16)) THEN
`4.2—
`count <= count + 1;
`4.3— END IF;
`
`—— New person record, store present counters and then reset ~— (Segment 5)
`
`5.1— IF (feat_pipeline(0)(32) = ’1’) THEN
`5.2—
`Data(7 downto 0) <= itobv(count,8);
`5.3~
`Data(15 downto 8) <= itobv(features,8);
`5.4——
`count <: 0;
`5.5—
`features <= 0;
`5.6—
`Address <= itobv(bvtoi(Address) + 1,18);
`5.7— END IF;
`
`FIGURE 10.15 VHDL Specification Segments for X0
`
`10.6 SIMULATION AND SYNTHESIS RESULTS
`
`The VHDL behavioral modeling code for PBS X0—X16 has been tested using the
`Splash simulation environment. The simulation environment loads the lookup tables
`and crossbar configuration file into the simulator. Note that the Splash simulator runs
`independently of the Splash 2 hardware and runs on the host. The input data are read
`from a specified file, and the data on each of the signals declared in the VHDL code
`can be traced as a function of time. A sample output of simulation using test inputs
`
`Petitioner Microsoft Corporation - Ex. 1007, p. 134
`
`Petitioner Microsoft Corporation - EX. 1007, p. 134
`
`
`
`Section 10.6
`
`Simulation and Synthesis Results
`
`135
`
`/'
`
`* _
`
`,
`
`,
`
`L
`
`’
`
`,
`
`‘
`
`.
`
`a
`
`.
`
`’
`
`
`
` a gdu: lump grew fiisc
`800
`820
`840
`880
`660
`900
`920
`340
`981]
`
`
`
`
`
`
`
` BOARDS(0)IBDIXI]IXO_CLK W“
`
`0);BDIX00<0_GOR_R