`Techniques
`
`R. L. Rivest, S.L. Graham
`Editors
`
`Perfect Hashing
`Functions: A Single
`Probe Retrieving
`Method for Static Sets
`Renzo Sprugnoli
`Istituto di Elaborazione della Informazione
`del Consiglio Nazionale delle Ricerche
`
`A refinement of hashing which allows retrieval of
`an item in a static table with a single probe is
`considered. Given a set I of identifiers, two methods
`are presented for building, in a mechanical way,
`perfect hashing functions, i.e. functions transforming
`the elements of I into unique addresses. The first
`method, the "quotient reduction" method, is shown to
`be complete in the sense that for every set I the
`smallest table in which the elements of I can be stored
`and from which they can be retrieved by using a
`perfect hashing function constructed by this method
`can be found. However, for nonuniformly distributed
`sets, this method can give rather sparse tables. The
`second method, the "remainder reduction" method, is
`not complete in the above sense, but it seems to give
`minimal (or almost minimal) tables for every kind of
`set. The two techniques are applicable directly to
`small sets. Some methods to extend these results to
`larger sets are also presented. A rough comparison
`with ordinary hashing is given which shows that this
`method can be used conveniently in several practical
`applications.
`Key Words and Phrases: hashing, hashing
`methods, hash coding, direct addressing, identifier-to(cid:173)
`address transformations, perfect hashing functions,
`perfect hash coding, reduction, scatter storage,
`searching
`CR Categories: 3.7, 3.74, 4.34
`
`Copyright© 1977, Association for Computing Machinery, Inc.
`General permission to republish, but not for profit, all or part of
`t~is material is granted provided that ACM's copyright notice is
`given and that reference is made to the publication, to its date of
`issue, and to the fact that reprinting privileges were granted by per(cid:173)
`mission of the Association for Computing Machinery.
`Author's address: Istituto di Elaborazione della Informazione
`del Consiglio Nazionale delle Ricerche, Via S. Maria 46, 156100
`Pisa, Italy.
`
`841
`
`Introduction
`
`This paper is devoted to a refinement of the well(cid:173)
`known technique of hashing, which allows retrieval of
`an item in a static table with a single probe. Let I be a
`given set of identifiers; if we wish to know whether an
`identifier w belongs to I, it is common practice to use
`an identifier-to-address function h to store the elements
`of I in a hash table and then to use the same function h
`to retrieve w in the table. In general, several probes
`are necessary to locate w in the table or to be convinced
`that w is not there (i.e. w E /).
`However, if h transforms the identifiers in I into
`unique addresses, a single probe is sufficient. Such a
`transformation will be called a perfect hashing function .
`In [4] Knuth defines an "amusing puzzle," to find a
`perfect hashing function for a given set I. He points
`out that a slight modification of I may change h
`completely so that the tedious calculations to find h
`become useless and everything has to be started over
`from scratch.
`Here we show how the problem of finding a perfect
`hashing function for a given set I can be mechanized.
`We claim that the use of perfect hashing functions can
`be useful in many applications.
`As a very simple example, let us consider an
`Assembler/370 program dealing with a lot of dates, in
`which the month is abbreviated to the first three
`characters. A fast routine is to be designed to recognize
`the month, to check if it is correctly spelled, and to
`point to some related information. Since the set is so
`small, a linear search is usually preferred. The month
`abbreviations are stored in the table TABLE in the
`last three bytes of a full word, with the first byte set to
`zero; the month to be searched for is right adjusted in
`the full word MONTH, located at the end of TABLE,
`i.e. MONTH equals TABLE + 48.
`Figure 1 gives a possible piece of coding, where,
`on the right, we have written the units of time (see
`Knuth [3]) relative to each instruction. C is the number
`of times the loop is executed. Assuming C = 6.5, we
`have a total score of 34. We remark that it is useless
`to unroll the loop because we wish to know explicitly
`the index in the table for further use.
`Now, according to the theory developed in Section
`3, we can set up a perfect hash table with an appropri(cid:173)
`ate perfect hashing function (Table I) and use the
`piece of coding given in Figure 2.
`The perfect hashing function is performed by the
`four instructions in the shaded area. Summing up the
`units of time we get a total score of 26, an improvement
`of about 24 percent over the linear search.
`As an example, let us suppose that MONTH con(cid:173)
`tains 'FEB'. In register 3 we load the characters 'EB'
`corresponding to the decimal number 50626; adding 9
`and multiplying by 28 = 256 we get 12962560; dividing
`by 23, in register 2 we find the remainder 13, so that
`the last shift gives 6 in register 3. In fact, 'FEB' is the
`month at location 6 in TABLE.
`
`Communications
`of
`the ACM
`
`November 1977
`Volume 20
`Number 11
`
`ASSA ABLOY Ex. 1022 - Page 1
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01093 - U.S. Patent No. 8,620,039
`
`
`
`2
`1
`1
`1
`C
`2C
`C
`2
`1
`
`Fig. 1.
`
`LOOP
`
`L
`LA
`LA
`LNR
`LA
`C
`BNER
`C
`BE
`
`1 ,~1ONTH
`2,LOOP
`3,4
`3,3
`3, 4 (3)
`1 ,TABLI: (3)
`2
`3,=F'48'
`FAILURE
`
`Fig. 2.
`1 ,MONTH
`L
`LA
`3 I 1
`N
`3,=X'FFFF'
`2,2
`SR
`LA=3 9(3
`SLA= 3 8
`o= 2 =F'23'
`SRDA= 2 33
`3,2
`SLA
`1,TABLE (3)
`C
`FAILURE
`BNE
`
`2
`1
`2
`1
`1
`2
`10
`2
`2
`2
`1
`
`Table I.
`
`TABLE
`+1
`+2
`+3
`+4
`+5
`+6
`+7
`+8
`+9
`+10
`+11
`
`MAR
`OCT
`JUN
`SEP
`AUG
`JAN
`FEB
`APR
`DEC
`NOV
`JUL
`MAY
`
`We have developed direct methods to derive per(cid:173)
`fect hashing functions for small sets, say with at most
`10-12 elements, and we have extended our results to
`larger sets.
`In Section 1 we give some general definitions and a
`more formal approach to the problem; in Section 2 we
`present the "quotient reduction" method for construct(cid:173)
`ing perfect hashing functions. Section 3 is devoted to
`the "remainder reduction" method, and, finally, in
`Section 4, we extend our results to larger sets.
`
`1. General Considerations
`
`Essentially we are interested in functions defined
`on sets of identifiers. However, because of the repre(cid:173)
`sentation of characters in the computer memory, we
`may consider functions on integers without any loss of
`generality.
`Let N denote the set {O, 1, 2, 3, ... } of natural
`numbers and Z the set of integers; for n, n ', m E Z,
`"n mod m" is the remainder (~O) of the integer
`division of n by m, and n = n' (modulo m) is the
`congruence relation defined by: n mod m = n' mod m.
`Furthermore, if n < m, [n, m] is the interval {n, n +
`1, ... , m - 1, m}, the length of which ism - n + 1.
`
`842
`
`Zm denotes the set of residues modulo m, i.e. the
`interval [O, m - 1]; Gm is the set of integers q such
`that O < q < Im I and q is prime to m; the number of
`the elements in Gm is <f,(m), the Euler totient function
`applied to m. Occasionally we use the group structure
`of Zm and Gm imposed by addition and multiplication
`modulo m, respectively. Finally, Lx J and f x 1 denote the
`floor and ceiling functions applied to the (real) number
`X.
`
`Given a set / = {w 1 , w 2 , • • • , wn} of natural
`numbers and w E N, we consider the problem of
`determining in a practical way whether w E /. Usually
`the elements of / are stored in a table with m (~n)
`locations, and w is to be searched for in the table. Two
`reviews of a variety of techniques usable for this
`purpose can be found in [4] and [7].
`One of the most efficient techniques is hashing
`(see also [2] and [5]). The method consists in using a
`hashing function h:N -
`Zm and storing an element W;
`E / at location h(w;) in the table. The same function h
`is then used to check whether a given w E N is present
`in the table. Since several elements in / can hash to
`the same address, different methods have been devel(cid:173)
`oped to store and locate colliding elements. Therefore
`in general more than one probe (i.e. access to the
`table) is necessary to verify that w E / or to be
`convinced that w ft. /. The average number of probes
`can be made close to 1 at the cost of having sparse
`tables, that is, tables with a loading factor n/m much
`less than 1.
`The set / may be static, that is, it does not change
`during the execution of a program. In this case, the
`distribution of the elements of/ can be used to design
`a hashing function h which improves on the usual
`performance of hashing.
`The best situation is given by a function h such that
`h on / is injective and max h(I) = n - 1. The first
`condition assures that a single probe is sufficient to
`retrieve an element, and the second assures that the
`table is full. Given the set/, we are going to show that
`it is possible to construct a function h ( depending on /)
`which satisfies the first condition and allows us to use
`a table with a loading factor very close to 1.
`A perfect hashing function (phf for short) for / is a
`function h: N - N such that h on/ is injective, min
`h(I) = 0, and max h(I) = m - 1 for some m ~ n (m is
`the length of the table). A minimal perfect hashing
`function for/ is one for which m = n.
`The elements in / are assumed to be in ascending
`order; thus the set/ is determined by its first element
`w 1 and the ordered set of its 1st-differences 6; = W;+ 1
`- w; for every i E [l, n - 1]. Obviously any difference
`wi - w; can be expressed in terms of the differences 6;.
`
`2. Quotient Reduction Method
`
`The first, and perhaps the only, previous systematic
`attempt to construct perfect hashing functions is given
`in [1]. However, if we consider the set / = {l, 3, 8,
`
`Communications
`of
`the ACM
`
`November 1977
`Volume 20
`Number 11
`
`ASSA ABLOY Ex. 1022 - Page 2
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01093 - U.S. Patent No. 8,620,039
`
`
`
`14, 17, 23}, given there as an example, the function
`h(w) = l(w + 3)/5J is a far simpler minimal perfect
`hashing function for I than the function described by
`Greniewski and Turski.
`This example illustrates the first method we pre(cid:173)
`sent, the quotient reduction method. Given a finite set
`I C N, the method consists in translating the set I and
`then taking the integer quotients by some divisor N E
`N: ('v'wEI) more formally, h(w) = l(w + s)/NJ for
`some integers depending on I. A function of this form
`will be called a quotient reduction perfect hashing
`function.
`The translation term s can be decomposed s = qN
`+ s' for some integers q ands' (0 :s; s' < N). The term
`qN allows us to have h(w 1) = 0, while s' is used to
`adjust the elements of I to different intervals [kN, (k
`+ l)N - 1] so that
`h(w;) =I- h(w1)
`for every i, j E [1, n] and i =I=- j.
`The perfect hashing functions generated by this
`method are very simple and work well for uniformly
`distributed sets. Furthermore the results obtained will
`be used later to develop more general techniques.
`
`(2.1)
`
`Algorithm Q (Quotient reduction): Given a set / as above, this
`algorithm finds the best quotient reduction phf for/.
`(a) [find upper bound for NJ N 0+-min{l(w1-w,-1)/(i-i-l)J I i,
`jE[l, n-1) andj>i+l};
`(b) [initialize A] A+-[1, N 0 ] (at the end of step (c) the set A will
`contain all the possible values for N);
`(c) [scan/] Vi,jE[l, n-1) such thatj>i and 8,+8;SN0 do:
`(cl) [initialize D] d+-8,+81-l; D+-[l, d] (at the end of step
`( c3) the set D will contain all the values N satisfying (2 .1)
`relative to i and j);
`(c2) [find 0' and 0"] m+-(w1+l)-w;+ 1 ; M+-w1+ 1 -(w,+l);
`0'+-[m/N0l; 0''+-[m/(d+l)l;
`(c3) [determine D] '1'0EZ (0's(Js(J''), do: D+-DU[fm/01, LM/0J];
`(c4) [update A] A+-AnD;
`(d) [find NJ N+-maxA (N is taken as large as possible in order to
`get the smallest table);
`(e) [initialize J] J+-ZN (at the end of step (f) the set J will contain
`the integerss(Oss<N) satisfying (2.1) for every dj);
`[determine J] ViE[l, n-1) such that 81<N, do: J+-J n
`{(t-w;+ 1)mod N I Ost<S,};
`(g) [a smaller N, if necessary] if J=0, drop N from A and go to
`step (d);
`(h) [choose best t] let t be any element in J minimizing (w,+t)mod
`N (this condition assures that the table is of minimal length);
`[finds] s+-t-Nl(w, +t)/NJ.
`
`(f)
`
`(i)
`
`Now let us show that this algorithm is correct.
`First we prove that N 0 is an upper bound for N. By
`(2.1), for j > i + l we have h(w1) - h(w;) > j -
`l;
`i -
`hence (w1 + s) > (w; + s) + (j -
`l)N, and so:
`i -
`N :s; l(w1 - W; - 1)/(j - i -
`l)J
`for i, j E [l, n] andj > i + l.
`As an example throughout this section, let us
`consider the set I= {17, 138, 173, 294, 306, 472,
`540, 551, 618}. We compute N 0 by means of Table II,
`which contains the differences of the elements in I. On
`each row we have circled the least difference w 1 - w;,
`and on the right we have computed l(w; - W; - 1)/(j
`
`(2.2)
`
`843
`
`Table II.
`
`L77/1J =77
`L14 5;2J =n
`
`L256/3J =85
`
`L323/4J =80
`
`L412;5J =82
`
`L4 79/GJ =79
`
`L600;1J =85
`
`l)J. Thus we have N 0 = 72 and, by step (b), a=
`i -
`-
`[l, 72].
`Now, before explaining step (c), we have to analyze
`the role played by the set J. So let us suppose that N
`has already been found and consider steps (e) and (f).
`For every w; E /, an admissible increment for W; is any
`integer t for which condition (2 .1) is satisfied for j = i
`+ 1:
`
`(2.3)
`
`l(wi+ 1 + t)/NJ =I=- l(w; + t)/NJ.
`In other words, an admissible increment for W; is any
`translation value which adjusts w; and W;+ 1 to two
`different intervals [kN, (k + l)N -
`1]. Clearly a
`quotient reduction phf for I can be found if and only if
`there exists an admissible increment which works for
`every w; E /.
`The set of all the admissible increments for W; is
`given by J*(w;) = {u - wi+ 1 + kN IO :s; u < 8; and k E
`Z}. In fact, let t = u - wi+ 1 + kN; then we have W; + t
`= w; + u - wH 1 + kN = u - 8; + kN and wH 1 + t =
`W;+ 1 + u - W;+ 1 + kN = u + kN, and relation (2.3)
`holds if and only if O :s; u < 8;.
`We can ignore the term kN in the expression for
`J*(w;) and define the set J(w;) of the reduced admissible
`increments for w;:
`J(w;) = {(t - W;+ 1)mod NI 0 :s; t < 8;}.
`(2.4)
`Obviously, if 8; 2: N, J(w;) = ZN and no computation
`is necessary. Thus steps (e) and (f) determine the set J
`= nf,:-l J(w;) correctly, and, as we remarked above,
`there exists a quotient reduction phf for I if and only if
`J =I=- 0.
`In our example, let us suppose N = 64 (:572 =
`N 0 ). In Table II we put in a box the 1st differences less
`than 64. It is:
`1(138) = {(t - 173)mod 64 J O :s; t < 35} = [19, 53],
`1(294) = {(t - 306)mod 64 I O :s; t < 12} = [14, 25],
`1(540) = {(t - 551)mod 64 I O :s; t < 11} = [25, 35],
`and J = {25}. The reader can verify that for N = 65
`and N = 63, J = 0 and J = [16, 21], respectively.
`Thus there exist quotient reduction phf's for N = 63
`and N = 64, but not for N = 65.
`Now, looking at the thing from the other side, we
`can determine the set a of possible values of N for
`which the associated set J is nonempty. We can prove
`that if J =I=- 0 (relative to N) then 'v'w;, w 1 E I (i < j)
`November 1977
`Communications
`Volume 20
`of
`Number 11
`the ACM
`
`ASSA ABLOY Ex. 1022 - Page 3
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01093 - U.S. Patent No. 8,620,039
`
`
`
`there exists a multiple of N in the interval Tu = [mu,
`Mu], where mu = (wj + 1) - wi+ 1 and Mu = wJ+ 1 -
`(w;
`+ 1). In fact, by (2.4), J(wi) n J (wj) =f- 0 if and only if
`t I O :5 t <
`there exist two elements a EA = {wi+1 -
`6;} and b EB = {wHl - t I 0 :5 t < 6j} such that a = b
`(modulo N). This means that the set of all the differ(cid:173)
`ences between the elements in B and the elements in
`A must contain a multiple of N. But this set is just Tu,
`because its limits are the cross-differences of the limits
`of Band A.
`The converse is not true because J can be empty
`although the J(wi)'s are not pairwise disjoint. Thus
`maxA is a better approximation to N than N 0 , but A
`may contain elements which do not correspond to any
`quotient reduction phf for I. Actually step (c) can be
`eliminated; however, as shown by the example below,
`when N is smaller than N 0 , step (c) can save a lot of
`computations relative to the J(w;)'s.
`Now, if we call Du the set of positive integers with
`a multiple in the interval Tu, we have to choose N in
`the set A = [l, N 0] n ni<j Du. By definition, it is Du =
`U1s11sm0 ffmu/01, LMu/0J]; however, we can remark
`that:
`'(a) since N :s; N 0 , the useful lower limit for 0 is 0' =
`rmu/Nol;
`(b) the length of the interval Tu is d = wJ+ 1 -
`(w; + 1)
`(wj + 1) + wi+ 1 + 1 = 6; + 61 - 1; so every
`-
`integer in [l, d] has a multiple in Tu. This implies
`that (i) every couple (i, j) for which d = 6; + 6j -
`1 2:: N 0 can be ignored in the computation of A,
`and (ii) the upper limit for 0 is 0'' = [mu/(d + 1)].
`So we have Tu = [l, d] U U6'sl/slf' [[mu/01, LMu/0]],
`and step ( c) determines the set A correctly.
`In our example we have to consider the following
`intervals:
`T24 = [122, 167]
`T27 = [368,412]
`T 47 = [235, 256]
`so:
`
`for which d = 46, 0' = 2, 0" = 3;
`for which d = 45, 0' = 6, 0" = 8;
`for which d = 22, 0' = 4, 0" = 11;
`
`[1, 55] U [61, 72];
`[1, 51] U [53, 58] U [62, 68];
`[l, 25] U [27, 28] U [30, 32] U [34, 36]
`U [40, 42] U [47, 51] U [59, 64];
`[1, 25] U [27, 28] U [30, 32] U [34, 36]
`U [40, 42] U [47, 51] U [62, 64].
`Now, every N E A can possibly determine one or
`more quotient reduction phf's. It is possible to show
`that if N > N' the table length for N is not longer than
`any table length for N'. Hence we choose N as large as
`possible (steps (d) and (g)). In our example, for N =
`64 = max A, it is J = {25}.
`Finally, let us come to steps (h) and (i). Up to this
`moment, we have found N and the set J of the reduced
`admissible increments relative to N. Every t E J
`determines a quotient reduction phf for/; in order to
`have h(w 1) = 0, the translation value St is given by St =
`[(w 1 + t)/N]. Since we are looking for the shortest
`t -
`
`844
`
`table, h(wn) is to be as smalll as possible. This occurs
`when w1 + St (which is non-negative and less than N)
`takes its nearest value to 0. However, since w1 + t =
`w1 + St (modulo N), the quantity (w 1 + t)mod N is to
`be as small as possible, and this condition determines
`the value t of the "best" redm:ed admissible increment.
`t = 25, and the best
`In our example, it is s =
`quotient reduction phf is h(w) = L(w + 25)/64J. It is
`not minimal, as shown by the! following table:
`
`w
`h(w)
`
`17 138 173 294 306 472 540 551 618
`0
`2
`3
`4
`7
`8
`9
`10
`5
`
`Now, let us consider a possible improvement of the
`quotient reduction method. Obviously, in order to
`have an (almost) full table for a given set/, we should
`have N = (wn - w 1)/(n - 1). However, if the elements
`in I are not uniformly distributed, N can be consider(cid:173)
`ably smaller than this best value; so the table is sparse.
`Often it is possible to obtain shorter tables introducing
`the concept of a cut. A cut consists in translating all
`the elements in I larger than a certain value (the cut
`value) before performing the quotient reduction. Since
`the cut value can be identified with some Wt E /, the
`index t will be called the cut point and the perfect
`hashing function h is defined by:
`
`h(w;) = [(wi + s)/N],
`h(w;) = [(w; + s + r)/N],
`
`Vi :s; t,
`Vi> t,
`
`(2.5)
`
`for some integer r.
`The problem consists in finding the cut value (or,
`equivalently, the cut point) and the integer r for which
`the table may be of minimal length. In order to satisfy
`condition (2.3), the integer r should produce an admis(cid:173)
`sible increment common to all the elements in I. An
`obvious approach is to try successively all the elements
`w1 , w2 , • • • , Wn-I as possible cut values. This leads to
`the following Algorithm C. An important point in this
`algorithm is the evaluation of N; we ignore the differ(cid:173)
`ences of elements in I on opposite sides of the cut
`point. Actually the values of these differences will be
`determined only when the integer r is known.
`
`Algorithm C (Quotient reduction with a cut). Given a set /, this
`algorithm determines the cut point t and the integer r, allowing us to
`obtain the best cut reduction phf of the form (2.5).
`(a) [initialize t] ,-1;
`(b) [upper bound for N,] N 0-min {l(wJ-w,-1)/(j-i-l)Jl(i,jE[l, t]
`or i,jE[t+ 1, n]) andj>i+ l} (ignore couples w,, WJ such that i:5t<j);
`(c) [find A] evaluate A as in steps (b) and (c) of Algorithm Q;
`(d) [find N,] N,-maxA;
`(e) [admissible increments] h-nl:l {(v-wi+,) mod N,I0:5v< 61};
`la-nf.:;'t-, {(v-wi+,) mod N,I0:5v<61};
`(f) [a smaller N,, if necessary] if either h or la is empty, drop N,
`from A and go to step (d);
`(g) [lower bound for 6;] 60-min {(j-i-1) N,+1-wJ+w,+6,li:5t<j};
`(h) [finds', 6;] perform the following steps:
`(hl) p-min {pE[l, N,]1(-w,-p) mod N,Eh};
`(h2) lets' be the element in h corresponding to the value of p;
`(h3) 6;-min {62:60 l 3j"Ela such that 6=w 1+1+j"+p (moduloN,)};
`(i) [determiner, s] r,~;-61; s1~-s'-N,L(w 1+l·')/N,J;
`(j) [length of the table] L,~(w.+s1+r,)/N,J-1;
`
`Communications
`of
`the ACM
`
`November 1977
`Volume 20
`Number 11
`
`ASSA ABLOY Ex. 1022 - Page 4
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01093 - U.S. Patent No. 8,620,039
`
`
`
`(k) [loop on t] t+-t+ 1 and go to step (b) if t<n;
`(I) [table of minimal length] let zany index for which Lz is minimal
`and outputs,, N,, w,, rz.
`
`Now let us show that this algorithm is correct. For
`fixed t, we consider the set I 1 = {w;j 1 :5 i :5 t} U {w; +
`rlt < i :5 n}; then the function (2.5) relative to I is
`equivalent to a quotient reduction phf for I 1• However,
`the 1st differences of I 1 equal the 1st differences of I,
`except for 8; = 81 + r. Thus the determination of r is
`equivalent to the evaluation of 8;, and the most impor(cid:173)
`tant steps of the algorithm are (g) and (h).
`Step (g) determines a lower bound for 8; by apply(cid:173)
`ing relation (2 .2) to the set of differences w i - w; (i :5
`t < j). By steps (hl) and (h2) we have p = -w 1 - s' +
`k'N, and by (h3), 8; =w1+1 + j" + p + k" N = 81 + j"
`- s' + kN, where k = k' + k"; hence, for every w; E
`Ii (i > t), w;+I + s' = W;+1 + 8; -
`81 + s' = W;+1 + j" +
`kN. This proves thats' is a reduced admissible incre(cid:173)
`ment for w;. Since 8; ~ 80 , s' is a reduced admissible
`increment for every element in ft. Actually p has been
`defined in such a way that s' is the "best" reduced
`admissible increment. Thus Algorithm C determines
`the best cut reduction phf (2 .5) correctly.
`The run time of Algorithm C, however, is exceed(cid:173)
`ingly high, at least of order Kn 3 , where K is the largest
`0' + 1 (step (c3)). If we content
`value of 0" -
`ourselves with a near-to-optimal function of the form
`(2.5), we can find a far better algorithm. In fact, N is
`usually very close to N 0 , and, by construction, a cut
`adjusts w1 and w1+ 1 to two consecutive intervals [kN,
`(k + 1) N -
`l]. Thus we may assume that the value
`(wn - w 1 - 81)/N0 + 3 approximates the length of the
`table obtained using t as a cut point. This leads to the
`following:
`
`Algorithm S (Simplified quotient reduction with a cut). Given a set
`I, this algorithm first determines the cut point t corresponding to a
`nearly optimal perfect hashing function of the form (2.5) and then
`finds the other parameters of the function.
`
`(a) [initialize t] t+---1;
`(b) [upper bound for N] N'/ +- min {l(wJ-w,-1)/(j-i-l)Jf(i, jE[l,
`t] or i,jE[t+l, n]) andj>i+l};
`(c) [approximate length of the table] L 1--(w.-w,-6,)/N'/;
`(d) [loop on t] let t+-t+l and go to step (b) if t<n;
`(e) [shortest table] let z be any index for which Lz is minimal;
`(f) [initialize N] N+-N~+ 1;
`(g) [decrement N] N+-N-1;
`(h) [admissible increments] h+-nf;;;/ {(v-wH,) mod Nf0sv<6,};
`JR+-nr;;;J+1 {(v-wi+,) mod NI 0sv<6i};
`[lower bound for 6;] 60+-min{(j-i-l) N+l-wJ+w,+6zfisz<j};
`(i)
`(j) [finds', 6;] perform the following steps:
`(jl) p+-min {pE[l, N]l(-wz-p) mod NEh};
`(j2) lets' be the element in h corresponding to the value of p;
`(j3) 6;+-min {62e:60 f 3j"EJR such that 6=Wz+1+j"+p (moduloN)};
`(k) [determiner, s] r--8;-6z; s+-s'-NL(w,+s')/NJ;
`(I) [output results] outputs, N, w,, r.
`
`Let us apply Algorithm S to our example; we get:
`
`8
`7
`6
`5
`4
`3
`2
`1
`78
`83
`77
`72
`72
`72
`72
`72
`6.67 7.86 6.67 8.18 6.04 6.92 7.11 6.85
`
`845
`
`so z = 5. Considering the differences in the shaded
`area of Table II, we obtain 80 = 131. Now we have h
`= 1(138) n 1(294) = [54, 65] and 18 = 1(472) n 1(540)
`n 1(551) = [30, 31]; sop = min {p E [l, 72]1 (-306 -
`p) mod 72 E [54, 65]} = 61, ands' = 65. Hence 8~ =
`min {8 2:: 131 I 3j" E [30, 31] such that 8 = 4 72 + j" +
`61 (modulo 72)}, and it is simple to see that 8 = 59 or
`8 = 60 (modulo 72). However, 131 = 59 (modulo
`72); so 8~ = 131, r = -35, s = -7. The perfect
`hashing function obtained is minimal and can be imple(cid:173)
`mented by the two Fortran statements:
`IF(W.GT.306) W = W - 35
`H = (W - 7)/72
`
`where Hand Ware INTEGER variables.
`We conclude this section with three remarks:
`(i) The programmer has to consider the fact that if w
`> wn (and h(w) > m), w is compared to s·omething
`outside the table. Thus some care has to be taken
`when allocating the table or an extra comparison
`between h(w) and rn must be introduced.
`It is possible to consider cut reduction phf's with
`more than one cut. An algorithm similar to Algo(cid:173)
`rithm S is easily devised to get a near-to-optimal
`function of this new type.
`(iii) All the functions considered in this section are
`monotonic.
`
`(ii)
`
`3. Remainder Reduction Method
`
`As we have remarked, the quotient reduction
`method works well when the set I is uniformly distrib(cid:173)
`uted. However, this is not always the case, especially
`when the elements in I are derived from identifiers
`coded in EBCDIC. In fact, the collating sequence for
`letters and digits contains three large gaps - between I
`and J, Rand S, and Zand 0.
`When the set I is not uniformly distributed, we can
`scramble its elements to get a more uniform distribu(cid:173)
`tion and try to apply a quotient reduction phf to the
`scrambled set. To scramble the elements of I in a 1-1
`manner, we take the moduli of the elements in I by
`some appropriate integer divisor M. This method will
`be called the remainder reduction method.
`In order that the scrambled set IM= {w; mod Mlw;
`EI} C ZM be of interest to us, we should have:
`
`w; =I:- wi (modulo M), Vi, j E [l, n] and i =I= j.
`(3.1)
`If D is the set of divisors of some difference w i - w; (j
`> i), condition (3 .1) is verified if and only if M ft. D.
`In what follows, we suppose M ft. D if not otherwise
`stated.
`We are interested in obtaining remainder reduction
`perfect hashing function of the form:
`h(w) = l((d + wq) mod M)/NJ.
`In order to do this, the values d, q, N, and M must be
`suitably chosen. In particular, to chose N and d, we
`
`(3.2)
`
`Communications
`of
`the ACM
`
`November 1977
`Volume 20
`Number 11
`
`ASSA ABLOY Ex. 1022 - Page 5
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01093 - U.S. Patent No. 8,620,039
`
`
`
`need to consider some properties of the scrambled set
`IM with respect to a simpler form of function, which
`we call a rotation reduction perfect hashing function:
`h(w) = l((d + w) mod M)/NJ.
`(3.3)
`A (scrambled) set IM will be called rotationally reduci(cid:173)
`ble by N E Z iff there exists a rotation reduction phf
`(3.3) for IM.
`Since (d + w) mod M = (d + (w mod M) mod M),
`the set d + IM = {(d + w;) mod Mlw; E IM} is a
`rotation of IM; we call d the rotation value. Thus, in
`other words, IM is rotationally reducible by N iff there
`exists a rotation value d such that d + IM has a
`quotient reduction phf of the form h(w) = lw/NJ.
`Therefore, in a sense, a rotation reduction phf is
`analogous to a quotient reduction phf in which the
`translation has become a rotation. However, a rotation
`reduction phf has a remarkable advantage over a
`quotient reduction phf. In fact, the rotation allows us
`to ignore the 1st difference between the two elements
`which become the last and the first elements of the
`rotated set. We can choose this difference as the one
`which causes trouble for determining N. Thus a rota(cid:173)
`tion reduction phf is more like a cut reduction phf
`than it is like a simple quotient reduction phf.
`It is possible to verify the validity of these remarks
`by an example. Let I 19 = {0, 1, 4, 6, 8, 12, 14}; it is
`rotationally reducible by 3, and a suitable rotation is
`14 + I 19 = {l, 3, 7, 9, 14, 15, 18}.
`The simplest method to find whether a set IM is
`rotationally reducible is to look successively at all the
`M possible rotation values; however, we can look at
`them in parallel. Let us consider the extended set of IM,
`defined as the set of 2n - 1 elements XM = IM U (M +
`(IM "'-{w})) C hM, where w is the largest element in IM.
`We always suppose that the elements in XM are ar(cid:173)
`ranged in ascending order. The following theorem is
`the basis for finding the rotation reduction phf's for
`IM; moreover, it tells the whole story about the connec(cid:173)
`tions between rotation and quotient reduction phf's.
`THEOREM A. Let IM and X M be as above. IM is
`rotationally reducible by N if and only if there exists a
`subset X'Jj = {wi+ 1 , wi+2 , • • • , W;+n} of n consecutive
`elements of XM which has a quotient reduction phf h(w)
`= l(w + s)/NJ such that W;+n + s < M.
`In fact, if IM is rotationally reducible, then w;+i is
`the element which becomes the first after the rotation.
`Since wi+n becomes the last element, the condition
`w i+n + s < M is verified. Conversely, if there exists a
`set X'J;J satisfying the conditions of the theorem, then
`either i = O and s ~ 0 so that the rotation value d
`equals s, or s < 0 and then d = s + M.
`For the set I 19 , we have X 19 = {0, 1, 4, 6, 8, 12,
`14, 19, 20, 23, 25, 27, 31} and we try to find a
`quotient reduction phf for N = 3 as shown in Table III.
`In order that a quotient reduction phf exist for a
`subset X'Jj, condition (2.2) must hold. We have circled
`the differences violating the condition; so the differ(cid:173)
`ences in the dark shaded area can be ignored during
`
`846
`
`Table III.
`
`0
`
`1
`
`4
`
`6
`
`8 12 14 19 20 23 25 27 31 N(k-1)+1
`
`1st-diff.
`
`1
`
`3
`
`2
`
`2
`
`4
`
`2
`
`5
`
`1
`
`3
`
`2
`
`2
`
`4
`
`2nd-diff.
`
`4
`
`5
`
`4
`
`6
`
`6
`
`7
`
`6
`
`4
`
`5
`
`4
`
`6
`
`3rd-cliff.
`
`4t" -jiff.
`
`5th-cliff.
`
`6th-diff.
`
`7th-diff.
`
`7
`
`10
`
`13
`
`16
`
`19
`
`the computation. Moreover, in the tuple of kth differ(cid:173)
`ences there must be n -
`k consecutive differences
`satisfying (2.2); so the differences in the light shaded
`area can also be ignored. Usually these properties
`provide a fast way to decide that a set is not rotationally
`reducible.
`In our example, we have three candidate sets XW,
`X~:i/, and XW, corresponding to the (n -
`l)th differ(cid:173)
`ences not enclosed in the shaded areas. We have the
`following sets of admissible increments: 1(4) = {O, l},
`1(6) = 1(12) = {1, 2}, 1(19) = {l}; so 1 is the only
`admissible increment for all the three sets.
`For XW we haves = l; however, w 2+1 + s = 20 >
`19, and the last condition of Theorem A is violated.
`For X~:il it is s = -2 and w 3+1 + s = 18 < 19; so we
`have the reducible rotation (19 - 2) + I19 = 17 + I19.
`Finally, for XW, it is s = -5 and w4+ 7 + s = 18 < 19;
`so we have the reducible rotation (19 - 5) + I 19 = 14
`+ I 19 , which is the rotation we have considered already.
`At this point we can state an algorithm for finding
`rotation reduction phf's.
`Algorithm T (Rotation reduction): Given a (scrambled) set IM, this
`algorithm finds all the rotation values d corresponding to rotation
`reduction phf's for IM by a given integer N.
`
`(a) [extended set] XM+-IM U (M+(JM "-._{max IM}));
`(b) (initialize V] V +- [I, n] (at the end of step (c), V will contain
`the indices i corresponding to sets xi-o for which the first
`condition of Theorem A is satisfied);
`(c) (determine V] 'v'kE[2, n-1] do:
`{cl) [kth differences] compute the set ~ of kth differences (~
`contains 2n-k-1 elements);
`(c2) [verify condition (2.2)] let D be the set of indices t of l>,E~
`such that l>,>N(k-1);
`(c3) [update V] V+-Vn{iE[l, n]l[i, i+n-k] CD} (D has to
`contain (n-k) consecutive differences satisfying (2.2));
`(c4) [no reduction] if V=0, exit {the set IM does not have any
`rotation reduction phf by N);
`(d) [initialize i] let i be the first element in V (we consider xi- 0 );
`(e) [find J] J+-nJ:tr-2 {(t-w;+,) mod Nlw