throbber
Volume 3 I Sorting and Searching
`
`H
`
`lanford University
`
`THE ART OF
`COMPUTER PROGRAMMING
`
`ISHINO COMPANY
`
`Reading, Massachusetts
`Menlo Park, California · London · Amsterdam · Don Mills, Ontario · Sydney
`
`ASSA ABLOY Ex. 1021 - Page 1
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01094 - U.S. Patent No. 8,620,039
`
`

`

`This book is in the
`ADDISON-WESLEY SERIES IN
`COMPUTER SCIENCE AND INFORMATION PROCESSING
`
`Consulting Editors
`RICHARD s. VARGA and MICHAEL A. HARRISON
`
`Second printing, March 1975
`
`Copyright© 1973 by Addison-Wesley Publishing Company, Inc. Philippines copyright 1973
`by Addison-Wesley Publishing Company, Inc.
`All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,
`or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording,
`or otherwise, without the prior written permission of the publisher. Printed in the United
`States of America. Published simultaneously in Canada. Library of Congress Catalog Card
`No. 67-26020.
`
`ISBN 0-201-03803-X
`BCDEFGH IJ-M A -79B76
`
`This book forms a nat
`Chapter 2, because it
`basic structural ideas.
`book is only for those E
`tion of general-purpos(
`But in fact the area o
`discussing a wide vari(
`
`How are good algc
`How can given alg
`How can the effici,
`How can a persor
`same application?
`In what senses car
`How does the the,
`How can external
`with large data ha
`
`Indeed, I believe that
`somewhere in the cont
`This volume comp
`is concerned with sor1
`been divided chiefly in
`also are supplementar:
`tations (Section 5. 1)
`Chapter 6 deals with
`fil es; this is subdivided
`of keys, or by digital
`problem of secondary :
`
`ASSA ABLOY Ex. 1021 - Page 2
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01094 - U.S. Patent No. 8,620,039
`
`

`

`506
`
`SEARCHING
`
`6.4 HASHING
`
`6.4
`
`G.4
`
`So far we have considered search methods based on comparing the given argu(cid:173)
`ment K to the keys in the table, or using its digits to govern a branching process.
`A third possibility is to avoid all this rummaging around by doing so me arith(cid:173)
`metical calculation on K, computing a function f (K) which is the location of K
`and the associated data in the table.
`For example, let's consider again the set of 31 English words which we have
`subj ected to various search strategies in Section 6.2.2 and 6.3. Table 1 shows
`a short MIX program which transforms each of the 31 keys into a unique number
`f(K) between -10 and 30. If we compare this method to the MIX programs for
`the other methods we have considered (e.g., binary search, optimal tree search,
`trie memory, digital tree search), we find that it is superior from the standpoint
`of both space and speed, except that binary search uses slightly less space. In
`fact, the average time for a successful search, using the program of Table 1
`with the frequency data of Fig. 12, is only about 17.Su, and only 41 table loca(cid:173)
`tions are needed to store the 31 keys.
`Unfortunately it isn't very easy to discover such functions f(K). There
`are 41 31 ~ 10 50 possible functions from a 31-element set into a 41-element set,
`and only 41 • 40 • • • · • 11 = 41 !/10! ~ 104 3 of them will give distinct values
`for each argument; thus only about one of every 10 million function s will be
`suitable.
`Functions which avoid duplicate values are surprisingly rare, even with a
`fairly large table. For example, the famous "birthday paradox" asserts that if
`23 or more people are present in a room, chances are good that two of them
`
`TRANSFORMING A SET OF KEYS INTO UNIQUE ADDRESSES
`
`Table 1
`
`will have the same montl
`dom function which map
`no two keys map into t i
`Skeptics who doubt this r
`large parties they atten<
`unpublished work of H.
`Essays (1939), 45. See a
`Mecmuasi 4 (1939), 145·
`Theory (New York: Wile)
`On the other hand, tb
`niewski and W. Turski, C
`a suitable function can b
`amusing to solve a puzzl,
`Of course this methc
`must be known in advan<
`making it necessary to s1
`more versatile method if
`keys to yield the same ,
`ambiguity after f(K) has
`These considerations
`known as hashing or sea
`chop so mething up or to
`off some aspects of the k
`searching. We compute
`where the search begins.
`The birthday parad,
`Ki ~ Ki which hash to
`
`Instruction
`
`LDlN K(l: 1 )
`LD2 K(2:2)
`INCl
`-8,2
`JlP
`*+2
`INCl 16,2
`LD2 K(3:3)
`J2Z
`9F
`INCl
`-28,2
`JlP
`9F
`INCl 11,2
`LDA
`K(4:4)
`JAZ
`9F
`DECl
`-5,2
`9F
`JlN
`INCl 10
`9H LDA
`K
`CMPA TABLE,l
`JNE
`FAILURE
`
`0
`z
`<
`
`Cil
`0::
`<
`
`<
`
`Ul
`<
`
`E-<
`<
`
`Cil
`
`'°
`
`E-<
`::,
`
`'°
`
`><
`'°
`
`0::
`D
`~
`
`:,;
`D
`
`e:
`
`0
`<
`::c
`
`Cil
`>
`<
`::c
`
`Cil
`::c
`
`0::
`Cil
`::c
`
`Ul
`H
`::c
`
`z
`H
`
`H
`
`Ul
`H
`
`E-<
`H
`
`E-<
`D
`z
`
`- 1 - 2
`- 2
`- 1
`14
`- 5
`14
`- 5
`16
`16
`16
`
`14
`14
`
`- 1
`- 1
`- 9
`- 9
`7
`7
`7
`
`- 1
`- 1
`6
`6
`
`- 1
`- 1
`10
`10
`
`- 1
`- 1
`13
`13
`
`13
`13
`
`6
`10
`6
`10
`- 13
`- 18
`-18 - 13
`- 3
`3
`- 3
`3
`- 3
`3
`
`- 2
`- 2
`- 6
`- 2 - 2 - 6
`14
`18
`2
`14
`18
`2
`
`- 6
`- 6
`5
`5
`
`18
`18
`
`14
`14
`9
`9
`
`7
`7
`7
`
`- 3
`- 3
`- 3
`
`3
`3
`3
`
`13
`13
`13
`
`14
`14
`14
`
`16
`16
`16
`
`9
`9
`9
`
`18
`18
`18
`
`-8 -8
`- 8
`- 8
`- 15
`- 15
`- 15
`- 15
`2
`2
`2
`2
`2
`2
`- 22
`- 1
`- 22
`- 1
`- 7
`35
`-7
`35
`- 7
`35
`15
`15
`25
`25
`25
`25
`
`- 7
`- 7
`-7
`
`- 8
`- 8
`- 11
`- 11
`10
`10
`10
`
`- 8
`- 8
`- 11
`- 11
`10
`10
`10
`
`10
`10
`10
`
`- 15
`- 9 -9 -9 - 9
`- 8
`-9 -9 - 9 -9 - 15
`- 8
`- 7
`-7 -17 -2
`5
`6
`-7
`- 17 -2
`5
`6
`- 7
`25
`18
`-1
`29
`25
`18
`-1
`29
`25
`18
`-1
`29
`20
`12
`20
`12
`
`6
`6
`
`5
`5
`
`12
`12
`12
`
`- 1
`- 1
`- 1
`
`29
`29
`29
`
`5
`5
`5
`
`6
`6
`6
`
`20
`20
`20
`
`2
`5
`2
`5
`- 7 - 7
`- 7 - 7
`23
`20
`23
`20
`23
`20
`9
`9
`19
`19
`19
`19
`
`23
`23
`23
`
`ASSA ABLOY Ex. 1021 - Page 3
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01094 - U.S. Patent No. 8,620,039
`
`

`

`6.4
`
`6.4
`
`HA SHDIG
`
`507
`
`will have the same month and day of birth! In other words, if we select a ran(cid:173)
`dom function which maps 23 keys into a table of size 365, the probability that
`no two keys map into the same location is only 0.4927 (less than one-half) .
`Skeptics who doubt this result should try to find the birthday mat es at the next
`large parties they attend.
`[The birthday paradox apparently originated in
`unpublished work of H. Davenport; cf. W.W. R. Ball, Math. R ecreations and
`Essays (1939), 45. See also R. von Mises, j stanbul Universitesi Fen Fakultesi
`Mecmuasi 4 (1939), 145- 163, and W. Feller, An Introduction to Probability
`Theory (New York: Wiley, 1950), Section 2.3.]
`On the other hand, the approach used in Table 1 is fairly flexible [cf. M . Gre(cid:173)
`niewski and W. Turski, CACM 6 (1963), 322-323], and for a medium-sized table
`a suitable function can be found after about a day's work. In fact it is rather
`amusing to solve a puzzle like this.
`Of course this method has a serious fl aw, since the contents of the table
`must be knowri in advance; adding one more key will probably ruin everything,
`making it necessary to start over almost from scratch. We can obtain a much
`more versatile method if we give up the idea of uniqueness, permitting different
`keys to yield the same value f(K), and using a special method to resolve any
`ambiguity after f(K) has been computed.
`These considerations lead to a popular class of search methods commonly
`knqwn as hashing or scatter storage techniques. The verb "to hash" means to
`chop something up or to make a mess out of it; the idea in hashing is to chop
`off so me aspects of the key and to use this partial information as the basis for
`searching. We compute a hash function h(K) and use this value as the address
`where the search begins.
`The birthday paradox tells us that there will probably be distinct keys
`K i ~ K i which hash to the same value h(K i) = h(K;). Such an occurrence is
`
`(I)
`
`...
`
`:i:
`
`...
`
`...
`z
`
`(I)
`
`...
`
`r,-,
`
`...
`
`r,-,
`0
`z
`
`...
`:i:
`u
`...
`r,-,
`w
`::,
`<
`z
`<
`iii
`:i:
`11.
`:i:
`:i:
`0::
`0
`0
`><
`r,-,
`r,-,
`r,-,
`r,-,
`0
`0
`0
`~
`~
`Contents of rll after executing the instruction, given a particular key K
`
`(I)
`
`:i:
`r,-,
`
`(I)
`
`...
`
`~omparing the given argu(cid:173)
`overn a branching process.
`mnd by doing some arith(cid:173)
`which is the location of K
`
`glish words which we have
`2 and 6.3. T able 1 shows
`mys into a unique number
`>d to the MIX programs for
`~arch, optimal tree search,
`Jerior from the standpoint
`ses slightly less space. In
`; the program of Table 1
`:u, and only 41 tabl e loca-
`
`:h functions f(K) . There
`set into a 41-element set,
`1 will give distinct values
`million function s will be
`
`risingly rare, even with a
`y paradox" asserts th at if
`·e good that two of them
`
`QUE ADDRESSES
`
`:s
`0 g;
`
`0
`<
`:i:
`
`- 6
`- 6
`2
`2
`
`2
`2
`-7
`-7
`?3
`?3
`l3
`
`l3
`l3
`l3
`
`- 6
`- 6
`5
`5
`
`5
`5
`- 7
`- 7
`20
`20
`20
`9
`9
`19
`19
`19
`19
`
`-8
`- 15
`- 15
`2
`2
`2
`- 22
`- 22
`- 7
`- 7
`- 7
`
`- 7
`- 7
`- 7
`
`-8
`- 15
`- 15
`2
`2
`2
`- 1
`- 1
`35
`35
`35
`15
`15
`25
`25
`25
`25
`
`w
`:i:
`
`0:: w
`:i:
`
`-8
`-8
`- 11
`- 11
`10
`10
`10
`
`- 8
`- 8
`- 11
`- 11
`JO
`10
`10
`
`10
`10
`10
`
`- 8
`- 8
`- 7
`- 7
`18
`18
`18
`12
`12
`
`- 9
`- 9
`- 9 -9 - 15
`- 16
`-9 - 9 -9 - 9 -15
`- 16
`- 2
`- 17
`-7 - 18
`6
`5
`- 17
`- 2
`-7 - 18
`6
`5
`- 1
`29
`25
`4
`- 1
`29
`25
`4
`- 1
`29
`25
`4
`20
`20
`
`5
`5
`
`6
`6
`
`- 16
`- 16
`- 9
`- 9
`22
`22
`22
`
`12
`12
`12
`
`- 1
`- 1
`- 1
`
`29
`29
`29
`
`5
`5
`5
`
`6
`6
`6
`
`20
`20
`20
`
`4
`4
`4
`
`22
`22
`22
`
`- 26 -26 -28
`- 23 -23 - 26
`- 23
`- 23
`- 16
`- 23 -26 -26 - 26
`- 28
`- 23 -23 - 23
`- 16
`- 20
`- 15 -33 -26 -25
`-5 -23 -23 -23
`- 15 -33 -26 - 25 -20
`-5 -23 -23 - 23
`-2
`- 16
`17
`12
`0
`30
`-2
`- 16
`12
`17
`0
`30
`- 16
`- 2
`12
`17
`0
`30
`-5
`-22 -21
`8
`-5
`- 21
`- 22
`8
`-1
`11
`29
`- 1
`29
`11
`- 1
`29
`11
`-5
`11
`-5
`11
`21
`21
`21
`21
`
`30 -1 0
`- 10
`30
`- 10
`30
`
`- 2
`- 6
`- 2
`-6
`-6 -2
`
`17
`17
`17
`
`11
`11
`11
`
`-5
`-5
`-5
`
`8
`8
`8
`
`1
`1
`- 26 -22 - 18
`-26 -22 - 18
`-14
`-6
`2
`-14
`-6
`2
`-14
`-6
`2
`- 2
`- 10
`- 2
`- 10
`
`ASSA ABLOY Ex. 1021 - Page 4
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01094 - U.S. Patent No. 8,620,039
`
`

`

`508
`
`SEARCHING
`
`6.4
`
`6.-!
`
`called a collision, and several interesting approaches have been devised to
`In order to use a scatter table, a programmer
`handle the collision problem.
`must make two almost independent decisions: He must choose a hash function
`h(K), and he must select a method for collision resolution. We shall now con(cid:173)
`sider th ese two aspects of the problem in turn.
`
`(1)
`
`Hash functions . To make things more explicit, let us assume throughout this
`section that our hash function h takes on at most M different values, with
`0 _::; h(K) < M,
`for all keys K. The keys in actual files that arise in practice usually have a great
`deal of redundancy; we must be careful to find a hash function that breaks up
`clusters of almost identical keys, in order to reduce the number of collisions.
`It is theoretically impossible to define a hash function that creates random
`data from the nonrandom data in actual files. But in practice it is not difficult
`to produce a pretty good imitation of random data, by using simple arithmetic
`as we have discussed in Chapter 3. And in fact we can often do even better,
`by exploiting the nonrandom properties of actual data to construct a hash
`function that leads to fewer collisions than truly random keys would produce.
`Consider, for example, the case of 10-digit keys on a decimal computer.
`One hash function that suggests itself is to let M = 1000, say, and to let h(K)
`be three digits chosen from somewhere near the middle of the 20-digit product
`K X K. This would seem to yield a fairly good spread of values between 000
`and 999, with low probability of collisions. Experiments with actual data show,
`in fact, that this "middle square" method isn't bad, provided th at the keys
`do not have a lot of leading or trailing zeros; but it turns out that there are
`safer and saner ways to proceed, just as we found in Chapter 3 th at the middle
`flquare method is not an especially good random number generator.
`Extensive tests on typical files have shown that two major types of hash
`functions work quite well. One of these is based on division, and the other is
`based on multiplication.
`The division method is particularly easy; we simply use the remainder
`modulo M:
`
`(2)
`
`h(K ) = K mod M.
`In this case, some values of l'vl are obviously much better than others. For
`example, if M is an even number, h(K) will be even when K is even and odd
`when K is odd, and this will lead to a substanti al bias in many files. It would
`be even worse to let M be a power of the radix of the computer, since K mod M
`would then be simply the least significant digits of K (independent of the other
`digits). Similarly we can argue that M probably shouldn't be a multiple of 3
`either; for if the keys are alphabetic, two keys which differ from each other
`only by permutation of letters would then differ in numeric value by a multiple
`of 3. (This occurs because 10n mod 3 = 4n mod 3 = 1.) In general, we want
`to avoid values of M which divide r k ± a, where k and a are small numbers and
`r is the radix of the alphabetic character set (usually r = 64, 256, or 100),
`
`since a remainder modulo
`position of the key digits
`a prime number such that
`has been found to be quit
`For example, on the I
`h(K) by the sequence
`LDX
`ENTA
`DIV
`The multiplicative hf
`harder to describe becaw
`instead of with integers.
`usually 10 10 or 23 0 for M
`if we imagine the radix p
`choose some integer const
`
`h
`
`In this case we usually h
`h(K) consists of the leadi1
`In MIX code, if we let .
`hash function is
`
`K
`LDA
`A
`MUL
`ENTA 0
`SLB
`m
`
`Now h(K) appears in reg
`shift instructions, this se
`on many machines multir
`In a sense this metho
`could for example take 1
`the reciprocal of a constai
`that (5) is almost a "midi
`ence: We shall see that rr
`good properties.
`One of the nice featm
`was lost in (5); we could
`after (5) has finished. Th
`algorithm can be used to
`that K = (A'(AK modi
`tents of register X just b
`
`ASSA ABLOY Ex. 1021 - Page 5
`ASSA ABLOY AB v. CPC Patent Technologies Pty Ltd.
`IPR2022-01094 - U.S. Patent No. 8,620,039
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket