throbber
Case 3:17-cv-05659-WHA Document 371-21 Filed 02/14/19 Page 1 of 5
`Case 3:17-cv-05659-WHA Document 371-21 Filed 02/14/19 Page 1of5
`
`EXHIBIT 17
`EXHIBIT 17
`
`
`
`
`
`
`
`
`
`
`
`
`
`

`

`Lase 3:17-cv-05659-WHA Document 371-21 Filed 02/14/19 Page 2 of5
`Case 3:17-cv-05659-WHA Document 371-21 Filed 02/14/19 Page 2 of 5
`
`Volume 3 / Sorting and Searching
`
`THE ART OF
`
`COMPUTER PROGRAMMING
`
`SIE
`
`SereFTETIHAS
`
`PUTERSS
`
`.
`Reading, Massachusetts
`Menlo Park, California - London: Amsterdam: Don Mills, Ontario * Sydney
`
`

`

`Case 3:17-cv-05659-WHA Document 371-21 Filed 02/14/19 Page 3 of 5
`Case 3:17-cv-05659-WHA Document 371-21 Filed 02/14/19 Page 3of5 .
`
`This book is in the
`
`ADDISON-WESLEY SERIES IN
`
`COMPUTER SCIENCE AND INFORMATION PROCESSING
`
`Consulting Editors
`RICHARD §. VARGA and MICHAEL A, HARRISON
`
`Copyright © 1973 by Addison-Wesley Publishing Company, Inc. Philippines copyright 1973
`by Addison-Wesley Publishing Company,Inc.
`All rights reserved. No part of this publication may be reproduced,stored in a retrieval system,
`or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording,
`or otherwise, without the prior written permission of the publisher. Printed in the United
`States of America. Published simultaneously in Canada. Library of Congress Catalog Card
`No. 67-26020.
`
`ISBN 0-201-03803.x
`FGHIL-MA-79
`
`

`

`Case 3:17-cv-05659-WHA Document 371-21 Filed 02/14/19 Page 4of5
`Case 3:17-cv-05659-WHA Document 371-21 Filed 02/14/19 Page 4 of 5
`
`| 6.4
`
`HASHING
`
`507
`
`_will have the same month and day of birth! In other words,if we select a ran-
`'dom function which maps 23 keys into a table of size 365, the probability that
`no two keys map into the same location is only 0.4927 (less than one-half).
`| Skeptics who doubt this result should try to find the birthday mates at the next
`large parties they attend.
`[The birthday paradox apparently originated in
`unpublished work of H. Davenport; ef. W. W. R. Ball, Math. Recreations and
`| Essays (1939), 45. See also R. von Mises,
`Istanbul Uiniversitesi Fen Fakiiliesi
`Mecmuasi 4 (1939), 145-163, and W. Feller, An Introduction to Probability
`Theory (New York: Wiley, 1950), Section 2.3.]
`On the other hand, the approach used in Table 1 is fairly flexible [ef. M. Gre-
`_niewski and W. Turski, CACM 6 (1963), 322-323], and for a medium-sized table
`-a suitable function can be found after about a day’s work.
`In fact it is rather
`-amusing to solve a puzzle like this,
`Of course this method has a serious flaw, since the contents of the table
`must be known in advance; adding one morekey will probably ruin everything,
`making it necessary to start over almost from scratch. We can obtain a: much
`more versatile method if we give up the idea of uniqueness, permitting different
`_keys to yield the same value f(K), and using a special method to resolve any
`‘ambiguity after f(K) has been computed.
`These considerations lead to a popular class of search methods commonly
`| known as hashing or scatter storage techniques. The verb “to hash” means to
`:chop something up or to make a mess out of it; the idea in hashing is to chop
`| off some aspects of the key and to use this partial information as the basis for
`‘searching. We compute a hash function h(K) and use this value as the address
`| where the search begins.
`The birthday paradox tells us that there will probably be distinct keys
`‘Ky # K; which hash to the same value h(K;) = A(K;). Such an occurrenceis
`
`1G
`i H
`
`=
`HAH
`
`ww
`H
`
`es
`HF
`
`9
`z
`
`&
`5
`n
`ei
`8
`H
`r
`a
`oo
`Tr
`z
`=
`eG
`z
`Fy
`ph
`=
`E
`=
`Et
`&
`So
`S
`o
`Contents of rI1 after executing the instruction, given a particular key K
`
`—8 —9 —9 —9 —9 —15 —16 —16 —16 —23 —23 —23 —23 —26 —26 —26 —28
`—§ —9 —9 —9 —9 —15 —I16 —16 —16 —23 —23
`~—23 —23 —26 —26 —26 —28
`—7 —17 —-2
`5
`6 —7
`-18
`~9 —5 —23 —23 —23 —15
`~—33 —26 —25 —20
`—7 —17 -2
`5
`6 —7
`-18
`-9 —5 —23 —23 —23 —15 —33 —26 —25 —20
`18 —1
`29
`.
`25
`4
`22
`=©30
`1
`1
`1
`17 —16 —2
`0
`12
`
`
`
`‘18 —1 6=229 +5 4 22 830 1 1 1 17 —16 —2 0 12
`
`
`
`
`
`
`
`
`
`18 —J
`29
`5
`6
`25
`4
`22
`30
`1
`1
`1
`17 —16 —2
`0
`12
`12
`.
`.
`.
`.
`20
`.
`.
`. —26 —22 —18
`—22 —21 —5
`8
`12
`.
`20
`.
`—26
`~—22 —18
`—22 —21 —5
`8
`.
`.
`.
`.
`—14 —6
`2
`1k —L
`29
`.
`.
`:
`.
`—14 —6
`2
`ll —1
`29
`:
`:
`.
`.
`—14 —6
`2
`11 —1
`29
`.
`.
`:
`.
`«—10
`—2
`:
`—5
`ll
`:
`.
`.
`.
`—10
`—2
`.
`—5
`iil
`.
`
`.
`.
`
`12 —1
`12 —1
`(12 —1
`
`29
`29
`29
`
`5
`5
`5
`
`6
`6
`6
`
`8620
`20
`20
`
`4
`4
`+
`
`22
`22
`22
`
`30 —10
`380 —10
`30 —10
`
`-6 -—2
`-—6
`-2
`-6 -—2
`
`17
`17
`17
`
`1k —5
`Il —5
`ll
`~—5
`
`2
`21
`2i
`
`8
`8
`8
`
`

`

`Case 3:17-cv-05659-WHA Document 371-21 Filed 02/14/19 Page 5 of 5
`Case 3:17-cv-05659-WHA Document 371-21 Filed 02/14/19 Page 5of5
`
`
`
`508
`
`SEARCHING
`
`6.4
`
`called a collision, and several interesting approaches have been devised to
`handle the collision problem.
`In order to use a seatter table, a programmer
`must make two almost independent decisions: He must choose a hash function
`h(K), and he must select a method for collision resolution. We shall now con-
`sider these two aspects of the problem in turn.
`
`Hash functions. To make things more explicit, let us assume throughout this
`section that our hash function h takes on at most M different values, with
`
`0 < h(K) < M,
`
`(1)
`
`for all keys K. The keysin actual files that arise in practice usually have a great
`deal of redundancy; we must be careful to find a hash function that breaks up
`clusters of almost identical keys, in order to reduce the numberofcollisions.
`It is theoretically impossible to define a hash function that creates random
`data from the nonrandom data in actualfiles. But in practice it is not difficult
`to produce a pretty good imitation of random data, by using simple arithmetic
`as we have discussed in Chapter 3. And in fact we can often do even better,
`by exploiting the nonrandom properties of actual data to construct a hash
`function that leads to fewer collisions than truly random keys would produce.
`Consider, for example, the case of 10-digit keys on a decimal computer.
`One hash function that suggests itself is to let IZ = 1000, say, and to let h(K)
`be three digits chosen from somewherenear the middle of the 20-digit product
`K xX K. This would seem to yield a fairly good spread of values between 000
`and 999, with low probability of collisions. Experiments with actual data show,
`in fact, that this “middle square” method isn’t bad, provided that the keys
`do not have a lot of leading or trailing zeros; but it turns out that there are
`safer and saner ways to proceed, just as we found in Chapter 3 that the middle
`square method is not an especially good random numbergenerator.
`Extensive tests on typicalfiles have shown that two major types of hash
`functions work quite well. One of these is based on division, and the other is
`based on multiplication.
`The division method is particularly easy; we simply use the remainder
`modulo M7:
`
`h(K) = K mod M.
`
`(2)
`
`In this case, some values of M are obviously much better than others. For
`example, if M is an even number, h(K) will be even when K is even and odd
`when K is odd, and this will lead to a substantial bias in manyfiles. It would
`be even worse to let M be a powerof the radix of the computer, since K mod M
`would then be simply theleast significant digits of K (independentof the other
`digits). Similarly we can argue that M probably shouldn’t be a multiple of3
`either; for if the keys are alphabetic, two keys which differ from each other
`only by permutation of letters would then differ in numeric value by a multiple
`of 3.
`(This occurs because 10" mod 3 = 4” mod 3 = 1.)
`In general, we want
`to avoid values of M which divide r* +: a, where k and a are small numbers and
`r is the radix of the alphabetic character set (usually r= 64, 256, or 100),
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket