`
`Ross N, Williams
`ross@rocksoft.com
`Rocksoft —
`
`ABSTRACT
`
`the
`tools produced by the field of cryptology is
`One.of the most useful
`cryptographic hash, or message digest. Digestsare like checksums, but provide such
`a good fingerprint of the data thatit is actually computationally infeasible to change
`the data without also changing the data's digest. This special property provides new
`guarantees of integrity that
`lead to some surprising applications. Many digest
`algorithms are simple to use, patentfree, and have C implementationsthatare freely
`available on the Internet by FIP. This paper provides an introduction to digest
`algorithms, and reviews their application to disk monitoring,
`intruder and virus
`detection, file transfer verification, notarization, and authentication. It also gives
`
` enough practical details to enable the reader to deploy the popular MD5 digest
`
`the field of cryptology has
`In the last. twenty years,
`undergonea revolution, sparked by the invention of public
`key cryptography [Diffie76)]. This revolution has yielded a
`range of sew techniques that can be applied in various
`combinations to solve almost any secrecy,
`integrity, or
`authentication problem. Some of these techniques are
`complex,
`requiring a high degree of cryptographic
`sophistication to be applied securely. Others are simple,
`general-purpose tools that can be applied effectively by
`anyone who takes an interest in them. One suchtool is the
`digest algorithm, which reads.a-block of data of any size
`and generates a small
`(eg., 16-byte) fixed-width, non-
`invertible "digest". This digest has particular
`special
`properties that enable it to act as a form of identity of the
`original block. The best way to gain an understanding of
`digests is to approach them through. their ancestors,
`the
`checksums.
`,
`
`algorithm.
`
`Introduction
`
`Checksums
`
`.
`
`Checksum algorithms are a particular class of hashing
`algorithm that were devised to solve the problem of
`detecting errors in messages transmitted through noisy
`communication lines. To enable errorsto be detected, the
`transmitter calculates the checksum of the message and
`transmits it after the message. To check:the message, the
`receiver calculates the checksum.of the received message
`and compares it with the received checksum. For example,
`if the checksum algorithm simply summed the bytes in the
`
`message mod 256,
`follows:
`
`then the transfer might proceed as
`.
`
`Msg
`Msg,checksum,
`Msg received
`
`:
`:
`
`6-23
`6 23
`6 27
`
`#4
`4 33
`4 33
`
`In this example, the second byte of the message has been
`corrupted from 23 to 27 by the communications channel.
`However, the receiver can detect that something is wrong
`by comparing the transmitted checksum (33) with the
`calculated checksum of 37 (6 + 27 + 4).
`
`{f the checksumitself is corrupted, a correctly transmitted
`message might be incorrectly identified as a corrupted one.
`However, such safe-side failures don’t matter much. A
`dangerous-side failure occurs when the message and/or
`checksum is corrupted in a way that
`results
`in’ a
`transmission that
`is internally consistent. Unfortunately,
`this possibility is completely unavoidable, and the best that
`can be doneis to minimise its probability by strengthening
`the checksum algorithm until the chance of a dangerous-
`‘side failure is acceptably low.
`
`One wayto strengthen the checksum.algorithm is to change
`from an 8-bit register to a 16-bit register (i.e. suum the bytes
`mod 65536 instead of med 256) so as to apparently reduce
`the probability of failure from 1/256 to 1/65536. While
`essentially a good idea, widening the checksum aloneis not
`sufficient to significantly increase the strength of the error
`detection, as the formula used may not be sufficiently
`random. With a simple summing formula, each incoming
`
`EMCVMW 1028
`
`
`
`
`
`byte affects rouglily one byte of the summing register no
`matter how wide the summing register is. In the example
`above,
`the errorwould still go undetected even if the
`summing register was one Megabyte wide. This problem
`can only be solved by replacing the simple summing
`formula with a more sophisticated calculation that enables —
`each incoming byte to have the potential to affect the entire
`checksum register. Thus,
`there
`are
`actually.
`two
`requirements for a strong checksum algorithm:
`
`~
`
`Width: The algorithm should have a register wide
`enough to provide a low a-priori
`probability of
`failute (¢.g. 32-bits gives'a 1/232 chance of
`failure).
`
`Randomness: The: algorithm should give each
`input byte the potential to affect-many bits of the
`register.
`:
`
`Io other words, there is aneed for enough width in the
`checksum to provide an acceptably low chance of a-priori
`dangerous-side failure, and enough randomness to ensure
`that every byte of the mesgage will have a significant effect
`on the result. These goals are reflected in contemporary |
`checksum algorithms
`such as CRC-32, Unlike
`their |
`predecessors, modern checksum algorithms no longer
`simply sum the bytes; in fact, they are more likely to divide ©
`them! Despite this,
`the term checksum is still used to
`describe any algorithm that produces an error correction
`value, and which’ does not have the cryptographic strength
`of the digest algorithms described in the next section. Some
`common checksums are: CRC-32, CRC-16, Fletcher, and
`IP.
`,
`
`_ Problems With Checksums
`
`Checksum algorithms provide a good solution to the
`problem of detecting errors in data introduced by random
`phenomena such as noise on communication lines or
`defects in'a hard-disk surface. However, they are totally
`_ inadequate in the face of hostile humans.
`
`the example’ of a military commander who
`Consider
`despatches the message “LAND* by courier to the other
`half of bis army, to indicate that an attack is to take place
`by land, Being fearful thatthe courier could be intercepted
`and the message substituted,
`the commander takes the
`precaution of authenticating the messageby calculatingits
`checksum (31) and transmitting the checksum by smoke
`signal, a slow, but completely incorruptible
`channel. In
`doing this,
`the commander
`relies on the checksum
`. algorithm as an authenticator.
`
`‘Msg in ASCII
`Msg in decimal
`Msg, checksum
`Corrupted (Dec)
`Corrupted (ASCII)
`
`N
`A
`4
`:
`65 78
`7: 76
`65
`#78
`: 76
`: 83 101
`97
`:
`S$
`e
`a
`
`D
`68
`68
`25
`??
`
`31
`31
`
`=.
`
`Unfortunately, the enemy does intercept the message, and
`decides to replace it with a different one: "Sea", Being
`aware of the checksum smoke signal, and realising that a
`naive substitution would be detected, the enemy modifies
`the message in such a way thatit has the same checksum. It
`does this by appending the control character 25 to the end
`of the message to make the bytes of the message sum to 31
`(mod 256). Asa result, the other half of the commander's
`army receives the message, checksits checksum against the
`smoke
`signal checksum,
`and accepts
`the substituted
`message,
`
`that, whereas checksums
`-This”example demonstrates
`provide excellent protection against random errors, they —
`provide very little protection against
`intelligent
`and
`malicious agents’ who are not constrained to corrupt
`messages
`in a
`random manner. To protect
`against
`intelligent opponents,
`signature
`algorithms must
`be.
`‘strengthened to the point where,it is extremely difficult to
`find or construct a message having a particular signature.
`
`Digest Algorithms
`
`To provide greater security, checksum algorithms can be
`strengthened éither by increasing their width or Increasing
`their randomness. Digest algorithms do both, providing
`extra width and randomness, and raising the level of
`security to a point of practical unforgetabillty. This makes
`them slower than checksum algorithms ~~ the price of the
`extra secutity.
`
`Whereas most checksum algorithms are invertible and
`generate a 2-byte or 4-byte checksum, most digest
`algorithms generate a 16-byte (128-bit} digest, and are non-
`invertible. If the digest algorithm is designed properly, it
`will possess the following two properties:
`
`e
`
`e
`
` Itis computationally infeasible to find any message
`that correspondsto a given digest.
`
`tis computationaily infeasible to find any two
`messages that correspond to the same digest.
`
`Such algorithms are referred to as strong one-way hash
`functions, which we may equate with the term digest.
`Algorithms that possess only the first of the two properties
`are referred to as weak one-way hash functions and are
`best
`ignored by the practitioner, Strong one-way hash
`functions are sometimes also referred to as -collision-free
`hash functions. This is somewhat of a misnomer, as any
`functionthat maps a space containing an infinite number of
`values (eg., the set ofall finite blocks of bytes) to a space
`containing a finite number of values (eg., the set of all 128-
`bit blocks), must have an infinite number of collisions! Io
`practice,
`however, you're
`unlikely
`to ever
`actually
`encounter a colliding pair. You'd have to examine about
`
`
`
` the enemywill be able to find a substitute message that has
`
`Property: From a practical perspective, digest
`algorithms
`implement
`the apparent miracle of
`providing a one-way one-to-one mapping between
`the infinite set of data blocks and the finite set of n-
`bit digests,
`
`Of course, the mapping isn't really one-to-one, but forall
`practical purposes we can consider it to be. The “proof” of
`this is that irs practically impossible to demonstrate that
`the mapping jis not one-to-one,
`as
`it's practically
`impossible to find a collision!
`
`Thus one way to gain a practical mastery in the application
`of digests is‘ simply to imagine that they really are one-to-
`one (even though they're pot). For all practical purposes,
`the digest of a block of data can act as that block's unique
`
`
`
`' Calculation: A birthday attack on 128-bit space over
`seventy years would require the evaluation of:
`¥21284¢79*365*24*60*60) values per second,
`
`eightbillion values every second to have a greater than
`§0% chance offinding a collisionin your lifetime!.
`
`Following a crushing defeat on the battlefield, we find that
`our commander has mended his ways and is using a digest
`algorithm instead of the checksum algorithm:
`
`OD
`N
`A
`LG
`-;
`Msg in ASCIT
`68
`78
`65
`: 76
`Msg in decimal
`68
`78
`65
`: 76
`Msg + hex digest
`F87C04A7D828D9B050773833DE2382FE
`“While
`our
`commander might
`find
`transmitting
`F87C04A7D828D9B050773B33DE2382FE by smoke
`signal to be a little inconvenient, it's extremely valikely that
`
`the same digest. As a consequence, any observer who sees
`the ‘smoke signal digest can have confidence in any
`corresponding message, The digest acts asa secure form of
`authentication.
`
`identity. The various applications of digest algorithms flow
`naturally from this perspective.
`
`Applications
`
`The special properties ofdigests lead naturally to a
`variety of applications:
`
`Disk monitoring
`
`Digests can be used to detéct changesin file systems. If the
`digest of a file is recorded, any future change in the file can
`be detected by calculating the digest of the file and
`_.-comparing it to the recorded digest. At first glance, it might
`seem that checksums would work just as well. However, ”
`digests provide two major advantages. First, digests are
`much wider
`than checksums, providing far greater
`assurance that no change has occurred when the values do
`match.
`
`—
`
`In addition, digest algorithms are constructed so as to make
`it extremely difficult to recover any part of the original
`message from the digest. Thus, even if the message is
`secret,
`the general can transmit
`its digest publicly (by
`smoke signal) with confidence. The only-threat to secrecy
`arises if the number of messages is so small
`that
`the
`opponent could predict some of thent, calculate their
`digests, and hence “recognise” particular messages bytheir
`digests. This threat can be eliminated by starting each
`message with a fixed amount ofnoise.
`The Essential Property
`From a practical point of view, the essential property of
`digests is:
`
`Second, digests provide more security against attacks by
`intruders and viruses. A clever intruder mightfind it easy to
`modify a file in such a way that the modified file has the -
`same checksum as the old file, but would findit practically
`impossible to fool a digest.
`The public security tool” -
`Tripwire [Kim94] [Spafford92] and the commercial data.
`integrity tool Veracity [Williams94]
`[Rocksoft94] both
`record digests to monitorfiles,
`
`File transfers
`
`Digests do not have any special application to the .
`verification of file transfers. However, as they are wider
`than most checksums, digests can provide a far higher
`degree of confidence io
`the
`transfer
`than ordinary -
`checksums. As the volume of packets transmitted through
`data networks
`increases, existing checksums may not
`provide a
`sufficiently high degree of checking, For |
`example, a 16-bit checksumwith a one in 65536 chance of
`a false positive would have about an 8%chance of failing
`to detect at least one error in the transmission of 3000
`blocks down a phone line requiring an average of three
`attempts to transfer each block.2 For a 32 bit checksum,
`there would be about an 0.0001% chance4. Digests are
`usually 128-bits or wider and effectively eliminate this
`uncertainty.
`
`_ Notarization
`
`Digests can be used to timestamp (notarise) blocks of data .
`{eg., documents). To notarise a document, publish its digest
`in a Secure archival medium, such as the classified section
`of a widely circulated and archived newspaper. Oncethis is
`done, the document can be shown to have existed at that
`time by producing a machine readable copy of the
`
`2 Calculation: 1- (65535/65536)6000
`3 Calculation: Hem. 1)/232)6000
`
`
`
`wpeeeeaeaeeettelaniaigShaa
`
`document, along with a reference to the archive. This
`evidence, combined withthe computational infeasibility of
`creating a document that would match the digest, provides
`cryptographic proof that the file existed on or before the
`date when the newspaper was published. Publication of a
`document's digest
`is equivalent
`to -timestamping the
`document,
`
`The beauty of this form of timestampingis thatit can be
`performed silently (ie., without anyone sighting the
`documentor even noticing) and securely (ie., the
`cryptographic "proof"is very strong). Some specific
`applications. ofthis notarization technique are:
`
`1.
`
`2.
`
`stamping the date ofinventionforpatents without
`disclosing the invention to anyone,
`stamping cridical corporate or government documents
`so that they can't béforged or tampered with at a later
`date,
`
`Warning: Although timestamping using digests provides
`extremely strong cryptographic proof of the existence of a
`document at a particular time, and would probably
`convince most cryptologists, to our knowledge such proof
`has not yet been tested in court.
`
`Private key authentication
`
`Digests can be used to implement the authentication of
`messages between two parties sharing asectet key K. To
`send a message, the transmitter appends the secret key to
`the message M, computes the digest of the result X=d(MX),
`and then transmits MX.4 The receiver receives M’X’ and
`authenticates M' by testing for X'=d(M’K). This scheme
`works because any active eavesdropper would have only Mf
`and X to work with, and would haveto find soine NY such
`that Y=d(NK) without knowing KX. This is‘equivalent to the
`problem of breaking the digest.
`
`One attack on this scheme isthe playback attack. An
`“ opponent could record one of the messages and playit back
`later,
`to some detrimental effect. This attack can be
`prevented by including some unique stamp in each message
`that identifies it as being current. Some stamping ideas are:
`the date and time, a serial number, or one of a set of agreed
`numbers {with a record being kept of the ones used).
`
`Public key authentication
`
`Digests can also be used as a component of public key
`authentication, The most straightforward way to transmit a
`
`4 Note:In this paper XY usually denotes .
`concatenation, not multiptication:
`
`the
`for
`is
`message with public key authentication
`transmitter to append some redundantinformation Rto the
`message M, encrypt MR using the private key, and transmit
`the result. The receiver can authenticate the transmission
`by decrypting it using the public key and checking that the
`received redundant
`information is consistent with the
`received message. The disadvantage with this scheme is
`that some public key methods (eg., RSA) are quite slow,
`and, if there is no need for secrecy,it's inefficient to have
`to encrypt the entire messagejust to provide authentication.
`A mote efficient alternative is to encrypt just the digest of
`the message, using the private key. The encrypted digest
`can be appended to the unencrypted message. to provide
`_, @fficient 16-byte authentication,
`
`Storage of passwords
`
`Multi-user computer systems have to store a database of
`passwords to enable them to authenticate users. However,.
`it's insecure to keep a file of passwords anywhere in the
`computer system, as it's possible that an intruder might
`somehow read parts of the file without first breaking
`system security (eg., the intruder might find part of the. file
`in a cecycled block on the disk), The solutionis to store not.
`the passwords themselves, but their digests. This enables .
`the operating system to check passwords of users logging
`in, but does not enable an intruder who stumbles across the
`digests from reconstructingthe passwords themselves. This
`scheme is currently in use in most operating systems (see
`{Evans74] for an early paper on this topic). One attack on
`this schemeis to feed each word in a dictionary through the
`digest algorithm and comparethe resulting digests with the
`’ digests in the password file. This attack can be slowed by
`generating a random salt for each password entry,storing it
`with the entry, and calculating digests from. the password
`and the salt. This forces the opponent
`to reprocess the
`entire dictionary for each password entry,
`
`Proof of existence
`
`Digests can act as a proof of the existence of a particular
`block of data, making them suitable for remote server
`verification. Using digests, an entity B can prove to an
`entity A that it possesses a particular block of data (that A
`also possesses). The most obvious way to do this (short of
`having B actually transmit the block to A) would be simply
`to have B send A the digest ofthe block. Unfortunately, this
`is not convincing ‘as A might suspect that B has thrown
`away the data and is merely storing the digest! Stronger
`proof can be achieved by arranging for A to challenge B by
`sending B a random number R. To respond, B calculates
`and returns the digest of RD, A can then perform the same
`calculation to verify that B still possesses the data. It's
`important that RD is used rather than DR as if DR were
`used, B could simply precess D once, throw D away, and
`store only the 128-bit state of the digest algorithm.
`
`
`
`client could generate a large backup file and then calculate
`and record the digests of the file appendedto a hundred or
`60 different random values. These random number/digest
`pairs could then be used on a regular basis to challenge the
`backup service to prove thatit still has the file, even if the
`Client doesn't. Such proof would be secure, efficient and
`would require just a few bytes of network traffic. To the
`author'sknowledge,this is an original application for digest
`algorithms,
`Protocol protection
`
`The capacity of digests to act as proof of the existence of
`data also leads to @ protocol protection scheme. Suppose
`that you are developiag a commercial product of some kind
`that uses some kind of protocol or file. format, and you
`don't want competitors producing: interoperable products.
`One way to do this is to patent an algorithm and then build
`it into the protocol, For example, if the protocol relies on
`compression, a data compression patent could be used to
`stop interoperable products. However, this is a rather
`“overengineered” approach. Digests provide the possibility
`of a simpler way.
`
`To use a digest to protect a protocol, make the calculation
`of the digest of blocks of data and a key X an essential part
`of the protocol. For example, XK could be appended to each
`networkpacketand the digest of the result transmitted with
`each packet. The next step is to choose a K that
`is
`copyrighted! K could be part of your program,a technical
`paper, a copyright message, or anything else that
`is
`obviously a copyrighted work and is small enough to be
`embedded In executable programs, Once all this is in place,
`the creation of an interoperable product will be either
`computationally or
`legally infeasible,
`for
`it will be
`computationally infeasible to interoperate without X, but if
`K is used, you can sue them for copyright infringement of
`K\ Even if an attempt is made to hide K in the compatible
`product, the non-invertibility of digests, and theit use as a
`proof of existence of information could be used to prove
`that the compatible product must contain K somewhere! To
`the author's knowledge, this is an original applicationfor
`digest algorithms and has neverbeen tested in court,
`-
`
`"
`
`Data structures
`
` This technique can be applied to audit backup services. The
`
`Because it's extremely improbable that collisions (different
`‘inputs, same output) will ever be encoustered in practice,
`digest algorithms can be used in data structures to generate
`unique fixed-length identifiers for arbitrary blocks of data
`in situations where the identfier of identical blocks must
`be the same, but arbitrary identifiers (eg.,-serial numbers)
`cannot be assigned because there is no central entity to
`, issue identifiers.
`
`Forexample; a network of sempernglenetng a
`
`distributed database of documents could collect documents
`
`from their various input sources during the day and then
`synchronise at night by exchanging and comparing the
`digests of the new documents, Checksums could be used
`for this purpose, but would require a follow-up comparison
`’ between documents whose checksums matched. Digests
`provide enough assurance of uniqueness to eliminate the
`‘need for any such follow-up comparison.
`
`This example of a document synchronisation system is just
`one application for digests in data structures, ard it's likely
`that they could be applied in many other data structure
`applications as well. The essence is that the digest can act
`as a convenient identity for a block of data,
`
`-Suitimary
`
`Perbaps the most exciting aspect of these applications Is
`that they do not require a great degree of cryptographic
`sophistication to implement, All that one needs is a digest
`algorithm and an understanding ofthe properties of# digests
`mentionedearlier.
`~
`
`How Secure Are Digest Algorithms?
`
`‘Tn assessing the security of digest algorithms, we must
`consider their strength in the face of random and intelligent
`attacks.
`.
`
`The security of digest algorithms against random attacks
`can be measured in the same way as for checksums. As
`most digest algorithms produce 128-bit (16-byte) digests,
`the chance of a random error in the data going undetected
`is astronomically small, being just one in 2128
`(about one
`in 3.4.x.1038), Tobe this unlucky, you would have to win a
`_ tnillion-ticketlottery six times in a row! Thus, because of
`their increased width, digests generally provide far more
`protection against random attacks than checksums.
`
`Assessing the strength of a digest algorithm against attacks
`by an intelligent opponent is very much harder, as it is
`possible that any digest algorithm could have a hidden
`weakness that could be exploited. Currently, the only way
`to determineif a digestalgorithm is secure is to expose it to
`public scrutiny over a period of years, Unfortunately, the
`field of digests is relatively young and so most digests are
`only a few years old,
`
`1f we ignore the potential for hidden weaknesses, and
`assume that the digest algorithm being assessed provides
`perfect randomness, we need also to take a look at
`that
`other determinantof security: width. High randomnessin a
`digest algorithm is no use if the opponent can use brute
`force to search the digest space. As an n-bit digest can take
`2" different values, any opponent attemptingto search this
`space will find a match after an average of 2"-1 attempts.
`
`$ Calculation: 106° = 1036 = 1038,
`
`
`
`
`
`weeeaee
`
`.
`
`For a 128-bit digest this is 2)27 (1.7 x 1038) attempts. This
`is highly secure. Even If an opponent could test 10? new
`messages per second (which in 1994 technology would
`.tequire at least one thousand high speed CPUs), it would
`still take about 5.4 x 102! years.6
`.
`Unfortunately,
`there is a. different kind of brute-force
`attack, called a "birthday attack" that reduces the amount of
`work required from O(28) to O(V2"), Birthday attacksare
`named after the famous birthday paradox which states that
`a-proup of just 23 people are needed for there to be a
`probability of at least one half that at least two people in
`the proup share the same birthday.
`It's considered a
`paradox because common sense suggests that many more
`people would be required. Birthday attacks do not provide
`an-opponent with a faster-than-brute-force way offinding a
`docuinent having a particular digest, but they do allow an
`opponent to create a pair of documents having the same
`digest in about ¥2" attempts. To execute a birthday attack,
`the opponent generates V2"
`random documents and
`computes each document's digest. Theset of such digests is
`then searched for duplicates. A result
`related to the
`birthday paradox states that the probability that there will
`be at least one matchingpair will be about 0.5,
`
`two documents* can be
`In a variation of this attack,
`produced that are approximations of two target documents,
`and which have the same digest. This would enable an .
`opponent to present one of the two documents officially
`and then later substitute the second document [Yuval79}.
`
`How exposed to birthday attacks are existing digest
`algorithms? As it happens, not very. But birthday attacks
`do bring digest algorithms much closertothe brink. If a
`digest algorithm produces a 128-bit digest, then a birthday
`attack will eaan average of24 operations to succeed
`(instead of 2128-1), at 109 tests per second, 294 tests
`would take over 500 years.? Thus, a very determined
`Opponent with a network of one thousand workstations,
`with a solid year of CPU time to bum, would have a one in
`500 chance of finding two documents with the same digest.
`Whether you consider this to he aa serious risk depends on
`the criticality of your data,
`
`Forfurther information onattacks on digest algorithms, see
`[Pieprzyk93). For more
`information on how digest
`algorithms
`fit
`into
`the
`field of - cryptology,
`see
`[Simmons92].
`he
`
`-
`
`Technical Restrictions
`
`section contains a. brief discussion of technical
`This
`restrictions on the use of digest algorithms,
`
`6 Calculation: (2127/109¥(365*24*60*60) = 5.4 x 1021,
`
`7 Calculation: (264/109)/(365*24*60*60) = 580.
`
`Memory consumption: Most digest algorithms require
`verylittle memory, using it only for their internal state(eg.,,
`128-bits), aid to buffer Incoming bytes into blocks (eg.,-64
`bytes).
`
`CPU consumption: Digest algorithms are much slower
`than checksum algoritlims as digest algorithms have to
`provide a computational barrier for their opponent rather
`than merely a probabilistic one. However, digests are still
`fast enough to warrant the routine replacement of
`checksums by digests in many applications. The design of
`secure, high-speed digest algorithms is an active research
`area [Anderson94} [Pieprzyk93] [Scl.crypt].
`
`Legal Restrictions
`
`Copyright: Many of the more popular digest algorithms
`have implementations whose C source code is available -
`through
`the
`Iutemet
`by FIP. Many
`of
`these
`implementations come with public copyright licences that
`‘apply. only minorrestrictions to. thé programmer, such as
`requiring that the algorithm be properly identified. The
`MDS algorithm is one such algorithm and its copyright
`message is reproduced here in full as an example:
`
`.
`
`.
`
`Copyright (C) 1991-2. RSA Data Security, Inc. Created
`1991. All rights reserved.
`
`License to copy and use this software is granted
`provided thatit is identified as:the “RSA Data Security,
`Inc. MDS Message-Digest Algorithm" in all material
`mentioningor referencing this softwareorthis function.
`
`License is also granted to make and use derivative
`works provided that such works are identified as
`"derived from the RSA Data Security,
`Inc. MDS5
`Message-Digest Algorithm" in all material mentioning
`or referencing the derived work.
`
`Inc. makes no representations
`RSA Data Security,
`concerning either the merchantability of this software
`or the suitability of this software for any particular —
`without express or
`purpose.
`It
`is provided “as
`is"
`implied warranty of any kind,
`
`‘These notices must be retained in any copies of any part
`of this documentation and/or software.
`
`An important aspect of this notice is that it does not restrict
`the commercial use of MD5. This means that you can
`incorporate MDS into commercial products so long as you
`conform with the minimal requirements above.
`
`Patents: Software patents provide far greater restrictions
`than copyright as they protect the algérithm ‘itself, rather
`
`
`
`digest algorithms are patented (eg., MDG-2 and MDC-4
`(by IBM)), these patents are usually fairly oarrow, and the
`field seems to be remarkably free from the kind of broad
`software patents that have muddied fields such as data
`compression. In particular, many digest algorithms (eg,
`"MDS and SHA-1) appear to be entirelypatentfree.
`
`Exportability: The US Government's current policy is to
`restrict the export of cryptographic products for defence
`reasons. This policy has been propagated through treaties
`to-associated countries such as Australia and has caused
`uncertainty about what software can and can't be exported.
`
`embody cryptographic
`digest algorithms
`Although
`strengths, it seems that there is no problem with exporting
`them,. so long as they do not provide secrecy,
`just
`authentication. However, it would be wise to check with the
`relevant authorities anyway.
`‘
`
`A Short Catalogue of Digest Algorithms
`This section contains a description of some of the more
`commondigest algorithms.
`
`MD2: Thisis the "RSA Data Security, Inc. MD2 Message-
`Digest Algorithm". The defining document for MD2' is
`RFC-1319 [Kalisky92}. MD2is secure, but slow, Jts width
`is 16 bytes (128 bits).
`-
`
`MD4:This is the “RSA Data Security, Inc. MD4 Message-
`Digest Algorithm", MD4 is a fast, fairly secure digest
`algorithm. However, when it was released, it became so
`popular that {ts inventors became concerned thatit might
`not be secure enough for widespread use and released a
`more secure version which they named. MD5. As such,
`MDSis preferred. The defining document for MD4is RFC-
`1320 {Rivest92a}. Its width is 16 bytes (128 bits).
`°
`
` than a particular implementation. Although quite a few
`
`_
`
`SNEFRU: This is the Snefru algorithm, also known as
`"The Xerox Secure Hash Function”. Although the name
`"“Snefru" looks as if itis an acronym,it is actually the name
`of a Pharaoh of ancient Egypt. The Snefro algorithm is
`described in detail
`in
`[Merkle??]: Unlike
`the other
`algorithms, Snefru has two parameters called the security
`level (numberof transformation iterations) and the output
`size (width of the digest in longwords (4-byte chunks)),
`each of which can takeeither the value 4.or 8, This leads to
`four different versions of Snefru, Snefru is one of the few
`digest algorithms that yield a-digest longer than 16 bytes.
`This in itself could provide extra security, particularly if
`you are concerned aboutbirthday attacks.
`“Other Digest Algorithms: Other digest algorithms that the ~
`author has beard of, but not yet tracked down are: 3-WAY,
`AS, BLOWFISH, FEAL, FISH, Haval, MDC-2, MDC-4,
`N-hash, Purdy Hash, RIPE-MACI, RIPE-MAC3, RIPE-
`MD, RSADSI, SAFER, SEAL, VINO, and WAKE.There ©
`are apparently many more. Some of these are reputed to be
`-better (faster, more secure)
`than the digests described
`earlier, Someare them are reputed to have been broken.
`
`Getting Started
`
`The best wayto get started with digest algorithms is simply
`“to try one out. A good algorithm to start with is the RSA
`Data Security,
`Inc. MDS Message-Digest Algorithm
`CMDS" for short).
`It is a suitable "default" algorithm
`because it is simple, well-defined, well-ksown, patent-free,
`not
`too slow, has been exposed to the cryptological
`community for a few years, and has source code readily
`available by FTP through the Internet. The MD5 algorithm
`is described in RFC-1321, which also contains
`an
`implementation in C and a small test suite so that you can
`confirm that you've gotit working correctly, RFC-1321 can
`be obtained by FTP from:
`munnari.oz.au:/rfc/rfci321.2
`
`Conclusions
`
`MDS: This is the "RSA Data Security, Inc. MD5 Message-
`* Digest Algorithm”, This is a fast and secure algorithm. The
`defining document for MDS is the RFC-1321 [Rivest®2b}.
`Its widt