throbber
‘OFTWARE
`
`EMCVMW 1017
`
`RpliEYS
`
`1807 v9 soley)
`
`ron
`JOHN WILEY & SONS
`Chichester - New York - Brisbane - Toronto - Singapore
`A Wiley-Interscience Publication
`SPEXBL 12(12) 1085-1173
`ISSN 0038-0644
`
`EMCVMW 1017
`
`ACTICE & EXPERIENCE
`DME 12, No. 12
`DECEMBER 1982
`
`EDITORS
`
`DAVID BARRON
`
`CHARLES LANG
`
`DAVID HANSON
`
`

`

`SHORT COMMUNICATIONS
`
`1165
`
`‘FINGERPRINTING’—A
`TECHNIQUEFOR FILE
`IDENTIFICATION AND
`MAINTENANCE
`
`D. R. MCGREGOR AND J. A. MARIANI
`Department of Computer Science, University of
`Strathclyde, Glasgow, Scotland
`
`‘ideal’ value (1 in 237), but the simplicity of the
`algorithm makes for a simple and very fast fin-
`gerprinting program.
`as
`Hence, we are using the file’s content,
`opposed to the file’s name as in conventional
`systems, to identify thefile. Ina simple fingerprint
`(FP) system,
`for example, we can maintain a
`database of FPS against associated documentation.
`Therefore, when the user comes across an un-
`documentedfile he can try to match its FP against
`one in the database in an attempt to obtain any
`This short note describes a technique, which we
`information regarding thatfile. To build up the FP
`have called ‘fingerprinting’,
`to produce a quasi-
`database, users are free to add newfiles to the
`unique identifier for a file, derived from thatfile’s
`contents. This allows the identification of identical
`system, as long as theyare willing to enter appro-
`priate information regarding that file.
`files with different names and the notification of
`Systemscan naturally range upwardsin sophist-
`any changes to a knownfile.
`ication from the simple one described above. In the
`KEY WORDS
`File identification©Sumcheck
`case of program files, to ensure correct identificat-
`Information content Documentation
`ion, the FP database could also hold test data and
`Programmesupport environment
`matching output for the program (for example, a
`Source code control
`Pascal compiler could have a Pascal program as
`data and the corresponding compiled listing as
`output) which can be checked with the output of
`the unknown program (by fingerprints again). In
`the event of any difference, a more sophisticated
`comparison of the programs could be called into
`play.
`The fingerprinting technique, as outlined this
`far,
`is best used statically—i.e. with programs
`which are rarely expected to change. The FP
`database can be extended to handle different ver-
`sions of a program, while programs (or program
`versions) which are no longer supported can have
`the updated version’s FP overwrite the old FP.
`Static fingerprints can be used to detect any file
`corruptions. One use FP was put to was as an
`assuranceof correct software transportation. Every
`file transported was fingerprinted at our installa-
`tion, and a list of FPs plus a copy of the finger-
`printer
`sent with the software. The finger-
`printer was then employed at the destination to
`confirm the
`safe
`transportation of
`the
`files
`involved.
`The main use of the FP technique has been in
`conjunction with the AutoProg system,’ which
`essentially (for the purposes of this paper) main-
`tains a library of user source modules accessible by
`all users but maintained by their owners within
`ther own file store. In this case, the files can be
`expected to change, and the fact that a file has been
`altered can be
`exposed by occasionally re-
`fingerprintingall the modules and comparing these
`dynamic FPs against the FPs as recorded in the
`AutoProg database.
`If modifications ‘are detected, users of the af-
`fected modules can be notified and re-compilation
`
`One problem associated with current operating
`systemsis the easy proliferation of multiple copies
`of programs and data. While this is not only
`wasteful of space and other resources, it can cause
`problems whenassociated documentation has been
`connected with a program by the program’s file
`name.
`The ‘fingerprinting’ technique described in this
`short note outlines a solution wherebyinformation
`about a file is connectedto that file by meansofthe
`file’s contents, as opposed to its name.
`The ideais to provide an identifying feature for
`every file, which is intrinsically distinctive, and
`analogous (hopefully) to a human’s fingerprint. In
`our current implementation, a fingerprint is a 32-
`bit number,
`formed simply by performing a
`double check-sum on the program file, and provid-
`ing a uniqueidentifier for the file.
`A fingerprint is calculated as follows—given a
`file with N characters, the double checksum (s2) is
`formed bythe algorithm—
`
`si ¢= 0; 52.:= 0;
`for i:=—1to N do
`begin
`sl :=si4+c[z];
`s2:=s2+4+s1
`end
`
`where si and s2 are 16-bit integers (overflow is
`ignored) and c[/] is the ith character of thefile.
`These 2 16-bit numbers are then stored as a 32-
`bit fingerprint. From an information theoretical
`standpoint this algorithm does not generate the
`best possible checksum. The chance of a clash in
`fingerprints is therefore somewhathigher than the
`Received 26 July 1982 Revised 21 September 1982
`
`

`

`1166
`
`BOOK REVIEWS
`
`of affected programs can be automatically gen-
`erated. Similarly, FPs can be extended to detect
`changesin the associated documentation files.
`In this short note we have presented a very
`simple technique for connecting information about
`a file with that file’s content. This method shares
`the drawbackof associating documentation with a
`file name, as both a file’s name and contents can be
`transient. However, the FP technique can be use-
`fully applied in the diametrically opposing situa-
`
`tions where files are relatively static or where
`changesto files are required to be trapped, such as
`the AutoProg system or as a detection mechanism
`for file corruption.
`
`REFERENCES
`
`1. D. R. McGregor and J. A. Mariani, ‘AutoProg—a
`software development and maintenance system’,
`IUCC Bulletin, Summer 1981.
`
`
`
`Book Reviews
`
`INTRODUCTORY ALGOL 68 PROGRAMMING, D. F.
`Brailsford and A. N. Walker, Ellis Horwood,
`Chichester. No. of pages: 281. Price:£14.00 (hard-
`back), £5-95 (paperback).
`
`For example, consider the veryfirst coverage of
`arrays. There is one paragraph explaining in gen-
`eral terms that it maybe useful to regard a group of
`objects as forming a composite object and that
`Algol 68 has row modes and structured modes in
`order to do this. The next paragraph contains an
`Algol 68 is not an easy language to teach. This
`example of an object of mode []real and intro-
`text on Algol 68 is
`intended both for under-
`duces the notation [, ]real and [,, ]real and the
`graduates attending their
`first course in pro-
`final paragraph explains howa [,, Jreal is repre-
`gramming and for computer programmers and
`sented inside a computer. I feel that most of this
`system analysts in industry, research and com-
`material should havebeenleft until a later chapter.
`merce. I am afraid that the book’s approach is such
`Chapter2is called ‘Program Structure’. It starts
`that most beginners will soon be put off pro-
`with sections on ‘primaries’, ‘expressions’, ‘unitary
`gramming. Even for programmers conversant with
`clauses’ and ‘serial clauses’. This is an important
`Fortran, Basic or Cobol, a gentler introduction to
`chapter because. it contains someofthe essentials
`some of the concepts would be helpful.
`such as assignments,
`expressions, conditional
`For a first course in programmingI believeit is
`clauses and loops. However,
`there is too much
`necessary to have an approach led by a need to
`theory: there is not enough demonstration of how
`write programs in order to solve problemsrather
`these constructs are used and put together. For
`than a breadth-first approach throughthe elements
`example, at
`the start of this chapter,
`the first
`of a programminglanguage. Onerequires a gentle
`Algol 68 program of the book is presented. It
`lead-in to the waythat objects are represented in
`contains two heap declarations, two calls of read
`programsrather than knowall about objects, then
`and a call of print. The second Algol 68 program of
`find out how programsare written andfinallyfind
`the book appearsat the end ofthe section on‘serial
`out how objects are manipulated within programs.
`clauses’: it contains three heap declarations, a call
`In another context, there is the phrase Algorithms
`of read, an obscure while loop and a call of print.
`+ Data Structures = Programs where the operator
`In between the two programs, the fragments of
`+ evaluates its operands concurrently.
`In this
`Algol 68 that are given are not presentedasfulfil-
`book the right hand operand of + is evaluated
`ling someneed.
`first, then the left hand operandandfinally the +
`Chapter 2 continues with a discussion of scopes
`is done in order to yield some programs.
`and ranges. Until now heap generators have been
`Thus, chapter 1
`is about objects. It introduces
`used. However, after a discussion of howruntime
`the modes, real, int, bool and char, row modes
`storage is organized in terms of a stack and a stack
`and structured modes. The chapter also mentions
`pointer,
`the book points out
`that heap storage
`heap generators and dope vectors and has a very
`cannot be handled in this fashion and soit intro-
`heavy dose of refs including:
`duces
`loc. Abbreviated declarations are then
`ref ref int chain = heaprefint
`covered.
`:=heap int :=123
`Chapter 3 takes some elementary problems and
`produces programs to solve them. The chapter
`starts by producing many programsto solve a
`
`In myopinion this chapter contains too many new
`ideas.
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket