`
`http://1997.webhistory.org/www.lists/www-talk. 1994q3/0416.html
`
`re: Dienst and BFD/LIFN document
`
`Reed Wade (wade@cs.utk.edu)
`Mon, 08 Aug 1994 17:16:44 -0400
`
`e Messagessorted by: | date || thread ][ subject ][ author]
`e Next message: Terry Allen: "Re: Dienst, A Protocol for a Distributed Digital Document
`Library"
`e Previous message: Johan Vromans: "Re: ***** Freehand .eps to .gif ??"
`
`Hi2
`
`This seemsto relate to the Dienst discussion.
`
`We're working on a similar piece of the problem. Using LIFNs
`(Location Independent File Names(essentially, URN's that refer
`to an immutable set of octets)) we expect to be able to provide:
`
`support for easy replication/caching
`high scalability
`file authenticity and integrity
`
`We(Keith Moore) gave a short presentation describing our scheme
`to the URI and IIR working groupsat the last IETF meeting in
`Toronto.
`
`See attached.
`
`Reed Wade
`
`wade@cs.utk.edu -- http://www.netlib.org/utk/people/ReedWade.html
`
`Network Working Group Keith Moore
`Internet Draft Reed Wade
`Expires: January 27, 1995 Stan Green
`University of Tennessee
`July 27, 1994
`
`An Architecture for Bulk File Distribution
`
`draft-moore-bfd-arch-01 .txt
`
`Status of this Memo
`
`This documentis an Internet Draft. Internet Drafts are working
`
`1 of 7
`
`8/16/2012 12:33 PM
`
`EMC/VMwarev.Personal Web
`
`IPR2013-00083
`
`EMCVMW1057
`
`
`
`WWW-Talk Jul-Sep 1994: re: Dienst and BFD/LIFN document
`
`http://1997.webhistory.org/www.lists/www-talk. 1994q3/0416.html
`
`documents of the Internet Engineering Task Force (IETF), its Areas, and
`its Working Groups. Note that other groups mayalso distribute working
`documentsas Internet Drafts.
`
`Internet Drafts are valid for a maximum ofsix months and may be
`updated, replaced, or obsoleted by other documentsat any time. It is
`inappropriate to use Internet Drafts as reference material orto cite
`them other than as a "workin progress".
`
`Abstract
`
`This memo describes a system for the automated replication ofdatafiles
`and their descriptions to variousfile servers across the Internet. The
`system maintains a distributed database which containsthe locations of
`eachfile distributed by the system, and will provide a list of
`locations for any file upon request. The system provides assurances of
`integrity, and authenticity, of the replicatedfiles. It is intended
`for use with the World Wide Web, Gopher, and similar applications, to
`provide higheravailability, improved response, and better use of
`network resources.
`
`1. Introduction
`
`There are a numberof problems associated with the current Internet
`information infrastructure, which result in poorserviceto its users.
`These problemsinclude:
`
`+ lack of scability. Manyfiles are available at only a single file
`server. Any popularfile (e.g. Mosaic home page, weather map) will
`cause a file server to be swamped.
`
`+ lack of fault tolerance. Ifa file server is unavailable, there is no
`mechanism to find alternate serversfor thatfile.
`
`+ inefficient use of network resources. The primary location ofa file
`may be halfway acrossthe globe, or on the other side of a low-
`bandwidth link. Even whenalternate locations exist, there is
`currently no mechanism to find a "nearby"location of a particular
`
`Moore/Green/Wade Expires 27 January 1995 [Page 1]
`
`Bulk File Distribution 27 July 1994
`
`file.
`
`+ no assurances of authenticity or integrity. Files are currently
`replicated from one server to another using a variety of ad hoc
`mechanisms. Various translations may occur during this process, and
`
`2 of 7
`
`8/16/2012 12:33 PM
`
`
`
`WWW-Talk Jul-Sep 1994: re: Dienst and BFD/LIFN document
`
`http://1997.webhistory.org/www.lists/www-talk. 1994q3/0416.html
`
`errors (or even deliberate modifications) may be introduced. Thereis
`currently no mechanism for ensuring the integrity of replicatedfiles,
`nor any assurance that a copyofa file whichis available on a server
`has not been modified by someone other than the author.
`
`2. Proposal
`
`In order to address these problems, we propose the following
`architecture. It is intended to provide replication offiles across
`multiple servers, scalable accessto the files distributed by the
`system, and the assurance ofintegrity and (optionally) authenticity for
`each file. In addition it provides the ability to reliably cache such
`files as well as the potential to take advantage of network proximity
`for improved utiliziation.
`
`Eachfile is given a unique namecalled a Location Independent File Name
`(LIFN), whichrefers to that particular sequence of octets. Once a LIFN
`has beenassignedtoafile, the binding between the LIFN and that
`sequence of octets may not be changed. The space of LIFNsis sub-
`divided amongseveral "publishers" (or "naming authorities"), who are
`responsible for ensuring the uniqueness of LIFNswithin their portion of
`LIFN-space, and also provide a LIFN-to-location mappingservice for
`those LIFNs.
`
`The LIFN-to-location mappingservice is provided by a network of
`"location servers" collectively known asthe "location database". These
`servers accept requests for locations of LIFNs, as well as updates
`containing new locations or requests to delete old LIFN-to-location
`mappings. Such update requests require authentication; only thosefile
`servers which are authorized by the publisher maystore locations in the
`database.
`
`Accessto files themselves is provided by more-or-less conventionalfile
`servers, using any protocol which providesbinary transparentfile
`access. Such protocols would include HTTP, Gopher, FTP, and others, as
`long as certain restrictions are observed.
`
`Files are replicated amongfile servers using a "replication daemon". A
`copy of the replication daemon runson eachfile server. It accepts
`descriptions of newly published files, and decides (based onsite-
`providedcriteria) which files should be acquired bythefile server.
`It then queries the location databaseto find a location for eachfile
`desired, andretrieves the file from one ofthe locationslisted.
`Finally, it updates the location database to inform it of the new
`location for that file. The replication daemon mayalso act asa file
`
`Moore/Green/Wade Expires 27 January 1995 [Page 2]
`
`3 of 7
`
`8/16/2012 12:33 PM
`
`
`
`WWW-Talk Jul-Sep 1994: re: Dienst and BFD/LIFN document
`
`http://1997.webhistory.org/www.lists/www-talk. 1994q3/0416.html
`
`Bulk File Distribution 27 July 1994
`
`reaper, deciding whento delete files, and informing the location
`database whensuchfiles will no longer be available.
`
`Associated with each file is a description. Included in the description
`is so-called "bibliographic information", such astitle, author,
`content-type, etc., but also an MDSorsimilar fingerprint of the file.
`The relevant portions of the description are cryptographically signed by
`the publisher. To perform an integrity check, a file server or user can
`retrieve the description for any file (using whoist+ or a similar
`protocol), compute the MDSfingerprint for that file, and compare this
`with the onelisted in the description. To check authenticity, it can
`also verify the credentials of the file's description.
`
`A file is "published" in the system bycreating a description to go
`along with thefile, signing the description with the publisher's
`private key, makingthe file available via one or more "master"file
`servers, andlisting those locationsin the location database. The
`description may also be sent (perhaps via ordinary email) to interested
`parties. Such parties may includeslave file servers (which can use
`them to decide which newfiles to acquire), resource discovery servers
`(which can then provide search services based onthe file descriptions
`and/or the files themselves), and ordinary users.
`
`The location database consists of one or more servers for each
`publisher. Theseare listed in either a well-known masterdirectory, or
`a reserved portion of the DNS namespace,so that a client can easily
`find out which server to query for a particular LIFN. The query itself
`uses a datagram-basedprotocol, which is designed to impose the least
`possible overhead for both client and server. Updates to the location
`server use a similar protocol; however, these protocols also require
`authentication to prevent unauthorized (or untrusted) servers from
`listing alternate locations for a file. Location updatesare posted to
`a single location server and propagated to the other peer servers via a
`batch version of the update protocol(using virtual circuits rather than
`datagrams).
`
`It is not necessary to keep all location servers for a publisher in
`sync. The location query service does not guaranteethatit returnsall
`instancesofa givenfile. If the list of locations provided by one
`serveris insufficient, the client is free to consult the other servers
`in the hope offinding a better one. Similarly, if one or more of the
`locations thus providedis "stale" (that is, points to a file that no
`longer exists), the client may also look for thefile at its other
`listed locations. (Note that while a file server may deleteafile
`whoselocationis listed in the location database, it may not re-use a
`
`4 of 7
`
`8/16/2012 12:33 PM
`
`
`
`WWW-Talk Jul-Sep 1994: re: Dienst and BFD/LIFN document
`
`http://1997.webhistory.org/www.lists/www-talk. 1994q3/0416.html
`
`filename for a different file or change thatfile in any way.).
`
`To minimize the liklihood ofstale file locations, file servers are
`encouraged to inform the location database in advanceofactually
`deleting a file. The responseto a location query includes a "time to
`live" field whichis used by clients or proxy servers to maintain a
`
`Moore/Green/Wade Expires 27 January 1995 [Page 3]
`
`Bulk File Distribution 27 July 1994
`
`cache of LIFN-to-location mappings. After being informedthata file is
`going to disappear, the location servers will adjust the "time to live"
`field in future responses to queriesforthat file, to reflect the time
`whenthefile is expected to disappear. Until that time, the "time to
`live" field is the value supplied by the file server whenit posted the
`location. Specifying a time to live of "N" in a location update is
`tantamount to an agreementthat the file server will not delete thefile
`without informing the location server "N" secondsin advanceof doing
`so.
`
`A specialfile replication protocol is used betweenfile servers.It
`provides mutual authentication to prevent spoofing, and pipelining to
`transfer large numbersoffiles effeciently, even over high-delay links.
`It may also accomodate compression ona per-file basis (for low-
`bandwidth links), and checkpointing to allow for recovery whentransfers
`of large files are interrupted.
`
`3. Evaluation
`
`The system should scale in several ways. User demandfor any particular
`file can be distributed over multiple file servers. The location
`databaseis also distributed, both because each publisher maintainsits
`own servers, and also because several servers can be provided for any
`publisher. In addition, the current location query protocol provides
`for a cacheable "redirect" response that allows the LIFN space for a
`particular publisher to be divided across several secondary servers,
`without imposing any additional structure on the LIFNs themselves.
`Becausethere is no need for synchronization, location updates can also
`be distributed across several servers, and effeciently transmitted among
`location server peers. This avoids the overhead with multi-phase commit
`protocols which would be neededto ensure consistency.
`
`Integrity and authenticity are provided by the MDSfingerprint and the
`cryptographic signature in the file description. A possible weak point
`in the current system is the assumption that DNSwill be used to
`identify location servers for a particular publisher, since DNSis not
`
`5 of 7
`
`8/16/2012 12:33 PM
`
`
`
`WWW-Talk Jul-Sep 1994: re: Dienst and BFD/LIFN document
`
`http://1997.webhistory.org/www.lists/www-talk. 1994q3/0416.html
`
`itself secure. It should be pointed out that since only trusted
`locations will be listed by the location service, a user may not wish to
`perform integrity or authenticity checks for every file accessed.
`However, the capability is there for when it is needed.
`
`The system allowsa client to consult a local cache or proxy server
`before attempting to accessa file which may already be available
`locally. Since the binding between a LIFN andthefile is fixed, if a
`client has a LIFN for a file, and the cache hasa file which goes with
`it, the client has a reasonable assurance that the cached copyis
`correct (assumingit trusts the cache). If the time-to-live field in a
`location response is nonzero, LIFN-to-location bindings can also be
`reliably cached for that amountof time.
`
`Moore/Green/Wade Expires 27 January 1995 [Page 4]
`
`Bulk File Distribution 27 July 1994
`
`Finally, this system allowsthe potential that a client can select a
`"nearby" location from amongseverallocationsfor a file, or from among
`several available location servers. A means by which this may be
`accomplished has been proposed andis underinvestigation.
`
`4. Open Problems
`
`Asstated above, there is a need to provide a means by whicha client
`may choose from among severalservice locations to take advantage of
`network proximity.
`
`If existing file servers are to adoptthis plan, there needs to be a
`transition scheme. Asstated above,locations(i.e. file names) of
`files provided by the location databases may not be reused, even for
`updatesto the file. This is in contrast to the present-day use offile
`locations (URLs) which are expected to be stable referencesto the
`*current™ version ofa file. If the replication daemonis to replace
`the ordinary mirroring software that is presently in use, it must also
`provide "stable" locations for the samefiles, which may be updated in
`place and accessed by moretraditional means. This could be
`accomplished by having each description include a "suggested-filename"
`field. The file server would concocta local filename from this field;
`any new file from the same publisher with the same suggested-filename
`would replace the old copy of the file stored in that location (but not
`the copy ofthe file stored in the location listed in the location
`database).
`
`There is a need to describe manytypes ofrelationships in thefile
`description. The details of such descriptions are yet to be defined.
`
`6 of 7
`
`8/16/2012 12:33 PM
`
`
`
`WWW-Talk Jul-Sep 1994: re: Dienst and BFD/LIFN document
`
`http://1997.webhistory.org/www.lists/www-talk. 1994q3/0416.html
`
`5. Implementation status
`
`A prototype version of this system is being constructed by the authors.
`A distributed location database andclient library have been constructed
`and interfaced to Mosaic; the resulting client demonstrated the ability
`to (crudely) select from among multiple locationsof a file, and to
`recover from the failure of both file servers and location database
`servers. The replication daemon andits associated protocols are
`currently under development.
`
`Experience from the use of the prototype will be used to construct a
`second version of the system, which the authors intend to make widely
`available.
`
`Moore/Green/Wade Expires 27 January 1995 [Page 5]
`
`e Next message: Terry Allen: "Re: Dienst, A Protocol for a Distributed Digital Document
`Library"
`e Previous message: Johan Vromans: "Re: ***** Freehand .eps to .gif ??"
`
`7 of 7
`
`8/16/2012 12:33 PM
`
`