`for an Incompletely Trusted Environment
`
`Atul Adya, William J. Bolosky, Miguel Castro, Ronnie Chaiken, Gerald Cermak,
`John R. Douceur, Jon Howell, Jacob R. Lorch, Marvin Theimer, Roger P. Wattenhofer
`Microsoft Research, Redmond, WA 98052
`{adya, bolosky, mcastro, rchaiken, gcermak, johndo, howell, lorch, theimer}@microsoft.com;
`wattenhofer@inf.ethz.ch
`
`Abstract
`Farsite is a secure, scalable file system that logically functions as a centralized file server but is physically
`distributed among a set of untrusted computers. Farsite provides file availability and reliability through randomized
`replicated storage; it ensures the secrecy of file contents with cryptographic techniques; it maintains the integrity of
`file and directory data with a Byzantine-fault-tolerant protocol; it is designed to be scalable by using a distributed
`hint mechanism and delegation certificates for pathname translations; and it achieves good performance by locally
`caching file data, lazily propagating file updates, and varying the duration and granularity of content leases. We
`report on the design of Farsite and the lessons we have learned by implementing much of that design.
`
`1. Introduction
`This paper describes Farsite, a serverless distributed file
`system that logically functions as a centralized file
`server but whose physical realization is dispersed
`among a network of untrusted desktop workstations.
`Farsite is intended to provide both the benefits of a
`central file server (a shared namespace, location-
`transparent access, and reliable data storage) and the
`benefits of local desktop file systems (low cost, privacy
`from nosy sysadmins, and resistance to geographically
`localized faults). Farsite replaces the physical security
`of a server in a locked room with the virtual security of
`cryptography, randomized replication, and Byzantine
`fault-tolerance [8]. Farsite is designed to support
`typical desktop file-I/O workloads in academic and
`corporate environments,
`rather
`than
`the high-
`performance I/O of scientific applications or the large-
`scale write sharing of database applications. It requires
`minimal administrative effort to initially configure and
`practically no central administration to maintain. With
`a few notable exceptions (such as crash recovery and
`interaction between multiple Byzantine-fault-tolerant
`groups), nearly all of the design we describe has been
`implemented.
`
`Traditionally, file service for workstations has been
`provided either by a local file system such as FFS [28]
`or by a remote server-based file system such as NFS
`[39] or AFS [21]. Server-based file systems provide a
`shared namespace among users, and they can offer
`greater file reliability than local file systems because of
`better maintained, superior quality, and more highly
`redundant components. Servers also afford greater
`physical security than personal workstations in offices.
`
`However, server-based systems carry a direct cost in
`equipment, physical plant, and personnel beyond those
`already sunk into the desktop infrastructure commonly
`found in modern companies and institutions. A server
`requires a dedicated administrative staff, upon whose
`competence its reliability depends [19] and upon whose
`trustworthiness its security depends [47]. Physically
`centralized servers are vulnerable to geographically
`localized faults, and their store of increasingly sensitive
`and valuable
`information makes
`them attractive,
`concentrated targets for subversion and data theft, in
`contrast to the inherent decentralization of desktop
`workstations.
`
`In designing Farsite, our goal has been to harness the
`collective resources of loosely coupled, insecure, and
`unreliable machines to provide logically centralized,
`secure, and reliable file-storage service. Our system
`protects and preserves file data and directory metadata
`primarily through the techniques of cryptography and
`replication. Since file data is large and opaque to the
`system, the techniques of encryption, one-way hashing,
`and raw replication provide means to ensure its privacy,
`integrity, and durability, respectively. By contrast,
`directory metadata is relatively small, but it must be
`comprehensible and revisable directly by the system;
`therefore, it is maintained by Byzantine-replicated
`state-machines [8, 36] and specialized cryptographic
`techniques that permit metadata syntax enforcement
`without compromising privacy [15]. One of Farsite’s
`key design objectives is to provide the benefits of
`Byzantine fault-tolerance while avoiding the cost of full
`Byzantine agreement in the common case, by using
`signed and dated certificates to cache the authorization
`granted through Byzantine operations.
`
`Page 1 of 14
`
`Netskope Exhibit 1020
`
`
`
`Both Farsite’s intended workload and its expected
`machine characteristics are those typically observed on
`desktop machines in academic and corporate settings.
`These workloads exhibit high access locality, a low
`persistent update rate, and a pattern of read/write
`sharing that is usually sequential and rarely concurrent
`[22, 48]. The expected machine characteristics include
`a high fail-stop rate (often just a user turning a machine
`off for a while) [6] and a low but significant rate [41] of
`malicious or opportunistic subversion. In our design,
`analysis, evaluation, and discussion, we focus on this
`environment, but we note that corporate administrators
`might choose to supplement Farsite’s reliability and
`security by adding userless machines to the system or
`even running entirely on machines in locked rooms.
`
`Farsite requires no central administration beyond that
`needed to initially configure a minimal system and to
`authenticate new users and machines as they join the
`system. Administration is mainly an issue of signing
`certificates: Machine certificates bind machines to their
`public keys; user certificates bind users to their public
`keys; and namespace certificates bind namespace roots
`to their managing machines. Beyond initially signing
`the namespace certificate and subsequently signing
`certificates for new machines and users, no effort is
`required from a central administrator.
`
`There are many directions we could have explored in
`the Farsite design space that we have chosen not to.
`Farsite is not a high-speed parallel I/O system such as
`SGI's XFS [43], and it does not efficiently support
`large-scale write sharing of files. Farsite is intended to
`emulate the behavior of a traditional local file system,
`in particular NTFS [42]; therefore, it introduces no new
`user-visible semantics, such as an object-model
`interface, transactional support, versioning [39], user-
`specifiable file importance, or Coda-like [22] hooks for
`application-specific conflict
`resolvers
`to
`support
`concurrent file updates during disconnected operation.
`
`We have implemented most – but not all – of the design
`described in this paper. The exceptions, which mainly
`relate to scalability and crash recovery, are itemized in
`section 6 and identified throughout the text with the
`term Farsite design, indicating a mechanism that we
`have designed but not yet implemented.
`
`The following section presents a detailed overview of
`the system. Section 3, the bulk of the paper, describes
`the mechanisms that provide Farsite’s key features.
`Section 4 describes our prototype implementation.
`Section 5 analytically evaluates our design’s scalability
`and empirically evaluates our prototype’s performance.
`We discuss future work in section 6, related work in
`section 7, and conclusions in section 8.
`
`2. System Overview
`This section provides a background and overview of
`Farsite by itemizing underlying design assumptions,
`highlighting two enabling technology trends, explaining
`the fundamental concept of namespace roots, describing
`Farsite’s certification model and mechanisms, outlining
`the system architecture, and describing several semantic
`differences between Farsite and a local file system.
`
`2.1 Design Assumptions
`Most of our design assumptions stem from the fact that
`Farsite is intended to run on the desktop workstations of
`a large corporation or university. Thus, we assume a
`maximum scale of ~105 machines, none of which are
`dedicated servers, and all of which are interconnected
`by a high-bandwidth, low-latency network whose
`topology can be ignored. Machine availability is
`assumed to be lower than that of dedicated servers but
`higher than that of hosts on the Internet; specifically,
`we expect the majority of machines to be up and
`accessible for the majority of the time. We assume
`machine downtimes to be generally uncorrelated and
`permanent machine failures to be mostly independent.
`Although Farsite tolerates large-scale read-only sharing
`and small-scale read/write sharing, we assume that no
`files are both read by many users and also frequently
`updated by at least one user. Empirical data [6, 48]
`corroborates this set of assumptions.
`
`We assume that a small but significant fraction of users
`will maliciously attempt to destroy or corrupt file data
`or metadata, a reasonable assumption for our target
`environment but an unreasonably optimistic one on the
`open Internet. We assume that a large fraction of users
`may independently and opportunistically attempt to
`read file data or metadata to which they have not been
`granted access. Each machine is assumed to be under
`the control of its immediate user, and a party of
`malicious users can subvert only machines owned by
`members of the party.
`
`We assume that the local machine of each user can be
`trusted to perform correctly for that user, and no user-
`sensitive data persists beyond user logoff or system
`reboot. This latter assumption is known to be false
`without a technique such as crypto-paging [50], which
`is not employed by any commodity operating system
`including Windows, the platform for our prototype.
`
`2.2 Enabling Technology Trends
`Two technology trends are fundamental in rendering
`Farsite's design practical: a general increase in unused
`disk capacity and a decrease in the computational cost
`of cryptographic operations relative to I/O.
`
`Page 2 of 14
`
`Netskope Exhibit 1020
`
`
`
`A very large fraction of the disk space on desktop
`workstations is unused, and disk capacity is increasing
`at a faster rate than disk usage. In 1998, measurements
`of 4800 desktop workstations at Microsoft [13] showed
`that 49% of disk space was unused. In 1999 and 2000,
`we measured the unused portions to be 50% and 58%.
`
`As computation speed has increased with Moore’s Law,
`the cost of cryptographic operations has fallen relative
`to the cost of the I/O operations that are the mainstay of
`a file system. We measured a modern workstation and
`found a symmetric encryption bandwidth of 72 MB/s
`and a one-way hashing bandwidth of 53 MB/s, both of
`which exceed the disk’s sequential-I/O bandwidth of 32
`MB/s. Encrypting or decrypting 32 KB of data adds
`roughly one millisecond of latency to the file-read/write
`path. Computing an RSA signature with a 1024-bit key
`takes 6.5 milliseconds, which is less than one rotation
`time for a 7200-RPM disk.
`
`The large amount of unused disk capacity enables the
`use of replication for reliability, and the relatively low
`cost of strong cryptography enables distributed security.
`
`2.3 Namespace Roots
`The primary construct established by a file system is a
`hierarchical directory namespace, which is the logical
`repository for files. Since a namespace hierarchy is a
`tree, it has to have a root. Rather than mandating a
`single namespace root for any given collection of
`machines that form a Farsite system, we have chosen to
`allow the flexibility of multiple roots, each of which
`can be regarded as the name of a virtual file server that
`is collaboratively created by the participating machines.
`An administrator creates a namespace root simply by
`specifying a unique (for that administrator) root name
`and designating a set of machines to manage the root.
`These machines will form a Byzantine-fault-tolerant
`group (see subsection 2.5), so it is not crucial to select a
`specially protected set of machines to manage the root.
`
`2.4 Trust and Certification
`The security of any distributed system is essentially an
`issue of managing trust. Users need to trust in the
`authority of machines that offer to present data and
`metadata. Machines need to trust the validity of
`requests from remote users to modify file contents or
`regions of the namespace. Security components that
`rely on redundancy need to trust that an apparently
`distinct set of machines is truly distinct and not a single
`malicious machine pretending to be many, a potentially
`devastating attack known as a Sybil attack [11].
`
`Farsite manages trust using public-key-cryptographic
`certificates. A certificate is a semantically meaningful
`data structure that has been signed using a private key.
`
`The principal types of certificates are namespace
`certificates, user certificates, and machine certificates.
`A namespace certificate associates the root of a file-
`system namespace with a set of machines that manage
`the root metadata. A user certificate associates a user
`with his personal public key, so that the user identity
`can be validated for access control. A machine
`certificate associates a machine with its own public
`key, which is used for establishing the validity of the
`machine as a physically unique resource.
`
`Trust is bootstrapped by fiat: Machines are instructed
`to accept the authorization of any certificate that can be
`validated with one or more particular public keys. The
`corresponding private keys are known as certification
`authorities (CAs). Consider a company with a central
`certification department, which generates a public/
`private key pair to function as a CA. Every employee
`generates a public/private key pair, and the CA signs a
`certificate associating the employee’s username with
`the public key. For a small set of machines, the CA
`generates public/private key pairs and embeds the
`public keys in a namespace certificate that designates
`these machines as managers of its namespace root. The
`CA then performs the following four operations for
`every machine in the company: (1) Have the machine
`generate a public/private key pair; (2) sign a certificate
`associating the machine with the public key; (3) install
`the machine certificate and root namespace certificate
`on the machine; and (4) instruct the machine to accept
`any certificate signed by the CA.
`
`Farsite’s certification model is more general than the
`above paragraph suggests, because machine certificates
`are not signed directly by CAs but rather are signed by
`users whose certificates designate them as authorized to
`certify machines, akin to Windows’ model for domain
`administration. This generalization allows a company
`to separate the responsibilities of authenticating users
`and machines, e.g., among HR and IT departments. A
`single machine can participate in multiple CA domains,
`but responsibility granted by one CA is only delegated
`to other machines authorized by that same CA.
`
`A machine’s private key is stored on the machine itself.
`In the Farsite design, each user private key is encrypted
`with a symmetric key derived from the user’s password
`and then stored in a globally-readable directory in
`Farsite, where it can be accessed upon login. CA
`private keys should be kept offline, because the entire
`security of a Farsite installation rests on their secrecy.
`
`User or machine keys can be revoked by the issuing
`CA, using the standard techniques [29]: Signed
`revocation lists are periodically posted in a prominent,
`highly replicated location. All certificates include an
`expiration date, which allows revocation lists to be
`garbage collected.
`
`Page 3 of 14
`
`Netskope Exhibit 1020
`
`
`
`2.5 System Architecture
`We now present a high-level overview of the Farsite
`architecture, beginning with a simplified system and
`subsequently introducing some of Farsite’s key aspects.
`2.5.1 Basic System
`Every machine in Farsite may perform three roles: It is
`a client, a member of a directory group, and a file host,
`but we initially ignore the latter of these. A client is a
`machine that directly interacts with a user. A directory
`group is a set of machines that collectively manage file
`information using a Byzantine-fault-tolerant protocol
`[8]: Every member of the group stores a replica of the
`information, and as the group receives client requests,
`each member processes these requests deterministically,
`updates its replica, and sends replies to the client. The
`Byzantine protocol guarantees data consistency as long
`as fewer than a third of the machines misbehave.
`
`Consider a system that includes several clients and one
`directory group. For the moment, imagine that the
`directory group manages all of the file-system data and
`metadata, storing it redundantly on all machines in the
`group. When a client wishes to read a file, it sends a
`message to the directory group, which replies with the
`contents of the requested file. If the client updates the
`file, it sends the update to the directory group. If
`another client tries to open the file while the first client
`has it open, the directory group evaluates the specified
`sharing semantics requested by the two clients to
`determine whether to grant the second client access.
`2.5.2 System Enhancements
`The above-described system provides reliability and
`data integrity through Byzantine replication. However,
`it may have performance problems since all file-system
`requests involve remote Byzantine operations; it does
`not provide data privacy from users who have physical
`access to the directory group members; it does not have
`the means to enforce user access control; it consumes a
`large amount of storage space since files are replicated
`on every directory group member, and it does not scale.
`
`Farsite enhances the basic system in several ways. It
`adds local caching of file content on the client to
`improve read performance. Farsite’s directory groups
`issue leases on files to clients, granting them access to
`the files for a specified period of time, so a client with
`an active lease and a cached copy of a file can perform
`operations entirely locally. Farsite delays pushing
`updates to the directory group, because most file writes
`are deleted or overwritten shortly after they occur [4,
`48]. To protect user privacy and provide read-access
`control, clients encrypt written file data with the public
`keys of all authorized readers; and the directory group
`enforces write-access control by cryptographically
`validating requests from users before accepting updates.
`
`Since Byzantine groups fail to make progress if a third
`or more of their members fail, directory groups require
`a high replication factor (e.g., 7 or 10) [8]. However,
`since the vast majority of file-system data is opaque file
`content, we can offload the storage of this content onto
`a smaller number of other machines (“file hosts”), using
`a level of indirection. By keeping a cryptographically
`secure hash of the content on the directory group,
`putative file content returned from a file host can be
`validated by the directory group, thereby preventing the
`file host from corrupting the data. Therefore, Farsite
`can tolerate the failure of all but one machine in a set of
`file hosts, thus allowing a far lower replication factor
`(e.g., 3 or 4) [6] for the lion’s share of file-system data.
`
`As machines and users join a Farsite system, the
`volume of file-system metadata will grow. At some
`point, the storage and/or operation load will overwhelm
`the directory group that manages the namespace. The
`Farsite design addresses this problem by allowing the
`directory group to delegate a portion of its namespace
`to another directory group. Specifically, the group
`randomly selects, from all of the machines it knows
`about, a set of machines that it then instructs to form a
`new directory group. The first group collectively signs
`a new namespace certificate delegating authority over a
`portion of its namespace to the newly formed group.
`This group can further delegate a portion of its region
`of the namespace. Because each file-system namespace
`forms a hierarchical tree, each directory group can
`delegate any sub-subtree of the subtree that it manages
`without involving any other extant directory groups.
`
`In this enhanced system, when a client wishes to read a
`file, it sends a message to the directory group that
`manages the file’s metadata. This group can prove to
`the client that it has authority over this directory by
`presenting a namespace certificate signed by its parent
`group, another certificate signed by its parent’s parent,
`and so on up to the root namespace certificate signed by
`a CA that the client regards as authoritative. The group
`replies with these namespace certificates, a lease on the
`file, a one-way hash of the file contents, and a list of
`file hosts that store encrypted replicas of the file. The
`client retrieves a replica from a file host and validates
`its contents using the one-way hash. If the user on the
`client machine has read access to the file, then he can
`use his private key to decrypt the file. If the client
`updates the file, then, after a delay, it sends an updated
`hash to the directory group. The directory group
`verifies the user’s permission to write to the file, and it
`instructs the file hosts to retrieve copies of the new data
`from the client. If another client tries to open the file
`while the first client has an active lease, the directory
`group may need to contact the first client to see whether
`the file is still in use, and the group may recall the lease
`to satisfy the new request.
`
`Page 4 of 14
`
`Netskope Exhibit 1020
`
`
`
`2.6 Semantic Differences from NTFS
`Despite our goal of emulating a local NTFS file system
`as closely as possible, performance or behavioral issues
`have occasionally motivated semantics in Farsite that
`diverge from those of NTFS.
`
`If many (e.g., more than a thousand) clients hold a file
`open concurrently, the load of consistency management
`– via querying and possibly recalling leases – becomes
`excessive, reducing system performance. To prevent
`this, the Farsite design places a hard limit on the
`number of clients that can have a file open for
`concurrent writing and a soft limit on the number that
`can have a file open for concurrent reading. Additional
`attempts to open the file for writing will fail with a
`sharing violation. Additional attempts to open the file
`for reading will not receive a handle to the file; instead,
`they will receive a handle to a snapshot of the file,
`taken at the time of the open request. Because it is only
`a snapshot, it will not change to reflect updates by
`remote writers, and so it may become stale. Also, an
`open snapshot handle will not prevent another client
`from opening the file for exclusive access, as a real file
`handle would. An application can query the Farsite
`client to find out whether it has a snapshot handle or a
`true file handle, but this is not part of NTFS semantics.
`
`NTFS does not allow a directory to be renamed if there
`is an open handle on a file in the directory or in any of
`its descendents. In a system of 105 machines, there will
`almost always be an open handle on a file somewhere
`in the namespace, so these semantics would effectively
`prevent a directory near the root of the tree from ever
`being renamed. Thus, we instead implement the Unix-
`like semantics of not name-locking an open file’s path.
`
`The results of directory rename operations are not
`propagated synchronously to all descendent directory
`groups during the rename operation, because this would
`unacceptably retard the rename operation, particularly
`for directories near the root of the namespace tree.
`Instead, they are propagated lazily, so they might not be
`immediately visible to all clients.
`
`Windows applications can register to be informed about
`changes that occur in directories or directory subtrees.
`This notification is specified to be best-effort, so we
`support it without issuing read leases to registrants.
`However, for reasons of scalability, we only support
`notification on single directories and not on subtrees.
`
`3. File System Features
`This section describes the mechanisms behind Farsite’s
`key features, which include reliability and availability,
`security, durability, consistency, scalability, efficiency,
`and manageability.
`
`3.1 Reliability and Availability
`Farsite achieves reliability (long-term data persistence)
`and availability (immediate accessibility of file data
`when requested) mainly through replication. Directory
`metadata is replicated among members of a directory
`group, and file data is replicated on multiple file hosts.
`Directory groups employ Byzantine-fault-tolerant
`replication, and file hosts employ raw replication.
`
`For redundantly storing file data, Farsite could have
`used an erasure coding scheme [3] rather than raw
`replication. We chose the latter in part because it is
`simpler and in part because we had concerns about the
`additional latency introduced by fragment reassembly
`of erasure-coded data during file reads. Some empirical
`data [25] suggests that our performance concerns might
`have been overly pessimistic, so we may revisit this
`decision in the future. The file replication subsystem is
`a readily separable component of both the architecture
`and the implementation, so it will be straightforward to
`replace if we so decide.
`
`With regard to reliability, replication guards against the
`permanent death of individual machines, including
`data-loss failures (such as head crashes) and explicit
`user decommissioning. With regard to availability,
`replication guards against the transient inaccessibility of
`individual machines, including system crashes, network
`partitions, and explicit shutdowns. In a directory group
`of RD members, metadata is preserved and accessible if
`no more than (RD – 1) / 3 of the machines die. For
`files replicated on RF file hosts, file data is preserved
`and accessible if at least one file host remains alive.
`
`In the Farsite design, when a machine is unavailable for
`an extended period of time, its functions migrate to one
`or more other machines, using the other replicas of the
`file data and directory metadata to regenerate that data
`and metadata on the replacement machines. Thus, data
`is lost permanently only if too many machines fail
`within too small a time window to permit regeneration.
`
`Because the volume of directory data is much smaller
`than that of file data, directory migration is performed
`more aggressively than file-host migration: Whenever
`a directory group member is down or inaccessible for
`even a short time, the other members of the group select
`a replacement randomly from the set of accessible
`machines they know about. Since low-availability
`machines are – by definition – up for a smaller fraction
`of time than high-availability machines, they are more
`likely to have their state migrated to another machine
`and less likely to be an accessible target for migration
`from another machine. These factors bias directory-
`group membership toward highly availability machines,
`without introducing security-impairing non-randomness
`into member selection. This bias is desirable since
`
`Page 5 of 14
`
`Netskope Exhibit 1020
`
`
`
`Byzantine agreement is only possible when more than
`two thirds of the replicas are operational. The increase
`in the (light) metadata workload on high-availability
`machines is compensated by a small decrease in their
`(heavy) file storage and replication workload.
`
`Farsite improves global file availability by continuously
`relocating file replicas at a sustainable background rate
`[14]. Overall mean file uptime is maximized by
`achieving an equitable distribution of file availability,
`because low-availability files degrade mean file uptime
`more than high-availability files improve it. Therefore,
`Farsite successively swaps the machine locations of
`replicas of high-availability files with those of low-
`availability files, which progressively equalizes file
`availability. Simulation experiments [14] driven by
`actual measurements of desktop machine availability
`show that Farsite needs to swap 1% of file replicas per
`day to compensate for changes in machine availability.
`
`File availability is further improved by caching file data
`on client disks. These caches are not fixed in size but
`rather hold file content for a specified interval called
`the cache retention period (roughly one week) [6].
`
`3.2 Security
`3.2.1 Access Control
`Farsite uses different mechanisms to enforce write- and
`read-access controls. Because directory groups only
`modify their shared state via a Byzantine-fault-tolerant
`protocol, we trust the group not to make an incorrect
`update to directory metadata. This metadata includes
`an access control list (ACL) of public keys of all users
`who are authorized writers to that directory and to files
`therein. When a client establishes cryptographically
`authenticated channels to a directory group’s members,
`the channel-establishment protocol involves a user’s
`private key, thereby authenticating the messages on that
`channel as originating from a specific user. The
`directory group validates the authorization of a user’s
`update request before accepting the update.
`
`Because a single compromised directory-group member
`can
`inappropriately disclose
`information, Farsite
`enforces read-access control via strong cryptography, as
`described in the next subsection.
`3.2.2 Privacy
`Both file content and user-sensitive metadata (meaning
`file and directory names) are encrypted for privacy.
`
`When a client creates a new file, it randomly generates
`a symmetric file key with which it encrypts the file. It
`then encrypts the file key using the public keys of all
`authorized readers of the file, and it stores the file key
`encryptions with the file, so a user with a corresponding
`private key can decrypt the file key and therewith the
`
`file. Actually, there is one more level of indirection
`because of the need to identify and coalesce identical
`files (see subsection 3.6.1) even if they are encrypted
`with different user keys: The client first computes a
`one-way hash of each block of the file, and this hash is
`used as a key for encrypting the block. The file key is
`used to encrypt the hashes rather than to encrypt the file
`blocks directly. We call this technique convergent
`encryption
`[12], because
`identical
`file plaintext
`converges to identical ciphertext, irrespective of the
`user keys. Performing encryption on a block level
`enables a client to write an individual block without
`having to rewrite the entire file. It also enables the
`client to read individual blocks without having to wait
`for the download of an entire file from a file host.
`
`To prevent members of a directory group from viewing
`file or directory names, they are encrypted by clients
`before being sent to the group, using a symmetric key
`that is encrypted with the public keys of authorized
`directory readers and stored in the directory metadata
`[15]. To prevent a malicious client from encrypting a
`syntactically illegal name [31], the Farsite design uses a
`technique called exclusive encryption, which augments
`a cryptosystem in a way that guarantees that decryption
`can only produce legal names no matter what bit string
`is given as putative ciphertext [15].
`3.2.3 Integrity
`As long as fewer than one third of the members of a
`directory group are faulty or malicious, the integrity of
`directory metadata is maintained by the Byzantine-
`fault-tolerant protocol. Integrity of file data is ensured
`by computing a Merkle hash tree [30] over the file data
`blocks, storing a copy of the tree with the file, and
`keeping a copy of the root hash in the directory group
`that manages the file’s metadata. Because of the tree,
`the cost of an in-place file-block update is logarithmic –
`rather than linear – in the file size. The hash tree also
`enables a client to validate any file block in logarithmic
`time, without waiting to download the entire file. The
`time to validate the entire file is linear in the file size,
`not log-linear, because the count of internal hash nodes
`is proport