IPR2016-00851, No. 1012 Exhibit - Exhibit 1012 (P.T.A.B. Apr. 7, 2016)

DROPBOX EX. 1012
`
`
`
`DROPBOX EX. 1012DROPBOX EX. 1012
`
`
`
`

`
`THE ADVANCED COMPUTING SYSTEMS ASSOCIATION
`
`The following paper was originally published in the
`Proceedings of the USENIX Annual Technical Conference
`
`Monterey, California, USA, June 6-11, 1999
`
`Operation-based Update Propagation
`in a Mobile File System
`
`_
`
`Yui-Wah Lee, Kwong-Sak Leung
`The Chinese University of Hong Kong
`Mahadev Satyanarayanan
`Carnegie Mellon University
`
`© 1999 by The USENIX Association
`All Rights Reserved
`
`Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial
`reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper.
`USENIX acknowledges all trademarks herein.
`
`For more information about the USENIX Association:
`Phone: 1 510 528 8649
`FAX: 1 510 548 5738
`Email: office@usenix.org WWW: http://www.usenix.org
`
`Dropbox Ex. 1012
`
`

`
`Operation-based Update Propagation
`in a Mobile File System (cid:3)
`
`Kwong-Sak Leung
`Yui-Wah Lee
`The Chinese University of Hong Kong
`fclement,ksleungg@cse.cuhk.edu.hk
`
`Mahadev Satyanarayanan
`Carnegie Mellon University
`satya+@cs.cmu.edu
`
`Abstract
`
`In this paper we describe a technique called operation-
`based update propagation for efﬁciently transmitting up-
`dates to large ﬁles that have been modiﬁed on a weakly
`connected client of a distributed ﬁle system. In this tech-
`nique, modiﬁcations are captured above the ﬁle-system
`layer at the client, shipped to a surrogate client that is
`strongly connected to a server, re-executed at the surro-
`gate, and the resulting ﬁles transmitted from the surro-
`gate to the server. If re-execution fails to produce a ﬁle
`identical to the original, the system falls back to ship-
`ping the ﬁle from the client over the slow network. We
`have implemented a prototype of this mechanism in the
`Coda File System on Linux, and demonstrated perfor-
`mance improvements ranging from 40 percents to nearly
`three orders of magnitude in reduced network trafﬁc and
`elapsed time. We also found a novel use of forward error
`correction in this context.
`
`1 Introduction
`
`The use of a distributed ﬁle system on a mobile client is
`often hindered by poor network connectivity. Although
`disconnected operation [8] is feasible, a mobile client
`with an extensive amount of updates should not defer
`propagating them to a server for too long. Damage, theft
`or destruction of the client before update propagation
`will result in loss of those updates. Further, their timely
`propagation may be critical to successful collaboration
`
`(cid:3)This research was partially supported by the Defense Advanced Research
`Projects Agency (DARPA), Air Force Materiel Command, USAF under agree-
`ment number F19628-96-C-0061, the Intel Corporation, and the Novell Corpo-
`ration. The views and conclusions contained herein are those of the authors and
`should not be interpreted as necessarily representing the ofﬁcial policies or en-
`dorsements, either expressed or implied, of DARPA, Intel, Novell, or the U.S.
`Government.
`
`and in reducing the likelihood of update conﬂicts.
`
`Propagation of updates from a mobile client is often
`impeded by weak connectivity in the form of wireless
`or wired networks that are low-bandwidth or intermit-
`tent. Aggressive update propagation under these condi-
`tions increases the demand on scarce bandwidth. A ma-
`jor component to bandwidth demand is the shipping of
`large ﬁles in their entirety. Two obvious solutions to the
`problem are delta shipping and data compression. The
`former tries to ship only the incremental differences be-
`tween versions of ﬁles, and the latter “compresses out”
`the redundancies of ﬁles before shipping the ﬁles. Un-
`fortunately, as discussed later in this paper, both of these
`methods have shortcomings that limit their usefulness.
`
`In this paper, we focus on a radically different solution,
`which is called operation-based update propagation (or
`operation shipping for brevity). It is motivated by two
`observations. First, large ﬁles are often created or mod-
`iﬁed by user operations that can be easily intercepted
`and compactly represented. Second, the cost of shipping
`and re-executing the user operations is often signiﬁcantly
`smaller than that of shipping the large ﬁles over a weak
`network.
`
`To propagate a ﬁle, the client ships the operation to a
`surrogate client that is well connected to the server, as
`shown in Figure 1. The surrogate re-executes the opera-
`tion, validates that the re-generated ﬁle is identical to the
`original, and then propagates the ﬁle via its high-speed
`connection to the server. If the ﬁle re-generated by the
`surrogate does not match that from the original execu-
`tion, the system falls back to shipping the original from
`the client to the server over the slow connection. This
`validation and fall-back mechanism is essential to ensure
`the correctness of update propagation.
`
`Dropbox Ex. 1012
`
`Page 1
`
`

`
`1. Logging of user operations
`
`3. re-execution of user operations
`
`client
`
`Large files
`updated
`
`2. shipping of operation log
`
`compact operation log
`
`surrogate
`
`Large files
`re-generated
`
`4. validation of
` re-execution
` and shipping
` of re-generated
` files.
`
`weak network
`
`strong network
`
`server
`
`Figure 1: An overview of operation shipping
`
`The use of a surrogate in our approach ensures that
`server scalability and security are preserved — servers
`are not required to execute potentially malicious and
`long-running code, nor are they required to instantiate
`the execution environments of the numerous clients they
`service.
`In our model, we assume each mobile client
`has pre-arranged for a suitable surrogate of an appropri-
`ate machine type, at an adequate level of security, pos-
`sessing suitable authentication tickets, and providing an
`appropriate execution environment. This assumption is
`reasonable, since many users of mobile clients also own
`powerful and well connected personal desktops in their
`ofﬁces, and they can pre-arrange for these otherwise idle
`personal machines as the surrogates.
`
`Even though the idea of operation shipping is concep-
`tually simple, there are many details that have to be ad-
`dressed to make it work in practice. In the rest of this
`paper, we describe how we handle these details by imple-
`menting a prototype based on the Coda File System. Our
`experiment conﬁrms the substantial performance gains
`of this approach. Depending on the metric and the spe-
`ciﬁc application, we have observed improvements rang-
`ing from a factor of three to nearly three orders of mag-
`nitude.
`
`2 Coda background
`
`We have implemented our prototype by extending the
`Coda File System [3]. Although our experience is based
`on Coda, the general principles should also be applicable
`to other mobile ﬁle systems. We brieﬂy describe Coda in
`this section; more information is available in the litera-
`ture [3, 8, 7, 14, 13].
`
`Venus, carefully manages and persistently stores cached
`ﬁle-system objects. To support mobile computing, Coda
`clients can be used in disconnected and weakly connected
`mode, where Venus emulates the servers and allows ﬁle-
`system operations on cached objects. Updates are ap-
`plied immediately to locally cached objects, and are also
`logged in a client-modify log (CML). The logging mech-
`anism allows propagation of updates to servers to be de-
`ferred until a convenient time, such as when network
`connectivity is restored. Venus propagates these updates
`with a mechanism called trickle reintegration [14, 13].
`When propagation is attempted, a preﬁx of the log is
`shipped to the server in temporal order, the size of the
`preﬁx being determined by available bandwidth.
`
`The effect of each mutating ﬁle-system operation is rep-
`resented as a record in the CML. For example, a chmod
`operation is logged as a CHMOD record, a mkdir opera-
`tion is logged as a MKDIR record. Furthermore, if there
`have been some intervening write operations made to
`an open’ed ﬁle, a subsequent close operation will be
`logged as a STORE record.
`
`Records of the type STORE are special, because the as-
`sociated data include the contents of the ﬁles. There-
`fore, these records are much bigger than records of other
`types. STORE records can be as large as several kilo-
`bytes or even megabytes, whereas other records are typ-
`ically smaller than a few hundred bytes. Although the
`contents of a ﬁle logically constitute a part of the CML,
`they are physically stored in a separate container ﬁle in
`the client’s local ﬁle system.
`
`Although trickle reintegration has been shown to be ef-
`fective in decoupling the foreground ﬁle-system activi-
`ties from the slow propagations of updates, it still suffers
`from an important limitation: updated ﬁles are propa-
`gated in their entirety. In other words, although the users
`perceive a fast response time from the ﬁle system, the ac-
`tual propagations of the updates are very slow, and they
`generate heavy network trafﬁc. On a weak network, we
`need a more efﬁcient scheme for shipping updates.
`
`3 Design and Implementation
`
`3.1 Overview
`
`The Coda model is that there are many clients and a
`few servers. On each client, a cache manager, called
`
`We organize our discussion according to the four steps
`shown in Figure 1:
`
`2
`
`Dropbox Ex. 1012
`
`Page 2
`
`

`
`1. Logging of user operations. When a ﬁle is updated
`on a client, the client keeps the new ﬁle value V (the
`contents of the ﬁle) and also logs the user operation
`O.
`
`text processors
`and linkers (gcc, yacc, and ld),
`(latex and troff), ﬁle-format converters (dvips
`and sgml2latex), software-project management tools
`(make), and ﬁle packagers (tar).
`
`2. Shipping of operation log. If the network band-
`width is low, the client does not ship V to the server.
`Instead, it ships O, the ﬁngerprint of V , and other
`meta-data to a surrogate client.
`
`3. Re-execution of operations. The surrogate re-
`executes O and produces a ﬁle value V .
`
`4. Validation of re-execution. The surrogate vali-
`dates the re-execution by checking the ﬁngerprints
`and meta-data. If it accepts the re-execution, then it
`will present V to the server; otherwise, it will re-
`port failure to the client, which will then fall back
`to value shipping and ship V directly to the server.
`
`3.2 Logging of user operations
`
`3.2.1 Level of abstraction
`
`We need to ﬁnd the right level of abstraction for log-
`ging operations, such that the operation logs are com-
`pact. Currently, Venus logs only the low-level ﬁle-system
`operations. File-system operation logs are not compact
`enough for efﬁcient operation shipping, since they con-
`tain the contents of the ﬁles.
`
`On the other hand, computer users tend to think at a
`higher level. Typically, when they are working with a
`computer, they issue a series of commands to it. Since
`human beings type relatively slowly, the commands must
`be compact; hence, we focus on this level for operation
`shipping. When we speak of user operations, we mean
`the high level commands of computer users that can be
`intercepted, logged, and replayed later.
`
`There must be an entity that intercepts the user oper-
`ations and supplies the relevant information to the ﬁle
`system. This entity can be an interactive shell, or it can
`be the application itself. We refer to the two cases as
`application-transparent and application-aware logging.
`
`3.2.2 Application-transparent logging
`
`Application-transparent logging is possible if the ap-
`plication is non-interactive.
`In this case, an inter-
`active shell can intercept
`the information related to
`the user operation. For example, operation logging
`can be transparent to the following types of applica-
`tions (examples are listed in parentheses):
`compiler
`
`For application-transparent operation shipping, we need
`not modify the applications, we simply need to mod-
`ify some interactive shells – the meta-applications –
`such as bash and tcsh. Fortunately, there are much
`fewer meta-applications than applications. We assume,
`of course, that the users are willing to use a modiﬁed
`shell to take advantage of operation shipping.
`
`We extend the ﬁle-system API (application-programmer
`interface) with two new commands for the ioctl sys-
`tem call: VIOC BEGIN OP and VIOC END OP. They
`delineate the beginning and the end of a user operation.
`The user operation is tagged with the process group , so
`forked children in the same group are considered part of
`the same user operation. When the ﬁle system receives a
`ﬁle-system call, it can determine whether the call comes
`from a tagged user operation by examining the process-
`group information of the caller. The information nec-
`essary for re-execution, including the name of the user
`command, the command-line arguments, current work-
`ing directory, and so on, is passed to the ﬁle system with
`the VIOC BEGIN OP command.
`
`Our ﬁle system provides the logging mechanism, but the
`logging entities can choose the appropriate logging poli-
`cies. For example, an interactive shell may allow users
`to speciﬁcally enable or disable operation logging based
`on certain criteria.
`
`We have experimented with the bash shell, a com-
`mon Unix shell. We added a few lines of code so
`that the modiﬁed shell issues the VIOC BEGIN OP and
`VIOC END OP commands appropriately. Currently, we
`implement the most straightforward policy: the shell logs
`every user operation. In the future, we may implement a
`more ﬂexible policy.
`
`3.2.3 Application-aware logging
`
`Application-aware logging is needed when the applica-
`tion is interactive.
`In this case, the application is in-
`volved in capturing the user operations. The follow-
`ing types of applications fall into this category (ex-
`amples are listed in parentheses):
`text editors and
`word processors (emacs, vi, Applix Word and
`Microsoft Word), drawing tools (xfig and Corel
`Draw), presentation software (Applix Present and
`Microsoft PowerPoint), computer-aided-design
`
`3
`
`Dropbox Ex. 1012
`
`Page 3
`
`

`
`tools (AutoCAD and magic), and image manipulators
`(xv and GNU Image Manipulation Program).
`
`shipping or value shipping.
`
`In this paper, we focus on application-transparent op-
`eration shipping. The mechanism that we have de-
`signed, and the evaluation that we have performed, is
`limited to non-interactive applications. We plan to study
`application-aware operation shipping next, and we will
`report our ﬁndings in the future.
`
`3.3 Shipping of operation log
`
`3.3.1 Shipping mechanism
`
`The reintegrator, which is a user-level thread within
`Venus, manages update propagation. It periodically se-
`lects several records from the head of the CML and ships
`them to the servers. For records with no user-operation
`information attached, the reintegrator uses value ship-
`ping and makes a ViceReintegrate remote proce-
`dure call (RPC) to the server. The server, when process-
`ing the RPC, back-fetches the related container ﬁles from
`the client. If the reply of the RPC indicates success, the
`reintegrator will locally commit the updates. Local com-
`mitment of updates is the ﬁnal step of successful update
`propagation, and includes updating the states of relevant
`objects, such as version vectors and dirty bits, and trun-
`cating the CML.
`
`If user-operation information is available for a record,
`the reintegrator will attempt operation shipping ﬁrst. All
`the records associated with the same user operation will
`be operation shipped altogether. The reintegrator se-
`lects the records, packs the operation log, and makes a
`UserOpPropagate RPC to the surrogate. If the reply
`indicates success, the reintegrator will locally commit the
`updates. However, if the reply indicates failure, the rein-
`tegrator will set a ﬂag in each of the records indicating
`that it has tried and failed propagation by operation ship-
`ping. These records will then be value shipped.
`
`3.3.2 Cost model
`
`The current version of our prototype attempts operation
`shipping for a record whenever there is user-operation
`information available. This static approach implicitly
`assumes that the connectivity between a mobile client
`and its servers is always weak.
`In real life, a mobile
`client may have strong connectivity occasionally. Dur-
`ing that time, as explained in the following paragraphs,
`value shipping is more efﬁcient than operation shipping.
`We plan to enhance our prototype so that mobile clients
`dynamically decide whether they should use operation
`
`Our cost model compares the costs of value shipping
`with that of operation shipping. For each case, there are
`two different costs involved: network trafﬁc and elapsed
`time.
`
`For value shipping, assuming the overhead is negligible,
`the network trafﬁc is the total length L of the updated
`ﬁles, and the elapsed time is Tv = L=Bc, where Bc is
`the bandwidth of the network connecting the client to the
`server.
`
`For operation shipping, the network trafﬁc is the length
`of the operation log, Lop, and the elapsed time is Top.
`The latter is composed of four components:
`(1) the
`time needed to ship the operation log (Lop=Bc), (2)
`the time needed for re-executing the operation (E), (3)
`the time needed for additional computational overhead
`(Hop) such as computing checksum information and en-
`coding and decoding of forward-error-correction codes,
`and (4) the time needed to ship the updated ﬁles to the
`servers. There are two cases for the last component. If
`the re-execution passes the validation (accepted), the up-
`dated ﬁles will be shipped from the surrogate (the time
`cost will be L=Bs, where Bs is the bandwidth of the
`network connecting the surrogate to the server); if the
`re-execution fails the validation, the updated ﬁle will be
`shipped from the client (the time cost will be L=Bc). The
`following equation summarizes the time costs involved:
`
`Top = (cid:26) Lop=Bc + E + Hop + L=Bs
`
`Lop=Bc + E + Hop + L=Bc
`
`if accepted
`if rejected
`
`(1)
`
`Therefore, operation shipping is more favorable than
`value shipping only in certain condition. Operation ship-
`ping saves network trafﬁc if the operation log is more
`compact than the updated ﬁles (Lop < L). Also, it
`speeds up the update propagation (Top < Tv) if the fol-
`lowing ﬁve conditions are true: (1) the re-execution is
`accepted, (2) the operation log is compact (Lop (cid:28) L),
`(3) the re-execution is fast (E (cid:28) L=Bc), (4) the time
`needed for additional computational overheads is small
`(Hop (cid:28) L=Bc), and (5) the surrogate has a much better
`network connectivity than the client (Bs (cid:29) Bc).
`
`3.4 Re-execution of user operations
`
`3.4.1 Re-execution mechanism
`
`Although we anticipate most re-executions will be suc-
`cessful (execute completely and pass the validation), we
`have to prepare for the possibility that they may fail (can-
`not execute completely or fail the validation). Therefore,
`we have to ensure that re-executions are abortable trans-
`
`4
`
`Dropbox Ex. 1012
`
`Page 4
`
`

`
`actions such that failed re-executions will have no lasting
`effect. We implement this as follows.
`
`principle to facilitate repeating re-executions.
`
`Upon receiving a UserOpPropagate RPC from the
`client, Venus on the surrogate will temporarily put the af-
`fected volume 1 into write-disconnected mode, and then
`re-executes the user operation via a spawned child called
`the re-executor. During the re-execution, since the vol-
`ume is write-disconnected, input ﬁles can be retrieved
`from the servers, but ﬁle-system updates are not writ-
`ten immediately to the server. These updates are tagged
`with the identity of the re-execution and are logged in the
`CML. If the re-execution later passes the validation, the
`surrogate will re-connect the volume with the servers and
`reintegrate the updates. At the end, and only when the
`reintegration succeeds, the surrogate will locally commit
`the updates, and indicate a success to the client in a RPC
`reply. On the other hand, failures may happen any time
`during the re-execution, the validation, or the reintegra-
`tion. If any such failures occur, the surrogate will discard
`all the updates and indicate a failure to the client in a RPC
`reply.
`
`It is possible that a reintegration completes successfully
`at the servers, but the RPC response fails to arrive at the
`client in spite of retransmission. This can happen when
`there is an untimely failure of the surrogate or the com-
`munication channels. We make use of the existing Coda
`mechanism of ensuring atomicity of reintegrations [13]
`to handle this kind of failures. In short, the client pre-
`sumes an reintegration has failed if it does not receive
`a positive response from the surrogate, and it will retry
`the reintegration. At the same time, the server retains the
`information necessary to detect whether a record has al-
`ready been reintegrated earlier. If a client attempts such
`an improper retry, the server will reply with the error
`code EALREADY (Operation already in progress), and
`the client will then know that the records have already
`been successfully reintegrated in a previous attempt, and
`it will simply locally commit those records.
`
`3.4.2 Facilitating repeating re-executions
`
`A re-execution of a user operation is accepted when it
`produces a result that is identical to that of the original
`execution. We say the re-execution is repeating the orig-
`inal execution. Only these repeating re-executions are
`useful to operation shipping. We know that a determin-
`istic procedure will produce the same result in different
`runs provided that it has the same input and the same en-
`vironment in each run. Our ﬁle system makes use of this
`
`1In Coda, a volume is a collection of ﬁles forming a partial subtree
`of the Coda name space.
`
`First, the re-executor runs with the four Unix-process at-
`tributes identical to that of the original execution: (1) the
`working directory, (2) the environment-variable list, (3)
`the command-line arguments, and (4) the ﬁle-creation
`mask [23].
`
`Second, our ﬁle system expects that the surrogate ma-
`chine has software and hardware conﬁgurations similar
`to the client machine. Two machines are said to be iden-
`tically conﬁgured if they have the same CPU type, the
`same operating system, the same system-header ﬁles,
`the same system libraries, and any other system environ-
`ments that can affect the outcomes of computations on
`the two machines.
`
`Third, if a re-execution requires an input ﬁle stored in
`Coda, we can rely on the ﬁle system to ensure that the
`client and the surrogate will use an identical version of
`the input ﬁle. Coda can ensure this because a client
`ships updates to its servers in temporal order, and the
`surrogate will always retrieve the latest version of a ﬁle
`from the servers. For example, consider a user issuing
`three user operations successively on a client machine:
`(Op1) update a source ﬁle using an editor, (Op2) com-
`pile the source ﬁle into an object ﬁle using a compiler,
`and (Op3) update the source ﬁle again. When the surro-
`gate re-executes Op2, the client must have shipped Op1
`but not Op3, and the re-executor will see exactly the ver-
`sion updated by Op1.
`
`We emphasize that our ﬁle system does not guarantee
`that all re-executions will be repeating their original ex-
`ecutions; it just increases the probability of that happen-
`ing. More importantly, it ensures that only repeating re-
`executions will be accepted. This is achieved by the pro-
`cedure of validation (Section 3.5).
`
`In the evaluation section, we will see that many appli-
`cations exhibit repeating re-executions. Although some
`other applications exhibit non-repeating side effects dur-
`ing re-executions, these side effects can be handled in
`simple ways. Therefore, we believe we can use opera-
`tion shipping with a large number of applications.
`
`3.4.3 Status information
`
`In addition to updating contents of ﬁles, user opera-
`tions also update some meta-data (status information).
`Some of the meta-data are internal to the ﬁle system and
`are invisible to the users (e.g., every STORE operation
`has an identity number for concurrency control); some
`
`5
`
`Dropbox Ex. 1012
`
`Page 5
`
`

`
`are external and are visible to the users (e.g., the last-
`modiﬁcation time of a ﬁle).
`In both cases, to make a
`re-execution’s result identical to that of the original ex-
`ecution, the values of the meta-data of the re-execution
`should be reset to those of the original execution. There-
`fore, the client needs to pack these meta-data as part of
`the operation log, and the surrogate needs to adjust the
`meta-data of the re-execution to match the supplied val-
`ues.
`
`3.4.4 Non-repeating side effects
`
`We focus on applications that perform deterministic
`tasks, such as compiling binaries or formatting texts,
`and exclude applications such as games and probabilistic
`search that are randomized in nature. In an early stage of
`this project, we expected the re-executions of these ap-
`plications to repeat their original executions completely.
`However, we found that some common applications ex-
`hibited non-repeating side effects. So far we have found
`two types of such side effects: (1) time stamps in output
`ﬁles, and (2) temporary names of intermediate ﬁles. For-
`tunately, we are able to handle these side effects automat-
`ically, so we are still able to use operation shipping with
`these applications. We will discuss the handling methods
`in the two following subsections.
`
`We also anticipate a third possible kind of side effect: ex-
`ternal side effects. For example, if an application sends
`an email message as the last step of execution, then the
`user may be confused by the additional message sent by
`re-execution. To cope with this kind of side effect, we
`plan to allow users to disable the use of operation ship-
`ping for some applications.
`
`3.4.5 Side effects due to time stamps
`
`Some applications, such as rp2gen, ar, and latex,
`put time stamps into the ﬁles that they produce. rp2gen
`generates stubs routines for remote procedure calls, ar
`builds library ﬁles, and latex formats documents. They
`use time stamps for various reasons. Because of the
`time stamps, a ﬁle generated by a re-execution will dif-
`fer slightly from the version generated by the original
`execution. Observing that only a few bytes are differ-
`ent, we can treat the changed bytes as “errors” and use
`the technique of forward error correction (FEC) [6, 4]
`to “correct” them. (We are indebted to Matt Mathis for
`suggesting this idea to us.)
`
`Our ﬁle system, therefore, does the following. Venus on
`the remote client computes an error-correction code (we
`use the Reed-Solomon code) for each updated ﬁle that
`
`is to be shipped by operation, and it packs the code as a
`part of the operation log. Venus on the surrogate, after
`re-executing the operation, uses the code to correct the
`time stamps that may have occurred in the re-generated
`version of the ﬁle.
`
`Note that this is a novel use of the Reed-Solomon code.
`Usually, data blocks are sent together with parity blocks
`(the error-correction code); but our clients send only the
`parity blocks. The data blocks are instead re-generated
`by the surrogate. Whereas others use the code to correct
`communication errors, we use it to correct some minor
`re-execution discrepancies. If a discrepancy is so large
`that our error-correction procedure cannot correct it, our
`ﬁle system simply falls back to value shipping. This en-
`tails a loss of performance but preserves correctness.
`
`The additional network trafﬁc due to the error correction
`code is quite small. We choose to use a (65535,65503)
`Reed-Solomon block code over GF (
`In other
`).
`words, the symbol size is 16 bits, each block has 65,503
`data symbols (131,006 bytes) and 32 parity symbols (64
`bytes). The system can correct up to 16 errors (32 bytes)
`in each data block.
`
`
`
`However, the additional CPU time due to encoding and
`decoding is not small. We discuss this in more detail
`in Section 4.4. Also, the Reed-Solomon code cannot
`correct discrepancies that change length (for example,
`the two timestamps “9:17” and “14:49” have differ-
`ent lengths). The rsync algorithm [24] can handle length
`change, but we favor the Reed-Solomon code because it
`has a smaller overhead on network trafﬁc.
`
`3.4.6 Side effects due to temporary ﬁles
`
`The program ar is an example of an application that
`uses temporary ﬁles. Figure 2 shows the two CMLs on
`the client and the surrogate after the execution and re-
`execution of the following user operations:
`
`ar rv libsth.a foo.o bar.o.
`
`The user operation builds a library ﬁle libsth.a from
`two object modules foo.o and bar.o.
`
`Note that ar used two temporary ﬁles sta09395 and
`sta16294 in the two executions. The names of the
`temporary ﬁles are generated based on the identity num-
`bers of processes executing the application, and hence
`they are time dependent. Our validation procedure (Sec-
`tion 3.5) might naively reject the re-execution “because
`the records are different.”
`
`6
`
`Dropbox Ex. 1012
`
`Page 6
`
`

`
`Original Execution
`
`Re-execution
`
`Create sta09395
`
`Create sta16294
`
`3.5.1 Validation mechanism
`
`3.5 Validation of re-executions
`
`Store sta09395
`
`Store sta16294
`
`Remove libsth.a
`
`Remove libsth.a
`
`Rename sta09395
` to libsth.a
`
`Rename sta16294
` to libsth.a
`
`Figure 2: CMLs of two executions of ar
`
`Temporary ﬁles appear only in the intermediate steps of
`the execution. They will either be deleted or renamed at
`the end, so their names do not affect the ﬁnal ﬁle-system
`state. An application uses temporary ﬁles to provide ex-
`ecution atomicity. For example, ar writes intermediate
`computation results to a temporary ﬁle, and it will re-
`name the ﬁle to the target ﬁlename only if the execution
`succeeds. This measure is to ensure that the target ﬁle
`will not be destroyed accidentally by a futile execution.
`
`If a temporary ﬁle is created and subsequently deleted
`during the execution of a user operation, its modiﬁcation
`records will be deleted by the existing identity cancella-
`tion procedure [7]. They will not appear in the two CMLs
`and will not cause naive rejections of re-execution.
`
`However, if a temporary ﬁle is created and subsequently
`renamed during the execution of a user operation, its
`modiﬁcations records will be present in the CMLs, and
`might cause our validation to reject the re-execution. Our
`ﬁle system uses a procedure of temporary-ﬁle renaming
`to compensate for the side effect. This procedure is done
`after the re-executor has ﬁnished the re-execution and be-
`fore the surrogate begins the validation.
`
`The idea of the temporary-ﬁle renaming is to scan the
`two CMLs and identify all the temporary ﬁles as well
`as their ﬁnal names. We identify temporary ﬁles by the
`fact that they are created and subsequently renamed in
`an user operation. For each temporary ﬁle used by the
`surrogate, our ﬁle system determines the temporary ﬁle
`name N used by the client in the original execution. It
`thus renames the temporary ﬁle to N . In our ar exam-
`ple, the temporary ﬁle sta16294 will be renamed to
`sta09395.
`
`7
`
`Validation is done after the handling of side effects, and
`the adjustments of status information. By that time, if a
`re-execution is repeating its original execution, the set of
`mutations incurred on the surrogate should be the same
`as that on the client. Since mutations are captured on
`CMLs, our ﬁle system can validate a re-execution by
`comparing the relevant portion of the CML of the sur-
`rogate to that of the client.
`
`the client packs every
`To facilitate the comparison,
`record in the relevant portion of the CML as part of the
`operation log. However, the container ﬁles, which are as-
`sociated with STORE records, are not packed; otherwise
`they would incur a heavy network trafﬁc for shipping the
`operation log, amounting to the trafﬁc needed for value
`shipping. Instead, the client packs the ﬁngerprint of each
`container ﬁle. When comparing CMLs, the surrogate as-
`serts that two container ﬁles are equal if they have the
`same ﬁngerprint.
`
`3.5.2 Fingerprint
`
`A ﬁngerprint function produces a ﬁxed-length ﬁngerprint
`f (M ) for a given arbitrary-length message M . A good
`ﬁngerprint function should have two properties: (1) com-
`puting f (M ) from M is easy, and (2) the probability that
`another message M gives the same ﬁngerprint is small.
`For our purpose, the messages for which we ﬁnd the ﬁn-
`gerprints are the contents of the container ﬁles.
`
`Our ﬁle system employs MD5 (Message Digest 5) ﬁn-
`gerprints [17, 21]. Each ﬁngerprint has 128 bits, so the
`overhead is very small. Also, the probability that two
`different container ﬁles give the same ﬁngerprint is very
`small; it is in the order of =
`.
`
`The fact that the probability is non-zero, albeit very
`small, may worry some readers. However, even value
`shipping is vulnerable to a small but non-zero proba-
`bility of error. That is, there is a small probability that
`a communication error has occurred but is not detected
`by the error-correction subsystem of the communication
`channel. We believe people can tolerate the small prob-
`abilities of errors of both opera

This document is available on Docket Alarm but you must sign up to view it.

Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

Up-to-date information for this case.
Email alerts whenever there is an update.
Full text search for other cases.
Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.

Access Government Site

We are redirecting you
to a mobile optimized page.

Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket

Supplemental Search

Search for PTAB Motions

PTAB Analytics

TTAB Analytics

Basic Search

Filters

Party Search

Advanced

Selected Courts

Recently Selected Courts

Find PTAB Decisions

PTAB Analytics

Special PTAB Alerts

Orange Book

Directly Search Federal Courts

Search Trademark ...

This document is available on Docket Alarm but you must sign up to view it.

Accessing this document will incur an additional charge of $.

Still Working On It

A few More Minutes ... Still Working

This document could not be displayed.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

One Moment Please

Your document is on its way!

Sealed Document

We are redirecting youto a mobile optimized page.

Document Unreadable or Corrupt

We are unable to display this document.

STEP 2 of 2

Choose your membership type

Flat-Fee

Pay-As-You-Go Monthly

Add your payment information

Login or Join

Enter your corporate Email

Thousands of your peers are saving time and gaining a competitive advantage with Docket Alarm.

Join Docket Alarm to perform smarter legal research.

Download this document and millions of others instantly with a Docket Alarm membership.

Join Docket Alarm and start performing smarter legal research.

Start tracking this docket instantly with a Docket Alarm membership.

Join thousands of your peers and start performing smarter legal research.

STEP 1 of 2

Millions of Documents | 15 Seconds to Signup

Hi !

Welcome to Docket Alarm

Welcome to Docket Alarm!

Explore Litigation Insights andManage Your Cases

Reset Password

What is PACER?

Why do I need it?

What will I be charged?

Do other courts have fees?

Basic Free Access

Welcome

Thank you

Check Firm Account

We are redirecting you
to a mobile optimized page.

Explore Litigation Insights and
Manage Your Cases