throbber
Library Express
`
`3/28/00
`
`37476385950
`
`remzi@cs wisc edu
`
`Reauest
`
`Sandberg
`
`Goldberg
`
`Libra. EXDreSJ
`10309
`
`Print
`
`Kleiman Walsh
`
`and Lyon
`
`Design and Implementation of the Sun Network Filesystem
`LI Proceedings of the Summer 1985 USENIX Conference Portland OR
`June 1985 pp 119-130
`
`restrictions The Copyright Law of the United States Title 17 United
`Notice warning concerning copyright
`of copyrighted material Under
`States Code governs the making of photocopies or other
`reproductions
`specified in the law libraries and archives are authorized to furnish
`photocopy or
`certain conditions
`other reproduction One of these specified conditions
`is not to be
`the photocopy or reproduction
`is that
`used for any purpose other
`than private study scholarship or research If
`user makes
`request
`for
`for purposes in excess of fair use that user may be liable for
`or later uses
`photocopy or reproduction
`to refuse to accept
`infringement This institution reserves the right
`copying order
`fulfillment of the order would involve violation of copyright
`law
`
`if
`
`in its
`
`copyright
`judgement
`
`Upon receipt of this electronic reproduction of the publication you have
`requested we ask that you comply with copyright
`law by not systematically
`reproducing it or in any way distributing or making available multiple copies
`of it
`
`ISSN/ISBN/OCLC
`
`____________________
`
`Copyright
`
`Updated
`
`Numberof pages
`
`______________
`
`________
`
`Notified
`
`______________
`
`Petitioner IBM – Ex. 1041, p. 1
`
`

`
`618
`
`65
`
`USENIX Association
`
`Summer Conference Proceedings
`
`Portland 1985
`
`June 1114 1985
`Portland Oregon USA
`
`Petitioner IBM – Ex. 1041, p. 2
`
`

`
`For additional copies of these proceedings contact
`
`USENIX Aociation
`2560 Ninth Street Suite 215
`Berkeley CA 94710 U.S.A
`510/528-8649
`
`Copyright 1985 by The USENIX Association
`
`All rights reserved
`
`This volume is published as
`
`collective work
`
`Rights to individual papers remain
`
`with the author or the authors employer
`
`UNIX is
`
`trademark of ATT Bell Laboratories
`
`Other trademarks are noted in the text
`
`Petitioner IBM – Ex. 1041, p. 3
`
`

`
`Design and Implementation of the Sun Network Filesystem
`
`Russel Sandberg
`David Goldberg
`Steve Kleiman
`Dan Walsh
`Bob Lyon
`
`Sun Microsystems Inc
`2550 Garcia Ave
`Mountain View CA 94110
`415 9607293
`
`OTICE This MATER
`BE PROTECTED
`Copyiight Law Title 17
`
`-Qde
`
`Inuction
`The Sun Network Filesystem NFS provides transparent
`remote access
`to filesystems Unlike
`many other remote filesystem implementations under UNIXt
`the NFS is designed to be easily
`portable to other operating systems and machine architectures
`It uses an External Data
`Representation XDR specification to describe protocols in
`machine and system independent
`Remote Procedure Call package RPC to help
`The NFS is implemented on top of
`way
`simplify protocol definition implementation and maintenance
`In order to build the NFS into the UNIX 4.2 kernel
`user transparent way we decided
`to add
`in
`new interface to the kernel which separates
`generic filesystent operations from specific
`The filesystem interface consists of
`two parts the Virtual File
`filesystem implementations
`System VFS interface defines the operations that can be done on
`filesystem while the vnode
`filesystem This new
`that can be done on
`file within that
`interface defines the operations
`interface allows us to implement and install new filesystems in much the same way as new device
`drivers are added to the kernel
`
`In this paper we discuss the design and implementation of the filesystem interface in the kernel
`filesystem We describe some interesting design issues and how they were
`and the NFS virtual
`implementation We conclude
`resolved and point out some of the shortcomings of
`the current
`with some ideas for future enhancements
`Design Goals
`The NFS was designed to make sharing of filesystem resources in
`network of non-homogeneous
`machines easier Our goal was to provide
`UNIX-like way of making remote files available to
`In addition we
`local programs without having to modify or even recompile those programs
`wanted remote file access
`file access
`to be comparable in speed to local
`The overall design goals of the NFS were
`
`Machine and Operating System Independence
`of UNIX so that an NFS server can
`The protocols used should be independent
`should also be
`The protocols
`supply files to many different
`types of clients
`they can be implemented on low end machines like the PC
`simple enough that
`
`Crash Recovery
`When clients can mount remote filesystems from many different servers it
`that clients be able to recover easily from server crashes
`very important
`
`is
`
`Transparent Access
`We want
`remote files in
`system which allows programs to access
`to provide
`exactly the same way as local files No pathname parsing no special
`libraries
`no recompiling Programs should not be able to tell whether
`file is remote or
`local
`
`UNIX is
`
`trademark of Bell Laboratories
`
`119
`
`Petitioner IBM – Ex. 1041, p. 4
`
`

`
`UNIX Semantics Maintained on Client
`to work on UNIX machines UNIX filesystem
`transparent access
`In order for
`semantics have to be maintained for remote files
`
`Reasonable Performance
`to use the NFS If
`is no faster than the existing networking
`People will not want
`is easier to use Our design goal Is to make NFS
`such as rcp even if
`utilities
`as fast as the Sun Network Disk protocol ND1 or about 80% as fast as
`local disk
`
`it
`
`it
`
`Basic Design
`
`the server side and the client side
`
`The NFS design consists of three major pieces the protocol
`NFS Protocol
`The NFS protocol uses the Sun Remote Procedure Call RPC mechanism
`For the same
`calls help simplify programs RPC helps simplify the definition
`reasons
`that procedure
`organization and implementation of remote services The NFS protocol
`is defined in terms of
`Remote procedure calls are
`set of procedures their arguments and results and their effects
`the call and returned the
`the server has completed
`that is the client blocks until
`synchronous
`results This makes RPC very easy to use since it behaves like
`local procedure call
`The NFS uses
`The parameters to each procedure call contain all of
`the
`stateless protocol
`the call and the server does not keep
`track of any past
`information necessary
`to complete
`resends NFS
`This makes crash recovery very easy when
`server crashes the client
`requests
`response is received and the server does no crash recovery at all When
`client
`requests until
`crashes no recovery is necessary for either the client or the server When state is maintained on
`the server on the other hand recovery is much harder Both client and server need to be able to
`reliably detect crashes The server needs to detect client crashes so that it can discard any State it
`is holding for the client and the client must detect server crashes so that
`it can rebuild the
`servers state
`stateless protocol allows us to avoid complex crash recovery and simplifies the protocol
`Using
`response is received data will never be lost due to
`server
`client just resends requests until
`If
`that has crashed and
`the difference between
`crash
`the client can not
`server
`In fact
`recovered and
`that is slow
`
`server
`
`tell
`
`New transport
`Suns remote procedure
`call package is designed to be transport
`independent
`protocols can be plugged in to the RPC implementation without affecting the higher
`level
`The NFS uses the ARPA User Datagram Protocol UDP and Internet Protocol
`protocol code
`IP for its transport
`level Since UDP is an unreliable datagram protocol packets can get
`lost
`is stateless and the NFS requests are idernpotent
`but because the NFS protocol
`the client can
`the packet gets through
`recover by retrying the call until
`file handle fhandle or fh
`The most common NFS procedure parameter is
`structure called
`file The fhandle is opaque
`which is provided by the server and used by the client
`to reference
`that is the client never looks at the contents of the fhandle but uses it when operations are done
`on that
`file
`
`An outline of the NFS protocol procedures is given below For the complete specification see the
`Sun Network Filesystem Protocol Specification
`
`nuil returns
`Do nothing procedure to ping the server and measure round trip time
`lookupdirfh name returns fh attr
`new Ihandle and attributes for the named file in
`Returns
`createdirfh name attr returns newfh attr
`new file and returns its fhandle and attributes
`Creates
`remove dirfh name returns status
`Removes
`file from directory
`getattrfh returns attr
`Returns file attributes This procedure is like
`ND the Sun Network Disk Protocol provides blocklevel
`
`stat call
`
`access to remote subpartitioned disks
`
`directory
`
`Petitioner IBM – Ex. 1041, p. 5
`
`

`
`read also
`
`time and modify time of
`
`file Setting the size to
`
`setattrfh attr returns attr
`Sets the mode uid gid size access
`the file
`zero truncates
`returns attr data
`readfh offset count
`Returns up to count bytes of data from file starting offset bytes into the file
`the file
`returns the attributes of
`writefh offset count data returns attr
`file beginning offset bytes from the beginning of the file
`Writes count bytes of data to
`Returns the attributes of the file after the write takes place
`renamedirfh name tofh toname returns status
`Renames the file name in the directory dirfh to toname in the directory tofh
`tinkdirfh name tofh toname returns status
`Creates the file toname in the directory lofh which is
`directory dirfh
`symllnkdirfh name string returns status
`symbolic link name in the directory dirfh with value st ring The server does not
`Creates
`interpret the string argument in any way just saves it and makes an association to the new
`symbolic link file
`readllnkfh returns string
`Returns the string which is associated with the symbolic link file
`mkdLrdirfh name attr returns fh newaltr
`new directory name in the directory dirfh and returns the new fhandle and
`Creates
`
`link to the file name in the
`
`attributes
`rmdirdirfh name returnsstatus
`Removes the empty directory name from the parent directory dirfh
`returns entries
`readd ir dirfh cookie count
`Returns up to count bytes of directory entries from the directory dirfh Each entry contains
`The
`cookie
`file name file id and an opaque pointer to the next directory entry called
`specific entry in the
`readdir calls to start reading at
`cookie is used in subsequent
`readdir call with the cookie of zero returns entries starting with the first
`directory
`entry in the directory
`statfsfh returns fsstats
`Returns filesystem information such as block size number of free blocks etc
`New fhandles are returned by\the lookup create and mkdlr procedures which also take an
`filesystem is obtained by the
`remote fhandle for the root of
`The first
`fhandle as an argument
`client using another RPC based protocol The MOUNT protocol
`directory pathname and
`takes
`the client has access permission to the filesystem which contains that
`returns an fhandle if
`this makes it easier to plug in
`is that
`directory The reason for making this
`separate protocol
`the operating system dependent
`checking methods and it separates out
`new filesystem access
`aspects of the protocol Note that the MOUNT protocol
`is the only place that UNIX pathnames
`In other operating system implementations the MOUNT protocol can
`are passed to the server
`be replaced without having to change the NFS protocol
`The NFS protocol and RPC are built on top of an External Data Representation XDR
`XDR defines the size bytes order and alignment of basic data types such as
`specification
`from the basic data
`can be built
`string integer union boolean and array Complex structures
`it also makes
`types Using XDR not only makes protocols machine and language independent
`them easy to define The arguments and results of RPC procedures are defined using an XDR
`
`data definition language that looks
`
`lot
`
`like
`
`declarations
`
`Server Side
`is stateless as mentioned above when servicing an NFS request
`Because the NFS server
`commit any modified data to stable storage before returning results The implication for UNIX
`requests which modify the filesystem must flush all modified data to disk
`based servers is that
`for example on write request not only the
`before returning from the call This means that
`data block but also any modified indirect blocks and the block containing the .inode must be
`they have been modified
`Another modification to UNIX necessary to make the server work is the addition of
`generation
`These extra numbers make It
`number in the mode and
`filesystein Id in the superbiock
`to use the inode number mode generation number and filesystem Id
`possible for the server
`
`it must
`
`flushed if
`
`Petitioner IBM – Ex. 1041, p. 6
`
`

`
`file The mode generation number Is necessary because the server
`together as the fhandle for
`is later removed and the mode
`may hand out an fhandle with an mode number of
`file that
`this mode
`fhandle comes back the server must be able to tell
`that
`reused When the original
`every time
`different file The generation number has to be Incremented
`number now refers to
`the mode is freed
`
`Client Side
`
`interface to the NFS To make transparent access to
`The client side provides the transparent
`does not change the
`locating remote files that
`method of
`remote files work we had to use
`structure of path names Some UNIX based remote file access schemes use hostpath to name
`since existing programs that parse
`transparent access
`
`This does not allow real
`remote files
`pathnameS have to be modified
`address we decided to do the hostname lookup and
`late binding of file
`Rather than doing
`remote filesystem to
`file address binding once per filesystem by allowing the client
`directory using the mount program This method has the advantage that the client only has to
`to filesystems
`deal with hosinames once at mount time It also allows the server to limit access
`is that remote files are not available to the
`The disadvantage
`client credentials
`by checking
`is done
`client until mount
`single machine is provided by
`Transparent access to different types of filesystems mounted on
`new filesystems interface in the kernel Each filesystem type supports two sets of operations
`the Virtual Filesystem VFS interface defines the procedures that operate on the filesystem as
`on an
`that operate
`whole and the Virtual Node vnode interface defines
`the procedures
`the filesystem
`schematic diagram of
`filesystem type
`Figure
`file within that
`individual
`interface and how the NFS uses it
`
`to attach
`
`is
`
`CLIENT
`
`SERVER
`
`Network
`
`Figure
`
`The Filesystem Interface
`the operations that can be done
`structure that contains
`The VFS interface is implemented using
`the operations
`structure that contains
`the vnode interface is
`whole filesystem Likewise
`on
`filesystem There is one VFS structure per
`node file or directory within
`that can be done on
`
`122
`
`Petitioner IBM – Ex. 1041, p. 7
`
`

`
`for each active node Using this
`mounted filesystem in the kernel and one vnode structure
`filesystems and nodes in the same
`abstract data type implementation allows the kernel
`to treat all
`is using
`way without knowing which underlying filesystem implementation it
`pointer to mounted-on VFS This
`to its parent VFS and
`Each vnode contains
`pointer
`root
`filesystern tree can be mount point
`for another filesystem
`means that any node in
`operation is provided in the VFS to return the root vnode of mounted filesystem This is used
`The root operation is
`by the pathname traversal routines in the kernel to bridge mount points
`pointer so that the root vnode for each mounted filesystem can be
`used instead of just keeping
`back pointer to the vnode on which it
`released The VFS of mounted fliesystem also contains
`can also be traversed across mount points
`is mounted so that pathnames that include
`each filesystem type must provide mount and
`In addition to the VFS and vnode operations
`The operations defined for the
`mount_root operations to mount normal and root
`filesystems
`filesystem interface are
`
`Filesystem Operations
`mount varies
`mount_root
`
`VFS Operations
`
`unmountvfs
`rootvfs returnsvnode
`statfs vfs returns fsstatbuf
`sync vfs
`
`Vnode Operations
`
`System call to mount
`Mount
`filesystem as root
`
`filesystem
`
`Unmount
`filesystern
`Return the vnode of the filesystem root
`Return
`filesystem statistics
`Flush delayed write blocks
`
`Mark file open
`Mark file closed
`Read or write
`Do I/O control operation
`Do select
`Return
`file attributes
`
`file
`
`directory
`
`file
`
`Link to
`Rename
`
`file
`
`file
`
`openvnode flags
`closevnode
`flags
`rdwrvnode uio rwflag flags
`loctlvnode cmd data rwflag
`selectvnode
`rwflag
`getattrvnode
`returnsattr
`Set
`file attributes
`setattrvnode attr
`accessvnode mode
`Check access permission
`Look up file name in
`lookupdvnode name returnsvnode
`name attr cxci mode returnsvnode
`Create
`createdvnode
`file name from directory
`name
`Remove
`removedvnode
`linkvnode todvnode toname
`renamedvnode name todvnode toname
`mkdlrdvnode name attr returnsdvnode
`Create
`Remove
`rmdlrdvnode name
`directory
`Read directory entries
`readdlrdvnode returns entries
`symllnkdvnode name attr to_name
`Create
`symbolic link
`Read the value of
`symbolic link
`readlfnkvp returnsdata
`Flush dirty blocks of
`fsyncvnode
`Mark vnode inactive and do clean up
`inactive vnode
`bmapvnode bik returnsdevnode mappedblk Map block number
`Read and write filesystem blocks
`strategybp
`Read
`block
`breadvnode blockno
`Release
`block buffer
`brelsevnode buf
`Notice that many of the vnode procedures map one-to-one with NFS protocol procedures while
`The bmap
`such as open close and locti do not
`other UNIX dependent procedures
`are used to do reading and writing using the buffer
`strategy bread and brelse procedures
`cache
`
`directory
`
`file
`
`returnsbuf
`
`is done in the kernel by breaking the path into directory components
`Pathname
`traversal
`lookup call through the vnode for each component At first glance it seems like
`doing
`
`and
`waste
`
`123
`
`Petitioner IBM – Ex. 1041, p. 8
`
`

`
`of time to pass only one component with each cali Instead of passing the whole path and receiving
`The main reaon for this is that any component of the path could be
`target vnode
`back
`filesystem and the mount
`information Is kept above the vnode
`mount point
`for another
`In the NFS filesystem passing whole pathnames would force the server to
`implementation level
`its clients in order to determine where to break the
`the mount points of
`keep track of ail of
`pathname and this would violate server statelessness The inefficiency of
`looking up one
`cache of directory vnodes
`time is alleviated with
`component at
`Implementation
`
`the NFS started in March 1984
`step in the implementation was
`The first
`Implementation of
`By June we had the first
`to include the filesystem interface
`the 4.2 kernel
`modification
`of
`vnode kernel running We did some benchmarks to test
`the amount of overhead added by the
`the difference was not measurable and in the
`in most cases
`turned out
`extra interface It
`that
`worst case the kernel had only slowed down by about 2% Most of the work in adding the new
`that used modes directly and
`interface was in finding and fixing all of the places in the kernel
`code that contained implicit knowledge of modes or disk layout
`the filesystem routines in the kernel had to be completely rewritten to use vnodes
`few of
`Only
`lookup
`the routine that does pathname lookup was changed
`the vnode
`Namei
`to use
`it doesnt use global state The direnter routine which adds
`operation and cleaned up so that
`new directory entries used by create rename etc also had to be fixed because it depended
`on the global state from narnei Direnter also had to be modified to do directory locking during
`directory rename operations because mode locking is no longer available at this level and vnodes
`locked
`are never
`
`fixed upper limit on the number of active vnode and VFS structures we added
`can be allocated and freed
`these and other structures
`to the kernel
`so that
`
`To avoid having
`memory allocator
`dynamically
`new system call getdirentries was added to read direètory entries from different
`types of
`filesystems The 4.2 readdir library routine was modified to use the new system call so programs
`This change does however mean that programs that use
`would not have to be rewritten
`readdir have to be relinked
`Beginning in March the user level RPC and XDR libraries were ported to the kernel and we were
`to kernel RPC calls in June We worked on RPC
`able to make kernel
`to user and kernel
`to kernel null RPC call was
`performance for about month until
`the round trip time for
`kernel
`8.8 milliseconds The performance tuning included several speed ups to the UDP and IP code in
`the kernel
`Once RPC and the vnode kernel were in place the implementation of NFS was simply matter of
`the NFS
`writing the XDR routines to do the NFS protocol
`implementing an RPC server
`for
`translates vnode
`filesystem interface which
`and implementing
`in the kernel
`procedures
`The first NFS kernel was up and running in mid
`operations into NFS remote procedure calls
`August At this point we had to make some modifications to the vnode interface -to allow the
`NFS server to do synchronous write operations
`This was necessary since unwritten blocks in
`the servers buffer cache are part of the clients state
`the MOUNT protocol was built
`into the NFS protocol
`It wasnt
`Our first
`implementation of
`later that we broke the MOUNT protocol
`level RPC service
`The
`separate user
`into
`until
`MOUNT server
`user level daemon that is started automatically when mount request comes
`list of exported filesystems and the clients
`in It checks the file /etc/exports which contains
`the client has import permission the mount daemon does
`that can import them If
`getfh
`pathname into an fhandle which is returned to the client
`to convert
`system call
`On the client side the mount command was modified to take additional arguments including
`filesystem type and options string The filesystem type allows one mount command to mount any
`type of filesystem The options string is used to pass optional
`flags to the different
`filesystem
`mount system calls For example the NFS allows two flavors of mount soft and hard
`hard
`the server goes down while
`retry NFS calls forever if
`soft mount gives
`mounted filesystem will
`up after while and returns an error The problem with soft mounts is that most UNIX programs
`are not very good about checking return status from system calls so you can get some strange
`hard mounted filesystem on the other hand will never
`behavior when servers go down
`server crash it may cause processes to hang for while but data will not be lost
`due to
`
`is
`
`fail
`
`124
`
`Petitioner IBM – Ex. 1041, p. 9
`
`

`
`In addition to the MOUNT server we have added NFS server daemons
`These are user level
`processes that make an nfsd system call
`into the kernel and never return This provides
`user
`to the kernel NFS server which allows the server
`to sleep Similarly the block I/O
`context
`daemon on the client side is
`and services
`user
`lives in the kernel
`that
`level process
`Because the RPC requests are blocking
`asynchronous block
`I/O requests
`user context
`necessary to wait for read-ahead and write-behind requests to complete
`These daemons provide
`requests in the kernel
`temporary solution to the problem of handling parallel synchronous
`In
`the future we hope to use
`light-weight process mechanism in the kernel
`to handle these requests
`
`is
`
`The NFS group started using the NFS in September and spent
`the next six months working on
`and administrative tools to make the NFS easier
`performance enhancements
`to install and use
`One of
`the advantages of the NFS was immediately obvious as the df output below shows
`to more than
`diskless workstation can have access
`Gigabyte of disk
`
`Fi.esysten
`/dev/ndo
`/dev/ndpo
`panic/usr
`fiat/usr/src
`panic/usr/panic
`galaxy/usr/galaxy
`mercury/usr/mercury
`opium/usr/opium
`
`kbytes
`7445
`5691
`27487
`345915
`148371
`7429
`301719
`327599
`
`used
`5788
`2798
`21398
`220122
`116505
`5150
`215179
`36392
`
`avail capacity
`912
`86%
`2323
`65%
`3340
`86%
`91201
`71%
`17028
`87%
`1536
`77%
`56368
`79%
`258447
`12%
`
`Mounted on
`
`/pub
`/usr
`/usr/src
`/usr/panic
`/usr/galaxy
`/usr/mercury
`/usr/opium
`
`The Hard Issues
`Several hard design issues were resolved during the development of the NFS One of the toughest
`was deciding how we wanted to use the NFS Lots of flexibility
`can lead to lots of confusion
`Root Fliesystems
`Our current NFS implementation does not allow shared NFS root filesystems
`There are many
`filesystems that we just didnt have time to address
`hard problems associated with shared root
`For example many well-known machine specific files are on the root filesystem and too many
`programs use them Also sharing
`filesystem implies sharing /tmp and /dev
`root
`Sharing
`tmp is
`problem because programs create temporary files using their process Id which is not
`remote device access system We considered
`unique across machines
`Sharing /dev
`requires
`The
`to /dev
`by making operations on device
`allowing shared access
`local
`nodes appear
`problem with this simple solution is that many programs make special use of the ownership and
`permissions of device nodes
`Since every client has private storage either real disk or ND for the root filesystem we were
`able to move machine specific
`from shared
`new directory called
`filesystems into
`/private and replace those files with symbolic links Things like iusrilib/crontab and the
`whole directory /usriadm have been moved This allows clients to boot with only /etc
`and
`ibm executables local The /usr and other filesystems are then remote mounted
`Fliesystem Naming
`
`files
`
`Servers export whole filesystems but clients can mount any sub-directory of
`remote filesystem
`on top of
`local filesystem or on top of another remote filesystem In fact
`remote filesystem
`can be mounted more than once and can even be mounted on another
`This
`itself
`copy of
`means that clients can have different names for filesystems by mounting them in different
`places
`To alleviate some of the confusion we use
`set of basic mounted filesystems on each machine
`and then let users add other filesystems on top of that Remember
`though that this is just policy
`there is no mechanism in the NFS to enforce this
`User home directories are mounted on
`violation of our goals because hostnames are now part
`This may seem like
`/usriserverriame
`the directories could have been called /usr/1
`/usr/2 etc Using
`of pathnames but
`in fact
`server names is just
`convenience
`This scheme makes workstations
`look more like timesharing
`user can log in to any workstation and her home directory will be there It
`terminals because
`also makes tilde expansion -username is expanded
`to the users home directory in the
`shell
`work in
`network with many workstations
`To aviod the problems of loop detection and dynamic filesystem access checking servers do not
`ih ir ai.sIi
`This
`cross mount points on remote lookup rpnh1pte
`tha
`
`Petitioner IBM – Ex. 1041, p. 10
`
`

`
`server
`
`client has to remote mount each of
`
`the servers exported
`
`files
`
`difference
`
`filesystem layout as
`filesystems
`Credentials Authentication and Security
`We wanted to use UNIX style permission checking on the server and client so that UNIX users
`RPC allows different
`remote and local
`between
`would see
`very
`authentication parameters to be plugged-in to the packet header of each call so we were able to
`The
`to pass uid gid and groups on each call
`make the NFS use UNIX flavor authenticator
`the user making the call
`server uses the authentication parameters to do permission checking as if
`were doing the operation locally
`the mapping from uid and gid to user must
`The problem with this authentication method is that
`whole local
`fiat uid gid space over
`be the same on the server and client
`This implies
`This is not acceptable in the long run and we are working on different authentication
`network
`In the mean time we have developed another RPC based service called the Yellow
`schemes
`By letting YP handle
`Pages YP to provide
`simple replicated database lookup service
`we make the fiat uid space much easier
`to administrate
`and /etc/group
`/etc/passwd
`is not clear
`to remote files
`It
`issue related to client authentication is super-user access
`Another
`server machine through
`that the super-user on workstation should have root access to files on
`before
`to user nobody uid
`the NFS To solve this problem the server maps user root uid
`access permission This solves the problem but unfortunately causes some strange
`checking
`file than
`for users logged in as root since root may have fewer access
`rights to
`behavior
`normal user
`to remote user
`Remote root access also affects programs which are set-uid root and need access
`files for example lpr To make these programs more likely to succeed we check on the client
`the
`fail with EACCES and retry the call with the real-uid instead of
`side for RPC calls that
`This is only done when the effective-uid is zero and the real-uid is something other
`effective-uid
`than zero so normal users are not affected
`
`little
`
`client
`
`remote files the super-user on
`While restricting super-user access
`helps to protect
`machine can still gain access by using su to change her effective-uid to the uid of the owner of
`remote file
`Concurrent Access and File Locking
`The NFS does not support remote file locking We purposely did not
`include this as part of the
`locking facilities that everyone agrees is correct
`protocol because we could not
`set of
`find
`In this way people can use
`Instead we plan to build separate RPC based file locking facilities
`the locking facility with the flavor of their choice with minimal effort
`access to remote files by multiple clients
`In
`Related to the problem of file locking is concurrent
`the mode level This prevents two processes
`the local filesystem file modifications are locked at
`single write Since the server maintains no
`writing to the same file from intermixing data on
`locks between requests and write may span several RPC requests two clients writing to the
`intermixed data on long writes
`same remote file may get
`UNIX Open File Semantics
`We tried very hard to make the NFS client obey UNIX filesystem semantics without modifying the
`In some cases this was hard to do For example UNIX allows removal of
`server or the protocol
`file then remove the directory entry for the file so that it has no
`process can open
`open flies
`read and write the file
`This is
`disgusting bit of
`name anywhere in the filesystem and still
`the
`it but
`turns out
`that all of
`UNIX trivia and at
`first we were just not going to support
`to have to fix csh sendrncil etc use this for temporary files
`programs that we didnt want
`in the client VFS
`removal work on remote files was check
`What we did to make open file
`instead of removing it This makes it
`the file is open and if so rename it
`remove operation if
`The client kernel
`then
`sort of invisible to the client and still allows reading and writing
`removes the new name when the vnode becomes inactive We call this the 3/4 solution because
`garbage file is left on the server An
`the client crashes between the rename and remove
`entry to cron can be added to clean up on the server
`Another problem associated with remote open files is that access permission on the file can
`In the local case the access permission is only checked when the
`change while the file is open
`in the remote case permission is checked on every NFS call This means that
`file is opened but
`it no longer has read
`file then changes the permission bits so that
`client program opens
`if
`
`it
`
`if
`
`Petitioner IBM – Ex. 1041, p. 11
`
`

`
`To get around this problem we save the client
`read request will
`fail
`subsequent
`permission
`credentials in the file table at open time and use them in later file access
`requests
`Not all of the UNIX open file semantics have been preserved because interactions between two
`For example if one
`clients using the same remote file can not be controlled on
`single client
`file the first clients read request will
`file and another client
`removes that
`fail
`client opens
`even though the file is still open
`Time Skew
`
`client and
`file to be
`server can cause time associated with
`Time skew between two clients or
`library entry and Id checks the
`time in
`For example ranlib saves the current
`inconsistent
`modify time of the library against the time saved in the library When ran lib is run on
`remote
`file the modify time comes from the server while the current
`time that gets saved in the library
`the servers time is far ahead of
`looks to Id like the
`the clients it
`comes from the client
`There were only three programs that we found that were affected by this
`library is out of date
`ranlib is and emacs so we fixed them
`potential problem for any program that compares system time to file modification time
`This is
`We plan to fix this by limiting the time skew between machines with
`time synchronization
`protocol
`Performance
`
`If
`
`The final hard issue is the one everyone is most interested in performance
`Much of the time since the NFS first came up has been spent
`in improving performance Our
`goal was to make NFS faster than the ND in the 1.1 Sun release about 80% of the speed of
`but how long it
`takes to do
`raw throughput
`local disk The speed we are interested in is not
`normal work
`set of benchm

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket