throbber
Michael J. Karels
`
`John S. Quarterman
`
`Canon Exhibit 1111
`
`

`
`The Design and Implementation of the
`4.4BSD
`Operating System
`
`Canon Exhibit 1111
`
`

`
`The Design and Implementation of the
`4.4BSD
`Operating System
`
`Marshall Kirk McKusick
`Consultant
`
`Keith Bostic
`Berkeley Software Design, Inc.
`
`Michael J. Karels
`Berkeley Software Design, Inc.
`
`John S. Quarterman
`Texas Internet Consulting
`
`• 'f''f'
`
`ADDISON-WESLEY
`Boston • San Francisco • New York • Toronto • Montreal
`London • Munich • Paris • Madrid
`Capetown • Sydney • Tokyo • Singapore • Mexico City
`
`Canon Exhibit 1111
`
`

`
`This book is in the Addison-Wesley UNIX and Open Systems Series
`
`Series Editors: Marshall Kirk McKusick and John S. Quarterman
`Publishing Partner: Peter S. Gordon
`Associate Editor: Deborah R. Lafferty
`Associate Production Supervisor: Patricia A. Oduor
`Marketing Manager: Bob Donegan
`Senior Manufacturing Manager: Roy E. Logan
`Cover Designer: Barbara Atkinson
`Troff Macro Designer: Jaap Akkerhuis
`Copy Editor: Lyn Dupre
`Cover Art: John Lasseter
`
`UNIX is a registered trademark of X/Open in the United States and other countries. Many of
`the designations used by manufacturers and sellers to distinguish their products are claimed
`as trademarks. Where those designations appear in this book, and Addison-Wesley was
`aware of a trademark claim, the designations have been printed in initial caps or all caps.
`
`The programs and applications presented in this book have been included for their instruc(cid:173)
`tional value. They have been tested with care, but are not guaranteed for any particular pur(cid:173)
`pose. The publisher offers no warranties or representations, nor does it accept any liabili(cid:173)
`ties with respect to the programs or applications.
`
`Library of Congress Cataloging-in-Publication Data
`
`The design and implementation of the 4.4BSD operating system I
`Marshall Kirk McKusick ... [et al.].
`p. cm.
`Includes bibliographical references and index.
`ISBN 0-201-54979-4
`1. UNIX (Computer file) 2. Operating systems (Computers)
`I. McKusick, Marshall Kirk.
`QA76.76.063D4743 1996
`005.4'3--dc20
`
`96-2433
`CIP
`
`Copyright © 1996 by Addison-Wesley Longman, Inc.
`
`All rights reserved. No part of this publication may be reproduced, stored in a retrieval sys(cid:173)
`tem, or transmitted, in any form or by any means, electronic, mechanical, photocopying,
`recording, or otherwise, without the prior written permission of the publisher. Printed in
`the United States of America. Published simultaneously in Canada.
`
`Text printed on recycled and acid-free paper.
`ISBN 0201549794
`1112131415 MA
`
`03 02 OJ
`
`I 0th Printing
`
`February 200 I
`
`Canon Exhibit 1111
`
`

`
`36
`
`Chapter 2
`
`Design Overview of 4.4BSD
`
`This facility allows buffers in different parts of a process address space to be
`written atomically, without the need to copy them to a single contiguous buffer.
`Atomic writes are necessary in the case where the underlying abstraction is record
`based, such as tape drives that output a tape block on each write request. It is also
`convenient to be able to read a single request into several different buffers (such as
`a record header into one place and the data into another). Although an application
`can simulate the ability to scatter data by reading the data into a large buffer and
`then copying the pieces to their intended destinations, the cost of memory-to(cid:173)
`memory copying in such cases often would more than double the running time of
`the affected application.
`Just as send and recv could have been implemented as library interfaces to
`sendto and recvfrom, it also would have been possible to simulate read with readv
`and write with writev. However, read and write are used so much more frequently
`that the added cost of simulating them would not have been worthwhile.
`
`Multiple Filesystem Support
`With the expansion of network computing, it became desirable to support both
`local and remote filesystems. To simplify the support of multiple filesystems, the
`developers added a new virtual node or vnode interface to the kernel. The set of
`operations exported from the vnode interface appear much like the filesystem
`operations previously supported by the local filesystem. However, they may be
`supported by a wide range of filesystem types:
`
`•Local disk-based filesystems
`
`• Files imported using a variety of remote filesystem protocols
`
`• Read-only CD-ROM filesystems
`
`• Filesystems providing special-purpose interfaces-for example, the /proc
`filesystem
`
`A few variants of 4.4BSD, such as FreeBSD, allow filesystems to be loaded
`dynamically when the filesystems are first referenced by the mount system call.
`The vnode interface is described in Section 6.5; its ancillary support routines are
`described in Section 6.6; several of the special-purpose filesystems are described
`in Section 6.7.
`
`2. 7
`
`Filesystems
`
`A regular file is a linear array of bytes, and can be read and written starting at any
`byte in the file. The kernel distinguishes no record boundaries in regular files,
`although many programs recognize line-feed characters as distinguishing the ends
`of lines, and other programs may impose other structure. No system-related infor(cid:173)
`mation about a file is kept in the file itself, but the filesystem stores a small amount
`of ownership, protection, and usage information with each file.
`
`Canon Exhibit 1111
`
`

`
`Section 2.7
`
`Filesystems
`
`37
`
`A filename component is a string of up to 255 characters. These filenames are
`stored in a type of file called a directory. The information in a directory about a
`file is called a directory entry and includes, in addition to the filename, a pointer to
`the file itself. Directory entries may refer to other directories, as well as to plain
`files. A hierarchy of directories and files is thus formed, and is called afilesystem;
`a small one is shown in Fig. 2.2. Directories may contain subdirectories, and there
`is no inherent limitation to the depth with which directory nesting may occur. To
`protect the consistency of the filesystem, the kernel does not permit processes to
`write directly into directories. A filesystem may include not only plain files and
`directories, but also references to other objects, such as devices and sockets.
`The filesystem forms a tree, the beginning of which is the root directory,
`sometimes referred to by the name slash, spelled with a single solidus character
`(I). The root directory contains files; in our example in Fig. 2.2, it contains vmu(cid:173)
`nix, a copy of the kernel-executable object file. It also contains directories; in this
`example, it contains the usr directory. Within the usr directory is the bin direc(cid:173)
`tory, which mostly contains executable object code of programs, such as the files
`Is and vi.
`A process identifies a file by specifying that file's pathname, which is a string
`composed of zero or more filenames separated by slash (I) characters. The kernel
`associates two directories with each process for use in interpreting pathnames. A
`process's root directory is the topmost point in the filesystem that the process can
`access; it is ordinarily set to the root directory of the entire filesystem. A path(cid:173)
`name beginning with a slash is called an absolute pathname, and is interpreted by
`the kernel starting with the process's root directory.
`
`Figure 2.2 A small filesystem tree.
`
`Canon Exhibit 1111
`
`

`
`38
`
`Chapter2
`
`Design Overview of 4.4BSD
`
`A pathname that does not begin with a slash is called a relative pathname, and
`is interpreted relative to the current working directory of the process. (This direc(cid:173)
`tory also is known by the shorter names current directory or working directory.)
`The current directory itself may be referred to directly by the name dot, spelled
`with a single period (.). The filename dot-dot ( •• ) refers to a directory's parent
`directory. The root directory is its own parent.
`A process may set its root directory with the chroot system call, and its cur(cid:173)
`rent directory with the chdir system call. Any process may do chdir at any time,
`but chroot is permitted only a process with superuser privileges. Chroot is nor(cid:173)
`mally used to set up restricted access to the system.
`Using the filesystem shown in Fig. 2.2, if a process has the root of the filesys(cid:173)
`tem as its root directory, and has /usr as its current directory, it can refer to the file
`vi either from the root with the absolute pathname /usr/bin/vi, or from its current
`directory with the relative pathname bin/vi.
`System utilities and databases are kept in certain well-known directories. Part
`of the well-defined hierarchy includes a directory that contains the home directory
`for each user-for example, /usr/staff/mckusick and /usr/staff/karels in Fig. 2.2.
`When users log in, the current working directory of their shell is set to the home
`directory. Within their home directories, users can create directories as easily as
`they can regular files. Thus, a user can build arbitrarily complex subhierarchies.
`The user usually knows of only one filesystem, but the system may know that
`this one virtual filesystem is really composed of several physical filesystems, each
`on a different device. A physical filesystem may not span multiple hardware
`devices. Since most physical disk devices are divided into several logical devices,
`there may be more than one filesystem per physical devic~. but there will be no
`more than one per logical device. One filesystem-the filesystem that anchors all
`absolute pathnames-is called the root filesystem, and is always available. Others
`may be mounted; that is, they may be integrated into the directory hierarchy of the
`root filesystem. References to a directory that has a filesystem mounted on it are
`converted transparently by the kernel into references to the root directory of the
`mounted filesystem.
`The link system call takes the name of an existing file and another name to
`create for that file. After a successful link, the file can be accessed by either file(cid:173)
`name. A filename can be removed with the unlink system call. When the final
`name for a file is removed (and the final process that has the file open closes it),
`the file is deleted.
`Files are organized hierarchically in directories. A directory is a type of file,
`but, in contrast to regular files, a directory has a structure imposed on it by the sys(cid:173)
`tem. A process can read a directory as it would an ordinary file, but only the ker(cid:173)
`nel is permitted to modify a directory. Directories are created by the mkdir system
`call and are removed by the rmdir system call. Before 4.2BSD, the mkdir and
`rmdir system calls were implemented by a series of link and unlink system calls
`being done. There were three reasons for adding systems calls explicitly to create
`and delete directories:
`
`Canon Exhibit 1111
`
`

`
`Section 2.7
`
`Filesystems
`
`39
`
`1. The operation could be made atomic. If the system crashed, the directory
`would not be left half-constructed, as could happen when a series of link oper(cid:173)
`ations were used.
`
`2. When a networked filesystem is being run, the creation and deletion of files
`and directories need to be specified atomically so that they can be serialized.
`
`3. When supporting non-UNIX filesystems, such as an MS-DOS filesystem, on
`another partition of the disk, the other filesystem may not support link opera(cid:173)
`tions. Although other filesystems might support the concept of directories,
`they probably would not create and delete the directories with links, as the
`UNIX filesystem does. Consequently, they could create and delete directories
`only if explicit directory create and delete requests were presented.
`
`The chown system call sets the owner and group of a file, and chmod changes
`protection attributes. Stat applied to a filename can be used to read back such
`properties of a file. The fchown, fchmod, and /stat system calls are applied to a
`descriptor, instead of to a filename, to do the same set of operations. The rename
`system call can be used to give a file a new name in the filesystem, replacing one
`of the file's old names. Like the directory-creation and directory-deletion opera(cid:173)
`tions, the rename system call was added to 4.2BSD to provide atomicity to name
`changes in the local filesystem. Later, it proved useful explicitly to export renam(cid:173)
`ing operations to foreign filesystems and over the network.
`The truncate system call was added to 4.2BSD to allow files to be shortened
`to an arbitrary offset. The call was added primarily in support of the Fortran run(cid:173)
`time library, which has the semantics such that the end of a random-access file is
`set to be wherever the program most recently accessed that file. Without the trun(cid:173)
`cate system call, the only way to shorten a file was to copy the part that was
`desired to a new file, to delete the old file, then to rename the copy to the original
`name. As well as this algorithm being slow, the library could potentially fail on a
`full filesystem.
`Once the filesystem had the ability to shorten files, the kernel took advantage
`of that ability to shorten large empty directories. The advantage of shortening
`empty directories is that it reduces the time spent in the kernel searching them
`when names are being created or deleted.
`Newly created files are assigned the user identifier of the process that created
`them and the group identifier of the directory in which they were created. A three(cid:173)
`level access-control mechanism is provided for the protection of files. These three
`levels specify the accessibility of a file to
`
`1. The user who owns the file
`
`2. The group that owns the file
`
`3. Everyone else
`
`Canon Exhibit 1111
`
`

`
`40
`
`Chapter 2
`
`Design Overview of 4.4BSD
`
`Each level of access has separate indicators for read permission, write permission,
`and execute permission.
`Files are created with zero length, and may grow when they are written.
`While a file is open, the system maintains a pointer into the file indicating the cur(cid:173)
`rent location in the file associated with the descriptor. This pointer can be moved
`about in the file in a random-access fashion. Processes s~aring a file descriptor
`through a fork or dup system call share the current location pointer. Descriptors
`created by separate open system calls have separate current location pointers.
`Files may have holes in them. Holes are void areas in the linear extent of the file
`where data have never been written. A process can create these holes by position(cid:173)
`ing the pointer past the current end-of-file and writing. When read, holes are
`treated by the system as zero-valued bytes.
`Earlier UNIX systems had a limit of 14 characters per filename component.
`This limitation was often a problem. For example, in addition to the natural desire
`of users to give files long descriptive names, a common way of forming filenames
`is as basename.extension, where the extension (indicating the kind of file, such as
`.c for C source or .o for intermediate binary object) is one to three characters,
`leaving 10 to 12 characters for the basename. Source-code-control systems and
`editors usually take up another two characters, either as a prefix or a suffix, for
`their purposes, leaving eight to 10 characters. It is easy to use 10 or 12 characters
`in a single English word as a basename (e.g., "multiplexer").
`It is possible to keep within these limits, but it is inconvenient or even dan(cid:173)
`gerous, because other UNIX systems accept strings longer than the limit when
`creating files, but then truncate to the limit. A C language source file named
`multiplexer.c (already 13 characters) might have a source-code-control file with
`s. prepended, producing a filename s.multiplexer that is indistinguishable from
`the source-code-control file for multiplexer.ms, a file containing troff source for
`documentation for the C program. The contents of the two original files could
`easily get confused with no warning from the source-code-control system. Care(cid:173)
`ful coding can detect this problem, but the long filenames first introduced in
`4.2BSD practically eliminate it.
`
`2.8 F'ilestores
`
`The operations defined for local filesystems are divided into two parts. Common
`to all local filesystems are hierarchical naming, locking, quotas, attribute manage(cid:173)
`ment, and protection. These features are independent of how the data will be
`stored. 4.4BSD has a single implementation to provide these semantics.
`The other part of the local filesystem is the organization and management of
`the data on the storage media. Laying out the contents of files on the storage
`media is the responsibility of the filestore. 4.4BSD supports three different file(cid:173)
`store layouts:
`
`Canon Exhibit 1111
`
`

`
`Section 2.9
`
`Network Filesystem
`
`41
`
`• The traditional Berkeley Fast Filesystem
`
`• The log-structured filesystem, based on the Sprite operating-system design
`[Rosenblum & Ousterhout, 1992]
`
`• A memory-based filesystem
`
`Although the organizations of these filestores are completely different, these dif(cid:173)
`ferences are indistinguishable to the processes using the filestores.
`The Fast Filesystem organizes data into cylinder groups. Files that are likely
`to be accessed together, based on their locations in the filesystem hierarchy, are
`stored in the same cylinder group. Files that are not expected to accessed together
`are moved into different cylinder groups. Thus, files written at the same time may
`be placed far apart on the disk.
`The log-structured filesystem organizes data as a log. All data being written
`at any point in time are gathered together, and are written at the same disk loca(cid:173)
`tion. Data are never overwritten; instead, a new copy of the file is written that
`replaces the old one. The old files are reclaimed by a garbage-collection process
`that runs when the filesystem becomes full and additional free space is needed.
`The memory-based filesystem is designed to store data in virtual memory. It
`is used for filesystems that need to support fast but temporary data, such as /tmp.
`The goal of the memory-based filesystem is to keep the storage packed as com(cid:173)
`pactly as possible to minimize the usage of virtual-memory resources.
`
`2.9 Network Filesystem
`
`Initially, networking was used to transfer data from one machine to another. Later,
`it evolved to allowing users to log in remotely to another machine. The next logi(cid:173)
`cal step was to bring the data to the user, instead of having the user go to the
`data-and network filesystems were born. Users working locally do not experi(cid:173)
`ence the network delays on each keystroke, so they have a more responsive envi(cid:173)
`ronment.
`Bringing the filesystem to a local machine was among the first of the major
`client-server applications. The server is the remote machine that exports one or
`more of its filesystems. The client is the local machine that imports those filesys(cid:173)
`tems. From the local client's point of view, a remotely mounted filesystem
`appears in the file-tree name space just like any other locally mounted filesystem.
`Local clients can change into directories on the remote filesystem, and can read,
`write, and execute binaries within that remote filesystem identically to the way
`that they can do these operations on a local filesystem.
`When the local client does an operation on a remote filesystem, the request is
`packaged and is sent to the server. The server does the requested operation and
`returns either the requested information or an error indicating why the request was
`
`Canon Exhibit 1111
`
`

`
`196
`
`Chapter6
`
`1/0 System Overview
`
`Interrupt Handling
`
`Interrupts are generated by devices to signal that an operation has completed or
`that a change in status has occurred. On receiving a device interrupt, the system
`invokes the appropriate device-driver interrupt service routine with one or more
`parameters that identify uniquely the device that requires service. These parame(cid:173)
`ters are needed because device drivers typically support multiple devices of the
`same type. If the interrupting device's identity were not supplied with each inter(cid:173)
`rupt, the driver would be forced to poll all the potential devices to identify the de(cid:173)
`vice that interrupted.
`The system arranges for the unit-number parameter to be passed to the inter(cid:173)
`rupt service routine for each device by installing the address of an auxiliary glue
`routine in the interrupt-vector table. This glue routine, rather than the actual inter(cid:173)
`rupt service routine, is invoked to service the interrupt; it takes the following
`actions:
`
`1. Save all volatile registers.
`
`2. Update statistics on device interrupts.
`
`3. Call the interrupt service routine with the appropriate unit number parameter.
`
`4. Restore the volatile registers saved in step 1.
`
`5. Return from the interrupt.
`
`Because a glue routine is interposed between the interrupt-vector table and the
`interrupt service routine, device drivers do not need to be concerned with saving
`and restoring machine state. In addition, special-purpose instructions that cannot
`be generated from C, which are needed by the hardware to support interrupts, can
`be kept out of the device driver; this interposition of a glue routine permits device
`drivers to be written without assembly language.
`
`6.2 Block Devices
`Block devices include disks and tapes. The task of the block-device interface is to
`convert from the user abstraction of a disk as an array of bytes to the structure
`imposed by the underlying physical medium. Although the user may wish to
`write a single byte to a disk, the hardware can read and write only in multiples of
`sectors. Hence, the system must arrange to read in the sector containing the byte
`to be modified, to replace the affected byte, and to write back the sector to the
`disk. This operation of converting random access to an array of bytes to reads and
`writes of disk sectors is known as block 110. Block devices are accessible directly
`through appropriate device special files, but are more commonly accessed indi(cid:173)
`rectly through the filesystem (see Section 8.2).
`Processes may read data in sizes smaller than a disk block. The first time that
`a small read is required from a particular disk block, the block will be transferred
`
`Canon Exhibit 1111
`
`

`
`Section 6.2
`
`Block Devices
`
`197
`
`from the disk into a kernel buffer. Later reads of parts of the same block then
`require only copying from the kernel buffer to the memory of the user process.
`Multiple small writes are treated similarly. A buffer is allocated from the cache
`when the first write to a disk block is made, and later writes to part of the same
`block are then likely to require only copying into the kernel buffer, and no disk I/O.
`In addition to providing the abstraction of arbitrary alignment of reads and
`writes, the block buffer cache reduces the number of disk I/O transfers required by
`filesystem accesses. Because system-parameter files, commands, and directories
`are read repeatedly, their data blocks are usually in the buffer cache when they are
`needed. Thus, the kernel does not need to read them from the disk every time that
`they are requested.
`If the system crashes while data for a particular block are in the cache but
`have not yet been written to disk, the filesystem on the disk will be incorrect and
`those data will be lost. (Critical system data, such as the contents of directories,
`however, are written synchronously to disk, to ensure filesystem consistency;
`operations requiring synchronous I/O are described in the last subsection of Sec(cid:173)
`tion 8.2.) So that lost data are minimized, writes are forced periodically for dirty
`buffer blocks. These forced writes are done (usually every 30 seconds) by a user
`process, update, which uses the sync system call. There is also a system call,
`fsync, that a process can use to force all dirty blocks of a single file to be written to
`disk immediately; this synchronization is useful for ensuring database consistency
`or before removing an editor backup file.
`Most magnetic-tape accesses are done through the appropriate raw tape de(cid:173)
`vice, bypassing the block buffer cache. When the cache is used, tape blocks must
`still be written in order, so the tape driver forces synchronous writes for them.
`
`Entry Points for Block-Device Drivers
`
`Device drivers for block devices are described by an entry in the bdevsw table.
`Each bdevsw structure contains the following entry points:
`
`open
`
`Open the device in preparation for I/O operations. A device's open
`entry point will be called for each open system call on a block special
`device file, or, internally, when a device is prepared for mounting a
`filesystem with the mount system call. The open() routine will com(cid:173)
`monly verify the integrity of the associated medium. For example, it
`will verify that the device was identified during the autoconfiguration
`phase and, for tape and disk drives, that a medium is present and on(cid:173)
`line.
`
`strategy Start a read or write operation, and return immediately. I/O requests to
`or from filesystems located on a device are translated by the system
`into calls to the block I/O routines bread() and bwrite(). These block
`I/O routines in turn call the device's strategy routine to read or write
`data not in the cache. Each call to the strategy routine specifies a
`pointer to a buf structure containing the parameters for an I/O request.
`
`Canon Exhibit 1111
`
`

`
`198
`
`close
`
`dump
`
`psize
`
`Chapter 6
`
`110 System Overview
`
`If the request is synchronous, the caller must sleep (on the address of
`the buf structure) until 1/0 completes.
`
`Close a device. The close() routine is called after the final client inter(cid:173)
`ested in using the device terminates. These semantics are defined by
`the higher-level 1/0 facilities. Disk devices have nothing to do when a
`device is closed, and thus use a null close() routine. Devices that sup(cid:173)
`port access to only a single client must mark the device as available
`once again. Closing a tape drive that was open for writing typically
`causes end-of-file marks to be written on the tape and the tape to be
`rewound.
`
`Write all physical memory to the device. The dump entry point saves
`the contents of memory on secondary storage. The system automati(cid:173)
`cally takes a dump when it detects an unrecoverable error and is about
`to crash. The dump is used in a postmortem analysis of the problem
`that caused the system to crash. The dump routine is invoked with the
`processor priority at its highest level; thus, the device driver must poll
`for device status, rather than wait for interrupts. All disk devices are
`expected to support this entry point; some tape devices do as well.
`
`Return the size of a disk-drive partition. The driver is supplied a logi(cid:173)
`cal unit and is expected to return the size of that unit, typically a disk(cid:173)
`drive partition, in DEV _BSIZE blocks. This entry point is used during
`the bootstrap procedure to calculate the location at which a crash dump
`should be placed and to determine the sizes of the swap devices.
`
`Sorting of Disk 1/0 Requests
`
`The kernel provides a generic disksort() routine that can be used by all the disk
`device drivers to sort 1/0 requests into a drive's request queue using an elevator
`sorting algorithm. This algorithm sorts requests in a cyclic, ascending, cylinder
`order, so that requests can be serviced with a minimal number of one-way scans
`over the drive. This ordering was originally designed to support the normal read(cid:173)
`ahead requested by the filesystem as well as to counteract the filesystem's random
`placement of data on a drive. With the improved placement algorithms in the cur(cid:173)
`rent filesystem, the effect of the disksort() routine is less noticeable; disksort()
`produces the largest effect when there are multiple simultaneous users of a drive.
`The disksort() algorithm is shown in Fig. 6.2. A drive's request queue is
`made up of one or two lists of requests ordered by cylinder number. The request
`at the front of the first list indicates the current position of the drive. If a second
`list is present, it is made up of requests that lie before the current position. Each
`new request is sorted into either the first or the second list, according to the
`request's location. When the heads reach the end of the first list, the drive begins
`servicing the other list.
`Disk sorting can also be important on machines that have a fast processor, but
`that do not sort requests within the device driver. In this situation, if a write of
`
`Canon Exhibit 1111
`
`

`
`Section 6.2
`
`Block Devices
`
`199
`
`disksort(dq, bp)
`drive queue *dq;
`buffer *bp;
`
`if (drive queue is empty) {
`place the buffer at the front of the drive queue;
`return;
`
`if (request lies before the first active request) {
`locate the beginning of the second request list;
`sort bp into the second request list;
`else
`sort bp into the current request list;
`
`Figure 6.2 Algorithm for disksort().
`
`several Kbyte is honored in order of queueing, it can block other processes from
`accessing the qisk while it completes. Sorting requests provides some scheduling,
`which more fairly distributes accesses to the disk controller.
`
`Disk Labels
`
`Many disk controllers require the device driver to identify the location of disk sec(cid:173)
`tors that are to be transferred by their cylinder, track, and rotational offset. For
`maximum throughput efficiency, this information is also needed by the filesystem
`when deciding how to lay out files. Finally, a disk may be broken up into several
`partitions, each of which may be used for a separate filesystem or swap area.
`Historically, the information about the geometry of the disk and about the lay(cid:173)
`out of the partitions was compiled into the kernel device drivers. This approach
`had several flaws. First, it was cumbersome to have all the possible disk types and
`partitions compiled into the kernel. Any time that a disk with a new geometry was
`added, the driver tables had to be updated and the kernel recompiled. It was also
`restrictive in that there was only one choice of partition table for each drive type.
`Choosing a different set of tables required modifying the disk driver and rebuild(cid:173)
`ing the kernel. Installing new tables also required dumping all the disks of that
`type on the system, then booting the new kernel and restoring them onto the new
`partitions. Disks with different partition layouts could not be moved from one
`system to another. An additional problem arose when nonstandard partition tables
`were used; new releases from the vendor had to have the partition tables modified
`before they could be used on an existing system.
`For all these reasons, 4.4BSD and most commercial UNIX vendors added disk
`labels. A disk label contains detailed geometry information, including cylinder,
`track, and sector layout, along with any other driver-specific information. It also
`contains information about the partition layout and usage, the latter describing
`
`Canon Exhibit 1111
`
`

`
`200
`
`Chapter 6
`
`1/0 System Overview
`
`partltlon usage: type of filesystem, swap partition, or unused. For the fast
`filesystem, the partition usage contains enough additional information to enable
`the filesystem check program (fsck) to locate the alternate superblocks for the
`filesystem.
`Having labels on each disk means that partition information can be different
`for each disk, and that it carries over when the disk is moved from one system to
`another. It also means that, when previously unknown types of disks are con(cid:173)
`nected to the system, the system administrator can use them without changing the
`disk driver, recompiling, and rebooting the system.
`The label is located near the beginning of each drive-usually, in block zero.
`It must be located in the first track, because the device driver does not know the
`geometry of the disk until the driver has read the label. Thus, it must assume that
`the label is in cylinder zero, track zero, at some valid offset within that track.
`Most architectures have hardware (or first-level) bootstrap code stored in read(cid:173)
`only memory (ROM). When the machine is powered up or the reset button is
`pressed, the CPU executes the hardware bootstrap code from the ROM. The hard(cid:173)
`ware bootstrap code typically reads the first few sectors on the disk into the main
`memory, t

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket