`
`FILE
`SYSTEM
`IMPLEMENTATION
`
`As we saw in Chapter 10, the file system provides the mechanism for on(cid:173)
`line storage and access to both data and programs. The file system resides
`permanently on secondary storage, which has the main requirement that it
`must be able to hold a large amount of data, permanently. This chapter is
`primarily concerned with issues concerning file storage and access on the
`most. common secondary-storage medium, the disk. We explore ways to
`allocate disk space, to recover freed space, to track the locations of data,
`and to interface other parts of the operating system to secondary storage.
`Performance issues are considered throughout the chapter.
`
`11.1 • File-System Structure
`
`Disks provide the bulk of secondary storage on which a file system is
`maintained. To improve I/0 efficiency, 110 transfers between memory and
`disk are performed in units of blocks. Each block is one or more sectors.
`Depending on the disk drive, sectors vary from 32 bytes to 4096 bytes;
`usually, they are 512 bytes. Disks have two important characteristics that
`mal<e them a convenient medium for storing multiple files:
`
`1. They can be rewritten in place; it is possible to read a block from the
`disk, to modify the block, and to write it back into the same place.
`
`2. We can access directly any given block of information on the disk.
`Thus, it is simple to access any file either sequentially or randomly,
`
`383
`
`Apple 1013 (Part 3 of 4)
`U.S. Pat. 8,504,746
`
`
`
`384 • Chapter 11: File-System Implementation
`
`and switching from one file to another requires only moving the
`read-write heads and waiting for the disk to revolve.
`
`We discuss disk structure in great detail in Chapter 12.
`
`·11.1.1 File-System Organization
`To provide an efficient and convenie:nt access to the disk, the operating
`system imposes a file system to allow the data to be stored, located, and
`retrieved easily. A file system poses two quite different design problems.
`The first problem is defining how the file system should look to the user.
`This task involves the definition of a file and its attributes, operations
`allowed on a file, and the directory structure for organizing the files. Next,
`algorithms and data structures must be created to map the logical file
`system onto the physical secondary-storage devices.
`The file system itself is generally composed of many different levels.
`The structure shown in Figure 11.1 is an example of a layered design. Each
`level in the design uses the features of lower levels to create new features
`for use by higher levels.
`The lowest level, the 110 control, consists of device drivers and interrupt
`ha11diers to transfer information between memory and the disk system. A
`device driver can be thought of as a translator. Its input consists of high(cid:173)
`level commands such as "retrieve block 123." Its output consists of low(cid:173)
`level, hardware-specific instructions, which are used by the hardware
`controller, which interfaces the I/O device to the rest of the system. The
`
`application programs
`
`J
`'
`'
`' 1/0 control ,.
`
`logical file system
`
`file- organization module
`
`basic fil~ system
`
`devices
`
`Figure 11.1 Layered file system.
`
`
`
`11.1 File-System Structure • 385
`
`device driver usually writes specific bit patterns to special locations ·in the
`110 controller's memory to tell the controller on which device location to act
`and what actions to take.
`The basic file system needs only to issue generic commands to the
`appropriate device driver to read and write physical blocks on the disk.
`Each physical block is identified by its numeric disk address (for example,
`drive 1, cylinder 73, surface 2, sector 10).
`The file-organization module knows about files and their logical blocks, as
`well as physical blocks. By knowing the type of file allocation used and the
`location of the file, the file-organization module can translate logical block
`addresses to physical block addresses for the basic file system to transfer.
`Each file's logical blocks are numbered from 0 (or 1) through N, whereas
`the physical-· blocks containing this data usually do not match the logical
`numbers, so a translation is needed to locate each block. The file(cid:173)
`organization module also includes the free-space manager, which tracks
`unallocated blocks and provides these blocks to the file-organization
`module when requested.
`Finally, the logical file system uses the directory structure to provide the
`file-organization module with the information the latter needs, given a
`symbolic file name. The logical file system is also responsible for
`protection and security, as was discussed in Chapter 10 and will be further
`discussed in Chapter 13.
`To create a new file, an application program calls the logical file
`system. The logical file system knows the format of the directory
`structures. To create a new file, it reads the appropriate directory into
`memory, updates it with the new entry, and writes it back to the disk. A
`one with a type field indicating
`directory can be treated exactly as a file -
`that it is a directory. Thus, the logical file system can call the file(cid:173)
`organization module to map the directory 110 into disk-block numbers,
`which are passed on to the basic file system and 110 control system.
`Once the directory has been updated, the logical file system can. use it
`to perform I/O. When a file is opened, the directory structure is searched
`for the desired file entry. It would be possible to search the directory
`structure for every 110 operation, but that would be inefficient. To speed
`the search, the operating system generally keeps in memory a table,
`referred to as the open-file table, consisting of information about all the
`currently opened files (Figure 11.2).
`The first reference to. a file (normally an open) causes the directory
`structure to be searched and the directory entry for this file to be copie~
`into the table of opened files. The index into this table is returned to the
`user program, and all .further ·references are made through the index (a file
`descriptor or file control block),
`rather than with a symbolic name.
`Consequently, as long as the file is not closed, all directory lookups are
`done on the open-file table. All changes to the directory entry are made to
`the open-file table in memory. When the file is closed by all users that
`
`
`
`386 • Chapter 11: File-System Implementation
`
`file name
`TEST.C
`MAIL. TXT
`
`permissions
`rw·
`rw rw
`rw
`
`access dates
`...
`
`pointer to disk block
`~
`
`~
`
`index
`0
`1
`2
`
`n
`
`Figure 11.2 A typical open-file table.
`
`have opened it, the updated entry is copied back to the disk-based
`directory structure.
`Some systems complicate this scheme even further by using multiple
`levels of in-memory tables. For example, in the BSD UNIX file system, each
`process has an open-file table that merely points to a systemwide open-file
`table, which in turn points to a table of active inodes. The active-inodes
`table is an in-memory cache of inodes currently in use, and includes the
`inode index fields that point to the on-disk data blocks. In this way, once
`a file is opened, all but the actual data blocks are in memory for rapid
`access by any process accessing the file. The BSD UNIX system is typical in
`its use of caches wherever disk 110 can be saved. Its cache hit rate of 85
`percent shows that these techniques are well worth implementing. The BSD
`UNIX system is described fully in Chapter 19. The open-file table is detailed
`in Section 10.1.2.
`
`11.1.2 File-System Mounting
`Just as a file must be opened before it is used, a file system must be
`mounted before it can be available to processes on the system. The mount
`procedure is straightforward. The operating system is given the name of
`. the device, and the location within the file structure at which to attach the
`file system (called the mount point). For instance, on a UNIX system, a file
`system containing user's home directories might be mounted as /home;
`then, to access the directory struch~re within that file system, one could
`precede the directory names with /home, as in /home/jane. Mounting that
`file system under /users would result in the path name !users/jane to reach
`the same directory.
`Next, the operating system verifies that the device contains a valid file
`system. It does so by asking the device driver to read the device directory
`and verifying that the directory has the expected format. Finally, the
`operating system notes in its directory structure that a file system is
`mounted at the specified mount point. This scheme enables· the operating
`
`
`
`11.2 Allocation Methods • 387
`
`system to traverse its directory structure, switching among file systems as
`appropriate.
`Consider the actions of the Macintosh Operating System. Whenever
`the system encounters a disk for the first time (hard disks are found at
`boot time, floppy disks are seen when they are inserted into the drive), the
`Macintosh Operating System searches for a file system on the device. If it
`finds one, it automatically mounts the file system at the root level, adding
`a folder icon on the screen labeled with the name of the file system (as
`stored in the device directory). The user is then able to click on the icon
`and thus to display-the newly mounted file system.
`File system mounting is further discussed in Sections 17.6.2 and 19.7.5.
`
`11.2 • Allocation Methods
`
`The direct-access nature of disks allows us flexibility in the implementation
`of files. In almost every case, many files will be stored on the same disk.
`The main problem is how to allocate space to these files so that disk space
`is utilized effectively and files can be accessed quickly. Three major
`methods of allocating disk space are in wide use: contiguous, linked, and
`indexed. Each method has its advantages and disadvantages. Accordingly,
`some systems (such as Data General's RDOS for its Nova line of computers)
`support all three. More common, a system will use one particular method
`for all files.
`
`11.2.1 Contiguous Allocation
`The contiguous allocation method requires each file to occupy a set of
`contiguous blocks on the disk. Disk addresses define a linear ordering on
`the disk. Notice that, with this ordering, accessing block b + 1 after block b
`normally requires no head movement. When head movement is needed
`(from the last sector of one cylinder to the first sector of the next cylinder),
`it is only one track. Thus, the number of disk seeks required for accessing
`contiguously allocated files is minimal, as is seek time when a seek is
`finally needed. The IBM VM/CMS operating system uses contiguous
`allocation because it provides such good performance.
`Contiguous allocation of a file is defined by the disk address and
`length (in block units) of the first block. If the file is n blocks long, and
`starts at location b, then it occupies blocks b, b + 1, b + 2, ... , b + n - 1.
`The directory entry for each file indicates the address of the starting block.
`and the length of the area allocated for this file (Figure 11.3).
`Accessing a file that has been allocated contiguously is easy. For
`sequential access, the file system remembers the disk address of the last
`block referenced and, when necessary, reads the next block. For direct
`access to block i of a file that starts at block b, we can immediately access
`
`
`
`Chapter 11: File-System Implementation
`
`directory
`
`file
`count
`tr
`fist
`f
`
`start
`0
`14
`19
`28
`6
`
`length
`2
`3
`6
`4
`2
`
`Figure 11.3 Contiguous allocation of disk
`
`block b + i. Thus, both sequential and direct access can be
`contiguous allocation.
`finding
`One difficulty with contiguous allocation
`. The implementation of the free-space management
`Section 11.3, determines how
`this
`task
`management system can be used, but some are slower than
`The contiguous disk-space-aiJocation problem can be
`application of the general dynamic
`""""'""""'- in Section 8.4, which is how to satisfy a
`holes. First-fit and best-fit are the most common
`to select a
`hole from the set of available
`shown that both first-fit and best-fit are more efficient
`terms of both tin1e and storage utilization. Neither
`generally ...... "~.._.._
`best in terms of storage utilization, but first-fit
`These algorithms suffer from external
`fragmentation.
`allocated and deleted, the free disk space
`broken into
`fragmentation exists whenever free space
`broken into
`becomes a problem when the largest contiguous chunk
`a request; storage is fragmented into a number of holes, no one
`enough to store the data. Depending on the total ~u''""''"'
`and the average file size, external fragmentation may be
`or a major problem.
`·
`
`
`
`11.2 Allocation Methods • 389
`
`Some older microcomputer systems used contiguous allocation on
`floppy disks. To prevent loss of significant amounts of disk space to
`external fragmentation, the user had to run a repacking routine that copied
`the entire file system onto another floppy disk or onto a tape. The original
`floppy disk was then freed completely, creating one large contiguous free
`space. The routine then copied the files back onto the floppy disk by
`allocating contiguous space from this one large hole. This scheme
`effectively compacts all free space into one contiguous space, solving the
`fragmentation problem. The cost of this compaction is time. Copying all
`the files from a disk to compact space may take hours and may be
`necessary on a weekly basis. During this down time, normal system
`operation cannot continue, so such compaction is avoided at all costs on
`production machines.
`There are other problems with contiguous allocation. A major problem
`is determining how much space is needed for a file. When the file is
`created, the total amount of space it will need must be found and
`allocated. How does the creator (program or person) know the size of the
`file to be created? In some cases, this determination may be fairly simple
`(copying an existing file, for example); in general, however, the size of an
`output file may be difficult to estimate.
`If we allocate too little space to a file, we may find that that file cannot
`be extended. Especially with a best-fit allocation strategy, the space on
`both sides of the file may be in use. Hence, we cannot make the file larger
`in place. Two possibilities then exist. First, the user program can be
`terminated, with an appropriate error message. The user must then
`allocat~ more space and run the program again. These repeated runs may
`be costly. To prevent them, the user will normally overestimate the
`amount of space needed, resulting in considerable wasted space.
`The other possibility is to find a larger hole, to copy the contents of the
`file to the new space, and to release the previous space. This series of
`actions may be repeated as long as space exists, although it can also be
`time-consuming. Notice, however, that in this case the user never needs to
`be informed explicitly about what is happening; the system continues
`despite the problem, although more and more slowly.
`Even if the total amount of space needed for a file is known in
`advance, preallocation may be inefficient. A file that grows slowly over a
`long period (months or years) must be allocated enough space for its final
`size, even though much of that space may be unused for a long time. The
`file therefore has a large amount of internal fragmentation.
`To avoid several of these drawbacks, some operating systems use a
`modified contiguous allocation scheme, where a contiguous chunk of space
`is allocated initially, and then, when that amount is not large enough,
`ctnother chunk of contiguous space, an extent, is added to the initial
`allocation. The location of a file's blocks is then recorded as a location and
`a block count,· plus a link to the first block of the next extent. On some
`
`
`
`390
`
`Chapter 11: File·System Implementation
`
`the owner of the file can set the extent
`results in inefficiencies if the owner
`incorrect.
`Internal
`can still be a problem if the extents are
`too
`large,
`fragmentation can be a problem as extents of varying
`deallocated.
`
`Allocation
`2
`allocation solves all problems of contiguous allocation.
`allocation, each file is a linked list of disk blocks; the disk blocks
`anywhere on the disk. The directory contains a pointer
`and last blocks of the file. For example, a file of five blocks
`block 9, continue at block 16, then block 1, block 10, and finally
`11.4). Each block contains a pointer to the next block.
`pointers are not made available to the user. Thus, if
`block
`bytes, and a disk address (the pointer) requires 4 bytes, then
`user sees
`blocks of 508 bytes.
`To create a new file, we simply create a new entry
`With linked allocation, each directory entry has a pointer
`block of the file. This pointer is initialized to nil (the
`value) to signify an empty file. The size field
`also set to 0.
`file causes a free block to be found via the
`
`directory
`start
`9
`
`end
`25
`
`file
`jeep
`
`Figure 11.4 Linked allocation of disk space.
`
`
`
`11.2 Allocation Methods • 391
`
`system, and this new block is then written to, and is linked to the end of
`the file. To read a file, we simply re~d blocks by following the pointers
`from block to block.
`There is no external fragmentation with linked allocation, and any free
`block on the free-space list can be used to satisfy a request. Notice also .
`that there is no need to declare the size of a file when that file is created.
`A file can continue to grow as long as there are free blocks. Consequently,
`it is never necessary to compact disk space.
`Linked allocation does have disadvantages, however. The major
`problem is that it can be used effectively for only sequential-access files. To
`find the ith block of a file, we must start at the beginning of that file, and
`follow the pointers until we get to the ith block. Each access to a pointer
`requires a disk read, and sometimes a disk seek. Consequently, it is
`inefficient to support a direct-access capability for linked allocation files.
`Another disadvantage to linked allocation is the space required for the
`pointers. If a pointer requires 4 bytes out of a 512-byte block, then 0.78
`percent of the disk is being used for pointers, rather than for information.
`Each file requires slightly more space than it otherwise would.
`The usual solution to this problem is to collect blocks into multiples,
`called clusters, and to allocate the clusters rather than blocks. For instance,
`the file system may define a cluster as 4 blocks, and operate on the disk in
`only cluster units. Pointers then use a much smaller percentage of the
`file's disk space. This method allows the logical-to-physical block mapping
`to remain simple, but improves disk throughput (fewer disk head seeks)
`and decreases
`the space needed for block allocation and free-list
`management. The cost of this approach is an increase in internal
`fragmentation, because more space is wasted if a cluster is partially full
`than when a block is partially full. Clusters can be used to optimize disk
`access for many other algorithms, so they ;:lre used in most operating
`systems.
`Yet another problem is reliability. Since the files are linked together by
`pointers scattered all over the disk, consider what would happen if a
`pointer were lost or damaged. A bug in the operating-system software or a
`disk hardware failure might result in picking up the wrong pointer. This
`error could result in linking into the free-space list or into another file. One
`partial solution is to use doubly linked lists or to store the file name and
`relative block number in each block; however, these schemes require even
`more overhead for each file.
`An important variation on the linked allocation method is the use of a.
`file-allocation table (FAT). This simple but efficient method of disk-space
`allocation is used by the MS-DOS and OS/2 operating systems. A section of
`disk at the beginning of each partition is set aside to contain the table.
`The table has one entry for each disk block, and is indexed by block
`number. The FAT is used much as is a linked list. The directory entry
`contains the block number of the first block of the file. The table entry
`
`
`
`392 II Chapter 11: File-System Implementation
`
`indexed by that block number then contains the block number
`block in the file. This chain continues until the last block,
`special end-of-file value as the table entry. Unused blocks are
`a 0 table value. Allocating a new block to a file is a simple
`finding the first 0-valued table entry, and replacing the previous
`value with the address of the new block. The 0 is then replaced
`the FAT structure of H•c.rn"'""
`end-,of-file value. An illustrative example
`for a file consisting of disk blocks 217, 618, and 3311.
`Note that the FAT allocation scheme can result in a significant
`of head seeks, unless the FAT is cached. The disk head must move
`start of the partition to read the FAT and find the location of
`question, then move to the location of the block itself. In the
`both moves occur for each block. A benefit is that random access
`optimized, because the disk head can find the location of any
`by
`rectding the information in the FAT.
`
`11.2.3 Indexed Allocation
`Linked allocation solves the external-fragmentation and
`problems of contiguous allocation. However, linked
`support efficient direct access, since the pointers to the blocks are
`
`directory entry
`
`name
`
`start block
`
`339
`
`FAT
`
`Figure 11.5 File-allocation table.
`
`
`
`with the blocks themselves all over the disk and need to be
`order. Indexed allocation solves this problem by bringing
`together into one location: the index block.
`Each file has its own index block, which is an array
`ith
`addresses. The ith entry in the index block points to
`file. The directory contains the address of the index block (Figure
`read the ith block, we use the pointer in the ith index-block
`and read the desired block. this scheme is similar to the
`described in Chapter 8.
`When the file is created, all pointers in the index block
`When the ith block is first written, a block is obtained from
`manager, and its address is put in the ith index-block entry.
`Indexed allocation supports direct access, without
`external fragmentation, because any free block on the disk
`request for more space.
`Indexed allocation does suffer from wasted
`overhead of the index block is generally greater than the
`of linked allocation. Consider a common case in which we
`only one or two blocks. With linked allocation, we lose
`one pointer per block (one or two pointers). With
`entire index block must be allocated, even
`only one or
`be non-nil.
`
`directory
`
`file
`jeep
`
`index block
`19
`
`Figure 11.6
`
`Indexed allocation of disk
`
`
`
`394 • Chapter 11: File-System Implementation
`
`This point raises the question of how large the index block should be.
`Every file must have an index block, so we want the index block to be as
`small as possible. If the index block is too small, however, it will not be
`able to hold enough pointers for a large file, and a mechanism will have to
`be available to deal with this issue:
`
`• Linked scheme. An index block is normally one disk block. Thus, it
`can be read and written directly by itself. To allow for large files, we
`may link together several index blocks. For example, an index block
`might contain a small header giving the name of the file, and a set of
`the first 100 disk-block addresses. The next address (the last word in
`the index block) is nil (for a small file) or is a pointer to another index
`block (for a large file).
`
`• Multilevel index. A variant of the linked representation is to use a
`separate index block to point to the index blocks, which point to the
`file blocks themselves. To access a block, the operating system uses the
`first-level irtdex to find the second-level index to find the desired data
`block. This approach could be continued to a third or fourth level,
`depending on the desired maximum file size. With 4096-byte blocks
`(through clustering), we can get 1024 4-byte pointers into an index block.
`Two levels of indexes allow 1,048,576 data blocks, which allows a file of
`up to 4 gigabytes. This number of bytes currently exceeds the physical
`capacity of most individual disk drives.
`
`• Combined scheme. Another alternative, used in the BSD UNIX system,
`is to keep the first, say, 15 pointers of the index block in the file's
`index block (or inode). (The directory entry points to the inode, as
`discussed in Section 19.7.) The first 12 of these pointers point to direct
`blocks; that is, they contain addresses of blocks that contain data of the
`file. Thus, the data for small (no more thfn 12 blocks) files do not peed
`a separate index block. If the block size· is 4K, then up to 48K of data
`may be accessed directly. The next three pointers point to indirect
`blocks. The first indirect block pointer is the address of a single indirect
`block. The single indirect block is an index block, containing not data,
`but rather the addresses of blocks that do contain data. Then there is a
`double indirect block pointer, which contains the address of a block that
`contains the addresses of blocks that contain pointers to the actual data
`blocks. The last pointer would contain the address of a triple indirect
`block. Under this method; the number of blocks that can be allocated
`to a file exceeds the amount of space addressable by the 4-byte file
`pointers used by the operating s~stem or passed in system calls. The
`32-bit file pointer reaches only 2 2 bytes, or 4 gigabytes. An inode is
`shown in Figure 11.7.
`
`
`
`mode
`
`owners (2)
`
`timestamps (3)
`
`size
`
`block count
`
`Figure 11.7 The UNIX in ode.
`
`Note that indexed allocation schemes suffer from some
`performance problems as does linked allocation. Specifically/
`blocks can be cached in memory, but the data blocks may be Qn,r€'>r:\t1
`over a partition.
`,
`
`11.2.4 Performance
`The allocation methods that we have discussed vary in
`efficiency and data-block access times. Both are
`selecting the proper method or methods
`an
`implement.
`One difficulty in comparing the performance of the
`determining how the systems will
`A
`sequential access should use a method
`of ""'-"·"""''"'
`with mostly random access.
`any
`requires only one access to get a disk block. Since we can
`initial address of the file in memory, we can calculate
`address of the ith block (or the next block) and read it directly.
`the address of the
`For linked allocation, we can also
`memory and
`it directly. This method
`fine for
`direct access, however, an access to the ith block might
`
`
`
`396 • Chapter 11: File-System Implementation
`
`reads. This problem indicates why linked allocation should not be used for
`an application requiring direct access.
`As a result, some systems support direct-access files by using
`contiguous allocation and sequential access by linked allocation. For these
`systems, the type of access to be made must be declared when the file is
`· created. A file created for sequential access will be linked and cannot be
`used for direct access. A file created for direct access will be contiguous
`and can support both direct access and sequential access, but its maximum
`length must be declared when it is created. Notice that, in this case, the
`operating system must have appropriate data structures and algorithms to
`support both allocation methods. Files can be converted from one type to
`another by the creation of a new file of the desired type, into which the
`contents of the old file are copied. The old file may then be deleted, and
`the new file renamed.
`Indexed allocation is more complex. If the index block is already in
`memory, then the access can be made directly. However, keeping the
`index block in memory requires considerable space. If this memory space is
`not available, then we may have to read first the index block and then the
`desired data block. For a two-level index, two index-block reads might be
`necessary. For an extremely large file, a,cessing a block near the end of the
`file would require reading in all the index blocks to follow the pointer
`chain before the data block finally could be read. Thus, the performance of
`indexed allocation depends on the index structure, on the size of the file,
`and on the position of the block desired.
`Some systems combine contiguous allocation with indexed allocation
`by using contiguous allocation for small files (up to three or four blocks),
`and automatically switching to an indexed allocation if the file grows large.
`Since most files are small, and contiguous allocation is efficient for smaJ.l
`files, average performance can be quite good.
`For instance, the version of UNIX from Sun Microsystems was changed
`in 1991 to improve performance in the file-system allocation algorithm.
`The performance measurements
`indicated
`that
`the maximum disk
`throughput on a typical workstation (12-MIPS Sparcstationl) took 50 percent
`of the CPU and produced a disk bandwidth of only 1.5 megabytes per
`second .. To improve performance, Sun made changes to allocate space in
`· very large clusters (56K) whenever possible. This allocation reduced
`external fragmentation, and thus seek and latency times. In addition, the
`disk.-reading routines were optimized to read· in these large clusters. The
`inode structure was left unchanged. These changes, plus the use of read(cid:173)
`ahead and free-behind (discussed in Section 11.5.2) resulted in 25 percent
`less CPU being used for somewhat improved throughput.
`Many other optimizations are possible and are in use. Given the
`disparity between CPU and disk speed, it is not unreasonable to add
`thousands of extra instructions to the operating system to save just a few
`
`
`
`11.3 Free-Space Management • 397
`
`disk head movements. Furthermore, this disparity is increasing over time,
`to the point where hundreds of thousands of instructions reasonably could
`be used to optimize head movements. For example, research is proceeding
`on the use of artificial-intelligence algorithms to analyze past disk-access
`patterns so as to predict future ones.
`
`11.3 • Free-Space Management
`
`Since there is only a limited amount of disk space, it is necessary to reuse
`the space from deleted files for new files, if possible. (Write-once optical
`disks only allow one write to any given sector, and thus such reuse is not
`physically possible.) To keep track of free disk space,
`the system
`maintains a free-space list. The free-space list records all disk blocks that are
`free -
`those not allocated to some file or directory. To create a file, we
`search the free-space list for the required amount of space, and allocate
`that space to the new file. This space is then removed from the free-space
`list. When a file is deleted, its disk space is added to the free-space list.
`The free-space list, despite its name, might not be implemented as a list, as
`we shall discuss.
`
`11.3.1 Bit Vector
`Frequently, the free-space list is implemented as a bit map or bit vector.
`Each block is represented by 1 bit. If the block is free, the bit is 1; if the
`block is allocated, the bit is 0.
`For example, consider a disk where blocks 2, 3, 4, 5, 8, 9, 10, 11, 12,
`13, 17, 18, 25, 26, and 27 are free, and the rest of the blocks are allocated.
`The free-space bit map would be
`
`001111001111110001100000011100000 ...
`
`The main advantage of this approach is that it is relatively simple and
`efficient to find the first free block, or n consecutive free blocks on the
`disk. Indeed, many computers supply bit-manipulation instructions that
`can be used effectively for that purpose. For example, the Intel 80386 and
`80486 and Motorola 68020 through 68040 microprocessors (the processors
`that power
`recent
`IBM PC compatibles and Macintosh systems,
`respectively) have instructions that return the offset in a word of the first
`bit with the value 1. In fact, the Apple Macintosh Operating System uses
`the bit-vector method to allocate disk space. To find the first free block, the
`Macintosh Operating System checks sequentially each word in the bit map
`to see whether that value is not 0, since a: 0-valued word has all 0 bits and
`represents a set of allocated blocks. The first nonO word is scanned for the
`
`
`
`398 • Chapter 11: File-System Implementation
`
`first 1 bit, which is the location of the first free block. The calculation of
`the block number is
`
`(number of bits per word)* (number of 0-value words) + offset of first 1 bit
`
`functionality.