`
`and JAMES L. PETERSON
`ABRAHAM SILBERSCHATZ,
`JOHN S. QUARTERMAN,
`Department of Computer Sciences, University of Texas, Austin, Texas 78712
`
`of the 4.2 Berkeley Software Distribution,
`This paper presents an in-depth examination
`Virtual VAX-11 Version
`(4.2BSD), which
`is a version of the UNIX’”
`Time-Sharing
`System. There are notes throughout
`on 4.3BSD,
`the forthcoming
`system
`from
`the
`University
`of California
`at Berkeley. We trace the historical development of the UNIX
`system from
`its conception
`in 1969 until
`today, and describe
`the design principles
`that
`have guided this development. We then present
`the internal data structures and
`algorithms used by the kernel
`to support
`the user interface.
`In particular, we describe
`process management, memory management,
`the file system, the I/O system, and
`communications.
`These are treated
`in as much detail as the UNIX
`licenses will allow. We
`conclude with a brief description
`of the user interface and a set of bibliographic
`notes.
`
`Networks]:
`Categories and Subject Descriptors: C.2.4 [Computer-Communication
`applications; D.4.0 [Operating Systems]: General-
`Distributed Systems--distributed
`UNIX; D.4.7 [Operating Systems]: Organization
`and Design-interactiue
`systems;
`K.2 [History of Computing]: Software--UNIX
`General Terms: Algorithms, Design, Human Factors, Performance, Reliability,
`
`Security
`
`Additional Key Words and Phrases: Flexibility,
`
`portability,
`
`simplicity
`
`INTRODUCTION
`This paper presents an in-depth examina-
`tion of the 4.2BSD operating system, the
`research UNIX’ system developed for the
`Defense Advanced Research Projects
`Agency (DARPA) by the University of Cal-
`ifornia at Berkeley. We have chosen
`4.2BSD over UNIX System V (the UNIX
`system currently being licensed by AT&T)
`because concepts such as internetworking
`and demand paging are implemented
`in
`4.2BSD but not
`in System V. Where
`4.3BSD,
`the
`forthcoming
`system
`from
`
`1 UNIX
`
`is a trademark
`
`of AT&T Bell Laboratories.
`
`from 4.2BSD
`Berkeley, differs functionally
`in the areas of interest, such differences are
`noted.
`This paper is not a critique of the design
`and implementation of 4.2BSD or UNIX;
`it is an explanation. For comparisons of
`System V and 4.2BSD, see the literature,
`particularly
`the references given in Section
`1.1, p. 380. Such comparisons are mostly
`beyond the scope of this paper.
`The VAX’
`implementation
`is used be-
`cause 4.2BSD was developed on the VAX,
`
`’ VAX, PDP, TOPS-20, and VMS are trademarks
`Digital Equipment Corporation.
`
`of
`
`(0 1985 by
`Chapter 14 of Operating Systems Concepts, Second Edition, by J. L. Peterson and A. Silberschatz
`Addison-Wesley,
`Reading, Massachusetts)
`and this article were both derived
`from an earlier common manu-
`script by J. S. Quarterman. Consequently
`they share some text. Common portions
`are reprinted with
`the
`permission of Addison-Wesley.
`Author’s present address: James L. Peterson, MCC, 9430 Research Blvd., Austin, Texas 78759.
`
`the copies are not made or
`that
`is granted provided
`fee all or part of this material
`to copy without
`Permission
`for direct commercial advantage,
`the ACM copyright
`notice and the title of the publication
`and its
`distributed
`date appear, and notice
`is given that copying
`is by permission of the Association
`for Computing Machinery.
`To
`copy otherwise, or to republish,
`requires a fee and/or specific permission.
`0 1986 ACM 0360-0300/85/1200-0379
`$00.75
`
`Computing Surveys, Vol. 17, No. 4, December 1985
`
`INTEL EX. 1242.001
`
`
`
`380
`
`l
`
`J. S. Quarterman, A. Silberschatz, and J. L. Peterson
`
`CONTENTS
`
`INTRODUCTION
`1. OVERVIEW
`1.1 History
`1.2 Design Principles
`2. PROCESSES
`2.1 user
`Interface
`2.2 Control Blocks
`2.3 CPU Scheduling
`3. MEMORY
`MANAGEMENT
`3.1 Paging
`3.2 Swapping
`4. FILE SYSTEM
`4.1 user
`Interface
`4.2
`Implementations
`4.3 Data Structures on the Disk
`4.4 Layout and Allocation
`Policies
`4.5 Mapping a Pathname
`to an Inode
`4.6 Mapping a File Descriptor
`to an Inode
`I/O SYSTEM
`5.1 Block Buffer Cache
`5.2 Raw Device
`Interfaces
`5.3 C-Lists
`6. COMMUNICATIONS
`6.1 Signals
`6.2
`Interprocess Communication
`6.3 Networking
`Systems
`6.4 Distributed
`7. USER
`INTERFACE
`7.1 Shells and Commands
`7.2 Standard
`I/O
`7.3 Pipelines, Filters, and Shell Scripts
`7.4 The UNIX Philosophy
`8. BIBLIOGRAPHIC
`NOTES
`ACKNOWLEDGMENTS
`REFERENCES
`
`5.
`
`and that machine still represents a conven-
`ient point of reference, despite
`the recent
`proliferation
`of implementations
`on other
`hardware
`(such as the Motorola
`68020
`or National
`Semiconductor
`32032). Also,
`details of
`implementation
`for non-VAX
`systems are usually proprietary
`to the com-
`panies
`that did them. And space does not
`permit examination
`of every
`implementa-
`tion on every kind of hardware.
`This paper is not a tutorial on how to use
`UNIX
`or 4.2BSD.
`It
`is assumed that
`the
`reader knows how to use the UNIX system.
`The presentation
`is closely
`limited
`to a
`technical examination
`of traditional
`oper-
`ating
`system and networking
`concepts,
`in the ker-
`most of which are implemented
`
`Computing Surveys, Vol. 17, No. 4, December 1985
`
`nel. Students of operating systems and nov-
`ice systems programmers
`(the
`intended
`readership)
`should
`find
`the organization
`and content appropriate.
`The novice UNIX user will want to read
`Section 7 on the user interface before delv-
`ing into the sections on kernel details. That
`section
`is as brief as possible, because the
`user interface and user programs
`in general
`are (regardless of their
`importance
`to the
`utility
`and popularity
`of the system) be-
`yond the proper scope of this paper. Read-
`ing one of the several good books on using
`UNIX
`(see Section 8, Bibliographic
`Notes)
`would be good preparation
`for reading
`the
`paper.
`The paper begins with a very brief over-
`view of the history of the system and some
`description of the design philosophy behind
`it. The other sections cover process man-
`agement, memory management,
`the
`file
`system,
`the
`I/O system, communications,
`and certain
`features of the user interface
`that distinguish
`the system. The paper con-
`cludes with a set of bibliographic
`notes.
`
`1. OVERVIEW
`
`the history
`is concerned with
`This section
`and design of the UNIX system, which was
`initially
`developed at Bell Laboratories
`as
`a private
`research project of two program-
`mers. Its original elegant design and devel-
`opments
`of
`the
`past
`fifteen
`years
`have made it an important
`and powerful
`operating
`system. We
`trace
`the history
`of the system E [Compton
`1985; Ritchie
`1978, 1984a, 1984b] and relate
`its design
`principles.
`
`1.1 History
`
`first version of UNIX was developed
`The
`at Bell
`Laboratories
`in 1969 by Ken
`Thompson
`to use an otherwise
`idle PDP-7.
`He was soon joined by Dennis Ritchie, and
`the two of them have since been the largest
`influence on what
`is commonly
`known as
`Research UNIX.
`Ritchie, Thompson, and other early Re-
`search UNIX
`developers had previously
`worked on
`the Multics
`project
`[Peirce
`19851, and Multics
`[Organick 19751 was a
`strong
`influence
`on the newer operating
`
`INTEL EX. 1242.002
`
`
`
`4.2BSD and 4.3BSD as Examples of the UNIX System
`
`l
`
`381
`
`is merely a
`system. Even the name UNIX
`pun on Multics,
`indicating
`that
`in areas
`where Multics
`attempted
`to do many
`things, UNIX
`tries
`to do one thing well.
`The basic organization
`of the file system,
`the idea of the command
`interpreter
`(the
`shell) being a user process,
`the use of a
`process per command,
`the original
`line-
`editing characters # and @, and many other
`things come directly
`from Multics.
`sys-
`Ideas from various other operating
`of
`tems, such as Massachusetts
`Institute
`Technology’s CTSS, have also been used.
`The fork operation comes from Berkeley’s
`GENIE
`(XCS-940) operating system.
`The Research UNIX
`systems
`include
`UNIX Time-Sharing System, Sixth Edition
`(commonly
`known as Version 6), which
`was the first version widely available out-
`side Bell Laboratories
`(in 1976) and ran on
`the PDP-11.
`(These version numbers cor-
`respond
`to
`the edition
`numbers of
`the
`UNIX
`Programmer’s Manual
`that were
`current when the distributions were made.)
`Multiprogramming
`was added before Ver-
`sion 6, and after the system was rewritten
`in a high-level
`programming
`language, C
`[Kernighan
`1978; Ritchie
`et al. 19781.
`C was designed and implemented
`for this
`purpose by Dennis Ritchie.
`It is descended
`[Rosler 19841 from
`the
`language B, de-
`signed and implemented
`by Ken Thomp-
`son. B was itself descended from BCPL. C
`continues
`to evolve
`[Stroustrup
`1984;
`Tuthill
`1985a].
`system was
`The
`first portable UNIX
`UNIX Time-Sharing System, Seventh Edi-
`tion (Version 7), which
`ran on the PDP-11
`and the
`Interdata
`8132, and had a VAX
`variety
`called UNIX/32V
`Time-Sharing
`System Version 1.0 (32V). The system cur-
`rently
`in development
`by
`the Research
`group at AT&T Bell Laboratories
`is UNIX
`Time-Sharing
`System, Eighth
`Edition
`(Version 8).
`of Version 7 in
`After
`the distribution
`1978, the Research group gave external dis-
`tributions over to the UNIX Support Group
`(USG). USG had previously
`distributed
`such systems as UNIX Programmer’s Work
`Bench
`(PWB)
`internally,
`and sometimes
`externally
`as well
`[Mohr 19851.
`Their
`first external
`distribution
`Version 7 was UNIX System
`III
`
`after
`(System
`
`features
`incorporated
`in 1982, which
`III),
`of Version 7,32V, and also of several UNIX
`systems developed by groups other than the
`Reseach group. Features of UNIX/RT
`(a
`real-time UNIX
`system) were included, as
`well as many
`features
`from PWB. USG
`released UNIX System V (System V)
`in
`1983; it is largely derived
`from System III.
`The divestiture
`of the various Bell Oper-
`ating Companies
`from AT&T
`has
`left
`AT&T
`in a position
`to market System V
`[Wilson 19851 aggressively. USG has me-
`tamorphosed
`into
`the UNIX
`System
`Development
`Laboratory
`(USDL), whose
`current
`distribution
`is UNIX System V
`Release 2 (V.2), released in 1984.
`system
`The ease with which
`the UNIX
`can be modified has led to development
`work at numerous organizations
`such as
`Rand, Bolt, Beranek and Newman
`(BBN),
`the University
`of Illinois, Harvard, Purdue,
`and even DEC. But the most influential
`of
`the non-Bell Laboratories
`and non-AT&T
`UNIX
`development
`groups has been
`the
`University
`of California
`at Berkeley
`[McKusick
`19851. UNIX
`software
`from
`Berkeley
`is released
`in so-called BerkeZey
`Software Distributions
`(BSD), hence
`the
`generic numbers 2BSD
`for the later PDP-
`11 distributions
`and 4BSD
`for
`the
`later
`VAX distributions.
`ter-
`Many of the
`features of the 4BSD
`minal drivers are from TENEX/TOPS-20,3
`and efficiency
`improvements
`have been
`made as a result of comparisons with VMS.
`The first Berkeley VAX UNIX work was
`the addition
`to 32V of virtual memory, de-
`mand paging, and page replacement
`in 1979
`by Bill Joy and Ozalp Bagaoglu
`to produce
`3BSD. The
`large virtual memory space of
`3BSD allowed
`the development
`of very
`large programs,
`such as Berkeley’s
`own
`Franz Lisp. This memory management
`work convinced DARPA
`to fund Berkeley
`for
`the
`later development
`of a standard
`UNIX
`system for government use (4BSD).
`One of the goals of this project was to
`provide support
`for the DARPA
`Internet
`networking
`protocols TCP/IP.
`This was
`done in a general manner, and it is possible
`to communicate among diverse network
`fa-
`cilities,
`ranging
`from
`local networks
`(such
`
`3 TENEX
`
`is a registered
`
`trademark
`
`of BBN.
`
`Computing Surveys, Vol. 17, No. 4, December 1985
`
`INTEL EX. 1242.003
`
`
`
`382
`
`l
`
`J. S. Quarterman, A. Silberschatz, and J. L. Peterson
`
`and token rings) to long-haul
`as Ethernets
`networks
`(such as DARPA’s ARPANET).
`It is sometimes convenient
`to refer to the
`Berkeley VAX UNIX
`systems
`following
`3BSD as 4BSD, although
`there were ac-
`tually several releases (indicated by decimal
`points
`in the release numbers), most nota-
`bly 4.1BSD. 4BSD was the operating sys-
`tem of choice for VAXs
`from the beginning
`until
`the release of System III
`(1979-1982)
`and remains so for many research or net-
`working
`installations.
`Most organizations
`would buy a 32V license and order 4BSD
`from Berkeley without
`ever bothering
`to
`get a 32V tape. Many
`installations
`inside
`the Bell System ran 4.1BSD (many still do,
`and many others run 4.2BSD).
`The 4BSD work
`for DARPA was guided
`by a steering committee, which
`included
`many notable people
`from
`the UNIX
`and
`communities. 4.2BSD, first dis-
`networking
`tributed
`in 1983, is the culmination
`of the
`original Berkeley DARPA UNIX
`project,
`although
`further
`research
`proceeds
`at
`Berkeley.
`the only organization
`Berkeley was not
`involved
`in
`the development
`of 4.2BSD.
`Contributions
`(such as autoconfiguration,
`job control, and disk quotas) came from
`numerous universities
`and other organiza-
`tions in Australia, Canada, Europe, and the
`United States. A few ideas, such as the fcntl
`system call, were
`taken
`from System V.
`(Licensing and pricing considerations
`have
`prevented
`the use of any actual code from
`System III or System V in 4BSD.) Not only
`are many contributions
`included
`in the dis-
`tributions
`proper, but
`there
`is an accom-
`panying
`set of user-contributed
`software,
`which
`is carried on the tapes containing
`the
`4BSD distributions.
`The system was tested
`on the M68000-based workstation
`by Sun
`Microsystems,
`Inc., before its initial distri-
`bution.
`This
`simultaneous
`development
`contributed
`to the ease of further ports of
`4.2BSD.
`accepts mail
`Berkeley
`bugs
`about
`and their
`fixes at a well-known
`electronic
`address,
`and
`the
`consulting
`company
`mt.Xinu
`distributes
`a bug
`list compiled
`from such submissions. Many of the bug
`fixes may be incorporated
`in future distri-
`
`4 Ethernet
`
`is a trademark
`
`of Xerox Corporation.
`
`Computing Surveys, Vol. 17, No. 4, December 1985
`
`discussion of
`is constant
`butions. There
`4.2BSD)
`in
`UNIX
`in general
`(including
`list UNIX-
`the DARPA
`Internet mailing
`on
`the
`WIZARDS,
`which
`appears
`USENET
`network
`as
`the news group
`net.unix-wizards;
`both
`the
`Internet
`and
`USENET are international
`in scope. There
`is another USENET
`news group dedicated
`to 4BSD bugs. While
`few ideas appear to
`be accepted by Berkeley directly
`from these
`lists and news groups (probably because of
`the difficulty
`of sifting
`through
`the sheer
`volume
`of submissions),
`discussions
`in
`them sometimes
`lead to new facilities being
`written
`that are later accepted.
`of
`Figure 1 is a sketch of the evolution
`the several main branches of the UNIX
`system, especially
`those leading
`to 4.2BSD
`and System V [Chambers and Quarterman
`1983; Uniejewski
`19851. The dates given
`are approximate,
`and there
`is no attempt
`to show all influences. Some of the systems
`named in the figure are not mentioned
`in
`the text, but are included
`to better show
`the relations among the ones that are dis-
`cussed in the text.
`We are aware at this writing of the im-
`minent
`release of 4.3BSD and System V
`Release 2 Version 4. There are few func-
`tional
`changes
`in
`the kernel
`in 4.3BSD,
`although
`there are many performance
`im-
`provements
`[Cabreral et al. 1985; Leffler et
`al. 1984; McKusick
`et al. 19851. (Some of
`these 4.3BSD changes are noted in sections
`throughout
`this paper.) Although System
`V Release 2 Version 4 does introduce pag-
`ing
`[Jung 1985; Miller
`19841 (including
`copy-on-write
`and shared memory)
`to Sys-
`tem V,
`there are
`few other
`functional
`changes.
`Dozens
`including
`
`of computer manufacturers,5
`almost
`all of
`those
`usually
`
`’ These include at least Altos, Amdahl, Apollo, AT&T,
`Burroughs,
`Callan, Celerity,
`Codata, Convergent
`Technologies, Convex, COSI, Cray, Cromemco, Data
`General, DEC, Denelcor,
`Dual Systems, ELXSI,
`Encore, Flexible, Gould, Heurikon, Hewlett Packard,
`Honeywell,
`IBM,
`Integrated
`Business Computers,
`Integrated Solutions,
`Intel,
`Interactive Systems, Logi-
`cal Microcomputer,
`Medical
`Informatics, NBI, NCR,
`National
`Semiconductor,
`Onyx, Pacific Computer,
`Parallel,
`Perkin-Elmer,
`Plexus, Pyramid,
`R Sys-
`tems,
`Radio
`Shack,
`Ridge,
`Sequent,
`Silicon
`Graphics,
`Sperry,
`Sun Microsystems,
`Tektronix,
`Visual Technology,
`and Wicat.
`
`INTEL EX. 1242.004
`
`
`
`: PDp-i i :
`
`lBSD-2BSD
`
`a 28BSD +
`
`2.9BSD +
`
`3BSD-
`
`4.OBSD
`
`J.lcBSD -+iii+~:i$;+
`
`
`
`Bell Research Bell Research
`
`/
`
`
`
`Bell Cdwnbns Bell Cdwnbns
`
`
`
`USG I USDL USG I USDL
`
`
`
`MERT + UNWRT MERT + UNWRT
`
`/
`
`1969
`
`1973 1976
`
`1977/
`
`1978
`
`1979
`
`1980
`
`1981
`
`1982
`
`1983
`
`1984
`
`:
`
`1985
`
`Figure 1. UNIX history.
`
`INTEL EX. 1242.005
`
`
`
`384
`
`l
`
`J. S. Quarterman, A. Silberschatz, and J. L. Peterson
`
`considered major by market share, have
`either announced or introduced
`computers
`that run the UNIX
`system or close deriva-
`tives, and numerous other companies sell
`related peripherals, software packages, sup-
`port,
`training,
`documentation,
`or combi-
`nations of these. The hardware packages
`involved
`range from micros
`through minis,
`multis, and mainframes
`to supercomputers.
`Most of
`these use ports of System V,
`4.2BSD, or mixtures of the two, although
`there are still a variety of machines running
`software based on System III, 4.1BSD, and
`Version 7. There are even some Version 6
`systems still
`in regular operation.
`for
`UNIX
`is also an excellent
`vehicle
`the
`academic study. For example, both
`Tunis operating
`system
`[Holt 19831 and
`the Xinu operating
`system
`[Comer 19841
`are based on the concepts of UNIX,
`but
`were developed explicitly
`for classroom
`study. Ritchie and Thompson were honored
`in 1983 by the ACM Turing award for their
`work on UNIX.
`
`1.2 Design Principles
`
`ancestor, Mul-
`its most influential
`Unlike
`tics, UNIX was not designed by a joint
`project of several major institutions
`and did
`not have project goals and aspirations
`set
`out beforehand
`in a series of papers pre-
`sented at a prestigious professional
`confer-
`ence. UNIX was, instead, orginated
`first by
`one programmer, Ken Thompson, and then
`another, Dennis Ritchie, as a system
`for
`their personal convenience, with no elabo-
`rate plan spelled out beforehand. This
`flex-
`ibility appears to have been one of the key
`factors
`in the development
`of the system.
`There were some design principles
`in-
`volved, however, even though
`they were not
`spelled out at the outset.
`for
`UNIX was designed by programmers
`programmers.
`It has always been interac-
`tive. Multiple
`processes are supported, and
`it is easy for one process to create another
`process. There are standard and
`flexible
`ways of interconnecting
`the input and out-
`put of processes and otherwise coordinating
`several processes
`to do a task.
`It
`is not
`uncommon
`for a book
`typeset using
`the
`
`UNIX system to include both the source of
`programs written
`in a language such as C,
`FORTRAN, Pascal, or LISP, and the out-
`put of the same programs. The manuscript
`of the book itself may be used as the input
`of such programs, and their output may be
`statistics concerning
`the book. It is trivial
`for the programmer
`to set up a mechanism
`whereby a single word (e.g., “make”)
`typed
`at a terminal
`causes the programs
`to be
`compiled and run and both
`their source
`and output
`to be typeset as part of the
`manuscript.
`is the
`flexibility
`reason for this
`A main
`is only
`lack of numerous
`file
`types: There
`one data file
`type as far as the operating
`system is concerned. This
`is a sequence of
`bytes, which may be accessed either
`ran-
`domly or sequentially. There are no “access
`methods” and no “control
`blocks”
`in
`the
`data space of a UNIX
`user process, and
`there are few limitations
`on what a pro-
`cess’s data space may be used
`for. The
`interface
`to the file system is very simple:
`A file is referred
`to by a character string
`for
`opening
`and by an
`integer
`for
`further
`manipulation.
`in directories, which
`Files are grouped
`essentially
`form a tree-structured
`hierar-
`chy. Users may create directories as easily
`as ordinary
`tiles.
`It is not hard to build elaborate database
`access mechanisms on top of UNIX’s
`one
`simple
`file type, as the half-dozen or more
`readily
`available
`databases
`for UNIX
`INGRES, which
`is distrib-
`(starting with
`uted with 4BSD) attest.
`files are treated as
`Devices and ordinary
`similarly
`as possible. Thus device depen-
`dencies or peculiarities
`are kept in the ker-
`nel as much as possible, and even in the
`kernel most of them are segregated in the
`device drivers.
`of an ex-
`A process may be informed
`ceptional
`condition
`(perhaps by another
`process) by means of a signal. True
`inter-
`process communication
`and networking
`may be accomplished
`in 4.2BSD by the use
`of sockets.
`(and
`The size constraints of the PDP-11
`have
`earlier computers used
`for UNIX)
`forced a certain elegance. Whereas other
`
`Computing Surveys, Vol. 17, No. 4, December 1985
`
`INTEL EX. 1242.006
`
`
`
`4.2BSD and 4.3BSD as Examples of the UNIX System
`
`l
`
`385
`
`(the users)
`
`shells and commands
`compilers and interpreters
`system libraries
`interface lo lhc kernel
`sysrem call
`
`signals
`terminal handling
`character I/O system
`terminal drivers
`
`file system
`swapping
`block l/O system
`disk and tape drivers
`
`CPU scheduling
`page replacement
`demand paging
`virtual memory
`
`kernel
`terminal controllers
`terminals
`
`to the hardware
`interface
`device controllers
`disks and tapes
`
`memory controllers
`physical memory
`
`Figure 2. Layers of the UNIX
`
`system.
`
`for deal-
`systems have elaborate algorithms
`UNIX
`ing with pathological
`conditions,
`crash (a panic), and
`just does a controlled
`tries to prevent
`rather
`than cure such con-
`ditions. Whereas other systems would use
`brute
`force or macro expansion, UNIX
`mostly has had
`to have developed more
`subtle, or at least simpler, approaches.
`In some instances, such as networking,
`PDP-11 size constraints unfortunately
`had
`the opposite effect. The original UNIX
`Version 6 ARPANET
`software was split
`into a kernel part and a part that ran as a
`user process, purely because of size con-
`straints.
`This entailed
`not only perfor-
`mance penalties but also led to a rather
`convoluted design. The 4.2BSD networking
`code does not suffer from this, since it runs
`on processors
`(VAX, M68000, NS16032,
`etc.) that have a reasonably sized address
`space. The PDP-11 ports of this code re-
`quire extensive kernel overlays.
`Virtual memory and paging were not im-
`plemented on the PDP-11 because of the
`small number and huge size of the pages
`allowed by the hardware. Thus early ver-
`sions of the INGRES database system ran
`as multiple
`(six or seven) processes, and
`Franz Lisp, with
`its need
`for huge data
`spaces in a single process, did not develop
`until
`the VAX permitted paging in 3BSD.
`Even
`though some UNIX
`systems now
`require large
`try
`to do some things
`that
`address spaces, the size constraints
`im-
`posed during
`the early development of the
`
`system have had, on the whole, a beneficial
`effect.
`From the beginning, UNIX development
`systems have had all
`the UNIX
`sources
`available on line, and the developers have
`used
`the systems under development
`as
`their primary systems. This has greatly
`fa-
`cilitated discovering deficiencies and their
`fixes, as well as new possibilities
`and their
`implementations.
`Facilities
`for program development have
`always been a high priority. Such facilities
`the program make (which may be
`include
`used to determine which of a collection of
`program source tiles need to be compiled
`the Source
`and then compile
`them) and
`Code Control System (SCCS) (which
`is used
`to keep successive versions of files available
`without having
`to store the entire contents
`of each step).
`of sources for the oper-
`The availability
`ating system has also encouraged
`the pleth-
`ora of UNIX
`variants existing
`today, but
`the benefits have outweighed
`the disadvan-
`tages. If something
`is broken,
`it can be fixed
`at a local site, rather
`than having
`to wait
`for the next
`release of the system. Such
`fixes, as well as new facilities, may be in-
`corporated
`into
`later distributions.
`Binary
`licenses are becoming more popular with
`the growing number of small,
`inexpensive,
`UNIX
`systems, however.
`The UNIX operating system may be con-
`sidered for convenience of exposition
`to be
`layered
`roughly as depicted
`in Figure 2.
`
`Computing Surveys, Vol. 17, No. 4, December 1985
`
`INTEL EX. 1242.007
`
`
`
`386
`
`l
`
`J. S. Quarterman, A. Silberschatz, and J. L. Peterson
`
`below the system call interface
`Everything
`the physical
`hardware
`is the
`and above
`kernel
`[Thompson
`19781. This paper
`is
`mostly concerned with
`the kernel, since
`that
`is where most of the traditional
`oper-
`ating systems issues are addressed:
`
`is the only UNIX code that cannot be
`The kernel
`by a user to his own
`liking. For
`this
`substituted
`reason,
`the kernel should make as few real deci-
`sions as possible. This does not mean to allow
`the
`user a million
`options
`to do
`the same
`thing.
`Rather,
`it means to allow only one way to do one
`thing, but have
`that way be the
`least-common
`divisor of all
`the options
`that might have been
`provided.
`[Thompson
`1978, p. 19311
`
`in the design
`is especially noticeable
`This
`of the system call interface
`[Joy et al. 19831:
`“Throughout,
`simplicity
`has been substi-
`tuted
`for efficency. Complex
`algorithms
`are used only
`if
`their complexity
`can be
`localized”
`[Thompson
`1978, p. 19321.
`
`2. PROCESSES
`
`In this section we describe how user pro-
`cesses are created and manipulated
`by
`other user processes, including
`the layout
`of a process’s address space. Then
`the
`kernel control blocks
`that keep track of
`processes are described. Finally,
`an over-
`view of the CPU scheduler and its event
`mechanism
`is given
`[Thompson
`19781.
`
`2.1 User Interface
`
`in execution. To
`is a program
`A process
`execute a new program, a new process
`is
`first produced by the fork system call, cre-
`ating
`two almost
`identical processes, each
`with a copy of the original data space. Then
`the exec primitive may be used by one pro-
`cess to replace
`its virtual memory space
`with
`that
`for a new program
`(read from a
`file). A process may choose to terminate by
`using
`the exit system call, and its parent
`process may wait
`for that event by using
`the wait system call. Figure 3 shows
`two
`common
`scenarios
`for
`the use of
`these
`system calls.
`their process
`Processes are named by
`identifier
`(pid), which
`is an integer. The
`fork system call returns
`the process identi-
`
`Computing Surveys, Vol. 17, No. 4, December 1985
`
`the parent
`to
`the child process
`fier of
`the child
`in
`process, and
`returns
`zero
`process; this
`is how a program can deter-
`mine in which process it is running after a
`fork. The wait system call provides
`the
`process
`identifier
`of
`the child
`that
`ter-
`minated
`so the parent can tell which of
`possibly several children
`it was.
`From
`the viewpoint
`of the calling pro-
`cess, one may
`liken
`fork and wait
`to a
`subroutine
`call and return, whereas exec is
`more like a goto.
`be-
`The simplest
`form of communication
`tween processes is by pipes. Pipes provide
`a reliably
`delivered byte stream between
`two processes.They may be created before
`the fork, and their end points may then be
`set up between
`the fork and the exec.
`All user processes are descendants of one
`original process, which
`is called
`init and
`has process identifier
`1. On each terminal
`port available
`for interactive
`use, init
`forks
`(with
`the fork system call) a copy of itself,
`which attempts
`to open the port for reading
`and writing. This new process has a new
`process identifier. The open succeeds when
`a directly connected
`terminal
`is turned on
`or a telephone call is accepted by a dial-up
`modem. Then
`the
`init process executes
`(with
`the exec system call) a program called
`getty.
`line parameters
`terminal
`Getty initializes
`and prompts
`the user to type a login name,
`which getty collects. Getty then executes a
`program
`called
`login, passing
`the
`login
`name as an argument. Login prompts
`the
`user for a password, and collects
`it as the
`user
`types
`it. Login determines whether
`the user is allowed
`to log in by encrypting
`the typed password and comparing
`it with
`an encrypted string
`found according
`to the
`login name in the file /etc/passwd.
`If the
`comparison
`is successful,
`login sets the nu-
`meric user identifier
`(uid) of the process to
`that of the user logging
`in and executes a
`shell, or command
`interpreter.
`(The path-
`name of the shell and the user identifier
`are also found
`in /etc/passwd
`according
`to the user’s login name.) This shell is what
`the user ordinarily
`communicates with
`for
`the rest of the login session. The shell itself
`forks subprocesses
`for the commands
`that
`the user tells it to execute.
`
`INTEL EX. 1242.008
`
`
`
`4.2BSD and 4.3BSD as Examples of the UNIX System
`
`l
`
`exec
`init
`
`init
`pld 1
`
`exec
`
`exec
`
`zombie pid 1095
`
`exec
`
`login pid 7623
`f
`
`Figure 3. Fork, exec, exit, and wait.
`
`is used suc-
`The same process identifier
`cessively after
`the
`fork by the child
`init
`process, by getty, by login, and finally by
`the shell. When the user logs out, the shell
`dies and the original
`init process (process
`identifier
`1) waits on
`it. After
`the wait
`succeeds,
`the process
`identifier
`formerly
`used by the shell may be reassigned by the
`kernel
`to a new process.
`The user identifier
`is used by the kernel
`to determine
`the user’s permissions
`for cer-
`tain system calls, especially
`those involving
`file accesses. There is also a group identifier
`(gid ), which
`is used to provide similar priv-
`ileges to a collection of users. In 4.2BSD, a
`user’s processes may be in several groups
`simultaneously.
`The login process puts the
`user’s shell
`in all
`the groups permitted
`to the user by the files
`/etc/passwd
`and
`/etc/group.
`identifiers
`two user
`There are actually
`used by the kernel: The effective user iden-
`
`file access
`tifier is the one used to determine
`permissions, and the real user identifier
`is
`used by some programs
`to determine who
`the original user was before
`the effective
`user identifier was set by a setuid.
`If the
`file being executed by exec has setuid
`indi-
`cated,
`the effectiue user
`identifier
`of the
`process is set to the user identifier
`of the
`owner of the file, while
`the real user iden-
`tifier
`is left as it was. This allows certain
`processes to have more than ordinary priv-
`ileges while still being executable by ordi-
`nary users. This setuid
`idea is patented by
`Dennis Ritchie
`[1979a] and is one of the
`most distinctive
`features of UNIX.
`For
`groups
`there
`is a similar
`distinction
`for
`effective and real group
`identifiers,
`and a
`similar setgid feature.
`Every process has both a user and a
`system phase, which never execute simul-
`taneously. Most ordinary work
`is done by
`the user process, but when a system call is
`
`Computing Surveys, Vol. 17, No. 4, December 1985
`
`INTEL EX. 1242.009
`
`
`
`388
`
`l
`
`J. S. Quarterman, A. Silberschatz, and J. L. Peterson
`
`done, it is the system process, which has a
`different
`stack than
`the user process, that
`performs
`the system call.
`The virtual address space of a user pro-
`cess is divided
`into text (executable
`instruc-
`tions), data, and stack segments. The data
`and stack segments are always
`in the same
`address space, but may grow separately
`and, on most machines,
`in opposite direc-
`tions: On a VAX
`the stack grows down
`as the data grow up toward
`it. The
`text
`segment
`is usually not writable,
`so that
`one copy may be shared among several
`processes.
`How a process’s virtual address space is
`mapped
`into physical main or secondary
`memory varies greatly
`from system to sys-
`tem but is usually
`transparent
`to the user.
`
`2.2 Control Blocks
`
`There are no system control blocks acces-
`sible
`in a user process’s virtual
`address
`space, but there are such control blocks
`in
`the kernel associated with
`the process.
`Some of these control blocks are diagramed
`in Figure 4.
`The most basic data structure associated
`the process struc-
`with processes is called
`ture. There
`is an array of these, whose
`length
`is defined at system link
`time. Each
`p