throbber
21O1ELD.D.A
`
`- Page 1
`
`APPLE 1012 - Page 1
`
`

`
`Many of the designations used by manufacturers and sellers to distin-
`guish their products are claimed as trademarks. Where those designa-
`tions appear in this book and Addison-Wesley was aware of a trademark
`claim, the designations have been printed in initial capital letters.
`
`The publisher offers discounts on this book when ordered in quantity
`for special sales.
`
`For more information, please contact:
`
`Corporate & Professional Publishing Group
`Addison-Wesley Publishing Company
`One Jacob Way
`Reading, Massachusetts 01867
`
`Stein, Lincoln D., 1960-
`
`How to set up and maintain a World Wide Web site : the guide for
`information providers / Lincoln D. Stein.
`p.
`cm.
`Includes index.
`
`*
`
`ISBN 0-201-63389-2 (alk. paper)
`1. World Wide web (Information retrieval system)
`TK5105.888.S74 1995
`
`I. Title.
`
`005.75--dc2O
`
`95-24492
`CIP
`
`Copyright © 1995 by Addison-Wesley Publishing Company, Inc.
`
`All rights reserved. N0 part of this publication may be reproduced, stored in a
`retrieval system, or transmitted, in any form or by any means, electronic,
`mechanical, photocopying, recording, or otherwise, without the prior written
`permission of the publisher. Printed in the United States of America. Published
`simultaneously in Canada.
`
`0-201-63389-2
`
`1 2 3 4 5 6 7 8 9-CRW-98979695
`
`First printing, August 1995
`
`APPLE 1012 - Page 2
`
`APPLE 1012 - Page 2
`
`

`
`
`
`Guided Tour
`
`INTRODUCTION TO THE WEB
`
`3
`
`information sharing among collaborators, but interest in the system soon
`spread to other laboratories and academic institutions.
`A turning point for the Web came in February 1993, when the U.S.
`National Center for Superconducting Applications (NCSA) released an
`early Version of Mosaic, a Web browser for Unix machines running the X
`Windows system. Mosaic used icons, popup menus, rendered bitmapped
`text, and color links to display hypertext documents. In addition, Mosaic
`was capable of incorporating color images directly onto the page along
`with the text, and provided support for sounds, animation, and other
`types of multimedia. In mid November 1993, Mosaic was released simul-
`taneously for three popular platforms: the Apple Macintosh, Microsoft
`Windows-based machines, and X Windows.
`The Web took off explosively. In October 1993, eight months after the
`release of Mosaic for X Windows, the number of Web servers registered at
`CERN had increased to 500. A year later there were an estimated 4600
`sites, with more being added exponentially. In August 1994, Web network
`traffic on the National Science Foundation's Internet backbone exceeded
`
`that for e-mail, the only service ever to do so. Recent estimates of the Web
`I put the number of servers at more than 12,000, and estimate an annual
`growth rate of 3000%.
`
`A short walk through the World Wide Web will show you what it's all
`about. The screen shots that follow use a Macintosh-based Web browser
`called MacWeb, produced and distributed freely by EINet (a service run
`by Microelectronics and Computer Corporation). MacWeb was chosen
`for the screen shots mainly because it isn't Mosaic. Although Mosaic and
`the Web have become synonymous in the public perception, Mosaic is
`only the best known browser; many others are available both freely and
`. commercially.
`Figure 1.1: SIPB Main Page. We start our tour at the MIT Student
`Information Processing Board (SIPB), a Web site maintained by one of
`MIT's student organizations. The Web has no particular starting point, so
`this is as good a place to jump in as any. The first thing that grabs your
`attention is the Web's use of the document metaphor. The Web is organized
`. as a series of pages, each with a distinctly book-like feel. You'll find para—
`graphs, headings, subheadings, changes of font and emphasis, indented
`lists, and embedded color graphics. The underlined words and phrases
`are hypertext links. These links, when selected, take the user to a different
`page or to a different.lo’cation on the same page. In this case, we use the
`mouse to select the link named ”IAP Course Guide” to learn more about
`
`what's going on during MIT's Independent Activities Period.
`
`APPLE 1012 - Page 3
`
`APPLE 1012 - Page 3
`
`

`
`HOW TO SET UP AND MAINTAIN A WORLD WIDE WEB SITE
`
`-3 File Edit Options Navigate Hotlist
`U.ll.UlU.MlT.EDU Home Pa I a
`
`worm ‘Hide’ ‘iF?‘ta',l)V:£»e
`
`FIGURE 1.1 MIT SIPB Main Page
`
`Figure 1.2: Freshman Fishwrap. This link takes us to another page, this
`one maintained by the Freshman Pishwrap, a student newspaper. Each page
`on the Web has a unique address, known as its URL, or Uniform Resource
`Locator. You can see the URL for this page in the box on the upper right-
`hand corner of this Web browser's window. URL formats are explained in
`great depth later, but for now just notice that the URL begins with the text
`http, indicating that this page is accessed using the Hypertext Transfer
`Protocol (HTTP) and that the Internet address of the machine on which
`this page lives is fishwrap—docs .www.media.mit . edu. Also notice
`that this page lives on a different machine than the SIPB main page, which
`is hosted by www . mit . edu.
`This page contains a graphic calendar with instructions to click on a
`day in order to see the corresponding class schedule. This is an example
`of a clickable map. Clicking the mouse on different parts of the image
`takes us to different pages. In this case, we click on January 9, marked
`”IAP Start.”
`
`Figure 1.3: IAP Schedule for January 9. This link takes us to a course
`schedule. The schedule itself is made up of more links, any one of which we
`could select to get a short course description and pointers to other courses of .
`interest. Instead, we'll do some more exploring. We jump back to the main
`SIPB page (by clicking the browser's left arrow button a few times) and select
`the link marked ”official MIT web server.”
`
`APPLE 1012 - Page 4
`
`APPLE 1012 - Page 4
`
`

`
`
`
`INTRODUCTION TO THE WEB
`
`5
`
`a‘ File Edit options Navigate Hotlist
`
`FIGURE 1.3 Independent Activities Period Schedule
`
`APPLE 1012 - Page 5
`
`APPLE 1012 - Page 5
`
`

`
`28
`
`HOW TO SET UP AND MAINTAIN A WORLD WIDE WEB SITE
`
`IP Addresses
`
`Domain Names
`
`TCP/IP uses a static addressing scheme in which each and every machine on
`the Internet is assigned a unique, unchanging IP address. IP addresses are
`32-bit numbers that are usually written out as four 8-bit numbers separated
`by dots. Examples of IP addresses include 18.157.0.135 and 127.1.18.92.
`Although the four billion addresses sounds like more than enough to go
`around, this isn’t_ really the case. For one thing, various ranges of IP ad-
`dresses are reserved for special purposes such as multicasting. For another,
`IP addresses are organized in a hierarchical way into a series of networks
`and subnetworks. The Network Information Center (NIC) allocates blocks of
`contiguous addresses to organizations and regional networks (Table 2.1). A
`small organization, such as a privately held company, might receive the
`block of 255 addresses from 192.66.12.1 to 192.66.12.255 (this is called a class
`”C” address.) It could then divvy the addresses up among its various depart-
`ments. A large organization, such as a university, might receive the block of
`. approximately 65,000 addresses from 128.15.0.1 to 128.15.255.255 (this is a
`class ”B” address.) Even larger entities, such as the U.S. military or the
`NEARnet regional network, could be granted one or more class ”A”
`addresses, such as the block 18.0.0.1 to 18.255.255.255, encompassing more
`than 16 million addresses. The advantages of this hierarchical way of divid-
`ing the addresses are twofold. Organizationally, it's simpler to give blocks of
`addresses to organizations and allow them to divide them up as they see fit.
`Technically, it's much easier for network routers to determine how to get
`packets of data from one address to another when the Internet is organized
`into a series of networks and subnetworks.
`As a result of its rapid growth, the Internet is close to running out of
`unallocated addresses. A new system that uses longer addresses will
`replace the current one over the next few years. The new system will be
`designed to maintain compatability with the current addressing scheme.
`
`Raw IP addresses are unfriendly. They are difficult to remember and hard
`to type. For this reason, IP addresses are usually assigned human readable
`names using a distributed hierarchical lookup system known as the Domain
`Name System (DNS). In DNS, each machine has a unique name consisting
`of multiple parts separated by dots. The first part is the machine's host
`name, followed by a list of domains. The first domain is usually an identifier
`for the organization to which the machine belongs, followed by
`
`TABLE 2.1 Networks and Hosts
`
`Class
`
`Example Address
`
`Network Part
`
`Host Part
`
`15532.5
`18.
`18.155325
`A
`32.5
`128.15.
`128.15.32.5
`B
`
`
`
`192.66.12.56 192.66.12.C 56
`
`APPLE 1012 - Page 6
`
`APPLE 1012 - Page 6
`
`

`
`
`
`LINRAVELING THE WEB: HOW IT ALL WORKS
`
`29
`
`more organizational subtitles if necessary, and finally a label for the top-
`leoel domain. In the USA, the top-level domain is usually an identifier for
`the type of organization, edu for education institutions, com for commer-
`cial organizations, mil for military establishments, net for network
`providers, and org for organizations that don't fit anywhere else. For the
`rest of the world, the top—level domain usually identifies the country: jp
`for Japan, de for Germany (Deutschland), ch for Switzerland, and so on.
`The host name and domains together form a fully qualified domain name that
`uniquely identifies that machine on the Internet. The dots in domain names
`have no correspondence to the dots in IP addresses. Whereas IP addresses
`have four parts, domain names may have two, three, or more, depending
`on how the local naming system happens to have been setup.
`For example, one of the Sun workstations inside the Whitehead
`Institute of Biomedical Research's local network has the IP address
`
`18.157.1.125. Its full domain name is loco.wi .mit . edu. Here's how the
`
`name is formed (Figure 2.1): its host name is loco, it belongs to a network
`maintained by the Whitehead Institute, wi, which in turn is part of MIT's
`network, mi t, which is itself a U.S. educational institution, edu.
`The information in the DNS system is distributed among a large num-
`ber of DNS databases, each one stored on a name server maintained by the
`organization responsible for its piece of the network. When a program is
`given a domain name to connect to, it must first send an inquiry to its local
`name server in order to find the numeric IP address to which the name cor-
`
`responds. If the name server doesn't know (and often it doesn't), it queries
`another name server closer to the destination, and that name server may in
`turn query a third. For example, a program in Japan wanting to look up the
`address of loco .wi .mit .edu, might first send a query to one of the name
`servers in the US. responsible far the edu names. That machine would
`then forward the request to the MIT machine responsible for the mic
`domain, which would in turn defer to a name server at the Whitehead
`Institute. Physically, the DNS databases are just human-readable tables. To
`add or modify a machine name, the local DNS administrator makes a sim-
`ple addition or modification to the table.
`
`host name
`
`yet another
`organization
`
`domains
`
`g_ L
`loco
`TT
`
`FIGURE 2.1 Anatomy of a Fully Qualified Domain Name
`
`organization organization type
`
`APPLE 1012 - Page 7
`
`APPLE 1012 - Page 7
`
`

`
`30
`
`HOW TO SET UP AND MAINTAIN A WORLD WIDE WEB SITE
`
`One of the nice features of the DNS is that a single machine can have
`one or more ”aliases” assigned to it in addition to its true name. This fea-
`ture is widely used by Web administrators to give descriptive names to
`their server machines. For example, an organization whose domain
`name is Capricorn . org might run its Web server on a host named
`toggenberg . Capricorn . org. Instead of using this as its publicly
`known Web name, the organization could create a www alias for the
`machine, making it known to the world as www. capricorn . org. In
`addition to being the obvious name for people to guess at when trying to
`find the organization's Web server, use of the alias makes it easy to move
`the Web service to a different machine later. The Web administrator just
`has to let the person who runs the local DNS know that the alias needs to
`be reassigned to the new machine.
`
`To establish a corffinunications channel between two programs running
`on different machines, or even two programs running on the same
`machine, one program must initiate the connection and the other accept it.
`This is accomplished using a client/ server scheme. The server runs first.
`When it first starts up it signals the operating system that it wants to
`accept incoming network connections. Then it waits around for the con-
`nections to start rolling in. When a client on a remote machine needs to
`send or retrieve information from the server, it opens up a connection to
`the server, passes information back and forth, and closes the connection.
`,Most servers can handle multiple simultaneous incoming connections.
`They do this either by duplicating themselves in memory each time an
`incoming connection comes in, or by cleverly interleaving their communi-
`cations activity.
`The distinction between client and server rests on who initiates the
`connection and who accepts it. Although the server is usually the informa-
`tion provider and the client is usually the information customer, this is not
`necessarilylthe case. However, it is generally true that the client usually
`interacts directly with the user, processing keystrokes and displaying
`results, while the server skulks unseen in the background.
`
`When two programs want to communicate with each other, it isn't enough
`for them to know each others’ IP addresses. They also need a way to ren-
`dezvous. This is because a single machine often runs multiple types of
`servers. For example, the typical Unix machine offers a telnet service for
`network log—ins, a time service for exchanging the time of day, an ftp
`service for transferring files, and several others. A machine offering Web
`or Gopher services will run HTTP or Gopher servers as well. When a pro-‘
`gram connects to a remote machine, how does it ensure that it will connect
`to the right program?
`
`Clients and
`
`Servers
`
`Ports
`
`¢,M.w..»..y—...«...._......_.._...a..,.«....w
`
`APPLE 1012 - Page 8
`
`APPLE 1012 - Page 8
`
`

`
`
`
`'
`
`LINRAVELING THE WEB: HOW IT ALL WORKS
`
`31
`
`This is done through well~kn0wn ports. A port is to an IP address what
`an apartment number is to an apartment building's street address: the IP
`address identifies the machine, and the port identifies a particular
`program running on the machine (Figure 2.2). Ports are identified by a
`number from 0 to 65,535. When a servgr starts up, it notifies the operating
`system to reserve a particular port. On Unix systems port numbers
`between D and 1024 are privileged: They can only be reserved by servers
`run by the root user (also known as the superuser). The other ports are
`available for anyone’s use. (Personal computers don't havethis restriction
`on the use of low-numbered ports.) Well-known ports are those which, by
`convention, are assigned to be used for particular services (Table 2.2). For
`example, port 23 is used for Telnet, and port 80 is used for the .Web’s
`hypertext transfer protocol, HTTP.
`
`Server
`
`
`
`192.23.43.114
`
`18.155.32.23
`
`FIGURE 2.2 Clients Use Well-Known Port Numbers to Identify Particular Server
`Programs Running on a Host
`
`TABLE 2.2 Well-Known Ports for Common Protocols
`
`Protocol
`
`FTP
`
`Telnet
`
`Gopher
`HTTP
`
`Port
`
`K
`
`21
`
`23
`
`70
`80
`
`119
`NTTP (Usenet news)
`
`WAIS 210
`
`
`
`* This discussion glosses over the fact that there are really two low—level TCP/IP communi-
`cations protocols: TCP, a reliable protocol suitable for sending long streams of data, and
`UDP, an unreliable protocol suitable for exchanging brief messages. Although TCP is pre—
`ferred by most servers, including all the servers discussed in this book, some specialized
`servers use UDP instead. A TCP and a UDP program can both use the same port number
`without conflict, because in actuality a network program is uniquely identified by the
`combination of an IP address, a port number, and a communications protocol.
`
`APPLE 1012 - Page 9
`
`APPLE 1012 - Page 9
`
`

`
`
`
`.;,.:»:....,.2.-.».V
`
`
`
`
`§5?
`
`E2
`
`32
`
`HOW TO SET LIP AND MAINTAIN A WORLD WIDE WEB SITE
`
`Daemons and
`Inetd
`
`For example, when a Web server starts up, it reserves port 80 for its
`exclusive use (unless it's been configured to use a different one). Incoming
`clients know they should use port 80 for connecting to HTTP servers,
`making the rendezvous successful.
`
`In Unix systems, servers are run in either of two modes: stand-alone or
`under the control of a program called inetd. Stand-alone servers, also
`known as daemons, follow the model described earlier. They start up, listen
`for incoming connections, service the requests, and then go back to listen-
`ing. Most daemons can service multiple simultaneous incoming connec-
`tions. They do this by ”forking” a copy of themselves whenever there's a
`new incoming connection. The copy handles the request, leaving the origi-
`nal free to listen for new requests.
`‘
`It's possible for a_ system to support dozens of servers, each one assigned
`to a different port. At any time, only a fraction of them are actually doing
`any work, the rest are just hanging around, waiting for a connection, and
`consuming memory needlessly. To prevent this waste, the ”super daemon,”
`inetd was invented. When inetd starts up it reads a configuration file
`that gives it a list of ports to listen to and servers to run in response to
`incoming connections on each port. When a client connects to one of these
`ports inetd quickly launches the designated server and hands off the con-
`nection to it. When the communication is finished, the server exits, releasing
`system resources. inetd will launch it again when needed.
`Most servers, including the FTP, Telnet, and Gopher servers, run
`under inetd. Although Web HTTP servers can be configured to run this
`way as well, they usually aren't. Web servers, large programs with long
`and complex configuration files, take a significant amount of time to
`launch, and performances suffers seriously when run under inetd. For
`this reason, Web servers are usually run in stand-alone daemon mode.
`
`Uniform Resource Locators
`
`Because browsers speak many different protocols, there has to be some
`unambiguous way of telling them how and where to find an item of inter-
`est on the Internet. This is done through Uniform Resource Locator (URL)
`notation, a straightforward way of indicating the protocol, host, and loca-
`tion of an Internet resource.
`If you've used any of the Web browsers,
`you're already familiar with URLs: they are the ”address” of a Web page.
`The anatomy of an URL is diagrammed in Figure 2.3. The first part of
`the URL specifies the communications protocol. It's separated from the *
`rest of the URL by a colon. The second part, beginning with a double slash
`and ending with a single slash, is the name of the host machine on which
`
`APPLE 1012 - Page 10
`
`APPLE 1012 - Page 10
`
`

`
`
`
`I
`
`IINRAVELING THE WEB: HOW IT ALL WORKS
`
`33
`
`the resource resides and optionally the communications port to which you
`will connect. It's only necessary to specify the port if for some reason the
`remote server has been configured to use a nonstandard port. Otherwise
`the default port will be used (see Table 2.2 for a list of default ports). The
`host can be specified either by name (preferred), or by dotted Internet
`address. The rest of the URL is the path, a string of characters that tells the
`server how to locate the resource. Its format is different for each of the
`protocols: In some cases it will be the path to a file; in others it will be a
`query used to retrieve a document from a database or other program.
`Only some characters are legal within URLs. Upper and lowercase let-
`ters, numerals, and the characters $_@ . — are OK. The characters
`=; / # ? : %&+ and the space character are also legal but have special mean-
`ings. Everything else, including tabs, spaces, carriage returns, newlines,
`accented characters, and other symbols are illegal. To include these char-
`acters in an URL they must be escaped, using an escape code consisting of
`the °/o sign followed by the two-digit hexadecimal code of the character.
`For example, a carriage return can be entered into a URL with ”°/o0D”, a
`space with ”%20", and the percent sign itself with the sequence ”°/o25”.
`You'll find a list of ASCII codes in Table 2.3 as well as in Appendix B.
`It can be difficult to remember which characters are legal and which
`aren't. Fortunately, most browsers are pretty forgiving. Commonly used
`”illegal” characters, such as the ~ symbol, are automatically translated
`into the correct escape code by browsers before being sent to the server.
`
`path
`host name
`______|________._.___l______
`http: //www . capricorniorg : 8 O 8 O /expensive_fish/kobi . html
`T
`T
`protocol
`port
`
`FIGURE 2.3 Anatomy of an URL
`
`Complete Versus URLs can be complete, partial, or relative. Complete URLs contain all parts
`Partial URLS
`of the URL, including the protocol part, the host name part, and the docu-
`ment path. A hypertext link containing a complete URL will always point
`the browser to the correct location. An example of a complete URL is:
`
`http : //www. Capricorn . org./careers/heavy_industry . html 1
`
`APPLE 1012 - Page 11
`
`APPLE 1012 - Page 11
`
`

`
`
`
`1
`5
`
`V
`
`HOW TO SET UP AND MAINTAIN A WORLD WIDE WEB SITE
`
`TABLE 2.3 ASCII Character Codes
`
`Dec
`
`Hex
`
`Char
`
`Dec Hex Char
`
`Dec
`
`Hex
`
`Char
`
`0
`
`1
`2
`
`3
`4
`
`5
`6
`7
`8
`
`9
`10
`
`11
`12
`13
`14
`15
`16
`17
`18
`19
`
`O0
`
`O1
`O2
`
`O3
`O4
`
`O5
`06
`O7
`08
`
`09
`0A
`
`OB
`OC
`013
`OE
`OF
`10
`1
`-2
`13
`
`NUL
`
`SOH
`STX
`
`ETX
`EOT
`
`ENQ
`ACK
`BEL
`BS
`
`HT
`LE‘
`
`VT
`PF
`CR H‘
`SO
`SI
`DLE
`DC1
`DC2
`DC3
`
`DC4
`NAK
`SYN
`
`46
`
`47
`48
`
`49
`50
`
`51
`52
`53
`54
`
`55
`56
`
`57
`58
`59
`60
`61
`62
`63
`64
`65
`
`66
`67
`68
`
`2E
`
`2F
`30
`
`31
`32
`
`33
`34
`35
`‘ 36
`
`37
`38
`
`39
`3A
`3B
`3C
`3D
`3E
`3F
`40
`41
`
`42
`43
`44
`
`.
`
`/
`SO
`
`1
`2
`
`3
`4
`5
`6
`
`7
`8
`
`9
`:
`,
`<
`=
`>
`?
`@
`A
`
`B
`C
`D
`
`92
`
`93
`94
`
`95
`96
`
`97
`98
`99
`100
`
`101
`102
`
`103
`104
`105
`106
`107
`108
`"09
`10
`'11
`
`112
`'13
`114
`
`5C
`
`5D
`5E
`
`5F
`60
`
`61
`62
`63
`64
`
`65
`66
`
`67
`68
`69
`6A
`6B
`6C
`6D
`6E
`6F
`
`70
`71
`72
`
`\
`
`1
`"
`
`__
`‘
`
`a
`b
`C
`C1
`
`e
`f
`
`g
`h
`i
`j
`k
`1
`In
`n
`0
`
`p
`q
`r
`
`_15
`-16
`-17
`118
`119
`120
`
`121
`22
`
`23
`24
`
`125
`126
`
`127
`
`73
`74
`75
`76
`77
`78
`
`79
`7A
`
`7B
`7C
`
`7D
`7E
`
`7F
`
`S
`t
`u
`V
`w
`X
`
`y
`2
`
`{
`1
`
`}
`~
`
`DEL
`
`20
`21
`22
`
`23
`24
`25
`26
`27
`28
`
`29
`30
`
`31
`32
`
`33
`34
`
`35
`36
`37
`38
`
`39
`
`40
`41
`42
`43
`
`44
`
`45
`
`14
`15
`-6
`
`'7
`"8
`19
`A
`_B
`1C
`
`1D
`B
`
`11?‘
`20
`
`21
`22
`
`23
`24
`25
`26
`
`27
`
`28
`29
`2A
`2B
`
`2C
`
`2D
`
`ETB
`CAN
`EM
`SUB
`ESC
`FS
`
`GS
`RS
`
`US
`SPACE
`
`E
`"
`
`#
`S
`%
`&
`
`‘
`
`(
`)
`*
`+
`
`,
`
`~
`
`69
`70
`71
`72
`73
`74
`
`75
`76
`
`77
`78
`
`79
`80
`
`81
`82
`83
`84
`
`85
`
`86
`87
`88
`89
`
`90
`
`91
`
`45
`46
`47
`48
`49
`4A
`
`4B
`4C
`
`4D
`4E
`
`4F
`50
`
`51
`52
`53
`54
`
`55
`
`56
`57
`58
`59
`
`5A
`
`5B
`
`E
`F
`G
`H
`I
`J
`
`K
`L
`
`M
`N
`
`0
`P
`
`Q
`R
`S
`T
`
`U
`
`V
`W
`X
`Y
`
`Z
`
`[
`
`APPLE 1012 - Page 12
`
`APPLE 1012 - Page 12
`
`

`
`
`
`UNRAVELING THE WEB: HOW IT ALL WORKS
`
`35
`
`In contrast, an example of a partial URL is the simpler
`
`/careers/heavy_industry .htm1
`
`In partial URLs, the protocol and host name parts are left off and the
`URL begins with the path name part. When browsers encounter links con-
`taining this type of URL, they interpret the URL relative to the current
`page, assuming the same protocol and host name. In the preceding exam-
`ple, if the user is viewing the document
`
`http: //www. capricorn . org/heavy__industry . html
`
`and selects a link referring to URL / careers / steel .html, the browser
`would interpret this partial URL as if it were written out as
`
`http : / /www . Capricorn . org/careers/steel .html
`
`This shorthand notation can be taken even further to create relative URLs.
`
`In this type not only are the protocol and host omitted, but part of the path
`is left out as well, as in the stripped down
`
`strip__mining .html
`
`Everything, including the path itself, is now interpreted relative to the cur-
`rent document. The path names of relative URLs follow the same conven-
`tions as relative paths in the Unix and MS-DOS file systems. The directory
`name ”.” is used to indicate the current directory and the name ”..” is used
`to indicate the directory above the current one. So the relative URL
`automotive/ openings .html refers to a document in a directory below
`the current document, whereas .
`. / light__industry . html tells the
`browser to hop up one level before looking for the document.
`Relative URLs are most useful for creating logically linked sets of doc-
`uments within a site. The documents refer to each other using relative
`links only, allowing the entire set to be moved from place to place within a
`site, or even to a new site entirely without changing all the links. Absolute
`URLs are usually used to refer to documents located at remote sites.
`Chapter 5 shows how this works.
`
`Specific URLS
`
`There are as many different kinds of URLs as there are protocols supported
`by browsers. This section lists the common ones, and Table 2.4 gives a
`quick summary.
`
`Pile LIRLS
`
`These are the most basic of URLs. They specify a file located on the local
`machine. The general form of a file URL is:
`
`file: / / /path__to__the__fi1e
`
`APPLE 1012 - Page 13
`
`APPLE 1012 - Page 13
`
`

`
`
`
`36
`
`HOW TO SET UP AND MAINTAIN/l WORLD WIDE WEB SITE
`
`TABLE 2.4 Common URLs
`
`URL
`
`Description
`
`Local files
`file: / / /usr/local/birds/emus . gif
`
`HTTP protocol
`ht tp: //a . remote . host /birds/emus . gi f
`ht tp : / / 61 . remote . host /birds/
`ht tp: //a . remote . host /cgi—bin/ search?emu
`ht tp : / /a . remote . hos t /cgi —bin/ search
`ht tp: //a . remote . host /~fred/tapir . gi :E
`ht tp : / / a . remote . host / ~fred/
`
`FTP protocol
`ftp: / /a . remote . host /pub/emus . gi f
`ftp : / / a . remote . host /pub/ server
`ftp: //fred : xyzzy@a . remote . host / le“t‘ter . txt
`
`Gopher protocol
`gopher : / / a . remote . host/
`
`‘
`
`Telnet protocol
`telnet: / /a . remote . host/
`
`SMTP protocol
`mailto: fred@bedrock . Capricorn . org
`
`NNTP protocol
`news : comp . infosystems .www . providers
`
`g
`WAIS protocol
`wai s : / / a . remote . host /birds__o f__NA?emu
`
`A file on the local computer
`
`A file on an HTTP server
`A directory listing on an HTTP server
`An executable script on an HTTP server
`An executable script without parameters
`A file in a user-supported HTTP directory
`A listing of a user-supported HTTP directory
`
`A file on an anonymous FTP server
`A directory listing on an anonymous FTP server
`A file on an FTP server that requires a user name
`
`Top—level menu of a Gopher host
`
`Telnet to a remote host
`
`Send mail to user
`
`Read recent news in a newsgroup
`
`WAIS search on the named document index
`
`The host name and port should always be left blank in this type of URL
`(with one exception, as discussed later). Following this is the full path
`name to the file of interest using whatever notation is appropriate for the
`browser's operating system (slash for Unix, backslash for DOS, and colon
`for Macintosh OS). Most if not all browsers are kind enough to translate
`the Unix path notation into the local language, so a Unix-style path name,
`using slashes to separate directories, always works.
`File URLs should never be used in documents intended to serve over
`
`the Web. Say a user is browsing an HTML document that contains a link
`to file: ///usr/local/games/l1ama_attack. When the user selects
`this link the browser will attempt to retrieve a file named 1 lama__attack
`from the user's local file system, which is probably not what was intended.
`File URLs are best used during testing of a set of HTML documents, or for
`documents that are intended for local consumption only. However, a bet-
`ter solution is to use relative URLs during the development of a set of
`linked pages. Otherwise all the links will have to be revised when you
`move the finished documents into place.
`
`APPLE 1012 - Page 14
`
`APPLE 1012 - Page 14
`
`

`
`
`
`UNRAVELING THE WEB: HOW IT ALL WORKS
`
`37
`
`It's possible for a file URL to specify a host in the host name section. If
`it does so, the URL isn't treated as a file URL at all, but as an FTP URL.
`The browser will attempt to retrieve the file Via the anonymous file trans-
`fer protocol as described later. This is an archaic feature included for back-
`ward compatibility with old documents and should be avoided.
`
`HTTP URLs
`
`Web servers, by definition, speak HTTP. Naturally enough, HTTP URLs
`account for the vast majority of URLs that you will see. The format of an
`HTTP URL is:
`
`http: //hostname :port /path/to/the/resource
`
`As with other URLs you need only specify the communications port if
`the remote HTTP server is configured to something other than the stan—
`dard port 80. The resource path has exactly the same format as a Unix
`path name: The slashes separate a hierarchy of directories. Double dots (..)
`can be used to move up in the directory hierarchy and a single dot (.) indi-
`cates the current directory.
`Although the path used in an HTTP URL looks like a Unix path, it
`doesn't usually correspond exactly to a real physical file path on the
`remote machine. For one thing, the Web server interprets the URL path
`relative to the document root directory set in the server's configuration
`(the next chapter describes how this is done). For example, the URL
`
`http: //www. Capricorn . org/cooking/Curry . html
`
`may very well point to a file physically located on host Capricorn. org
`located at
`K.
`
`/local/web/Cooking/Curry . html.
`
`The path part of this kind of URL is often called a virtual path.
`The response by the HTTP server to the request for a particular URL is
`somewhat different depending on the resource type. If the path name
`points to a file, the server will return its contents. The browser can then do
`Whatever is appropriate for the type. If the path name points to a directory,
`the HTTP server will do one of two things. If the directory contains a wel-
`come page (often named welcome .html or index . html), this document
`will be retrieved and sent to the browser. This is how to drop the user into
`the welcome page when she accesses the site's root directory with an URL
`like http: / /www. Capricorn . org / . If no such file exists, the server will
`construct a directory listing on the fly and send it back to the browser.
`Depending on the server configuration, this listing may contain icons,
`hypertext links, file descriptions, and the contents of any README files
`found in the directory (examples of directory listings are shown in the next
`
`APPLE 1012 - Page 15
`
`APPLE 1012 - Page 15
`
`

`
`38
`
`HOW TO SET UP AND MAINTAIN A WORLD WIDE WEB SITE
`
`chapter, Figures 3.1 and 3.2). Servers can also be configured to ignore cer-
`tain types of files or to give others special treatment. Refer to the next chap-
`ter for full details on configuring your server for the various directory list-
`ing display options.
`HTTP URLs can also point to executable scripts. When an HTTP server
`receives a request for an URL that involves a server script, it invokes the
`program and sends the program's output to the browser. You can't tell
`from looking at it whether an URL points to a regular document or to a
`script, but if you do happen to know that a particular URL points to an
`executable script, you can pass information to it by following the URL
`with a question mark and a query string:
`
`http://www.Capricorn.org/cgi~bin/phonebook?giles+goatboy
`
`The format of the query string can get fairly complex and is taken up in
`more detail in Chapter 8.
`Another common type of HTTP URL looks like
`
`http://www.capricorn.org/~fred
`
`This points to a user—supp0rted directory, a set of pages located in user
`f red’s home directory. This feature lets ordinary users of the Web host
`create and maintain their own home pages.
`
`FTP URLs
`FTP (file transfer protocol) is one of the oldest and probably still the most
`popular of the methods for moving files around the Internet. The usual
`FTP URL looks like
`
`ftp: //hostname/pat:h/to/the/file
`
`The browser will attempt to retrieve the file pointed to by an FTP URL by
`connecting to the specified host via anonymous FTP and issuing the correct
`sequence of commands to download the indicated file. If the URL points to a
`directory rather than a file, the browser constructs a directory listing that can
`be used for selecting files or for navigating to other directories. This means
`that the simple ftp: / / hostname/ can be used to browse an entire FTP site.
`Some FTP sites require a user name and password for access. These
`sites can be handled with the full form of the FTP URL:
`
`ftp: //user : password@hostname : port /path/to/the/ file
`
`For example, here's an URL that can be used for retrieving a file under the
`user name f red, password bedrock:
`
`ftp: //fred:bedrock@www.Capricorn. org/strip_mining . html
`
`2
`
`APPLE 1012 -

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket