`Ian 5. Graham
`John Wiley Sc Sons, Inc.
`New York 0 Chichester 0 Brisbane 0 Toronto 0 Singapore
`AH BLT-201 9.002


`I’ublisher: Katherine Schowalter
`Editor: Paul Farrell
`Assistant Editor: Allison Roarty
`Managing Editor: Frank Grazioli
`Interior Design <36 Composition: Benchmark Productions, Inc.
`Designations used by CDmpanies to distinguish their products are often claimed as
`trademarks. In all instances where John Wiley & Sons, Inc. is aware of a claim, the
`product names appear in Initial Capital or all CAPITAL letters. Readers, however,
`should contact the appropriate companies for more complete information regarding
`trademarks and registration.
`This text is printed on acid—free paper.
`Copyright © 1995 Ian S. Graham
`Published by John Wiley 8C Sons, Inc.
`All rights reserved. Published simultaneously in Canada.
`This publication is designed to provide accurate and authoritative information in regard
`to the subject matter covered. It is sold with the understanding that the pnbiisher is not
`engaged in rendering legal, accounting, or other professional service. If legal advice or
`other expert assistance is required, the services of a competent professional person
`should he sought.
`Reproduction or translation of any part of this work beyond that permitted by section
`107 or 108 of the 1976 United States Copyright Act without that permission of the
`copyright owner is unlawful. Requests for permission or further information should be
`addressed to the Permission Department, John Wiley 8:: Sons, Inc.
`Library of Congress Cara]oging-in-Publication Data:
`ISBN 0 471—11849—4
`Printed in the United States of America


`m Introduction ........................................................... ix
`Uniform Resource Locators .............................................. x
`The Hypertext Transfer Protocol .......- .................................. x
`The Hypertext Markup Language ....................................... xi
`Overview of the Book .................................................. xii
`Acknowledgments ..................................................... xvi
`Dedication ............................................................ xvi
`Introduction to the Hypertext Markup Language ...................... 1
`Chapter 1
`fl Basic Outline of the HyperText Markup Language ........................ 4
`Example 1: A Simple HTML Document .................................. 5
`HTML Element ................................................... 7
`Head and Title Elements ........................................... 3
`Body Element .............................................. _. ...... 9
`Highlighting Elements ............................................ 1o
`Paragraphs ....................................................... 11
`Unordered Lists .................................................. 13
`Horizontal Rule Element .......................................... 14
`Lessons from Example I .......................................... 14
`Example 2.: Images and Hypertext Links ................................. 15
`The Example Document .......................................... 17
`Example Document Rendered ..................................... 11
`Anchors ......................................................... 21
`Partial URLs ..................................................... 22
`Creating Links ................................................... 26
`Hypertext Links: The Good, the Bad, and the Ugly ................. 27
`Lessons from Example 2 .......................................... 31
`Example 3: Home Pages ................................................ 32
`Title and Heading ................................................ 37
`Text Portion ..................................................... 39
`Organization ..................................................... 39
`icons ............................................................ 40
`Uniform Resource Locators
`...................................... 41
`Lessons from Example 3 .......................................... 43
`Example 4: Collections of Hypertext Documents ......................... 44
`Pre Element ...................................................... 43
`Organization ..................................................... 51
`Archiving ........................................................ 52
`Lessons from Example 4 .......................................... 55
`Example 5: Images, Movies, and Sound Files ............................ 56
`Linking Large Images ............................................. 56
`Lessons from Example 5 .......................................... 61
`Example 6: Fill-In Forms ............................................... 61
`Form Element .................................................... 62
`Form Restrictions ................................................ 66
`Lessons from Example 6 .......................................... 67
`References ............................................................. 68


`HTML in Detail ........................................................ 71
`Chapter 2
`m Introduction to HTML ................................................. 71
`Allowed Characters in HTML documents .......................... 73
`Special Characters ................................................ 74
`Comments in HTML Documents .................................. 75
`HTML as a MIME Type .......................................... 76
`HTML Elements and Markup Tags ..................................... 76
`Case-Sensitivity .................................................. 78
`Empty Elements
`................................................. 79
`Element Nesting .................................................. no
`Unknown Elements or Attributes .................................. so
`Overall Document Structure ...................................... so
`Hypertext Markup Language Specification: Element by Element .......... 61
`Key to This Section ............................................... 82
`Head Elements ................................................... 86
`Body Elements ................................................... 94
`List Elements ................................................... 1 13
`Character—Related Elements ...................................... 134
`Character Highlighting Elements ................................. 136
`Logical Highlighting ............................................. 138
`Physical Highlighting ............................................ 145
`HTML+ Elements ............................................... 148
`References ............................................................ 159
`Uniform Resource Locators (URLs) .................................. 161
`Chapter 3
`“ Allowed Characters in URLs .......................................... 162
`Disallowed Characters ........................................... 163
`Special Characters .............................................. 164
`Example of a Uniform Resource Locator ............................... 165
`1. Protocol ..................................................... 165
`2. Address and Port Number .................................... 165
`3. Resource Location ........................................... 166
`Partial URLs
`......................................................... 167
`URL Specifications
`................................................... 168
`Ftp URLs ....................................................... 169
`Gopher URLs
`.................................................. 171
`HTTP URLs .................................................... 174
`Mailto URLs ................................................... 177
`News URLs
`.................................................... 177
`Telnetl'tn3270/rlogin URLs ...................................... 178
`..................................... ' ............... 178
`File URLs
`...................................................... 179
`References ............................................................ i 79
`The HTTP Protocol and the Common Gateway Interface ............ 181
`Chapter 4
`_ The HTTP Protocol ................................................... 182
`Example HTTP Client-Server Sessions ............................ 164
`User Authentication, Data Encryption, and Access Control
`....... 261


`Chapter 5
`HTTP Methods and Headers Reference ................................ 204
`HTTP Methods
`................................................ 205
`HTTP Request Headers .......................................... 205
`HTTP Response Headers
`....................................... 207
`The Common Gateway Interface ....................................... 268
`Sending Data from the Client to the Server ....................... 206
`Sending Data to the Gateway Program from the Server ............ 269
`Returning Data from the Gateway Program to the Server .......... 210
`The POST Method and Standard Input ........................... 225
`Security Considerations ................................................ 229
`References ............................................................ 230
`................................................ 231
`HTML and CGI Tools
`Images in HTML Documents .......................................... 232
`X-Bitmap Images ................................................ 232
`X-Pixelmaps .
`.7 ................................................. 232
`GIF Image Files ................................................. 233
`Reducing Image File Size: The Color Map ........................ 233
`Reducing Image Size: Rescaling Images ........................... 235
`Transparent GIF Images ......................................... 235
`Active Images ................................................... 236
`Creating the Image Database ..................................... 242
`Icon Archive Sites ............................................... 247
`Client—Side Executable Programs ....................................... 247
`Sending the Script to the Client .................................. 246
`Corifiguring the Client ........................................... 250
`Security Issues ................................................... 250
`Server-Side Document Includes ......................................... 253
`Include Command Format ....................................... 253
`Exampie of Server-Side Includes .................................. 256
`HTML Utility Programs ............................................... 261
`Dtd2htmi ....................................................... 262
`HTML Table Converter ......................................... 262
`Hypermail ...................................................... 263
`MHonArc: Mail to HTML Archive .............................. 263
`Table of Contents Generator ..................... '................ 264
`TreeLink ........................................................ 264
`CGI Utility Functions ................................................. 265
`CGI Email Handler .............................................. 265
`CGI Feedback Form ............................................. 265
`Determining Client Software ..................................... 266
`Convert UNIX Man Pages to HTML ............................. 266
`Processing Queries and FORM Packages ......................... 267
`Database CGI Gateway Programs
`..................................... 269
`WAIS Gateways
`................................................ 269
`WWWAIS .................................................... 272
`.............. 272
`Gateways to Structured Query Language Databases
`Macintosh Search Tools: TR-WW .............................. 276


`CGI Archive Sites
`.................................................... 276
`Database Gateway References ......................................... 277
`HTML Editors and Document Translators ............................ 279
`Chapter 6
`_ HTML Editors ........................................................ 280
`Simple Text Editors (Macintosh, PC, UNIX) ...................... 281
`Alpha (Macintosh) ............................................. 282
`BBEdit HTML Extensions (Macintosh) ........................... 282
`CU_HTML.DOT (PC—Word for Windows) ....................... 283
`Emacs (UNIX, PC) .............................................. 284
`GTJ—ITMLDOT (PC-Word for Windows) ....................... 285
`HoTMetaL (PC-Windows, UNIX) ................................ 285
`HTML Assistant (PC-Windows) ................................. 286
`HTMLed (PC-Windows) ......................................... 287
`HTML.edit (Macintosh) ......................................... 288
`HTML Editor (Macintosh) ...................................... 288
`HTML for Word 2.0 (PC-Word for Windows) .................... 288
`Htmltext (UNIX) ................................................ 239
`NextStep HTML-Editor (NeXT) ................................. 289
`S.H.E. (Macintosh) .............................................. 290
`TkHTML (UNIX) ............................................... 290
`TEN/WW (UNIX) ............................................... 291
`Document Transiators and Converters .................................. 291
`AchhtmE ....................................................... 293
`Charconv ....................................................... 293
`Cybcrieaf ....................................................... 294
`FasTag .......................................................... 294
`Framethmi ..................................................... 295
`HLPDK ......................................................... 295
`Hyperlatex ...................................................... 298
`Latchhtml ..................................................... 296
`Miflhtml ....................................................... 297
`Miftran ......................................................... 297
`Mmthml ....................................................... 297
` ..................................................... 293
`Pthtml ......................................................... 298
`RosettaMan ..................................................... 299
`Rtftohtmi ....................................................... 299
`RTFTOHTM ................................................... 300
`Seribelhtml ..................................................... aoo
`Striphtmi ....................................................... aoo
`TagWrite ........................................................ 301
`TeXZrtf ......................................................... 301
`Texithml ....................................................... 391
`WPTOHTML ................................................... 392
`Wplx .......................................................... 392
`WebMaker ...................................................... 302
`HTML Verifiers ....................................................... 303


`ngls ........................................................... 303
`HTML Validation Site ........................................... 306
`Link Verifiers ......................................................... 306
`Linkcheck ...................................................... 306
`Veri§y_Linl(s ..............._ ..................................... 307
`Web Browsers and Helper Applications ............................. 309
`Chapter 7
`_ Accessing Software .................................................... 310
`PC Platform Browsers ................................................. 31o
`MSDOS Browser: DosLynx ..................................... 311
`Windows and 05/2 Browsers .................................... 312
`Helper Applications ............................................. 315
`Macintosh Platform Browsers .......................................... 317
`MacWeb ........................................................ 318
`Macintosh Helper Applications .................................. 319
`UNIX and VAXIVMS Platform Browsers ............................... 321
`Batch mode browser (UNIX) .................................... 321
`Chimera (UNIX) ................................................ 321
`Emacs—w3 mode (UNIX, VMS, others) ........................... 322
`LineModc Browser (UNIX, VMS) ................................ 323
`_ Lynx (UNIX, VMS) ............................................. 323'
`MidasWWW (UNIX, VMS) ..................................... 324
`Mosaic for X-Winclows (UNIX) .................................. 325
`Mosaic (TueV) 2.4.2 ............................................. 326
`Quadralay GVVHIS Browser (UNIX) ............................. 326
`Rashty VMS Client (VMS) ...................................... 326
`Tkwww (UNIX) ................................................ 327
`ViolaWWW (UNIX) ............................................ 327
`UNIX Helper Packages
`......................................... 328
`NeXT Platform Browsers .............................................. 329
`The CER'N 'NeXT Browser ...................................... 330
`Omniweb ...................................................... 33o
`Amiga Browser: AMosaic .............................................. 331
`Coming Attractions ................................................... 331
`IBM 0312 Browser: Web Explorer ................................ 331
`MicrOMind SlipKnot ............................................ 331
`Netscape Communications Corp. ................................ 331
`HTTP Servers and Server Utilities ................................... 333
`Chapter 8
`m Basic Server Issues ..................................................... 333
`UNIX Servers ................................................... 334
`VMS Servers .................................................... 335
`Windows NT or 05/2 .......................................... 335
`Windows or Macintosh .......................................... 333
`Behind a Firewall? ............................................... 336
`List of Server Software ................................................ 33S
`CERN HTTP Server (UNIX, VMS) .............................. 337
`CL—H’ITP (Symbolics LISP Machines) ............................ 337


`_ comems
`CMS HTTPD (VMICMS) ........................................ 333
`DECthread HTTP Server (VAX/VMS) ............................ 339
`GN GopherlI-ITTP Server (UNIX) ................................ 339
`GWHIS HTTP Server (UNIX) ................................... 340
`HTTPS (Windows NT) .......................................... 34o
`Jungle (UNIX) .................................................. 341
`MacHTTP (Macintosh) .......................................... 341
`NCSA HTTPD (UNIX) .......................................... 342
`OSZHTTPD (OS/2) ............................................. 343
`Plexus (UNIX) .................................................. 343
`SerWeb (Windows 3.1) .......................................... 344
`Web4I-Iam (Windows 3.1) ....................................... 344
`WinHTTPD: NCSA HTTPD for Windows (Windows 3.1) ......... 344
`Coming Attractions ................................................... 345
`BASIS WEBserver ............................................... 345
`MDMA: Multithreaded Daemon for Muitimedia Access ........... 345
`Netscape Netsite ................................................ 346
`Server Support Programs .............................................. 346
`Getstats ......................................................... 346
`Gwstat ......................................................... 347
`Webster ......................................................... 347
`Wusage ......................................................... 343
`Wwwstat ....................................................... 348
`Real-World Examples ................................................ 349
`Electronic E-Print Servers .............................................. 349
`OncoLink ............................................................ 354
`Introduction .................................................... 355
`The OncoLink Implementation ................................... 356
`Use of OncoLink ................................................ 357
`Views of the Solar System ............................................. 364
`Background ..................................................... 364
`HTML Issues ................................' ................... 367
`Summary ....................................................... 369
`NetBoy—Choice of an Online Generation .............................. 369
`San Francisco Reservations’ World Wide Web Page ..................... 373
`Introduction .................................................... 373
`Why on the World Wide Web? ................................... 374
`Designing the W Page ....................................... 375
`Final Notes on SFR ............................ '................. 331
`The ISO Latin-1 Character Set ....................................... 333
`Multipurpose Internet Mai! Extensions (MIME) ..................... 339
`Finding Software Using Archie ...................................... 395
`Listening at a TCP/lP Port ............................. - ............. 401
`Glossary ............................................................. 403
`Index ................................................................ 433
`Chapter 9
`Appendix A
`Appendix B
`Appendix 3
`Appendix D


`An HTTP connection has four stages:
`1. Open the connection»—The client contacts the server at the Internet
`address and port number specified in the URL (the default port is 80).
`2. The request—The client sends a message to the server, requesting ser-
`vice. The request consists of HTTP request headers that define the
`method requested for the transaction and provide information about
`the capabilities of the client, followed by the data being sent to the
`server (if any). Typical HTTP methods are GET, for getting an object
`from a server, or POST, for posting data to an object on the server.
`3. The response—The server sends a response to the client. This consists
`of response headers describing the state of the transaction (for exam-
`ple, the status of the response—successful or not—and the type of data
`being sent), followed by the actual data.
`4. Close the connection—The connection is closed.
`This procedure means that a connection can download only a single docu-
`ment or process a single transaction, while the stateless nature of the trans~
`action means that each connection knows nothing about previous connec-
`tions. The implications of this are illustrated in the following example
`HTTP transactions.
`Suppose HTTP is being used to access an HTML document containing ten
`inline images. Composing the entire document then requires 11 distinct
`connections to the HTTP server: one to retrieve the HTML document
`itself, and ten more to retrieve the ten image files.
`Suppose a user retrieves a fill-in HTML FORM from a server and enters
`his or her username and password information to access a restricted server—
`side resource. When the user submits the FORM data to the server, this
`l i
`i 3


`Search string:-
`Search databases in: F Canada I Russia I— Sweden I U.S.A. (multiple
`items can be selected.)
`— Figure 4.6 Mosaic for X-Windows browser rendering of the FORM
`example in Figure 4.5.
`the URL and separated from it by a question mark. When the HTTP server
`receives these data, it forwards the entire query string to the gateway pro-
`gram forml. These details are discussed in Example 9 in the CGI section of
`this chapter.
`When an HTML FORM submits data to an HTTP server using the GET
`method, the FORM data are appended to the URL as a query string.
`Consequently, the FORM data are sent to the server in the query string
`part of the locator string in the request header method field. The FORM
`data are encoded according to the URL syntax.


`'inET_ lcglb1n/ferm1?srch~dogf
`'j accepttext/plaln
`.3 :A'c'c'ep'tr'_ certifies '-
`' _'Aeeéet_":.‘.'ee'x't'/.tiiimi :
`".A'c'e éfit‘f 3'
`Eablank line, conte_‘a
`m Figure 47 Data sent from a eiient to an HTTP server during a FORM-
`based GET request.
`This example again uses the FORM shown in Figure 4.6 but with one sub—
`tle change: It uses the HTTP POST method, not GET. Therefore, this
`example sends the same data to the server but by a different method.
`As in Example 3, this form has variable names srcb, srcbjype, and 5722?.
`These have been assigned values srcb=dogfisb, srcbjypezExact Match,
`srvr=Canada, and srvr=Sweden. Figure 4.8 shows the data sent to the
`server, again from the Mosaic for X—Windows browser.
`The POST method sends data in a message body, not in the URL. The clif-
`ference is indicated in three places. The first is in the method header, which
`now specifies the POST method and which has no data appended to the
`URL. The second is in the two new request header fields: Content—type
`and Content—length. Last, it is indicated by the line of data following
`the headers. This line is the data being sent to the server via the POST

