`
`
`
`”MANNING
`
`001
`
`Apple Inc.
`Apple Inc.
`APL1009
`APL1009
`U.S. Patent No. 8,724,622
`US. Patent No. 8,724,622
`
`001
`
`
`
`Illustrated Guide
`
`to HTTP
`
`1%MLSEflHHMON
`
`MANNING
`
`Greenwldl
`
`002
`
`
`
`For electronic browsing and ordering of this book. see hrsp:iiuntrw.browsebooks.ourn.
`
`The publisher offers discounts on this book when ordered in quantity. For more
`information. please contact:
`
`Special Sales Department
`Manning Publications Co.
`3 Lewis Street
`
`Greenwich. CT (16330
`
`Fax: {203] filii 3‘313
`entail: ordersfi’manning—som
`
`@199? by Manning Publications Co. All rights reserved.
`
`No part of this publication may be reproduced. stored in a rerrieral system. or
`transmitted. in any Earth or by means elotrronic. mechanical. photooopying. or
`otherwise. widtour prior mitten permission oFthe publisher.
`
`B Reoogniz'tng the importance oF preserving what has been written. it is
`Manning’s policy to have the bot-its ll: publishes printed. on acid-fie: paper, and
`we exert our bat oil-ores to that: end.
`
`Many of the designations used by manufacturers and sellers to distinguish their
`products are claimed as trademarks. 1iii-"'l'tete those designations appear in the book.
`and Manning Publications ms aware oFa trademark claim, the designations have
`been printed in initial caps or all caps.
`
`library of Congress Csmlogingrinrt’ublieation Data
`Hethnton. P111115.
`Illustrated guide to HTTP i Paul 5. Hethmon.
`p.
`cm.
`Includes bibliographical rel-lenses and index.
`ISBN 1-554?7?.3?.fi
`
`l. Hyper-next systems. 1. HTTP [Computer network protocol}
`1. Title.
`
`QflfifflHfildH‘iEé
`Midi-41:21
`
`199?
`
`9?-159I5
`Cil‘
`
`M Manning Publications Co.
`3 Lewis Street
`Greenwich. CT [5331']
`
`Cupyedimr: MW Mitchell
`Typesettet'. Dotod'ty MnIsiECl
`Covet duigner: Leslie Haimes
`
`Printed in the United States offimerieo
`i23456?3910—CR-—W95939?
`
`003
`
`003
`
`
`
`chapter 2
`
`HTTP overview
`
`2.1 Wat is thc World Wide WEI}?
`
`8
`
`2.2 General apemtion
`
`113
`
`2.3 Abitufhismry 12
`2.4 HTTPILI
`16
`
`2.5 Finishing 24
`
`004
`
`004
`
`
`
`CHAP TEE .3 M TTP fl VER VIE ll?
`
`
`2. 1 “Wm: is the
`
`World Wale Web?
`
`Just what is the World Wide Webi During the last Few years, just about every-
`body has defined what it is (and isn't]. I'm not going to add another definition
`here, but if you are reading this book you should be familiar enough with it. Dis-
`regarding any definition, the 1t'llr'orld Wide Web has become one of the most
`important information technologies of the nineties.
`
`2.}.1 The client/terrier erode!
`
`is the largest clientlI
`the 1World TlWide T"i"r'F"el:I
`From a programmer’s viewpoint,
`server sysrem implemented to date. It is made up of innumerable clients and
`servers, all exchanging information. In a typical clientiserver system, a propri-
`etary client talks to a proprietary server to accomplish some raslr. The taslt might
`be a sales order system for a mail order firm, or a data mining systEm for corpo-
`rate estecLItives. The Web changes things a hit, malting them more complicated
`and simple at the same time. The simple part comes from the open, well-defined
`protocols used between the clients and the servers. The complicated part comes
`From the loss of extensive programmeradefined protocols.
`let me explain the latter a little more thoroughly. If you were given the taslt
`of writing an application to handle order entry, you would typically define the
`types oF transactions to occur between your client and server. A typical exchange
`might be to look up a description of an item in the catalog. The client would
`malte a connection to the server, send a request which might be binary or plain
`text, and then would receive the reply which would typically be plain text. The
`reply might contain binary data also. such as a picture. lGiven a TCPIIP environ~
`ment using sockets. the client would make a connection to a port on which the
`server is listening. Then it would send a pacltet of information to the server. In
`order to make interpreting the data easier. you might have defined a structure for
`request packets that consist of 4 bytes for a numerical request code. The server
`then knows to read 4 bytes from the socket and then interpret accordingly.
`1"lil'rl‘len the server sends the response to the client, the client knows to expect a
`
`005
`
`005
`
`
`
`WHATIE THE WURLLJ WIDE WEB?
`
`
`certain type of reply. {See Figure 2.1}. In this case, you've defined a header of
`4 bytes that contains the length of the description [in plain text}, and the
`description immediately follmsrs rhe header. If data follows the description, then
`the 4 bytes after it are the length of the binary data, the picture of the item. Once
`the binary data lrias been received, the server closes the connection and the trans-
`action is finished.
`
`Iransaetlon
`
`Figure 2.1 Glisnu'aarver
`
`in this scenario, you as the programmer, had d‘ut utmost flexibility. 1liou
`were able to define the exact messages and the Format of the replies to rhern.
`Being able to do this makes your code very efficient. 1tron don't have to interpret
`the transactions to any extent. You are able to minimize the amount oF network
`traffic you generate and maximize the amoum of data in each Ltansaction. Con-
`tinuing on with your application, you can quickly define and implement all of
`the transactions your ciient and server need to know For proper response.
`Hot a couple oF months down the road, the word comes down from the 15
`department that your uiFty clientrlserver application also needs to run under
`Window 95 and 05m as well as the Mac ciient you originally wrote. So now
`you’ve got to go back and program two nosr clients and have the possibility of
`doing more in the Future. It would have been nice to write a single client which
`would run on ali oi- the possible operating systems. This is where HTTF comes
`into play. Instead of writing clients For every possible operating system, you can
`use a Web client such as Netscape Navigator, along with a "Web server, to build
`your elieno’server system.
`
`006
`
`006
`
`
`
`CHAPTER 2' HTTP GUERVIEW
`
`
`Routines are a bit dii‘ierenr in the Internet world however. In your original
`clienti'server application you had the freedom to define your own messaging
`standards. Now, someone else is going to give you the blueprint to work from in
`the form of an RFC. As mentioned previously, RFC is short for Request For
`Comments. RFCs are the technical documents which describe the protocols in
`use on the Internet. HTTP is the protocol used to send and receive messages
`between 1illr'eb clients and servers. HTML is the protocol used to create the Web
`pages sent as the data resource of the HTTP message. The two are closely related
`but distinct. The latest RFCs are on the ICU-ROM accompanying this boolt.
`The principal US resposirory for RFCs is held at the Internic. the agency respon-
`sible for domain registrations. among other Functions. The Web site is
`aw. i net-rerun. not. 1From the main page, follow the prompts to the Directory
`and Database Servicm and From mere to the RFC information.
`
`2.2 General operation
`
`HTTP is a request—response type of protocol. The client application sends a
`request to the server and then do: server responds to the request. In HTTPHJEI
`and H'I'I'Pr" Li], this was generally accomplished by malting a new connection
`For each request. HTTP! 1.]
`introduces persistent connections as the default
`behavior. Wild] persistent connections. the client and server maintain the corn
`nection. exchanging multiple requests and responses until
`the connection is
`explicitly closed by one. Even with persistent connections. HTTP remains a
`stateless prorocoi. No information is retained by the server between requests.
`There are three general request—response chains in which HTTP operates.
`The first is when a user agent makes a request directly to the origin server as
`shown in Figure 2.2 herein. In this scenario, the user agent makes a connection
`directly to the origin server on the default port oi: Eifl {unless otherwise specified}
`and sends its request. The server will be listening for inooming connections and
`start a new thread or process to serve the new request. Once the request has been
`worsened, the server sends the reSponse back over the connection.
`The second request—response chain involves a proxy or cache agent as an
`intermediary. In this scenario,
`the user agent makes its request to the proxy
`
`It}
`
`007
`
`007
`
`
`
`GENERA i'. UPI-IRA TIQN
`
`
` Request message
`Useragent
`[
`TGF’ portBI]
`H'ITF'server
`|
`
`Figure 2.2 finale ellent to server
`H'I'I'P operation
`
`instead of to the origin server [See Figure 2.3]. The proxy then makes the rcqucsr
`to the origin server on behalf— of the client. The server replies to the proxy, and
`then the proxy relays this to the user agent, thus fulfilling the request. This type
`operation is mostly seen in Firewali environments where the local 1AM is isolated
`from the Internet. J'to alternate on this procedures is for the intermediate agent
`to also serve as a caching agent.
`When making a request through the cache agent, the cache agent tries to
`serve the response From its internal cache of resources. The cache itselisaves any
`response it receives, ii' the response is a cachahle one. This shortens the request-
`tesponse chain, improves response time, and reduces network load. Most proxy
`agents are also caching agents.
`The final scenario is one involving an intermediate agent, acting as a tunnel.
`A tunnei blindly Funnels
`requests and responses benveen two HTTP
`applications. As shown in Figure 2.4, it is, in essence, providing a path for the
`user agent to the server.
`A tunnel is diFierenr From a proxy in how it operates. r'i. tunnel is simply a
`mechanism via which the user agent sends requests and receives responses From
`an origin server. The tunnel itself does nothing to the requests unlike a proxy
`
`Request! Io proxy
`
`
`
`Home 2.3 Ellen! to proxy to server H1117 operation
`
`I!
`
`008
`
`008
`
`
`
`CHAPTER 2 HTTP OVERVIEW
`
`
`Hoouostlooorvor
`
`I
`
`"
`
`Ueer agent
`
`
`
`
`
`
`
`Figure 2.4 Cilantro server via tunnel HTrF operation
`
`which may rewrite certain headers or require authentication from die user before
`providing services. A tunnel would be used most often to route HI 1P tralIie
`over a non-TCI’IIP link.
`
`Past the three basic request-response chains, anyone can put together any
`combination of intermediate agents. It is entirelyr reasonable for a user agent to
`send a request to a proxy, which sends it through a tunnel which reaches another
`proxy, and finali}r makes it to the origin server. Through all of this, the basic idea
`still maintains the request-response paradigm, although it may moire marryr
`contortions along the wav. Next, we wiU need to look in depth at the specific
`operation oi: HI IP.
`
`23 A bit ofbr'rtmy
`
`Before we delve into HTTPI I .1, a hit of background is in order. In this section
`we’ll examine the previous versions oi" H I I P: HTTPffléi and HTTP! I .11
`HI [Pf 1-1 is a response to those established previous versions—their strengths
`and their shortcomings.
`
`2.3. i HTTPIOfl
`
`The first implementation oF HTTP is now known as HTTPi'flH. The entire
`description of that protocol encompasses only a Perv pages. In HTTPIUE. a
`
`I2
`
`009
`
`009
`
`
`
`A BIT OF HISTORY
`
`
`client program makes a connection to the server on TCP port EU. The client
`then sends its request in the Following Form:
`
`GET document . html CRLF
`
`The request starts with the word GET. No odier methods are supported. A
`space character is then sent, followed by the document name. The document
`more may be Fully qualified and is not allowed to have any spaces. To end the
`line, the client should send a carriage return line feed combination. The specifi-
`cation mentions that servers should be tolerant of clients by only transmitting
`the line feed.
`
`One other option is allowed for the document name. The client may send a
`search request by appending a question mark, followed by a search term. Multir
`pie search terms may be specified by putting a plus sign between each. This type
`request should only be generated when the document specified contains the
`lSINDEX HTML tag. This allows a request of:
`
`GET doCument.hcm1?he-lp+me CRLP
`
`For the reply, the server returns the contents of the document. There is no
`content information, MIME type, or any other information returned to the cli-
`ent. The protocol is, in fact, restricted to sending only H'I'ML text documents.
`When the document has been sent, the server closes the conneCtion to signify the
`end of the document. This is necessary since no length information is exchanged
`between the server and client. When sending the document, the server delitnits
`each line by an optional carriage return, which is then followed by a mandatory
`line feed character.
`
`As can be seen from this description, implementing the HTTP![1.9 protocol
`can he done in a few clover: lines of code. The problem, however, was the
`limitation it imposed. Only text documents could he served and there was no
`method for the client to submit information to the server.
`
`2.3.2 HTTPHfl
`
`It has only
`The HTTPIIJII protocol was developed From 1992 to 1996.
`appeared as an Informational RFC. as recently as May 1996. Before that point,
`
`I3
`
`010
`
`010
`
`
`
`CHAPTER 2 HTTP OVERVIEW
`
`
`I-I'_.["l"P.Ir 1J1 was based on what the major 1Web servers and clients did. Since
`RFC 1945 is only an informational RFC, it does not actually specify an official
`standard of the Internet. It does. however. describe the common usage of
`HTTPH .l} and provides the reference for our server’s later implementation via
`the enclosed CD.
`
`H'ITPILD developed From the need to exchange more titan simple text
`information. It became a way to build a distributed hypermedia information
`system adapted to many needs and purposes. From 1994 to 199?, the Web
`developed from a Forum in which corn purer science departments could showcase
`their research into a center where everyone has a Web page. In Fact half of the
`television commercials today include a URL. in order for this to happen, HTTP
`expanded tremendously from its original specification.
`The lirst major change From the HTTPJ'IUS specification was the use oF
`MIME—like headers in request messages and in response messages. On the client
`side, the request message grew From the one line request to a structured. stable
`mulri=line request:
`
`Pu]. l-Roq'uee I: =‘Requoet-rLine
`*i General—Header
`Request—Header
`Entity-Header
`CRLF
`
`|
`|
`
`}
`
`l Enti.ty-Eody J
`Request-Line - Method SP Hequeet-UEI SP HTTP—Version CRLF
`
`The added headers resulted From the need to transmit more information in
`
`the request. For clients, this information included sending preferences for the
`type if information desired. This was expressed in terms of MIME media types:
`terms such as text Ihtml and imagergif were initiated so clients and servers
`
`could send information each could understand and use. The additional headers
`
`also let clients implement conditional retrievals using the If-Hodifiec‘i-Sinco
`header. This header allows the client to request that the resource be returned
`only if it has changed since the given date. With this. clients could cache l‘te—
`quently requested pages and update them only when necessary, thus saving valur
`able time and bandwidth.
`
`Dn dae server side, the server was finally allowed to send back content infor—
`mation. along wid'i the resource. In HTTPiflfl, only rhe resource ms sent. 1|With
`
`14
`
`011
`
`011
`
`
`
`A BIT OF HISTORY
`
`
`the expanded response syntax. the server could now tell the client exactly what
`type information was in the resource and, finally. substantially send more than
`HTML documents:
`
`Full—Response = StatuSeLine
`*{ General—Header
`Response-Header
`Entity-Header l
`CRLF
`
`l
`|
`
`Status—Lino -‘
`
`l Entity—Body ]
`HTTP—Varsion SP Status—Coder SP Reason—Phrase: CRLF
`
`The addition of [he Content-Typo header allowed the server to include the
`media type of the resource. Along with the original HTML documents, images
`and audio files became popular and commonplace as Forms of information to
`present on a 1Web site.
`The next HTTP change was the definition of new request methods. Along
`with the original GET request. HEAD and POST were now allowed. The HEAD
`request allows a client application to request a resource and receive all of the
`information about the resource without actually receiving the resource. This had
`uses for Web robots and spiders, which traverse links to gather update informae
`tion and detect broken links. The Poor method is what brought real interactivity
`to the Web. Now clients had a way to send substantial information to a server
`for processing. The GET method had been used at first as a way to transmit infor—
`mation to a server, but was limited by the amount of information a server would
`accept as part of the requoatrURI.* Now with POST, virtually unlimited entity
`bodies could be sent in a request message. W’ith this. came the use of the Web
`for inputting information: order forms, surveys, and requests could be made
`From a Web page.
`Servers also gathered the ability to respond with a status code to the client's
`requesr. The infamous sot Not Found status code oould now be sent whenever
`the resource was not present. Beyond this. the server could also respond with
`too to indicate a general success response, sea to indieare a resource had moved
`temporarily to a different location, ear to indicate authorization was required.
`or 5W to indicate a general server error while trying to fulfill the request.
`
`' Uniform Resource Identifiers {Lillie} are ouverod in Chapter 3.
`
`15
`
`012
`
`012
`
`
`
`CHAPTER .2 HTTP OVERVIEW
`
`
`The 4le Unauthorized status code leads us into the final point to make
`about HTTPHJ]. It
`introduced the idea oF restricted access to resources. A
`
`server could require a client to supply a username and password before returning
`certain resources. The idea of basic authentication allowed someone tn build a
`
`Web site with private information. Information could be restricted to a certain
`person or group of people. This also allowed a Web site to track a person
`throughout his visit. This ability permits a site to create a shopping cart for a user
`to track the items he wishes to purchase through multiple pages. At the end of
`the visit, the server can supply the complete list of items the user has selected.
`lGiven the stateless nature of HTTP, this allows commerce to flourish much easr
`
`let on the Web.
`
`From these enhancements to the protocol, HTTP developed From a simple
`information retrieval system into a general purpose transaction system capable of
`building quite complex systems with standard applications across multiple plate
`Forms. 15With this success came problems. Users demanded Faster loading orw
`pages, which led to clients malting multiple con nections to a single server. The
`higher number of connections led to bandwidth and server overload at times.
`Problems also appeared as more vanity servers appeared on the Internet. Servers
`which host multiple virtual domains on a single machine required a unique IP
`address For each virtual domain to identify each to the software. This has caused
`the finite supply of IP addresses to dwindle just a bit faster. Problems also arose
`as caching agents were introduced. Servers did not have a good way to specify
`what could and could not be safely cached. which led many sites to use catcher
`busting techniques, which prohibit a cache agent From being able to cache a para
`ticular response. Throughout 1995 and 1996. the IETFJ’HTTP Working Stoop
`worlted to develop HTTP! 1.1 to build upon H']TP.I"1.U, improve HI I I’ls gen—
`eral capabilities, and fist some of the problems which had appeared.
`
`2.4 HTTP/1.1
`
`In operation, HTTPHJ closely resembles H'I—TPJ'lfl. It still consists of the
`request—response paradigm and
`is highly compatible with H] [PHD
`
`id
`
`013
`
`013
`
`
`
`HTTPHJ
`
`
`applications. There are seven areas we’ll discuss here about how HTTP.Ir 1.1 dif-
`fers From HTTP! 1.1.}:
`
`*I New request methods
`
`- Persistent connections
`
`' Chunked encoding
`
`' Byte range operations
`
`* Content negotiation
`
`. Digesr Authentication
`
`* Caching
`
`2. 4. I New request method:
`
`The H'l"l'l’ 1.1 specification has defined two new methods which are higth
`beneficial to the end user: PFJ'T' and DELETE. The L-‘U'I' method allows a user agent
`to request a server to aeoept a resource and store it as the request-URI given by
`the client. This method allows a user agent to update or create a new resource on
`a server. in use. an HTML editor might implement this as a way for the user to
`maintain pages on a Web site. The user could create the pages and have them
`automatically updated by the editor. Notice that this behavior is difFerent From
`the previoust available POST method. Using POST, the u5er agent was requesting
`the resource identified by the request-URI to accept the eotiwr sent by the client.
`In essence. it was viewed as subordinate to the requestFUlU. The PUT method is
`aslting the server to accept the entity as the request-URI. Another use of this
`method might include implementing an HTTP based revision control system.
`The DELETE method is selfreitplanatorv: the user agent is requesting that the
`request—URI be removed from the server. Along with PUT. there is now a stan—
`dard method to implement Web based editing. The protocol specification spe—
`cifically allows the server to defer die actual deletion of a resource when it
`receives a request. It should move the resource to a nonaccessible location how-
`ever. This relaxation allows a server to save deleted resources in a safe place for
`review before final deletion and should probably be implemented in this way by
`
`I?
`
`014
`
`014
`
`
`
`CHAPTER 2 HTTP UVERWEW
`
`
`any server. Both the DELETE and PUT methods allow a user agent to create,
`replace. and delete resources on a server. Because of this, access to both methods
`should be controlled in some manner, either using IP address based restrictions
`or via one of the authentication methods within HTTP.
`
`The DPTIoNS method is used to query a server about the capabilities of a
`specific resource or about the server in general. A user agent can trial-u: an
`oP'I'IorrE-s request against a specific recource to find out which methods the server
`supports when accessing the resource. The response returned by the server
`should include an}.r communications related information about the resource.
`Typical information in the response would include an Pillow header listing the
`supported methods when requesting the resource. A user agent may also make a
`general OPTIONS request of the server and receive the same information as it
`applies to the server as a whole.
`The final method added, TRACE, is used For debugging purposes at the appli—
`cation level. A client program can use the method to have its original request
`echoed back to it. Using this information. the client can debug problems which
`might occur to an origin server when several
`intermediate agents handle its
`request. In use, an HTTP traceroute can be accomplished by letting the request
`advance one server at a time, checking the response back from each.
`
`2.4.2 Persistent connections
`
`As mentioned a bit earlier, in the quest for User satisfaction, Web browsers began
`making multiple connections to origin servers in order to speed up response
`times. Unfortunately, this led to some major congestion since a few clients could
`quickly bog down a slow linlt. The practice also suffered from the inherit mecha-
`nisms of making TC];II connections where serup time can usurp a good portion of
`the total connection cycle. Starting with l-I".l"".l"PrIlr 1. l, the protocol implements, as
`a default behavior, the practice of persistent connections. This means that once a
`client and server open a connection, the connection remains open until one or
`the other specifically requests that it he closed. While open, the client can send
`multiple, but separate, requests and. the server can respond to them in order. Clie
`cots are also free to send multiple requests without waiting for the responses1
`
`i3
`
`015
`
`015
`
`
`
`HTTFHJ
`
`
`In practice. a client might do this when
`basically pipelining the requests.
`requesting all of the graphic images From a particular page. It can also make the
`requests for the images. one after the other. and then finally listen for the
`responses finm the server. Implemented well. response time to the users will be
`high. without the inefficiencies of individual requests.
`
`2.4.3 Cbnnhed encoding
`
`Dnc problem arises For servers when persistent connections become the default
`behavior: they must now return a proper Concetta-Length header with each
`response. Previously. servers could signify the end of the entity body by simply
`closing the connection. With persistent connections. the server can no longer do
`this and must be able to determine the length of any entity it sends to the client.
`For most resources. this is not a problem. The length of H"1'ML. and image files
`can be determined throufit the operating system. Where trouble arises is in
`dynamically generated responses.
`Fortunately, HTTP! 1.1 also provides a solution: chanced encoding. Using
`chunked encoding. a server or CGI ptoCess can send back an entity body of
`unknown initial length by sending it back in chunks of known length. We'll dis“
`cuss the details in a later chapter. but Figure 2.5 shows the basic format.
`As shown. the server sends the site of the upcoming chunl-t in bytes and then
`the actual chunk of data. This is repeated until all the data is sent. Once all of the
`data is sent. a Final siae oft] is sent. indicating the end of the data. Following this.
`the server may optionally send Footers. or header fields which are allowed to be
`sent after the entityr hotly. With this method. it becomes easy for a server to send
`dynamically generated data and easy For the client to decode it.
`
`Ifil’fflfll
`
`Figure 2.5 Chunltetl encoding
`
`I9
`
`016
`
`016
`
`
`
`CHAPTER .i.’ HTTP OVERVIEW
`
`
`2.4.4 Byte range operations
`
`Another optimization and convenience introduced is bye tongs operations. I'm
`sure almost everyone has experienced trying to download the latest beta software
`from a favorite vendor. only to have the connection fail with Iflfl bytes to go
`[not of 5 MB, of entIrSe}. At that point, download is attempted again, hoping for
`the best. Now. the user agent can just ask for the last
`ll'll'i bytes of the resource
`instead of asking fiat the entire resource again. This can improve both the mood
`and response time. When requesting a byte range, a client makes a request as
`norrnal. but includes a Range header specifying the byte range the resource is to
`return. The client may also specify multiple byte ranges within a single request if
`it so desires. in this case.
`the server returns the resource as a multipartr
`
`byternngen media type.
`The use of byte ranges is not limited to recovery of failed transfers. Certain
`clients may wish to limit the number of bytes downloaded prior to committing a
`full request. A client with limited memory, disk space, or bandwidth can request
`the first sn-many bytes of a resource to let the user decide whether to finish the
`
`download. Servers are not required to implement byte range operations, but it is
`a recommended part of the protocol.
`
`2.4.5 Content negotiation
`
`There are times when a server may hold several different representations ofa sin—
`gle resource in order tn serve clients hetter. The alternate representations may be
`national language versions of a page or a resource which is available. both in its
`regular media type and as a gtiped version. In order to provide to the client the
`best representation. eontent negotiation may be performed. This can take the
`form of server-driven, agent—driven. or transparent negotiation.
`The first form, retreat-drifter! negotirrtr'err,
`is performed on the origin server,
`based on the client’s request. The server will
`inspect
`the various Accept-*
`headers a Client may find and. using this information plus other optional infnr~
`mation, send the best response to the client. This allows the client
`to send
`
`Accept, Aenept—Chareet, Accept-Language, or any combination of the
`Accent—Ii headers, stating their preference for responses. 1'illi’l'usn servers perform
`
`20
`
`017
`
`017
`
`
`
`HTTPHJ
`
`
`this negotiation. they must then send a Vary header to the client stating over
`which parameters the server chose the particular resource. The very header is
`required to he returned in order to provide caches with enough information to
`properly determine which Future requests may be satisfied by the response.
`The second form of content negoriation is agmradrr'rien. in this approach,
`the server provides to the user agent the inFonnation needed to pick the heat rep-
`resentation of the resource. This may come in the Form oi: the optional
`a1 tarna toss header or
`in the entity body to the initial
`response. The
`Alternates header is mentioned in the appendices to the HTTP protocols, but
`the ease: definition will he provided in a later specification thereto. Using either
`approach allows the server to provide a list oFchoices to the user agent. The user
`agent may then automatically. or with user input, pick the heir representation.
`The Iinal form is called transparent negotiation. In transparent negotiation.
`an intermediate cache provides serverudriven negotiation. based on the agents
`driven information From the server. In more concrete terms, the cache has the
`
`agent-driven negotiation information from the server For a particular resource
`with multiple representations. Assuming the cache understands all of the ways in
`which the representations vary,
`it may pick the best response when a client
`request is received. This allows an off-loading of server duries onto cache agents
`and improves response time to clients while providing accurate responses.
`
`.2. st. 6 Digest Anrbsnnicerisn
`
`Digest Authentication is included in l-l"l"'l"l='aIr l .l as a replacement for Basic
`Authentication. Basic Authentication suffers From the problem of passing the
`user's password in clear text across the nenvorl-t. With Digest Audrenticationt the
`password is kept as a shared secret between the client and server. The server and
`client compote a digest value. using the MD? [Message Digest 5} algorithm
`over a concatenation of the secret password and a few other values. This digest is
`then sent acroSs the network. Since only the client and Server know the secret
`password, the client can compute the digest value, send it to the serverl and then
`the server can verify it against the information it holds. Since no one else knows
`
`' MDE is detailed in RFCIJZL
`
`2!
`
`018
`
`018
`
`
`
`CHAP TEE 2 H TTF U YER VIEW
`
`
`the secret password, authenticity is more secure. This algorithm is similar to the
`POP?) protocol’s APO? method of authentication.
`Digest fluthentication is still only a reasonably secure method, however. it
`still requires an outside mode oF exchanging the pastord heoveen clients and
`servers. Digest J'tuthentication, therefore,
`is meant solely as a replacement For
`Basic Aurhentication.
`
`2. 4. 7 Catching
`
`The caching model in HTTPHJ allows the server a great deal oF control over
`the caching of responses. First. the specification makes it clear what is cachable
`and what is not. Generally speaking. only GET or HEAD responses are sachables.
`responses to any other method must be explicitly marked as cachahle by the
`server. The protocol uses the Cache-Control header
`to transmit caching
`instructions from servers and clients to caches.
`
`For servers, the cache control directives can he segregated into five groups:
`what is eachabie. what is not cachable, how old it can be. donit serw: anything
`past its age, and don't transform. in the first group- are directives which allow an
`origin server to explicitly mark something as cachable when it normally would
`not be. This can be used to allow caching oFanthenticated responses or responses
`to Peer requests. An example ofa cachet-1e Peat request might be the results of
`a search engine on a Web site. Under many circumstances. the results From a
`search would remain valid For several hours or even a flow days. If the response is
`cachable and serves one other client request, the server has oiT—loaded some work
`onto cache agents.
`The what is not cachable group oFdirectives include the no-eaehe and no—
`ucore direcrives. Basically.
`these directives instruct the cache agents to never
`save a response which includes the directive. The too—cache applies to responses
`only, while the ne—aeoro applies to both the request and response messages.
`The no-s tore directive can be thought of as the stronger. It instructs caches to
`remove the requestfresponse Frorn volatile storage {i.e.. memory] as soon as pos-
`sible and to never store it in nonvolatile storage (Le... hard disk}.
`A server who wishes to control how long a response may be cached will use
`the max-age- direct