`
`• " COJVUPACT DBSC< DNCH..UDED
`
`.
`
`MANN I NG
`
`Huawei v. Uniloc - Exhibit No. 1009 - 1/219
`
`
`
`Illustrated Guide
`to HTTP
`
`PAULS. HETHMON
`
`Ill
`MANNING
`
`Greenwich
`{74° w. long.)
`
`Huawei v. Uniloc - Exhibit No. 1009 - 2/219
`
`
`
`For electronic browsing and ordering of this book, see http://www.browsebooks.com.
`
`The publisher offers di.scounts on this book when ordered in quantity. For more
`information, please contact:
`
`Special Sales Department
`Manning Publications Co.
`3 Lewis Street
`Greenwich, CT 06830
`
`Fax: (203) 661-9018
`email: orders@manning.com
`
`©1997 by Manning Publications Co. All rights reserved.
`
`No part of this publication may be reproduced, stored in a retrieval system, or
`transmitted, in any form or by means electronic, mechanical, photocopying, or
`otherwise, without prior written permission of the publisher.
`
`8 Recognizing the importance of preserving what has been written, it is
`Manning's policy to have the books it publishes printed on acid-free paper, and
`we exert our best efforts to that end.
`
`Many of rhe designations used by manufacturers and sellers to distinguish their
`products are claimed as trademarks. Where those designations appear in the book,
`and Manning Publications was aware of a trademark claim, the designations have
`been printed in initial caps or all caps.
`
`Library of Congress Cataloging-in-Publication Data
`Hethmon, Paul S.
`Illustrated guide to HTTP I Paul S. Hethmon.
`p.
`em.
`Includes bibliographical refrences and index.
`ISBN 1-884777-37-6
`I. Hypertext systems. 2. HTTP (Computer network protocol)
`I. Title.
`QA76.76.H94H484 1997
`004.6'2-dc21
`
`97-1596
`CIP
`
`Manning Publications Co.
`3 Lewis Street
`Greenwich, CT 06830
`
`Copyeditor: Maggie Mitchell
`Typesetter: Dorothy Marsico
`Cover designer: Leslie Haimes
`
`Printed in the United States of America
`1 2 3 4 5 6 7 8 9 10 - CR - 00 99 98 97
`
`Huawei v. Uniloc - Exhibit No. 1009 - 3/219
`
`
`
`
`
`CHAPTER 2 HTTP OVERVIEW
`
`2.1 What is the
`World Wide Web?
`
`Just what is the World Wide Web? During the last few years, just about every(cid:173)
`body has defined what it is (and isn't). I'm not going to add another definition
`here, but if you are reading this book you should be familiar enough with it. Dis(cid:173)
`regarding any definition, the World Wide Web has become one of the most
`important information technologies of the nineties.
`
`2.1.1 The client/server model
`
`From a programmer's viewpoint, the World Wide Web is the largest client/
`server system implemented to date. It is made up of innumerable clients and
`servers, all exchanging information. In a typical client/ server system, a propri(cid:173)
`etary client talks to a proprietary server to accomplish some task. The task might
`be a sales order system for a mail order firm, or a data mining system for corpo(cid:173)
`rate executives. The Web changes things a bit, making them more complicated
`and simple at the same time. The simple part comes from the open, well-defined
`protocols used between the clients and the servers. The complicated part comes
`from the loss of extensive programmer-defined protocols.
`Let me explain the latter a little more thoroughly. If you were given the task
`of writing an application to handle order entry, you would typically define the
`types of transactions to occur between your client and server. A typical exchange
`might be to look up a description of an item in the catalog. The client would
`make a connection to the server, send a request which might be binary or plain
`text, and then would receive the reply which would typically be plain text. The
`reply might contain binary data also, such as a picture. Given a TCP/IP environ(cid:173)
`ment using sockets, the client would make a connection to a port on which the
`server is listening. Then it would send a packet of information to the server. In
`order to make interpreting the data easier, you might have defined a structure for
`request packets that consist of 4 bytes for a numerical request code. The server
`then knows to read 4 bytes from the socket and then interpret accordingly.
`When the server sends the response to the client, the client knows to expect a
`
`8
`
`Huawei v. Uniloc - Exhibit No. 1009 - 5/219
`
`
`
`WHAT IS THE WORLD WIDE WEB?
`
`certain type of reply. (See Figure 2.1). In this case, you've defined a header of
`4 bytes that contains the length of the description (in plain text), and the
`description immediately follows the header. If data follows the description, then
`the 4 bytes after it are the length of the binary data, the picture of the item. Once
`the binary data has been received, the server closes the connection and the trans(cid:173)
`action is finished.
`
`Client
`
`4-byte request packet
`
`·· ~·
`
`Response packet
`
`Server
`...
`
`I ~
`
`Figure 2.1 Client/server
`transaction
`
`In this scenario, you as the programmer, had the utmost flexibility. You
`were able to define the exact messages and the format of the replies to them.
`Being able to do this makes your code very efficient. You don't have to interpret
`the transactions to any extent. You are able to minimize the amount of network
`traffic you generate and maximize the amount of data in each transaction. Con(cid:173)
`tinuing on with your application, you can quickly define and implement all of
`the transactions your client and server need to know for proper response.
`But a couple of months down the road, the word comes down from the IS
`department that your nifty client/ server application also needs to run under
`Windows 95 and OS/2 as well as the Mac client you originally wrote. So now
`you've got to go back and program two new clients and have the possibility of
`doing more in the future. It would have been nice to write a single client which
`would run on all of the possible operating systems. This is where HTTP comes
`into play. Instead of writing clients for every possible operating system, you can
`use a Web client such as Netscape Navigator, along with a Web server, to build
`your diem/server system.
`
`9
`
`Huawei v. Uniloc - Exhibit No. 1009 - 6/219
`
`
`
`CHAPTER 2 HTTP OVERVIEW
`
`Routines are a bit different in the Internet world however. In your original
`client/server application you had the freedom to define your own messaging
`standards. Now, someone else is going to give you the blueprint to work from in
`the form of an RFC. As mentioned previously, RFC is short for Request For
`Comments. RFCs are the technical documents which describe the protocols in
`use on the Internet. HTTP is the protocol used to send and receive messages
`between Web clients and servers. HTML is the protocol used to create the Web
`pages sent as the data resource of the HTTP message. The two are closely related
`but distinct. The latest RFCs are on the CD-ROM accompanying this book.
`The principal US respository for RFCs is held at the lnternic, the agency respon(cid:173)
`sible for domain registrations, among other functions. The Web site is
`www. internic. net. From the main page, follow the prompts to the Directory
`and Database Services and from there to the RFC information.
`
`2.2 General operation
`
`HTTP is a request-response type of protocol. The client application sends a
`request to the server and then the server responds to the request. In HTTP/0.9
`and HTTP/1.0, this was generally accomplished by making a new connection
`for each request. HTTP/1.1 introduces persistent connections as the default
`behavior. With persistent connections, the client and server maintain the con(cid:173)
`nection, exchanging multiple requests and responses until the connection is
`explicitly closed by one. Even with persistent connections, HTTP remains a
`stateless protocol. No information is retained by the server between requests.
`There are three general request-response chains in which HTTP operates.
`The first is when a user agent malces a request directly to the origin server as
`shown in Figure 2.2 herein. In chis scenario, the user agent makes a connection
`directly to the origin server on the default port of 80 (unless otherwise specified)
`and sends its request. The server will be listening for incoming connections and
`start a new thread or process to serve the new request. Once the request has been
`processed, the server sends the response back over the connection.
`The second request-response chain involves a proxy or cache agent as an
`intermediary. In this scenario, the user agent makes its request to the proxy
`
`10
`
`Huawei v. Uniloc - Exhibit No. 1009 - 7/219
`
`
`
`
`
`CHAPTER 2 HTTP OVERVIEW
`
`Request to server
`
`";I
`
`+
`
`User agent
`
`t::
`
`'+. :· .... -
`
`Tunnel
`
`Response to user agent G..::-jd
`
`ll Request to server
`
`L
`
`Response to user agent
`
`HTIP server
`
`Figure 2.4 Client to server via tunnel HTIP operation
`
`which may rewrite certain headers or require authentication from the user before
`providing services. A tunnel would be used most often to route HTTP traffic
`over a non-TCPIIP link.
`Past the three basic request-response chains, anyone can put together any
`combination of intermediate agents. It is entirely reasonable for a user agent to
`send a request to a proxy, which sends it through a tunnel which reaches another
`proxy, and finally makes it to the origin server. Through all of this, the basic idea
`still maintains the request-response paradigm, although it may make many
`contortions along the way. Next, we will need to look in depth at the specific
`operation of HTTP.
`
`2.3 A bit of history
`Before we delve into HTTP I 1.1, a bit of background is in order. In this section
`we'll examine the previous versions of HTTP: HTTPI0.9 and HTTPil.O.
`HTTP I L 1 is a response to those established previous versions-their strengths
`and their shortcomings.
`
`2.3.1 HTTP/0.9
`The first implementation of HTTP is now known as HTTPI0.9. The entire
`description of that protocol encompasses only a few pages. In HTTPI0.9, a
`
`12
`
`Huawei v. Uniloc - Exhibit No. 1009 - 9/219
`
`
`
`A BIT OF HISTORY
`
`client program makes a connection to the server on TCP port 80. The client
`then sends its request in the following form:
`
`GET document.html CRLF
`
`The request starts with the word GET. No other methods are supported. A
`space character is then sent, followed by the document name. The document
`name may be fully qualified and is not allowed to have any spaces. To end the
`line, the client should send a carriage return line feed combination. The specifi(cid:173)
`cation mentions that servers should be tolerant of clients by only transmitting
`the line feed.
`One other option is allowed for the document name. The client may send a
`search request by appending a question mark, followed by a search term. Multi(cid:173)
`ple search terms may be specified by putting a plus sign between each. This type
`request should only be generated when the document specified contains the
`ISINDEX HTML tag. This allows a request of:
`
`GET document.html?help+me CRLF
`
`For the reply, the server returns the contents of the document. There is no
`content information, MIME type, or any other information returned to the cli(cid:173)
`ent. The protocol is, in fact, restricted to sending only HTML text documents.
`When the document has been sent, the server closes the connection to signify the
`end of the document. This is necessary since no length information is exchanged
`between the server and client. When sending the document, the server delimits
`each line by an optional carriage return, which is then followed by a mandatory
`line feed character.
`As can be seen from this description, implementing the HTTP/0.9 protocol
`can be done in a few dozen lines of code. The problem, however, was the
`limitation it imposed. Only text documents could be served and there was no
`method for the client to submit information to the server.
`
`2.3.2 HTTP/1.0
`The HTTP/1.0 protocol was developed from 1992 to 1996. It has only
`appeared as an Informational RFC as recently as May 1996. Before that point,
`
`13
`
`Huawei v. Uniloc - Exhibit No. 1009 - 10/219
`
`
`
`CHAPTER 2 HTTP OVERVIEW
`
`HTTP/1.0 was based on what the major Web servers and clients did. Since
`RFC 1945 is only an informational RFC, it does not actually specify an official
`standard of the Internet. It does, however, describe the common usage of
`HTTP/1.0 and provides the reference for our server's later implementation via
`the enclosed CD.
`HTTP I 1. 0 developed from the need to exchange more than simple text
`information. It became a way to build a distributed hypermedia information
`system adapted ro many needs and purposes. From 1994 to 1997, the Web
`developed from a forum in which computer science departments could showcase
`their research into a center where everyone has a Web page. In fact half of the
`television commercials today include a URL. In order for this to happen, HTTP
`expanded tremendously from its original specification.
`The first major change from the HTTP/0.9 specification was the use of
`MIME-like headers in request messages and in response messages. On the client
`side, the request message grew from the one line request to a structured, stable
`multi-line request:
`
`Full-Request =Request -Line
`*( General-Header I
`Request-Header I
`Entity-Header
`)
`CRLF
`[ Entity- Body ]
`Request -Line = Method SP Request-URI SP HTTP-Version CRLF
`
`The added headers resulted from the need to transmit more information in
`the request. For clients, this information included sending preferences for the
`type of information desired. This was expressed in terms of MIME media types:
`terms such as text/html and image/gif were initiated so clients and servers
`could send information each could understand and use. The additional headers
`also let clients implement conditional retrievals using the If-Modified- Since
`header. This header allows the client to request that the resource be returned
`only if it has changed since the given date. With this, clients could cache fre(cid:173)
`quently requested pages and update them only when necessary, thus saving valu(cid:173)
`able time and bandwidth.
`On the server side, the server was finally allowed to send back content infor(cid:173)
`mation, along with the resource. In HTTP/0.9, only the resource was sent. With
`
`14
`
`Huawei v. Uniloc - Exhibit No. 1009 - 11/219
`
`
`
`A BIT OF HISTORY
`
`the expanded response syntax, the server could now tell the client exactly what
`type information was in the resource and, finally, substantially send more than
`HTML documents:
`
`Full -Response = Status-Line
`*( General-Header
`Response-Header
`Entity-Header )
`CRLF
`( Entity-Body ]
`Status-Line = HTTP-Version SP Status - Code SP Reason-Phrase CRLF
`
`The addition of the Content - Type header allowed the server to include the
`media type of the resource. Along with the original HTML documents, images
`and audio files became popular and commonplace as forms of information to
`present on a Web site.
`The next HTTP change was the definition of new request methods. Along
`with the original GET request, HEAD and POST were now allowed. The HEAD
`request allows a client application to request a resource and receive all of the
`information about the resource without actually receiving the resource. This had
`uses for Web robots and spiders, which traverse links to gather update informa(cid:173)
`tion and detect broken links. The POST method is what brought real interactivity
`to the Web. Now clients had a way to send substantial information to a server
`for processing. The GET method had been used at first as a way to transmit infor(cid:173)
`mation to a server, but was limited by the amount of information a server would
`accept as part of the request-URI.* Now with POST, virtually unlimited entity
`bodies could be sent in a request message. With this, came the use of the Web
`for inputting information: order forms, surveys, and requests could be made
`from a Web page.
`Servers also gathered the ability to respond with a status code to the client's
`request. The infamous 404 Not Found status code could now be sent whenever
`the resource was not present. Beyond this, the server could also respond with
`2 o o to indicate a general success response, 3 0 2 to indicate a resource had moved
`temporarily to a different location, 4 o 1 to indicate authorization was required,
`or 500 to indicate a general server error while trying to fulfill the request.
`
`* Uniform Resource Identifiers (URis) are covered in Chapter 3.
`
`15
`
`Huawei v. Uniloc - Exhibit No. 1009 - 12/219
`
`
`
`CHAPTER 2 HTTP OVERVIEW
`
`The 401 Unauthorized status code leads us into the final point to make
`about HTTP/1.0. It introduced the idea of restricted access to resources. A
`server could require a client to supply a username and password before returning
`certain resources. T he idea of basic authentication allowed someone to build a
`Web site with private information. Information could be restricted to a certain
`person or group of people. This also allowed a Web site to track a person
`throughout his visit. This ability permits a site to create a shopping cart for a user
`to track the items he wishes to purchase through multiple pages. At the end of
`the visit, the server can supply the complete list of items the user has selected.
`Given the stateless nature of HTTP, this allows commerce to flourish much eas(cid:173)
`ier on the Web.
`From these enhancements to the protocol, HTTP developed from a simple
`information retrieval system into a general purpose transaction system capable of
`building quite complex systems with standard applications across multiple plat(cid:173)
`forms. With this success came problems. Users demanded faster loading of
`pages, which led to clients making multiple connections to a single server. The
`higher number of connections led to bandwidth and server overload at times.
`Problems also appeared as more vanity servers appeared on the Internet. Servers
`which host multiple virtual domains on a single machine required a unique IP
`address for each virtual domain to identify each to the software. This has caused
`the finite supply of IP addresses to dwindle just a bit faster. Problems also arose
`as caching agents were introduced. Servers did not have a good way to specify
`what could and could not be safely cached, which led many sites to use cache(cid:173)
`busting techniques, which prohibit a cache agent from being able to cache a par(cid:173)
`ticular response. Throughout 1995 and 1996, the IETF/HTTP Working Group
`worked to develop HTTP/1.1 to build upon HTTP/1.0, improve HTTP's gen(cid:173)
`eral capabilities, and fix some of the problems which had appeared.
`
`2.4 HTTP/1.1
`
`In operation, HTTP/ 1. 1 closely resembles HTTP/1.0. It still consists of the
`is highly compatible with HTTP/1.0
`request-response paradigm and
`
`16
`
`Huawei v. Uniloc - Exhibit No. 1009 - 13/219
`
`
`
`HTTP/1 .1
`
`applications. There are seven areas we'll discuss here about how HTTP I 1.1 dif(cid:173)
`fers from HTTP/1.0:
`
`• New request methods
`• Persistent connections
`• Chunked encoding
`• Byte range operations
`• Content negotiation
`• Digest Authentication
`• Caching
`
`2. 4.1 New request methods
`
`The HTTP 1.1 specification has defined two new methods which are highly
`beneficial to the end user: PUT and DELETE. The PUT method allows a user agent
`to request a server to accept a resource and store it as the request-URI given by
`the client. This method allows a user agent to update or create a new resource on
`a server. In use, an HTML editor might implement this as a way for the user to
`maintain pages on a Web site. The user could create the pages and have them
`automatically updated by the editor. Notice that this behavior is different from
`the previously available POST method. Using POST, the user agent was requesting
`the resource identified by the request-URI to accept the entity sent by th~ client.
`In essence, it was viewed as subordinate to the request-URI. The PUT method is
`asking the server to accept the entity as the request-URI. Another use of this
`method might include implementing an HTTP based revision control system.
`The DELETE method is self-explanatory: the user agent is requesting that the
`request-URI be removed from the server. Along with PUT, there is now a stan(cid:173)
`dard method to implement Web based editing. The protocol specification spe(cid:173)
`cifically allows the server to defer the actual deletion of a resource when it
`receives a request. It should move the resource to a nonaccessible location how(cid:173)
`ever. This relaxation allows a server to save deleted resources in a safe place for
`review before final deletion and should probably be implemented in this way by
`
`17
`
`Huawei v. Uniloc - Exhibit No. 1009 - 14/219
`
`
`
`CHAPTER 2 HTTP OVERVIEW
`
`any server. Both the DELETE and PUT methods allow a user agent to create,
`replace, and delete resources on a server. Because of this, access to both methods
`should be controlled in some manner, either using IP address based restrictions
`or via one of the authentication methods within HTTP.
`The OPTIONS method is used to query a server about the capabilities of a
`specific resource or about the server in general. A user agent can make an
`OPTIONS request against a specific resource to find out which methods the server
`supports when accessing the resource. The response returned by the server
`should include any communications related information about the resource.
`Typical information in the response would include an Allow header listing the
`supported methods when requesting the resource .. A user agent may also make a
`general OPTIONS request of the server and receive the same information as it
`applies to the server as a whole.
`The final method added, TRACE, is used for debugging purposes at the appli(cid:173)
`cation level. A client program can use the method to have its original request
`echoed back to it. Using this information, the client can debug problems which
`might occur to an origin server when several intermediate agents handle its
`request. In use, an HTTP traceroute can be accomplished by letting the request
`advance one server at a time, checking the response back from each.
`
`2. 4.2 Persistent connections
`
`As mentioned a bit earlier, in the quest for user satisfaction, Web browsers began
`making multiple connections to origin servers in order to speed up response
`times. Unfortunately, this led to some major congestion since a few clients could
`quicldy bog down a slow link. The practice also suffered from the inherit mecha(cid:173)
`nisms of making TCP connections where setup time can usurp a good portion of
`the total connection cycle. Starting with HTTP/1.1, the protocol implements, as
`a default behavior, the practice of persistent connections. This means that once a
`client and server open a connection, the connection remains open until one or
`the other specifically requests that it be closed. While open, the client can send
`multiple, but separate, requests and the server can respond to them in order. Cli(cid:173)
`ents are also free to send multiple requests without waiting for the responses,
`
`18
`
`Huawei v. Uniloc - Exhibit No. 1009 - 15/219
`
`
`
`
`
`CHAPTER 2 HTTP OVERVIEW
`
`2. 4. 4 Byte range operations
`Another optimization and convenience introduced is byte range operations. I'm
`sure almost everyone has experienced trying to download the latest beta software
`from a favorite vendor, only to have the connection fail with 100 bytes to go
`(out of 5MB, of course). At that point, download is attempted again, hoping for
`the best. Now, the user agent can just ask for the last 100 bytes of the resource
`instead of asking for the entire resource again. This can improve both the mood
`and response time. When requesting a byte range, a client makes a request as
`normal, but includes a Range header specifying the byte range the resource is to
`return. The client may also specify multiple byte ranges within a single request if
`it so desires. In this case, the server returns the resource as a multipart 1
`byt e r anges media type.
`The use of byte ranges is not limited to recovery of failed transfers. Certain
`clients may wish to limit the number of bytes downloaded prior to committing a
`full request. A client with limited memory, disk space, or bandwidth can request
`the first so-many bytes of a resource to let the user decide whether to finish the
`download. Servers are not required to implement byte range operations, but it is
`a recommended part of the protocol.
`
`2. 4. 5 Content negotiation
`There are times when a server may hold several different representations of a sin(cid:173)
`gle resource in order to serve clients better. The alternate representations may be
`national language versions of a page or a resource which is available, both in its
`regular media type and as a gziped version. In order to provide to the client the
`best representation, content negotiation may be performed. This can take the
`form of server-driven, agent-driven, or transparent negotiation.
`The first form, server-driven negotiation, is performed on the origin server,
`based on the client's request. The server will inspect the various Accept-*
`headers a client may send and, using this information plus other optional infor(cid:173)
`mation, send the best response to the client. This allows the client to send
`Accept, Accept-Charset , Accept-Language, or any combination of the
`Accept-* headers, stating their preference for responses. When servers perform
`
`20
`
`Huawei v. Uniloc - Exhibit No. 1009 - 17/219
`
`
`
`HTTP/1.1
`
`this negotiation, they must then send a Vary header to the client stating over
`which parameters the server chose the particular resource. The Vary header is
`required to be returned in order to provide caches with enough information to
`properly determine which future requests may be satisfied by the response.
`The second form of content negotiation is agent-driven. In this approach,
`the server provides to the user agent the information needed to pick the best rep(cid:173)
`resentation of the resource. This may come in the form of the optional
`Alternates header or in the entity body to the initial response. The
`Alternates header is mentioned in the appendices to the HTTP protocols, but
`the exact definition will be provided in a later specification thereto. Using either
`approach allows the server to provide a list of choices to the user agent. The user
`agent may then automatically, or with user input, pick the best representation.
`The final form is called transparent negotiation. In transparent negotiation,
`an intermediate cache provides server-driven negotiation, based on the agent(cid:173)
`driven information from the server. In more concrete terms, the cache has the
`agent-driven negotiation information from the server for a particular resource
`with multiple representations. Assuming the cache understands all of the ways in
`which the representations vary, it may pick the best response when a client
`request is received. This allows an off-loading of server duties onto cache agents
`and improves response time to clients while providing accurate responses.
`
`2. 4. 6 Digest Authentication
`Digest Authentication is included in HTTP/1.1 as a replacement for Basic
`Authentication. Basic Authentication suffers from the problem of passing the
`user's password in clear text across the network. With Digest Authentication, the
`password is kept as a shared secret between the client and server. The server and
`client compute a digest value, using the MD5* (Message Digest 5) algorithm
`over a concatenation of the secret password and a few other values. This digest is
`then sent across the network. Since only the client and server know the secret
`password, the client can compute the digest value, send it to the server, and then
`the server can verify it against the information it holds. Since no one else knows
`
`* MD5 is detailed in RFC1321.
`
`21
`
`Huawei v. Uniloc - Exhibit No. 1009 - 18/219
`
`
`
`CHAPTER 2 HTTP OVERVIEW
`
`the secret password, authenticity is more secure. This algorithm is similar to the
`POP3 protocol's APOP method of authentication.
`Digest Authentication is still only a reasonably secure method, however. It
`still requires an outside mode of exchanging the password between clients and
`servers. Digest Authentication, therefore, is meant solely as a replacement for
`Basic Authentication.
`
`2.4.7 Caching
`The caching model in HTTP I 1.1 allows the server a great deal of control over
`the caching of responses. First, the specification makes it clear what is cachable
`and what is not. Generally spealdng, only GET or HEAD responses are cachable;
`responses to any other method must be explicitly marked as cachable by the
`server. The prorocol uses the Cache - Control header to transmit caching
`instructions from servers and clients ro caches.
`For servers, the cache control directives can be segregated into five groups:
`what is cachable, what is not cachable, how old it can be, don't serve anything
`past its age, and don't transform. In the first group are directives which allow an
`origin server ro explicitly mark something as cachable when it normally would
`not be. This can be used to allow caching of authenticated responses or responses
`to POST requests. An example of a cachable POST request might be the results of
`a search engine on a Web site. Under many circumstances, the results from a
`search would remain valid for several hours or even a few days. If the response is
`cachable and serves one other client request, the server has off-loaded some work
`onto cache agents.
`The what is not cachable group of directives include the no-cache and no(cid:173)
`store directives. Basically, these directives instruct the cache agents to never
`save a response which includes the directive. The no-cache applies to responses
`only, while the no-store applies ro both the request and response messages.
`The no-store directive can be thought of as the stronger. It instructs caches to
`remove the request/response from volatile storage (i.e., memory) as soon as pos(cid:173)
`sible and to never store it in nonvolatile srorage (i.e., hard disk).
`A server who wishes to control how long a response may be cached will use
`the max-age directive. This directive sets a time limit from when it is served to
`
`22
`
`Huawei v. Uniloc - Exhibit No. 1009 - 19/219
`
`
`
`HTTP/1.1
`
`when the response is considered stale. A client may still request a cache return of
`a response, even though it has become stale. In these situations, the server can
`include a directive from the don't serve anything past its age group. These direc(cid:173)
`tives (must-revalidate and proxy-revalidate) instruct a cache to revali(cid:173)
`date a response with the origin server to make certain it is still valid. If the
`response is not valid, the server will normally supply a fresh response; if the
`server cannot be contacted to revalidate the response, then the cache will return
`an error to the requesting client.
`The final catego1y of directives is the don't transform group. The directive
`here is called no-transform. Its function is to prevent an intermediate agent
`from transforming a response in any way. The typical example would be a server
`sending out medical images. Given the nature of medical images, the content
`authors wish to maintain the images in their original formats, perhaps TIFF. An
`intermediate agent may normally wish to transform all images into JPEG format
`because of the space savings on disk and in bandwidth. This would result in a
`loss of information which is unacceptable in the given context, hence the
`no-transform directive.
`The client agents also gain some control in HTTP I 1.1 over the responses
`that caches serve to them. The directives can be broken down into three basic
`groups: not cachable, how old can it be, and don't make a new request. The not
`cachable group uses the no- cache and no-store directives as do the servers.
`Here, the meaning is slightly different. When a client requests no-cache or
`no-store, it is instructing the cache agent to not send any responses it may have
`stored, but instead to make a new request to the origin server. It also instructs the
`cache agent to not cache the response from the server.
`In the how old can it be group, the cache control directives permit an agent
`to control the age of a response which a server returns to it. It can specify this by
`the age of the response (how long has it been since the origin server generated it),
`by specifying how stale it can be (how long past its age is permissible), or by spec(cid:173)
`ifying how much longer the response must be f