`
`Akamai Ex. 1037
`Akamai Techs. v. Equil IP Holdings
`IPR2023-00332
`Page 00001
`
`
`
`1446 A. Fox, E.A. Brewer/Computer Networks and 1SDN Systems 28 (1996) 1445-1456 in compression. This is not sufficient to bridge the order-of-magnitude gap in bandwidth be- tween, e.g., Ethernet and consumer modems. • A mechanism is proposed in [13] for clients to negotiate for one of several document representa- tions stored at a server. Under the proposed scheme, the server creates a fixed number of representations in advance, possibly with human guidance, and advertises to clients which repre- sentations are available. Even if this mechanism were widely deployed (which would require changes to servers), it would not satisfy clients whose connectivity would best be exploited with an intermediate-quality representation not present at the server. • Caching [9,12,4] and prefetching reduce initial server-client latency and server-cache bandwidth requirements, but do not reduce cache-client bandwidth. We believe that distributed intelligent caching will ultimately be necessary, but not for reducing latency and bandwidth to the client. The methods described above are ineffective at clos- ing the bandwidth gap because they either require changes at the server (content or control), force additional interaction with the user (e.g. to explicitly select one version of a page), do not allow the user to explicitly manage the available client bandwidth, or require the user to sacrifice graphics altogether. We describe a mechanism that addresses all of these problems: real-time adaptive distillation and refine- ment. 2. How distillation and refinement can help 2.1. The concept of datatype-specific distillation the user. Of course there are limits to how severe a degradation of quality is possible before the image becomes unrecognizable, but as we discuss below, we have found that order-of-magnitude size reduc- tions are possible without significantly compromis- ing the usefulness of an image. Our definition of distillation as lossy compression is independent of the specific encoding of the image. For example, GIF is a lossless image encoding for- mat, but the distillation process throws away infor- mation in shrinking the image and quantizing its colormap. As another example, a PostScript text document can be distilled by extracting the text corresponding to document content and analyzing the text corre- sponding to formatting information in order to glean clues about the document structure. These clues can be used to compose a "plaintext-plus" version of the document, in which, for example, chapter head- ings are in all caps and centered, subsection headings are set off by blank lines, etc. The distilled represen- tation is impoverished with respect to the original document, but contains enough semantic information to be useful to the user. Adobe Systems' Distiller Pro package (not to be confused with our use of the term "distillation") performs a similar function, constructing a portable document format (PDF) file from a PostScript file. Clearly, distillation techniques must be datatype- specific, because the specific properties of a docu- ment that can be exploited for semantic-preserving compression vary widely among data types. We say "type" as opposed to "subtype" (in the MIME sense), since, for example, the techniques used for image distillation apply equally well regardless of whether the source image is in GIF or JPEG format. Distillation is highly lossy, real-time, datatype- specific compression that preserves most of the se- mantic content of a document. The concept is best illustrated by example. A large color graphic can be scaled down to one-half or one-quarter length along each dimension, reducing the total area and thereby reducing the size of the representation. Further com- pression is possible by reducing the color depth or colormap size. The resulting representation, though poorer in color and resolution than the original, is nonetheless still recognizable and therefore useful to 2.2. Refinement Although a distilled image can be a useful repre- sentation of the original, in some cases the user may want to see the full content of the original. More commonly, the user may want to see the full content of some part of the original; for instance, zooming in on a section of a graphic, or rendering a particular page containing PostScript text and figures without having to render the preceding pages.
`
`IPR2023-00332 Page 00002
`
`
`
`A. Fox, E.A. Brewer/Computer Networks and 1SDN Systems 28 (1996) 1445-1456 1447 We use the term refinement to refer to the pro- cess of fetching some part (possibly all) of a source document at increased quality. We can define a refinement space for a given datatype, whose axes correspond to the properties of the datatype exploited by the corresponding distillation technique. For ex- ample, some obvious axes for still graphics are scale (as a fraction of the original) and color depth. The source image corresponds to the tuple and (1,1 ..... 1) in refinement space. Distillation and refinement can then be thought of as parameterized mappings between points in refinement space. The example interface by which a user specifies a desired refinement is application-specific; for example, using a mouse to select a subregion of an image for zooming. Refinement space for a given datatype may be discrete or continuous. For example, the pixel-di- mension refinement axis is (nearly) continuous, but for distilling rich text, we may be able to identify only a relatively small fixed number of intermediate quality representations. For PostScript, these would likely consist of "plain text" (ASCII only with minimal formatting clues), structured rich text (such as PDF or HTML), and original PostScript. 2.3. Trading cycles for bandwidth Because distillation can be performed in real time, it eliminates the need for servers to maintain multi- ple intermediate-quality representations of a docu- ment: Any desired intermediate representation can be Fig. 1. (top) Soda Hall, distilled image and ref'mement. (bottom) Distilled to 320 × 200 (17Kbytes), refinement of writing on building (15 Kbytes).
`
`IPR2023-00332 Page 00003
`
`
`
`1448 A. Fox, E.A. Brewer/Computer Networks and 1SDN Systems 28 (1996) 1445-1456 created on demand using an appropriate distiller. The computing resources necessary for real-time distilla- tion are becoming cheaper and more plentiful, and we have found that at least certain kinds of distilla- tion can be done almost trivially in real time using modest desktop-PC hardware. Distillation and refine- ment allow us to trade cycles, which are cheap and plentiful, for bandwidth, which is expensive and scarce. 2.4. Using refinement for bandwidth management As an example of refinement in action, consider a user who has downloaded the image of Soda Hall in Fig. 1 to her laptop computer using a 28.8 modem. The 300 X 200 image, which occupies 17K bytes and contains 16 colors, was obtained by distilling the original which has pixel dimensions 880 X 610, con- tains 249 colors, and occupies 503K bytes. Although the distilled image is clearly recognizable as the building, due to the degradation of quality the writ- ing on the building is unreadable. The user can specify a refinement of the subregion containing the writing, which can then be viewed at full resolution and color depth, as shown in Fig. 1. The refinement requires 15K bytes to represent. Notice how distillation and refinement have been used to explicitly manage the limited bandwidth available to the user. The distilled image and refine- ment together occupy only a fraction of the size of the original. The total bandwidth required to transmit them is a fraction of what would have been required to transmit the original, which might have been too large to view on the user's screen anyway. The process of distilling the original to produce the smaller representation took about 6 seconds of wall clock time on a lightly loaded SPARC-20 worksta- tion; the process of extracting the subregion from the original for refinement took less than 1 second. 2.5. Optimizing for a target display Some notebook computers and PDAs have smaller screens and can display fewer colors or grays than their desktop counterparts, in addition to suffering from limited bandwidth. For such devices, we would like to use the scarce bandwidth for transmitting a distilled representation of higher resolution, rather than using it for transmission of color information in excess of the client's display capability. Intelligent distillation will scale the source image down to reasonable dimensions for the client display, and preserve only the color information that the client can display. Distillation thus allows bandwidth to be managed in a way that exploits the client's strengths and limitations. Table 1 gives a sampling of comput- ing devices with typical display and bandwidth char- acteristics, with the minimum latencies in minutes and seconds to transfer 5K, 50K and 200K bytes. These numbers serve as zeroth-order approximations for a small inline image, a large inline image, and the total amount of inline image data on a page, respectively. 2.6. Optimizing for rendering on impoverished de- vices Some devices, particularly PDAs, have limited onboard computing power and understand only a small number of image formats. It would be painful and slow, for example, for an Apple Newton to receive a GIF image and transcode it to PICT, its native graphics format, for display on the screen. Instead, this transcoding can be done on a more Table 1 Computing device characteristics Device CPU/MHz Typ. Bandwidth (bits/s) Display size Minimum xmit latency, 5K/50K/200K bytes Apple Newton ARM 610/20 2400 320 × 240, 1-bit Sony MagicLink Motorola 68340/20 14.4K 480 X 320, 2-bit gray Typical notebook PC Intel or PPC/60-100 28.8K wireline 640 × 480 to 800 X 600, 9600 cellular 8-bit color Typical desktop PC Intel or PPC/60-120 56K ISDN, 10M Ethernet 640 × 480 to 1024 × 768, 16-bit color 0:17/2:50/I 1:20 0:03/0:30/1:20 0:02/0:15/0:60 wireline 0:04/0:42/2:48 cellular 0:01/0:07/0:29
`
`IPR2023-00332 Page 00004
`
`
`
`A. Fox, E,4. Brewer / Computer Networks and ISDN Systems 28 (1996) 1445-1456 1449 Web Server network (Internet) Proxy Server (Pythia) Client Browser Nigh bandwidth and low latency link (typically wireline) Low bandwidth and high latency link (typically wireless or modem) Fig. 2. Architecture of a "proxied" WWW service. capable desktop workstation as part of the distillation process, before the image is sent to the client. The idea of using transcoding to address client limitations has been explored in the Wireless World Wide Web experiment performed at DEC WRL [5]. 3. An implemented HTTP proxy based on real- time distillation We have shown that distillation and refinement provide the user with a powerful mechanism for management of limited bandwidth, without com- pletely sacrificing bandwidth-intensive nontextual (or richtext) content. Since such a mechanism is sorely needed on the WWW, we have implemented an HTI'P proxy [10] based on real-time distillation and refinement. We have observed that using our proxy makes Web surfing with a modem much more bear- able, and makes Web surfing over metropolitan-area wireless feasible. (Our work on distillation and re- finement was originally done in the context of wire- less mobile computing.) Mosaic, Netscape, and other popular WWW browsers allow the user to designate a particular host as a proxy for HTTP requests. Rather than fetching a URL directly from the appropriate server, the fetch request is passed on to the proxy. The proxy obtains the document from the server on the client's behalf, and forwards it to the client. The proxy mechanism was originally included to allow users behind a corporate firewall to access the WWW via a proxy that had "one foot on either side" of the firewall. Our proxy, Pythia 5, is intended to run near the logical boundary between well-connectedness and poorly-connectedness. As a first-order approximation, if we take the majority of the wired Internet to be well-connected and consider a client using PPP or SLIP to be poorly-connected, Pythia can run anywhere inside the wired Internet. The architecture of a "proxied" WWW service is shown schematically in Fig. 2. The idea of placing a proxy at this boundary has also been explored in the LBX (Low Bandwidth X) project, on which the Berkeley InfoPad's [6] "split X" server is based. The idea of using a proxy to transcode data on the fly was discussed in [8]. 3.1. Statistical models for real-time distillation Pythia works by modeling the running time and achieved compression of distillation algorithms for various data formats. For example, given an input GIF or JPEG and a color quantization factor and scale, the model is used to predict how long the distillation will take and approximately how much compression will be achieved. Our current models provide a ballpark first cut for estimating compres- 5 In Greek mythology, Pythia was the intermediary who carded a pilgrim's request to the Oracle at Delphi and conveyed the reply back to the pilgrim.
`
`IPR2023-00332 Page 00005
`
`
`
`1450 A. Fox, E.A. Brewer/Computer Networks and ISDN Systems 28 (1996) 1445-1456 sion and latency, though significant deviations from the model prediction are observed in a substantial fraction of cases. We expect the refinement of this model to be a focus of continuing research. Pythia uses the model to meet user-specified bounds on inline image size (and therefore latency) while surfing the WWW. For example, suppose the user is using a 28.8 modem and has specified a maximum latency of 5 seconds per inline image, and Pythia encounters an inline image that is 40 Kbytes in size. The maximum traffic that can be carried in 5 seconds at 28.8Kbits/sec is about 5.6Kbytes, so Pythia calculates the distillation parameters neces- sary to produce a representation of the image that is about 5.6 Kbytes in size. In practice, the bound on the image size will be tighter, since Pythia must account for the additional latency introduced by the distillation process itself. The first graph (Fig. 3) shows a breakdown of server fetch, distillation, and transmission times for a small sample of images found on Berkeley WWW servers, as transmitted to a client on a conventional 14.4 modem using PPP. • The number in parentheses following the name of each image is the size of the source image, in bytes, as stored on the server. Distillation vs. Transmission cool.gif(417852) graduation.gif(594939) portrate.gif(49185) soda.gif(503761 ) fox,gif(211237) eric.gif(19236) 0 I 0 20 30 40 Fig. 3. Breakdown of server fetch, distillation and transmission times for various images.
`
`IPR2023-00332 Page 00006
`
`
`
`A. Fox, E.A. Brewer / Computer Networks and ISDN Systems 28 (1996) 1445-1456 1451 • Each bar shows the breakdown of total client latency to receive the image: time for Pythia to retrieve the image from server (svr), time to distill the image (distill), and time to transmit the distilled image to the client (xmit). TCP roundtrip latencies between the client and proxy are ab- sorbed into this last component. The four different bars for each image represent four different Pythia user profiles, varying in the aggressiveness of distillation. In each case, the final size of the distilled representation is shown as a number to the right of each bar. For example, the image cool.gif, whose undistilled size is 417,852 bytes, was delivered to the client in a distilled form of 12574 bytes. The delivery latency included about 2 seconds for Pythia to get the image from the local server, about 8 seconds to distill it, and about 32 seconds to send the distilled version to the client. The other three bars for cool.gif show similar latency measures for three different distilled representations, of 7474, 4545, and 3348 bytes. As the graph shows, distillation ranges from less than 1 second up to a couple of tens of seconds, on a lightly-loaded SPARCstation-20. The unusually high transmit latencies for the small (3K) images reflect a highly loaded PPP gateway that typically adds up to 0.5 seconds each way per roundtrip; unfortunately, such performance is not unusual when using PPP-based ISPs. The second graph (Fig. 4) shows the raw transmit latencies, including TCP and PPP-gateway overhead, for transmitting the undistilled originals of the above images to the client using the same modem connec- tion. For reference, the bars in the above graph are also reproduced below. As the graph shows, the total perceived latency at the client is reduced by approxi- mately an order of magnitude when Pythia is used, even though the distillation process takes measurable time. 3.2. Pythia's user interface Pythia maintains a user profile associated with the IP address of each HTTP client that contacts it, and provides a mechanism for users to "register" if their IP address changes (as is the case, e.g., with ISPs that assign IP addresses dynamically when users dial up). The profile, which is user-settable via an HTML ¢ooI,9i~417852) Tronsmission Without Proxy Distillation graduation,gif~O?4959) portrate~i~49185) fox .~lif(211237) erie ~if(19256) 0 100 200 ~ 400 Fig. 4. Raw transmit latencies for the undisfllled originals. 5O0
`
`IPR2023-00332 Page 00007
`
`
`
`1452 A. Fox, E.A. Brewer/Computer Networks and ISDN Systems 28 (1996) 1445-1456 Unregistered User/Change Preferences E If you're already registered and want to tell Pythia your new IP address for this session, just enter your Pythia user ID below, select "I'm Here", and click Submit. (Hint: adding this page to your Hotlist will make it easy to change your preferences in the future.) [] If you want to register, fill out the entire form and click Submit. [] If you don't want to register, unset your HTTP proxy in your Browser Preferences now. [] Click here for a demo of this service: you can access the GloMop Project pages without registering, but trying to access any other page will bring you back here. Send us feedback about the GIoMop HTTP proxy. Registration NOTE: Your email address is used as the lookup key in Pythia's registration database when you change your preferences profile or IP address. Email Address: New Registration or Prefs Change rm Here, Keep My Existing Prefs Set Preferences Set your distillation preferences here. They will automatically be used whenever you connect to the proxy from the machine you're currently on. Keep inlines below Kbytes My display can show: Colors Grays When refinement of a distilled image is requested: Render refined image only Render image within page Also re-anchor page around new image (increases latency) Miscellaneous options: Omit irritating background patterns Provide link to this page on every page I visit This is an example form only: do not submit fox@cs, berkeley, edu Fig. 5.
`
`IPR2023-00332 Page 00008
`
`
`
`A. Fox, E.A. Brewer/" Computer Networks and ISDN Systems 28 (1996) 1445-1456 1453 form (see Fig. 5), encodes the user's connection speed, some characteristics of the user's display device, and various other options. The display infor- mation is useful because exploiting the display's constraints may allow Pythia to produce a better representation of some graphic within the same la- tency bound (e.g. it will permit color information to be traded for resolution). To use Pythia, a user specifies Pythia's host and port as the HTTP Proxy in the Preferences dialog of most browsers, and fills out the profile form. Pages delivered by Pythia look like their "unproxied" counterparts, except that some of the inline images have been distilled. Bounding boxes of the original images are preserved, to accommodate pages where the layout has been fine-tuned for viewing on a particular browser. The user can request a refinement of a distilled image by following an HTTP link next to each distilled image. Depending on the user's profile, the original image will be fetched and displayed on a page by itself, or it will be refined "in place" and the current page re-rendered around it. Pythia adds these "fetch refinement" links to the HTML text on the fly, as described in the next section. For example, here is a portion of a web page before refinement (see Fig. 6 (top)), and the same page after (Fig. 6 (bottom)) the user has refined the inline image. If image dimension hints are supplied in the source page's IMG tag, Pythia passes them on to the client; however, Pythia cannot add dimension hints itself, since at the time it sees the referencing HTML tag, it cannot know what the actual image dimen- sions will be without prefetching part of the image, which might add unacceptable latency. We are ex- perimenting with this tradeoff to determine which method will provide a higher perceived quality of service to the user. Pythia also translates PostScript to HTML, using =Armando Fox Research I Classes/Teaching I Papers ] Personal ] Calendar I Disney Welcome to Armando's home page---a detour onto the information service road. I'm a second-year PhD student in the Computer Science Division at UC Berkeley. My advisor is Eric Brewer. /] Armando Fox Research [ Classes/Teaching I Papers I Personal I Calendar I Disney Welcome to Armando's home page---a detour onto the information service road. I'm a second-year PhD student in the Computer Science Division at UC Berkeley. My advisor is Eric Brewer. Fig. 6.
`
`IPR2023-00332 Page 00009
`
`
`
`1454 A. Fox, E.A. Brewer / Computer Networks and ISDN Systems 28 (1996) 1445-1456 software developed in part by DEC SRC [11]. This distillation typically results in a reduction of 5-7 × for PostScript text, and has the additional advantage that the text can be rendered on clients for which PostScript previewing is awkward, such as PCs run- ning Windows. This is an example of distillation that provides both of the orthogonal benefits mentioned previously: content size reduction and optimization for rendering on the target display. 4. Implementation and performance 4.1. URL munging and HTML modification When Pythia returns HTML text to the client, the text is scanned for IMG tags. For each such tag found, Pythia does two things: • Modify the URL of the image source, so that when the image is requested, Pythia will recog- nize the tag as belonging to an inline image. This could be omitted if the HTTP "Referer:" field was filled in consistently by all browsers. Insert a hyperlink immediately following the im- age tag. This new link will contain a URL, based on that of the original image, that will cue Pythia to deliver an undistilled representation of the image. If the user has elected to have Pythia re-render the entire page with the image refined in place, the URL is instead based on the name of the referring page concatenated with a bit vector in which each bit position indicates whether the corresponding inline image should be distilled or not. A more detailed explanation of the munging mecha- nism, for those who are interested, can be found on Pythia's home page 6. 4.2. Exploiting URL-level parallelism Pythia's internal architecture is modular: "distilla- tion servers" for new datatypes can be eaSily added. The distillation server need only provide a statistical performance model and the functionality to read the 6 http://www.cs.berkeley.edu/ fox/glomop/pythia.html source document and write out a distilled representa- tion given some parameters. Although a distiller can be launched as a standard shell pipeline, we also provide a standard makefile and front-end for build- ing somewhat more efficient distillation servers based on Berkeley sockets. Distillers can run on the same physical machine as Pythia or on different machines. Pythia keeps track of which distillers are running on which ma- chines, and attempts to do simple load balancing across them. The Berkeley Network of Workstations (NOW) project [1] has provided a job-queue inter- face for harvesting idle cycles on machines in the NOW; we are retrofitting Pythia to use this mecha- nism to spawn and destroy distillers dynamically as NOW resource levels fluctuate. 4.3. Refinement cache After distilling and forwarding an image to the client, Pythia caches a copy of the image locally to minimize latency in case the client requests refine- ment. The cache is a very simple fully-associative size-limited LRU whose keys are URLs and whose data fields are the original image data. 5. Implementation status, limitations, and future work Pythia's "front end" currently runs on a lightly- loaded SPARCstation-20 and distributes image dis- tillers to 2-4 other workstations on the same subnet. Because it is a prototype and is not consistently running, its user community is limited to only about a dozen users, and it is rare to see more than two users at one time. Under these light conditions, the workstation console does not suffer noticeable per- formance degradation due to Pythia usage. Since Pythia can farm out distillation work to other workstations, the cycles required to perform distillation for clients do not constitute a perfor- mance bottleneck. Instead, like HTTP servers, the limiting factor is the single pipe in and out of the "front end" that receives HTTP requests (i.e. the process listening on the TCP port designated as the HTTP Proxy). The current implementation of Pythia
`
`IPR2023-00332 Page 00010
`
`
`
`A. Fox, E.A. Brewer / Computer Networks and ISDN Systems 28 (1996) 1445-1456 1455 is in unoptimized Perl; translation to C should in- crease the number of requests that can be handled by the front end per unit time. Current usage patterns indicate that this metric will not be a bottleneck when only a few users are served by a single front end. We are planning joint work with the Berkeley Office of Telecommunications Services, which pro- vides dial-up PPP and SLIP services to about 6,000 subscribers on the Berkeley campus, to allow them to provide web proxy service as part of their sub- scription package. This experiment will stress Pythia and allow us to explore various strategies for scaling the front-end using a "magic router" based on fast IP packet interposition [3]. Pythia currently performs a Unix fork() to handle each new HTTP request. It is well known that the latency of this operation is substantial [14]. Future versions of Pythia will be multithreaded rather than relying on process-level parallelism, and idle worker threads rather than forked processes will handle mul- tiple incoming requests. The bandwidth of the client connection currently must be filled in on the User Preferences HTML form. Pythia takes the user's word for this quantity, rather than attempting to measure the quality of the connection directly (e.g., by estimating the latency between the transmission of HTML text to the client and reception of an HTI'P request for an embedded image). The short lifetimes of HTrP TCP connec- tions and the overhead of TCP slow start make it difficult to measure end-to-end bandwidth accu- rately. • Pythia cannot hide server-to-proxy latency, though it can mitigate it by distillation and caching. Pythia's distillation estimates are based solely on the proxy- to-client bandwidth stated by each user. Pythia is fault-tolerant with respect to distillers: it will reap distillers that are killed due to NOW load balancing and will continue to function with de- graded performance. If Pythia's front-end crashes, however, the user will see an error that the proxy has stopped accepting connections. We currently do not have a fault-tolerance strategy for the front-end. Because Pythia works by munging URLs, it may cause cache inconsistency at the client. For example, after a user stops using Pythia, that user's cache will contain some entries whose keys (URLs) are the Pythia-modified URLs rather than the original source URLs. Flushing the client cache fixes this problem at significant inconvenience to the user. URL munging is necessary because HTrP provides no way for Pythia to maintain "session state" describing which inlines on a given page have been distilled and which have not. To circumvent this limitation, Pythia encodes this information into the URLs passed back and forth between client and proxy. HTTP-NG will include some notion of session control, which should allow us to maintain the appropriate state without resorting to URL-munging. As part of our wireless and mobile computing effort, we are developing a variant of Pythia with a richer client API for building network-adaptive ap- plications. This API will allow negotiation of a wider variety of datatypes, an environment in which agents can run, and distillation services for continuous- media streams such as MPEG (an implemented ex- ample is [2]). 6. Conclusions Pythia provides three important orthogonal bene- fits to WWW clients: • Real-time distillation and refinement, guided by statistical models, allow the user to bound latency and exercise explicit control over bandwidth that may be scarce and expensive (e.g. metered cellu- lar phone service). • Transcoding to a representation understood di- rectly by the client may improve rendering on the client or result in a representation that can be transmitted more efficiently. • Knowledge of client display constraints allows content to be optimized for rendering on the client. Users have commented that even the prototype version of Pythia provides a qualitative increase of about 5 × when surfing the WWW over PPP with a 14.4 modem. These are the same users that previ- ously turned image loading off completely in order to make surfing bearable. With the continued growth of the WWW, the benefits afforded by proxied ser- vices like Pythia will represent increasingly signifi- cant added value to end users and content providers alike. Pythia is the first fruit of a comprehensive research agenda aimed at implementin