`
`
`
`Peerrmance
`Tuning
`
`O’REILLY"
`
`Pam'cle Killelea
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`
`
`Web Performance Tuning
`
`Patrick Killelea
`
`Beijing - Cambridge - K6171 - Paris - Sebastopol - Taipei . Tokyo
`
`O’REILLY”
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`
`
`Web Performance Tuning
`by Patrick Killelea
`
`Copyright © 1998 O’Reilly & Associates, Inc. All rights reserved.
`Printed in the United States of America.
`
`Published by O’Reilly & Associates, Inc., 101 Morris Street, Sebastopol, CA 95472.
`
`Editor: Linda Mui
`
`Production Editor: Madeleine Newell
`
`Printing History:
`
`October 1998:
`
`First Edition.
`
`Nutshell Handbook and the Nutshell Handbook logo are registered trademarks of O’Reilly &
`Associates, Inc. JavaTM and all Java-based trademarks and logos are trademarks or registered
`trademarks of Sun Microsystems, Inc., in the United States and other countries. O’Reilly 8:
`Associates, Inc. is independent of Sun Microsystems. The association between the image of a
`hummingbird and the topic of web performance tuning is a trademark of O’Reilly 8:
`Associates, Inc.
`
`Netscape, Netscape Navigator, and the Netscape Communications Corporate logos are
`trademarks and tradenames of Netscape Communications Corporation. Internet Explorer and
`the Internet Explorer logo are trademarks and tradenames of Microsoft Corporation. All other
`product names and logos are trademarks of their respective owners. Many of the designations
`used by manufacturers and sellers to distinguish their products are claimed as trademarks.
`Where those designations appear in this book, and O’Reilly & Associates, Inc. was aware of
`a trademark claim, the designations have been printed in caps or initial caps.
`
`Appendix A is Copyright © 1998 Netscape Communications Corporation. Reproduced with
`permission.
`
`While every precaution has been taken in the preparation of this book, the publisher assumes
`no responsibility for errors or omissions, or for damages resulting from the use of the
`information contained herein.
`
`(E
`{Q9
`
`This book is printed on acid—free paper with 85% recycled content, 15% post-consumer waste.
`O’Reilly & Associates is committed to using paper with the highest recycled content available
`consistent with high quality.
`
`ISBN: 1—56592-579-0
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`
`
`
`
`In this chapter:
`- Parameters of
`Performance
`- Benchmark
`
`Specifications and
`Benchmark Tests
`
`- Web Performance
`Measuring Tools and
`
`.293“
`
`Recommendations
`
`Web Performance
`Measurement
`
`Parameters of Performance
`
`There are four classic parameters describing the performance of any computer sys—
`tem:
`latency, throughput, utilization, and efficiency. Tuning a system for perfor—
`mance can be defined as minimizing latency and maximizing the other three
`parameters. Though the definition is straightforward, the task of tuning itself is not,
`because the parameters can be traded off against one another and will vary with
`the time of day, the sort of content served, and many other circumstances. In addi—
`tion, some performance parameters are more important to an organization’s goals
`than others.
`
`Latency ana’ Throughput
`
`Latency is the time between making a request and beginning to see a result. Some
`define latency as the time between making a request and the completion of the
`request, but this definition does not cleanly distinguish the psychologically signifi-
`cant time spent waiting, not knowing whether your request has been accepted or
`understood. You will also see latency defined as the inverse of throughput, but
`this is not useful because latency would then give you the same information as
`
`throughput. Latency is measured in units of time, such as seconds.
`
`Throughput is the number of items processed per unit time, such as bits transmit—
`ted per second, HTTP operations per day, or millions of instructions per second
`(MIPS). It is conventional to use the term bandwidth when referring to through—
`
`put in bits per second. Throughput is found simply by adding up the number of
`items and dividing by the sample interval. This calculation may produce correct
`but misleading results because it ignores variations in processing speed within the
`sample interval.
`
`43
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`
`
`
`44
`Chapter 3: Web Petformance Measurement
`
`The following three traditional examples help clarify the difference between
`latency and throughput:
`
`1. An overnight (24—hour) shipment of 1000 different CDs holding 500 mega—
`bytes each has terrific throughput but lousy latency. The throughput is (500 X
`220 X 8 X 1000) bits/(24 X 60 X 60) seconds = about 49 million bits/second,
`which is better than a T3’s 45 million bits/second. The difference is that the
`
`overnight shipment bits are delayed for a day and then arrive all at once, but
`T3 bits begin to arrive immediately, so the T5 has much better latency, even
`though both methods have approximately the same throughput when consid-
`ered over the interval of a day. We say that the overnight shipment is barsty
`traffic.
`
`2. Supermarkets would like to achieve maximum throughput per checkout clerk
`because they can then get by with fewer of them. One way for them to do this
`is to increase your latency, that is, to make you wait in line, at least up to the
`limit of your tolerance. In his book Configuration and Capacity Planning for
`Solaris Servers (Prentice Hall), Brian Wong phrased this dilemma well by say-
`
`ing that throughput is a measure of organizational productivity while latency is
`a measure of individual productivity. The supermarket may not want to Waste
`
`your individual time, but it
`organizational productivity.
`
`is even more interested in maximizing its own
`
`3. One woman has a throughput of one baby per 9 months, barring twins or trip—
`
`lets, etc. Nine women may be able to bear 9 babies in 9 months, giving the
`group a throughput of 1 baby per month, even though the latency cannot be
`decreased (i.e., even 9 women cannot produce 1 baby in 1 month). This
`
`mildly offensive but unforgettable example is from The Mythical Man-Month,
`by Frederick P. Brooks (Addison Wesley).
`
`Although high throughput systems often have low latency, there is no causal link.
`You’ve just seen how an overnight shipment can have high throughput with high
`latency. Large disks tend to have better throughput but worse latency: the disk is
`physically bigger, so the arm has to seek longer to get to any particular place. The
`latency of packet network connections also tends to increase with throughput. As
`you approach your maximum throughput, there are simply more packets to put on
`the wire, so a packet will have to wait longer for an opening, increasing latency.
`This is especially true for Ethernet, which allows packets to collide and simply
`retransmits them if there is a collision, hoping that it retransmitted them into an
`
`increasing throughput capacity will decrease
`It seems obvious that
`open slot.
`latency for packet switched networks. While this is true for latency imposed by
`traffic congestion, it is not true for cases where the latency is imposed by routers
`or sheer physical distance.
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`
`
`Parameters ofPeiformance
`
`45
`
`Finally, you can also have low throughput with low latency: a 14.4kbps modem
`may get the first of your bits back to you reasonably quickly, but its relatively low
`throughput means it will still take a tediously long time to get a large graphic to
`you.
`
`With respect to the Internet, the point to remember is that latency can be more
`significant
`than throughput. For small HTML files, say under 2K, more of a
`28.8kbps modem user’s time is spent between the request and the beginning of a
`response (probably over one second) than waiting for the file to complete its
`arrival (one second or under).
`
`Measuring network latency
`
`Each step on the network from client to server and back contributes to the latency
`of an HTTP operation. It is difficult to figure out where in the network most of the \
`
`latency originates, but there are two commonly available Unix tools that can help.
`Note that we’re considering network latency here, not application latency, which is
`the time the applications running on the server itself take to begin to put a result
`back out on the network.
`
`If your web server is accessed over the Internet, then much of your latency is
`probably due to the store and forward nature of routers. Each router must accept
`an incoming packet into a buffer,
`look at the header information, and make a
`decision about where to send the packet next. Even once the decision is made,
`
`the router will usually have to wait for an open slot to send the packet. The
`latency of your packets will therefore depend strongly on the number of router
`hops between the web server and the user. Routers themselves will have connec—
`tions to each other that vary in latency and throughput. The odd, yet essential
`thing about the Internet is that the path between two endpoints can change auto—
`matically to accommodate network trouble, so your latency may vary from packet
`to packet. Packets can even arrive out of order. You can see the current path your
`packets are taking and the time between router hops by using the tracerom‘e util-
`ity that comes with most versions of Unix. (See the traceroute manpage for more
`information.) A number of kind souls have made traceroute available from their
`
`web servers back to the requesting IP address, so you can look at path and perfor-
`mance to you from another point on the Internet, rather than from you to that
`point. One page of links to traceroute servers is at http://www.slacstanford.edn/
`comp/net/wan—mon/tmceroute-srv.btml. Also see http://www.internetwealbercom/
`for continuous measurements of ISP latency as measured from one point on the
`Internet.
`
`Note that traceroute does a reverse DNS lookup on all intermediate IPs so you can
`see their names, but this delays the display of results. You can skip the DNS
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`
`
`46
`
`Chapter 3: Web Performance Measurement
`
`lookup with the —n option and you can do fewer measurements per router (the
`default is three) with the -q option. Here’s an example of traceroute usage:
`
`% traceroute —q 2 www.umich.edu
`traceroute to www.umich.edu (141.211.144.53), 30 hops max, 40 byte packets
`1
`router.cableco—op.com (206.24.110.65)
`22.779 ms
`139.675 ms
`2 mv103.mediacity.com (206.24.105.8)
`18.714 ms
`145.161 ms
`3 grfge000.mediacity.com (206.24.105.55)
`23.789 ms
`141.473 ms
`4 bordercoreZ—hssiO—O.SanFrancisco.mci.net
`(166.48.15.249)
`29.091 ms
`39.856 ms
`62.75 HS
`63.16 ms
`(166.48.22.1)
`bordercoreZ.WillowSprings.mci.net
`merit.WillowSprings.mci.net
`(166.48.23.254)
`82.212 ms
`76.774 ms
`f-umbin.c—ccb2.ummet.umich.edu (198.108.3.5)
`80.474 ms
`76.875 ms
`www.umich.edu (141.211.144.53)
`81.611 ms *
`
`mde'l
`
`If you are not concerned with intermediate times and only want to know the cur—
`rent time it takes to get a packet from the machine you’re on to another machine
`on the Internet (or on an intranet) and back to you, you can use the Unix ping
`utility. ping sends Internet Control Message Protocol (ICMP) packets to the named
`host and returns the latency between you and the named host as milliseconds. A
`latency of 25 milliseconds is pretty good, while 250 milliseconds is not good. See
`the ping manpage for more information. Here’s an example of ping usage:
`
`% ping www.umich.edu
`PING www.umich.edu (141.211.144.53): 56 data bytes
`64 bytes from 141.211.144.53:
`icmp_seq=0 ttl=248 time=112.2 ms
`64 bytes from 141.211.144.53:
`icmp_seq=1 ttl=248 time=83.9 ms
`64 bytes from 141.211.144.53:
`icmp_seq=2 tt1=248 time=82.2 KB
`64 bytes from 141.211.144.53:
`icmp_seq=3 ttl=248 time=80.6 ms
`64 bytes from 141.211.144.53:
`icmp_seq=4 ttl=248 time=87.2 H5
`64 bytes from 141.211.144.53:
`icmp_seq=5 ttl=248 time=81.0 ms
`
`——— www.umich.edu ping statistics ———
`6 packets transmitted,
`6 packets received,
`round—trip min/avg/max = 80.6/87.8/112.2 ms
`
`0% packet loss
`
`Measuring network latency and throughput
`
`When ping measures the latency between you and some remote machine, it sends
`ICMP messages, which routers handle differently than the TCP segments used to
`carry HTTP. Routers are sometimes configured to ignore ICMP packets entirely.
`Furthermore, by default, ping sends only a very small amount of information, 56
`data bytes, although some versions of ping let you send packets of arbitrary size.
`For these reasons, ping is not necessarily accurate in measuring HTTP latency to
`the remote machine, but it is a good first approximation. Using telnet and the Unix
`tulle program will give you a manual feel for the latency of a connection.
`
`The simplest ways to measure web latency and throughput are to clear your
`browser’s cache and time how long it takes to get a particular page from your
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`
`
`
`Parameters ofPeiformcmce
`47
`
`server, have a friend get a page from your server from another point on the Inter-
`net, or log in to a remote machine and run time lynx —source http://
`myserver.com/ > /dev/null. This last method is sometimes referred to as the
`
`stopwath method of web performance monitoring.
`
`Another way to get an idea of network throughput is to use FTP to transfer files to
`and from a remote system. FTP is like HTTP in that it is carried over TCP. There
`are some hazards to this approach, but if you are careful, your results should
`reflect your network conditions. First, do not put too much stock in the numbers
`the FTP program reports to you. While the first significant digit or two will proba—
`bly be correct, the FTP program internally makes some approximations, so the
`number reported is only approximately accurate. More importantly, what you do
`with FTP will determine exactly which part of the system is the bottleneck. To put
`
`it another way, what you do with FTP will determine what you’re measuring. To
`insure that you are measuring the throughput of the network and not of the disk
`of the local or remote system, you want to eliminate any requirements for disk
`access which could be caused by the FTP transfer. For this reason, you should not
`FTP a collection of small files in your test; each file creation requires a disk access.
`
`Similarly, you need to limit the size of the file you transfer because a huge file will
`not fit in the filesystem cache of either the transmitting or receiving machine, again
`resulting in disk access. To make sure the file is in the cache of the transmitting
`machine when you start the FTP, you should do the FTP at least twice, throwing
`away the results from the first iteration. Also, do not write the file on the disk of
`the receiving machine. You can do this with some versions of FTP by directing the
`result to Mew/null. Altogether, we have something like this:
`
`ftp> get bigfile /dev/null
`
`Try using the FTP [ms/9 command to get an interactive feel for latency and
`throughput. The has}? command prints hash marks (#) after the transfer of a block
`of data. The size of the block represented by the hash mark varies with the FTP
`
`implementation, but FTP will tell you the size when you turn on hashing:
`
`ftp> hash
`Hash mark printing on (1024 bytes/hash mark).
`ftp> get ers.27may
`200 PORT command successful.
`150 Opening BINARY mode data connection for ers.27may (362805 bytes).
`#############################################################################
`#############################################################################
`#############################################################################
`#############################################################################
`##############################################
`226 Transfer complete.
`362805 bytes received in 15 secs (24 Kbytes/sec)
`ftp> bye
`221 Goodbye.
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`
`
`
`
`48 Chapter 3: Web Performance Measurement
`
`You can use the Expect scripting language to run an FTP test automatically at reg-
`ular intervals. Other scripting languages have a difficult time controlling the termi—
`nal of a spawned process; if you start FTP from within a shell script, for example,
`execution of the script halts until FTP returns, so you cannot continue the FTP ses—
`sion. Expect is designed to deal with this exact problem. Expect is well docu—
`mented in Exploring Expect, by Don Libes (O’Reilly 8: Associates).
`
`You can of course also retrieve content via HTTP from your server to test network
`performance, but
`this does not cleanly distinguish network performance from
`server performance.
`
`Here are a few more network testing tools:
`
`ttcp
`
`It
`ttcp is an old C program, circa 1985, for testing TCP connection speed.
`makes a connection on port 2000 and transfers zeroed buffers or data copied
`from STDIN. It is available from fip://ftp.arl.mil/piib/ttcp/ and distributed with
`some Unix systems. Try which ttcp and mom ttcp on your system to see if the
`binary and documentation are already there.
`nettest
`
`is Nettest, available at fip.-//ftp.sgi.com/sgi/src/
`A more recent tool, circa 1992,
`nettest/ Nettest was used to generate some performance statistics for VBNS, the
`very-high-performance backbone network service, http://wwwz/bnsnet/
`
`bing
`
`bing attempts to measure bandwidth between two points on the Internet. See
`http://web.cnamfr/reseau/bing.btml.
`
`cbargen
`
`The cbargen service, defined in RFC 864 and implemented by most versions
`of Unix, simply sends back nonsense characters to the user at the maximum
`possible rate. This can be used along with some measuring mechanism to
`determine what that maximum rate is. The TCP form of the service sends a
`
`continuous stream, while the UDP form sends a packet of random size for
`each packet received. Both run on well-known port 19.
`
`netspec
`
`NetSpec simplifies network testing by allowing users to control processes
`across multiple hosts using a set of daemons.
`It can be found at bttp://
`www. tisl. u/eoms. edu/Projects/AAI/products/netspec/
`
`Utilization
`
`Utilization is simply the fraction of the capacity of a component that you are actu—
`ally using. You might think that you want all your components at close to 100%
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`
`
`Parameters of Performance
`
`49
`
`utilization in order to get the most bang for your buck, but this is not necessarily
`how things work. Remember that for disk drives and Ethernet,
`latency suffers
`greatly at high utilization. A rule of thumb is that many components can run at
`their best performance up to about 70% utilization. The petfmeter tool that comes
`with many versions of Unix is a good graphical way to monitor the utilization of
`your system.
`
`Ejficiency
`
`Efficiency is usually defined as throughput divided by utilization. When compar—
`ing two components, if one has a higher throughput at the same level of utiliza-
`tion, it is regarded as more efficient. If both have the same throughput but one has
`a lower level of utilization that one is regarded as more efficient. While useful as a
`basis for comparing components, this definition is otherwise irrelevant, because it
`is only a division of two other parameters of performance.
`
`‘
`
`A more useful measure of efficiency is performance per unit cost. This is usually
`called cost efficiency. Performance tuning is the art of increasing cost efficiency:
`getting more bang for your buck. In fact, the Internet itself owes its popularity to
`the fact that it is much more cost—efficient than previously existing alternatives for
`transferring small amounts of information. Email is vastly more cost-efficient than a
`letter. Both send about the same amount of information, but email has near—zero
`
`latency and near—zero incremental cost; it doesn’t cost you any more to send two
`emails rather than one. Web sites providing product information are lower latency
`and cheaper than printed brochures. As the throughput of the Internet increases
`faster than its cost, entire portions of the economy will be replaced with more
`cost-efficient alternatives, especially in the business-to-business market, which has
`little sentimentality for old ways. First, relatively static information such as busi-
`ness paperwork, magazines, books, CDs, and videos will be Virtualized. Second,
`the Internet will become a real—time communications medium.
`
`The cost efficiency of the Internet for real—time communications threatens not only
`the obvious target of telephone carriers, but also the automobile industry. That is,
`telecommuting threatens physical commuting. Most of the workforce simply moves
`bits around, either with computers, on the phone, or in face-to-face conversa~
`tions, which are, in essence, gigabit—per-second, low-latency video connections. It
`is only these face—to—face conversations that currently require workers to buy cars
`for the commute to work. Cars are breathtakingly inefficient, and telecommuting
`represents an opportunity to save money. Look at the number of cars on an urban
`
`It’s a slow river of metal, fantastically expensive in
`highway during rush hour.
`terms of car purchase, gasoline, driver time, highway construction, insurance, and
`fatalities. Then consider that most of those cars spend most of the day sitting in a
`parking lot. Just think of the lost interest on that idle capital. And consider the cost
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`
`
`50
`
`Chapter 3: Web Performance Measurement
`
`itself, and the office. As data transmission costs continue to
`of the parking lot
`accelerate their fall, car costs cannot fall at the same pace. Gigabit connections
`
`between work and home will inevitably be far cheaper than the daily commute,
`for both the worker and employer. And at gigabit bandwidth,
`it will feel
`like
`you’re really there.
`
`Benchmark Specifications and
`Benchmark Tests
`
`For clarity, we should distinguish between benchmark specifications and bench—
`mark tests. There are several web benchmarks that may be implemented by more
`
`than one test, since there are implementation details that do not affect the results
`of the test. For example, a well—specified HTTP load is the same regardless of the
`hardware and software used to generate the load and regardless of the actual bits
`in the content. On the other hand, some benchmarks are themselves defined by a
`
`test program or suite, so that running the test is the only way to run the bench—
`mark. We will be considering both specifications and tests in this section.
`
`The point of a benchmark is to generate performance statistics that can legiti—
`mately be used to compare products. To do this, you must try to hold constant all
`of the conditions around the item under test and then measure performance. If the
`
`only thing different between runs of a test is a particular component, then any dif-
`ference in results must be due to the difference between the components.
`
`Exactly defining the component under test can be a bit tricky. Say you are trying
`to compare the performance of Solaris and Irix in running Netscape server soft—
`ware. The variable in the tests is not only the operating system, but also, by neces—
`
`sity, the hardware. It would be impossible to say from a benchmark alone which
`performance characteristics are due to the operating system and which are due to
`the hardware. You would need to undertake a detailed analysis of the OS and the
`hardware, which is far more difficult.
`
`It may sound odd, but another valid way to think of a benchmark test is the cre—
`ation of a deliberate bottleneck at the subject of the test. When the subject is defi-
`nitely the weakest link in the chain, then the throughput and latency of the whole
`system will reflect those of the subject. The hard part is assuring that the subject is
`actually the weakest link, because subtle changes in the test can shift the bottle-
`neck from one part of the system to another, as we saw earlier with the FTP test of
`network capacity. If you’re testing server hardware throughput, for example, you
`want to have far more network throughput than the server could possibly need,
`otherwise you may get identical results for all hardware, namely the bandwidth of
`the network.
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`
`
`
`
`In this chapter:
`0 Brief History of the
`Web Browser
`
`- How Browsers Work
`
`- Popular Browsers
`- Browser Speed
`0 Browser Tuning Tips
`- Figuring Out Why the
`
`Browser Is Hanging
`
`o
`
`Recommendations
`
`Cl”t 50ft
`
`Brief History of the Web Browser
`
`The idea of a hypertext browser is not new. Many word processing packages such
`as FrameMaker and formats such as PDF generate or incorporate hyperlinks. The
`
`idea of basing a hypertext browser on common standards such as ASCII text and
`Unix sockets was an advance first made by the Gopher client and server from the
`University of Minnesota. Gopher proved to be extremely light and quick, but the
`links were presented in a menu separate from the text, and Gopher did not have
`the ability to automatically load images. The first drawback was solved by the
`invention of HTML, and the second was solved in the first graphical HTML
`
`browser, Mosaic, produced in 1993 at the University of Illinois National Center for
`Supercomputing Applications (NCSA).
`
`Many of the original students who developed Mosaic were among the founders of
`Netscape the following year. An effort by the University of Illinois to commercial-
`ize Mosaic led to the founding of Spyglass, which licensed its code to Microsoft for
`the creation of Internet Explorer. Netscape and IE have been at the forefront of
`browser advances in the last few years, but the core function of the browser, to
`
`retrieve and display hypertext and images, has remained the same.
`
`How Browsers Work
`
`The basic function of a browser is extremely simple. Any programmer with a good
`
`knowledge of Perl or Java can write a minimal but functional text-only browser in
`one day. The browser makes a TCP socket connection to a web server, usually on
`port 80, and requests a document using HTTP syntax. The browser receives an
`HTML document over the connection and then parses and displays it, indicating in
`some way which parts of the text are links to other documents or images. When
`
`Microsoft Corp. Exilfoit 1044
`
`Microsoft Corp. Exhibit 1044
`
`
`
`
`86
`Chapter 6: Client Software
`
`the user selects one of the links, perhaps by clicking on it, the process starts all
`over again, with the browser requesting another document.
`In spite of the
`advances in HTML, HTTP, and Java, the basic functionality is exactly the same for
`all web browsers.
`‘
`
`Let’s take a look at'the functionality of recent browsers in more detail, noting per-
`formance issues. To get the ball rolling, the browser first has to parse the URL
`you’ve typed into the “Location:” box or recognize which link you’ve clicked on.
`This should be extremely quick. The browser then checks its cache to see if it has
`that page. The page is looked up through a quick hashed database mapping URLs
`to cache locations. Dynamic content should not be cached, but if the provider of
`the content did not specify an immediate timeout in the HTTP header or if the
`browser is not clever enough to recognize CGI output from URLs, then dynamic
`content will be cached as well.
`
`If the page requested is in the cache and the user has requested Via a preference
`setting that the browser check for updated versions of pages, then a good browser
`will try to save time by making only an HTTP HEAD request to the server with an
`If—modified—since line to check whether the cached page is out of date. If the
`
`reply is that the cached page is still current, the browser simply displays the page
`from the cache. If the desired web page is not in the cache, or is in the cache but
`is stale, then the browser needs to request the current version of the page from the
`server.
`
`In order to connect to a web server, the client machine needs to know the server’s
`
`4-byte IP address (e. g., 198.137.240.92). But the browser usually has only the fully-
`qualified server name (e.g., wwwwbitebousagov) from the user’s manual request
`or the HTML of a previous page. The client machine must figure out which IP
`address is associated with the DNS name of a web server. It does this Via the dis-
`
`is, DNS. The client
`that
`tributed database of domain name to IP mappings,
`machine makes a request of its local name server, which either knows the answer
`or queries a higher—level server for the answer. If an IP answer is found, the client
`can then make a request directly to the server by using that IP address. If no
`answer is found, the request cannot proceed and the browser will display “No
`DNS Entry” or some other cryptic message to the user.
`
`The performance problem here is that DNS lookups are almost always imple—
`mented with blocking system calls, meaning that nothing else can happen in the
`browser until the DNS lookup succeeds or fails. If the local DNS server is over-
`loaded, the browser will simply hang until some rather long operating system tim—
`eout expires, perhaps one minute. DNS services, like most other Internet services,
`tend to get exponentially slower under heavy load. The only guaranteed way to
`avoid the performance penalty associated with DNS is not to use it. You can sim-
`ply embed IP addresses in HTML or type them in by hand. This is hard on the
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`
`
`How Browsers Work
`
`87
`
`user, because DNS names are much easier to remember than IP addresses, and
`because it is confusing to see an IP address appear in the “Location:” box of the
`browser. Under good conditions, DNS lookup takes only a few tenths of a sec—
`ond. Under bad conditions, it can be intolerably slow.
`
`The client-side implementation of DNS is known as the resolver. The resolver is
`usually just a set of library calls rather than a distinct program. Under Unix, for
`example, the resolver is part of the lilac library that most C programmers use for
`their applications. Fortunately, most DNS resolvers cache recently requested DNS
`names, so subsequent lookups are much faster than the first.
`
`Once a browser client has the IP address of the desired server,
`
`it generates the
`
`HTTP request describing its abilities and what it wants, and hands it off to the OS
`for transmission. In generating the HTTP request, the browser will check for previ- .
`
`ously received cookies associated with the desired page or DNS domain and send
`those along with the request so that the web server can easily identify repeat cus—
`tomers. The whole request is small, a hundred bytes or so. The OS attempts to
`establish a TCP connection to the server and to give the server the browser’s
`
`request. The browser then simply waits for the answer or a timeout. If no reply is
`forthcoming, the browser does not know whether it is because the server is over-
`loaded and cannot accept a new connection, because the server crashed, or
`because the server’s network connection is down.
`
`When the response from the server arrives, the 08 gives it to the browser, which
`then checks the header for a valid HTTP response code and a new cookie. If the
`
`response is OK, the browser stores any cookie, parses the HTML content or image,
`and starts to calculate how to display it. Parsing is very CPU—intensive. You can
`feel how fast your CPU is when you load a big HTML page, say 100K or more,
`from cache or over a very fast network connection. Remember that parsing text is
`
`a step distinct from laying out and displaying it. Netscape, in particular, will delay
`the display of parsed text until the size of every embedded image is known. If the
`image sizes are not included in the HTML <IMG> tag, this means that the browser
`must request every image and receive a response before the user sees anything on
`the page.
`
`The order in which an HTML page is laid out is up to the particular browser. In
`Netscape 4.x, web pages are rendered in the following order, once all the image
`sizes are known:
`
`1. The text of the page is laid out. Links in the text are checked against a history
`database, and if found, are shown in a different color to indicate that the user
`
`has already clicked on them.
`
`2. The boundary boxes for images are displayed with any ALT text for the image
`and with the image icon.
`
`Microsoft Corp. Exhibit 1044
`
`Microsoft Corp. Exhibit 1044
`
`
`
`
`88
`Chapter 6: Client Software
`
`3. Images are displayed, perhaps with progressive rendering, where the image
`gains in definition as data arrives rather than simply filling in from top to bot-
`
`tom. It is common for Netscape to load and show an image before showing
`any text.
`
`4. Subsidiary frames are loaded starting over at step 1.
`
`A browser may open multiple connections to the server. You can clearly