`Ninth System Administration Conference LISA 95
`Monterey California September 18-22 1995
`
`Administering Very High Volume Internet Services
`
`Dan Mosedale William Foss and Rob McCool
`Netscape Communications
`
`For more information about USENIX Association contact
`
`Phone
`FAX
`4.WWWURL
`
`510 528-8649
`
`510 548-5738
`
`office@usenix.org
`http //www.usenix
`
`org
`
`Petitioner IBM – Ex. 1073, p. 1
`
`
`
`Administering Very High
`Volume Internet Services
`
`Dan
`
`osedal
`
`Iii am Foss and Rob McCooI
`
`Netscape Communications
`
`small scale
`
`ABSTRACT
`Providing WWW or FTP service on
`well-solved problem
`is already
`Scaling this to work at
`per day however
`site that accepts millions of connections
`can
`In this paper we give
`easily push multiple machines and networks
`edge
`to the bleeding
`out
`concrete configuration techniques
`that have helped us get
`the best possible performance
`is mostly centered on WWW service but much of
`resources Our analysis
`of sewer
`the
`to FTP service Additionally we discuss some of the tools
`information applies equally well
`that we use for thy-to-day management
`We dont have
`how much each configuration
`statistics about exactly
`lot of specific
`through the watch the
`helped us Rather this paper
`represents our many iterations
`change
`The intent
`see various failures fix whats broken loop
`load increase
`is to help the reader
`from the start and then to supply ideas
`high-performance manageable
`sewer
`to look for when it becomes overloaded
`
`configure
`about what
`
`debated how we would spread the load across multi
`ple machines when that became necessary
`Because
`of problems reported using DNS round-robin tech
`niques3 we chose to instead implement
`randomi
`zation scheme
`of
`the Netscape Navigator
`inside
`In short when
`copy of
`the Navigator
`itself
`accessing home.mcom.com or home.netscape.com
`periodically queries the DNS for
`hostname of the
`form homeX.netscape.com
`random
`where
`number between
`and 16 Each of our web sewers
`the homeX aliases pointing to it
`number of
`be
`Since this strategy
`is not
`something
`to most sites we wont spend more time
`available
`
`is
`
`it
`
`is
`
`that will
`
`has
`
`Our Site
`
`we
`what
`Communications
`runs
`Netscape
`believe to be one of the highest volume web services
`on the Internet Our machines currently take
`total
`of between six and eight million HTTP hits per day
`and this number continues to grow Furthermore we
`make
`our web browser
`the Netscape Navigator
`via FTP and HTTP
`available for downloading
`
`The web site 2I contains online documentation
`sales and marketing
`for the Netscape Navigator
`line many gen
`information about our entire product
`directory ser
`interest pages including various
`eral
`vices as well as home pages for Netscape employ
`the machines
`ees All of
`run the Netscape Server
`but most of the strategies in this paper should apply
`to other HTTP and even FTP servers also
`
`times we have
`At various
`tried out various
`of machines running the given operat
`configurations
`shows
`ing systems Figure
`list
`
`Each of our WWW servers has an identical
`to it more on this later
`content
`tree uploaded
`Before the Netscape Navigator was released to the
`time we thought
`the first
`community for
`the web pages we intended
`and
`to serve
`
`Internet
`
`about
`
`on it here
`
`Another scheme which we have
`nameserver-based
`load
`
`This
`
`looked into is
`scheme
`balancing
`nameserver periodically polling each
`depends upon
`content server at site to find out how loaded they are
`though
`less functional version could simply use
`static weighting The nameserver
`then resolves the
`domain name to the IP address of the
`server which
`the added
`load
`This
`has
`currently has
`the
`least
`machine that
`benefit of not sending requests to
`overloaded or unavailable effectively creating
`mans failover system It
`can
`however
`
`is
`
`poor
`leave
`
`20
`
`Sun uniprocessor 60 Mhz SPARCserver
`SGI 2-processor Challenge-L
`Indy 150 Mhz
`SGI
`SGI Challenge-S 175 Mhz
`HP 9000/8 16
`90 Mhz Pentium PC
`90 Mhz Pentium PC
`
`Solaris 2.3
`2.4
`IRIX 5.2
`IRIX 5.2
`
`5.3
`
`IffiX 5.3
`HP-UX 9.04
`Windows NT 3.5
`BSD/OS 2.0
`
`Figure
`
`Tested hardware/software
`
`configurations
`
`1995 LISA IX September 17-22 1995 Monterey CA
`
`95
`
`Petitioner IBM – Ex. 1073, p. 2
`
`
`
`Administering Very High Volume Internet Services
`
`Mosedale Foss
`
`McCool
`
`dura
`dead machine in the server pooi
`for as long
`tion as the DNS TTL
`Two DNS load balancing
`schemes that we plan to investigate further are RFC
`17944J and lbnamed5J
`
`and
`
`Figures
`mance statistics
`
`summarize some of the perfor
`servers on July 26
`for the various
`load Our
`1995 along with their assigned relative
`setup allows us to give each machine
`fraction of
`to our site measured in sixteenths of
`the total
`traffic
`
`the total
`
`Note
`
`1/16
`
`of
`
`the
`
`In addition to assuming
`WWW load the
`150
`JVIHz
`Indy also wears
`the
`home .netscape .com and
`aliases www.netscape.com
`currently runs all CGI processes for the entire site
`Total CGI load for this day accounted for 49887 of
`its 1548859 HTTP accesses more on this later
`
`managing web servers and content We will avoid
`discussing HTML authoring
`and utilities
`programs
`those who
`http//home
`for
`see
`interested
`netscape com/home/how-to-create
`-web-services html
`for pointers to many such tools
`Document Control CVS
`
`are
`
`Since our content comes from several different
`company we chose
`sources within
`use
`the
`to
`CVS6J
`document
`tree
`to manage the
`worked moderately well but
`for our environment
`
`This
`
`has
`
`is not an ideal solution
`
`In some some ways the creation of content
`mid-to-large programming environment
`resembles
`revision system became necessary
`document
`to
`govern the creation of our web site content as multi
`ple contributing editors added and deleted material
`tree CVS provided
`from the content
`reasonably
`easy method to retrieve older source or detailed logs
`of changes made to HTJVIL
`dating back to the crea
`tion of the content
`tree
`One drawback of CVS is that many of the folks
`who design our content
`found it difficult
`to use and
`lack of experience with UNIX
`understand
`due to
`cross-plafform GUI-based tool would be especially
`well-suited to this market niche
`
`The Tools
`
`The traffic
`
`size of our web site
`and content
`and we collected
`number of
`grew veiy quickly
`tools to help us manage the machines and content
`We used
`number of existing programs some of
`which we enhanced and developed
`few internally
`as well Well
`go over many of
`these here the
`is to focus on tools that are directly useful
`in
`
`intent
`
`Host
`
`type
`
`Load
`
`Fraction
`
`Hits
`
`Redirects
`
`Server
`
`Errors
`
`umque
`URLs
`
`Unique
`
`hosts
`
`KB
`trans
`
`ferred
`
`SGI 150Mhz
`R4400 Indy
`
`SGI 175 Mhz
`R4400 Challenge
`
`1/16
`
`1548859
`
`68962
`
`7253
`
`4558
`
`120712
`
`12487021
`
`6/16
`
`2306930
`
`47154
`
`5/16
`
`23
`
`43
`
`2722
`
`249791
`
`11574007
`
`10626111
`
`SGI 175 Mhz
`R4400 Challenge
`
`BSD/OS 90 Mhz
`Pentium
`
`2059499
`
`43441
`
`2681
`
`225055
`
`4/16
`
`1571804
`
`31919
`
`23
`
`2351
`
`192726
`
`7936917
`
`Figure
`
`WWW Server activity for the period between 25/Jul/1995235804
`
`and 26/Jul/1995235859
`
`Host
`
`type
`
`Load
`
`Fraction
`
`Bytes/Hits
`
`Bytes/Hits
`
`Bytes/Hits
`
`Bytes/Hits
`
`Bytes/Hits
`
`type SGI 150 Mhz
`Host
`R4400 Indy
`type SGI 175 Mhz
`Host
`R4400 Challenge
`type SGI 175 Mhz
`Host
`R4400 Challenge
`
`SD/OS 90
`Host
`type
`Mhz Pentium
`
`1/16
`
`430600/82
`
`345132/61
`
`227618/61
`
`224849/60
`
`236678/59
`
`6/16
`
`613621/128
`
`646112/119
`
`656412/1 10
`
`545699/108
`
`520256/107
`
`5/16
`
`466244/93
`
`430870/88
`
`358186/84
`
`375964/84
`
`53 1421/82
`
`4/16
`
`375696/81
`
`256143/77
`
`417655/76
`
`394878/72
`
`298561/70
`
`Figure
`
`Five busiest minutes for the tested hosts
`
`96
`
`1995 LISA IX September 17-22 1995 Monterey CA
`
`Petitioner IBM – Ex. 1073, p. 3
`
`
`
`Mosedale Foss
`
`McCool
`
`Administering Very High Volnme Internet Services
`
`Content Pnsh
`
`Once we had multiple machines
`sewing our
`WWW content
`it became necessary to come up with
`reasonable mechanism for getting
`copies of our
`tree to all of the servers outside our
`master content
`firewall NCSA distributes their documents
`among
`sewer machines by keeping their content
`tree on the
`AFS distributed filesystem3
`It seemed to us that another natural solution to
`this problem was rdist
`program specifically
`in sync with
`trees of
`designed for keeping
`files
`master copy However we felt we couldnt use this
`unmodified as its security
`depended entirely on
`.rhosts file which is
`notoriously thin layer of pro
`tection With the help of some other developers we
`worked on incorporating SSL8I
`in order to
`into rdist
`provide for encryption as well as better authentica
`tion of both ends With SSL we no longer need to
`rely on
`clients IP address for its identity crypto
`authentication
`provide
`certificates
`that
`
`graphic
`
`instead
`
`of our SSLified rdist we
`In the development
`it would be
`good idea to use the latest
`decided that
`rdist from USC in part because
`it has the option of
`using rsh
`for its transport
`rather
`than rcrrd
`Because it doesnt use rcrrd
`it no longer needs to
`real security win One
`setuid root which is
`be
`side effect of this is that we now have an SSLified
`which we use to copy
`version of
`rsh
`log files
`from our sewers back to our internal nets
`
`Monitoring
`During the course of our sewer growth we
`number of tools for moth
`wrote and/or borrowed
`toring our web sewers
`These
`include
`couple of
`log analyzer and
`tools to check
`response time
`program to page us if one of the sewers goes down
`The tool
`to check
`response time is designed to
`be run from
`machine external
`to the sewer being
`monitored
`Every so often it wakes up and sends
`to an HTTP sewer
`typical document
`request
`for
`such as the home page
`It measures the amount of
`took from start
`is from
`time that
`to finish that
`zero from
`before it calls connect
`to after
`it gets
`the sewer has closed the con
`read indicating that
`relatively small document
`nection If you choose
`good general
`this time can give you
`indication of
`how long people are unnecessarily waiting for docu
`small docu
`conditions
`ments since under
`ideal
`ment should come back nearly instantaneously
`In
`our typical monitoring setup we run monitor pro
`grams from remote well-connected
`sites as well as
`from locally-networked machines This allows us to
`see when problems are
`result of network conges
`tion as opposed to lossage on the sewer machines
`
`it
`
`The logfile analyzer which is now
`standard
`part of the Netscape Communications and Commerce
`sewer products provides information about
`the busi
`the day and about how
`est hours or minutes of
`
`has
`
`much
`thta transfer client document
`caching
`saved our site The analyzer can be very helpful
`in
`determining which hours are peak hours and will
`the most attention Because of
`require
`the high
`at our site we designed it
`volume of traffic
`to pro
`cess large log files quickly
`The program to page us when
`sewer becomes
`unreachable
`is similar to our response time program
`is that when it
`The difference
`sewer
`finds
`that
`reasonable time frame for
`does not respond within
`three consecutive tries it sends an e-mail message to
`us along with
`message to our alphanumeric-pager
`to make sure we know that
`sewer needs
`gateway
`attention
`
`times when we knew that we wouldnt be
`At
`the sewer we used
`able to come in and reboot
`UNIX box with an R5232-controlled on/off switch to
`automatically hard boot any system not
`responding
`to three sequential GET requests
`small PC with
`two serial ports is enough to individually monitor 10
`for most non-
`systems and provides
`fatal
`recovery
`system errors e.g most
`problems other
`hardware failure or logfile partitions filling up
`
`than
`
`Performance
`
`10 11 have explored HTTP
`Previous works
`and have
`come to the conclusion
`that
`performance
`HTTP in its current
`form and TCP are particularly
`ill-suited to one another when it comes to perfor
`mance
`The authors of some of
`these articles
`have
`number of ways to improve the situation
`suggested
`For the present however we
`changes
`via protocol
`in making do with what we
`are more interested
`have
`
`speaking most TCP stacks
`More practically
`have never been abused in quite this way before so
`they dont deal well with
`its not
`too surprising that
`The standard UNIX model of
`load
`this level of
`new sewer each
`time
`connection
`forking
`opens
`doesnt
`and
`particularly well
`either
`the
`scale
`Netscape Sewer uses
`process-pool model for just
`
`this reason
`
`Kernels
`
`took the most time for us to
`The problems that
`solve involved the UNIX kernel Not having sources
`for most platforms makes it something of
`black
`box We hope that sharing some hard-won insights
`useful
`in this area will
`especially
`prove
`
`to the
`
`reader
`TCP Tuning
`There are several kernel parameters which one
`can tweak
`that will often improve the performance
`web or FTP sewer significantly
`of
`The size of the listen queue corresponds to the
`maximum number of connections pending in the ker
`connection is considered pending when it has
`nel
`or when it
`has been
`not been fully
`established
`
`1995 LISA IX September 17-22 1995 Monterey CA
`
`97
`
`Petitioner IBM – Ex. 1073, p. 4
`
`
`
`Administering Very High Volume Internet Services
`
`Mosedale Foss
`
`McCool
`
`established
`
`and is waiting for
`process to do
`an
`If the queue size is too small clients will
`accept
`see connection refused or connection
`sometimes
`timed out messages
`is too big results
`are
`if
`sporadic some machines
`seem to function nicely
`while others of similar or
`identical configuration
`become hopelessly bogged down You will need to
`listen
`to find the
`size for your
`experiment
`queue Version 1.1 of the Netscape Communications
`and Commerce
`Sewers will never
`request
`queue larger than 128
`
`it
`
`right
`
`listen
`
`In kernels that have
`
`is
`
`the
`
`SD-based TCP stacks the
`size of
`the
`controlled by
`listen queue
`SOMAXCONIN parameter Historically this has
`been
`define in the kernel
`so if you dont have
`to your OS source code you will probably
`access
`need to get
`vendor patch which will allow you to
`tune
`called
`In Solaris
`it
`this
`parameter
`tcp_conn_mqmax and can be read and written using
`ndd 1M on /dev/tcp
`Sun has chosen to limit
`the
`size to which one can raise tcp_conn_req_max
`using
`ndd to 32 Contact Sun to find out how to raise this
`further
`
`is
`
`limit
`
`needs
`
`that
`
`it
`
`server machine
`on
`kernel
`Additionally the
`to have enough memory to buffer all
`the data
`In variants of UNIX that use
`is sending out
`SD-based TCP stack
`these buffers are called
`mbufs
`The default number of mbufs in most ker
`nels is way too small for TCP traffic of this natum
`reconfiguration is usually mquired We
`have
`found
`and error
`to find the right
`required
`that
`trial
`-m shows
`number
`if netstat
`for
`requests
`memory are being denied you probably need more
`mbufs Under
`IRIX the pammeter you will need to
`nmclusters
`and
`called
`lives
`in
`
`is
`
`that
`
`raise
`
`is
`
`it
`
`/var/sysgenlmaster.d/bsd
`TCP employs
`mechanism called
`keepalive
`is designed to make sure that when one host of
`that
`TCP connection
`loses contact with its peer host
`and either host
`is waiting for thta from its peer the
`indefinitely for data to
`waiting system does not wait
`arrive Under
`the sockets
`the socket
`interface if
`the
`to have
`system is waiting for is
`configured
`SOKEEPALIVE option turned on the system will
`send
`to the remote system after it
`keepalive packet
`has been waiting for
`certain period of time It will
`and will give
`continue sending
`packet periodically
`up and close the connection
`system does not
`if
`the
`certain number of tries
`respond after
`
`the
`
`mechanism for chang
`Many systems provide
`ing the interval between TCP keepalive probes Typ
`ically the period of time before
`system will send
`is measured in hours
`This is to
`keepalive
`make
`the system does
`sure that
`large
`numbers of keepalive
`to hosts which for
`packets
`idle telnet sessions that simply dont
`example have
`have thta to send for long periods of time
`
`not
`
`send
`
`packet
`
`With
`time if
`sewer
`
`for
`
`close that connection
`
`web server an hour
`is an awfully long
`browser does not
`send
`information the
`few minutes it
`is waiting for within
`is
`the remote machine has become unreach
`likely that
`router failures were the typical
`able In the past
`of hosts becoming
`In todays
`cause
`unreachable
`the same
`that problem still exists while at
`Internet
`time an incmasingly large number of users are using
`modem with SLIP or PPP as their connection
`to
`has shown that
`the Internet Our experience
`these
`are unstable and cause
`types of connections
`the
`situations where
`becomes
`most
`host
`suddenly
`silent and unreachable Most HTTP sewers have
`timeout built
`they have waited for data
`if
`in so that
`few minutes they will
`from client
`forcibly
`The situations where
`sewer
`ones most
`
`thta are the
`
`is not actively waiting for
`important for keepalive
`-an shows many idle sockets
`If netstat
`kernel or idle HTTP sewers waiting for them you
`check whether your sewer software sets
`should first
`the SO_KEEPALIVE socket option The Netscape
`Communications and Commerce sewers do so The
`is whether your sys
`second thing you should check
`tem allows you to change
`between
`the
`interval
`keepalive probes Many systems such as IRIX and
`mechanisms
`provide
`changing
`Solaris
`for
`the
`to minutes instead of hours Most
`keepalive
`interval
`systems weve encountered
`two
`default of
`hours we typically truncate
`to 15 minutes
`if
`dedicated web sewer system
`your system is not
`consider
`you should
`the value
`keeping
`relatively
`high so idle telnet sessions dont cause unnecessary
`network
`The third thing you should check
`traffic
`their TCP implementa
`with your vendor
`is whether
`tion allows sockets
`to time out during the
`TCP close Certain versions of the BSD
`stages of
`TCP code
`on which many of
`todays
`systems are
`based do not use keepalive
`timeouts during close
`This means that
`situations
`connection to
`system
`has fully ack
`that becomes unwachable
`befom it
`the close can stay in your machines ker
`nowledged
`nel
`If you see this situation contact
`indefinitely
`your vendor
`patch
`Is Your Friend Or The Value of
`Your Vendor
`
`in the
`
`final
`
`the
`
`have
`
`it
`
`for
`
`Patch
`
`In almost eveiy case we have gotten quite
`bit
`of value from working directly with the vendor
`Since very high volume TCP service of
`the nature
`fairly new phenomenon OS vendors
`we describe is
`am only beginning to adapt their kernels for this
`There are patches to allow the system admiis
`size for IRIX 5.2
`trator to increase the listen queue
`and 5.3 as well as enabling the TCP keepalive timer
`while
`socket
`These
`in closing states
`patches
`few related to
`also fix other problems including
`and performance Con
`multiprocessor correctness
`tact SGI
`for the curmnt patch numbers if you are
`using WebFORCE system you should already have
`
`is
`
`98
`
`1995 LISA IX September 17-22 1995 Monterey CA
`
`Petitioner IBM – Ex. 1073, p. 5
`
`
`
`Mosedale Foss
`
`McCool
`
`Administering Very High Volume Internet Services
`
`backing
`
`backups
`
`manipulation allows for an automated
`system of
`months worth of data
`up and processing
`or no human intervention Tape
`with little
`are generated onto 8mm tape once monthly
`shows con
`
`of
`
`the
`
`Overall analysis
`log files
`sistent data supporting the following
`oclock
`Peak
`loads occur between
`12 and
`PM PST
`Peak
`rates were
`connection
`between 120-140 connections per second
`machine
`peak of roughly half
`second
`the
`amplitude occurs between
`and
`oclock
`PM PST
`load day of
`Wednesday
`the
`the highest
`week
`generating more than both weekend
`days combined
`
`them With these
`administrator will
`
`/sysgenlmaster.dlbsd
`
`patches most parameters
`need
`
`are
`
`in
`
`to
`
`edit
`
`the
`
`/var
`
`want
`
`If you are using Solaris 2.3 you will definitely
`the most
`recent
`release of
`the kernel
`to get
`In general weve
`number 101318
`jumbo patch
`found Solaris 2.4 able to handle much more traffic
`than even
`system with scalability patches
`2.3
`installed If upgrading to 2.4 is an option we highly
`it especially when your traffic
`recommend
`starts to
`reach the range of multiple hundreds of thousands of
`hits per day The 2.4 jumbo patch number 101945 is
`as well as sta
`also recommended both for security
`reasons
`
`bility
`
`Logging
`
`Generally the less information that you need to
`you will get We found
`log the better performance
`logging entirely we typically
`that by turning off
`gain of about 20% In the
`realized
`performance
`future many
`servers will offer alternative
`log file
`formats to the current common log format which
`record
`as well as
`will provide
`better performance
`information most
`only the
`administrator
`
`to the
`
`site
`
`important
`
`the
`
`Many
`to perfonu
`servers offer
`ability
`reverse DNS lookups
`on the
`IP addresses of
`the
`is very use
`clients that access your sewer While it
`information having your sewer do it at mn-time
`ful
`problem Since many
`tends to be
`performance
`DNS lookups either timeout or are extremely slow
`sewer
`on the
`then generates extra traffic
`and
`network
`some
`often
`devotes
`non-trivial
`amount of networking resources to waiting for DNS
`response packets
`
`the
`
`local
`
`it
`
`does
`
`For high-volume logging syslogd also causes
`problems we suggest avoiding
`performance
`it
`if
`per second and each
`one is logging 10 connections
`two pieces of data to be logged
`connection
`causes
`for us this could mean
`up to 20
`as
`into syslogd and 20 out of
`context-switches
`it per
`second
`This
`overhead
`in addition to
`logging-related I/O and all
`actual content service
`On our site the logs are rotated once every 24
`hours and compressed
`staging directory
`separate UNIX machine inside of our
`an SSLified rsh
`to bring the individual
`logs to
`gigabyte partition where the logs are uncompressed
`and piped
`concatenated
`software
`to our analysis
`Reverse DNS lookups
`are done at
`rather
`this point
`than at mn time on the sewer which allows us to do
`IP address that con
`only one
`reverse lookup
`per
`nected during that day Processing and lookups on
`the machines on our site takes
`the logs from all of
`approximately an hour to complete
`
`is
`
`any
`
`processing
`
`related
`
`to
`
`into
`
`firewall uses
`
`single compressed log file
`is approximately
`we end up with
`70 megabytes and consequently
`over 250 IVIB of log files daily Our method of
`log
`
`per
`
`is
`
`Equipment
`
`Ndworks
`We found that one UNIX machine doing high-
`that an Ether
`volume content service was about all
`net could handle Putting two such machines on
`caused
`of both
`single Ethernet
`the performance
`machines
`net became
`degrade badly
`the
`as
`and collisions More analysis
`saturated with traffic
`needed We are finding
`of our network
`data is still
`to be more cost effective to have one Ethernet per
`than to purchase FDDI equipment
`host
`for all of
`them
`
`to
`
`it
`
`found SGIs addition of
`As an aside we have
`-C switch
`in IRIX to
`ndstat
`be
`to
`thta collected by
`extremely useful
`It displays the
`full-screen format which is
`
`the
`
`netstat
`
`in
`
`updated
`
`dynamically
`Memary
`
`it
`
`it You want
`This is fairly simple get
`lots of
`to have enough memoiy both for buffering network
`to keep most of
`data and for your filesystem cache
`accessed
`frequently
`serves
`the
`files
`that
`in
`The filesystem read cache hit-rate percen
`memory
`tage on our web sewers is almost always 90% or
`above Most modern UNIXes automatically pick
`reasonable size for the buffer cache but some may
`tuning Many OS vendors
`require manual
`include
`tools for monitoring your cache hit rates Sys
`useful
`have sar we also found HP/UXs
`tem
`derivatives
`rronitor1M and IRIXs osviv1 helpful
`Our Typical Configuration
`
`at our site is workstation
`typical webserver
`class machine e.g Sun SPARC 20 SGI
`Indy or
`Pentium P90
`and
`running between
`150
`128
`For UNIX machines
`least we have
`at
`processes
`128 megabytes of memory to be about
`found
`the
`can use With this much
`that our machines
`most
`the network buffer space we
`memory we have all
`need we get
`high filesystem read-cache
`hit
`rate
`and usually have
`few or even tens of megabytes
`to spare depending on the UNIX version
`
`1995 LISA IX September 17-22 1995 Monterey CA
`
`99
`
`Petitioner IBM – Ex. 1073, p. 6
`
`
`
`Administering Very High Volume Internet Services
`
`Mosedale Foss
`
`McCool
`
`is
`
`or more of disk
`Generally one
`gigabyte
`thys each of our sewers generates
`These
`necessary
`over 200 megs of log data per
`thy before compres
`content we are
`sion and the amount of HTJVIL
`housing continues to grow Data from sar and ker
`that our boxes are spend
`nel profiling code suggest
`ing between 5% and 20% of
`their time waiting for
`the high read-cache hit-rate we
`I/O Given
`disk
`that by moving our log files onto
`separate
`expect
`fast/wide SCSI drive and experimenting with filesys
`tem parameters this percentage will decrease
`significantly
`
`fairly
`
`Miscellaneous Points
`
`In this section we will discuss
`that we
`have
`learned
`during our
`things
`managing web sewers
`Bit About Security
`
`few random
`
`tenure
`
`concerns of
`
`unique
`
`notable
`
`data
`
`that
`Such
`
`is
`
`it
`
`In addition to the normal security
`Internet12I web
`sites on the
`some
`have
`sewers
`security opportunities One of
`the most
`is CGI programs13
`These
`allow the
`to add all sorts of interesting functionality to
`author
`web site Unfortunately
`they can also be
`real
`problem since they generally take
`security
`entered by web user as their input
`they need to be
`very careful about what such data is used for
`CGI script
`an email address and
`takes
`If
`hands it off to sendmail on the command
`line the
`to go through and make sure that no
`needs
`script
`unescaped shell characters are given to the shell
`to do
`unexpected
`might cause
`something
`unexpected interplay between different programs
`common cause of security violations
`Since many
`users who want
`to provide programmatic functional
`ity on their web pages are not
`intimately familiar
`outs of UNIX
`and
`with the ins
`one
`security
`approach to this problem is to simply forbid CGI
`programs in users personal web pages
`However we suggest an alternative mandate
`for CGI programs written by
`use of
`taintperl14I
`users Perl
`is already one of the predominant script
`used to write CGI programs it
`for manipulating data of all sorts
`extremely powerful
`and producing HTJVIL output
`version
`taintperl
`of perl which keeps
`track of the source of the data
`is con
`Any data input by the user
`it uses
`interpreter to be tainted and
`sidered by the taintperl
`cant be used for thngerous
`operations unless expli
`citly untainted This means that such scripts can be
`audited by
`officer by
`security
`easily
`command
`and carefully
`for
`the iintaint
`grepping
`is used
`analyzing the variables on which it
`Web Server Heterogeneity
`An interesting feature of our site is that
`is an
`testbed for ports of our sewer
`to new plat
`ideal
`forms Since it was clear
`to us from the beginning
`that we would be using it
`this way we needed to
`
`ing languages
`
`is
`
`that
`
`reasonably
`
`is
`
`it
`
`it
`
`It
`
`references
`
`think about how we would deal with CGI programs
`some thought we
`and their environment
`After
`just wasnt practical
`to try to port and
`decided that
`test all of our CGI content and force our users to do
`the same for their home pages to every new plat
`form that we wanted to test
`that we
`turned out
`were very fortunate to have considered this early as
`NT
`we eventually
`our Windows
`ended up testing
`port on our web site
`We designated
`machine that
`single alias to
`would run all of our CGI programs We
`then
`that all HTML pages should
`created the guideline
`simply point all CGI script
`to this alias
`nice side effect
`is that this type of content
`is path
`tioned to its own machine which can be specifically
`tailored to CGI sewice
`the machine pool con
`If
`tains many incompatible machines this setup avoids
`having to maintain binaries for CGI programs com
`piled for each machine An average day at our site
`shows 50000 out of 7.5 million requests for CGI
`scripts note that
`this ratio is almost certainly veiy
`dependent on the type of content sewed
`An additional experiment would be to partition
`types of content e.g graphics to their own
`other
`if necessary DNS-based
`machines in the same way
`could be used to spread out
`load
`load-balancing
`same type e.g
`across multiple machines of
`gif.netscape.com could be used to refer to multiple
`machines whose only purpose in life is to sewe GIF
`files
`FTP vs HTTP
`easy way to han
`Both FTP and HTTP offer
`dle file transfer each with relative strengths
`FTP provides
`busy signal
`to the user indicating that
`site is currently process
`ing too many transactions
`That
`limit
`is easily set
`by the site administrator HTTP provides
`mechan
`ism for this as well however
`implemented
`is not
`by many HTTP sewers primarily due to the fact
`that
`user con
`it can be veiy confusing for users When
`to an FTP site they are allowed to transfer
`nects
`they need in that session Due to
`every document
`HTTPs
`stateless nature and the
`it uses
`new connection
`for each
`user
`can
`transfer
`easily get an HTML document and then be refused
`sewice when
`documents
`for
`inlined
`asking
`images This makes for
`very confusing user experi
`future work in HTTP develop
`ence
`is hoped that
`ment will
`this problem Further
`help to alleviate
`work in URL or URN arenas will hopefully provide
`more formal mechanisms
`for defining alternative dis
`tribution machines
`system planning sense FTP should be con
`separate sewice and therefore can be
`sidered to be
`sewed from completely different computer
`cleanly
`file delivery vs
`This offers easier
`log analysis of
`html sewed and also aids in secmity efforts
`sys
`tem running only one sewice correctly
`configured
`
`the
`
`it
`
`file
`
`the
`
`that
`
`is feedback
`
`fact
`
`that
`
`It
`
`In
`
`100
`
`1995 LISA IX September 17-22 1995 Monterey CA
`
`Petitioner IBM – Ex. 1073, p. 7
`
`
`
`Mosedale Foss
`
`McCool
`
`Administering Very High Volume Internet Services
`
`and if breached
`is less likely to be breached
`not mean loss of security for our entire site
`
`does
`
`also appreciate the input from the folks who took the
`time to read drafts of our paper
`
`About
`
`the Authors
`
`The
`
`authors
`
`have
`
`for
`
`the
`
`of
`
`the
`
`been
`responsible
`and babysitting
`implementation
`design
`Netscape web site since its inception
`Dan Mosedale dmose@netscape.com is Lead
`UNIX Administrator
`He has been
`at Netscape
`managing UNIX boxes for long enough to dislike
`Dan likes playing around with
`them thoroughly
`FAQ list about get
`weird Internet stuff and wrote
`ting connected to the MBONE
`William Foss bill@netscape.com is Web-
`focus is on how to
`His current
`master at Netscape
`make the
`site scale even further
`in an economical
`he had the enviable job
`fashion Prior to Netscape
`of playing with large scale UNIX systems and work
`ing for Jim Clark at Silicon Graphics
`Rob McCool
`robm@netscape.com
`staff at Netscape He designed
`member of technical
`and implemented the Netsite Communications and
`Commerce
`he designed
`servers Prior
`to Netscape
`documented
`and
`implemented
`tested
`supported
`NCSA httpd from its inception through version 1.3
`
`is
`
`Bibliography
`
`11 http//home.netscape.com/comprod/mirror
`index.html
`
`2J http//home.netscape.com
`3J Kwan
`User
`Reed
`McGrath
`to NCSAs World Wide Web
`Access Patterns
`http//www-pablo.cs.uiuc.edu/Papers/
`
`Server
`WWW.ps.Z
`DNS Support
`for Load Balancing
`4J Brisco
`RFC 1794 USC/Information Sciences
`April 1995
`ftp//ds.internic .netlrfc/rfc 1794.txt
`lbnamed
`Roland
`Load
`Name Server
`in Perl LISA IX
`http//www
`leland.stanford.edu/schemers/docs/lbnameW
`
`Institute
`
`5J Schemers
`
`Balancing
`
`Conference
`
`Proceedings
`
`lbnamed.html
`
`6J ftp//prep.ai.mit.edulpub/gnulcvs-
`
`Overhauling Rdist
`7J Cooper
`LISA VI Conference Proceedings
`ftp//usc
`.edulpub/rdist
`
`1.5 .tar.gz
`
`for the 90s
`pp 175-188
`
`typ
`
`Performance
`gains are also likely Content
`served by FTP is composed
`of
`ically
`large files
`to HTTP-served
`data which is typically
`compared
`small and quickly
`by
`accessible
`to be
`designed
`modem users
`document and its inlined images are
`short enough to be delivered in short periods Mix
`ing the two different
`types can make it hard to pin
`down
`if FTP and
`especially
`system bottlenecks
`HTTP
`served from
`single machine
`are being
`Many times the two services will
`compete for the
`same resources making it hard to track down prob
`lems in both areas
`While running both FTP and HTTP servers on
`128 HTTP
`one machine we found
`daemon
`that
`processes and an imposed limit of 50 simultaneous
`FTP connections was about all
`workstation-class
`system would tolerate Further growth beyond that
`would cause
`each service to be periodically demed
`network
`250
`Once separated
`however
`resources
`simultaneous FTP connections on
`workstation-class
`machine was handled easily
`HTTP
`service on the other hand offers
`method to gather useful
`information from the reques
`tor via forms before allowing file
`transfer to take
`place This became
`as the
`necessity at Netscape
`cer
`encryption tecimology in our software required
`tain amount of
`legal documentation
`to be agreed to
`prior to download
`
`Conclusion and Future Directions
`
`Although sites such as ours are currently the
`exception we expect
`they will soon become the
`that
`rule as the Internet
`continues
`its exceedingly
`rapid
`growth Additionally we expect content
`to become
`vastly more dynamic
`in the future both on the front
`and
`end using mechanisms
`such as server push14J
`Jav415J and the backend where using SQL data
`become even more
`and search engines will
`bases
`common This promises to provide many new chal
`in the area of performance meas
`lenges especially
`urement and management
`We hope that
`the techniques and information in
`to folks who wish to
`this paper will prove helpful
`very high volimie of ser
`administer sites providing
`vice
`
`Acknowledgements
`
`Brendan
`than