throbber
Fragmentation Considered Harmful
`
`Christopher A. Kent
`Jefirey C. Mogul
`
`Digital Equipment Corporation
`Western Research Lab
`
`(Originally published in Proc. SIGCOMM ‘87, Vol. 17, No. 5, October 1937)
`
`Abstract
`
`Internetworks can be built from many different kinds of networks, with varying limits on
`maximum packet size. Throughput is usually maximized when the largest possible packet is
`sent; unfortunately, some routes can carry only very small packets. The IP protocol allows a
`gateway to fragment a packet if it is too large to be transmitted. Fragmentation is at best a
`necessary evil; it can lead to poor performance or complete communication failure. There are a
`variety of ways to reduce the likelihood of fragmentation; some can be incorporated into exist-
`ing IP implementations without changes in protocol specifications. Others require new
`protocols, or modifications to existing protocols.
`
`1. Introduction
`
`Internetworks built of heterogeneous networks are
`valuable because they insulate higher—level protocols
`from changes in network technology, because they al-
`10w universal cummunication without the expense of
`constructing a homogeneous universal
`infrastructure,
`and because they allow the use of different network
`technologies as appropriate to both local-area and long-
`haul
`links. Most datagram networks set a maximum
`limit on tile size of packets they carry,
`to simplify
`packet buffering in the nodes and to limit how long one
`packet can lie up the link. In a heterogeneous interned
`such as the DARPA IP Internet, these packet—size limits,
`known as MTUs (for maximum transmission unit) vary
`widely from 254 bytes for Packet Radio networks to
`2000 bytes for the Wideband Satellite Network [22];
`since nobody knows exactly what is connected to the
`Internet, the range in MTUs may be even broader.
`
`In general, it is better to use a few large packets instead
`of many small packets to carry a given amount of data,
`because much of the cost of packetized communication
`is per—packet rather than per-byte. On a high—speed
`LAN,
`throughput can increase almost
`linearly with
`packet size over a wide range of sizes. Therefore, we
`prefer to make our packets as large as possible.
`
`This desire for large packets conflicts with the variation
`in MTUs across an intemet. We want to send large
`
`packets but some network along the packets’ path may
`not be able to carry them. One approach to this dilemma
`is fragmentation when a node must transmit a packet
`that is larger than the MTU of the network. it breaks the
`packet into several smaller fragments and sends them
`instead. If the fragments are all sent along the same data
`link and are immediately reassembled at the next node,
`this is called transparent or intro-network fragmenta-
`tion. If the fragments are allowed to follow independent
`routes, and are reassembled only upon reaching their
`ultimate destination this is called inter-nemark frag-
`mentation. A good discussion of both methods, in more
`detail, may be found in Shoch [23].
`
`In this paper, drawing on experience with a large
`heterogeneous internetwork, we examine fragmentation
`in the context of the IP protocol [18]. IP supports the
`use
`of
`inter~network
`fragmentation.
`(Transparent
`fragmentation may be also be used as long as it
`is
`invisible to the [P layer.) Fragmentation appears at first
`to be an elegant solution to the problem, but subtle
`complications arise in real networks that can result in
`poor performance or even total communication failure.
`
`Experience with inter-network fragmentation in the 1?
`Internet has convinced us that it is something to avoid.
`In section 2 we compare the advantages and dis
`advantages of fragmentation,
`in order to justify this
`assertion. We then discuss.
`in section 3, a variety of
`schemes for avoiding or recovering from fragmentation
`
`ACM SIGCOMM
`
`-75-
`
`Computer Communication Review
`
`INTEL EX. 1420.001
`
`INTEL EX. 1420.001
`
`

`

`2. What is wrong with fragmentation?
`
`The arguments in favor of fragmentation are straight~
`forward. Fragmentation allows higher level protocols to
`be
`unconcerned with
`the
`characteristics of
`the
`
`transmission channel. and to send data in conveniently
`sized pieces. Sending larger quantities of data in each IP
`datagram minimizes the bookkeeping overhead asso-
`ciated with managing the data. (See section 3.5 fowl
`specific example.)
`
`Fragmentation allows the source host to deal with routes
`having different MTUs without having to know what
`path packet are taking. The safest strategy is for the
`source to send very small datagrams, at a great loss of
`efficiency. Fragmentation allows the source to choose a
`size that is “reasonable” and, when that size proves to
`be too large, prevides a mechanism that allows data to
`continue to get through.
`
`fragmentation allows protocols to optimize
`Finally,
`performance for high bandwidth connections. Emerging
`network technologies have larger and larger MTUs.
`Most local networks have MTUs large enough to send
`1024 bytes of user data plus associated overhead in a
`single packet; new technologies will allow ten times
`that. Fragmentation provides a mechanism for deciding
`the actual packet size as late as possible. It especially
`allows protocols to avoid choosing to send small
`datagrams until absolutely necessary. Protocols can
`choose large segment sizes to take advantage of the
`large MTU in a local network, and rely on fragmenta-
`tion at gateways to send the segments through networks
`with small M'I‘Us when needed. If datagrams must
`traverse a route consisting of several high-MTU links
`followed by a low-MTU link, by delaying the use of
`small packets until
`the low—MTU link is
`reached,
`fragmentation allows the use of large packets on the
`initial high MTU links. and thus uses those links more
`efficiently.
`
`The arguments against fragmentation fall
`categories
`
`into three
`
`-
`
`«-
`
`of
`use
`inefficient
`causes
`Fragmentation
`resources: Poor choice of fragment sizes can
`greatly increase the cost of delivering a datagram.
`Additional bandwidth is used for the additional
`
`header information, intermediate gateways must
`expend computational resources to make addi-
`tional routing decisions, and the receiving host
`must reassemble the fragments.
`
`Loss of fragments leads to degraded per-
`formance: Reassembly of IP fragments is not
`very robust. Loss of a single fragment requires
`the higher level protocol to retransmit all of the
`
`data in the original datagram, even if most of the
`fragments were received correctly.
`
`I
`
`the
`reassembly is hard: Given
`Efficient
`likelihood of lost fragments and the information
`present in the IP header, there are many situations
`in which the reassembly process. though straight-
`forward, yields lower than desired performance.
`
`2.1. An overview of fragmentation in IP
`
`IP is a protocol providing unreliable delivery of
`datagrams. IP datagrams are encapsulated in network-
`specific packets. Gateways may fragment an incoming
`packet if it will not fit in a single outgoing packet; in
`this case, each fragment is sent as a separate packet.
`The [P header contains several fields that are used to
`manage fragmentation [18]:
`
`I
`
`Identification: A 16—bit field assigned by the
`sender to aid in assembling the fragments of a
`datagram. The tuple (source, destination, proto-
`col, identification) for a given datagram must be
`unique over all existing datagrams. When a
`packet is fragmented, the value of the Identifica-
`tion field of the original packet is cepied into
`each fragment.
`
`I
`
`Time to live (TTL): An 8-bit field that specifies
`the maximum time. measured in seconds. that the
`
`packet may remain in the Internet system. If TTL
`contains the value zero.
`the packet must be
`discarded. The TTL must be decreased by at least
`one every time the packet passes through a
`gateway, even if the time required to process the
`packet is less than a second. Thus, the 'I'I‘L field
`is an upper bound on packet lifetime.
`
`-
`
`Fragment offset: A 13—bit field that identifies the
`fragment location, relative to the beginning of the
`Original, unfragmented datagram. Fragment off-
`sets are in units of 8 bytes.
`
`indicates
`field that
`0 More fragments: A l-bit
`whether or not this is the last fragment of the
`datagram.
`
`The reassembly process consists of matching the
`protocol and identification fields of incoming fragments
`with those of fragments already held, and coalescing the
`data into complete datagrams. Fragments must be
`discarded if their TTL expires while they are held for
`reassembly.
`(For more details of
`the reassembly
`algorithm, see [5].)
`
`level protocols such as TCP (Transmission
`Higher
`Control Protocol) [19] use IP as a basis to implement a
`reliable connection between two client processes.
`Portions of the data stream known as segments are sent
`in individual IP datagrams, along with control informa—
`
`ACM SIGCOMM
`
`-78-
`
`Computer Communication Fleview
`
`INTEL EX. 1420.002
`
`INTEL EX. 1420.002
`
`

`

`tion used by the cooperating TCP processes to ensure
`reliable communication.
`In particular, TCP uses a
`sequence number that covers individual bytes in the
`data stream, and an acknowledgment mechanism that
`allows the receiving process to tell the sender “I have
`correctly received all data up to and including sequence
`number n."
`
`2.2. Fragmentation cauSes inefficient resource usage
`
`Consider the costs associated with sending a packet.
`Each time it passes through a gateway, there is some
`constant computational overhead to make
`routing
`decisions, modify the packet header, compute the new
`checksum, and move the packet between the appropriate
`incoming and outgoing queues. In addition, a portion of
`the available bandwidth on the incoming and outgoing
`interfaces is consumed.
`In many cases,
`the constant
`computational overhead dominates the cost. Input and
`output may be overlapped using DMA devices;
`in a
`typical uniprocessor gateway,
`there is no way to
`parallelize the computational overhead.
`
`Fragmenting at an IP gateway, rather than having the
`host choose the appropriate segment size to avoid
`fragmentation, may lead to suboptimal use of gateway
`resources and network bandwidth. Consider a TCP
`
`process that tries to send 1024 data bytes across a route
`that includes the ARPAnet, which has an MTU of 1006
`
`bytes. The IP and TCP headers are at least 40 bytes
`long, leading to a total unfragmented IP datagram 1064
`bytes in length. To cross the ARPAnel,
`this will be
`broken into a 1006 byte fragment, followed by a 3’8 byte
`fragment. These short fragments amortize the fixed
`overhead per ARPAnet packet over very few bytes of
`data, and the total packet count is much higher than
`needed. If the sending TCP instead chooses segments
`that fit in a 1006 byte ARPAnet packet, the total packet
`count is minimized, and the total overhead is as low as
`
`possible.
`
`For example, consider sending 10 Kbytes of data.
`Sending 1024—byte TCP segments generates 10 IP
`datagrams, each 1064 bytes long. Each datagram is
`fragmented into two ARPAnet packets, one 1006 bytes
`long and the other 78 bytes, for a total of 20 packets. If
`the originating TCP instead sends 966 byte segments
`(the largest that will fit in a single ARPAnet packet),
`only 1 1 packets are sent.
`
`Another limit to utilizing available bandwidth lies in the
`interaction of the TI'L and Identification fields. Assume
`that a reasonable initial value for the 'ITL field is 32
`
`(the maximum hop count from edge to edge of the
`DARPA Internet is currently estimated to be between
`15 and 20). If we allow fragmentation, we must ensure
`that all datagrams in flight have unique values for the
`
`Identification field. Thus, the maximum datagram rate is
`215.82, or 2048 datagrams per
`second. Current
`gateways can forward nearly 1000 packets per second;
`high performance workstatiOns
`and interfaces can
`generate packets much more rapidly, and can probably
`forward 4000 packets per second. We are certainly
`within five years of having commonly available
`processor and network technology that pushes against
`the limit imposed by the 16—bit Identification field.
`
`to increase bandwidth in the
`This limit implies that,
`presence of fragmentation, hosts should send larger
`datagrams. so as to carry more data per value of the
`Identification field. This is a bad idea, because large
`datagrams lead to more fragments, and we shall show
`that this increases the likelihood of a severe decrease in
`
`performance. If we simply avoid fragmented datagrams.
`values of the Identification field need not be unique,
`and there is no bandwidth limit imposed by its size.
`
`2.3. Poor performance when fragments are lost
`
`When segments are sent that are large enough to require
`fragmentation,
`the loss of any fragment requires the
`entire segment to be retransmitted. This can lead to
`poorer performance than would have been achieved by
`originally sending segments that didn‘t require frag-
`mentation.
`
`Gateways in the Internet must drop packets when
`congested. If the gateways are congested, dropping
`fragments only makes the situation worse. Dropped
`fragments mean increased retransmissions, which leads
`to more fragments. As the loss rate goes up due to
`heavy
`congestion,
`the
`total
`throughput
`drops
`dramatically, since the loss of any one fragment means
`that
`the resources expended in sending the other
`fragments of that datagram are entirely wasted.
`
`Even when congestion is not the problem, retransmis-
`sion does not necessarily increase the likelihood that all
`the fragments that make up the segment will arrive
`unscathed.
`In particular, network idiosyncrasies may
`conspire to cause the same fragment or fragments to be
`lost on successive retransmission. We call this deter—
`
`minisri'c fragment lass.
`
`An example of deterministic fragment loss occurs in the
`4.ZBSD Unix implementation of TCP when datagrams
`pass between a local network (typically an Ethernet or a
`Proteon ring, with MTUs of 1500 or 2046 bytes,
`respectively) and the ARPAnet. The TCP prefers to
`send 1024 byte data segments, which are transmitted in
`1064 byte IP datagrams. As seen earlier, this results in
`two fragments, 1006 and 78 bytes long.
`
`The receiving gateway receives both fragments and
`sends them out over the local Proteon ring. The Proteon
`
`ACM SIGCOMM
`
`-77-
`
`Computer Communication Review
`
`INTEL EX. 1420.003
`
`INTEL EX. 1420.003
`
`

`

`ring interface does not have sufficient buffering to
`receive back-to-back packets, so it consistently drops
`the second fragment. The sending TCP times out, and
`retransmits the 1024 byte segment, which will again be
`fragmented. The second fragment
`is again lost.
`the
`segment
`times out, and eventually the connection is
`broken.
`
`In addition, many of the gateways in the Internet today
`are derived from 4.2BSD Unix. This implementation of
`IP does not properly fragment a previously fragmented
`packet, preventing some fragments from ever reaching
`their destination, which might better be called gum:
`anteed fragment loss.
`
`2.4. Efficient reassembly is difficult
`
`Reassembling fragments into datagrams at the IP layer
`is considerably less robust than constructing a reliable
`stream at the TCP layer. The windOw mechanism in
`TCP allows the reassembly process to accurately gauge
`how much buffer space to allocate for the current
`stream of unacknowledged data bytes. Also, because in
`TCP the data stream is covered by a sequence number
`for each data byte, once a contiguous sequence of bytes
`at the beginning of the outstanding data stream has been
`reassembled, it can be acknowledged and handed up to
`the next layer. Thus, progress can always be made, even
`if in small amounts.
`
`At the IP layer, there is no indication in the header of a
`fragmented packet of how many other fragments follow,
`or of the length of the entire datagram. The More
`Fragments bit tells only if this the last fragment of the
`datagram, and the Fragment Offset field tells only the
`position of this fragment in the complete datagram. If
`the total size of the incoming datagram is too large to fit
`available buffer space, no progress can be made. The IP
`specification requires hosts to be able to reassemble
`datagrams at least 576 bytes in length; larger segment
`sizes must be explicitly negotiated by higher level
`protocols.
`
`Even if there is sufficient buffer space to reassemble a
`very large datagram, conflicts can occur. In the Internet,
`it is possible for fragments of the same datagram to take
`different routes to their ultimate destination. Depending
`on queue management strategies at gateways along the
`way, a fragment of a small datagram may arrive
`intermixed with the fragments of a large datagram.
`More concretely. assume two datagrams, L (large) and
`S (small), are fragmented as LILQLthLsLGLng and
`$132. If there are only eight buffers available, and the
`reception order is LIL2L3L4L5LgLTSlL881, reassembly of
`L cannot succeed, despite adequate buffer space. Upon
`reception of 8., the reassembly process could discard L]
`through L»;, which would leave six free buffers and
`
`allow S to be reassembled when S; arrives. Or, it could
`discard L3 (and subsequently 52). blocking reassembly
`of both L and S; the buffers would be kept full until the
`fragments expire.
`In either case,
`the work done to
`transport all the fragments of L is entirely wasted. It is
`not possible to coalesce a complete initial string of
`fragments and partially acknowledge receipt of the
`datagram in order to free some of the buffer space.
`(Dave Mills first pointed out this behavior in [13].)
`
`It is difficult to decide how long to hold on to received
`fragments. The only firm limit
`is the 'ITL field;
`the
`reassembly process must discard fragments as their
`TTLs expire. Since each gateway decrements the TH.
`field, it must be set high enough to traverse the longest
`possible route, and thus may still be quite high when the
`packet arrives at
`its destination. Naive use of the
`received T'I'L as a reassembly timeout will cause some
`fragments to occupy buffer space for a much longer
`time than necessary. Use of too short a reassembly
`timeout will cause fragments to be dropped too quickly,
`leading to unnecessary retransmissions.
`
`Because IP is a datagram protocol, there is no guarantee
`that a given fragment will ever arrive. A higher level
`protocol may retransmit a lost IP datagram. If a retrans-
`mitted datagram does not have the same value for the IP
`Identification field,
`its data will not be recognized as
`being the same as that in previously received fragments.
`The old fragments will occupy buffer space until timed
`out or forced out by incoming packets, and cannot fill
`holes left by fragments dropped from the second data-
`gram. This suggests that higher level protocols should
`attempt to use the same value for the IP Identification
`on both the original and retransmitted data. (This idea
`was proposed by John Shriver [24].)
`
`3. Avoiding fragmentation
`
`in most circumstances, the potential
`We believe that.
`fragmentation
`far outweigh the
`disadvantages of
`expected advantages. Thus, hosts should avoid sending
`datagrams that are so large that they will be fragmented.
`The length limit can be determined by a variety of
`general approaches:
`
`0 Always send small datagrams: There is some
`datagram size that is small enough to fit without
`fragmentation on any network; we could simply
`send no datagrams larger than this limit.
`
`' Guess minimum MTU of path: Use a heuristic
`to guess the minimum MTU along the path the
`datagram will follow.
`
`0 Discover actual minimum MTU of path: Use a
`protocol to determine the actual minimum MTU
`along the path the datagram will follow.
`
`ACM SIGCOMM
`
`-73-
`
`Computer Communication Fleview
`
`INTEL EX. 1420.004
`
`INTEL EX. 1420.004
`
`

`

`0 Guess or discover MTU and backtrack if
`
`wrong: Since an estimate might be wrong, and a
`discovered MTU may change if a route changes,
`sometimes we may have to adjust the length limit.
`This requires both a mechanism for detecting
`errors, and a mechanism for correcting them.
`
`Later in this section we will discuss more specific
`fragmentation avoidance Schemes.
`
`All these strategies assume that the route the datagrams
`will follow is
`independently determined. If multiple
`routes are available between source and destination, one
`might
`instead try to avoid fragmentation by using
`source-routing to avoid data links with small MTUs.
`Suitable alternate routes seldom exist, however, and
`
`even when they do we see no efficient way for an IP
`host to obtain enough information to choose a good
`source-route.
`
`IP is a layered protocol architecture, and fragmentation
`avoidance must be done at the right layer. It makes little
`sense to build redundant mechanisms into several layers
`if it is possible to do it once. This implies that the right
`place for fragmentation avoidance is the layer commOn
`to all 11’ communication,
`the 1P datagram layer itself
`(and its partner, the ICMP protocol). It would be a poor
`idea
`to
`put
`the
`entire
`fragmentation
`avoidance
`mechanism in, say, the TCP layer, because both the
`mechanism and any additional protocol would have to
`be duplicated in parallel
`layers, such as UDP[17],
`NETBLT[6], and VMTP[3], and because it would be
`awkward for
`a TCP—based mechanism to
`share
`
`knowledge with other layers and across connections.
`
`layers above IP should be
`to say that
`This is not
`uninvolved in fragmentation avoidance. Architectural
`layering does not mean that higher layers must be kept
`ignorant of fragmentation issues. Optimal performance
`depends upon cooperation between layers for example,
`the TCP layer should not send huge segments if the IP
`layer knows that they will be fragmented.
`
`Most of the fragmentation-avoidance schemes we will
`propose depend on keeping some knowledge about the
`minimum MTU (MINMTU) on the path a datagram will
`follow. A MINMTU value could be associated with a
`
`specific destination network. a specific destination host.
`a specific route (there may be several routes to one
`destination, with differing MINMTUs), or a specific
`connection (since for different applications, we may
`want
`to choose between optimizing for maximum
`bandwidth versus minimum delay, and thus might want
`to accept different risks of fragmentation for different
`connections to the same host). The MINMTU values
`could be kept in the IP routing database. or in a separate
`database, especially if per-connection MINMTUs are
`
`wanted. To support pervconnection MlNMTUs, the IP
`layer must obtain
`a
`connection
`identifier
`from
`connection-oriented higher layers.
`
`scheme
`a per~connection
`that
`is our belief
`It
`(degenerating to a per-routc-to-specific-host scheme for
`connectionless protocols)
`is the most
`flexible one.
`While it is true that by keeping perwdestination-network
`information one might be able to pool
`information
`about several hosts, this is not necessarily safe. Because
`many networks are subnetted [15], because MTUS may
`vary among the subnets of a given network, and because
`one cannot tell whether a remote network is subnetted
`
`or not, it is not true that knowing the MLNMTU for one
`host reliably gives you the MINMTU for all other hosts
`on the same network.
`
`Routes in a datagram network are not necessarily
`symmetric; the route a packet takes may not be the
`reverse of the route taken by a packet traveling in the
`opposite direction. Because of this. it is not safe for a
`host to assume that it can send a datagram as large as
`the one it has received from its peer. An independent
`MINMTU determination must be made for each
`
`direction, although the peer hosts may assist each other
`in doing so.
`
`When the 1? layer has determined the MINMTU for a
`connection or destination, it can make this information
`available to higher
`layers.
`such as TCP,
`that are
`generating segments
`to be
`sent as
`IP datagrams
`Segment-generating layers should ask the IP layer for a
`MINMTU before sending a segment; connection—based
`layers
`should either
`check periodically that
`the
`MINMTU has not changed, or should be able to handle
`asynchronous notification of a change.
`
`3.1. Fragmentation avoidance without protocol
`changes
`
`fragmentation
`section we describe several
`In this
`avoidance schemes that can be implemented without
`changing existing protocol specifications or creating
`new protocols. There are obvious advantages to such
`approaches. since they can be taken immediately by
`individual sites or vendors; further, we have sufficient
`experience with one of them to believe that it works
`fairly well. On the other hand, none of these schemes
`can make use of exact knowledge of MINMTUs, and so
`may not provide optimal performance.
`
`3.1.1. Always send tiny datagrams
`
`If a host always sent datagrams no larger than the
`minimum MTU over the entire intemet, these datagrams
`would never be fragmented. In the IP Internet the limit
`is no higher than 254 bytes, and might be lower. Since
`almost all of the Internet supports larger MTUs, and
`
`ACM SIGCOMM
`
`-79-
`
`Computer Communication Review
`
`INTEL EX. 1420.005
`
`INTEL EX. 1420.005
`
`

`

`since performance depends so strongly on packet size,
`this approach can't provide reasonable performance. It
`is worth invoking only as
`a
`temporary diagnostic
`measure if performance actually increases when the
`datagram size is decreased, this is a clear indication that
`inappropriate fragmentation is taking place for larger
`datagrams.
`
`Alternatively, one might assume that using a Sid-byte
`limit is small enough to avoid fragmentation in virtually
`all cases (we hope that in the future, all new LP network
`links would be capable of handling packets of this size).
`576 bytes is set forth in the IP specification [18] as the
`maximum size a host can send without explicit
`permission from the receiving host, so it is reasonable
`as an arbitrary value.
`
`3.1.2. Send 576-byte datagrams if the route goes via
`a gateway
`
`The IP layer can determine if the route for a connection
`or destination goes via a gateway. If it does, then the
`size limit is set to 576 (our favorite arbitrary value);
`otherwise, any size up to the MTU of the data-link layer
`may be used.
`
`This approach provides maximum performance for
`local connections, and reasonable assurance that on
`most non-local connections, datagrams will not be
`fragmented. It is not perfect, since
`
`1.
`
`2.
`
`3.
`
`It does not avoid fragmentation on every path
`
`It may unnecessarily limit packet size, especially
`on subnctted collections of hi gh-speed LAN 5 that
`all support large packets.
`
`If proxy ARP is used [14] then the 1P layer may
`be fooled into believing that a non-local path is
`local, and thus use large datagrams when they are
`not necessarily safe.
`
`However, it is quite easy to implement and in general
`provides good performance. A variant of this scheme,
`implemented in the TCP layer, has been used for
`several years at many sites and is now incorporated in
`4.SBSD Unix [12]. This is the method we recommend
`in the absence of protocol changes.
`
`3.13. Send 576-byte datagrams if the route goes off-
`net
`
`Instead of checking whether a destination is behind a
`gateway,
`the IF layer can examine the destination's
`network number to decide if it is local or non—local. In a
`
`this trades a higher risk of
`subnetted environment,
`guessing too high a MINMTU for higher performance
`within the local collection of subnets.
`
`3.2. Fragmentation avoidance with protocol changes
`
`fragmentation
`section we describe several
`In this
`avoidance schemes that
`require changes to existing
`protocol specifications or the creation of new protocols.
`Mostly, these involve changes to gateways and some
`minor changes to IP-layer software; all are designed so
`as to coexist with unmodified gateways and hosts.
`
`3.2.1. Probe mechanisms
`
`Ideally, for a host to be able to send the largest possible
`datagrams that will not be fragmented,
`it must have
`perfect information as to the MINMTU along the path
`the datagrams will follow. Since most IP hosts do not
`even know what that route is, much less what the MTUs
`
`route, we need a mechanism for
`along the
`are
`discovering MINMTU.
`
`The most straightforward kind of mechanism is to send
`a packet along the route, collecting MTU information as
`it goes; we call
`these probe mechanisms. Probe
`mechanisms require support from gateways each gate—
`way along the route must update the probe according to
`the MTU of the hop it is about to take. Probe mechan—
`isms also require support from peer hosts, since paths
`are aSymmetric, once a probe reaches the end of its
`route, the information it has collected must be returned
`to the source host.
`
`A probe may either gather a list of all the MTUs along
`the path (somewhat analogous to the IP “Record Route"
`option), with which the host
`can determine
`the
`MINMTU. or the probe may simply carry only the
`lowest MTU value seen along the route. The former
`method provides a little more information;
`the latter
`method is easier to implement and results in shorter
`packets.
`
`A probe may be made only once, at the beginning of a
`connection or the use of a route, or it may be made
`periodically. Periodic probes are preferable if
`the
`MINMTU is kept per-destination or per-connection,
`since the route may change. If MINMTU information is
`kept per-route, then it will not change and consequently
`probes need not be repeated.
`
`Probe mechanisms are useful for discovering other path
`characteristics besides MINMTU. As long as one is
`processing a probe, it makes sense to collect a variety of
`information, since it comes at little additional cost. This
`information could include:
`
`Minimum bandwidth
`
`Useful for determining appropriate transmission
`rates; if a host knows that a 9600—baud link is part
`of the path, it should behave differently than if the
`path is entirely via 100 Mbit fiber networks.
`
`ACM SIGCOMM
`
`-80-
`
`Computer Communication Review
`
`INTEL EX. 1420.006
`
`INTEL EX. 1420.006
`
`

`

`Maximum delay
`Useful for determining realistic round—trip times;
`if a satellite channel
`is in use, with a delay of
`several hundred milliseconds, a host should not
`
`retransmit as quickly as if the end-to~end delay
`were several milliseconds.
`
`Maximum queue length
`if measured
`A high value implies congestion;
`using the
`“fair—queueing"
`algorithm [16]
`it
`indicates to a host whether it is sending too much.
`Alternatively,
`a “congestion-encountered” flag
`could be set
`if any gateway along the path is
`experiencing congestion.
`
`Maximum error rate
`
`When a link along the path is experiencing a high
`error rate, a host might choose to send shorter
`packets (so as to reduce the likelihood that an
`entire datagram is dropped because of a single
`error) or use error—correcting codes.
`
`Hop Count
`The total number of links traversed along the
`route may be of interest, for example, in choosing
`a value for the “Time To Live" field. (Collection
`of hop counts was proposed by Mike Karels [10].)
`
`It is not necessary for every gateway along the path to
`support probing, providing they all forward the probe.
`Gaps in the probe information are not fatal; at worst,
`host behavior is the same as if no probing is done. A
`gateway that does support probing can cover up for an
`occasional uncooperative gateway by looking at
`the
`incoming link as well as
`the outgoing link when
`determining the MINMTU.
`
`Since route choices may depend on the IP “Type of
`Service” and perhaps the IP “Security” option, probes
`should carry the same Type of Service and Security as
`the data packets will [4]; gateways should observe Type
`of Service and Security when updating values in probes.
`
`3.2.2. Probing with ICMP messages
`
`A probe can be done using a separate packet; in the IP
`architecture, we would do this using a new ICMP
`“Probe Path” message. This is described in detail
`in
`appendix I.
`
`Briefly, a host wishing to probe a path sets initial values
`for the fields of the Probe Path message, then sends it to
`the destination host. Each gateway along the route
`updates various
`fields of the message. When the
`destination host receives the message,
`it copies the
`recorded information into a different area of
`the
`
`message, reinitializes the recording fields, and returns
`the message to the original host. If the second host
`requests, the message may make one more trip, after
`
`which both hosts will have the path information,
`including MINMTU.
`
`3.23. Probes piggybaeked on IP headers
`
`It is not necessary to send a separate packet to probe the
`path. Instead, the probe information can be piggybacked
`on the actual data packets, as part of the 1P header. In
`appendix [I we describe new IP header options for
`recording
`and
`returning MINMTU information.
`(Additional options could be defined for recording other
`path characteristics.)
`
`In this case, a host wishing to probe a path sets initial
`values for the “Probe MTU" option in the IP header of
`a datagram it is sending. Each gateway along the route
`may update the value carried in this option. When the
`destination host receives the datagram,
`it copies the
`recorded information into a “MTU Reply” option and
`attaches it to the next datagram going back to the source
`host. When this reply is received, the f

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket