`
`Reza Rejaie
`Information Sciences Institute
`University of Southern California
`reza@isi.edu
`
`Mark Handley
`AT&T Center of Internet Research
`The International Computer Science Institute
`mjh@ aciri.org
`
`Deborah Estrin
`Information Sciences Institute
`University of Southern California
`estrin @isi.edu
`
`Abstract
`
`Streaming audio and video applications are becoming increasingly
`popular on the Internet, and the lack of effective congestion con(cid:173)
`trol in such applications is now a cause for significant concern.
`The problem is one of adapting the compression without requir(cid:173)
`ing video-servers to re-encode the data, and fitting the resulting
`stream into the rapidly varying available bandwidth. At the same
`time, rapid fluctuations in quality will be disturbing to the users
`and should be avoided.
`In this paper we present a mechanism for using layered video
`in the context of unicast congestion control. This quality adap(cid:173)
`tation mechanism adds and drops layers of the video stream to
`perform long-term coarse-grain adaptation, while using a TCP(cid:173)
`friendly congestion control mechanism to react to congestion on
`very short timescales. The mismatches between the two timescales
`are absorbed using buffering at the receiver. We present an ef(cid:173)
`ficient.scheme for the distribution of butTering among the active
`layers. Our scheme allows the server to trade short-term improve(cid:173)
`ment for long-term smoothing of quality. We discuss the issues in(cid:173)
`volved in implementing and tuning such a mechanism, and present
`our simulation results.
`
`1
`
`Introduction
`
`The Internet has been experiencing explosive growth of audio and
`video streaming. Most current applications involve web-based au(cid:173)
`dio and video playback[6, 14] where stored video is streamed from
`the server to a client upon request. This growth is expected to con(cid:173)
`tinue, and such semi-realtime traffic will form a higher portion of
`the Internet load. Thus the overall behavior of these applications
`will have a significant impact on the Internet traffic.
`Since the Internet is a shared environment and does not cur(cid:173)
`rently micro-manage utilization of its resources, end systems are
`expected to be cooperative by reacting to congestion properly and
`promptly[S]. Deploying end-to-end congestion control results in
`
`higher overall utilization of the network and improves inter-protocol
`fairness. A congestion control mechanism determines the avail(cid:173)
`able bandwidth based on the state of the network, and the appli(cid:173)
`cation should then use this bandwidth efficiently to maximize the
`quality of the delivered service to the user.
`Currently, many of the commercial streaming applications do
`not perform end-to-end congestion control. This is mainly be(cid:173)
`cause stored video has an intrinsic transmission rate. These rate(cid:173)
`based applications either transmit data with a near-constant rate
`or loosely adjust their transmission rates on long timescales since
`the required rate adaptation for effective congestion control is not
`compatible with their nature. Large scale deployment of these ap(cid:173)
`plications could result in severe inter-protocol unfairness against
`TCP-based traffic and possibly even congestion collapse.
`This paper is not about congestion control mechanisms, but
`about a complementary mechanism to adapt the quality of stream(cid:173)
`ing video playback while performing congestion control. How(cid:173)
`ever, to design an effective quality adaptation scheme, we need
`to know the properties of the deployed congestion control mech(cid:173)
`anism. Our primary assumption is that the congestion control
`mechanism employs an additive increase, multiplicative decrease
`(AIMD) algorithm.
`We previously designed a simple TCP-friendly congestion con(cid:173)
`trol mechanism, the Rate Adaptation Protocol (RAP)[l7]. RAP is
`a rate-based congestion control mechanism and employs an AIMD
`algorithm in a manner similar to TCP. We assume RAP as the un(cid:173)
`derlying congestion control mechanism because it's properties are
`relatively simple to predict. However, our proposed mechanisms
`can be applied with any congestion control scheme that deploys
`an AIMD algorithm.
`Figure I shows the transmission rate of a .RAP source over
`time. Similar to TCP, it hunts around for a fair share of the band(cid:173)
`width. However unlike TCP, RAP is not ACK-clocked and vari(cid:173)
`ations of transmission rate have a more regular sawtooth shape.
`Bandwidth increases linearly for a period of time, then a packet is
`lost, and an exponential backoff occurs, and the cycle repeats.
`
`• This work was supported by DARPA under contract No. DABT63-95-
`C0095 and DABT63-96-C-0054 as part of SPT and VINT projects.
`
`1.1 Target Environment
`
`Permission to make digital or hard copies of all or part of this work for
`personal or classroom use is granted without fee provided that
`copies are not made or distributed for profit or commercial advan·
`tage and that copies bear this notice and the full citation on the first page.
`To copy otherwise. to republish, to post on servers or to
`redistribute to lists. requires prior specific permission and/or a fee.
`SIGCOMM '99 8/99 Cambridge, MA, USA
`© 1999 ACM 1-58113-135-6/99/0008 ... $5.00
`
`Our target environment is a video server that simultaneously plays
`back different video streams on demand for many heterogeneous
`clients. As with current Internet video streaming, we expect the
`length of such streams to range from 30 second clips to full-length
`movies. The server and clients are connected through the Internet
`where the dominant competing traffic is TCP-based. Clients have
`
`189
`
`
`
`Trt~nsmission Rate of a RAP source wilhoul Rne Grain Adaptaion
`
`RAP Trlinsmission Rate -
`Link Bandwidth ----·
`
`14
`
`12
`
`0~------·~--------~--------~------~
`~
`~
`~
`~
`~
`Timo(sec)
`
`Figure I: Transmission rate of a single RAP flow
`
`heterogeneous network capacity and processing power. Users ex(cid:173)
`pect startup playback latency to be low, especially for shorter clips
`played back as part of web surfing. Thus pre-fetching an entire
`stream before starting its playback is not an option. We believe
`that this scenario reasonably represents many current and antici(cid:173)
`pated Internet streaming applications.
`
`1.2 Motivation
`
`If video for playback is stored at a single lowest - common - de(cid:173)
`nominator encoding on the server, high-bandwidth clients will re(cid:173)
`ceive poor quality despite availability of a large amount of band(cid:173)
`width. However, if the video is stored at a single higher quality
`encoding (and hence higher data rate) on the server, there will
`be many low-bandwidth clients that cannot play back this stream.
`In the past, we have often seen RealVideo streams available at
`14.4 Kb/s and 28.8 Kb/s, where the user can choose their connec(cid:173)
`tion speed. However, with the advent of ISDN, ADSL, and cable
`modems to the home, and faster access rates to businesses, the
`Internet is becoming much more heterogeneous. Customers with
`higher speed connections feel frustrated to be restricted to modem(cid:173)
`speed playback. Moreover, the network bottleneck may be in the
`backbone, such as at provider interconnects or links to the server
`itself. In this case, the user cannot know the congestion level and
`congestion control mechanisms for streaming video playback are
`critical.
`Given the time varying bandwidth channel due to congestion
`control the server to be able to adjust the quality of the stream it
`plays back so that the perceived quality is as high as the available
`network bandwidth will permit. We term this quality adaptation.
`
`1.3 Quality Adaptation Mechanisms
`
`There are several ways to adjust the quality of a pre-encoded stored
`stream, including adaptive encoding: switching among multiple
`pre-encoded versions, and hierarchical encoding.
`One may re-quantize stored encodings on-the-fly based on net(cid:173)
`work feedback[ I, 15, 20]. However, since encoding is CPU -
`intensive, servers are unlikely to be able to do this for large num(cid:173)
`bers of clients. Furthermore, once the original data has been stored
`compressed, the output rate of most encoders can not be changed
`over a wide range.
`In an alternative approach, the server keeps several versions
`of each stream with different qualities. As available bandwidth
`
`changes, the server plays streams of higher or lower quality as
`appropriate.
`With hierarchical encoding[8, 10, 12, 21], the server maintains
`a layered encoded version of each stream. As more bandwidth
`becomes available, more layers of the encoding are delivered. If
`the average bandwidth decreases, the server may then drop some
`of the layers being transmitted. Layered approaches usually have
`the decoding constraint that a particular enhancement layer can
`only be decoded if all the lower quality layers have been received.
`There is a duality between adding or dropping of layers in the
`layered approach and switching streams in the multiply-encoded
`approach. However the layered approach is more suitable for
`caching by a proxy for heterogeneous clients[ 18]. In addition, it
`requires less storage at the server, and it provides an opportunity
`for selective retransmission of the more important information.
`The design of a layered approach for quality adaptation primar(cid:173)
`ily entails the design of an efficient add and drop mechanism that
`maximizes quality while minimizing the probability of base-layer
`buffer underflow.
`The rest of this paper is organized as follows: first we provide
`an overview of the layered approach to quality adaptation and then
`explain coarse-grain adding and dropping mechanisms in section
`2. We also discuss fine-grain inter-layer bandwidth allocation for a
`single backoff scenario. Section 3 motivates the need for smooth(cid:173)
`ing in the presence of real loss patterns and discusses two possible
`approaches. In section 4, we sketch an efficient filling and drain(cid:173)
`ing mechanism that not only achieves smoothing but is also able
`to cope efficiently with various patterns of losses. We evaluate
`our mechanism through simulation in section 5. Section 6 briefly
`reviews related work. Finally, section 7 concludes the paper and
`addresses some of our future plans.
`
`2
`
`layered Quality Adaptation
`
`Hierarchical encoding provides an effective way that a video play(cid:173)
`back server can coarsely adjust the quality of a video stream with(cid:173)
`out transcoding the stored data. However, it does not provide fine(cid:173)
`grained control over bandwidth, i.e. bandwidth changes at the
`granularity of a layer. Furthermore, there needs to be a quality
`adaptation mechanism to smoothly adjust the quality (i.e. number
`of layer) as bandwidth changes. Users will tolerate poor quality
`video, but rapid variations in quality are disturbing.
`Hierarchical encoding allows video quality adjustment over
`long periods of time, whereas congestion control changes the trans(cid:173)
`mission rate rapidly over short time intervals (several round-trip
`times,(RTTs)). The mismatch between the two timescales is made
`up for by buffering data at the receiver to smooth the rapid varia(cid:173)
`tions in available bandwidth and allow a near constant number of
`layers to be played.
`Figure 2 graphs a simple simulation of a quality adaptation
`mechanism in action. The top graph shows the available network
`bandwidth and the consumption rate at the receiver with no layers
`being consumed at startup, then one layer, and finally two layers.
`During the simulation, two packets are dropped and cause conges(cid:173)
`tion control backot1's, when the transmission rate drops below the
`consumption rate for a period of time. The lower graph shows the
`playout sequence numbers of the actual packets against time. The
`horizontal lines show the period between arrival time and playout
`time of a packet. Thus it indicates the total amount of buffering for
`each layer. This simulation shows more buffered data for Layer 0
`(the base layer) than for Layer I (the enhancement layer). Af(cid:173)
`ter the first backoff, the length of these lines decreases indicating
`
`190
`
`
`
`Bao~f-
`
`Draining
`· •Phase
`
`._:
`
`J--¥---+-+-
`
`I
`I
`R/2:
`
`Tima
`
`Figure 3: Filling and draining phase
`
`cde. Such a period of time is known as a draining phase.
`Note that the quality adaptation mechanism can only adjust the
`number of active layers and their bandwidth share. This paper at(cid:173)
`tempts to derive efficient behavior for these two key mechanisms:
`
`• A coarse-grain mechanism for adding and dropping lay(cid:173)
`ers. By changing the number of active layers, the server
`can perform coarse-grain adjustment on the total amount of
`receiver buffered data.
`
`• A fine-grain inter-layer bandwidth allocation mechanism among
`the active layers. If there is receiver-buffered data avail(cid:173)
`able for a layer, we can temporarily allocate less bandwidth
`than is being consumed while taking the remainder from the
`buffer. This smoothes out reductions in the available band(cid:173)
`width. When spare bandwidth is available, we can send data
`for a layer at a rate higher than its consumption rate, and in(cid:173)
`crease the data butTered for that layer at the receiver.
`
`In the next section, we present coarse-grain adding and dropping
`mechanisms as well as their relation to the fine-grain bandwidth
`allocation. Then we discuss the fine-grin bandwidth allocation in
`the subsequent sections.
`
`2.1 Adding a Layer
`
`A new layer can be added as soon as the instantaneous avail(cid:173)
`able bandwidth exceeds the consumption rate (in the decoder) of
`the existing layers. The excess bandwidth could then be used to
`start bum:ring a new layer. However, this would be problematic
`as without knowing future available bandwidth we cannot decide
`when it will first be possible to start decoding the layer. The new
`layer's playout is decided by the inter-layer timing dependency
`between its data and that in the base layer. Therefore we cannot
`make a reasoned decision about which data from the new layer to
`actually send 3
`.
`A more practical approach is to start sending a new layer when
`the instantaneous bandwidth exceeds the consumption rate of the
`existing layers plus the new layer. In this approach the layer can
`start to play out immediately. In this case there is some excess
`bandwidth from the time the available bandwidth exceeds the con(cid:173)
`sumption rate of the existing layers until the new layer is added.
`This excess bandwidth can be used to buffer data for existing lay(cid:173)
`ers at the receiver.
`
`3 Note that once the inter-layer timing for a new layer is adjusted, it is maintained
`as long as the buffer docs not dry out.
`
`Filling Phase
`
`~ : c Phase
`
`hase
`
`Packet
`Packet
`Rec~out
`
`Phase
`halite
`::::::::::::::::::::::;a'
`= ......................... :-~
`
`.
`I
`.
`m rcce1ver
`huffcr
`
`-·Layer I
`•···········•Layer 0
`
`,.
`
`~a
`
`•.. ; ................... ···.J'
`
`' ,.
`
`5 ......
`
`.. ~/
`. .::::._,.
`~·· 7.!
`tl'
`:::.:::7
`/
`·.:·:··cc:· .. :·]•
`.... j~
`
`..
`
`.. ~ .' ... -..• ,··.···'. ~ ~ .. ~ .. ~ ~~:;::_;:_::.:::i ....
`
`::~::~::~::~:i" i¥
`
`'
`
`backoff 1
`
`hackoff2
`
`Tir?e
`
`Figure 2: Layered Encoding with Receiver Butlering
`
`buffered data from Layer 0 is being used to compensate for the
`lack of available bandwidth. At the time of the second backoff,
`a little data has been buffered for Layer 1 in addition to the large
`amount for Layer 0. Thus data is drawn from both buffers properly
`to compensate for the lack of available bandwidth.
`The congestion control mechanism dictates the available band(cid:173)
`width 1
`• We cannot send more than this amount, and do not wish
`to send less2
`• In a real network even the average bandwidth of
`a congestion controlled flow changes over the session lifetime.
`Thus a quality adaptation mechanism must continuously evalu(cid:173)
`ate the available bandwidth and adjust the number of active layers
`accordingly.
`In this paper we assume that the layers arc linearly spaced - that
`is each layer has the same bandwidth. This simplifies the analysis,
`but is not a requirement. In addition, we assume each layer has a
`constant consumption rate over time. In practice this is unlikely in
`a real codec, but to a first approximation it is reasonable. It can be
`ignored by slightly increasing the amount of receiver buffering for
`all layers to absorb variations in consumption rate.
`Figure 3 shows a single cycle of the congestion control mech(cid:173)
`anism. The sawtooth waveform is the instantaneous transmission
`rate. There are na active layers, each of which has a consumption
`rate of C. In the left hand side of the figure, the transmission rate
`is higher than the consumption rate, and this data will be stored
`temporarily in the receiver's buffer. The total amount of stored
`data is equal to the area of triangle abc. Such a period of time is
`known as a .filling phase. Then, at time tb, a packet is lost and the
`transmit rate is reduced multiplicatively. To continue playing out
`na layers when the transmission rate drops below the consumption
`rate, some data must be drawn from the receiver buffer until the
`transmission rate reaches the consumption rate again. The amount
`of data drawn from the buffer is shown in this figure as triangle
`
`1 Available bandwidth and transmission rate are used inter-changeably throughout
`this paper.
`2 For simplicity we ignore flow control issues in this paper but implementations
`should not. However our final solutions generally require so little receiver buffering
`that this is not often an issue.
`
`191
`
`
`
`In practice, this bandwidth constraint for adding is still not con(cid:173)
`servative enough, as it may result in several layers being added and
`dropped with each cycle of the congestion control sawtooth. Such
`rapid changes in quality would be disconcerting for the viewer.
`One way to prevent rapid changes in quality is to add a buffering
`condition such that adding a new layer does not endanger existing
`layers. Thus, the server may add a new layer when:
`
`I. The instantaneous available bandwidth is greater than the
`consumption rate of the existing layers plus the new layer,
`and,
`
`2. There is sufficient total buffering at the receiver to survive
`an immediate backoff and continue playing all the existing
`layers plus the new layer.
`
`To satisfy the second condition we assume (for now) that no addi(cid:173)
`tional backoff will occur during the draining phase, and the slope
`of linear increase can be properly estimated.
`These are the minimal criteria for adding a new layer. If these
`conditions are held a new layer can be kept for a reasonable pe(cid:173)
`riod of time during the normal congestion control cycles. We shall
`show later that we normally want to be more even conservative
`than this. Clearly we need to have sufficient buffering at the re(cid:173)
`ceiver to smooth out variations in the available bandwidth so that
`the number of active layers does not change due to the normal
`hunting behavior of the congestion control mechanism.
`
`Expressing the adding conditions more precisely:
`Condition 1: R > (na + l)G
`((na + l)G -lJ-)2
`Condition 2: L buf; ~
`28
`
`na-1
`
`i=O
`
`where R is the current transmission rate
`na is the number of currently active layers
`buf; is the amount of buffered data for layer i
`8 is the rate of linear increase in bandwidth
`(typically one packet per RTT)
`
`2.2 Dropping a Layer
`
`Once a backoff occurs, if the total amount of buffering at the re(cid:173)
`ceiver is Jess than the estimated required buffering for recovery,
`(i.e, the area of triangle cde in figure 3), the correct course of
`action is to immediately drop the highest layer. This reduces the
`consumption rate (naG) and hence reduces the buffer requirement
`for recovery. If the buffering is still insufficient, the server should
`iteratively drop the highest layer until the amount of buffering is
`sufficient. This rule clearly doesn't apply to the base layer which
`is always sent.
`
`The dropping mechanism more precisely:
`
`WHILE (naG> R+ \ 28 ~1
`
`buf;)
`
`DOna= na -1
`This mechanism provides a coarse-grain criteria for dropping a
`layer. However, it may be insufficient to prevent buffer underflow
`during the draining phase for one of the following reasons:
`
`• We may suffer a further backoff before the current draining
`phase completes.
`
`• Our estimate of the slope of linear increase may be incorrect
`if the network RTT changes substantially.
`
`• There may be sufficient total data buffered, but it may be
`allocated among the different layers in a manner that pre(cid:173)
`cludes its use to aid recovery.
`
`The first two situations are due to incorrect prediction of the amount
`of buffered data needed to recover, and we term such an event a
`critical situation. In such events, the only appropriate course of
`action is to drop additional layers as soon as the critical situation
`is discovered.
`The third situation is more problematic, and relates to the fine(cid:173)
`grain bandwidth allocation among active layers during both filling
`and draining phases. We devote much of the rest of this paper to
`deriving and evaluating a near-optimal solution to this situation.
`
`2.3
`
`Inter-layer Buffer Allocation
`
`Because of the decoding constraint in hierarchical coding, each
`additional layer depends on all the lower layers, and correspond(cid:173)
`ingly is of decreasing value. Thus a buffer allocation mechanism
`should provide higher protection for lower layers by allocating a
`higher share of buffering for them.
`The challenge of inter-layer buffer allocation is to ensure the
`total amount of buffering is sufficient, and that is properly dis(cid:173)
`tributed among active layers to effectively absorb the short-term
`reductions in bandwidth that might occur. The following two ex(cid:173)
`amples illustrate ways in which improper allocation of buffered
`data might fail to compensate for the Jack of available bandwidth.
`
`• Dropping layers with buffered data: A simple buffer al(cid:173)
`location scheme might allocate an equal share of buffer to
`each layer. However, if the highest layer is dropped after a
`backoff, its buffered data is no longer able to assist the re(cid:173)
`maining layers in the recovery. The top layer's data will still
`be played out, but it is not providing buffering functional(cid:173)
`ity. This implies that it is more beneficial to buffer data for
`lower layers.
`
`• Insufficient distribution of buffered data: An equally sim(cid:173)
`ple buffer allocation scheme might allocate all the buffering
`to the base layer. Consider an example when three layers
`are playing, where a total consumption rate of 3G must
`be supplied for the receiver's decoder. If the transmission
`rate drops to G, the base layer (Lo) can be played from
`its buffer. Since neither £ 1 nor L2 has any buffering, they
`require transmission from the source. However available
`bandwidth is only sufficient to feed one layer. Thus £2 must
`be dropped even if the total buffering were sufficient for re(cid:173)
`coveT)'.
`
`In these examples, although buffering is available, it cannot be
`used to prevent the dropping of layers. This is inefficient use of the
`buffering. In general, we are striving for a distribution of buffering
`that is most efficient in the sense that it provides maximal protec(cid:173)
`tion against dropping layers for any likely pattern of short-term
`reduction in available bandwidth.
`These examples reveal the following tradeoffs for inter-layer
`buffer allocations:
`
`192
`
`
`
`By similar reasoning, the next largest amount an additional
`layer's buffer can contribute is quadrilateral bcde, and this por(cid:173)
`tion of buffered data should be allocated to £1, the first enhance(cid:173)
`ment layer, and so on. This approach minimizes the amount of
`buffered data allocated for higher layers that might be dropped in a
`critical situation and consequently maximizes buffering efficiency.
`
`The optimal amount of buffering for layer i is:
`
`Bufi,opt = 2~(C(2na- 2i- 1)- R);
`BuJ;,opt =~(naG- ~ - -iC) 2
`
`;
`
`i < nb- 1
`i = nb - 1
`
`Although we can calculate the optimal allocation of buffered
`data for the active layers, a backoff may occur at any random time.
`To tackle this problem, during the filling phase, we incrementally
`adjust the allocation of buffered data so that the buffer state always
`remains as close as possible to an optimal state.
`..--~~~--..,.. ... --'-'-"""'------
`
`'1'
`
`• Allocating more buffering for the lower layers not only im(cid:173)
`proves their protection but it also increases efficiency of buffer(cid:173)
`ing.
`
`• Buffered data for each layer can not provide more than its
`consumption rate(i.e. C). Thus there is a minimum number
`of buffering layers that are needed to cope with short-term
`reductions in available bandwidth for successful recovery.
`This minimum is directly determined by the reduction in
`bandwidth that we intend to absorb by buffering.
`
`Expressing this more precisely:
`
`where nb is the minimum number of buffering layers
`R is the transmission rate (before a backoft)
`
`2.4 Optimal Inter-layer Buffer Allocation
`
`Given a draining phase following a single backoff, we can derive
`the optimal inter-layer buffer allocation that maximizes buffering
`efficiency. Figure 4 illustrates an optimal buffer allocation and
`its corresponding draining pattern for a draining phase. Here we
`assume that the total amount of buffering at the receiver at time tb
`is precisely sufficient for recovery( i.e. area of triangle af g) with
`no spare buffering available at the end of the draining phase.
`,.... _______ .,
`
`Draining
`Phase
`
`ITotat
`
`!
`i
`!Available
`,Bandwidth
`~~~o~or1<
`
`I ConsurT'JIIion
`_Rate (n,F)
`I
`I
`I
`I
`i
`i
`'
`I
`.
`.
`I
`L_ __ , ___ , _______ _j ___ --">-
`tb
`Time
`
`Figure 4: The optimal inter-layer buffer distribution
`
`To justify the optimality of this buffer allocation, consider that
`the consumption rate of a layer must be supplied either from the
`network or from the buffer or a combination of the two. If it is
`supplied entirely from the buffer, that layer's buffer is draining at
`consumption rate C. The area of quadrilateral defg in figure 4
`shows the maximum amount of buffer that can be drained from
`a single layer during this draining phase. If the draining phase
`ends as predicted, there is no preference as to buffer distribution
`among active layers as long as no layer has more than de f g worth
`of buffered data. However, if the situation becomes critical due to
`further backoffs, layers must be dropped. Allocating area defg of
`buffering to the base layer would ensure that the maximum amount
`of the buffered data is still usable for recovery, and ma:<dmizes
`buffering efficiency.
`
`Figure 5: Optimal Buffer sharing
`
`Toward that goal, we assume that a single backoff will occur
`immediately, and ask the question: "if we keep only the base layer,
`is there sufficient buffering to survive?". If there is not sufficient
`buffering, then we fill up the base layer's buffer until it has enough
`buffering to survive a single backoff. Then we ask the question: "if
`we keep only two layers, is there enough buffering to survive with
`those buffers having optimal allocation?". If there is not enough
`base layer data, we fill the base layer's buffer up to the optimal
`level. Then we start sending L 1 data until both layers have the
`optimal amount of buffering to survive. We repeat this process
`and increase the number of expected surviving layers until all the
`buffering layers arc filled up to an optimal level such that all active
`layers can survive from a single backoff. This approach results in
`a sequential filling pattern among buffering layers.
`Figure 5 illustrates the optimal filling and draining scheme for
`a single backoff. If a backoff occurs exactly at time tb, all layers
`can survive the backoff. Occurrence of a backoff earlier than tb
`results in dropping one or more active layers. However the buffer
`state is always as close as possible to the optimal state without
`those layers. If no backoff occurs until adding conditions (section
`2.1) are satisfied, a new layer is added and we repeat the sequential
`filling mechanism.
`It is worth mentioning that the server can control the filling and
`draining pattern by proper fine-grain bandwidth allocation among
`active layers. Figure 5 illustrates that at each point of time during
`
`193
`
`
`
`the draining phase, bandwidth share plus draining rate for each
`layer is equal to its consumption rate. Thus maximally efficient
`buffering results in the upper layers being supplied from the net(cid:173)
`work during the draining phase while the lower layers are supplied
`from their buffers. For example, just after the backoff, layer 2 is
`supplied entirely from the buffer, but the amount supplied from the
`buffer decreases to zero as data supplied from the network takes
`over. Layers 0 and I are supplied from the buffer for longer peri(cid:173)
`ods.
`
`3 Smoothness Constraints
`
`In the previous section, we derived an optimal filling and draining
`scheme based on the assumption that we only buffer to survive a
`single backoff with all the layers intact. However, examination of
`Internet traffic indicates that real networks exhibit near-random[2]
`loss patterns with frequent additional backoffs during a draining
`phase. Thus, aiming to survive only a single backoff is too aggres(cid:173)
`sive and results in frequent adding and dropping of layers.
`
`3.1 Smoothing
`
`To achieve reasonable smoothing of the add and drop rate, an ob(cid:173)
`vious approach is to refine our adding conditions (in section 2.1)
`to be more conservative. We have considered the following two
`mechanisms to achieve smoothing:
`
`• We may add a new layer if the average available bandwidth
`is greater than the consumption rate of the existing layers
`plus the new layer.
`
`• We may add a new layer if we have sufficient amount of
`buffered data to survive K ma;r; backoffs with existing layers,
`where Kmax is a smoothing factor with value greater than
`one.
`
`Although each one of these mechanisms results in smoothing, the
`latter not only allows us to directly tie the adding decision to ap(cid:173)
`propriate buffer state for adding, but it can also utilize limited
`bandwidth links effectively. For example, if there is sufficient
`bandwidth across a modem link to receive 2.9 layers, the aver(cid:173)
`age bandwidth would never become high enough to add the third
`layer. In contrast, the latter mechanism would send 3 layers for
`90% of the time which is more desirable. For the rest of this pa(cid:173)
`per we assume that the only condition for adding a new layer is
`availability of optimal buffer allocation for recovery from Kmax
`backoffs.
`Changing Kma;r; allows us to tune the balance between maxi(cid:173)
`mizing the short-term quality and minimizing the changes in qual(cid:173)
`ity. An obvious question is "What degree of smoothing is ap(cid:173)
`propriate?" In the absence of a specific layered codec and user(cid:173)
`evaluation, Kma» can not be analytically derived. Instead it should
`be set based on real-world user perception experiments to deter(cid:173)
`mine the appropriate degree of smoothing that is not disturbing to
`the user. In practice, we probably also want to base Kmax on the
`average bandwidth and RTT since these determine the duration of
`a draining phase.
`
`3.2 Buffering Revisited
`
`If we delay adding a new layer to achieve smoothing, this affects
`the way we fill and drain the buffers. Figure 6 demonstrates this
`issue.
`
`Draining
`Phase 1
`
`1 Filling
`Phase 2
`
`Figure 6: Revised Draining Phase Algorithm
`
`Up until time ts, this is the same as figure 5. The second filling
`phase starts at time t 3 , and at t4 there is sufficient buffering to
`survive a backoff. However, for smoothing purposes, a new layer
`is not added at this point and we continue buffering data until a
`backoff occurs at ts.
`Note that as the available bandwidth increases, the total amount
`of buffering increases but the required buffering for recovery from
`a single backoff decreases. At time ts, we have more buffering
`than we need to survive a single backoff, but insufficient buffering
`to survive a second backoff before the end of the draining phase.
`We need to specify how we allocate the extra buffering after time
`t4 , and how we drain these buffers after ts while maintaining effi(c