`in Adaptive Streaming over HTTP
`
`Saamer Akhshabi
`College of Computing
`Georgia Institute of
`Technology
`sakhshab@cc.gatech.edu
`
`Ali C. Begen
`Video and Content Platforms
`Research and Advanced
`Development
`Cisco Systems
`abegen@cisco.com
`
`Constantine Dovrolis
`College of Computing
`Georgia Institute of
`Technology
`dovrolis@cc.gatech.edu
`
`ABSTRACT
`Adaptive (video) streaming over HTTP is gradually being
`adopted, as it offers significant advantages in terms of both
`user-perceived quality and resource utilization for content
`and network service providers.
`In this paper, we focus
`on the rate-adaptation mechanisms of adaptive streaming
`and experimentally evaluate two major commercial players
`(Smooth Streaming, Netflix) and one open source player
`(OSMF). Our experiments cover three important operating
`conditions. First, how does an adaptive video player react
`to either persistent or short-term changes in the underlying
`network available bandwidth? Can the player quickly
`converge to the maximum sustainable bitrate?
`Second,
`what happens when two adaptive video players compete for
`available bandwidth in the bottleneck link? Can they share
`the resources in a stable and fair manner? And third, how
`does adaptive streaming perform with live content? Is the
`player able to sustain a short playback delay? We identify
`major differences between the three players, and significant
`inefficiencies in each of them.
`
`Categories and Subject Descriptors
`C.4 [Computer Systems Organization]: Performance of
`Systems
`
`General Terms
`Performance, Measurement, Algorithms
`
`Keywords
`streaming,
`adaptive
`evaluation,
`Experimental
`rate-adaptation algorithm, video streaming over HTTP
`
`1.
`
`INTRODUCTION
`the “next killer
`Video has
`long been viewed as
`application”. Over the last 20 years, the various instances
`
`Permission to make digital or hard copies of all or part of this work for
`personal or classroom use is granted without fee provided that copies are
`not made or distributed for profit or commercial advantage and that copies
`bear this notice and the full citation on the first page. To copy otherwise, to
`republish, to post on servers or to redistribute to lists, requires prior specific
`permission and/or a fee.
`MMSys’11, February 23–25, 2011, San Jose, California, USA.
`Copyright 2011 ACM 978-1-4503-0517-4/11/02 ...$10.00.
`
`of packet video have been thought of as demanding
`applications that would never work satisfactorily over
`best-effort IP networks. That pessimistic view actually
`led to the creation of novel network architectures and
`QoS mechanisms, which were not deployed in a large-scale,
`though. Eventually, over the last three to four years
`video-based applications, and video streaming in particular,
`have become utterly popular generating more than half
`of the aggregate Internet traffic. Perhaps, surprisingly
`though, video streaming today runs over IP without any
`specialized support from the network. This has become
`possible through the gradual development of highly efficient
`video compression methods, the penetration of broadband
`access technologies, and the development of adaptive video
`players that can compensate for the unpredictability of the
`underlying network through sophisticated rate-adaptation,
`playback buffering, and error recovery and concealment
`methods.
`Another conventional wisdom has been that video
`streaming would never work well over TCP, due to the
`throughput variations caused by TCP’s congestion control
`and the potentially large retransmission delays. As a
`consequence, most of the earlier video streaming research
`has assumed that the underlying transport protocol is UDP
`(or RTP over UDP), which considerably simplifies the
`design and modeling of adaptive streaming applications.
`In practice, however, two points became clear in the last
`few years. First, TCP’s congestion control mechanisms
`and reliability requirement do not necessarily hurt the
`performance of video streaming, especially if the video player
`is able to adapt to large throughput variations. Second, the
`use of TCP, and of HTTP over TCP in particular, greatly
`simplifies the traversal of firewalls and NATs.
`streaming
`The first wave
`of HTTP-based video
`applications used the simple progressive download method,
`in which a TCP connection simply transfers the entire
`movie file as quickly as possible. The shortcomings of that
`approach are many, however. One major issue is that all
`clients receive the same encoding of the video, despite the
`large variations in the underlying available bandwidth both
`across different clients and across time for the same client.
`This has recently led to the development of a new wave
`of HTTP-based streaming applications that we refer to as
`adaptive streaming over HTTP (For a general overview of
`video streaming protocols and adaptive streaming, refer
`to [2]). Several recent players, such as Microsoft’s Smooth
`
`
`
`Streaming, Adobe OSMF, as well as the players developed
`or used by Netflix, Move Networks and others, use this
`approach.
`In adaptive streaming, the server maintains
`multiple profiles of the same video, encoded in different
`bitrates and quality levels. Further, the video object is
`partitioned in fragments, typically a few seconds long. A
`player can then request different fragments at different
`encoding bitrates, depending on the underlying network
`conditions. Notice that it is the player that decides what
`bitrate to request for any fragment, improving server-side
`scalability. Another benefit of this approach is that the
`player can control its playback buffer size by dynamically
`adjusting the rate at which new fragments are requested.
`Adaptive streaming over HTTP is a new technology. It
`is not yet clear whether the existing commercial players
`perform well, especially under dynamic network conditions.
`Further, the complex interactions between TCP’s congestion
`control and the application’s rate-adaptation mechanisms
`create a “nested double feedback loop” - the dynamics of
`such interacting control systems can be notoriously complex
`and hard to predict. As a first step towards understanding
`and improving such video streaming mechanisms, this paper
`experimentally evaluates two commercial adaptive video
`players over HTTP (Microsoft’s Smooth Streaming and the
`player used by Netflix) and one open source player (Adobe
`OSMF). Our experiments cover three important operating
`conditions. First, how does an adaptive video player react
`to either persistent or short-term changes in the underlying
`network available bandwidth? Can the player quickly
`converge to the maximum sustainable bitrate?
`Second,
`what happens when two adaptive video players compete for
`available bandwidth in the bottleneck link? Can they share
`that resource in a stable and fair manner? And third, how
`does adaptive streaming perform with live content? Is the
`player able to sustain a short playback delay? We identify
`major differences between the three players, and significant
`inefficiencies in each of them.
`
`1.1 Related Work
`Even though there is extensive previous work on
`rate-adaptive video streaming over UDP, transport of
`rate-adaptive video streaming over TCP, and HTTP
`in particular, presents unique challenges and has not
`been studied in depth in the past. A good overview
`of multi-bitrate video streaming over HTTP was given
`by Zambelli
`[17],
`focusing on Microsoft’s IIS Smooth
`Streaming. Adobe has provided an overview of HTTP
`Dynamic Streaming on the Adobe Flash platform [1]. Cicco
`et al.
`[3] experimentally investigated the performance of
`the Akamai HD Network for Dynamic Streaming for Flash
`over HTTP. They studied how the player reacted to abrupt
`changes in the available bandwidth and how it shared the
`network bottleneck with a greedy TCP flow. Kuschnig et
`al. [9] evaluated and compared three server-side rate-control
`algorithms for adaptive TCP streaming of H.264/SVC
`video. The same authors have proposed a receiver-driven
`transport mechanism that uses multiple HTTP streams and
`different priorities for certain parts of the media stream
`[10]. The end-result is to reduce throughput fluctuations,
`and thus, improve video streaming over TCP. Tullimas et
`al. [15] also proposed a receiver-driven TCP-based method
`for video streaming over the Internet, called MultiTCP,
`aimed at providing resilience against short-term bandwidth
`
`fluctuations and controlling the sending rate by using
`multiple TCP connections. Hsiao et al.
`[8] proposed
`a method called Receiver-based Delay Control (RDC) to
`avoid congestion by delaying TCP ACK generation at the
`receiver based on notifications from routers. Wang et al.
`[16] developed discrete-time Markov models to investigate
`the performance of TCP for both live and stored media
`streaming.
`Their models provide guidelines indicating
`the circumstances under which TCP streaming leads to
`satisfactory performance. For instance, they show that TCP
`provides good streaming performance when the achievable
`TCP throughput is roughly twice the media bitrate, with
`only a few seconds of startup delay. Goel et al. [7] showed
`that the latency at the application layer, which occurs as a
`result of throughput-optimized TCP implementations, could
`be minimized by dynamically tuning TCP’s send buffer.
`They developed an adaptive buffer-size tuning technique
`that aimed at reducing this latency. Feng et.
`al
`[5]
`proposed and evaluated a priority-based technique for the
`delivery of compressed prerecorded video streams across
`best-effort networks. This technique uses a multi-level
`priority queue in conjunction with a delivery window to
`smooth the video frame rate, while allowing it to adapt to
`changing network conditions. Prangl et al.
`[13] proposed
`and evaluated a TCP-based perceptual QoS improvement
`mechanism. Their approach is based on media content
`adaptation (transcoding), applied at the application layer
`at the server. Deshpande [4] proposed an approach that
`allowed the player to employ single or multiple concurrent
`HTTP connections to receive streaming media and switch
`between the connections dynamically.
`
`1.2 Paper Outline
`In Section 2, we describe our experimental approach, the
`various tests we perform for each player, and the metrics we
`focus on. Sections 3, 4 and 5 focus on the Smooth Streaming,
`Netflix, and OSMF players, respectively. Section 6 focuses
`on the competition effects that take place when two adaptive
`players share the same bottleneck. Section 7 focuses on live
`video using the Smooth Streaming player. We summarize
`what we learn for each player and conclude the paper in
`Section 8.
`
`2. METHODOLOGY AND METRICS
`In this section, we give an overview of our experimental
`methodology and describe the metrics we focus on. The
`host that runs the various video players also runs a packet
`sniffer (Wireshark [12]) and a network emulator (DummyNet
`[14]). Wireshark allows us to capture and analyze offline the
`traffic from and to the HTTP server. DummyNet allows us
`to control the downstream available bandwidth (also referred
`to as avail-bw) that our host can receive. That host is
`connected to the Georgia Tech campus network through a
`Fast Ethernet interface. When we do not limit the avail-bw
`using DummyNet, the video players always select the highest
`rate streams; thus, when DummyNet limits the avail-bw to
`relatively low bitrates (1-5 Mbps) we expect that it is also
`the downstream path’s end-to-end bottleneck.
`In the following, we study various throughput-related
`metrics:
`1. The avail-bw refers to the bitrate of the bottleneck that
`we emulate using DummyNet. The TCP connections that
`transfer video and audio streams cannot exceed (collectively)
`
`
`
`that bitrate at any point in time.
`2. The 2-sec connection throughput refers to the download
`throughput of a TCP connection that carries video or audio
`traffic, measured over the last two seconds.
`3. The running average of a connection’s throughput refers
`to a running average of the 2-sec connection throughput
`measurements. If A(ti) is the 2-sec connection throughput in
`the i’th time interval, the running average of the connection
`throughput is:
`
`current rate to avoid unnecessary bitrate variations. Due
`to space constraints, we do not show results from all these
`experiments for each player; we select only those results that
`are more interesting and provide new insight.
`All experiments were performed on a Windows Vista
`Home Premium version 6.0.6002 laptop with an Intel(R)
`Core(TM)2 Duo P8400 2.26 GHz processor, 3.00 GB
`physical memory, and an ATI Radeon Graphics Processor
`(0x5C4) with 512 MB dedicated memory.
`
`ˆA(t) =(δ ˆA(ti−1) + (1 − δ)A(ti)
`
`A(t0)
`
`i > 0
`i = 0
`
`In the experiments, we use δ = 0.8.
`4. The (audio or video) fragment throughput refers to the
`download throughput for a particular fragment, i.e., the size
`of that fragment divided by the corresponding download
`duration. Note that, if a fragment is downloaded in every
`two seconds, the fragment throughput can be much higher
`than the 2-sec connection throughput in the same time
`interval (because the connection can be idle during part
`of that time interval). As will be shown later, some video
`players estimate the avail-bw using fragment throughput
`measurements.
`
`3. MICROSOFT SMOOTH STREAMING
`In the following experiments, we use Microsoft Silverlight
`Version 4.0.50524.0.
`In a Smooth Streaming manifest
`file, the server declares the available audio and video
`bitrates and the resolution for each content (among other
`information). The manifest file also contains the duration
`of every audio and video fragment. After the player has
`received the manifest file,
`it generates successive HTTP
`requests for audio and video fragments.
`Each HTTP
`request from the player contains the name of the content,
`the requested bitrate, and a timestamp that points to the
`beginning of the corresponding fragment. This timestamp
`is determined using the per-fragment information provided
`in the manifest. The following is an example of a Smooth
`Streaming HTTP request.
`
`GET (..)/BigBuckBunny720p.ism/
`QualityLevels(2040000)/Fragments(video=400000000)
`HTTP/1.1
`
`In this example, the requested bitrate is 2.04 Mbps and
`the fragment timestamp is 40 s.
`two TCP
`The Smooth Streaming player maintains
`connections with the server. At any point in time, one of the
`two connections is used for transferring audio and the other
`for video fragments. Under certain conditions, however, the
`player switches the audio and video streams between the two
`connections - it is not clear to us when/how the player takes
`this decision. This way, although at any point in time one
`connection is transferring video fragments, over the course of
`streaming, both connections get the chance to transfer video
`fragments. The benefit of such switching is that neither of
`the connections would stay idle for a long time, keeping the
`server from falling back to slow-start. Moreover, the two
`connections would maintain a large congestion window.
`Sometimes the player aborts a TCP connection and
`opens a new one - this probably happens when the former
`connection provides very low throughput. Also, when the
`user jumps to a different point in the stream, the player
`aborts the existing TCP connections, if they are not idle, and
`opens new connections to request the appropriate fragments.
`At that point, the contents of the playback buffer are
`flushed.
`In the following experiments we watch a sample video
`clip (“Big Buck Bunny”) provided by Microsoft at the IIS
`Web site:
`
`http://www.iis.net/media/experiencesmoothstreaming
`
`The manifest file declares eight video encoding bitrates
`between 0.35 Mbps and 2.75 Mbps and one audio encoding
`bitrate (64 Kbps). We represent an encoding bitrate of r
`Mbps as Pr, (e.g., P2.75). Each video fragment (except the
`
`We also estimate the playback buffer size at the player
`(measured in seconds), separately for audio and video.
`We can accurately estimate the playback buffer size for
`players that provide a timestamp (an offset value that
`indicates the location of the fragment in the stream) in
`their HTTP fragment requests. Suppose that two successive,
`say video, requests are sent at times t1 and t2 (t1 < t2)
`with timestamps t
`and t
`(t
`), respectively (all times
`< t
`measured in seconds). The playback buffer duration in
`seconds for video at time t2 can be then estimated as:
`
`′ 1
`
`′ 2
`
`′ 1
`
`′ 2
`
`)]+
`
`′ 1
`
`− t
`
`′ 2
`
`B(t2) = [B(t1) − (t2 − t1) + (t
`
`where [x]+ denotes the maximum of x and 0. This method
`works accurately because, as will be clear in the following
`sections, the player requests are not pipelined: a request for
`a new fragment is sent only after the previous fragment has
`been fully received.
`We test each player under the same set of avail-bw
`conditions and variations.
`In the first round of tests, we
`examine the behavior of a player when the avail-bw is not
`limited by DummyNet; this “blue-sky” test allows us to
`observe the player’s start-up and steady-state behavior -
`in the same experiments we also observe what happens
`when the user skips to a future point in the video clip.
`In the second round of tests, we apply persistent avail-bw
`variations (both increases and decreases) that last for tens
`of seconds. Such variations are common in practice when
`the cross traffic in the path’s bottleneck varies significantly
`due to arriving or departing traffic from other users. A
`good player should react to such variations by decreasing or
`increasing the requested bitrate. In the third round of tests,
`we apply positive and negative spikes in the path’s avail-bw
`that last for just few seconds - such variations are common
`in 802.11 WLANs for instance. For such short-term drops,
`the player should be able to maintain a constant requested
`bitrate using its playback buffer. For short-term avail-bw
`increases, the player could be conservative and stay at its
`
`
`
`Smooth Streaming Player, Section 3.1
`
`Fragment Request Interarrival Time
`Fragment Download Time
`
`
`
` 64
`
` 32
`
` 16
`
` 8
`
` 4
`
` 2
`
` 1
`
` 0.5
`
` 0.25
`
` 0.125
`
` 0.0625
`
` 0.03125
`
` 0.015625
`
`Seconds
`
` 0
`
` 40
`
` 80
`
` 120
`Time (Seconds)
`
` 160
`
` 200
`
` 240
`
`Figure 2: Interarrival and download times of video
`fragments under unrestricted avail-bw conditions.
`Each fragment is two seconds long.
`
`content). We estimated the target video playback buffer
`size, as described in Section 2, to be about 30 seconds.
`The time it takes to reach Steady-State depends on the
`avail-bw - as the avail-bw increases,
`it takes less time
`to accumulate the 30-second playback buffer. We have
`consistently observed that the player does not sacrifice
`quality, requesting low-bitrate encodings, to fill up its
`playback buffer sooner. Another interesting observation is
`that the player does not request a video bitrate whose frame
`resolution (as declared at the manifest file) is larger than the
`resolution of the display window.
`
`3.2 Behavior of the Audio Stream
`Audio fragments are of the same duration with video
`fragments, at least in the movies we experimented with.
`Even though audio fragments are much smaller in bytes
`than video fragments, the Smooth Streaming player does
`not attempt to accumulate a larger audio playback buffer
`than the corresponding video buffer (around 30 s). Also,
`when the avail-bw drops, the player does not try to request
`audio fragments more frequently than video fragments (it
`would be able to do so). Overall, it appears that the Smooth
`Streaming player attempts to keep the audio and video
`stream download processes as much in sync as possible.
`
`avail-bw
`
`under Persistent
`
`3.3 Behavior
`Variations
`In this section, we summarize a number of experiments
`in which the avail-bw goes through four significant and
`persistent transitions, as shown in Figure 3. First, note that,
`as expected, the per-fragment throughput is never higher
`than the avail-bw.
`Instead, the per-fragment throughput
`tracks quite well the avail-bw variations for most of the
`time; part of the avail-bw, however, is consumed by audio
`fragments and, more importantly, TCP throughput can vary
`significantly after packet loss events.
`We next focus on the requested video bitrate as the
`avail-bw changes. Initially, the avail-bw is 5 Mbps and the
`player requests the P2.04 profile because it is constrained by
`the resolution of the display window (if we were watching
`the video in full-screen mode, the player would request the
`highest P2.75 profile). The playback buffer (shown in Figure
`
`last) has the same duration: τ =2 s. The audio fragments
`are approximately of the same duration.
`
`3.1 Behavior under Unrestricted avail-bw
`Figure
`1
`shows
`the various
`throughput metrics,
`considering only the video stream, in a typical experiment
`without restricting the avail-bw using DummyNet.
`t=0
`corresponds to the time when the Wireshark capture starts.
`Note that the player starts from the lowest encoding bitrate
`and it quickly, within the first 5-10 seconds, climbs to the
`highest encoding bitrate. As the per-fragment throughput
`measurements indicate, the highest encoding bitrate (P2.75)
`is significantly lower than the avail-bw in the end-to-end
`path. The player upshifts to the highest encoding profile
`from the lowest one in four transitions. In other words, it
`seems that the player avoids large jumps in the requested
`bitrate (more than two successive bitrates) - the goal is
`probably to avoid annoying the user with sudden quality
`transitions, providing a dynamic but smooth watching
`experience.
`
`Smooth Streaming Player, Section 3.1
`
`Fragment Throughput
`Average Throughput
`Requested Bitrate
`
`
`
` 100
`
` 10
`
` 1
`
`Mbps
`
` 0.1
`
` 0
`
` 40
`
` 80
`
` 120
`Time (Seconds)
`
` 160
`
` 200
`
` 240
`
`average
`throughput,
`Per-fragment
`Figure 1:
`TCP throughput and the requested bitrate for
`video traffic under unrestricted avail-bw conditions.
`Playback starts at around t=5 s, almost 2 s after the
`user clicked Play.
`
`Another important observation is that during the initial
`time period, the player asks for video fragments much more
`frequently than once every τ seconds. Further analysis of
`the per-fragment interarrivals and download times shows
`that the player operates in one of two states: Buffering
`and Steady-State. In the former, the player requests a new
`fragment as soon as the previous fragment was downloaded.
`Note that the player does not use HTTP pipelining - it
`does not request a fragment if the previous fragment has
`not been fully received.
`In Steady-State, on the other
`hand, the player requests a new fragment either τ seconds
`after the previous fragment was requested (if it took less
`than τ seconds to download that fragment) or as soon
`as the previous fragment was received (otherwise).
`In
`other words,
`in the Buffering state the player aims to
`maximize its fragment request rate so that it can build
`up a target playback buffer as soon as possible.
`In
`Steady-State, the player aims to maintain a constant
`playback buffer, requesting one fragment every τ seconds
`(recall that each fragment corresponds to τ seconds of
`
`
`
`the switching delay indicates that the Smooth
`Again,
`Streaming player is conservative, preferring to estimate
`reliably the avail-bw (using several per-fragment throughput
`measurements) instead of acting opportunistically based on
`the latest fragment throughput measurement.
`The avail-bw decrease at t=303 s is even larger (from
`5 Mbps to 1 Mbps) and the player reacts by adjusting
`the requested bitrate in four transitions. The requested
`bitrates are not always successive. After those transitions,
`the request bitrate converges to an appropriate value P0.63,
`much less than the avail-bw. It is interesting that the player
`could have settled at the next higher bitrate (P0.84) - in that
`case, the aggregate throughput (including the audio stream)
`would be 0.94 Mbps. That is too close to the avail-bw (1
`Mbps), however. This implies that Smooth Streaming is
`conservative: it prefers to maintain a safety margin between
`the avail-bw and its requested bitrate. We think that this
`is wise, given that the video bitrate can vary significantly
`around its nominal encoding value due to the variable bitrate
`(VBR) nature of video compression.
`Another interesting observation is that the player avoids
`large transitions in the requested bitrate - such quality
`transitions can be annoying to the viewer. Also, the upward
`transitions are faster than the downward transitions - still,
`however, it can take several tens of seconds until the player
`has switched to the highest sustainable bitrate.
`
`3.4 Behavior under Short-term avail-bw
`Variations
`In this section, we summarize a number of experiments
`in which the avail-bw goes through positive or negative
`“spikes” that last for only few seconds, as shown in Figures 5
`and 7. The spikes last for 2 s, 5 s and 10 s, respectively.
`Such short-term avail-bw variations are common in practice,
`especially in 802.11 WLAN networks. We think that a good
`adaptive player should be able to compensate for such spikes
`using its playback buffer, without causing short-term rate
`adaptations that can be annoying to the user.
`
`Smooth Streaming Player, Section 3.3
`
`
`
`Available Bandwidth
`Fragment Throughput
`Average Throughput
`Requested Bitrate
`
` 0
`
` 40
`
` 80 120 160 200 240 280 320 360 400 440 480 520 560
`Time (Seconds)
`
` 5
`
` 4
`
` 3
`
` 2
`
` 1
`
` 0
`
`Mbps
`
`Figure 3: Per-fragment throughput, average TCP
`throughput and the requested bitrate for the
`video traffic under persistent avail-bw variations.
`Playback starts at around t=10 s, almost 3 s after
`the user clicked Play.
`
`Smooth Streaming Player, Section 3.3
`
`Available Bandwidth (Mbps)
`
` 5
`
` 4
`
` 3
`
` 2
`
` 1
`
` 30
`
` 25
`
` 20
`
` 15
`
` 10
`
` 5
`
`Buffer Size (Seconds)
`
` 0
`
` 0
`
` 0
` 40 80 120 160 200 240 280 320 360 400 440 480 520 560
`Time (Seconds)
`
`Video Playback Buffer Size
`Available Bandwidth
`
`Figure 4: Video playback buffer size in seconds
`under persistent avail-bw variations.
`
`Smooth Streaming Player, Section 3.4, First Experiment
`
` 100
`
`Available Bandwidth
`Average Throughput
`Requested Bitrate
`
`
`
` 10
`
` 1
`
`Mbps
`
` 0.1
`
` 0
`
` 40 80 120 160 200 240 280 320 360 400 440 480 520 560
`Time (Seconds)
`
`Figure 5: Average TCP throughput and the
`requested bitrate for the video traffic under positive
`avail-bw spikes. Playback starts at around t=7 s,
`almost 4 s after the user clicked Play.
`
`Figure 5 shows the case of positive spikes. Here, we
`repeat the three spikes twice, each time with a different
`
`4) has reached its 30 s target by t=40 s and the player is in
`Steady-State.
`At time t=73 s, the avail-bw is dropped to 2 Mbps - that
`is not sufficient for the P2.04 encoding because we also need
`some capacity for the audio traffic and for various header
`overheads. The player reacts by switching to the next lower
`profile (P1.52) but after some significant delay (almost 25
`seconds). During that time period, the playback buffer
`has decreased by only 3 seconds (the decrease is not large
`because the avail-bw is just barely less than the cumulative
`requested traffic). The large reaction delay indicates that
`the player does not react to avail-bw changes based on the
`latest per-fragment throughput measurements. Instead, it
`averages those per-fragment measurements over a longer
`time period so that it acts based on a smoother estimate
`of the avail-bw variations. The playback buffer size returns
`to its 30 s target after the player has switched to the P1.52
`profile.
`The avail-bw increase at t=193 s is quickly followed by
`an appropriate increase in the requested encoding bitrate.
`
`
`
`Smooth Streaming Player, Section 3.4, First Experiment
`
`Smooth Streaming Player, Section 3.4, Second Experiment
`
`Available Bandwith (Mbps)
`
` 2.5
`
` 2
`
` 1.5
`
` 1
`
` 0.5
`
` 30
`
` 25
`
` 20
`
` 15
`
` 10
`
` 5
`
`Buffer Size (Seconds)
`
` 100
`
`Available Bandwith (Mbps)
`
` 10
`
` 1
`
`Video Playback Buffer Size
`Available Bandwidth
`
` 40 80 120 160 200 240 280 320 360 400 440 480 520 560
`Time (Seconds)
`
` 0
`
` 0
`
` 40
`
` 80
`
` 120
`
`Video Playback Buffer Size
`Available Bandwidth
`
` 200
` 160
`Time (Seconds)
`
` 240
`
` 280
`
` 320
`
` 0
` 360
`
` 35
`
` 30
`
` 25
`
` 20
`
` 15
`
` 10
`
` 5
`
`Buffer Size (Seconds)
`
` 0
`
` 0
`
`Figure 6: Video playback buffer size in seconds
`under positive avail-bw spikes.
`
`Figure 8: Video playback buffer size in seconds
`under negative avail-bw spikes.
`
`4. NETFLIX PLAYER
`The Netflix player uses Microsoft’s Silverlight for media
`representation, but a different rate-adaptation logic. The
`Netflix player also maintains two TCP connections with
`the server, and it manages these TCP connections similarly
`with the Smooth Streaming player. As will become clear,
`however, the Netflix player does not send audio and video
`fragment requests at the same pace. Also, the format of
`the manifest file and requests are different. Further, most
`of the initial communication between the player and server,
`including the transfer of the manifest file, is done over SSL.
`We decrypted the manifest file using a Firefox plugin utility
`called Tamper Data that accesses the corresponding private
`key in Firefox. Video and audio fragments are delivered
`in wmv and wma formats, respectively. An example of a
`Netflix fragment request follows:
`
`GET /sa2/946/1876632946.wmv
`/range/2212059-2252058?token=1283923056
`_d6f6112068075f1fb60cc48eab59ea55&random
`=1799513140 HTTP/1.1
`
`Netflix requests do not correspond to a certain time
`duration of audio or video. Instead, each request specifies
`a range of bytes in a particular encoding profile. Thus,
`we cannot estimate the playback buffer size as described
`in Section 2. We can only approximate that buffer size
`assuming that the actual encoding rate for each fragment
`is equal to the corresponding nominal bitrate for that
`fragment (e.g., a range of 8 Mb at the P1.00 encoding profile
`corresponds to 8 seconds worth of video) - obviously this is
`only an approximation but it gives us a rough estimate of
`the playback buffer size.
`After the user clicks the Play button, the player starts
`by performing some TCP transfers, probably to measure
`the capacity of the underlying path.
`Then it starts
`buffering audio and video fragments, but without starting
`the playback yet. The playback starts either after a certain
`number of seconds, or when the buffer size reaches a target
`point. If that buffer is depleted at some point, the Netflix
`player prefers to stop the playback, showing a message that
`
`increase magnitude. The Smooth Streaming player ignores
`the 2-second spikes and the smaller 5-second spike. On the
`other hand, it reacts to the 10-second spikes by increasing
`the requesting video bitrate. Unfortunately, it does so too
`late (sometimes after the end of the spike) and for too long
`(almost till 40 s after the end of the spike). During the
`time periods that the requested bitrate is higher than the
`avail-bw, the playback buffer size obviously shrinks, making
`the player more vulnerable to freeze events (See Figure 6).
`This experiment confirms that the player reacts, not to the
`latest fragment download throughput, but to a smoothed
`estimate of those measurements that can be unrelated to
`the current avail-bw conditions.
`Figures 7 and 8 show similar results in the case of negative
`spikes. Here, the spikes reduce the avail-bw from 2 Mbps
`to 1 Mbps. The player reacts to all three spikes, even the
`spike that lasts for only 2 s. Unfortunately, the player reacts
`too late and for too long:
`it requests a lower bitrate after
`the end of each negative spike and it stays at that lower
`bitrate long for 40-80 s. During those periods, the user would
`unnecessarily experience a lower video quality.
`
`Smooth Streaming Player, Section 3.4, Second Experiment
`
`
`
` 2
`
` 1.5
`
` 1
`
`Mbps
`
` 0.5
`
` 0
`
` 0
`
` 40
`
` 80
`
` 120
`
`Available Bandwidth
`Average Throughput
`Requested Bitrate
`
` 200
` 160
`Time (Seconds)
`
` 240
`
` 280
`
` 320
`
` 360
`
`Figure 7: Average TCP throughput and the
`requested bitrate for the video traffic under negative
`avail-bw spikes. Playback starts at around t=9 s,
`almost 3 s after the user clicked Play.
`
`
`
`Netflix Player. Section 4.1
`
`Fragment Request Interarrival Time
`Fragment Download Time
`
`
`
` 64
`
` 32
`
` 16
`
` 8
`
` 4
`
` 2
`
` 1
`
` 0.5
`
` 0.25
`
` 0.125
`
` 0.0625
`
` 0.03125
`
` 0.015625
`
`Seconds
`
` 0
`
` 60
`
` 120
`
` 180
`Time (Seconds)
`
` 240
`
` 300
`
` 360
`
`Figure 10: Interarrival and download times of video
`fragments under unrestricted avail-bw conditions.
`
`4.2 Behavior of the Audio Stream
`Audio fragments in the Netflix player are significantly
`larger than the ones in Smooth Streaming. Specifically,
`an audio fragment is typically 30 s long. Thus, after the
`player has reached Steady-State, a new audio fragment
`is requested every 30 s. Further,
`it appears that this
`player does not attempt to keep the audio and video stream
`download processes in sync;
`it can be that the audio
`playback buffer size is significantly larger than the video
`playback buffer size.
`
`avail-bw
`
`under Persistent
`
`4.3 Behavior
`Variations
`Figure 11 shows the various throughput-related metrics
`in the case of persistent avail-bw variations. As in the
`experiment with unrestricted avail-bw,
`the player first
`requests few fragments at all possible