`Standards and Design Principles
`Thomas Stockhammer
`Qualcomm Incorporated
`c/o Nomor Research
`Brecherspitzstraße 8
`81541 Munich, Germany
`+49 89 978980 02
`stockhammer@nomor.de
`
`ABSTRACT
`In this paper, we provide some insight and background into the
`Dynamic Adaptive Streaming over HTTP (DASH) specifications
`as available from 3GPP and in draft version also from MPEG.
`Specifically, the 3GPP version provides a normative description
`of a Media Presentation, the formats of a Segment, and the deliv-
`ery protocol. In addition, it adds an informative description on
`how a DASH Client may use the provided information to establish
`a streaming service for the user. The solution supports different
`service types (e.g., On-Demand, Live, Time-Shift Viewing),
`different features (e.g., adaptive bitrate switching, multiple lan-
`guage support, ad insertion, trick modes, DRM) and different
`deployment options. Design principles and examples are provid-
`ed.
`Categories and Subject Descriptors
`H.4.m [Information Systems Applications]: Miscellaneous.
`General Terms
`Standardization.
`Keywords
`3GPP, video, mobile video, standards, streaming.
`1. INTRODUCTION
`Internet access is becoming a commodity on mobile devices. With
`the recent popularity of smart phones, smartbooks, connected
`netbooks and laptops the Mobile Internet use is dramatically
`expanding. According to recent studies [7], expectations are that
`between 2009 and 2014 the mobile data traffic will grow by a
`factor of 40, i.e., it will more than double every year. Figure 1
`shows that the video traffic will by then account for 66% of the
`total amount of the mobile data. At the same time mobile users
`expect high-quality video experience in terms of video quality,
`start-up time, reactivity to user interaction, trick mode support,
`etc., and the whole ecosystem including content providers, net-
`work operators, service providers, device manufacturers and
`technology providers need to ensure that these demands can be
`
`Permission to make digital or hard copies of all or part of this work for
`personal or classroom use is granted without fee provided that copies are
`not made or distributed for profit or commercial advantage and that
`copies bear this notice and the full citation on the first page. To copy
`otherwise, or republish, to post on servers or to redistribute to lists,
`requires prior specific permission and/or a fee.
`MMSys’11, February 23–25, 2011, San Jose, California, USA.
`Copyright 2011 ACM 978-1-4503-0517-4/11/02...$10.00.
`
`met. Affordable and mature technologies are required to fulfil the
`users’ quality expectations. One step into this direction is a com-
`mon, efficient and flexible distribution platform that scales to the
`rising demands. Standardized components are expected to support
`the creation of such common distribution platforms.
`
`Figure 1 Video Will Account for 66 Percent of Global Mobile
`Data Traffic by 2014 (Source [7], Figure 2)
`Traditional streaming generally uses a stateful protocol, e.g., the
`Real-Time Streaming Protocol (RTSP): Once a client connects to
`the streaming server the server keeps track of the client's state
`until the client disconnects again. Typically, frequent communica-
`tion between the client and the server happens. Once a session
`between the client and the server has been established, the server
`sends the media as a continuous stream of packets over either
`UDP or TCP transport. In contrast, HTTP is stateless. If an HTTP
`client requests some data, the server responds by sending the data
`and the transaction is terminated. Each HTTP request is handled
`as a completely standalone one-time transaction.
`Alternatively to streaming, progressive download may be used for
`media delivery from standard HTTP Web servers. Clients that
`support HTTP can seek to positions in the media file by perform-
`ing byte range requests to the Web server (assuming that it also
`supports HTTP/1.1 [4]). Disadvantages of progressive download
`are mostly that (i) bandwidth may be wasted if the user decides to
`stop watching the content after progressive download has started
`(e.g., switching to another content), (ii) it is not really bitrate
`adaptive and (iii) it does not support live media services. Dynamic
`Adaptive Streaming over HTTP (DASH) addresses the weakness-
`es of RTP/RTSP-based streaming and progressive download.
`
`133
`
`Google Exhibit 1015
`Google v. Ericsson
`
`
`
`2. DESIGN PRINCIPLES
`HTTP-based progressive download does have significant market
`adoption. Therefore, HTTP-based streaming should be as closely
`aligned to HTTP-based progressive download as possible, but
`take into account the above-mentioned deficiencies.
`
`Figure 2 Example Media Distribution Architecture
`Figure 2 shows a possible media distribution architecture for
`HTTP-based streaming. The media preparation process typically
`generates segments that contain different encoded versions of one
`or several of the media components of the media content. The
`segments are then hosted on one or several media origin servers
`typically, along with the media presentation description (MPD).
`The media origin server is preferably an HTTP server such that
`any communication with the server is HTTP-based (indicated by a
`bold line in the picture). Based on this MPD metadata information
`that describes the relation of the segments and how they form a
`media presentation, clients request the segments using HTTP GET
`or partial GET methods. The client fully controls the streaming
`session, i.e., it manages the on-time request and smooth playout of
`the sequence of segments, potentially adjusting bitrates or other
`attributes, for example to react to changes of the device state or
`the user preferences.
`Massively scalable media distribution requires the availability of
`server farms to handle the connections to all individual clients.
`HTTP-based Content Distribution Networks (CDNs) have suc-
`cessfully been used to serve Web pages, offloading origin servers
`and reducing download latency. Such systems generally consist of
`a distributed set of caching Web proxies and a set of request
`redirectors. Given the scale, coverage, and reliability of HTTP-
`based CDN systems, it is appealing to use them as base to launch
`streaming services that build on this existing infrastructure. This
`can reduce capital and operational expenses, and reduces or elimi-
`nates decisions about resource provisioning on the nodes. This
`principle is indicated in Figure 2 by the intermediate HTTP serv-
`ers/caches/proxies. Scalability, reliability, and proximity to the
`user’s location and high-availability are provided by general-
`purpose servers. The reasons that lead to the choice of HTTP as
`the delivery protocol for streaming services are summarized be-
`low:
`1. HTTP streaming is spreading widely as a form of delivery of
`Internet video.
`There is a clear trend towards using HTTP as the main proto-
`col for multimedia delivery over the Open Internet.
`3. HTTP-based delivery enables easy and effortless streaming
`services by avoiding NAT and firewall traversal issues.
`4. HTTP-based delivery provides reliability and deployment
`simplicity due as HTTP and the underlying TCP/IP protocol
`are widely implemented and deployed.
`
`2.
`
`134
`
`5. HTTP-based delivery provides the ability to use standard
`HTTP servers and standard HTTP caches (or cheap servers in
`general) to deliver the content, so that it can be delivered
`from a CDN or any other standard server farm.
`6. HTTP-based delivery provides the ability to move control of
`“streaming session” entirely to the client. The client basically
`only opens one or several or many TCP connections to one or
`several standard HTTP servers or caches.
`7. HTTP-based delivery provides the ability to the client to
`automatically choose initial content rate to match initial
`available bandwidth without requiring the negotiation with
`the streaming server.
`8. HTTP-based delivery provides a simple means to seamlessly
`change content rate on-the-fly in reaction to changes in avail-
`able bandwidth, within a given content or service, without
`requiring negotiation with the streaming server.
`9. HTTP-based streaming has the potential to accelerate fixed-
`mobile convergence of video streaming services as HTTP-
`based CDN can be used as a common delivery platform.
`Based on these considerations, 3GPP had identified the needs to
`provide a specification for a scalable and flexible video distribu-
`tion solution that addresses mobile networks, but is not restricted
`to 3GPP radio access networks (RANs). 3GPP has taken the
`initiative to specify an Adaptive HTTP Streaming solution in
`addition to the already existing RTP/RTSP-based streaming solu-
`tions and the HTTP-based progressive download solution.
`Specifically the solution is designed
`to support delivery of media components encapsulated in
`ISO base media file format box structure,
`to address delivery whereas presentation, annotation and user
`interaction is largely out-of-scope,
`to permit integration in different presentation frameworks.
`•
`The 3GPP sub-group SA4 working on codecs and protocols for
`media delivery started the HTTP streaming activity in April 2009
`and completed the Release-9 specification work early March
`2010. The 3GPP Adaptive HTTP Streaming (AHS) has been
`integrated into 3GPP Transparent end-to-end Packet-switched
`Streaming Service (PSS). Specifically, 3GPP TS 26.234 [1] (PSS
`Codecs and Protocols) clause 12 specifies the 3GPP Adaptive
`HTTP Streaming solution, and 3GPP TS 26.244 [2] (3GP File
`Format) clauses 5.4.9, 5.4.10, and 13 specify the encapsulation
`formats for segments. The Release-9 work is now under mainte-
`nance mode and some minor bug fixes and clarifications were
`agreed during the year 2010 and have been integrated into the
`latest versions of 3GPP TS 26.234 and 3GPP TS 26.244.
`The solution supports features such as
`fast initial startup and seeking,
`bandwidth-efficiency,
`adaptive bitrate switching,
`adaptation to CDN properties,
`re-use of HTTP-server and caches,
`re-use of existing media playout engines,
`support for on-demand, live and time-shift delivery services,
`simplicity for broad adoption.
`
`•
`•
`•
`•
`•
`•
`•
`•
`
`•
`
`•
`
`
`
`This approach makes the framework defined in 3GPP extensible,
`for example to any other segment formats, codecs and DRM
`solutions.
`3G-DASH supports multiple services, among others:
`On-demand streaming,
`Linear TV including live media broadcast,
`Time-shift viewing with network Personal Video Recording
`(PVR) functionalities.
`
`•
`•
`•
`
`Specific care was taken in the design that the network side can be
`deployed on standard HTTP servers and distribution can be pro-
`vided through regular Web infrastructures such as HTTP-based
`CDNs. The specification also leaves room for different serv-
`er/network-side deployment options as well as for optimized
`client implementations.
`The specification also defines provisions to support features such
`as
`
`•
`
`•
`
`•
`•
`
`•
`•
`
`Initial selection of client- and/or user-specific representations
`of the content,
`Dynamic adaptation of the played content to react to envi-
`ronmental changes such as access bandwidth or processing
`power,
`Trick modes such as seeking, fast forward or rewind,
`Simple insertion of pre-encoded advertisement or other
`content in on-demand and live streaming services,
`Efficient delivery of multiple languages and audio tracks,
`Content protection and content security, etc.
`
`The remainder of this section provides further background infor-
`mation on the concept of a Media Presentation, the usage of
`HTTP, as well as segment types and formats in the 3GPP instanti-
`ation. A summary of the normative specification is also provided.
`3.2 Media Presentation
`The concept of a Media Presentation is introduced in TS 26.234
`[1], clause 12.2. A Media Presentation is a structured collection of
`encoded data of some media content, e.g., a movie or a program.
`The data is accessible to the DASH Client to provide a streaming
`service to the user. As shown in Figure 4:
`A Media Presentation consists of a sequence of one or more
`•
`consecutive non-overlapping Periods.
`Each Period contains one or more Representations from the
`same media content.
`Each Representation consists of one or more Segments.
`Segments contain media data and/or metadata to decode and
`present the included media content.
`Period boundaries permit to change a significant amount of in-
`formation within a Media Presentation such as server location,
`encoding parameters, or the available variants of the content. The
`Period concept has been introduced among others for splicing of
`new content, such as ads, and for logical content segmentation.
`Each Period is assigned a start time, relative to start of the Media
`Presentation.
`
`•
`•
`
`•
`
`3GPP has also sought alignment with other organizations and
`industry fora that work in the area of video distribution. For ex-
`ample, as the Open IPTV Forum (OIPF) based their HTTP Adap-
`tive Streaming (HAS) solution [13] on 3GPP. 3GPP recently also
`addressed certain OIPF requirements and integrated appropriate
`features in the Release-9 3GPP Adaptive HTTP Streaming speci-
`fication. Also MPEG’s draft DASH solution is heavily based on
`3GPP’s AHS. Finally, 3GPP has ongoing work in Release-10,
`now also referred to as DASH. This work will extend the Release-
`9 3GPP AHS specification in a backward-compatible way. Close
`coordination with the ongoing MPEG DASH activities is orga-
`nized.
`3. 3GPP Adaptive HTTP Streaming
`3.1 Overview
`3GPP Adaptive HTTP Streaming, since Release-10 referred to as
`as 3GP-DASH, is the result of a standardization activity in 3GPP
`SA4 Figure 3 shows the principle of the 3GP-DASH specification.
`The specification provides
`a normative definition of a Media Presentation, with Media
`Presentation defined as a structured collection of data that is
`accessible to the DASH Client through Media Presentation
`Description,
`a normative definition of the formats of a Segment, with a
`Segment defined as an integral data unit of a media presenta-
`tion that can be uniquely referenced by a HTTP-URL (possi-
`bly restricted by a byte range),
`a normative definition of the delivery protocol used for the
`delivery of Segments, namely HTTP/1.1,
`an informative description on how a DASH client may use
`the provided information to establish a streaming service for
`the user.
`
`•
`
`•
`
`•
`
`•
`
`Figure 3 Solution overview – 3GP-DASH
`
`DASH in 3GPP is defined in two levels:
`1. Clause 12.2 in TS 26.234 [1] provides a generic frame-
`work for Dynamic Adaptive Streaming independent of
`the data encapsulation format for media segments.
`2. Clause 12.4 in TS 26.234 [1] provides a specific instan-
`tiation of this framework with the 3GP/ISO base media
`file format by specifying the segment formats, partly re-
`ferring to the formats in TS 26.244 [2].
`
`135
`
`
`
`ble Segments and their timing. The MPD is a well-formatted
`XML document and the 3GPP Adaptive HTTP Streaming specifi-
`cation defines an XML schema to define MPDs. An MPD may be
`updated in specific ways such that an update is consistent with the
`previous instance of the MPD for any past media. A graphical
`presentation of the XML schema is provided in Figure 5. The
`mapping of the data model to the XML schema is highlighted. For
`the details of the individual attributes and elements please refer to
`TS 26.234 [1], clause 12.2.5.
`
`Figure 5 MPD XML-Schema
`DASH also supports live streaming services. In this case, the
`generation of segments typically happens on-the-fly. Due to this
`clients typically have access to only a subset of the Segments, i.e.,
`the most recent MPD describes a time window of accessible
`Segments for this instant in time. By providing updates of the
`MPD, the server may describe new Segments and/or new Periods
`such that the updated MPD is compatible with the previous MPD.
`Therefore, for live streaming services a Media Presentation is
`typically described by the initial MPD and all MPD updates. To
`ensure synchronization between client and server, the MPD pro-
`vides access information in Universal Time Clock (UTC) time. As
`long as server and client are synchronized to UTC time, the syn-
`chronization between server and client can be ensured by the use
`of the UTC times in the MPD.
`Time-shift viewing and network PVR functionality are supported
`in a straightforward manner, as segments may be accessible on the
`network over a long period of time.
`3.3 Usage of HTTP
`The 3GPP DASH specification is written such that it enables
`delivering content from standard HTTP servers to an HTTP-
`Streaming client and enables caching content by standard HTTP
`caches. Therefore, the streaming server and streaming client
`comply to HTTP/1.1 as specified in RFC2616 [7] and HTTP-
`Streaming Clients are expected to use the HTTP GET method or
`the partial GET method for downloading media segments. No
`further details on caches and proxies are specified, as they are
`transparent to protocol.
`3.4 Segments based on 3GP File Format
`Beyond the general adaptive streaming framework, 3GPP DASH
`specifies an instantiation that uses segment formats based on the
`3GP file format as specified in TS 26.244 [2]. Each Representa-
`tion may either consist of
`one Initialisation Segment and at least one Media Segment,
`but typically a sequence of Media Segments, or
`one self-Initialising Media Segment.
`
`•
`
`•
`
`Figure 4 Media Presentation Data Model
`Each Period itself consists of one or more Representations. A
`Representation is one of the alternative choices of the media
`content or a subset thereof typically differing by the encoding
`choice, e.g., by bitrate, resolution, language, or codecs.
`Each Representation includes one or more media components,
`where each media component is an encoded version of one indi-
`vidual media type such as audio, video or timed text. Each Repre-
`sentation is assigned to a group. Representations in the same
`group are alternatives to each other. The media content within one
`Period is represented by either one Representation from group
`zero, or the combination of at most one Representation from each
`non-zero group.
`A Representation consists of at most one Initialisation Segment
`and one or more Media Segments. Media components are time-
`continuous across boundaries of consecutive Media Segments
`within one Representation. Segments represent a unit that can be
`uniquely referenced by an HTTP-URL (possibly restricted by a
`byte range). Thereby, the Initialisation Segment contains infor-
`mation for accessing the Representation, but no media data. Me-
`dia Segments contain media data and must fulfil some further
`requirements, namely:
`Each Media Segment is assigned a start time in the media
`presentation to enable downloading the appropriate Segments
`in regular play-out mode or after seeking. This time is gener-
`ally not accurate media playback time, but only approximate
`such that the client can make appropriate decisions on when
`to download the Segment such that it is available in time for
`play-out.
`• Media Segments may provide random access information,
`i.e., presence, location and timing of Random Access Points
`(RAPs).
`A Media Segment, when considered in conjunction with the
`information and structure of the MPD, contains sufficient in-
`formation to time-accurately present each contained media
`component in the Representation without accessing any pre-
`vious Media Segment in this Representation provided that
`the Media Segment contains a RAP. The time-accuracy ena-
`bles seamlessly switching Representations and jointly pre-
`senting multiple Representations.
`• Media segments may also contain information for randomly
`accessing subsets of the Segment by using partial HTTP
`GET requests.
`
`•
`
`•
`
`A Media Presentation is described in a Media Presentation De-
`scription (MPD), and MPDs may be updated during the lifetime
`of a Media Presentation. In particular, the MPD describes accessi-
`
`136
`
`
`
`An Initialisation Segment provides the client with the metadata
`that describes the media content and is basically a file conformant
`with the 3GPP file format without any media data. An Initialisa-
`tion Segment consists of the “ftyp” box, the “moov” box, and
`optionally the “pdin” box. The “moov” box contains no samples.
`This reduces the start-up time significantly as the Initialisation
`Segment needs to be downloaded before any Media Segment can
`be processed, but may be downloaded asynchronously before the
`Media Segments.
`Self-Initialising Media Segments comply with the 3GP Adaptive-
`Streaming Profile as specified in TS 26.244, clause 5.4.9. For
`3GP files conforming to this profile the ‘moov’ box is in the
`beginning of the file after the ‘ftyp’ and a possibly present ‘pdin’
`box, all movie data is contained in Movie Fragments, and the
`‘moov’ box is followed by one or more ‘moof’ and ‘mdat’ box
`pairs. In addition 3GP files conforming to this profile may contain
`any of the new boxes specified in TS 26.244 [2], clause 13, name-
`ly the segment type (‘styp’) box, the track fragment adjustment
`(‘tfad’) box and the segment index (‘sidx’) box. Self-Initialising
`media segments are assigned start time 0 relative to the Period
`start time, so no additional information is necessary for each
`segment. However, the additional boxes ‘tfad’ and ‘sidx’ may be
`used for accurate timing of each component within the 3GP file
`after random access and seeking within the 3GP file. More details
`on the boxes for DASH are provided further below.
`A Media Segment may start with a ‘styp’ box and a sequence of
`one or more whole self-contained movie fragments. The ‘styp’
`box is used for file branding of segments. In addition, each ‘traf’
`box may contain a ‘tfad’ box for track alignment to permit ran-
`dom access to the start of the segment or any fragment within the
`segment. Furthermore, each Media Segment may contain one or
`more ‘sidx’ boxes. The ‘sidx’ provides global timing for each
`contained track, time/byte range offsets of the contained movie
`fragments, as well as time offsets of random access points, if any.
`Note that the codecs in 3GPP AHS are identical to 3GPP PSS
`codecs as specified in TS 26.234, clause 7.2 for speech, 7.3 for
`audio, 7.4 for video and 7.9 for timed text. However, there is no
`restriction for use of any other codecs as long as the codecs can be
`encapsulated in ISO base media file format.
`3.5 Segment Indexing
`Segment Indexing is an important concept to permit byte range
`access to subsets of segments. This permits fast access to sub-
`structures in the segment for fast switching, simple random ac-
`cess, etc. Each segment index ‘sidx’ box documents a sub-
`segment defined as one or more consecutive movie fragments,
`ending either at the end of the containing segment, or at the be-
`ginning of a subsegment documented by the next ‘sidx’.
`The ‘sidx’ contains timing information to place the segment into
`the global time line of the media presentation. Beyond, it contains
`a loop that provides an index of the subsegment, i.e., the duration
`of each sub-segment and the offset from the first byte following
`the enclosing ‘sidx’, to the first byte of the referenced box.
`By downloading only a small portion in the beginning of the
`media segment, e.g., by using a byte range request, the segment
`index boxes may be fetched. Segment index boxes may be used
`for several purposes:
`1.
`To provide a mapping of each track contained in the
`media segment to the media presentation timeline, such
`that synchronous playout of media components within
`
`2.
`
`3.
`
`and across Representations is enabled as well as to per-
`mit switching across Representations.
`To enable fast navigation through segments, possibly
`using byte range requests and to minimize the download
`of media data during a seeking process.
`To locate the position of random access points within
`segments without downloading unnecessary media data
`before the random access point.
`
`Figure 6 shows a simple example of a single segment index. In
`this case, the first loop of the segment index box provides the
`exact timing for each media component starting in fragment F1.
`Furthermore, the second loop provides time-byte offset infor-
`mation as well as random access information for selected frag-
`ments, namely fragment F1, F3, and F5.
`
`Figure 6 A Simple Example of Segment Index (Legend: S, in
`yellow is a segment index box; F, in blue, is a movie fragment
`with its data; The arrows, in red above, shows the fragment
`documented by the first loop; Blue arrows below show the
`second-loop time-byte index pointers)
`As Segments may be of very different size, the first ‘sidx’ box
`may or may not describe all details of the following movie frag-
`ments in this segment. To avoid large ‘sidx’ boxes in case of large
`segments, the information may be provided in a nested manner, in
`such a way that ‘sidx’ boxes may reference not only the start of a
`movie fragment, but also other ‘sidx’ boxes.
`
`Figure 7 Nested Segment Indices (same legend as in Figure 6)
`Several examples for nested segment indices are provided in
`Figure 7. In the ‘hierarchical’ case, the first loop for S1 points to
`the first movie fragment, but the first pointer in the second loop
`for S1 references a ‘sidx’ box. Also, any other pointer in the
`second loop for S1 references a ‘sidx’ box. Only the second level
`(S2/3/4) points directly to movie fragments. Therefore, with
`downloading the ‘sidx’ box S1, fast coarse navigation through the
`segment is enabled. Further refinements are done in the lower
`level. In the daisy chain case, S1 and S3 reference both movie
`fragments and other ‘sidx’ boxes: Such a construction enables fast
`
`137
`
`
`
`ments at the appropriate times and provide the media player with
`data for best user experience.
`
`navigation to the initial fragments of the segments, whereas later
`fragments may require some sequential resolution. The syntax is
`flexible enough to also support any hybrids of strictly hierarchical
`and daisy-chain constructions.
`3.6 Summary
`The 3GP-DASH specification provides a universal and flexible
`solution for Adaptive HTTP Streaming. The solution is based on
`existing technologies, including codecs, encapsulation formats,
`content protection and delivery protocols. 3GP-DASH focuses on
`the specification of the interface between standard HTTP servers
`that host media presentations and the HTTP Streaming client.
`Specifically, 3GPP AHS specifies
`syntax and semantics of Media Presentation Description,
`•
`format of Segments, and
`•
`the delivery protocol for segments.
`•
`
`3GP-DASH also provides an informative overview on how a
`DASH Client may use the provided information to establish a
`streaming service to the user.
`However, 3GP DASH permits flexible configurations to address
`different use cases and delivery scenarios. Among others, 3GP-
`DASH does not specify
`Details on content provisioning, for example
`•
`o
`Size and duration of segments can be selected flexibly
`and individually for each Representation,
`o Number of Representations and the associated bitrates
`can be selected based on the content requirements,
`o
`Frequency and position of random access points are
`not restricted by the streaming solution,
`o Other attributes and encoding parameters of each Rep-
`resentation, etc, are not restricted.
`o Multiplexed components and individual component
`may be used.
`Normative client behaviour to provide streaming service, e.g.
`o
`Prescriptions of how and when to download segments
`o
`Representation selection and switching procedures
`among different Representations
`o Usage of HTTP/1.1 on how to download segments, etc
`The transport of MPD, while being possible through HTTP,
`may also be delivered by other means.
`
`•
`
`•
`
`To emphasize the flexibility in terms of use cases and deployment
`options, some example deployments are provided in section 4.
`4. DEPLOYMENT OPTIONS
`3GP-DASH provides a significant amount of options and flexibil-
`ity for deploying Adaptive HTTP Streaming services on both, the
`service provisioning end as well as the client side. Figure 8 shows
`a possible deployment architecture. Content preparation is done
`offline and ingested into an HTTP-Web serving cloud with origin
`and cache servers. The ingestion tool may adapt to specifics of
`CDNs, for example adapt the segment duration/size to CDN
`properties, use load-balancing and geo-location information, etc.
`By providing access to the MPD, the access client can access the
`streaming service through any IP network that enables HTTP
`connections. The network may be managed or unmanaged, wired
`or wireless, and multiple access networks may even be used in
`parallel. The major intelligence to enable an efficient and high-
`quality streaming service is in the HTTP Streaming client. With
`the access to the MPD, the client is able to issue requests to seg-
`
`138
`
`•
`
`Figure 8 Example Deployment Architecture
`3GP-DASH provides flexibility and options such that on the
`content preparation and ingestion side, the service may be opti-
`mized to support different delivery and/or user experience aspects,
`such as
`minimization of service access time, i.e., to ensure that the
`client have fast access after tune-in and after any seeking op-
`erations.
`minimization of the end-to-end delay in live services, e.g., by
`the adaptation of segment durations.
`maximization of delivery efficiency, e.g., by ensuring that
`the client has options for close-to-playout time downloading,
`or by providing appropriate bandwidth Representations, etc.
`adjustment to CDN properties, for example the desired file
`sizes, the amount of files, the handling of HTTP requests,
`etc.
`the reuse of encoded legacy content, for example media
`stored in MP4 files, media encoded with/without coordina-
`tion between encoders of different bitrates, coordinated ran-
`dom access point placement, etc.
`
`•
`
`•
`
`•
`
`•
`
`The 3GP-DASH solution includes some of the design options
`from proprietary Adaptive HTTP Streaming solutions, in particu-
`lar Apple HTTP Live Streaming [8] and Microsoft Smooth
`Streaming [10]. For example, Apple HTTP Live Streaming con-
`figuration may typically be mapped to
`Constant segment duration of roughly 10 seconds for each
`Representation,
`Each segment is provided in one fragment so only the first
`loop in the ‘sidx’ box of the segment may be provided,
`Each Representation is complete and assigned to group 0,
`i.e., typically audio and video are multiplexed within a single
`segments,
`Playlist-based segment lists with regularly updated MPDs to
`address the live generation and publishing of segments,
`Typical MS SmoothStreaming deployments may be mapped to the
`3GPP AHS specification by providing
`constant segment size of roughly 1 or 2 seconds for each
`Representation,
`
`•
`
`•
`
`•
`
`•
`
`•
`
`
`
`•
`
`•
`
`each Segment is represented by one movie fragment, and
`only the first loop of the ‘sidx’ is used to provide the media
`presentation time of each segment,
`• media components are provided in separate Representations
`with alternatives in same group and complementary compo-
`nents in separate groups,
`a template-based segment list generation is applied to support
`compact MPD/manifest representation.
`Additional design principles results from from implementation
`experience of progressive download services, especially the reuse
`of existing media content, or DFSplash [9] for which segments
`may be easily accessed with HTTP partial GET requests for opti-
`mized user experience, bandwidth efficiency and CDN adaptation.
`Table 1 provides a comparison of adaptive streaming solutions
`based on a collection in [11]. 3GP-DASH is added to the row. It is
`obvious that the flexibility of 3GP-DASH can address the features
`of proprietary adaptive streaming solutions.
`Table 1 Adaptive Streaming Comparison, based on collection
`in [11]
`
`Feature
`
`MS IIS [10]
`
`Apple [8]
`
`3GP- DASH
`
`Yes
`Yes
`HTTP
`
`Yes
`
`MS IIS
`
`Yes
`
`MP4
`
`yes
`no
`HTTP
`
`yes
`
`HTTP
`
`yes
`
`Yes
`Yes
`HTTP
`
`Yes
`
`HTTP
`
`yes
`
`MP2 TS
`
`3GP/MP4
`
`PlayReady
`
`no
`
`OMA DRM
`
`yes
`Agnostic
`flexible
`flexible
`both
`in work
`standard
`
`On-Demand & Live
`Live DVR
`Delivery Protocol
`Scalability via HTTP Edge
`Caches
`Origin Server
`Stateless Server Connec-
`tion
`Media Container
`DRM Support for Live and
`VOD
`no
`Yes
`Add Insertion Support
`H.264 BL
`Agnostic
`Supported Video Codecs
`10 sec
`2 sec
`Default Segment Duration
`30sec
`>1.5sec
`End-to-End Latency
`fragmented
`contiguous
`File Type on Server
`No
`No
`3GPP Adaptation
`proprietary
`proprietary
`Specification
`A few service examples are provided in section 5.
`5. SERVICE EXAMPLES
`5.1 On-Demand Adaptive Streaming Service
`Assume that a streaming service provider offers a