`
`(19) World Intellectual Property Organization
`International Bureau
`
`(43) International Publication Date
`23 September 2010 (23.09.2010)
`
`
`
`(Eu) AOUAEAAA
` ay
`Seg
`PCT
`
`WO 2010/107625 A2
`
`(10) International Publication Number
`
`
`
`(51)
`
`(21)
`
`International Patent Classification:
`HOA4L 12/56 (2006.01)
`G11B 27/031 (2006.01)
`G11B 20/10 (2006.01)
`HO4L 29/06 (2006.01)
`International Application Number:
`PCT/US2010/026707
`
`(22)
`
`InternationalFiling Date:
`
`oe
`(25) Filing Language:
`(26) Publication Language:
`
`9 March 2010 (09.03.2010)
`.
`English
`English
`
`/
`(30) Priority Data:
`US
`16 March 2009 (16.03.2009)
`12/405,215
`(for all designated States except US): MI-
`(71) Applicant
`CROSOFT CORPORATION[US/US]; One Microsoft
`Way, Redmond, Washington 98052-6399 (US).
`Inventors: SOOD,Vishal; c/o Microsoft Corporation, In-
`ternational Patents, One Microsoft Way, Redmond,
`Washington 98052-6399 (US). FREELANDER, Jack
`E.; c/o Microsoft Corporation, International Patents, One
`
`(72)
`
`Microsoft Way, Redmond, Washington 98052-6399 (US).
`ROY, Anirban; c/o Microsoft Corporation, International
`Patents, One Microsoft Way, Redmond, Washington
`98052-6399 (US). LIU, Lin; c/o Microsoft Corporation,
`International Patents, One Microsoft Way, Redmond,
`Washington
`98052-6399
`(US). ZHANG, Geqiang
`(Sam); c/o Microsoft Corporation, International Patents,
`One Microsoft Way, Redmond, Washington 98052-6399
`(US). DUGGARAJU, Krishna; c/o Microsoft Corpora-
`tion,
`International Patents, One Microsoft Way, Red-
`mond, Washington 98052-6399 (US). SIRIVARA, Sud-
`heer; c/o Microsoft Corporation, International Patents,
`One Microsoft Way, Redmond, Washington 98052-6399
`(US). BOCHAROV, John A.; c/o Microsoft Corpora-
`tion,
`International Patents, One Microsoft Way, Red-
`mond, Washington 98052-6399 (US).
`(81) Designated States (unless otherwise indicated, for every
`kind ofnationalprotectionavailable): AE, AG, AL, AM,
`AO, AT, AU, AZ, BA, BB, BG, BH, BR, BW, BY, BZ,
`CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, DO,
`DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT,
`
`(54) Title: SMOOTH, STATELESS CLIENT MEDIA STREAMING
`
`100
`
`re
`Adaptive Streaming System
`
`110
`
`120
`
`130
`
`140
`
`
`
`
`
`Media
`Playback
`
`
`
`
`
`
`
`
`
`
`
`QoS
`Monitoring
`
`Clock
`Synchroniz-
`ation
`
`STS150 160
`
`
`
`FIG. I
`
`Google Exhibit 1013
`Google Exhibit 1013
`Google v. Ericsson
`Google v. Ericsson
`
`[Continued on next page]
`
`(57) Abstract: An adaptive streaming system is described
`herein that provides a stateless connection between the
`client and server for streaming media playback in which the
`data is formatted in a mannerthat allows theclient to make
`decisions and react more quickly to changing network con-
`ditions. The client requests uniform chunks of media from
`the server that include a portion of the media. The adaptive
`streaming system requests portions of a media file or of a
`live streaming event in small-sized chunks each having a
`distinguished URL. This allows streaming media data to be
`cached by existing Internet cache infrastructure. Each
`chunk contains metadata information that describes the en-
`coding of the chunk and media content for playback by the
`client. The server may provide chunks in multiple encod-
`ings so that the client can switch quickly to chunksof a dif-
`ferent bit rate or playback speed.
`
`
`ae
`a
`ae
`a
`
`
`
`Chunk
`Request
`
`Chunk
`Parse
`
`Manifest
`Assembly
`
`
`
`
`
`WO2010/107625A2[INIINIMNNINNIMURIARATIALANATATAAAT
`
`
`
`
`
`
`
`WO 2010/107625 A2 IImMUINITUNT ITAAT TAN ARTUAE
`
`HN, HR, HU,ID,IL,IN,IS, JP, KE, KG, KM, KN,KP,
`KR, KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD,
`ME, MG, MK, MN, MW, MX, MY, MZ, NA, NG, NL
`NO,NZ, OM,PE, PG, PH,PL, PT, RO, RS, RU,SC, SD,
`SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, TR,
`TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW.
`
`(84)
`
`Designated States (unless otherwise indicated, for every
`kind of regional protection available): ARIPO (BW, GH,
`GM,KE, LS, MW, MZ, NA, SD, SL, SZ, TZ, UG, ZM,
`ZW), Eurasian (AM, AZ, BY, KG, KZ, MD, RU, TJ,
`TM), European (AT, BE, BG, CH, CY, CZ, DE, DK, EE,
`ES, FL FR, GB, GR, HR, HU,IE, IS, IT, LT, LU, LV,
`MC, MK, MT, NL, NO, PL, PT, RO, SE, SI, SK, SM,
`
`TR), OAPI (BF, BJ, CF, CG, CL, CM, GA, GN, GQ, GW,
`ML, MR,NE, SN, TD, TG).
`Declarations under Rule 4.17:
`
`as to applicant's entitlement to apply for and be granted
`a patent (Rule 4.17/(ii))
`
`as to the applicant's entitlement to claim the priority of
`the earlier application (Rule 4.17/(iii))
`Published:
`
`without international search report and to be republished
`upon receipt of that report (Rule 48.2(g))
`
`
`
`WO 2010/107625
`
`PCT/US2010/026707
`
`SMOOTH, STATELESS CLIENT MEDIA STREAMING
`
`BACKGROUND
`
`[0001]
`
`Streaming media is multimedia that is constantly received by, and
`
`normally presented to, an end-user(using a client) while it is being delivered by a
`
`streaming provider(using a server). Several protocols exist for streaming media,
`
`including the Real-time Streaming Protocol (RTSP), Real-time Transport Protocol
`
`(RTP), and the Real-time Transport Control Protocol (RTCP), which streaming
`
`applications often use together. The Real Time Streaming Protocol (RTSP),
`
`developed bythe Internet Engineering Task Force (IETF) and created in 1998 as
`
`10
`
`Request For Comments (RFC) 2326, is a protocol for use in streaming media
`
`systems, whichallowsa client to remotely control a streaming media server,
`
`issuing VCR-like commands such as "play" and "pause", and allowing time-based
`
`access tofiles on a server.
`
`[0002] The sending of streaming dataitself is not part of the RTSP protocol.
`
`15
`
`Most RTSP servers use the standards-based RTP as the transport protocol for the
`
`actual audio/video data, acting somewhat as a metadata channel. RTP defines a
`
`standardized packet format for delivering audio and video overthe Internet. RTP
`
`was developed by the Audio-Video Transport Working Group of the IETF and first
`
`published in 1996 as RFC 1889, and superseded by RFC 3550 in 2003. The
`
`20
`
`protocolis similar in syntax and operation to Hypertext Transport Protocol (HTTP),
`
`but RTSP adds new requests. While HTTP is stateless, RTSP is a stateful
`
`protocol. RTSP usesa session ID to keep track of sessions when needed. RTSP
`
`messagesare sent from client to server, although some exceptions exist where
`
`the serverwill send messagesto the client.
`
`25
`
`[0003]
`
`Streaming applications usually use RTP in conjunction with RTCP. While
`
`RTP carries the media streams (e.g., audio and video) or out-of-band signaling
`
`(dual-tone multi-frequency (DTMF)), streaming applications use RTCP to monitor
`
`transmission statistics and quality of service (QOS) information. RTP allows only
`
`one type of message, one that carries data from the sourceto the destination.
`
`In
`
`30
`
`many cases, there is a need for other messages in a session. These messages
`
`control the flow and quality of data and allow the recipient to send feedbackto the
`
`source or sources. RTCP is a protocol designed for this purpose. RTCP has five
`
`types of messages: senderreport, receiver report, source description message,
`
`
`
`WO 2010/107625
`
`PCT/US2010/026707
`
`bye message, and application-specific message. RTCP provides out-of-band
`
`control information for an RTP flow and partners with RTP in the delivery and
`
`packaging of multimedia data, but does not transport any data itself. Streaming
`
`applications use RTCP to periodically transmit control packets to participants in a
`
`streaming multimedia session. One function of RTCP is to provide feedback on
`
`the quality of service RTP is providing. RTCP gathers statistics on a media
`
`connection and information such as bytes sent, packets sent, lost packets, jitter,
`
`feedback, and round trip delay. An application mayusethis information to
`
`increase the quality of service, perhapsbylimiting flow or using a different codec
`
`10
`
`or bit rate.
`
`[0004] One problem with existing media streaming architecturesis the tight
`
`coupling between server and client. The stateful connection between client and
`
`server creates additional server overhead, because the servertracks the current
`
`state of each client. This also limits the scalability of the server.
`
`In addition, the
`
`15
`
`client cannot quickly react to changing conditions, such as increased packetloss,
`
`reduced bandwidth, user requests for different content or to modify the existing
`
`content (e.g., speed up or rewind), and soforth, withoutfirst communicating with
`
`the server and waiting for the server to adapt and respond. Often, whena client
`
`reports a loweravailable bandwidth (e.g., through RTCP), the server does not
`
`20
`
`adapt quickly enough causing breaks in the media to be noticed by the user on
`
`the client as packets that exceed the available bandwidth are not received and
`
`new lower bit rate packets are not sent from the server in time. To avoid these
`
`problems, clients often buffer data, but buffering introduces latency, whichfor live
`
`events may be unacceptable.
`
`25
`
`[0005]
`
`In addition, the Internet contains many types of downloadable media
`
`content items, including audio, video, documents, and so forth. These content
`
`items are often very large, such as video in the hundreds of megabytes. Users
`
`often retrieve documents overthe Internet using HTTP through a web browser.
`
`The Internet has built up a large infrastructure of routers and proxies that are
`
`30
`
`effective at caching data for HTTP. Servers can provide cached datato clients
`
`with less delay and by using fewer resources than re-requesting the content from
`
`the original source. For example, a user in New York may download a content
`
`item served from a host in Japan, and receive the content item through a routerin
`
`California.
`
`If a user in New Jersey requests the same file, the router in California
`
`
`
`WO 2010/107625
`
`PCT/US2010/026707
`
`maybeable to provide the content item without again requesting the data from
`
`the host in Japan. This reduces the networktraffic over possibly strained routes,
`
`and allows the user in New Jerseyto receive the content item with less latency.
`
`[0006] Unfortunately, live media often cannot be cached using existing protocols,
`
`and eachclient requests the media from the same serveror set of servers.
`
`In
`
`addition, when streaming media can be cached, specialized cache hardwareis
`
`often involved, rather than existing and readily available HTTP-basedInternet
`
`caching infrastructure. The lack of caching limits the number of concurrent
`
`viewers and requests that the servers can handle, and limits the attendance of a
`
`10
`
`live event. The world is increasingly using the Internet to consume up to the
`
`minute live information, such as the record numberof users that watchedlive
`
`events such as the opening of the 2008 Olympicsvia the Internet. The limitations
`
`of current technology are slowing adoption of the Internet as a medium for
`
`consuming this type of media content.
`
`15
`
`SUMMARY
`
`[0007] An adaptive streaming system is described herein that provides a
`
`stateless connection betweenthe client and server for streaming media playback
`
`in which the data is formatted in a manner that allows the client to make decisions
`
`traditionally performed by the server and therefore react more quickly to changing
`
`20
`
`network conditions. The client requests uniform chunks of media from the server
`
`that include a portion of the media. The adaptive streaming system requests
`
`portions of a mediafile or of a live streaming event in small-sized chunks each
`
`having a distinguished URL. This allows existing Internet cache infrastructure to
`
`cache streaming media, thereby allowing more clients to view the same contentat
`
`25
`
`about the same time. As the event progresses, the client continues requesting
`
`chunksuntil the end of the event or media. Each chunk contains metadata
`
`information that describes the encoding of the chunk and media contentfor
`
`playback bythe client. The server may provide chunks in multiple encodings so
`
`that the client can switch quickly to chunksofa different bit rate or playback
`
`30
`
`speed. Thus, the adaptive streaming system provides an improved experience to
`
`the user with fewer breaks in streaming media playback, and an increased
`
`likelihood that the client will receive the media with lower latency from a more local
`
`cache server.
`
`
`
`WO 2010/107625
`
`PCT/US2010/026707
`
`[0008] This Summary is provided to introduce a selection of concepts in a
`
`simplified form that are further described below in the Detailed Description. This
`
`Summaryis not intended to identify key features or essential features of the
`
`claimed subject matter, noris it intended to be usedto limit the scope of the
`
`claimed subject matter.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`[0009]
`
`Figure 1 is a block diagram thatillustrates components of the adaptive
`
`streaming system, in one embodiment.
`
`[0010]
`
`Figure 2 is a block diagram thatillustrates an operating environment of
`
`10
`
`the smooth streaming system using Microsoft Windowsand IIS, in one
`
`embodiment.
`
`[0011]
`
`Figure 3 is a flow diagram thatillustrates the processing of the adaptive
`
`streaming system on a client to playback media, in one embodiment.
`
`[0012]
`
`Figure 4 is a flow diagram thatillustrates the processing of the adaptive
`
`15
`
`streaming system to handle a single media chunk, in one embodiment.
`
`DETAILED DESCRIPTION
`
`[0013] An adaptive streaming system is described herein that provides a
`
`stateless connection betweenthe client and server for streaming media playback
`
`in which the data is formatted in a manner that allows the client to make decisions
`
`20
`
`often left to the server in past protocols, and therefore react more quickly to
`
`changing network conditions.
`
`In addition, the adaptive streaming system operates
`
`in a manner that allows existing Internet cache infrastructure to cache streaming
`
`media data, thereby allowing moreclients to view the same content at about the
`
`same time. The adaptive streaming system requests portions of a mediafile or of
`
`25
`
`a live streaming event in small-sized chunks each having a distinguished URL.
`
`Each chunk maybe a media file in its own right or may be a part of a whole media
`
`file. As the event progresses, the client continues requesting chunksuntil the end
`
`of the event. Each chunk contains metadata information that describes the
`
`encoding of the chunk and media content for playback by the client. The server
`
`30
`
`may provide chunksin multiple encodings so that the client can, for example,
`
`switch quickly to chunks of a different bit rate or playback speed. Because the
`
`chunks adhere to World Wide Web Consortium (W3C) HTTP standards, the
`
`chunks are small enough to be cached, and the system provides the chunksin the
`
`same way to each client, the chunks are naturally cached by existing Internet
`
`
`
`WO 2010/107625
`
`PCT/US2010/026707
`
`infrastructure without modification. Thus, the adaptive streaming system provides
`
`an improved experienceto the user with fewer breaks in streaming media
`
`playback, and an increased likelihood that the client will receive the media with
`
`lower latency from a morelocal cache server. Because the connection between
`
`the client and server is stateless, the same client and server need not be
`
`connected for the duration of a long event. The stateless system described herein
`
`has no server affinity, allowing clients to piece together manifests from servers
`
`that may have begun atdifferent times, and also allowing server administrators to
`
`bring up or shut downorigin servers as load dictates.
`
`10
`
`[0014]
`
`In some embodiments, the adaptive streaming system uses a new data
`
`transmission format between the serverand client. The client requests chunks of
`
`media from a serverthat include a portion of the media. For example, for a 10-
`
`minutefile, the client may request 2-second chunks. Notethat unlike typical
`
`streaming where the server pushesdatato the client, in this case the client pulls
`
`15
`
`media chunks from the server.
`
`In the caseof a live stream, the server may be
`
`creating the media on the fly and producing chunksto respond to client requests.
`
`Thus, the client may only be several chunks behind the server in terms of how fast
`
`the server creates chunks and howfast the client requests chunks.
`
`[0015] Each chunk contains metadata and media content. The metadata may
`
`20
`
`describe useful information about the media content, such as the bit rate of the
`
`media content, where the media content fits into a larger media element (e.g., this
`
`chunk representsoffset 1:10 in a 10 minute videoclip), the codec used to encode
`
`the media content, and so forth. The client uses this information to place the
`
`chunk into a storyboard of the larger media element and to properly decode and
`
`25
`
`playback the media content.
`
`[0016]
`
`Figure 1 is a block diagram thatillustrates components of the adaptive
`
`streaming system, in one embodiment. The adaptive streaming system 100
`
`includes a chunk request component 110, a chunk parsing component 120, a
`
`manifest assembly component 130, a media playback component 140, a QoS
`
`30
`
`monitoring component 150, and a clock synchronization component 160. Each of
`
`these componentsis describedin further detail herein. The adaptive streaming
`
`system 100 as described herein operates primarily at a client computer system.
`
`However, those of ordinary skill in the art will recognize that various components
`
`
`
`WO 2010/107625
`
`PCT/US2010/026707
`
`of the system maybe placedat various locations within a content network
`
`environmentto provide particular positive results.
`
`[0017] The chunk request component 110 makes requests from the client for
`
`individual media chunksfrom the server. As shownin Figure 2, the client’s
`
`request maypassfirst to an edge server (e.g., an Internet cache), then to an
`
`origin server, and then to an ingest server. At each stage, if the requested data is
`
`found, then the request doesnotgo to the next level. For example, if the edge
`
`server has the requested data, then the client receives the data from the edge
`
`server and the origin server does not receive the request. Each chunk may have
`
`10
`
`a Uniform Resource Locator (URL) that individually identifies the chunk.
`
`Internet
`
`cache servers are good at caching server responsesto specific URL requests
`
`(e.g., HTTP GET). Thus, whenthe first client calls through to the server to get a
`
`chunk, the edge servers cache that chunk and subsequentclients that request the
`
`same chunk mayreceive the chunk from the edge server (based on the cache
`
`15
`
`lifetime and servertime to live (TTL) settings). The chunk request component 110
`
`receives the chunk and passesit to the chunk parsing component 120 for
`
`interpretation.
`
`[0018] The chunk parsing component 120 interprets the format of a media chunk
`
`received by the chunk request component 110 and separates the chunk into its
`
`20
`
`component parts. Typically, the chunk includes a headerportion containing
`
`metadata, and a data portion containing media content. The chunk parsing
`
`component provides the metadata to the manifest assembly component 130 and
`
`the media content to the media playback component140.
`
`[0019] The manifest assembly component 130 builds a manifest that describes
`
`25
`
`the media element to which received media content belongs. Large mediafiles
`
`that clients download as a whole (i.e., not streamed) often include a manifest
`
`describing the wholefile, the codecs and bit rates used to encode various portions
`
`of the file, markers about meaningful positions with the file, and so forth. During
`
`streaming, particularly live content, a server cannot provide a complete manifest
`
`30
`
`becausethe eventis still ongoing. Thus, the server provides as muchof the
`
`manifest as it can through the metadata in the media chunks. The server may
`
`also provide an application-programming interface (API), such as a predefined
`
`URL, for the client to request the manifest up to the current point in the media
`
`stream. This can be useful when the client joins a live, streamed event after the
`
`
`
`WO 2010/107625
`
`PCT/US2010/026707
`
`eventis already in progress. The manifest allows the client to request previously
`
`streamed portions of the media element (e.g., by rewinding), and the client
`
`continues to receive new portions of the manifest through the metadata of the
`
`streamed media chunks.
`
`[0020] The manifest assembly component 130 builds a manifest similar to that
`
`available for a complete media file. Thus, as the event proceeds if the user wants
`
`to skip backwards in the media (e.g., rewind or jump to a particular position), then
`
`skip forward again, the user can do so and the client uses the assembled manifest
`
`to find the appropriate chunk or chunksto playback to the user. When the user
`
`10
`
`pauses, the system 100 may continue to receive media chunks(or only the
`
`metadata portion of chunks based ona distinguished request URL), so that the
`
`manifest assembly component 130 can continue to build the manifest and be
`
`ready for any user requests (e.g., skip to the current live position or play from the
`
`pause point) after the user is done pausing. The client-side assembled manifest
`
`15
`
`allowsthe client to play the media event back as on-demand content as soon as
`
`the eventis over, and to skip around within the media event as it is going on.
`
`[0021] The media playback component 140 plays back received media content
`
`using the client hardware. The media playback component 140 may invoke one
`
`or more codecsto interpret the container within which the media content is
`
`20
`
`transported and to decompressor otherwise decode the media content from a
`
`compressed formatto a raw format (e.g., YV12, RGBA, or PCM audio samples)
`
`ready for playback. The media playback component 140 maythenprovide the
`
`raw format media content to an operating system API (e.g., Microsoft DirectX) for
`
`playback on local computer system sound and video hardware, such as a display
`
`25
`
`and speakers.
`
`[0022] The QoS monitoring component 150 analyzes the successof receiving
`
`packets from the server and adaptsthe client’s requests based on a set of current
`
`network and other conditions. For example, if the client is routinely receiving
`
`media chunkslate, then the component 150 may determine that the bandwidth
`
`30
`
`between the client and the serveris inadequate for the current bit rate, and the
`
`client may begin requesting media chunksat a lower bit rate. QoS monitoring
`
`may include measurement of other heuristics, such as render frame rate, window
`
`size, buffer size, frequency of rebuffering, and so forth. Media chunks for eachbit
`
`rate may have a distinguished URL so that chunksfor various bit rates are cached
`
`
`
`WO 2010/107625
`
`PCT/US2010/026707
`
`by Internet cache infrastructure. Note that the server doesnottrackclient state
`
`and does not know whatbit rate any particular client is currently playing. The
`
`server can simply provide the same media element in a variety of bit rates to
`
`satisfy potential client requests under a range of conditions.
`
`In addition, the initial
`
`manifest and/or metadata that the client receives mayinclude information about
`
`the bit rates and other encoding properties available from the server, so that the
`
`client can choosethe encoding thatwill provide a good client experience.
`
`[0023] Note that when switching bit rates, the client simply begins requesting the
`
`newbit rate and playing back the new bit rate chunksas the client receives the
`
`10
`
`chunks. The client does not have to send control information to the server and
`
`wait for the server to adapt the stream. The client’s request may not even reach
`
`the server due to a cache in between the client and serversatisfying the request.
`
`Thus, the client is much quickerto react than clients in traditional media streaming
`
`systems are, and the burden on the serverof having different clients connecting
`
`15
`
`under various current conditions is reduced dramatically.
`
`In addition, because
`
`current conditions tend to belocalized, it is likely that many clients in a particular
`
`geographic region or on a particular Internet service provider (ISP) will experience
`
`similar conditions and will request similar media encodings (e.g., bit rates).
`
`Because cachesalso tend to belocalized, it is likely that the clients in a particular
`
`20
`
`situation will find that the cache near them is “warm” with the data that they each
`
`request, so that the latency experienced by eachclient will be low.
`
`[0024] Theclock synchronization component 160 synchronizesthe clocks of the
`
`server and the client. Although absolute time is not generally relevant to the client
`
`and server, being able to identify a particular chunk and knowing the rate (i.e.,
`
`25
`
`cadence) at which to request chunksis relevantto the client. For example, if the
`
`client requests data too quickly, the server will not yet have the data and will
`
`respond with error responses(e.g., an HTTP 404 not found error response)
`
`creating many spurious requests that unnecessarily consume bandwidth. On the
`
`other hand, if the client requests data too slowly, then the client may not have data
`
`30
`
`in time for playback creating noticeable breaks in the media played back to the
`
`user. Thus, the client and server work well when the client knowsthe rate at
`
`which the server is producing new chunks and knowswherethe current chunk fits
`
`into the overall timeline. The clock synchronization component 160 providesthis
`
`information by allowing the server and client to have a similar clock value at a
`
`
`
`WO 2010/107625
`
`PCT/US2010/026707
`
`particular time. The server may also mark each media chunk with the time at
`
`which the server created the chunk.
`
`[0025] Clock synchronization also gives the server a commonreference across
`
`each of the encoders. For example, the server may encode data in multiple bit
`
`rates and using multiple codecs at the same time. Each encoder mayreference
`
`encodeddatain a different way, but the timestamp can be set in common across
`
`all encoders.
`
`In this way,if a client requests a particular chunk, the clientwill get
`
`media representing the same period regardless of the encoding thatthe client
`
`selects.
`
`10
`
`[0026] The computing device on which the system is implemented may include a
`
`central processing unit, memory, input devices (e.g., keyboard and pointing
`
`devices), output devices (e.g., display devices), and storage devices (e.g., disk
`
`drives or other non-volatile storage media). The memory and storage devices are
`
`computer-readable storage media that may be encoded with computer-executable
`
`15
`
`instructions (e.g., software) that implement or enable the system.
`
`In addition, the
`
`data structures and message structures maybestored or transmitted via a data
`
`transmission medium, such as a signal on a communication link. Various
`
`communication links may be used, such as the Internet, a local area network, a
`
`wide area network, a point-to-point dial-up connection, a cell phone network, and
`
`20
`
`so on.
`
`[0027] Embodiments of the system may be implementedin various operating
`
`environmentsthat include personal computers, server computers, handheld or
`
`laptop devices, multiprocessor systems, microprocessor-based systems,
`
`programmable consumer electronics, digital cameras, network PCs,
`
`25
`
`minicomputers, mainframe computers, distributed computing environmentsthat
`
`include anyof the above systems or devices, and so on. The computer systems
`
`maybecell phones, personal digital assistants, smart phones, personal
`
`computers, programmable consumerelectronics, digital cameras, and so on.
`
`[0028] The system may be described in the general context of computer-
`
`30
`
`executable instructions, such as program modules, executed by one or more
`
`computers or other devices. Generally, program modulesinclude routines,
`
`programs, objects, components, data structures, and so on that perform particular
`
`tasks or implement particular abstract data types. Typically, the functionality of
`
`
`
`WO 2010/107625
`
`PCT/US2010/026707
`
`the program modules may be combined ordistributed as desired in various
`
`embodiments.
`
`[0029]
`
`Figure 2 is a block diagram thatillustrates an operating environment of
`
`the smooth streaming system using Microsoft Windowsand IIS, in one
`
`embodiment. The environment typically includes a source client 210, a content
`
`delivery network 240, and an external network 270. The sourceclient is the
`
`source of the media or live event. The source client includes a media source 220
`
`and one or more encoders 230. The media source 220 may include cameras
`
`each providing multiple camera angles, microphonescapture audio, slide
`
`10
`
`presentations, text (such as from a closed captioning service), images, and other
`
`types of media. The encoders 230 encode the data from the media source 220 in
`
`one or more encoding formats in parallel. For example, the encoders 230 may
`
`produce encoded media in a variety ofbit rates.
`
`[0030] The content delivery network 240 includes one or more ingest servers
`
`15
`
`250 and one or moreorigin servers 260. The ingest servers 250 receive encoded
`
`media in each of the encoding formats from the encoders 230 and create a
`
`manifest describing the encoded media. The ingest servers 250 may create and
`
`store the media chunks described herein or may create the chunksonthe fly as
`
`they are requested. The ingest servers 250 can receive pushed data, such as via
`
`20
`
`an HTTP POST, from the encoders 230, or via pull by requesting data from the
`
`encoders 230. The encoders 230 and ingest servers 250 may be connectedin a
`
`variety of redundant configurations. For example, each encoder may send
`
`encoded media data to each ofthe ingest servers 250, or only to one ingest
`
`serveruntil a failure occurs. The origin servers 260 are the servers that respond
`
`25
`
`to client requests for media chunks. The origin servers 260 may also be
`
`configured in a variety of redundant configurations.
`
`[0031] The external network 270 includes edge servers 280 and otherInternet
`
`(or other network) infrastructure and clients 290. When a client makes a request
`
`for a media chunk, the client addresses the request to the origin servers 260.
`
`30
`
`Because of the design of network caching, if one of the edge servers 280 contains
`
`the data, then that edge server may respond to the client without passing along
`
`the request. However,if the data is not available at the edge server, then the
`
`edge server forwards the requestto one ofthe origin servers 260. Likewise, if one
`
`10
`
`
`
`WO 2010/107625
`
`PCT/US2010/026707
`
`of the origin servers 260 receives a request for data that is not available, the origin
`
`server may request the data from one of the ingest servers 250.
`
`[0032]
`
`Figure 3 is a flow diagram thatillustrates the processing of the adaptive
`
`streaming system on a client to playback media, in one embodiment. Beginning in
`
`block 310, the system selects an initial encoding at which to request encoded
`
`media from the server. For example, the system mayinitially select a lowest
`
`available bit rate. The system may have previously sent a request to the server to
`
`discover the available bit rates and other available encodings. Continuing in block
`
`320, the system requests and plays a particular chunk of the media, as described
`
`10
`
`further with reference to Figure 4. Continuing in block 330, the system determines
`
`a quality of service metric based on the requested chunk. For example, the chunk
`
`may include metadata for as many additional chunks as the server is currently
`
`storing, which the client can use to determine howfastthe client is requesting
`
`chunksrelative to how fast the server is producing chunks. This processis
`
`15
`
`described in further detail herein.
`
`[0033] Continuing in decision block 340, if the system determinesthat the
`
`current QoS metric is too low and the client connection to the server cannot
`
`handle the current encoding, then the system continues at block 350, else the
`
`system loops to block 320 to handle the next chunk. Continuing in block 350, the
`
`20
`
`system selects a different encoding of the media, wherein the system selects a
`
`different encoding by requesting data from a different URL for subsequent chunks
`
`from the server. For example, the system may select an encoding that consumes
`
`half the bandwidth of the current encoding. Likewise, the system may determine
`
`that the QoS metric indicates that the client can handle a higherbit rate encoding,
`
`25
`
`and the client may request a higherbit rate for subsequent chunks.
`
`In this way,
`
`the client adjusts the bit rate up and down based on current conditions.
`
`[0034] Although Figure3illustrates the QoS determination as occurring after
`
`each chunk, those of ordinary skill in the art will recognize that other QoS
`
`implementations are common, such as waiting a fixed number of packets or
`
`30
`
`chunks(e.g., every 10th packet) to make a QoS determination. After block 350,
`
`the system loops to block 320