`(22) Filed:
`(51) Int. Cl." ................................................ H04N 7/173
`(52) U.S. Cl. ...
`... 725/87; 72.5/94; 725/98;
`725/118; 709/219; 709/247; 348/384.1;
`348/395.1; 348/438.1; 375/240.1; 375/240.11
`(58) Field of Search ............................. 725/94, 9s, 11s,
`725/240.26, 87; 370/468,236, 230-232,
`235; 709/219, 247; 348/394.1, 395.1, 409.1,
`410.1, 425.1, 438.1; 375/240.1, 240.11,
`240.19, 240.08
`s
`
`References Cited
`U.S. PATENT DOCUMENTS
`5,262,875 A * 11/1993 Mincer et al. .............. ssssss
`5,659,539 A 8/1997 Porter et al. ........... 395/200.61
`5,742,343 A
`4/1998 Haskell et al. .............. 348/415
`5,886,733 A * 3/1999 Zdepski et al. ............... 348/13
`5,982,436 A * 11/1999 Balakrishnan et al. ...... 348/409
`6,014,694. A
`1/2000 Aharoni et al. ............. 709/219
`6,185,625 B1 * 2/2001 Tso et al.
`... 709/247
`6,282,206 B1
`8/2001 Hindus et al. .............. 370/468
`
`(56)
`
`(12) United States Patent
`Chou
`
`USOO6637031B1
`(10) Patent No.:
`US 6,637,031 B1
`(45) Date of Patent:
`Oct. 21, 2003
`
`(54) MULTIMEDIA PRESENTATION LATENCY
`MINIMIZATION
`(75) Inventor: Philip A. Chou, Menlo Park, CA (US)
`
`EP
`
`* ) Notice:
`
`(21) Appl. No.: 09/205,875
`
`FOREIGN PATENT DOCUMENTS
`O695094
`1/1996
`............ HO4N/7/26
`OTHER PUBLICATIONS
`Chiang, T., et al., “Hierarchical Coding of Digital Televi
`(73) Assignee: Microsoft Corporation, Redmond, WA sion', IEEE Communications Magazine, vol. 32, No. 5,
`38-45, (May 1, 1994).
`(US)
`Zheng, B., et al., “Multimedia Over High Speed Networks:
`Subject to any disclaimer, the term of this
`y
`Reducing Network Requirements with Fast Buffer Fillup',
`patent is extended or adjusted under 35
`IEEE Global Telecommunications Conference, NY,
`U.S.C. 154(b) by 0 days.
`XPO00825861, 779-784, (1998).
`* cited by examiner
`Primary Examiner Vivek Srivastava
`Assistant Examiner Ngoc Vu
`(74) Attorney, Agent, or Firm-Lee & Hayes, PLLC
`(57)
`ABSTRACT
`-
`0
`- 0
`To obtain real-time responses with interactive multimedia
`servers, the server provides at least two different audio/
`Visual data Streams. A first data Stream has fewer bits per
`frame and provides a video image much more quickly than
`a Second data Stream with a higher number of bits and hence
`higher quality Video image. The first data Stream becomes
`available to a client much faster and may be more quickly
`displayed on demand while the Second data Stream is Sent to
`improve the quality as Soon as the playback buffer can
`handle it. In one embodiment, an entire Video signal is
`layered, with a base layer providing the first signal and
`further enhancement layers comprising the Second. The base
`layer may be actual image frames or just the audio portion
`of a Video Stream. The first and Second streams are gradually
`combined in a manner Such that the playback buffer does not
`overflow or underflow.
`
`18 Claims, 6 Drawing Sheets
`
`2OO
`
`
`
`
`
`VIDEO CLIENT
`
`VIDEO CLIENT
`
`VIDEO CAPTURING TOOLS
`
`Petitioners' Exhibit 1010
`Page 0001
`
`
`
`U.S. Patent
`
`Oct. 21, 2003
`
`Sheet 1 of 6
`
`US 6,637,031 B1
`
`ALOW3Y
`
`YSLNdNOOD
`
`NOLWONdd¥
`
`SWVU90Ud
`
`JOVIYALNI
`
`
`
` I|{ION||0d3ndInd3NUd!YIOMIENV3xY1001}WweasWwolldo||yslaouaNOWW||ySIaGusaingon|||!wvugo’d||||¢gYAHLO!SN@WalSAS,Ioz.!MaldvdV!
`STINGON)OpWAISAS
`YOLINONLZ!iGGIBeiemeee
`I|Lv||
`OSGIALINN|ONISSIOONd||epiwior|\|
`LooeeLLeeeeeeeeeeeeaeeeeeeererrr4
`
`
`
`
`
`SOVAMALN!JOVIYAINIJ0VIYSLNI
`
`ONILVaadO
`
`Petitioners’ Exhibit 1010
`Page 0002
`
`Petitioners' Exhibit 1010
`Page 0002
`
`
`
`
`
`
`
`
`U.S. Patent
`
`Oct. 21, 2003
`
`Sheet 2 of 6
`
`US 6,637,031 B1
`
`ZOZ
`- - - - - - - - - - - -
`
`
`
`OOZ
`
`
`
`INEITO OBC]|/\
`
`Petitioners' Exhibit 1010
`Page 0003
`
`
`
`U.S. Patent
`
`
`
`Oct. 21, 2003
`Oct. 21, 2003
`
`Sheet 3 of 6
`Sheet 3 of 6
`
`U.S. Patent
`
`US 6,637,031 B1
`US 6,637,031 B1
`
`Petitioners’ Exhibit 1010
`Page 0004
`
`Petitioners' Exhibit 1010
`Page 0004
`
`
`
`U.S. Patent
`
`Oct. 21, 2003
`
`Sheet 4 of 6
`
`US 6,637,031 B1
`
`BITS
`
`A B C
`
`D
`
`FIG. 4A
`
`BITS
`
`A B
`
`C D
`
`FIG. 4B
`
`BS
`
`A BB C.C D
`
`TME
`
`TIME
`
`TIME
`
`FIG. 4C
`
`FIG. 4D
`
`BITS
`
`E A B
`
`C P F
`
`TIME
`
`BITS
`
`E A B
`
`FIG. 4E
`
`
`
`
`
`TRANSMISSION - /
`DELAY /
`/
`
`INITIAL ENCODER
`BUFFEREMPTINESS
`
`START-UP DELAY
`
`TIME
`
`Petitioners' Exhibit 1010
`Page 0005
`
`
`
`U.S. Patent
`
`Oct. 21, 2003
`
`Sheet 5 of 6
`
`US 6,637,031 B1
`
`BITS
`
`FIG. 5
`
`FIG. 6
`
`BITS
`
`
`
`FIG. 7
`
`TIME
`
`TIME
`
`TIME
`
`Petitioners' Exhibit 1010
`Page 0006
`
`
`
`U.S. Patent
`
`Oct. 21, 2003
`
`Sheet 6 of 6
`
`US 6,637,031 B1
`
`FIG. 8
`
`BITS
`
`
`
`FIG. 9
`
`FIG. 10
`
`TIME
`
`TIME
`
`TIME
`
`Petitioners' Exhibit 1010
`Page 0007
`
`
`
`1
`MULTIMEDIA PRESENTATION LATENCY
`MINIMIZATION
`
`US 6,637,031 B1
`
`2
`transmission enables the user to interact with the Server
`instantly, with low delay, or low latency. For example,
`progressive image transmission enables a user to browse
`through a large database of images, quickly aborting the
`transmission of the unwanted images before they are com
`pletely downloaded to the client.
`Similarly, Streaming is a technique that provides time
`varying content, Such as Video and audio, on demand over a
`communications link with limited bandwidth. In Streaming,
`audiovisual data is packetized, delivered over a network, and
`played as the packets are being received at the receiving end,
`as opposed to being played only after all packets have been
`downloaded. Streaming technologies are becoming increas
`ingly important with the growth of the Internet because most
`users do not have fast enough access to download large
`multimedia files quickly. With streaming, the client browser
`or application can start displaying the data before the entire
`file has been transmitted.
`In a Video on-demand delivery System that uses
`Streaming, the audiovisual data is often compressed and
`Stored on a disk on a media Server for later transmission to
`a client System. For Streaming to work, the client Side
`receiving the data must be able to collect the data and Send
`it as a steady Stream to a decoder or an application that is
`processing the data and converting it to Sound or pictures. If
`the client receives the data more quickly than required, it
`needs to Save the exceSS data in a buffer. Conversely, if the
`client receives the data more slowly than required, it needs
`to play out some of the data from the buffer. Storing part of
`a multimedia file in this manner before playing the file is
`referred to as buffering. Buffering can provide Smooth
`playback even if the client temporarily receives the data
`more quickly or more slowly than required for real-time
`playback.
`There are two reasons that a client can temporarily receive
`data more quickly or more Slowly than required for real-time
`playback. First, in a variable-rate transmission System Such
`as a packet network, the data arrives at uneven rates. Not
`only does packetized data inherently arrive in bursts, but
`even packets of data that are transmitted from the Sender at
`an even rate may not arrive at the receiver at an even rate.
`This is due to the fact that individual packets may follow
`different routes, and the delay through any individual router
`may vary depending on the amount of traffic waiting to go
`through the router. The variability in the rate at which data
`is transmitted through a network is called network jitter.
`A Second reason that a client can temporarily receive data
`more quickly or more slowly than required for real-time
`playback is that the media content is encoded to variable bit
`rate. For example, high-motion Scenes in a video may be
`encoded with more bits than low-motion Scenes. When the
`encoded video is transmitted with a relatively constant bit
`rate, then the high-motion frames arrive at a slower rate than
`the low-motion frames. For both these reasons (variable-rate
`Source encoding and variable-rate transmission channels),
`buffering is required at the client to allow a Smooth presen
`tation.
`Unfortunately, buffering implies delay, or latency. Start
`up delay refers to the latency the user experiences after he
`Signals the Server to Start transmitting data from the begin
`ning of the content (Such as when a pointer to the content is
`selected by the user) before the data can be decoded by the
`client System and presented to the user. Seek delay refers to
`the latency the user experiences after he signals to the Server
`to Start transmitting data from an arbitrary place in the
`middle of the content (Such as when a seek bar is dragged to
`
`FIELD OF THE INVENTION
`The present invention relates generally to multimedia
`communications and more specifically to latency minimiza
`tion for on-demand interactive multimedia applications.
`
`COPYRIGHT NOTICE/PERMISSION
`A portion of the disclosure of this patent document
`contains material which is Subject to copyright protection.
`The copyright owner has no objection to the facsimile
`reproduction by anyone of the patent document or the patent
`disclosure as it appears in the Patent and Trademark Office
`patent file or records, but otherwise reserves all copyright
`rights whatsoever. The following notice applies to the Soft
`ware and data as described below and in the drawing hereto:
`Copyright (C) 1998, Microsoft Corporation, All Rights
`Reserved.
`
`1O
`
`15
`
`BACKGROUND
`Information presentation over the Internet is changing
`dramatically. New time-varying multimedia content is now
`being brought to the Internet, and in particular to the World
`WideWeb (the web), in addition to textual HTML pages and
`Still graphics. Here, time-varying multimedia content refers
`to Sound, Video, animated graphics, or any other medium
`that evolves as a function of elapsed time, alone or in
`combination. In many situations, instant delivery and pre
`Sentation of Such multimedia content, on demand, is desired.
`“On-demand” is a term for a wide set of technologies that
`enable individuals to Select multimedia content from a
`central Server for instant delivery and presentation on a
`client (computer or television). For example, Video-on
`demand can be used for entertainment (ordering movies
`transmitted digitally), education (viewing training videos)
`and browsing (viewing informative audiovisual material on
`a web page) to name a few examples.
`Users are generally connected to the Internet by a com
`munications link of limited bandwidth, Such as a 56 kilobits
`per Second (Kbps) modem or an integrated Services digital
`network (ISDN) connection. Even corporate users are usu
`ally limited to a fraction of the 1.544 mega bits per Second
`(Mbps) T-1 carrier rates. This bandwidth limitation pro
`vides a challenge to on-demand Systems: it may be impos
`Sible to transmit a large amount of image or Video data over
`a limited bandwidth in the short amount of time required for
`"instant delivery and presentation.' Downloading a large
`image or Video may take hours before presentation can
`begin. As a consequence, Special techniques have been
`developed for on-demand processing of large images and
`video.
`A technique for providing large images on demand over
`a communications link with limited bandwidth is progres
`Sive image transmission. In progressive image transmission,
`each image is encoded, or compressed, in layers, like an
`onion. The first (core) layer, or base layer, represents a
`low-resolution version of the image. Successive layerS rep
`resent Successively higher resolution versions of the image.
`The Server transmits the layers in order, Starting from the
`base layer. The client receives the base layer, and instantly
`presents to the user a low-resolution version of the image.
`The client presents higher resolution versions of the image
`as the Successive layers are received. Progressive image
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`Petitioners' Exhibit 1010
`Page 0008
`
`
`
`3
`a particular point in time) before the data can be decoded and
`presented. Both start-up and Seek delays occur because even
`after the client begins to receive new data, it must wait until
`its buffer is sufficiently full to begin playing out of the buffer.
`It does this in order to guard against future buffer underflow
`due to network jitter and variable-bit rate compression. For
`typical audiovisual coding on the Internet, Start-up and Seek
`delays between two and ten Seconds are common.
`Large Start-up and Seek delays are particularly annoying
`when the user is trying to browse through a large amount of
`audiovisual content trying to find a particular video or a
`particular location in a Video. AS in the image browsing
`Scenario using progressive transmission, most of the time
`the user will want to abort the transmission long before all
`the data are downloaded and presented. In Such a Scenario,
`delays of two to ten Seconds between aborts seem intoler
`able. What is needed is a method for reducing the start-up
`and Seek delays for Such “on demand” interactive multime
`dia applications.
`SUMMARY OF THE INVENTION
`The above-identified problems, shortcomings and disad
`Vantages with the prior art, as well as other problems,
`Shortcoming and disadvantages, are Solved by the present
`invention, which will be understood by reading and studying
`the Specification and the drawings. The present invention
`minimizes the Start-up and Seek delays for on-demand
`interactive multimedia applications, when the transmission
`bit rate is constrained.
`In one embodiment, a Server provides at least two differ
`ent data Streams. A first data Stream is a low resolution
`Stream encoded at a bit rate below the transmission bit rate.
`A Second data Stream is a normal resolution Stream encoded
`at a bit rate equal to the transmission bit rate. The server
`initially transmits the low resolution Stream faster than real
`time, at a bit rate equal to the transmission bit rate. The client
`receives the low resolution Stream faster than real time, but
`decodes and presents the low resolution Stream in real time.
`Unlike previous Systems, the client does not need to wait
`for its buffer to become safely full before beginning to
`decode and present. The reason is that even at the beginning
`of the transmission, when the client buffer is nearly empty,
`the buffer will not underflow, because it is being filled at a
`rate faster than real time, but is being played out at a rate
`equal to real time. Thus, the client can safely begin playing
`out of its buffer as Soon as data are received. In this way, the
`delay due to buffering is reduced to nearly Zero.
`When the client buffer has grown sufficiently large to
`guard against future underflow by the normal resolution
`Stream, the Server Stops transmission of the low resolution
`Stream and begins transmission of the normal resolution
`Stream. The System of the present invention reduces the
`Start-up or Seek delay for interactive multimedia applications
`Such as Video on-demand, at the expense of initially lower
`quality. The invention includes Systems, methods,
`computers, and computer-readable media of varying Scope.
`Besides the embodiments, advantages and aspects of the
`invention described here, the invention also includes other
`embodiments, advantages and aspects, as will become
`apparent by reading and Studying the drawings and the
`following description.
`BRIEF DESCRIPTION OF THE DRAWINGS
`FIG. 1 is a diagram of an exemplary computer System in
`which the invention may be implemented.
`FIG. 2 is a diagram of an example network architecture in
`which embodiments of the present invention are incorpo
`rated.
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 6,637,031 B1
`
`4
`FIG. 3 is a block diagram representing the data flow for
`a streaming media System for use with the computer network
`of FIG. 2.
`FIGS. 4A, 4B, 4C, 4D, and 4E are schedules illustrating
`data flow for example embodiments of the Streaming media
`system of FIG. 3.
`FIG. 5 is a decoding schedule for multimedia content
`pre-encoded at a full bit rate.
`FIG. 6 is a schedule showing the full bit rate encoding of
`FIG. 5 advanced by T seconds.
`FIG. 7 is a schedule showing a low bit rate encoding of
`the content shown in FIG. 5.
`FIG. 8 is a schedule showing the low bit rate encoding
`schedule of FIG. 7 advanced by Tseconds and Superimposed
`on the advanced Schedule of FIG. 6.
`FIG. 9 is a schedule showing the transition from the
`delivery of the low bit rate encoded stream of FIG. 7 to the
`data Stream of FIG. 6, with a gap to indicate optional bit
`Stuffing.
`FIG. 10 is a schedule showing the advanced schedule of
`FIG. 6 with a total of RT bits removed from the initial
`frames.
`
`DESCRIPTION OF THE EMBODIMENTS
`In the following detailed description of the embodiments,
`reference is made to the accompanying drawings which
`form a part hereof, and in which is shown by way of
`illustration specific embodiments in which the invention
`may be practiced. These embodiments are described in
`Sufficient detail to enable those skilled in the art to practice
`the invention, and it is to be understood that other embodi
`ments may be utilized and that Structural, logical and elec
`trical changes may be made without departing from the
`Scope of the present inventions. The following detailed
`description is, therefore, not to be taken in a limiting Sense,
`and the Scope of the present inventions is defined only by the
`appended claims.
`The present invention is a System for achieving low
`latency responses from interactive multimedia Servers, when
`the transmission bit rate is constrained. A Server provides at
`least two different data Streams. A first data Stream is a low
`resolution Stream encoded at a bit rate below the transmis
`Sion bit rate. A Second data Stream is a normal resolution
`Stream encoded at a bit rate equal to the transmission bit rate.
`The server initially transmits the low resolution stream faster
`than real time, at a bit rate equal to the transmission bit rate.
`The client receives the low resolution stream faster than real
`time, but decodes and presents the low resolution Stream in
`real time. When the client buffer has grown sufficiently large
`to guard against future underflow by the normal resolution
`Stream, the Server Stops transmission of the low resolution
`Stream and begins transmission of the normal resolution
`Stream. The System of the present invention reduces the
`Start-up or Seek delay for interactive multimedia applications
`Such as Video on-demand, at the expense of initially lower
`quality.
`The detailed description of this invention is divided into
`four Sections. The first Section provides a general description
`of a Suitable computing environment in which the invention
`may be implemented including an Overview of a network
`architecture for generating, Storing and transmitting audio/
`Visual data using the present invention. The Second Section
`illustrates the data flow for a streaming media System for use
`with the network architecture described in the first section.
`The third section describes the methods of exemplary
`
`Petitioners' Exhibit 1010
`Page 0009
`
`
`
`US 6,637,031 B1
`
`15
`
`25
`
`35
`
`40
`
`S
`embodiments of the invention. The fourth section is a
`conclusion which includes a Summary of the advantages of
`the present invention.
`Computing Environment. FIG. 1 provides a brief, general
`description of a Suitable computing environment in which
`the invention may be implemented. The invention will
`hereinafter be described in the general context of computer
`executable program modules containing instructions
`executed by a personal computer (PC). Program modules
`include routines, programs, objects, components, data
`Structures, etc. that perform particular tasks or implement
`particular abstract data types. Those skilled in the art will
`appreciate that the invention may be practiced with other
`computer-System configurations, including hand-held
`devices, multiprocessor Systems, microprocessor-based pro
`grammable consumer electronics, network PCS,
`minicomputers, mainframe computers, and the like. The
`invention may also be practiced in distributed computing
`environments where tasks are performed by remote proceSS
`ing devices linked through a communications network. In a
`distributed computing environment, program modules may
`be located in both local and remote memory Storage devices.
`FIG. 1 employs a general-purpose computing device in
`the form of a conventional personal computer 20, which
`includes processing unit 21, System memory 22, and System
`buS 23 that couples the System memory and other System
`components to processing unit 21. System buS 23 may be
`any of Several types, including a memory buS or memory
`controller, a peripheral bus, and a local bus, and may use any
`of a variety of bus structures. System memory 22 includes
`read-only memory (ROM) 24 and random-access memory
`(RAM) 25. Abasic input/output system (BIOS) 26, stored in
`ROM 24, contains the basic routines that transfer informa
`tion between components of personal computer 20. BIOS 24
`also contains Start-up routines for the System. Personal
`computer 20 further includes hard disk drive 27 for reading
`from and writing to a hard disk (not shown), magnetic disk
`drive 28 for reading from and writing to a removable
`magnetic disk 29, and optical disk drive 30 for reading from
`and writing to a removable optical disk 31 Such as a
`CD-ROM or other optical medium. Hard disk drive 27,
`magnetic disk drive 28, and optical disk drive 30 are
`connected to system bus 23 by a hard-disk drive interface
`32, a magnetic-disk drive interface 33, and an optical-drive
`interface 34, respectively. The drives and their associated
`computer-readable media provide nonvolatile Storage of
`computer-readable instructions, data structures, program
`modules and other data for personal computer 20. Although
`the exemplary environment described herein employs a hard
`disk, a removable magnetic disk 29 and a removable optical
`disk 31, those skilled in the art will appreciate that other
`types of computer-readable media which can Store data
`accessible by a computer may also be used in the exemplary
`operating environment. Such media may include magnetic
`cassettes, flashmenory cards, digital versatile disks, Ber
`noulli cartridges, RAMs, ROMs, and the like.
`Program modules may be stored on the hard disk, mag
`netic disk 29, optical disk 31, ROM 24 and RAM 25.
`Program modules may include operating System 35, one or
`more application programs 36, other program modules 37,
`and program data 38. A user may enter commands and
`information into personal computer 20 through input devices
`such as a keyboard 40 and a pointing device 42. Other input
`devices (not shown) may include a microphone, joystick,
`game pad, Satellite dish, Scanner, or the like. These and other
`input devices are often connected to the processing unit 21
`through a Serial-port interface 46 coupled to System buS 23;
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`but they may be connected through other interfaces not
`shown in FIG. 1, Such as a parallel port, a game port, or a
`universal serial bus (USB). A monitor 47 or other display
`device also connects to System buS 23 via an interface Such
`as a Video adapter 48. In addition to the monitor, personal
`computers typically include other peripheral output devices
`(not shown) Such as Speakers and printers.
`Personal computer 20 may operate in a networked envi
`ronment using logical connections to one or more remote
`computerS Such as remote computer 49. Remote computer
`49 may be another personal computer, a Server, a router, a
`network PC, a peer device, or other common network node.
`It typically includes many or all of the components
`described above in connection with personal computer 20,
`however, only a storage device 50 is illustrated in FIG. 1.
`The logical connections depicted in FIG. 1 include local
`area network (LAN) 51 and a wide-area network (WAN) 52.
`Such networking environments are commonplace in offices,
`enterprise-wide computer networks, intranets and the Inter
`net.
`When placed in a LAN networking environment, PC 20
`connects to local network 51 through a network interface or
`adapter 53. When used in a WAN networking environment
`such as the Internet, PC 20 typically includes modem 54 or
`other means for establishing communications over network
`52. Modem 54 may be internal or external to PC 20, and
`connects to System buS 23 via Serial-port interface 46. In a
`networked environment, program modules depicted as resid
`ing within 20 or portions thereof may be stored in remote
`storage device 50. Of course, the network connections
`shown are illustrative, and other means of establishing a
`communications link between the computers may be Sub
`stituted.
`FIG. 2 is a diagram of an example network architecture
`200 in which embodiments of the present invention are
`implemented. The example network architecture 200 com
`prises video capturing tools 202, a video server 204, a
`network 206 and one or more video clients 208.
`The Video capturing tools 202 comprise any commonly
`available devices for capturing video and audio data, encod
`ing the data and transferring the encoded data to a computer
`via a Standard interface. The example Video capturing tools
`202 of FIG. 2 comprise a camera 210 and a computer 212
`having a Video capture card, compression Software and a
`mass Storage device. The Video capturing tools 202 are
`coupled to a Video Server 204 having Streaming Software and
`optionally having Software tools enabling a user to manage
`the delivery of the data.
`The video server 204 comprises any commonly available
`computing environment Such as the exemplary computing
`environment of FIG. 1, as well as a media server environ
`ment that Supports on-demand distribution of multimedia
`content. The media server environment of video server 204
`comprises Streaming Software, one or more data Storage
`units for Storing compressed files containing multimedia
`data, and a communications control unit for controlling
`information transmission between video server 204 and
`video clients 208. The video server 204 is coupled to a
`network 206 Such as a local-area network or a wide-area
`network. Audio, Video, illustrated audio, animations, and
`other multimedia data types are stored on video server 204
`and delivered by an application on-demand over network
`206 to one or more video clients 208.
`The video clients 208 comprise any commonly available
`computing environments Such as the exemplary computing
`environment of FIG.1. The video clients 208 also comprise
`
`Petitioners' Exhibit 1010
`Page 0010
`
`
`
`7
`any commonly available application for viewing Streamed
`multimedia file types, including QuickTime (a format for
`Video and animation), RealAudio (a format for audio data),
`RealVideo (a format for video data), ASF (Advanced
`Streaming Format) and MP4 (the MPEG-4 file format). Two
`video clients 208 are shown in FIG. 2. However, those of
`ordinary skill in the art can appreciate that video server 204
`may communicate with a plurality of Video clients.
`In operation, for example, a user clicks on a link to a Video
`clip or other Video Source, Such as camera 210 used for Video
`conferencing or other purposes, and an application program
`for viewing Streamed multimedia files launches from a hard
`disk of the video client 208. The application begins loading
`in a file for the video which is being transmitted across the
`network 206 from the video server 204. Rather than waiting
`for the entire Video to download, the Video starts playing
`after an initial portion of the Video has come across the
`network 206 and continues downloading the rest of the
`video while it plays. The user does not have to wait for the
`entire Video to download before the user can Start viewing.
`However, in existing Systems there is a delay for Such “on
`demand” interactive applications before the user can Start
`viewing the initial portion of the video. The delay, referred
`to herein as a start-up delay or a Seek delay, is experienced
`by the user between the time when the user signals the video
`server 204 to start transmitting data and the time when the
`data can be decoded by the video client 208 and presented
`to the user. However, the present invention, as described
`below, achieves low latency responses from Video Server
`204 and thus reduces the Start-up delay and the Seek delay.
`An example computing environment in which the present
`invention may be implemented has been described in this
`Section of the detailed description. In one embodiment, a
`network architecture for on-demand distribution of multi
`media content comprises Video capture tools, a Video Server,
`a network and one or more Video clients.
`Data Flow for a Streaming Media System. The data flow
`for an example embodiment of a streaming media System is
`described by reference to FIGS. 3, 4A, 4B, 4C, 4D and 4E.
`FIG. 3 is a block diagram representing the data flow for a
`streaming media system 300 for use with the network
`architecture of FIG. 2. The streaming media system 300
`comprises an encoder 302 which may be coupled to camera
`210 or other real time or uncompressed Video Sources, an
`encoder buffer 304, a network 306, a decoder buffer 308 and
`a decoder 310.
`The encoder 302 is a hardware or software component
`that encodes and/or compresses the data for insertion into
`the encoder buffer 304. The encoder buffer 304 is one or
`more hardware or Software components that Stores the
`encoded data until Such time as it can be released into the
`network 306. For live transmission Such as video
`conferencing, the encoder buffer 304 may be as Simple as a
`first-in first-out (FIFO) queue. For video on-demand from a
`video server 204, the encoder buffer 304 may be a combi
`nation of a FIFO queue and a disk file on the capture tools
`202, transmission. buffers between the capture tools 202 and
`the video server 204, and a disk file and output FIFO queue
`on the video server 204. The decoder buffer 308 is a
`hardware or Software component that receives encoded data
`from the network 306, and stores the encoded data until Such
`time as it can be decoded by decoder 310. The decoder 310
`is a hardware or Software component that decodes and/or
`decompresses the data for display.
`In operation, each bit produced by the encoder 302 passes
`point A312, point B314, point C 316, and point D 318 at
`
`5
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 6,637,031 B1
`
`8
`a particular instant in time. A graph of times at which bits
`croSS a given point is referred to herein as a Schedule. The
`schedules at which bits pass point A312, point B314, point
`C 316, and point D 318 can be illustrated in a diagram such
`as shown in the FIGS. 4A, 4B, 4C, 4D and 4E.
`FIGS. 4A, 4B, 4C, 4D and 4E are schedules illustrating
`data flow for example embodiments of the Streaming media
`system of FIG. 3. As shown in FIGS. 4A, 4B, 4C, 4D and
`4E, the y-axis corresponds to the total number of bits that
`have crossed the respective points (i.e. point A, point B,
`point C, and point D in FIG. 3) and the x-axis corresponds
`to elapsed time. In the example shown in FIG. 4A, schedule
`A corresponds to the number of bits transferred from the
`encoder 302 to the encoder buffer 304. Schedule B corre
`sponds to the number of bits that have left the encoder buffer
`304 and entered the network 306. Schedule C corresponds to
`the number of bits received from the network 306 by the
`decoder buffer 308. Schedule D corresponds to the number
`of bits transferred from the decoder buffer 308 to the decoder
`310.
`In the example shown in FIG. 4B, the network 306 has a
`constant bit rate and a constant delay. As a result, Schedules
`B and C are linear and are Separated temporally by a
`constant transmission delay.
`In the example shown in FIG. 4C, the network 306 is a
`packet network. As a result, Schedules B and C have a
`Staircase form. The transmission delay is generally not
`constant. Nevertheless, there exist linear Schedules B" and C
`that provide lower and upper bounds for schedules B and C
`respectively. Schedule B' is the latest possible linear sched
`ule at which encoded bits are guaranteed to be available for
`transmission. Schedule C' is the earliest possible linear
`Schedule at which received bits are guaranteed to be avail
`able for decoding. The gap between schedules B" and C' is
`the maximum reasonable transmission delay (including jitter
`and any retransmission time) plus an allowance for the
`packetization itself. In this way, a packet network can be
`reduced, essentially, to a constant bit rate, constant delay
`channel.
`Referring now to the example shown in FIG. 4D, for
`real-time applications the end-to-end delay (from capture