throbber
by M. H. Willebeek-LeMair
`K. G. Kumar
`E. C. Snible
`
`Bamba—Audio
`and video
`streaming over
`the Internet
`
`The World Wide Web has become a primary
`means of disseminating information, which is
`being presented increasingly through multiple
`media. The ability to broadcast audio and
`video information is becoming a reality with
`the advent of new media-streaming
`technologies. Most of the emerging streaming
`systems require high-bandwidth connections
`in order to deliver audio and video of suitable
`quality. In this paper we present a media-
`streaming system, called Bamba, that delivers
`audio and video over low-bandwidth modem
`connections with the use of standard
`compression technologies. Bamba offers high-
`quality audio and video over low-bit-rate
`connections and can operate using a standard
`HTTP server. The Bamba video is enhanced
`with special provisions for reducing the effect
`of errors in a lossy-network environment.
`Bamba adheres to existing standards wherever
`possible. Finally, Bamba has been fully
`implemented and deployed both internally at
`IBM and externally.
`
`1. Introduction
`The World Wide Web (WWW) has become a primary
`means of disseminating information. Initially, the type of
`information distributed was primarily in the form of text
`and graphics. Later, images and stored audio and video
`
`files emerged. These audio and video files are downloaded
`from a server and stored at the client before they are
`played. Most recently, streamed audio and video have
`become available from both stored and live sources on the
`Web. Audio and video streaming enables clients to select
`and receive audio and video content from servers across
`the network and to begin hearing and seeing the content
`as soon as the first few bytes of the stream arrive at the
`client. Streaming technology involves audio and video
`compression, schemes for stream formatting and
`transmission packetization, networking protocols and
`routing, client designs for displaying and synchronizing
`different media streams, and server designs for content
`storage and delivery. In this paper we present a system
`for audio and video streaming (with code name Bamba)
`developed at the IBM Thomas J. Watson Research
`Center. Bamba has been deployed within IBM and was
`demonstrated externally on the official Web site of the
`1996 Olympics. It has since been made available for free
`download from the IBM AlphaWorks* Web site.'
`Today's computer-network infrastructures, including the
`Internet, were not designed with streaming in mind.
`Streaming media requires that data be transmitted from
`a server to a client at a sustained hit rate that is high
`enough to maintain continuous and smooth playback at
`the receiving client station. A primary objective in
`developing Bamba is to stream audio and video across the
`Web through very-low-bit-rate connections. Audio is
`
`Alphalforksjhrn.com
`
`,Cnpyright 1995 by International Business Machines Corporation. Copying in printed form for private use is permitted without payment of royalty provided that (1) each
`reproduction is done without alteration and (2) the Journal reference and IBM copyright notice are included on the first page. The title and abstract, but no other portions,
`of this paper may be copied or distributed royalty free without further permission by computer-based and other information-service systems. Permission to republish any other
`portion of this paper must he obtained from the Editor.
`
`8018-8646/98/$5.00 .0 1998 IBM
`
`IBM J. RES. DEVELOP. VOL. 42 NO. 2 MARCH 1998 (cid:9)
`
`M. 14. WILLEBEEK-LEMAIR, K Ci. KUMAR, AND E. C. SNIBLE
`
`PAGE 1 OF 12
`
`I.M.L. SLU'S EXHIBIT 1004
`
`

`
`sufficiently compressed to stream over modem connections
`at 14.4 Kb/s, and video at 28.8 Kb/s. The system that has
`been developed not only achieves the low-bit-rate goal,
`but can also be extended to support higher-bit-rate
`streams to provide higher-quality streaming over intranets
`or higher-bandwidth Internet connections. Furthermore,
`when streaming is not possible because of congestion or
`insufficient bandwidth availability, the Bamba player
`(client software) at the receiving client automatically
`calculates how much data to preload in order to maintain
`continuous playback. This allows clients connected via
`low-bit-rate connections to fall back to a download-and-
`play mode and still receive the higher-bit-rate content.
`
`Existing audio and video streaming
`technologies
`In recent years, there has been much research and
`development in the areas of audio and video streaming
`as well as videoconferencing. Videoconferencing
`differs from audio and video streaming in that the
`communication is bidirectional, and end-to-end
`delays must be very low (<200 ms) for interactive
`communication. In fact, videoconferencing standards are
`quite mature and have emerged from the International
`Telecommunication Union (ITU) in the form of the
`H.3xx standards [1, 2], and from the Internet Engineering
`Task Force (IETF) in conjunction with the multicast
`backbone (MBone) [1, 3, 4]. In general, the two camps
`use the same audio and video compression standards
`(defined by the ITU) but differ in their networking
`protocol specifications.
`Audio and video streaming differs technically from its
`videoconferencing counterpart in that it can afford
`greater flexibility in end-to-end delays when the data is
`transmitted across a network and in the fact that stored
`content may be manipulated off-line with additional
`processing. These begin to merge when one considers live
`audio and video streaming applications (e.g., Internet,
`radio, and TV). The most relevant of the ITU standards is
`H.323, which defines audio/visual services over LANs for
`which quality of service cannot he guaranteed [5]. This
`standard specifies a variety of audio and video coders and
`decoders (CODECs) as well as signaling protocols to
`negotiate capabilities and set up and manage connections
`[6]. The underlying transport specified is the Real-time
`Transport Protocol (RTP) [7]. This protocol, defined by
`the IETF, is intended to provide a means of transporting
`real-time streams over Internet Protocol (IP) networks. A
`new protocol, the Real Time Streaming Protocol (RTSP),
`just proposed to the IETF, more directly addresses the
`issues of delivering and managing multimedia streams [8].
`Clearly, this area is still evolving as new protocols are
`being defined and refined to satisfy a wide range of
`emerging networked multimedia applications.
`
`There are a large number of audio and video streaming
`systems available in the market today [9]. These include
`VDOLive",2 StreamWorks's,3 Vosaic**,4 VivoActive**,5
`InterVU",6 and RealAudio**.7 VDOLive, Streamworks,
`Vosaic, and RealAudio are based on proprietary
`client—server systems that transport their audio and video
`streams by means of User Datagram Protocol (UDP/IP)
`connections. This unreliable transport does not retransmit
`lost packets and is blocked by most firewalls unless they
`are specially reconfigured. The others use I ITTP (based
`on TCP/IP) [10]. VDOLive employs a proprietary
`hierarchical compression technique that allows the server
`to adapt the video-stream bandwidth to the available
`network connection bandwidth. StreamWorks, Vosaic, and
`InterVu are based on MPEG** [11], while Vivo uses 11.263
`[12]. In general, these systems are designed to work over
`higher-bandwidth LAN connections and not at modem
`speeds. At modem speeds, the MPEG-based systems
`revert to slide-show-type video.
`Bamba is a streaming system that was designed to
`run over existing computer network infrastructures. In
`particular, it is versatile in dealing with the heterogeneous
`nature of this environment and the unpredictable
`congestion behavior of today's network traffic. In the
`Bamba system, audio and video are compressed into a
`Bamba file. This file is specially formatted to interleave
`the audio and video content and may even be extended to
`include other data types. The Bamba file is placed on a
`server. A client equipped with the appropriate Bamba
`software is able to communicate with the server and
`receive the Bamba audio/video file. If the network
`conditions are suitable (sufficient sustained bandwidth is
`available), this file, streaming across the network, is played
`at the client immediately. Otherwise, the file is played
`once uninterrupted playback can be ensured.
`The Bamba streaming system has several key features.
`The first of these is the quality of the audio and video,
`where the audio is set at a constant 6.3 Kb/s and the video
`ranges from very low bit rates of tens of kilobits per
`second to hundreds of kilobits per second. The second is
`the fact that both the audio and video compression are
`based on standard algorithms and can be performed by
`standards-compliant decoders. Third, the Bamba streaming
`system uses either a standard HTTP server or an
`enhanced video server running RTP over UDWIP. In the
`HTTP case, no special server software is required to store
`and send Bamba clips, and the transmitted streams can
`traverse firewalls with no special firewall configuration
`requirements. In the case of the video server running RTP
`
`2 Iturilwww.vdanet
`hap. ilwww..zingtech.curn
`forpah.ww.vosait.cont
`hfirlhvivW.VirOcom
`blip:11www.intervu.corn
`7 Impilwww.reolandio corn
`
`M. H. WILLEBEEK-LEMAIR, K. G. KUMAR, AND E. C. SNIBLE
`
`IBM 1. RES. DEVELOP. VOL. 42 NO. 2 MARCH 1Q96
`
`270 (cid:9)
`
`PAGE 2 OF 12
`
`I.M.L. SLU'S EXHIBIT 1004
`
`(cid:9)
`

`
`File
`server
`
`Network
`interface
`
`Network
`interface
`
`Hit?
`
`Netscape
`
`Plug-in interface
`
`Video
`decoder
`
`Video
`renderer
`
`Plug-in
`
`Audio
`decoder
`
`Audio
`renderer
`
`Server
`
`Client
`
`Bamba system block diagram.
`
`over UDP/IP, additional functionality is provided by
`means of a control protocol between the client and server.
`This functionality includes pacing of the transmission
`stream at a target bit rate as well as specific start and end
`times of transmission within an audio or video file. Finally,
`the Bamba player has been implemented either as a
`helper application, which runs outside a Web browser, or
`as a browser plug-in, which enables application developers
`to embed audio and video clips easily within an HTML
`document or as a Java** applet, which can be downloaded
`directly from a Web server containing Bamba clips without
`requiring special software installation at the client.
`The rest of this paper is organized as follows. In Section
`2, we describe the underlying Bamba technology. This
`includes a description of the video-compression algorithm
`as well as details related to the overall system design. In
`Section 3, we describe several enhancements made to the
`basic Bamba streaming system, such as increased
`robustness in lossy-network environments. A description of
`the Live Bamba architecture is given in Section 4. The
`paper is summarized in Section 5.
`
`2. Bamba technology
`A base requirement of the Bamba streaming system is to
`function within the WWW standard HTTP-based
`client—server architecture. In this section, we provide a
`description of the overall client—server architecture and
`present details concerning the compression algorithms. We
`also describe the Bamba file format and synchronization
`technique.
`
`• Bamba streaming architecture
`A block diagram of the Bamba streaming system is
`presented in Figure 1. The system consists of a client and
`a server component. The server is a standard HTTP Web
`
`server, which contains the stored Bamba audio and video
`files. The client consists of a Web browser and the Bamba
`audio and video plug-in software.
`The Bamba plug-ins are implemented as a set of
`dynamic link libraries that interface with the Web browser
`through the Netscape-defined plug-in API. Netscape
`has defined a set of plug-in routines that are used to
`communicate between the plug-in and the browser. Each
`plug-in library contains an initialization routine within
`which is declared what Netscape plug-in routines are used
`by the plug-in. These routines include mechanisms to
`create and delete instances of a plug-in, manage the plug-
`in display window, control the flow of data streams to the
`plug-in, etc. In general, the plug-in is tightly integrated
`with the browser. Note that while Netscape was used in
`this example, the approach is similar for other browsers.
`Bamba files may be embedded in HTML pages by
`means of a URL pointing to a file on an HTTP or video
`server. When the URL is requested, the server passes
`the metadata identifying the Bamba file and containing
`information about the file type to the client. The file type
`is used by the browser to launch the appropriate plug-in
`to play hack the Bamba file.
`Bamba was designed to stream clips from standard
`HTTP Web servers without special streaming software
`on the server. As such, Bamba is limited to the
`communication mechanisms provided by the HTTP
`protocol. This approach has certain advantages, the
`greatest of which is that it is simple and maps gracefully
`into the existing Web browsing architecture. As a result,
`content creators can easily produce Bamba audio and
`video clips and embed them in standard HTML [13]
`pages, which are then loaded onto and accessed from a
`standard HTTP server. Since the underlying transport
`protocol used by HTTP is TCP/IP, which provides reliable
`
`IBM J. RES. DEVELOP. VOL. 42 NO. 2 MARCH 1998 (cid:9)
`
`M. H. WILLEBEEK-LEMAIR, K. G. KUMAR, AND E. C. SNIBLE
`
`PAGE 3 OF 12
`
`I.M.L. SLU'S EXHIBIT 1004
`
`

`
`Motion vectors
`
`Macro block
`
`n — I n
`P
`
`n
`
`H
`
`B
`
`P
`
`P
`
`rt
`
`n + 2
`
`Motion vectors
`
`(b)
`
`, :0011.W914,17r,
`
`m-rC.-.7!: (cid:9)`•
`Video-compression algorithms: (a) MPEG I-, P-. and B-frame compression dependencies; (b) H.263 I and P macro-block dependencies.
`
`i
`
`end-to-end network connections, no special provisions are
`required for handling packet loss within the network. In
`essence, a Bamba audio or video clip is treated like any
`other HTTP object, such as an HTML or JPEG [14] file.
`If selected, the Bamba clip is transferred to the client
`(browser station) as fast as TCP/IP can move it, and the
`client begins decoding and displaying the Bamba file as
`soon as the first few bytes arrive.
`Since Bamba uses TCP/IP as the underlying
`communication protocol, the streams can traverse firewalls
`with no special configuration requirements. In general,
`systems based on UDPIIP cannot traverse firewalls without
`explicit permission changes in the firewall to allow passage
`to the UDP/IP packets. This is because IJDP/IP packets
`are easier to imitate than TCP/IP packets, since the
`UDP/IP protocol involves no end-to-end handshakes or
`sequence numbers [15].
`
`• Bamba audio and video technology
`The audio and video technology used in Bamba is based
`on standard algorithms originally defined within the ITU
`H.324 standard for video telephony over regular phone
`lines [16]. The audio standard, G.723, specifies two bit
`rates: 5.3 Kb/s and 6.3 Kb/s [17]. Bamba uses the higher-
`bit-rate CODEC, which compresses an 8-kHz input of
`16-bit samples to a fixed 6.3Kb/s stream. This audio
`algorithm is optimized to represent speech at high quality
`over low-bit-rate connections. It encodes speech into
`30-ms frames by means of linear predictive analysis-by-
`synthesis coding [17]. The input signal for the higher-bit-
`rate coder is Multipulse Maximum Likelihood
`Quantization (MP-MLQ) [17].
`The Bamba video CODEC complies with the H.263
`video compression standard [12], which uses an approach
`based on the discrete cosine transform (Dcr). This is
`
`similar to the technology used for MPEG. Unlike MPEG,
`which uses intrapicture frames (I-frames), predicted
`frames (P-frames) and bidirectional predicted frames
`(B-frames), H.263 does not define I- and P-frames, but
`rather I- and P-blocks-8-pixel by 8-pixel subregions of a
`frame. Figure 2(a) illustrates the MPEG dependencies
`among I-, P-, and B-frames, while Figure 2(b) illustrates
`the partitioning of H.263 frames into 1- and P-blocks and
`the dependencies between blocks. Representing frames as
`collections of I- and P-blocks reduces the size variance
`between frames and adds flexibility in selecting the refresh
`distance between I-blocks for different regions of the
`video image. To maximize compression based on temporal
`redundancy, there may be long intervals between 1-blocks
`for regions in the image that are not changing.
`The H.263 algorithm is designed to deliver video over
`very low-bit-rate (<64 Kb/s) dedicated connections. In this
`low-bit-rate range, H/63 has been demonstrated to
`outperform its predecessor, H.261 [18], by a 2.5:1 ratio [2]
`(i.e., at the same bandwidth, the signal-to-noise ratio of
`H.263 is 2.5 times higher than that of H.261. H.263 can
`also be easily extended to higher bit rates, in the
`100-200-Kb/s range. These rates are suitable for streaming
`over ISDN or intranet LAN-type connections. The H.263
`video compression algorithm uses a planar YVU12 format,
`which contains three components: luminance (Y) and two
`chrominance planes (V and U). The sizes of these planes
`vary as a function of the video resolution. Two of the
`resolutions supported by Bamba arc the Common
`Intermediate Format (CIF) and the Quarter Common
`Intermediate Format (QCIF) [2]; the formats are
`presented in Table 1. Smaller and intermediate-size
`resolutions arc also supported. The resolution and
`target bit rate are selected at compression time. The
`compression target bit rate may be set anywhere between
`10 and 356 Kb/s.
`
`M. H. WILLEBEEK-LEMA1R, K. 0. KUMAR, AND E. C SN1BLE
`
`IBM J. RES. DEVELOP. VOL. 42 NO. 2 MARCH 1998
`
`272 (cid:9)
`
`PAGE 4 OF 12
`
`I.M.L. SLU'S EXHIBIT 1004
`
`(cid:9)
`(cid:9)
`(cid:9)
`

`
`As with many compression standards, the H.263
`standard specifies the format of the video so that any
`standards-compliant decoder can successfully decode the
`video stream. Typically, this leaves much flexibility in the
`actual encoding technique and implementation. The H.263
`encoding used for Bamba uses an innovative algorithm to
`trade frame rate for frame quality [19]. The art in video
`compression lies in the decision of how best to apportion
`a few hits to different components in the compression
`process so that the compressed stream, once decoded and
`displayed, produces the highest quality as perceived by the
`end user. Quality is highly ambiguous and is perceived
`differently by different users. A typical tradeoff is between
`frame rate and frame quality (pixel quantization). For the
`same number of bits, it is possible to create two very
`different standards-compliant streams. One stream may
`have a higher frame rate, while the other may have a finer
`quantization of the frame pixels, obtaining a sharper
`image.
`The Bamba video implementation incorporates a
`dynamic frame-rate-control algorithm, which trades frame
`rate for frame quality (bits per frame) while maintaining a
`constant average hit rate. This approach allows the video
`to balance between the two extremes and deliver smoother
`motion or sharper images as appropriate, depending on
`the content and scene changes in the video. The algorithm
`behavior is illustrated in Figure 3. A video sequence with
`dynamically changing content is used to illustrate the
`algorithm's adaptable frame rate. The original clip is
`approximately 30 seconds long, captured at 15 frames per
`second for a total of 445 frames. It was compressed at a
`target hit rate of 20 Kh/s and resulted in a total of 332
`frames. Typically, larger frames are followed by a drop in
`frame rate in order to maintain the constant bit rate. The
`spikes in the figure correspond to larger frames, generated
`when the scene changes or the amount of motion in a
`scene is significant. These spikes are typically followed by
`several frame periods in which no data is transmitted at
`all.
`The Bamba H.263 implementation includes special
`motion-estimation techniques [20] and fast DC1'
`algorithms [21, 22], which result in very efficient
`implementations.
`
`• Framing structure
`A simple framing technique for smooth playback was
`implemented. Audio and video are interleaved into a
`single file to simplify the server function. Essentially, the
`server treats a Bamba file as any other data file. Audio
`and video data are interleaved proportionately to maintain
`a synchronous playback of both streams at the client.
`Bamba frames consist of a 240-byte segment of audio and
`a 24013/a-byte segment of video, where a is the audio rate
`and p is the video rate.
`
`se
`Y
`
`Frame number
`
`aliMMOMMEIEMMINIMMI
`
`Illustration of Samba video compression with dynamic-frame-rate-
`control algorithm. The number of bits transmitted is shown for
`each frame of a sample video sequence.
`
`Table 1 CIF and QCIF planar YVU12 formats.
`
`Pixels/fine
`
`Lines/frame
`
`CIF luminance (Y)
`CIF chrominance plane (V)
`CIF chrominance plane (U)
`QCIF luminance (Y)
`QCIF chrominance plane (V)
`QCIF chrominance plane (U)
`
`352
`176
`176
`176
`88
`88
`
`288
`144
`144
`144
`72
`72
`
`• Streaming-control algorithm
`When the Web is accessed. the actual connection speed
`between a client and a server in the network varies
`depending on the access method (e.g., modem or LAN),
`the network load, the server load, and even the client
`load. Hence, it is rarely possible to guarantee performance
`in this "hest-effort" environment, where processing and
`bandwidth resources are typically evenly distributed among
`all competing applications. Consequently, when an audio
`and/or video clip is accessed over the network, there is no
`guarantee that the resources (bandwidth and processing)
`are available to play the clip smoothly. To handle this
`situation, Bamba has a built-in rate monitor that
`dynamically evaluates the effective data-transfer rate (a)
`of a selected audio or video clip and compares this to the
`specified bit rate (a (cid:9)
`P) for the clip, which is contained
`in the clip header. If the specified rate is less than the
`measured rate, the clip can be played immediately. If, on
`the other hand, the specified rate exceeds the measured
`rate (a + 13 > a), a fraction of the clip is buffered
`
`IBM 1. RES. DEVELOP. VOL .12 NO. 2 MARCH 1998
`
`H. H. WILLEBEEK-LEMA1R, K. G. KUMAR, AND E. C. SNIBLE
`
`PAGE 5 OF 12
`
`I.M.L. SLU'S EXHIBIT 1004
`
`(cid:9)
`

`
`Bandwidth
`
`Available
`bandwidth
`
`Avers
`TCP
`bandwidth
`
`Time
`
`11111,1 i1P , 11 (cid:9)
`
`I'll' (cid:9)
`
`14111..ul.in,nII J 1111,
`
`sufficient for the clip to play to completion smoothly
`once playback is started. The amount of prebuffering is
`8 = L[1 — o-1(a (cid:9)
`0)], where L is the clip length.
`This calculation is performed on the basis of the initial
`download rate and again any time the buffer underfiows.
`In future networks, where quality-of-service mechanisms
`will be able to guarantee a desired bandwidth, this
`approach will allow the clips to stream uninterrupted.
`It will also provide a simple means of characterizing the
`clips and making the appropriate bandwidth requests.
`
`• Synchronization technique
`To maintain synchronization between the audio and the
`video, a video interframe time is calculated as a function
`of the total number of video frames and the total length
`(in time) of the audio portion of the uncompressed clip.
`During compression, not all frames may be compressed,
`since some may he skipped in order to achieve the target
`bit rate. As a result, the compressed frames may not have
`contiguous frame numbers, so the spacing between frames
`is calculated as the difference in sequence numbers times
`the interframe time calculated earlier. Video frames are
`displayed on the basis of the video-frame sequence
`number, the interframe time, and the actual number
`of audio samples played. This approach is particularly
`powerful, since the actual video interframe time tends to
`vary depending on the capture hardware subsystem used
`to create the clip. Synchronization points may also be
`placed in the Bamba file in order to achieve playback at
`arbitrary points within a clip or to recover from errors
`during transmission when UDP/IP is used.
`
`3. Bamba enhancements and error handling
`in a lossy environment
`The HTTP client—server system has some limitations.
`First, the HTTP protocol has no explicit mechanisms to
`perform such sophisticated stream-control functions as
`seeking to a particular position in the stream. There are
`ways of carrying customized function calls within the
`
`HTTP stream, but this requires special server software to
`execute those functions. Second, TCP/IP is an inefficient
`protocol for streaming delay-sensitive data across the
`network. It was originally designed to transport data files,
`with a built-in mechanism to alleviate congestion in the
`network [23]. TCP/IP is based on a "sliding-window"
`protocol that waits for acknowledgments from the receiver
`for every packet it sends. Each packet in the sliding
`window has a timer associated with it. A packet-receipt
`acknowledgment must be received by the transmitter
`before the timer expires, or the packet will be
`retransmitted. The size of the sliding window (number of
`outstanding packets) is based on the speed with which
`acknowledgments are received. TCP/IP continues to
`increase the size of the window (effectively, the bandwidth
`at which it is sending) until packets start to time out.
`Once they time out, TCP/IP exponentially backs off
`(reducing the size of the sliding window) and retransmits
`these (presumably lost) packets. A typical TCP/IP
`bandwidth "profile" resembles a sawtooth, as shown in
`Figure 4, resulting in inefficient usage of the bandwidth.
`The UDP-based solution is advantageous if sufficient
`bandwidth is available, since it does not use
`acknowledgments and allows the server to explicitly
`control the rate at which the streams are transmitted into
`the network. The continuous transmission at the server
`and the elimination of retransmissions make the resource
`requirements on the server much more predictable and
`manageable. On the other hand, this approach adds
`complexity to the server, since it must now pace the
`transmission of a clip into the network, and it adds
`complexity to the client side, since the client must be
`made able to handle packet loss within the network.
`However, for long clips, this approach reduces the storage
`requirements at the client and can provide a higher degree
`of functionality, such as the ability to seek and transmit
`only specified segments of the clip, or to adapt the
`transmitted bit rate to the available bit rate (e.g., send
`audio with no video or send selective portions of the
`video). Another important merit of UDP/IP is that, given
`the appropriate routing capability in the network, it can be
`used to efficiently multicast a stream to multiple clients
`simultaneously.
`In the UDP-based Bamba system, the server store clips
`and has data pumps that pace the transmission of the
`video clips into the network. The system block diagram
`is similar to that of Figure 1, except that the network
`interface module, which receives the RTP/UDP packets
`from the network, makes sure they are presented to the
`splitter module in the correct order and, upon detection
`of a lost packet, resynchronizes the stream by searching
`for a new synchronization point.
`
`M. H. WILLEBEEK-LEMAIR, K. G. KUMAR, AND E. C. SNIBLE (cid:9)
`
`IBM J. RES. DEVELOP. VOL. 42 NO. 2 MARCH 1998
`
`274 (cid:9)
`
`PAGE 6 OF 12
`
`I.M.L. SLU'S EXHIBIT 1004
`
`(cid:9)
`

`
`• Compression technology
`The H.263 video compression scheme was enhanced for
`Bamba, in order to provide added robustness to reduce
`the effect of lost packets in the UDP/IP environment [24].
`Since a large percentage of each video frame within an
`H.263 compressed stream is encoded by means of P-blocks
`with interframe dependencies, corrupted data may create
`errors that propagate for extended periods until an I-block
`refreshes the region. In general, this makes the video
`more susceptible to errors. To reduce the error effects, a
`novel scheme was devised for selecting when and where to
`place I-blocks within the compressed stream. The scheme
`is based on a two-phase compression strategy. The first
`phase of the compression strategy is needed to construct a
`dependence graph based on motion vectors between pixels
`in successive frames. [A motion vector is a pointer from a
`P-block in the current frame to an I- or P-block in the
`previous frame. These motion vectors are then used to
`determine the dependence count (the number of future
`blocks that may depend on a given block) of each block
`in a sequence of compressed frames.] The second phase
`selects which blocks to compress as I-blocks on the basis
`of the dependence count of each block in a sequence of
`compressed frames. This demonstrably improves the
`ability of the compressed stream to recover from errors
`and greatly reduces the time required to reconstruct the
`video image when an error occurs. The approach is
`standards-compliant, maintains a smooth bandwidth profile
`of the compressed stream (small variance in size between
`compressed frames), and causes only a slight increase in
`the overall bandwidth requirements.
`A conventional H.263 encoder first partitions a video
`image into a set of blocks of 16 pixels by 16 pixels. The
`coding control function searches for the best match
`between each block in the current frame and blocks in the
`previous frame. If a sufficiently close match is found, the
`block is encoded as a P-block based on the difference
`between the block and the closest matching block in the
`previous frame. The closeness of the match is evaluated
`in terms of the number of bits needed to encode the
`difference between the block and the closest matching
`block in the previous frame. If the difference is too
`great, it is deemed more efficient to encode the block
`independently, without reference to previous data. Such
`a block is referred to as an intracoded I-block.
`The resulting H.263 compressed stream consists
`primarily of P-blocks, with I-block insertions caused by
`scene changes or severe motion. To prevent error
`accumulation, the standard also requires that each block
`be encoded as an I-block at least once every 132 frames.
`Although the H.263 standard defines how I-blocks and
`P-blocks are encoded, it allows considerable flexibility
`in selecting when to encode a block as either an I- or
`P-block. We exploit this flexibility to improve the
`
`robustness of a standards-compliant stream by carefully
`choosing when and where to insert I-blocks during the
`encoding process.
`I-block encoding exploits only the spatial redundancy
`within the block in the compression process, while
`P-block encoding exploits both the temporal and spatial
`redundancies of the video. Although interblock encoding
`generally achieves more compression gain, the encoding
`dependencies of P-blocks reduce their resilience to errors.
`If the region referenced by a P-block has been corrupted,
`the decoding of the P-block will generate incorrect pixel
`values. If all or a part of this corrupted P-block is again
`referenced by other P-blocks, the erroneous pixel values
`will cause errors to propagate from one frame to another.
`This is known as the error-propagation problem of
`motion-compensated video compression. The propagation
`stops when all corrupted regions are updated by I-blocks.
`On the Internet, where loss is primarily attributed to
`network congestion, the loss of a UDP/IP packet that
`contains video data can result in the loss of several frames
`in a low-bit-rate video system. For example, with a target
`bit rate of 20 Kb/s, a typical QCIF compressed video
`frame may contain 165 bytes. Hence, a 500-byte IP packet
`contains roughly three frames of compressed video data.
`If the packet is lost, these frames cannot be recovered,
`and errors begin to propagate.
`For non-real-time applications, knowledge about the
`interdependence among blocks in a sequence can be
`obtained from the dependencies reflected by the motion
`vectors. It is thus possible to assign a measure of
`importance to a pixel or block by counting the number
`of pixels or blocks that depend on it. This operation is
`anticausal, i.e., traversing backward in time. The higher
`the dependence on a block, the more critical it is that this
`block be correct and that it be encoded as an I-block.
`Furthermore, dependence chains may he broken by
`encoding intermediate blocks in the chain as I-blocks.
`We illustrate how to construct a dependence graph and
`calculate dependence counts with the example of three
`frames in Figure 5. By starting with the last frame

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket