throbber
THE UNIVERSITY OF TEXAS AT AUSTIN
`THE GENERAL LIBRARIES
`—..—.—-.«—— ----—--Tu-rs-— .
`..__._.- ._._
`
`PROCEEDINGS
`
`ACM Multimedia '95
`
`San Francisco, California, November 5-9, 1995
`
`
`
`Sponsored by the ACM SIG MULTIMEDIA, SIGCHI, SIGGRAPH,
`SIGMIS, SIGBIO, SIGCOMM, SIGIR AND SIGOIS, in cooperation with
`ACM SIGAPP, SIGCAPH, SIGMOD and SIGOPS
`
`
`
`RPX Exhibit 1125
`RPX Exhibit 1125
`RPX v. DAE
`RPX V. DAE
`
`/ A
`
`

`
` The Association
`
`Machinery, Inc.
`1515 Broadway
`New York, NY-10036
`
`‘° 1995 by the Association
`Copyright
`for Computing Machinery, Inc. lACMl.
`Permission to make digital or hard
`copies of part or all of this work for
`personal or classroom use is granted
`without fee provided that the copies are
`not made or distributed for profit or
`commercial advantage and that copies
`bear this notice and the full citation on
`the
`first
`page.
`Copyrights
`for
`components of this work owned by
`others than ACM must be honored.
`
`Abstracting with credit is permitted. To
`copy otherwise, to republish. to post on
`servers, or
`to redistribute to lists,
`requires prior specific permission and!
`or a fee. Request permission to re-
`publish from: Publications Dept. ACM,
`Inc. Fax +1 (212) 869-0481 or <
`permissions@acm.org>.
`For
`other
`copying of articles that carry a code at
`the bottom of the first or last page or
`screen display, copying is permitted
`provided that the per-copy fee indicated
`in
`the code
`is paid through the
`Copyriht Clearance Center,
`222
`Rosewood Drive, Danvers, MA 01923.
`
`...In proceedings of the ACM Multi-
`media, 1995 (San Francisco, CA, USA,
`November 5-9. 1995) ACM, New York,
`1995. pp. 23-35.
`
`Ordering Information
`
`Nonmembers
`Nonmember orders placed within the
`U.S. should be directed to:
`
`Addison-Wesley Publishing Company
`Order Department
`Jacob Way
`Reading. MA 01867
`Tel: +1 800 447 2226
`
`Addison-Wesley will pay postage and
`handling on orders accompanied by
`check. Credit card orders may be placed
`by mail or by calling the Addison-
`Wesley Order Department
`at
`the
`number
`above. Follow-up
`inquiries
`should be directed to the Customer
`Service Department
`at
`the
`same
`number. Please include the Addison-
`
`Wesley ISBN number with your order:
`
`A-W ISBN 0-201-87774-0
`
`for
`
`Computing
`
`Nonmember orders from outside the
`U.S.
`should be addressed as noted
`below:
`
`EuropeIMidd|e East
`
`Addison-Wesley Publishing Group
`Concertgebouwplein 25
`1071 LM Amsterdam
`The Netherlands
`Tel: +31 20 6717296
`Fax: +31 20 6645334
`
`GermanyIAustrielswitzeriand
`
`Addison-Wesley Verlag Deutschland
`GmbH
`Hildachstrasse 15d
`Wachsbleiche 7-12
`531 1 1 Bonn
`Germany
`Tel: +49 228 98 515 0
`Fax: +49 228 98 515 99
`
`United KingdomlAfrica
`
`Addison-Wesley Pubiishers Ltd.
`Finchampstead Road
`Wokingham, Berkshire RG1 1 2N2
`United Kingdom
`Tel: +44 734 794000
`Fax: +44 734 794035
`
`After January 1, 1996:
`Addison-Wesley Longrnan Publishers,
`Ltd.
`
`Longman House
`Burnt Mill
`Harlow, Essex CMZOZJE
`United Kingdom
`Tel: +44 1279 623 623
`Fax: +441279 431 059
`
`Asia
`
`Addison-Wesley Singapore Pte. Ltd.
`15 Beach Road
`#05-02/09/ 10 Beach Centre
`Singapore 0718
`Tel: +65 339 7503
`Fax: +65 339 9709
`
`Japan
`
`Addison-Wesley Publishers Japan Ltd.
`Nichibo Building
`1-2-2 Sarugakucho
`Chiyoda—ku. Tokyo 101
`Japan
`Tel: +81 33 291 4581
`Fax: +81 33 291 4592
`
`Australiamew Zealand
`
`Addison-Wesley Publishers Pty. Ltd.
`6 Byfield Street
`-
`North Ryde, N.S.W. 2113
`Australia
`Tel: +61 2 878 5411
`Fax: +61 2 878 5830
`
`Latin America
`
`Addison-Wesley lberoamericana S.A.
`Boulevard de las Cataratas #3
`Colonia Jardines del Pedregal
`Delegacion Alvaro Obregon
`01900 Mexico D. F.
`Tel: +52 5 660 2695
`Fax: +52 5 660 4930
`
`Canada
`
`Addison-Wesley Publishing (Canada)
`Ltd.
`26 Prince Andrew Place
`Don Mills, Ontario M3C 2T8
`Canada
`Tel: +416 447 5101
`Fax: +416 443 0948
`
`ACM Members
`A limited number of copies are available
`at the ACM member discount. Send
`order with payment in US dollars to:
`
`ACM Order Department
`P.0. Box 12114
`Church Street Station
`New York, N.Y. 10257
`
`OR
`
`ACM European Service Center
`Avenue Marcel Thiry 204
`1200 Brussels
`
`Belgium
`
`Credit card orders from U.S.A. and
`Canada:
`+1 800 342 6626
`
`New York Metropolitan Area
`outside of the U.S.:
`
`and
`
`+1 212 626 0500 y
`Fax +1 212 944-1318
`Email:acmpubs@acm.org
`
`include your ACM membe
`Please
`number and the ACM Order numbe
`with your order.
`
`ACM Order Number: 433952
`ACM ISBN: 0-89791-751-O
`
`
`
`

`
`This material may be protected by Copyright law (Title 17 U.S. Code)
`
`

`
`compliance of H.261 in a novel scheme which we call Intra-
`H.26}.
`Intra-H.261 gives significant gain in compression
`performance compared to nv and substantial improvement in
`both run-time perfonnance and packet—loss tolerance‘ com-
`pared to ivs.
`Vic was originally conceived as an application to demon-
`strate the Tenet real-time networking protocols [14] and to
`simultaneously support the evolving “Lightweight Sessions"
`architecture [24] in the MBone. It has since driven the evolu-
`tion of the Real-time Transport Protocol (RTP) [40]. As RTP
`evolved, we tracked and implemented protocol changes, and
`fed back implementation experience to the design process.
`Moreover. our experience implementing the RTP payload
`specification for H.261 led to an improved scheme based
`on macroblock-level fragmentation, which resulted in a re-
`vised protocol [44]. Finally, the RTP payload specification
`for JPEG [13] evolved from a vic implementation.
`In the next section, we describe the design approach of the
`MBone tools. We then discuss the essentials of the vic net-
`
`work architecture. The network architecture shapes the soft-
`ware architecture, which is discussed in the following sec-
`tion. Finally, we discuss signal compression issues, deploy-
`ment and implementation status.
`
`2 COMPOSABLE TOOLS VS. TOOLKITS
`
`A cornerstone of the Unix design philosophy was to avoid
`supplying a separate application for every possible user task.
`Instead, simple, one-function “filters” like grep and sort can
`be easily and dynamically combined via a “pipe” operator to
`perfonn arbitrarily complex tasks. Similarly. we use modu-
`lar, configurable applications, each specialized to support a
`particular media, which can be easily composed via a Con-
`ference Bus to support the variety of conferencing styles
`needed to support effective human communication. This ap-
`proach derives from the framework proposed by Ousterhout
`in [33], where he claims that large systems are easily com-
`posed from small tools that are glued together with a simple
`communication primitive (e.g., the Tk send command). We
`have simply replaced his send primitive with a well-defined
`(and more restrictive) Conference Bus protocol. Restricting
`the protocol prevents the evolution of sets of tools that rely
`on the specifics of each other's internal implementations. In
`addition to vic, our conferencing applications include the Vi-
`sual Audio Tool (vat) for audio [26], a whiteboard (wb) for
`shared workspace and slide distribution [25, 15], and the Ses-
`sion Directory (sd) for session creation and advertisement
`[23].
`This “composable tools" approach to networked multime-
`dia contrasts with the more common “toolkit framework"
`adopted by other multimedia systems [10, 31, 37, 38]. Toolk-
`its provide basic building blocks in the form of a code li-
`brary with an application programming interface (API) to
`that library providing high—level abstractions for manipulat-
`ing multimedia data flows. Each distinct conferencing style
`requires a different application but the applications are typ-
`ically simple to write, consisting mostly of API calis with
`style-dependent glue and control logic.
`The toolkit approach emphasizes the programming model
`and many elegant programming mechanisms have resulted
`
`from toolkit-related research. To simplify the programming
`model, toolkits usually assume that communication is ap-
`plication independent and offer a generic.
`least-common-
`denominator network interface built using traditional trans-
`port protocols.
`In 1990 Clark and Tennenhouse [8] pointed out that multi-
`media applications could be simplified and both application
`and network performance enhanced if the network proto-
`col reflected the application semantics. Their model, Appli-
`cation Level Framing (ALF), is difficult to implement with
`toolkits (where application semantics are deliberately “fac-
`tored out") but is the natural way to implement “composable
`tools”. And ALF-based. media—specific tools offer a simple
`solution to multimedia's biggest problem — high rate, high
`volume, continuous media data streams. Since the tools are
`directly involved in processing the multimedia data flows.
`we can use ALF to tune all the performance-critical multime-
`dia data paths within the application and across the network.
`In addition to performance, flexibility is gained by com-
`posing simple tools rather than using a monolithic appli-
`cation built on top of some API. Since each tool deals di-
`rectly with its media stream and sends only low-rate reports
`like “X has started/stopped sending" on the Conference Bus.
`the coordination agent necessary to implement a particular
`conferencing scenario can be written in a simple interpreted
`language like Tc! [34]. This allows the most volatile part
`of the conferencing problem, the piece that melds audio,
`video, etc., into a coordinated unit that meets particular hu-
`man needs and expectations, to be simple and easy to evolve.
`It also ensures that the coordination agents are designed or-
`thogonal to the media agents, enforcing a mechanism/policy
`separation: media tools implement the mechanism by which
`coordination tools impose the policy structure appropriate
`for some particular conferencing scenario, e.g., open meet-
`ing, moderated meeting, class, seminar, etc.
`
`3 NETWORK ARCHITECTURE
`
`While the freedom to explore the communications protocol
`design space fosters innovation, it precludes interoperabil-
`ity. Since the MBone was created to study multicast scal-
`ing issues, interoperability is especially important. Multicast
`use at an interesting scale requires that a large group of peo-
`ple spread over a large geographic region have some reason
`to send and receive data from the group. One good way to
`achieve this is to develop interoperable applications that en-
`courage widespread use.
`
`3.1
`
`FITP
`
`To promote such interoperability. the AudiolVideo Trans-
`port Working group of the Internet Engineering Task Force
`(IETF) has developed RTP as an application levei protocol
`for multimedia transport. The goal is to provide a very thin
`transport layer without overly restricting the application de-
`signer. The protocol specification itself states that “RTP is
`intended to be malleable to provide the information required
`by a particular application and will often be integrated into
`the application processing rather than being implemented as
`a separate layer." In the ALF spirit, the semantics of sev-
`
`512
`
`
`
`

`
`RTP
`
`ATM
`
`m RMTP I AAL5
`IP
`RTIP
`
`Figure l: RTP and the Protocol Stack
`
`era] of the fields in the RTP header are deferred to an "RTP
`Profile” document, which defi nes the semantics according to
`the given application. For example, the RTP header contains
`a generic “marke'r" bit that in an audio packet indicates the
`start of a talk spurt but in a video packet indicates the end of
`a frame. The interpretation of fields can be further refined by
`the "Payload Format Specification”. For example, an audio
`payload might define the RTP timestamp as a audio sample,
`counter while the MPEG/RTP specification [22] defines it as
`the “Presentation Time Stamp” from the MPEG system spec-
`ification.
`Because of its ALF-like model, RTP is a natural match to
`the “composable tools” framework and serves as the foun-
`dation for vic‘s network architecture. Since RTP is indepen-
`dent of the underlying network technology, vic can simulta-
`neously support multiple network protocols. Figure l illus-
`trates how RTP fits into several protocol stacks. For IP and IP
`Multicast, RTP is layered over UDR while in the Tenet pro-
`tocols, it runs over RMTP/RTIP [2]. Similarly, vic can run
`directly over an ATM Adaptation Layer. In all these cases,
`RTP is realized in the application itself.
`the data delivery
`RTP is divided into two components:
`protocol, and the control protocol, RTCP. The data deliv-
`ery protocol handles the actual media transport, while RTCP
`manages control information like sender identification, re-
`ceiver feedback, and cross—media synchronization. Different
`media of the same conference-level session are distributed
`on distinct RTP sessions.
`
`Complete details of the RTP specification are provided in
`[40]. We briefly mention one feature of the protocol relevant
`to the rest of the paper. Because media are distributed on
`independent RTP sessions (and because vic is implemented
`independently of other multimedia applications), the proto-
`col must provide a mechanism for identifying relationships
`among media streams (e.g., for audiolvideo synchroniza-
`tion). Media sources are identified by a 32-bit RTP “source
`identifier” (SRCID), which is guaranteed to be unique only
`within a single session. Thus, RTP defines a canonical-
`name (CNAME) identifier that is globally unique across all
`sessions. The CNAME is a variable-length, ASCII string
`that can be algorithmically derived, e.g., from user and host
`names. RTCP control packets advertise the mapping be-
`tween a given source’s SRCID and variable—length CNAME.
`Thus, a receiver can group distinct RTP sources via their
`CNAME into a single, logical entity that represents a given
`session participant.
`In summary, RTP provides a solid, well-defined protocol
`framework that promotes application interoperability, while
`its ALF philosophy does not overly restrict the application
`
`513
`
`design and, in particular, lends itself to efficient implemen-
`tation.
`
`4 SOI-WWARE ARCHITECTURE
`
`The principles of ALF drove more than the vic network ar-
`chitecture; they also determined the overall software archi-
`tecture. Our central goal was to achieve a flexible software
`framework which could be easily modified to explore new
`coding schemes, network models, compression hardware.
`and conference control abstractions. By basing the design on
`an objected—oriented ALF framework, we achieved this flex-
`ibility without compromising the efficiency of the implemen-
`tation.
`
`ALF leads to a design where data sources and sinks within
`the application are highly aware of how data must be repre-
`sented for network transmission. For example, the software
`H.261 encoder does not produce a bit stream that is in turn
`packetized by an RTP agent. Instead, the encoder builds the
`packet stream fragmented at boundaries that are optimized
`for the semantics of H.261. In this way. the compressed bit
`stream can be made more robust to packet loss.
`At
`the macroscopic level,
`the software architecture is
`built upon an event—driven model with highly optimized data
`paths glued together and controlled by a flexible Tcl/Tk [34]
`framework. A set of basic objects is implemented in C++ and
`are coordinated via Tcl/Tk. Portions of the C++ object hi-
`erarchy mirror a set of object—oriented Tcl commands. C++
`base classes permit Tcl to manipulate objects and orches-
`trate data paths using a uniform interface, while derived sub-
`classes support specific instances of capture devices, display
`types, decoder modules, etc. This division of low-overhead
`control functionality implemented in Tel and perfonnance
`critical data handling implemented in C++ allows for rapid
`prototyping without sacrifice of performance. A very simi-
`lar approach was independently developed in the VuSystem
`[29].
`
`4.1 Decode Path
`
`Figure 2 roughly illustrates the receive/decode path. The el-
`liptical nodes correspond to C++ base classes in the imple-
`mentation, while the rectangular nodes represent output de-
`vices. A Tcl script is responsible for constructing the data
`paths and performing out-of-band control that might result
`from network events or local user interaction. Since Tcl/Tk
`also contains the user interface, it is easy to present control
`functionality to the user as a single interface element that
`might invoke several primitive control functions to imple-
`ment its functionality.
`The data flow through the receive path is indicated by the
`solid arrows. When a packet arrives from the network, the
`Network object dispatches it to the Demuxer which imple-
`ments the bulk of the RTP processing. From there, the packet
`is demultiplexed to the appropriate Source object, which rep-
`resents a specific. active transmitter in the multicast session.
`If no Source object exists for the incoming packet. an up-
`call into Tcl is made to instantiate a new data path for that
`source. Once the data path is established, packets flow from
`the source object to a decoder object. Hardware and soft-
`
`
`
`

`
` \
`II
`\\t
`fII
`\\\\
`
`I
`‘
`‘I
`\
`\
`
`l
`
`l ,
`
`I.
`p‘
`I
`4'
`l
`
`p
`
`III
`
`,
`
`\
`
`\
`
`.
`
`I
`
`‘
`
`.
`
`s‘
`_.
`-.
`
`—
`
`s
`
`\
`
`In
`
`~_
`
`-_
`
`Software
`Decoder
`
`\
`/1’
`t,‘
`
`\ ‘1 Hardware
`
`Decoder
`
`._
`
`2
`
`\s.
`
`i
`
`X Window
`
`External
`Video out
`
`J1
`
`I
`I_ .
`
`l
`
`Source1
`
`B
`
`\u
`
`\
`
`|
`
`r\
`
`.
`/‘I
`
`'
`
`I Q
`
`Sourcez
`State
`
`T...
`
`
`
`Hardware
`Decoder
`
`l
`
`
`
`Figure 2: The receive/decode data path.
`
`ware decoding, as well as multiple compression formats, are
`simultaneously supported via a C++ class hierarchy. When
`a decoder object decodes a complete frame, it invokes a ren-
`dering object to display the frame on the output device, either
`an X Window or external video output port.
`Note that. in line with ALF, packets flow all the way to the
`decoder object more or less intact. The decoder modules are
`not isolated from the network issues.
`In fact. it is exactly
`these modules that know best what to do when faced with
`
`packet loss or reordering. C++ inheritance provides a con-
`venient mechanism for implementing an ALF model without
`sacrificing software modularity.
`While this architecture appears straightforward to imple-
`ment, in practice the decode path has been one of the most
`challenging (and most revised) aspects of the design. The
`core difficulty is managing the combinatorics of all possible
`configurations. Many input compression formats are sup-
`ported, and deciding the best way to decode any given stream
`depends on user input. the capabilities of the available hard-
`ware, and the parameters of the video stream. For exam-
`ple, DECS J30O adaptor supports hardware decompression
`of 4:2:2~decimated JPEG, either to an X Window or an ex-
`ternal output port. The board can be multiplexed between
`capture and decoding to a window but not between Capture
`and decoding to the external port. Also, if the incoming
`IPEG stream is 4:l :l rather than 4:2:2—decimated, the hard-
`ware cannot be used at all. Finally, only JPEG—compressed
`streams can be displayed on the video output port since the
`system software does not support a direct path for uncom-
`pressed video. Many other devices exhibit similar peculiar-
`1t1es.
`
`Coping with all hardware peculiarities requires building a
`rule set describing legal data paths. Moreover. these rules
`depend intimately on how the application is being used, and
`therefore are complicated by user configuration- We have
`found that the Tcl/C++ combination provides a flexible so-
`lution for this problem. By implementing only the bare es-
`
`sentials in C++ and exporting a Tc] interface that allows
`easy creation, deletion, and configuration of C++ objects, the
`difficulty in managing the complexity of the data paths is
`greatly reduced.
`
`4.2 Capture Path
`
`We applied a similar architectural decomposition to the video
`capture/compression path. As with the decoder objects, en-
`coder objects perform both compression and RTP packed-
`zation. For example. the H.261 encoder fragments its bit
`stream into packets on “macroblock" boundaries to mini-
`mize the impact of packet loss.
`
`Different compression schemes require different video in-
`put formats. For instance, H.261 requires 4:l:l-decimated
`CIF format video while the nv encoder requires 4:212-
`decimated video of arbitrary geometry. One implementation
`approach would be for each capture device to export a com-
`mon format that is subsequently converted to the format de-
`sired by the encoder. Unfortunately, manipulating video data
`in the uncompressed domain results in a substantial perfor-
`mance penalty. so we have optimized the capture path by
`supporting each format.
`
`A further perfonnance gain was realized by carrying out
`the “conditional replenishment" [32] step as early as possi-
`ble. Most of the compression schemes utilize block-based
`conditional replenishment, where the input image is divided
`up into small (e.g., 8x8) blocks and only blocks that change
`are coded and sent. The send decision for a block depends
`on only a small (dynamically varying) selection of pixels of
`that block. Hence, if the send decision is folded in with the
`capture IIO process, most of the read memory traffic and all
`of the write memory traffic is avoided when a block is not
`coded.
`
`514
`
`
`
`

`
`4.3 Rendering
`
`Another performance-critical operation is converting video
`from the YUV pixel representation used by most compres-
`sion schemes to a fonnat suitable for the output device. Since
`this rendering operation is performed after the decompres-
`sion on uncompressed video, it can be a bottleneck and must
`be carefully implemented. Our profiles of vic match the ex-
`periences reported by Patel et al. [35], where image render-
`ing sometimes accounts for 50% or more of the execution
`time.
`
`Video output is rendered either through an output port on
`an external video device or to an X window.
`In the case
`
`of an X window, we might need to dither the output for a
`color-mapped display or simply convert YUV to RGB for
`a true-color display. Alternatively, HP‘s X server supports
`a “YUV visual" designed specifically for video and we can
`write YUV data directly to the X server. Again, we use a C++
`class hierarchy to support all of these modes of operation and
`special-case the handling of 42:2 and 4: l :l—decimated video
`and scaling operations to maximize perfonnance.
`For color-mapped displays, vic supports several modes of
`dithering that trade off quality for computational efficiency.
`The default mode is a simple error—diffusion dither carried
`out in the YUV domain. Like the approach described in
`[35]. we use table lookups for computing the error terms, but
`we use an improved algorithm for distributing color cells in
`the YUV color space. The color cells are chosen unifonnly
`throughoutthe feasible set of colors in the YUV cube, rather
`than uniformly across the entire cube using saturation to find
`the closest feasible color. This approach effectively doubles
`the number of useful colors in the dither. Additionally, we
`add extra cells in the region of the color space that corre-
`sponds to flesh tones for better rendition of faces.
`While the error—diffusion dither produces a relatively high
`quality image, it is computationally expensive. Hence, when
`performance is critical, a cheap. ordered dither is available.
`Vic's ordered dither is an optimized version of the ordered
`dither from nv.
`
`An even cheaper approach is to use direct color quanti-
`' zation. Here, a color gamut is optimized to the statistics of
`the displayed video and each pixel is quantized to the nearest
`color in the gamut. While this approach can produce band-
`ing artifacts from quantization noise, the quality is reason-
`able when the color map is chosen appropriately. Vic com-
`putes this color map using a static optimization explicitly
`invoked by the user. When the user clicks a button. a his-
`togram of colors computed across all active display windows
`is fed into Heckbert‘s median cut algorithm [21]. The result-
`ing color map is then downloaded into the rendering mod-
`ule. Since median cut is a compute-intensive operation that
`can take several seconds, it runs asynchronously in a sepa-
`rate process. We have found that this approach is qualita-
`tively well matched to LCD color displays found on laptop
`PCs. The Heckbert color map optimization can also be used
`in tandem with the error diffusion algorithm. By concentrat-
`ing color cells according to the input distribution. the dither
`color variance is reduced and quality increased.
`Finally, we optimized the true-color rendering case. Here,
`the problem is simply to convert pixels from the YUV color
`space to RGB. Typically. this involves a linear transforma-
`
`?E?
`
`
`
`Figure 3: The Conference Bus.
`
`tion requiring four scalar multiplications and six condition-
`als. Inspired by the approach in [35], vic uses an algorithm
`that gives full 24-bit resolution using a single table lookup on
`each U-V chrominance pair and perfomis all the saturation
`checks in parallel. The trick is to leverage off the fact that the
`three coefficients of the Y term are all 1 in the linear trans-
`form. Thus we can precompute all conversions for the tuple
`(0, U, V) using a 64KB lookup table, T. Then. by linearity,
`the conversion is simply (R, G, B) = (Y, Y, Y) + T(U, V).
`A final rendering optimization is to dither only'the regions
`of the image that change. Each decoder keeps track of the
`blocks that are updated in each frame and renders only those
`blocks. Pixels are rendered into a buffer shared between the
`X server and the application so that only a single data copy is
`needed to update the display with a new video frame. More-
`over. this copy is optimized by limiting it to a bounding box
`computed across all the updated blocks of the new frame.
`
`4.4 Privacy
`
`To provide confidentiality to a session. vie implements end-
`to-end encryption per the RTP specification. Rather than rely
`on access controls (e.g.. scope control in IP Multicast), the
`end—to—end model assumes that the network can be easily
`tapped and thus enlists encryption to prevent unwanted re-
`ceivers from interpreting the transmission. In a private ses-
`sion, vic encrypts all packets as the last step in the transmis-
`sion path, and decrypts everything as the first step in the re-
`ception path. The encryption key is specified to the session
`participants via some external, secure distribution mecha-
`nism.
`
`Vic supports multiple encryption schemes with a C++
`class hierarchy. By default, the Data Encryption Standard
`(DES) in cipher block chaining mode [ I] is employed. While
`weaker fomis of encryption could be used (e.g., those based
`on linear feedback shift registers). efficient implementations
`of the DES give good performance on current hardware
`(measurements are given in [27]). The computational re-
`quirements of compression/decompression far outweigh the
`cost of encryption/decryption.
`
`4.5 The Conference Bus
`
`Since the various media in a conference session are handled
`
`by separate applications. we need a mechanism to provide
`coordination among the separate processes. The “Confer-
`ence Bus" abstraction, illustrated in Figure 3. provides this
`mechanism. The concept is simple. Each application can
`broadcast a typed message on the bus and all applications
`that are registered to receive that message type will get a
`
`515
`
`
`
`

`
`copy. The figure depicts a single session composed of audio
`(vat), video (vic), and whiteboard (wb) media, orchestrated
`by a (yet to be developed) coordination tool (ct).
`A complete description of the Conference Bus architec-
`ture is beyond the scope of this paper. Rather, we provide
`a brief overview of the mechanisms in vic that support this
`model.
`
`Voice-switched Windows. A feature not present in the
`other MBone video tools is vic’s voice—switched windows.
`A window in voice-switched mode uses cues from vat to fo-
`cus on the current speaker. “Focus" messages are broadcast
`by vat over the Conference Bus, indicating the RTP CNAME
`of the current speaker. Vic monitors these messages and
`switches the viewing window to that person.
`If there are
`multiple voice-switched windows, the most recent speakers’
`video streams are shown. Because the focus messages are
`broadcast on the Conference Bus, other applications can use
`them for other purposes. For example. on a network that sup-
`ports different qualities of service, a QoS tool might use the
`focus message to give more video bandwidth to the current
`speaker using dynamic RSVP filters [5].
`Floor Control. All of the LBL MBone tools have the abil-
`
`ity to “rnute" or ignore a network media source, and the dis-
`position of this mute control can be controlled via the Confer-
`ence Bus. This very simple mechanism provides a means to
`implement floor control in an external application. One pos-
`sible model is that each participant in the session follows the
`direction of a we1l~known (session—defined) moderator. The
`moderator can give the floor to a participant by multicasting
`a takes-floor directive with that participant's RTP CNAME.
`Locally, each receiver then mutes all participants except the
`one that holds the floor. Note that this model does not rely on
`cooperation among all the remote participants in a session.
`A misbehaving participant cannot cause problems because it
`will be muted by all participants that follow the protocol.
`Synchronization. Cross—media synchronization can also
`be carried out over the Conference Bus. Each real-time ap-
`plication induces a buffering delay, called the playback point.
`to adapt to packet delay variations [24]. This playback point
`can be adjusted to synchronize across media. By broad-
`casting “synchronize” messages across the Conference Bus,
`the different media can compute the maximum of all ad-
`vertised playout delays. This maximum is then used in the
`delay-adaptation algorithm. In order to assure accurate syn-
`chronization, the semantics of the advertised playback points
`must be the delay offset between the source timestamp and
`the time the media is actually transduced to the analog do-
`main. The receiver buffering delay alone does not capture
`the local delay variability among codecs.
`Device Access. Each active session has a separate con-
`ference bus to coordinate the media within that session. But
`
`some coordination operations like device access require in-
`teraction among different sessions. Thus we use a global
`conference bus shared among all media. Applications shar-
`ing a common device issue claim-device and release-device
`messages on the global bus to coordinate ownership of an
`exclusive—access device.
`
`Conference Buses are implemented as multicast datagram
`sockets bound to the loopback interface. Local-machine IP
`multicast provides a simple, efficient way for one process
`
`to send information to an arbitrary set of processes without
`needing to have the destinations “wired in“. Since one user
`may be participating in several conferences simultaneously.
`the transport address (UDP destination port) is used to cre-
`ate a separate bus for each active conference. This simplifies
`the communication model since a tool knows that everything
`it sends and receives over the bus refers to the conference
`it is participating in and also improves performance since
`tools are awakened only when there is activity in their con-
`ference. Each application in the conference is handed the ad-
`dress (port) of its bus via a startup command line argument.
`The global device access bus uses a reserved port known to
`all applications.
`
`4.6 User Interface
`
`A screen dump of vic’s current user interface, illustrating the
`display of several active video streams, is shown in Figure 4.
`The main conference window is in the upper left hand cor-
`ner. It shows a thumbnail view of each active source next to
`various identification and reception information. The three
`viewing windows were opened by clicking on their respec-
`tive thumbnails. The control menu is shown at the lower
`
`right. Buttons in this window turn transmission on and off.
`select the encoding fomiat and image size, and give access to
`capture device-specific features like the selection of several
`input ports. Bandwidth and frame rate sliders limit the rate
`of the transmission, while a generic quality slider trades off
`quality for bandwidth in a fashion dependent on the selected
`encoding format.
`This user interface is implemented as a Tc1lTk script em-
`bedded in vic. Therefore, it is easy to prototype changes and
`evolve the interface. Moreover, the interface is extensible
`since at run—time a user can include additional code to mod-
`ify the core interface via a home directory “dot file”.
`A serious deficiency in our current approach is that vic‘s
`user interface is completely independent of the other media
`tools in the session. While this modularity is fundamental
`to our system architecture. it can be detrimental to the user
`interface and a more uniform presentation is needed. For
`example, vat, vic, and wb all employ their own user inter-
`face element to display the members of the session. A better
`model would be to have single instance of this list across all
`the tools. Each participant in the listing could be annotated
`to show which media are active

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket