throbber
United States Patent 19
`Baumgartner et al.
`
`54
`
`75
`
`73
`
`21
`22
`51
`52
`58
`
`METHOD AND APPARATUS FOR
`SYNCHRONIZNG AUDIO AND WIDEO DATA
`STREAMS IN A MULTIMEDIA SYSTEM
`Inventors: Donn M. Baumgartner; Thomas A.
`Dye, both of Austin,Tex.
`Assignee: Dell USA, L.P., Round Rock,Tex.
`
`Appl. No.: 255,604
`Filed:
`Jun. 8, 1994
`Int, C. m. H04NS/04
`... 348/515: 395/806
`Field of Search ........................ 348/515; 395/154,
`395/162-164; 345/122; 375/355; H04N 5/04,
`5/12
`
`56
`
`References Cited
`U.S. PATENT DOCUMENTS
`Cooper.
`Re. 33,535
`2/1991
`Haji et al..
`4538,176
`8/1985
`Kouyama et al. .
`4,618,890
`10/1986
`Kouyama et al. .
`4,644,400
`2/1987
`Chapelle et al. .
`4,679,085
`7/1987
`Cooper.
`4,703,355
`10/1987
`4,750,034
`6/1988
`Lem.
`4,851,909
`7/1989
`Noske et al. .
`5,170.252
`12/1992
`Gear et al. .
`5,420,801
`5/1995
`Dockter et al. ..................... 364,514 R
`5,430,485 7/1995 Lankford et al.
`... 348/515
`5,471,576 11/1995 Yee ......................................... 395/154
`FOREIGN PATENT DOCUMENTS
`2305278 12/1990 Japan .................................... 34.8/515
`
`f Synchronizationy
`module /
`
`
`
`
`
`
`
`Deterinine current
`video frame number
`O
`
`Determine curren
`audio positio.
`
`Coculate equivcent
`audio frone number
`
`512
`
`514
`
`Coll oudic driver to
`citton wive rote stutus
`
`584
`
`Calculate and store
`Oudio Frome rute
`
`505
`
`Imtialize varicbles
`
`s508
`
`too far chead
`(nd cucio clairg
`
`Yes
`
`Calculate synchronization
`error yuantity
`
`
`
`cudio poused
`(nd wideo co-ght
`lip
`
`
`
`
`
`
`
`524
`
`US005642171A
`Patent Number:
`11
`45 Date of Patent:
`
`5,642,171
`Jun. 24, 1997
`
`OTHER PUBLICATIONS
`Nicolaou, Cosmos "An Architecture for Real-Time Multi
`media Communication Systems"; IEEE Journal on Selected
`Areas in Communications, vol. 8, No. 3, Apr. 1990.
`Little, Thomas D.C. and Arif Ghafoor "Synchronization and
`Storage Models for Multimedia Objects”; IEEE Journal on
`Selected Areas in Communications, vol. 8, No.3, Apr. 1990.
`Primary Examiner-Sherrie Hsia
`57
`ABSTRACT
`Amethod and apparatus for synchronizing audio and video
`data streams in a computer system during a multimedia
`presentation to produce a correctly synchronized presenta
`tion. The preferred embodiment of the invention utilizes a
`nonlinear feedback method for data synchronization. The
`method of the present invention periodically queries each
`driver for the current audio and video position (or frame
`number) and calculates the synchronization error. The syn
`chronization error is used to determine a tempo value
`adjustment to one of the data stream designed to place the
`video and audio back in sync. The method then adjusts the
`audio or video tempo to maintain the audio and video data
`streams in synchrony. In the preferred embodiment of the
`invention, the video tempo is changed nonlinearly over time
`to achieve a match between the video position and the
`equivalent audio position. The method applies a Smoothing
`function to the determined tempo value to prevent overcom
`pensation. The method of the present invention can operate
`in any hardware system and in any software environment
`and can be adapted to existing systems with only minor
`modifications.
`
`40 Claims, 7 Drawing Sheets
`
`(
`
`
`
`Cetermine tempo
`
`525
`
`530
`
`so wolve
`
`Adjust tempo using
`a smoothing function
`
`532 --
`
`538
`
`Yes
`
`Audio
`ohead of audio
`status reporled
`OS bod
`
`error=3 and
`lost empg
`nct t cinq
`ge
`
`548
`Set tempo to nominal role
`
`tempt
`550 - M.
`Set last temp3 to notinal ote
`
`552
`
`temp3
`lost-tempo
`
`Store used tempo ic
`nexl Sync coll
`
`
`
`PAGE 1 OF 19
`
`SONOS EXHIBIT 1014
`IPR of U.S. Pat. No. 8,942,252
`
`

`

`U.S. Patent
`
`Jun. 24, 1997
`
`Sheet 1 of 7
`
`5,642,171
`
`| ±0IJI
`
`
`
`ZZ!
`
`
`PAGE 2 OF 19
`
`SONOS EXHIBIT 1014
`IPR of U.S. Pat. No. 8,942,252
`
`

`

`U.S. Patent
`
`Jun. 24, 1997
`
`Sheet 2 of 7
`
`5,642,171
`
`
`
`
`
`07%
`
`0
`G
`Z
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`p.100 0|pný
`
`0%" |
`
`2 (f)I, H.
`
`OZZ
`
`
`PAGE 3 OF 19
`
`SONOS EXHIBIT 1014
`IPR of U.S. Pat. No. 8,942,252
`
`

`

`U.S. Patent
`
`Jun. 24, 1997
`
`Sheet 3 of 7
`
`5,642,171
`
`
`
`
`
`
`
`
`
`
`PAGE 4 OF 19
`
`SONOS EXHIBIT 1014
`IPR of U.S. Pat. No. 8,942,252
`
`

`

`U.S. Patent
`
`Jun. 24, 1997
`
`Sheet 4 of 7
`
`5,642,171
`
`
`
`
`
`pJ00 0|pný 910MpIDH
`
`| 79
`
`þ :) I H
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`PAGE 5 OF 19
`
`SONOS EXHIBIT 1014
`IPR of U.S. Pat. No. 8,942,252
`
`

`

`U.S. Patent
`
`Jun. 24, 1997
`
`Sheet 5 of 7
`
`5,642,171
`
`Synchronization
`module
`
`
`
`
`
`
`
`
`
`
`
`No (> Col Oudio driver to
`
`obton WOve rote Stotus
`
`504
`
`Colculote Ond store
`Oudio frome rote
`
`Initialize VOriables
`
`506
`
`508
`
`Determine Current
`video frome number
`
`510
`
`Determine Current
`Oudio position
`
`Calculote equivalent
`Oudio frome number
`
`512
`
`514
`
`NO
`
`
`
`
`
`
`
`518
`
`Audio
`to0 for OheOd
`Ond OUdio ploying
`
`Colculote synchronizotion
`error quantity
`
`516
`
`
`
`No
`
`522
`
`OUdio poused
`Ond video Cought
`Up
`
`
`
`Yes
`
`Stop Oudio
`
`Restort Oudio
`
`520
`
`524
`
`FIC 5A
`
`
`PAGE 6 OF 19
`
`SONOS EXHIBIT 1014
`IPR of U.S. Pat. No. 8,942,252
`
`

`

`U.S. Patent
`
`Jun. 24, 1997
`
`Sheet 6 of 7
`
`5,642,171
`
`FIC 5B
`
`Determine tempo
`
`526
`
`528
`
`
`
`
`
`Video
`Storted
`
`Yes
`
`530
`
`Set tempo to
`slow volue
`
`Adjust tempo using
`Q Smoothing function
`
`
`
`
`
`532
`
`
`
`
`
`558
`
`Audio
`ahead of Oudio
`status reported
`OS bod
`
`Yes
`
`Set tempo to
`nominot rote
`
`536
`
`No
`
`Audio
`dato OVoloble
`
`exit
`
`
`
`No
`
`
`
`
`
`Sync
`error =0 Ond
`last tempo
`not = nomino
`rote
`
`
`
`
`
`
`
`
`
`
`
`Synchronizotion
`error X tolerance
`
`Yes
`
`Set tempo to nominol rote
`
`
`
`Adjust tempo
`
`Set lost tempo to nominol rote
`
`Adjust tempo
`
`554
`
`Store used tempo for
`next Sync coll
`
`
`PAGE 7 OF 19
`
`SONOS EXHIBIT 1014
`IPR of U.S. Pat. No. 8,942,252
`
`

`

`U.S. Patent
`
`Jun. 24, 1997
`
`Sheet 7 of 7
`
`5,642,171
`
`Common Storting Point
`602
`
`
`
`608
`
`Audio Do to
`Streom
`
`Video Doto
`StreOm
`
`604
`
`606
`
`FIC.. 6
`
`
`PAGE 8 OF 19
`
`SONOS EXHIBIT 1014
`IPR of U.S. Pat. No. 8,942,252
`
`

`

`1.
`METHOD AND APPARATUS FOR
`SYNCHRONZING AUDIO AND WIDEO DATA
`STREAMS IN A MULTIMEDIA SYSTEM
`
`5,642,171
`
`2
`200 bytes of storage, if the data is not compressed. Vector
`based images are created by defining the end points,
`thickness, color, pattern and curvature of lines and solid
`objects comprised within the image. Thus, a vector-based
`image includes a definition which consists of a numerical
`representation of the coordinates of the object, referenced to
`a corner of the image.
`Bit-mapped images are the most prevalent type of image
`storage format, and the most common bit-mapped-image file
`formats are as follows. A file format referred to as BMP is
`used for Windows bit-map files in 1-, 2-, 4-, 8-, and 24-bit
`color depths. BMPfiles containabit-map header that defines
`the size of the image, the number of color planes, the type .
`of compression used (if any), and the palette used. The
`Windows DIB (device-independent bit-map) format is a
`variant of the BMP format that includes a color table
`defining the RGB (red green blue) values of the colors used.
`Other types of bit-map formats include the TIF (tagged
`image format file), the PCX (Zsoft Personal Computer
`Paintbrush Bitmap) file format, the GIF (graphics inter
`change file) format, and the TGA (Texas Instruments
`Graphic Architecture) file format.
`The standard Windows format for bit-mapped images is a
`256-color device-independent bit map (DIB) with a BMP
`(the Windows bit-mapped file format) or sometimes a DIB
`extension. The standard Windows format for vector-based
`images is referred to as WMF (Windows metafile).
`Compression
`Full-motion video implies that video images shown on the
`computer's screen simulate those of a television set with
`identical (30 frames-per-second) frame rates, and that these
`images are accompanied by high-quality stereo sound. A
`large amount of storage is required for high-resolution color
`images, not to mention a full-motion video sequence. For
`example, a single frame of NTSC video at 640-by-400-pixel
`resolution with 16-bit color requires 512K of data perframe.
`At 30 flames per second, over 15 Megabytes of data storage
`are required for each second of full motion video. Due to the
`large amount of storage required for full motion video,
`various types of video compression algorithms are used to
`reduce the amount of necessary storage. Video compression
`can be performed either in real-time, i.e., on the fly during
`video capture, or on the stored video file after the video data
`has been captured and stored on the media. In addition,
`different video compression methods exist for still graphic
`images and for full-motion video.
`Examples of video data compression for still graphic
`images are RLE (run-length encoding) and JPEG (Joint
`Photographic Experts Group) compression. RLE is the stan
`dard compression method for Windows BMP and DIB files.
`The RLE compression method operates by testing for dupli
`cated pixels in a single line of the bit map and stores the
`number of consecutive duplicate pixels rather than the data
`for the pixel itself. JPEG compression is a group of related
`standards that provide either lossless (no image quality
`degradation) or lossy (imperceptible to severe degradation)
`compression types. Although JPEG compression was
`designed for the compression of still images rather than
`video, several manufacturers supply JPEG compression
`adapter cards for motion video applications.
`In contrast to compression algorithms for still images,
`most video compression algorithms are designed to com
`press full motion video. Video compression algorithms for
`motion video generally use a concept referred to as inter
`frame compression, which involves storing only the differ
`
`10
`
`15
`
`20
`
`FIELD OF THE INVENTION
`The present invention relates generally to multimedia
`computer systems, and more particularly to a method and
`apparatus for synchronizing video and audio data streams in
`a computer system during a multimedia presentation.
`DESCRIPTION OF THE RELATED ART
`Multimedia computer systems have become increasingly
`popular over the last several years due to their versatility and
`their interactive presentation style. A multimedia computer
`system can be defined as a computer system having a
`combination of video and audio outputs for presentation of
`audio-visual displays. A modem multimedia computer sys
`tem typically includes one or more storage devices such as
`an optical drive, a CD-ROM, a hard drive, a videodisc, or an
`audiodisc, and audio and video data are typically stored on
`one or more of these mass storage devices. In some file
`formats the audio and video are interleaved together in a
`single file, while in otherformats the audio and video data
`are stored in different files, many times on different storage
`25
`media. Audio and video data for a multimedia display may
`also be stored in separate computer systems that are net
`worked together. In this instance, the computer system
`presenting the multimedia display would receive a portion of
`the necessary data from the other computer system via the
`network cabling.
`A multimedia computer system also includes a video card
`such as a VGA (Video Graphics Array) card which provides
`output to a video monitor, and a sound card which provides
`audio output to speakers. A multimedia computer system
`may also include a video accelerator card or other special
`ized video processing card for performing video functions,
`such as compression, decompression, etc. When a computer
`system displays a multimedia presentation, the computer
`system microprocessor reads the audio and video data stored
`on the respective mass storage devices, or received from the
`other computer system in a distributed system, and provides
`the audio stream through the sound card to the speakers and
`provides the video stream through the VGA card and any
`specialized video processing hardware to the computer
`video monitor. Therefore, when a computer system presents
`an audio-visual display, the audio data stream is decoupled
`from the video data stream, and the audio and video data
`streams are processed by separate hardware subsystems.
`A multimedia computer system also includes an operating
`system and drivers for controlling the various hardware
`elements used to create the multimedia display. For
`example, a multimedia computer includes an audio driver or
`sound card driver for controlling the sound card and a video
`driver for controlling the optional video processing card.
`55
`One example of an operating system which supports mul
`timedia presentations is the Multimedia Extensions for the
`Microsoft Windows operating system.
`Graphic images used in Windows multimedia applica
`tions can be created in either of two ways, these being
`bit-mapped images and vector-based images. Bit-mapped
`images comprise a plurality of picture elements (pixels) and
`are created by assigning a color to each pixel inside the
`image boundary. Most bit-mapped color images require one
`byte per pixel for storage, so large bit-mapped images create
`65
`correspondingly large files. For example, a full-screen, 256
`color image in 640-by-480-pixel VGA mode requires 307,
`
`45
`
`30
`
`35
`
`50
`
`
`PAGE 9 OF 19
`
`SONOS EXHIBIT 1014
`IPR of U.S. Pat. No. 8,942,252
`
`

`

`3
`ences between successive frames in the data file. Interframe
`compression begins by digitizing the entire image of a key
`frame. Successive frames are compared with the key frame,
`and only the differences between the digitized data from the
`key frame and from the successive frames are stored.
`Periodically, such as when new scenes are displayed, new
`key frames are digitized and stored, and subsequent com
`parisons begin from this new reference point. It is noted that
`interframe compression ratios are content-dependent, i.e., if
`the video clip being compressed includes many abruptscene
`transitions from one image to another, the compression is
`less efficient. Examples of video compression which use an
`interframe compression technique are MPEG, DVI and
`Indeo, among others.
`MPEG (Moving Pictures Experts Group) compression is
`a set of methods for compression and decompression of full
`motion video images that uses the interframe compression
`technique described above. The MPEG standard requires
`that sound be recorded simultaneously with the video data,
`and the video and audio data are interleaved in a single file
`to attempt to maintain the video and audio synchronized
`during playback. The audio data is typically compressed as
`well, and the MPEG standard specifies an audio compres
`sion method referred to as ADPCM (Adaptive Differential
`Pulse Code Modulation) for audio data.
`A standard referred to as Digital Video Interactive (DVI)
`format developed by Intel Corporation is a compression and
`storage format for full-motion video and high-fidelity audio
`data. The DVI standard uses interframe compression tech
`niques similar to that of the MPEG standard and uses
`ADPCM compression for audio data. The compression
`method used in DVI is referred to as RTV 2.0 (real time
`video), and this compression method is incorporated into
`Intel's AVK (audio/video kernel) software for its DVI prod
`uct line. IBM has adopted DVI as the standard for displaying
`video for its Ultimedia product line. The DVI file format is
`based on the Intel i750 chipset and is supported through the
`Media Control Interface (MCI) for Windows. Microsoft and
`Intel jointly announced the creation of the DVMCI (digital
`video media control interface) command set for Windows
`3.1 in 1992.
`The Microsoft Audio Video Interleaved (AVID format is a
`special compressed file structure format designed to enable
`video images and synchronized sound stored on CD-ROMs
`to be played on PCs with standard VGA displays and audio
`45
`adapter cards. The AVI compression method uses an inter
`frame method, i.e., the differences between successive
`frames are stored in a manner similar to the compression
`methods used in DVI and MPEG. The AVI format uses
`symmetrical software compression-decompression
`techniques, i.e., both compression and decompression are
`performed in real time. Thus AVI files can be created by
`recording video images and sound in AVI format from a
`VCR or television broadcastin real time, if enough free hard
`disk space is available.
`In the AVI format, data is organized so that coded frame
`numbers are located in the middle of an encoded data file
`containing the compressed audio and compressed video.The
`digitized audio and video data are organized into a series of
`frames, each having header information. Each frame of the
`audio and video data streams is tagged with a frame number
`that typically depends upon the frame rate. For example, at
`every 33 milliseconds (ms) or a 30th of a second, a frame
`number is embedded in the header of the video frame and at
`every 30th of a second, or 33 ms, the same frame number is
`embedded in the header of the audio track. The number
`assigned to the frames is, therefore, coordinated so that the
`
`65
`
`30
`
`35
`
`40
`
`50
`
`55
`
`5,642,171
`
`10
`
`15
`
`20
`
`25
`
`4
`corresponding audio and video frames are originally tagged
`with the same number. Therefore, since the frames are
`initially received simultaneously, the frames can actually be
`preprocessed so that tag codes are placed into the header
`files of the audio and the video for tracking the frame
`number and position of the audio and video tracks.
`In the AVI format, the audio and video information are
`interleaved (alternated in blocks) in the CD-ROM to mini
`mize delays that would result from using separate tracks for
`video and audio information. Also, the audio and video data
`are interleaved to synchronize the data as it is stored on the
`system. This is done in an attempt to synchronize the audio
`and video data during playback.
`The Apple QuickTime format was developed by Apple for
`displaying animation and video on Macintosh computers,
`and has become a de facto multimedia standard. Apple's
`QuickTime and Microsoft's AVI take a parallel approach to
`the presentation of video stored on CD-ROMs, and the
`performance of the two systems is similar. The QuickTime
`format, like AVI, uses software compression and decom
`pression techniques but also can employ hardware devices,
`similar to those employed by DVI, to speed processing. The
`Apple QuickTimeformat became available for the PC under
`Microsoft Windows in late 1992.
`As mentioned above, the audio and video data streams in
`a multimedia presentation are processed by separate hard
`ware subsystems under the control of separate device driv
`ers. The audio and video data are separated into separate data
`streams that are then transmitted to separate audio and video
`subsystems. The video data is transmitted to the video
`subsystem for display, and the audio data is transmitted to
`the sound subsystem for broadcast. These two subsystems
`are addressed by separate drivers, and each driver is loaded
`dynamically by the operating system during a multimedia
`presentation. In an operating system that is multi-tasking,
`has multiple drivers, or has multiple windows, the time
`period between the servicing of drivers is indeterminate. If
`a driver is not serviced by the operating system in time for
`the next frame, a portion of the multimedia systems may
`stall, resulting in the audio not being synchronized with the
`video. When the audio and video portions of a multimedia
`presentation become unsynchronized, many times this lack
`of synchronization is noticeable to the viewer, resulting in a
`less pleasing display. One result of audio and video data
`being out of sync is that the viewer may hear words that do
`not match the lips of the speaker, a situation commonly
`called "out of lip sync."
`Therefore, many times the corresponding audio and video
`frames of a multimedia presentation are not played synchro
`nously together. The reasons for the audio and video data
`streams falling out of sync during a presentation include the
`inherent decoupling of the audio and video data streams in
`separate subsystems in conjunction with system bottlenecks
`and performance issues associated with the large amounts of
`data that are required to be manipulated during a multimedia
`presentation. As mentioned above, full motion video clips
`with corresponding audio require massive amounts of sys
`tem resources to process. However, a considerably greater
`amount of processing is required to display the video data
`than is required for the audio data. First the video data must
`be decompressed either in software or in a codec
`(compression-decompression) device. If the color depth of
`the video is higher than that of the display, such as when an
`AVIfile with 16 bit video is played on an 8 bit display, the
`computer must dither colors to fit within the display's color
`restrictions. Also, if the selected playback window size is
`inconsistent with the resolution at which the video was
`captured, the computer is required to scale each frame.
`
`
`PAGE 10 OF 19
`
`SONOS EXHIBIT 1014
`IPR of U.S. Pat. No. 8,942,252
`
`

`

`5
`In addition to the greater amount of processing required
`for video data, the amount of video processing can vary
`considerably, thus further adversely affecting synchroniza
`tion. For example, one variable that affects the speed of
`video playbackis the decompression performed on the video
`data. The performance of software decompression algo
`rithms can vary for a number of reasons. For example, due
`to the interframe method of compressing data, the number of
`bytes that comprise each video frame is variable, depending
`on how similar the prior video frame is to the current video
`frame. Thus, more time is required to process a series of
`frames in which background is moving than is required to
`process a series of frames containing only minor changes in
`the foreground. Other variables include whether the color
`depth of the video equals that of the display and whether the
`selected playback window size is consistent with the reso
`lution at which the video was captured, as mentioned above.
`In addition, a slow CPU adversely affects every stage in
`the processing of a video file for playback. A sluggish hard
`disk or CD-ROM controller can also adversely affect per
`20
`formance as can the performance of the display controller or
`video card. Also, other demands can be made on the system
`as a result of something as simple as a mouse movement.
`While the above processing is being performed on the video
`and audio data, and while other demands are made on
`system resources, it becomes very difficult to ensure that the
`audio and video data remain in synchronization.
`Video for Windows includes a method which presumably
`attempts to maintain the audio and video portions of a
`multimedia display in sync, i.e., attempts to adapt when the
`computer system cannot keep up with either the video or
`audio portions of the display. Video for Windows bench
`marks the video hardware when it first begins execution as
`well as every time thereafter that the default display is
`changed. The results of these tests are used to determine a
`35
`particular system's baseline display performance at various
`resolutions and color depths. Video for Windows then uses
`this information regarding the capabilities of the video
`system to adjust the video frame rate to match the bench
`marked performance for the default display. Video for
`Windows maintains the continuity of the audio at all costs
`because a halting audio track is deemed more distracting.
`When the burden of the video playback is such that the
`system cannot keep up, Video for Windows skips frames
`during playback or adjusts the frame rate continuously as the
`system's resource usage patterns change.
`However, the method used by video for Windows in
`adjusting the video rate to match the benchmarked perfor
`mance of the default display results in an average frame rate
`suitable for the benchmark determined at the time the default
`was last changed. Attempts to display video frames contain
`ing an unusually heavy amount of non-repetitive data will
`slow processing down to the point where the benchmarked
`frame rate is no longer useful. When this happens, video
`frames are skipped because the burden of processing the
`video data becomes too great to preserve lip-sync in the
`display. The result can be "jerky” movement of the images
`of persons speaking as noted in Discover Windows 3.1
`Multimedia, by Roger Jennings (Que Corp. 1992), p.
`105-106. Thus, the method used by Video for Windows has
`proven to be inadequate, i.e., the video and audio portions
`still fall out of sync or exhibit "jerky" movement during a
`presentation.
`Shortcomings inherent in decoupled audio multimedia
`systems have been a problem for some time, and various
`efforts have been made to synchronize the audio and video
`portions of a presentation. There has been a recognized need
`
`50
`
`6
`in the industry for a solution to this problem. However, no
`satisfactory solution has been found, prior to the present
`invention.
`Therefore, a method and apparatus is desired which
`provides improved synchronization between digital audio
`and digital video data streams in a multimedia computer
`system, i.e., a method is needed to assure that corresponding
`video and audio frames are played back together. A syn
`chronization method is also desired that does not require the
`use of an encoding procedure prior to the processing of
`audio and video digital signals. It is also desirable to provide
`a multimedia synchronization system that is capable of
`functioning consistently whether video and audio data are
`delivered to the system in separate files or interleaved in one
`file.
`
`SUMMARY OF THE ENVENTION
`The present invention comprises a method and apparatus
`for synchronizing separate audio and video data streams in
`a multimedia system. The preferred embodiment of the
`invention utilizes a nonlinear feedback method for data
`synchronization that is independent of hardware, the oper
`ating system and the video and audio drivers used. The
`system and method of the present invention does not require
`that incoming data be time stamped, or that any timing
`information exist in the video data stream relative to audio
`and video data correspondence. Further, the data is not
`required to be modified in any way prior to the transfer of
`data to the video and audio drivers, and no synchronization
`information need be present in the separated audio and video
`data streams that are being synchronized by the system and
`method of the present invention. The preferred embodiment
`of the present invention requires that there be a common
`starting point for the audio and video data, i.e., that them be
`a time index of Zero where the audio and video are both in
`synchrony, such that the first byte of audio and video digital
`data are generated simultaneously.
`The synchronization method of the present invention is
`called periodically during a multimedia display to synchro
`nize the video and audio data streams. In the preferred
`embodiment, a periodic timer is set to interrupt the multi
`media operating system at uniform intervals during a mul
`timedia display and direct the operating system to invoke the
`synchronization method of the present invention. When the
`synchronization method is invoked, the method first queries
`the video driver to determine the current video frame
`position and then queries the audio driver to determine the
`current audio position. The current audio position is then
`used to compute the equivalent audio frame number. The
`synchronization method compares the video and audio
`frame positions and computes a synchronization error value,
`which is essentially the number of frames by which the
`video frame position is in front of or behind the current
`audio frame position.
`The synchronization error is used to assign a tempo value
`meaningful to either the video driver or the audio driver. In
`the preferred embodiment, the method adjusts the video
`tempo to maintain video synchronization, but in an alternate
`embodiment the method adjusts the audio tempo to maintain
`synchronization. Once a video tempo value has been
`determined, the preferred method adjusts this video tempo
`value by applying a smoothing function, i.e., a weighted
`average of prior tempo values, to the determined tempo
`value. If the synchronization error is determined to be
`greater than a defined tolerance, i.e., if the audio and video
`data streams are more than a certain number of flames out of
`
`5,642,171
`
`10
`
`15
`
`25
`
`30
`
`45
`
`55
`
`65
`
`
`PAGE 11 OF 19
`
`SONOS EXHIBIT 1014
`IPR of U.S. Pat. No. 8,942,252
`
`

`

`5,642,171
`
`7
`sync, and if the tempo value is not equal to the last tempo
`value previously sent to the video driver, then the method
`adjusts the video frame speed by passing the tempo value to
`the video driver.
`If the synchronization error is approximately 0, i.e., the
`audio and video data streams are substantially in sync, and
`the prior determined tempo value passed to the video driver
`was not the nominal rate, i.e., the rate intended to exactly
`match the audiorate, the method passes a video tempo value
`at the nominal rate to the video driver. In other words, if the
`audio and video data streams are in sync, a tempo value at
`the nominal rate is passed to the video driver. This removes
`any affects of the smoothing function, which otherwise
`would change the tempo value to other than the nominal
`rate.
`The method also determines if the audio is too far ahead
`of the video and if the audio is playing. If so, the audio is
`paused to allow the video to catch up. If the method
`determines that the audio is paused and that the video has
`caught up, the method restarts the audio. The method saves
`the video tempo value for comparison during the next call by
`the periodic timer and surrenders control to the operating
`system until called again.
`Therefore, the present invention provides an improved
`method of synchronizing the audio and video data streams
`during a multimedia presentation to provide a correctly
`synchronized presentation. The present invention permits
`the use of existing software drivers and multimedia operat
`ing systems. Further, the method of the present invention
`operates independently of where the audio and video data
`are stored as well as the type of operating system or drivers
`being used.Thus the presentinvention operates regardless of
`whether the audio and video data are interleaved in one file,
`stored on different media, or stored in separate computer
`systems. Also, the present invention does not require any
`type of time stamping or tagging of data, and thus does not
`require any modification of the video or audio data. Further,
`the present invention operates regardless of the type of
`compression/decompression algorithm used on the video
`data.
`
`O
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`8
`of the present invention is shown. It is noted that FIG. 1
`illustrates only portions of a functioning computer system,
`and those elements not necessary to the understanding of the
`operation of the present invention have been omitted for
`simplicity. As shown, the multimedia computer system
`includes a CPU 102 coupled to a host bus 106. Main
`memory 104 is also coupled to the hostbus 106. The host
`bus 106 is coupled to an expansion bus 112 by means of a
`bus controller 110. The expansion bus may be any of various
`types including the AT (advanced technology) bus or indus
`try standard architecture (ISA) bus, the EISA (extended
`industry standard architecture) bus, a microchannel (MCA)
`bus, etc. A video card or video adapter such as a VGA (video
`graphics array) card 120 is coupled to the expansion bus 112
`and is adapted to interface to a video monitor 122, as shown.
`The computer system may also include a video accelerator
`card 124 for performing compression/decompression
`(codec) functions. However, in the preferred embodiment
`the computer system does not include a video accelerator
`card. An audio card or sound card 130 is also coupled to the
`expansion bus 112 and interfaces to a speaker 132. The audio
`board 130 is preferentially a Sound Blaster II brand card
`made by Creative Labs, Inc. of Milpitas, Calif.
`Various mass storage devices are also coupled to the
`expansion bus 112, preferably including a CD-ROM 140,
`and a hard drive 142, as well as others. One or more of these
`mass storage devices store video and audio data which is
`used during presentation of a multimedia display. The audio
`and video data may

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket