`Maturi et al.
`
`111111111111111111111111111111111111111111111111111111111111111111111111111
`US005559999A
`[11] Patent Number:
`[45] Date of Patent:
`
`5,559,999
`Sep.24, 1996
`
`[54] MPEG DECODING SYSTEM INCLUDING
`TAG LIST FOR ASSOCIATING
`PRESENTATION TIME STAMPS WITH
`ENCODED DATA UNITS
`
`[75]
`
`Inventors: Greg Maturi, Tracy; David R. Auld;
`Darren Neuman, both of San Jose, all
`of Calif.
`
`[73] Assignee: LSI Logic Corporation, Milpitas,
`Calif.
`
`[21] Appl. No.: 303,444
`
`[22] Filed:
`
`Sep.9, 1994
`
`Int. Cl.6
`•••••••.•••••••••••••••••••••.•••••••••••••.•..•••• G06F 13/372
`[51]
`[52] U.S. Cl .................................. 395/550; 348/7; 348/10;
`370/94.2; 370/110.1; 3951182.18; 395/200.17
`[58] Field of Search ............................... 395/200.17, 550,
`3951182.18, 250; 348/7, 512, 10; 370/110.1,
`94.2
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,241,398 1211980 Carll ................................... 395/200.17
`5,287,182
`211994 Haskell et al ........................... 348/500
`5,396,497
`311995 Veltman ............................... 370/100.1
`8/1995 Hooper et al. .............................. 34817
`5,442,390
`5,448,568
`9/1995 De1puch et al. . ...................... 372194.2
`
`Attorney, Agent, or Firm-Poms, Smith, Lande & Rose
`ABSTRACT
`[57]
`
`A Motion Picture Experts Group (MPEG) multiplexed data
`bitstream includes encoded video and audio data units,
`which are prefixed with headers including Presentation Time
`Stamps (PTS) indicating desired presentation times for the
`respective data units. The data units are decoded, and
`presented at a fixed time after decoding, such that the fixed
`time can be subtracted from the PTS to provide a desired
`decoding time. The bitstream is parsed, the video and audio
`headers are stored in video and audio header memories, and
`the associated video and audio data units are stored in video
`and audio channel memories respectively. A first interrupt is
`generated each time a header is stored, and a host micro(cid:173)
`controller responds by storing the PTS from the header and
`the starting address of the corresponding data unit in the
`channel memory as an entry in a list. A second interrupt is
`generated each time a data unit is decoded, and the host
`microcontroller responds by accessing the list using the
`starting address of the data unit to obtain the corresponding
`PTS and thereby the desired decoding time. Decoding and
`presentation are synchronized by comparing the desired
`decoding time with a system clock time. If the desired
`decoding time is later than the system clock time by more
`than one presentation (frame) time period for the data unit,
`presentation of the data unit is skipped. If the desired
`decoding time is earlier than the system clock time by more
`than the presentation time period, presentation of the data
`unit is repeated.
`
`Primary Examiner-Ken S. Kim
`
`26 Claims, 4 Drawing Sheets
`
`I
`
`I
`I
`I
`I
`I
`I
`
`I
`I
`I
`I
`
`1---
`
`-
`
`-
`
`-
`
`-
`
`I
`
`I
`I
`I
`I
`I
`I
`I
`I
`
`I
`I
`I
`I
`I
`I
`I
`I
`
`I
`I
`
`AUDIO
`OUT
`
`18a
`\
`
`RAM
`
`10
`
`I
`I
`
`I
`I
`I
`I
`I
`
`18
`I
`HOST
`~
`MICRO- H-
`CONTROLLER
`r-------------------- ______ J
`~
`:
`)2
`34)
`--+.. PRE-BITSTREAM
`
`CHANNEL
`IN
`PARSER
`CONTROLLER
`36-
`L---------
`
`,.,16
`10
`r----------1----------- ------------,
`l
`l
`CLOCK
`38
`I
`PULSE
`30
`GENERATOR
`I
`SYSTEM
`TIM£
`VIDEO
`VIDEO
`r- DECODER I-- PRESENT ~ VIDEO
`COUNTER
`OUT
`-,
`CONT
`t 24
`28 .~26
`I
`AUDIO
`POST
`AUDIO
`PARSER ~ DECODER 1- PRESENT f+-
`CONT
`t
`:
`t
`----------------------\-----J
`32
`20-......._ VIDEO
`AUDIO
`VIDEO
`AUDIO
`FRAME
`HEADER CHANNEL HEADER CHANNEL MEMORY
`BUFFER BUFFER BUFFER BUFFER BUFFER
`\
`\
`\
`\
`\
`20a
`20b
`20c
`20d
`20e
`
`SONY EX. 1007
`Page 1
`
`
`
`""' ""' ""'
`Ul ""'
`...
`Ul
`...
`Ul
`
`ol::oo
`9,
`Joo&
`
`(I) [
`
`="'
`\C
`\C
`Joo&
`~~
`~
`
`(I)
`
`~ = f""''o-
`d • rJ).
`
`~
`•
`
`OUT
`AUDIO
`
`OUT
`VIDEO
`
`I
`
`PRESENT
`
`VIDEO
`
`CONT
`
`COUNTER r-DECODER -
`
`VIDEO
`
`I
`r----------1-----------f------------,
`
`GENERATOR
`I PULSE
`_I
`I CLOCK I
`40
`
`1"'16
`
`CONT
`DECODER r---PRESENT
`AUDIO
`-,
`28 .~
`26
`
`-----
`
`BUFFER BUFFER BUFFER BUFFER BUFFER
`HEADER CHANNEL HEADER CHANNEL MEMORY
`FRAME
`---
`
`20e
`
`20d
`
`20c
`
`20b
`
`20a
`
`AUDIO
`
`AUDIO
`
`VIDEO
`
`20-......._ VIDEO
`
`-------------
`
`36-
`
`---
`t
`t
`CONTROLLER 1--PARSER
`POST ~ AUDIO
`CHANNEL
`~
`
`J
`_f 24
`
`TIME
`SYSTEM
`I
`38
`
`·-------------------______ j
`I
`:
`r-f-
`
`34}
`
`PARSER
`PRE-
`~2
`
`BITSTREAM
`
`IN
`
`I
`I
`I
`I
`I
`I
`
`CONTROLLER
`
`MICRO-
`HOST
`/
`18
`
`RAM
`
`~
`
`\
`18a
`
`10
`
`FIG. 3
`
`FIG. 1
`
`PACKET DATA
`
`\_ ---y----l
`
`)
`
`FIG. 2
`PACKET DATA
`
`y
`
`\.
`
`-AUDIO OUT
`VIDEO OUT
`
`16
`
`DATA
`AUDIO
`
`HEADER HEADER
`PACKET AUDIO
`
`DAli1
`VIDEO
`
`. HEADER HEADER HEADER HEADER
`VIDEO
`SYSTEM
`
`PACKET
`
`PACK
`
`CONTROLLER
`I
`
`'
`
`MICRO-
`HOST
`
`18
`
`I
`
`I
`
`I
`12
`
`~
`
`10
`
`IN
`ECC/DECRYPT
`MPEG ~ DEMODULATOR/
`
`BITSTREAM
`
`14
`
`SONY EX. 1007
`Page 2
`
`
`
`'-=
`'-=
`"' '-=
`'-=
`01
`"' 01
`01
`
`,J:o.
`9,
`N
`m.
`til =-
`
`="
`~
`~
`~
`
`--~
`~
`til
`
`('0 a
`~
`~
`•
`
`0 • 00.
`
`HEADER BUFFER
`
`TAG IN
`
`HEADER AND
`
`STORE
`
`I
`
`HEADER BUFFER
`
`HEADER AND
`
`TAG IN
`
`INTERRUPT ' STORE
`HEADER ' GENERATE
`
`BITSTREAM
`INCOMING
`
`PARS£
`
`J
`
`J
`
`D£T£CT
`
`FIG. 5
`
`----~_J
`
`-
`
`y
`
`PTS5 TAG= 020
`PTS4 TAG= 018
`PTSJ TAG= 010
`PTS2 TAG= 008!
`PTS! TAG= OOOj
`
`. . .
`
`HOST AUDIO
`
`PTS LIST
`
`PTS5 TAG= 071
`
`. . .
`
`PTS4 TAG= 05A I
`PTSJ TAG= 000
`PTS2 TAG= 01£
`PTS! TAG= 000
`
`PTS LIST
`HOST VIDEO
`
`I
`
`\._
`
`1
`
`START= 018
`
`TAG = 018
`
`START= 010
`
`TAG = 010
`
`START= 008
`
`TAG= 008
`
`START= 000
`
`TAG::: 000
`
`BUFFER
`CHANNEL
`AUDIO
`
`BUFFER
`HEADER
`AUDIO
`
`START= 020
`
`I
`. . .
`
`--
`
`'-
`
`TAG = 020
`
`\
`. ..
`
`-
`
`HEADER 5 v DATA 5
`HEADER 4 v DATA 4
`HEADER J v DATA J
`HEADER 2 v DATA 2
`HEADER 1 v DATA 1
`
`START= 046
`
`TAG= 046
`
`START= 01£
`
`DATA 2
`
`START= 000
`
`BUFFER
`CHANNEL
`VIDEO
`
`TAG= 01£
`
`TAG= 000
`
`BUFFER
`HEADER
`VIDEO
`
`START= 071
`
`I
`. . .
`
`START= 05A
`
`DATA 4
`
`TAG= 071
`
`TAG= 05A
`
`\
`...
`HEADER 5 v DATA 5
`HEADER 4 /
`HEADER J / DATA J
`HEADER 2 /
`HEADER 1 / DATA 1
`
`20d
`
`20c
`
`FIG. 4
`
`20a
`
`SONY EX. 1007
`Page 3
`
`
`
`\C
`\C
`\C
`-...
`\C
`til
`til
`-...
`til
`
`,f;;..
`
`s,
`~
`~
`="' ~
`00
`
`~ =-.
`~
`....
`-?
`
`~,f;;..
`N
`
`00
`
`('D a
`~ a.
`rJl •
`d •
`
`STC COUNTER
`SYNCHRONIZE
`
`I
`
`OR CLOCK FREQUENCY
`ADJUST STC COUNTER
`
`IF NECESSARY
`
`'
`
`WITH STC COUNTER
`COMPARE SCR CODE
`
`FROM BITSTREAM
`GET SCR CODE
`
`!
`•
`
`SCR COUNTER
`INITIALIZE
`
`FIG. 9
`
`PRESENTATION
`SYNCHRONIZE
`
`FRAME
`PRESENT
`
`•
`
`IF NECESSARY
`SKIP FRAME
`
`SCR ' REPEAT OR
`
`TO ADJUSTED
`COMPARE PTS
`
`•
`
`BUFFER READ
`GET CHANNEL I
`
`TO GET PTS
`MEMORY LIST
`ACCESS HOST
`
`POINTER '
`'
`
`INTERRUPT
`RECEIVE
`
`DECODE
`
`INTERRUPT ' DECODE
`
`T
`DATA
`
`GENERATE
`
`CODE
`START
`DETECT
`
`J
`•
`
`FIG. 8
`
`FIG. 7
`
`FIG. 6
`
`MEMORY TABLE
`TAG IN HOST
`STORE PTS AND
`
`HOST MEMORY
`AND TAG IN
`
`LIST
`
`AND TAG FROM
`
`HEADER
`
`RECEIVE
`
`' STORE PTS
`INTERRUPT ' EXTRACT PTS
`
`SONY EX. 1007
`Page 4
`
`
`
`\C
`\C
`\C
`,.
`\C
`til
`til
`,.
`til
`
`~ ....
`....
`~ .....
`111 =-~
`
`0
`
`\C =-.
`\C
`~
`--~
`~
`111
`
`~ = """'"
`
`~
`•
`rJl
`•
`Cj
`
`I
`
`46
`
`22
`(
`
`I
`SCR1
`
`I
`
`I GENERATOR
`PULSE I
`I <
`
`CLOCK
`
`FIG. 1 1
`
`COUNTER
`
`TIME
`SYSTEM
`
`38
`
`18
`
`MICROCONTROLLER
`
`HOST
`
`SCR2
`
`22
`(
`
`I PRE -PARSER I
`
`SCR1
`
`r--
`
`SYNCHRONIZER
`
`·~ COMPARATOR f-. TIME
`4~
`
`;o
`
`GENERATOR
`
`PULSE
`CLOCK
`f
`
`COUNTER
`
`SYSTEM
`/.
`38
`
`-TIME
`
`)
`42
`
`SCRO
`
`FIG. 10
`
`SONY EX. 1007
`Page 5
`
`
`
`5,559,999
`
`1
`MPEG DECODING SYSTEM INCLUDING
`TAG LIST FOR ASSOCIATING
`PRESENTATION TIME STAMPS WITH
`ENCODED DATA UNITS
`
`BACKGROUND OF THE INVENTION
`
`5
`
`20
`
`25
`
`30
`
`2
`different layers of the hierarchy. It is therefore necessary for
`the decoder to associate the presentation time stamp found
`at the packet layer with the beginning of the first access unit
`which follows it.
`The situation is further complicated by the fact that in a
`real decoder the system has little control over the presenta(cid:173)
`tion times of the presentation units. For example, in the
`video decoder, video frames (pictures) must be presented at
`an exact multiple of the frame rate for the video to appear
`smooth, and the audio frames must be presented at exact
`multiples of the audio frame rate for the audio be free of
`clicks.
`In the idealized MPEG synchronization scheme, a system
`time clock (STC) which maintains a system clock time is
`15 provided in the decoder. The initial value of the system clock
`time is transmitted in the system stream by the encoder as a
`System Clock Reference (SCR) in an MPEG 1 bitstream, or
`as a Program Clock Reference (PCR) in an MPEG 2
`bitstream. The decoder sets its local system time clock to the
`initial value, and then continues to increment it at a clock
`rate of 90 kHz.
`Subsequently, the encoder transmits a presentation time
`stamp for an audio or video access unit, followed some time
`later by the access unit itself. The decoder compares the
`presentation time stamp to the local system clock time, and
`when they are equal removes the access unit from the
`elementary stream buffer, instantly decodes it to produce the
`corresponding presentation unit, and presents the presenta-
`tion unit.
`In a real system, synchronization is complicated by fac(cid:173)
`tors including the following.
`I. Presentation units cannot be removed from the elemen(cid:173)
`tary stream buffer instantaneously, nor decoded or presented
`instantaneously.
`2. Acceptable presentation unit boundaries may not be
`under the control of the encoder. For example if an MPEG
`decoder is locked to an external television synchronization
`signal, the presentation unit boundaries are controlled by the
`synchronization pulse generator, not the decoder itself. This
`creates error in the presentation time.
`3. Presentation time stamps which have errors in them,
`due to channel errors, and may prevent a frame from being
`decoded indefinitely.
`
`I. Field of the Invention
`The present invention generally relates to the art of 10
`audio/video data compression and transmission, and more
`specifically to a synchronization system for a Motion Picture
`Experts Group (MPEG) audio/video decoder.
`2. Description of the Related Art
`Constant efforts are being made to make more effective
`use of the limited number of transmission channels currently
`available for delivering video and audio information and
`programming to an end user such as a home viewer of cable
`television. Various methodologies have thus been developed
`to achieve the effect of an increase in the number of
`transmission channels that can be broadcast within the
`frequency bandwidth that is currently allocated to a single
`video transmission channel. An increase in the number of
`available transmission channels provides cost reduction and
`increased broadcast capacity.
`The number of separate channels that can be broadcast
`within the currently available transmission bandwidth can be
`increased by employing a process for compressing and
`decompressing video signals. Video and audio program
`signals are converted to a digital format, compressed,
`encoded and multiplexed in accordance with an established
`compression algorithm or methodology.
`The compressed digital system signal, or bitstream, which
`includes a video portion, an audio portion, and other infor(cid:173)
`mational portions, is then transmitted to a receiver. Trans(cid:173)
`mission may be over existing television channels, cable
`television channels, satellite communication channels, and
`the like. A decoder is provided at the receiver to de(cid:173)
`multiplex, decompress and decode the received system
`signal in accordance with the compression algorithm. The 40
`decoded video and audio information is then output to a
`display device such as a television monitor for presentation
`to the user.
`Video and audio compression and encoding is performed 45
`by a suitable encoders which implement a selected data
`compression algorithm that conforms to a recognized stan(cid:173)
`dard or specification agreed to among the senders and
`receivers of digital video signals. Highly efficient compres(cid:173)
`sion standards have been developed by the Moving Pictures 50
`Experts Group (MPEG), including MPEG 1 and MPEG 2.
`The MPEG standards enable several VCR-like viewing
`options such as Normal Forward, Play, Slow Forward, Fast
`Forward, Fast Reverse, and Freeze.
`The MPEG standards outline a proposed synchronization 55
`scheme based on an idealized decoder known as a Standard
`Target Decoder (STD). Video and audio data units or frames
`are referred to as Access Units (AU) in encoded form, and
`as Presentation Units (PU) in unencoded or decoded form.
`In the idealized decoder, video and audio data presentation 60
`units are taken from elementary stream buffers and instantly
`presented at the appropriate presentation time to the user. A
`Presentation Time Stamp (PTS) indicating the proper pre(cid:173)
`sentation time of a presentation unit is transmitted in an
`MPEG packet header as part of the system syntax.
`The presentation time stamps and the access units are not
`necessarily transmitted together since they are carried by
`
`35
`
`SUMMARY OF THE INVENTION
`
`The present invention provides a decoding system for a
`Motion Picture Experts Group (MPEG) multiplexed audio/
`video bitstream, or a comparable bitstream utilizing a dif(cid:173)
`ferent compression algorithm. The system incorporates a
`host microcontroller, a decoder, and an video/audio decod(cid:173)
`ing synchronization method that is performed automatically
`by the system.
`The MPEG bitstream includes encoded video and audio
`data or Access Units (AU) in the form of Packetized
`Elementary Streams (PES), which are prefixed with headers
`including Presentation Time Stamps (PTS)
`indicating
`desired presentation times for the respective access units.
`The access units are decoded to produce corresponding
`Presentation Units (PU), and presented at a fixed time after
`decoding, such that the fixed time can be subtracted from the
`presentation time stamps to provide requested decoding
`times.
`The bitstream is parsed, the video and audio headers are
`stored in video and audio header memories, and the asso(cid:173)
`ciated video and audio access units are stored in video and
`
`65
`
`SONY EX. 1007
`Page 6
`
`
`
`15
`
`FIG. 1 is block diagram illustrating a video/audio decod(cid:173)
`ing system embodying the present invention;
`FIG. 2 is a simplified diagram illustrating a data bitstream
`that is decoded by the system of FIG. 1;
`FIG. 3 is a more detailed block diagram of the present
`decoding system;
`FIG. 4 is a diagram illustrating an arrangement of buffer
`20 memories of the present system;
`FIG. 5 is a flowchart illustrating the steps of storing a data
`header and associated memory tag in accordance with the
`present invention;
`FIG. 6 is a flowchart illustrating the steps of storing a
`25 presentation time stamp and associated memory pointer tag;
`FIG. 7 is a flowchart illustrating a decoding operation;
`FIG. 8 is a flowchart illustrating the synchronization of
`presentation of a data frame;
`FIG. 9 is a flowchart illustrating synchronization of sys(cid:173)
`tem clock time;
`FIG. 10 is a block diagram illustrating a first arrangement
`for synchronizing the system clock time; and
`FIG. 11 is a block diagram illustrating a second arrange-
`35 ment for synchronizing the system clock time.
`
`30
`
`3
`audio channel memories respectively. A first interrupt is
`generated each time a header is stored, and a host micro(cid:173)
`controller responds by storing the presentation time stamp
`from the header and the starting address (write pointer) of
`the corresponding access unit in the channel memory as an
`entry in a list.
`A second interrupt is generated each time an access unit
`is decoded, and the host microcontroller responds by access(cid:173)
`ing the list using the starting address (read pointer) of the
`access unit to obtain the corresponding presentation time 10
`stamp and thereby the requested decoding time. Decoding
`and presentation are synchronized by comparing the
`requested decoding time with the system clock time.
`If the requested decoding time is later than the system
`clock time by more than one presentation (frame) time
`period for the corresponding presentation unit, presentation
`of the presentation unit is skipped. If the requested decoding
`time is earlier than the system clock time by more than the
`presentation time period, presentation of the presentation
`unit is repeated.
`The system further comprises a provision for synchroniz(cid:173)
`ing the system time clock (STC) using System Clock
`Reference (SCR) or Program Clock Reference (PCR) time
`stamps that are extracted from the bitstream.
`More specifically, the present invention implements a
`loosely coupled video/audio synchronization scheme. It is
`designed to take into account unpredictable system delays
`such as externally generated video syncs, as well as easily
`deal with a variety of error conditions that may occur in the
`channel.
`The system time clock (STC) is first set. In an MPEG 1
`stream, the System Clock Reference (SCR) time stamp
`comes from the pack layer. In an MPEG 2 stream the
`Program Clock Reference (PCR) can come from various
`places, including the transport layer, the program stream
`pack header or the Program Elementary Stream (PES) layer.
`In each case, the SCR or PCR is trapped by a system
`parser, stored in a system stream buffer, and an interrupt is
`generated. A host controller reads the SCR or PCR field and 40
`copies it to an SCR register, and sets the system clock timer
`counting. If interrupt latency is very long the SCR or PCR
`value can be adjusted to accommodate the latency.
`The decoder is then started. The actual start is delayed
`until a vertical sync signal is generated in order to synchro- 45
`nize frame reconstruction and delay.
`A host controller responds to each picture start code
`interrupt. It examines a video elementary stream buffer read
`pointer, and uses this value to associate the picture with a list
`of pending presentation time stamps stored in a system
`header buffer. The error between the actual presentation time
`(the current system clock time) and the requested presenta(cid:173)
`tion time (from the presentation time stamp) can then be
`determined.
`One of three actions is taken depending on the magnitude 55
`and sense of the error.
`1. If the error is less than one presentation frame time, the
`audio or video frame is synchronized to the system time
`clock, and is decoded as scheduled.
`2. If the actual presentation time is earlier than the
`requested presentation time by more than one presentation
`time period, the decoder repeats one presentation unit
`(frame).
`3. If the actual presentation time is later the requested 65
`presentation time by more than one presentation time period,
`the decoder skips a frame.
`
`5,559,999
`
`5
`
`4
`This process is repeated indefinitely. If the decoder loses
`synchronization for any reason, an appropriate corrective
`action is taken.
`These and other features and advantages of the present
`invention will be apparent to those skilled in the art from the
`following detailed description, taken together with the
`accompanying drawings, in which like reference numerals
`refer to like parts.
`
`DESCRIPTION OF THE DRAWINGS
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`A video/audio decoder system 10 embodying the present
`invention is illustrated in FIG. 1, and comprises a demodu(cid:173)
`lator/ECC/decryptation unit 12 for receiving an MPEG
`multiplexed bitstream from an encoder (not shown) via a
`communications channel 14. The unit 12 demodulates the
`input bitstream, performs error correction (ECC) and de(cid:173)
`encrypts the demodulated data if it is encrypted for access
`limitation or data compression purposes.
`The unit 12 applies the demodulated MPEG bitstream as
`50 digital data to a video/audio decoder 16, which de-multi(cid:173)
`plexes and decodes the bitstream to produce output video
`and audio signals in either digital or analog form.
`The system 10 further comprises a host microcontroller
`18 that interacts with the decoder 16 via an arrangement of
`interrupts as will be described in detail below. The decoder
`16 and the microcontroller 18 have access to an external data
`storage such as a Dynamic Random Access Memory
`(DRAM) 20. It will be noted that the scope of the invention
`is not so limited, however, and that the memory 20 can be
`60 provided inside the decoder 16 or the microcontroller 18.
`A simplified, generic representation of an MPEG bit(cid:173)
`stream is illustrated in FIG. 2. The bitstream includes a
`system header that provides housekeeping and other infor(cid:173)
`mation required for proper operation of the decoder 16. A
`pack header identifies a pack of data that comprises one or
`more packs, with each pack having a pack header. Each pack
`includes one or more video and/or audio access units
`
`SONY EX. 1007
`Page 7
`
`
`
`5,559,999
`
`5
`(encoded frames), each of which is preceded by its own
`header having a frame Start Code (SC).
`The MPEG system syntax governs the transfer of data
`from the encoder to the decoder. A system stream typically
`comprises a number of Packetized Elementary Streams 5
`(PES), which can be video or audio streams, that are
`combined together to form a program stream. A program is
`defined as a set of elementary streams which share the same
`system clock reference, so can be decoded synchronously to
`each other.
`In MPEG 1 there are only two levels of hierarchy in the
`system syntax; the elementary stream and the program
`stream. In MPEG 2 there are more levels.
`
`10
`
`MPEG I
`
`MPEG2
`
`Program Stream
`Stream (PES)
`Elementary Stream
`
`Transport Stream or Program
`Program Elementary Stream
`Elementary Stream
`
`15
`
`20
`
`6
`As the pre-parser 22 begins to store a video header or an
`audio header in the header buffer 20a or 20c respectively, it
`generates a first interrupt to the microcontroller 18. The
`pre-parser 22 then stores the access unit following the
`header in the appropriate channel buffer 20b or 20d. The
`pre-parser 22 also captures the starting address (write
`pointer) of the access unit in the channel buffer 20b or 20d,
`and appends this starting address as a "tag" to the header
`stored in the header buffer 20a or 20c.
`As illustrated in the flowchart of FIG. 6, the host micro-
`controller 18 receives the first interrupt from the pre-parser
`22, and extracts the presentation time stamp from the PES
`header stored in the header buffer 20a or 20c, together with
`the associated tag. The host microcontroller 18 stores these
`two items as an "entry" in a list in the RAM 18a. The entries
`in the RAM 18a provide a link between the presentation
`time stamps stored in the header buffer 20a or 20c and the
`starting addresses of the associated access units stored in the
`channel buffer 20b or 20d.
`FIG. 4 illustrates a simplified example of five video access
`units and five audio access units stored in the DRAM 20, and
`the associated entries in the list in the RAM 18a. The
`headers for the five video access units stored in the video
`header buffer 20a include the presentation time stamps for
`the access units, although not explicitly illustrated. The
`associated tags for the five video headers indicate the
`starting addresses for the five video access units stored in the
`video channel buffer 20b, in the illustrated example hexa(cid:173)
`decimal addresses 000, OlE, 046, 05A and 071 respectively.
`In an essentially similar manner, the headers for the five
`audio access units stored in the audio header buffer 20c
`include the presentation time stamps for the access units.
`The associated tags for the five audio headers indicate the
`starting addresses for the five audio access units stored in the
`audio channel buffer 20d, in the illustrated example hexa(cid:173)
`decimal addresses 000, 008, 010, 018 and 020 respectively.
`The tag list in the RAM 18a of the microcontroller 18
`includes a video tag list 18b and an audio tag list 18c. Each
`entry includes the presentation time stamp for the associated
`40 video or audio access unit and the tag or starting address for
`the access unit stored in the buffer 20b or 20c respectively.
`The video and audio access units are decoded asynchro(cid:173)
`nously relative to the operation of the pre-parser 22 by the
`decoders 26 and 28 respectively. The decoders 26 and 28
`read access units out of the channel buffers 20b and 20d in
`synchronism with frame start pulses generated by the pre(cid:173)
`sentation controllers 30 and 32 respectively.
`The video and audio presentation units (decoded access
`50 units or frames) are presented at a fixed frame rate, typically
`30 frames/second for video. The access units are not
`decoded and presented instantaneously as in the idealized
`MPEG Standard Target Decoder (STD) model, but are
`presented at a fixed time interval after the start of decoding.
`This interval is typically 1.5 frames for video.
`Thus, the requested decoding time can be calculated
`indirectly from the presentation time stamp by subtracting
`the fixed time interval from the value of the presentation
`time stamp. If the system is designed to also utilize Decod-
`ing Time Stamps (DTS), the desired decoding time is equal
`to the DTS and can be obtained directly from the bitstream
`without using the PTS.
`The decoding operation is illustrated in the flowchart of
`FIG. 7. Upon receipt of a frame start pulse from the
`controller 30 or 32, the decoder 26 or 28 starts to decode the
`next access unit in the buffer 20b or 20d. Upon reading a
`Start Code (SC), which indicates the beginning of the
`
`The Program Elementary Stream (PES) is introduced to
`allow multiple programs to be sent over the same transport
`stream. An MPEG 2 system may either transmit a program
`stream, containing a PES for a single program, or a transport
`stream, containing PESs for multiple, possibly unrelated, 25
`programs. An MPEG 2 system decoder therefore must be
`able to accept PES data from transport packets or from the
`program stream.
`The crucial difference between these two scenarios is that
`the data in transport packets may split PES packets at 30
`non-packet boundaries, whereas the data in a program
`stream will only switch from one PES to another at PES
`boundaries.
`The present system 10 parses MPEG 1 and MPEG 2
`system data in the same way, using a system parser between 35
`the incoming system data and elementary stream buffers.
`As illustrated in FIG. 3, the microcontroller 18 comprises
`a Random Access Memory (RAM) 18a for storing a list of
`memory pointer tag entries as will be described in detail
`below. The decoder 16 comprises a pre-parser 22, a post(cid:173)
`parser 24, a video decoder 26, an audio decoder 28, a video
`presentation unit 30, an audio presentation unit 32. A chan(cid:173)
`nel controller 34 controls the operation of the units 22 to 32.
`The units 22 to 28 and 34 have access to the DRAM 20
`via a bus 36. The DRAM 20 is preferably a single continu- 45
`ous block of memory, but is internally partitioned into a
`video header buffer 20a, a video channel (data) buffer 20b,
`an audio header buffer 20c, an audio channel (data) buffer
`20d and a frame memory buffer 20e.
`The pre-parser 22 parses the input bitstream and captures
`any SCR (MPEG 1) or PCR (MPEG 2) time stamps that are
`included in any of the layers of the stream. The pre-parser
`22, under control of the channel controller 34, causes PES
`video headers to be stored in the video header buffer and
`PES audio headers to be stored in the audio header buffer
`20c.
`The pre-parser 22 causes PES streams of video data
`(access) units to be stored in the video channel buffer 20b
`and audio data (access) units to be stored in the audio
`channel buffer 20d in a First-In-First-Out (FIFO) arrange(cid:173)
`ment. The starting address of each access unit stored in the
`buffer 20b or 20d is the address following the last address of
`the previous access unit.
`The operation of the pre-parser 22 is illustrated in flow- 65
`chart form in FIG. 5. The parsing operation for video and
`audio data is essentially similar.
`
`55
`
`60
`
`SONY EX. 1007
`Page 8
`
`
`
`5,559,999
`
`15
`
`7
`associated access unit, the decoder 26 or 28 generates a
`second interrupt, and continues to decode the remainder of
`the access unit and apply the decoded presentation unit to the
`controller 30 or 32 for presentation on a user's television
`monitor or the like.
`The post-parser 24, under control of the channel control(cid:173)
`ler 34, causes video and audio access units to be read out of
`the DRAM 20 and applied to the appropriate decoder 26 or
`28.
`The operation of synchronizing the decoding, presenta(cid:173)
`tion of video and audio data in accordance with the present
`invention is illustrated in FIG. 8. In response to a second
`interrupt, the host microcontroller 18 captures the starting
`address (read pointer) of the access unit being decoded in the
`buffer 20b or 20d, and uses this value to access the list 18b
`or 18c in the RAM 18a. The microcontroller 18 searches the
`appropriate list 18b or 18c until it finds the highest value tag
`that is larger than the captured read pointer. This is the tag
`of the access unit being decoded. The other portion of the
`entry for this tag is the presentation time stamp of the access
`unit being decoded.
`As illustrated in FIG. 3, the decoder 16 further comprises
`a System Time Clock (STC) counter 38 that is incremented
`by a 90 kHz clock pulse generator 40. The instantaneous
`count of the counter 38 constitutes the system clock time
`which constitutes a time reference for synchronizing the
`decoding and presentation operations of the system 10.
`The host microcontroller 18, in response to a second
`interrupt, captures the count of the counter 38 and compares
`it with the value of the presentation time stamp minus the
`fixed time interval between decoding and presentation. This
`latter value represents the requested decoding time, or the
`system clock time at which the associated access unit should
`be decoded. If a Decoding Time Stamp (DTS) is provided,
`the DTS is equal to the PTS minus the fixed decoding time
`interval, and can be used instead of the PTS.
`If the requested decoding time is less than one frame time
`interval different from the count of the counter 38 (system
`clock time), the data unit is decoded and presented in the
`normal manner. If, however, the requested decoding time
`differs from the system clock time by more than one frame
`time interval, indicating that the system 10 is out of syn(cid:173)
`chronization by more than one frame, a synchronization
`adjustment is made.
`If the requested decoding time is larger than the system
`clock time, indicating that the associated access unit is being
`decoded too soon, presentation of the frame is repeated to
`add one frame time to the presentation operation. Con(cid:173)
`versely, if the requested decoding time is smaller than the
`system time, indicating that the access unit is being decoded
`too late, decoding and presentation of the access unit are
`skipped to subtract one frame time from the presentation.
`The channel buffers 20b and 20d are FIFO or circular
`buffers, and are accessed using binary read and write 55
`addresses. Preferably, the decoder 26 or 28 appends one or
`more Most Significant Bits (MSB) to the starting addresses
`or tags stored in the tag lists 18b and 18c which indicate the
`number of times the buffer 20b or 20d has rolled over. This
`enables the host microcontroller 18 to search the tag lists 60
`over a substantial number of frames to determine the correct
`association in the event that the system 10 gets out of
`synchronization by more than one frame.
`The system 10 is further provided with a means for
`initially setting the system clock time in the counter 38, and 65
`adjusting the system clock time to compensate for discrep(cid:173)
`ancies between the system clock time and requested system
`
`8
`time clock (STC) times provided by SCR or PCR time
`stamps in the MPEG bitstream. These discrepancies are
`caused by system errors, deviations in the frequency of the
`clock pulse generator 40 from the nominal value of 90 kHz,
`5 or various other factors.
`The system clock time is first set to an initial value. In
`response to a first interrupt from the pre-parser 22 which
`results from storing a PES header in the header buffer 20a
`or 20c, the microcontroller 18 reads the counter 38 to obtain
`10 a value SCRO. When an SCR or PCR is parsed from the
`transport layer, or decoded from a PES head