`
`Chapter 19
`
`process have inverse elements , and most combination s are lossless. For
`example , variable-length encodi11g and decoding
`is lossless, so it need
`not be considered as part of the system . The coin bi11a tion of DCT and
`inverse DCT (IDTC) is lossless witl1 sufficient
`arithmetic
`precision.
`This pair of elements contributes round-off er rors , but little else.
`The lossy process is, of course , quanti zation . There is a block called
`the coefficients
`to the cor(cid:173)
`dequantization , but it serves only to return
`rect reconstructio11 values ; the original
`inf orma ti on is irretrievably
`lost.
`However , once this step ha s been perform ed, we can agai11 look at a
`lossless system . Assuming
`there is no signal processing between
`the
`output of the decoder and the inpt1t to the next encoder , we can f al(cid:173)
`low the values . Quantized coefficients are passed via varia ble-length
`encode /decode (lossless), then to the inv er se DCT , thus pro ducing a set
`of san1ple levels. If DCT/ID CT is nearl y lossless, so is IDTC followed
`by DCT . After the DCT proces s of the next enco d er, we will have the
`same coefficient
`set that was produc ed by th e dequantizer
`. Each of
`these levels should be right in the center of the decision range for the
`new quantizer , so that it should pr oduc e the same result . Figure 19-1
`illustrates .
`
`Figure 19-1
`Lossless groupings
`in concatenated
`compression systems.
`
`'
`
`Encode
`
`Decode
`
`DCT
`
`' Quantize
`,
`
`'
`
`/
`
`Variable-
`length
`encode
`
`'
`
`/
`
`Variable-
`De-
`length ~
`•
`quanti ze
`decode
`
`,,
`'
`
`IDCT
`
`•
`
`\
`
`!
`
`Loss on first pass
`
`Transparent
`
`Decode
`
`Encode
`
`IDCT
`
`DCT
`
`Transparent
`
`if all quantizers identical
`
`in this way, the explanation of the observed behayjor is
`Considered
`simple . Quantization causes the loss observed on the first pass. On sub-
`sequent passes the same quantization
`results should occur every time .
`
`•
`
`•
`
`IPR2021-00827
`Unified EX1010 Page 324
`
`
`
`Closing Thougl1ts
`
`The only issue is the DCT /IDCT match , which is never perfect . Arith(cid:173)
`metic precision errors and round-off errors co11tribute the very grad(cid:173)
`ual deterioratio11 in performance on subsequent generations.
`It will be seen that
`the q·uantizer
`is an essential element of this
`process . The lossless situation exists only if the quantizers are identical .
`Successive coding by two encoders that use different quantizers will
`likely result in quite rapid deterioration .
`Researchers at Sarnoff Corporation have show11 that compression
`systems can be very sensitive to the artifacts produced by other com(cid:173)
`pression systems . In one experiment
`, two "light " compression systems
`intended
`for studio operations with high-definition
`televisio11 were
`compared . One was based on MPEG, the othe1· on wavelets. The sig(cid:173)
`nals , both of which were free from visible artifacts , were fed to an
`ATSC transn1ission encoder that uses MPEG compression .
`The transmission encoder performed 11ormally when fed with the
`output of the MPEG system , but displayed a much higher level of
`artifacts when fed from the wavelet system .
`This should be cause for concern . Already , television plants are full
`of compressed no11linear edit systems and tape formats that use com(cid:173)
`pression . We are about to see widespread deployment of MPEG-based
`deli very systems , yet we see the rapid growth of DV compression ,
`which uses a quantization
`strategy quite different
`from MPEG. None
`of this will cause an overnight disaster , but we may be eroding ot1r
`quality headroom without even realizing it.
`Two parts of a European project are of great interest in this area. In
`one , developers are achieving essentially lossless recoding , even with a
`change of bit rate . All the coding decisions (motion vectors, etc.) of the
`original compression are extracted
`from the decoder and sent along
`with the video. (There is a proposal to code this data by using the least
`significant bit of one of the color difference signals.) The same coding
`decisions can then be used by the second encoder
`a big step toward
`retaining quality.
`The other work specifically concerns changing bit rates perhaps
`from a distribution
`signal to a broadcast delivery signal. The enco~er
`uses a special form of quantizer
`tl1at recognizes the previot1s quant12a(cid:173)
`tion decisions and attempts
`to minimize
`the combined effect of the
`two quantizers. Demonstrated
`results are spectacular.
`Audio compression
`syste111s are not in1mune
`to the problem~ of
`concatenation. As with video, a compression system designed for finaj
`delivery to the point of use should be designed so that the effects 0
`.
`.
`d
`bl
`If
`'d
`. 1
`s are co11catenat-
`.
`two 1 ent1ca system
`quant1zat1on are t111 etecta
`e.
`
`•
`
`IPR2021-00827
`Unified EX1010 Page 325
`
`
`
`306
`
`Chapter 19
`
`ed t11en , just as with video , the decisions in the seco11d system should
`matcl1 those of the first and result in little or no additional
`i1npair(cid:173)
`ment , if the signal has not changed . If tl1e signal does change , the likely
`result is that a coarsely quantized signal is requanti zed differently ,
`and it is very likely that the impairment will be unacceptable .
`If nonidentical compressio11 systems are concatenated , rapid deteri(cid:173)
`oratio11 is possible , agai11 just like video . The artifacts of one system are
`likely to disrupt
`the ef ficie11t operation of the other . An interesting
`and very important exa1nple is the interaction of Dolby St1rround Pro
`Logic (DSPL) witl1 Dolby Digital (DD ), if wrongly used , as described in
`the previous chapter .
`
`Switching MPEG
`
`MPEG was co11ceived for a .relatively simple sce11ario in which inf or(cid:173)
`mation
`is encoded , sent to the decoder , and dec oded , without any
`intervening steps other than transmission and / or storage . It became
`evident
`that there were situations where
`it would be desirable
`to
`switch from one program stream to another , or to a different part of
`the same stream . An obvious exa1nple i11 the context of MPEG's early
`objectives is the interactive movie
`a story that switches to alternative
`routes and endings according to actions of the viewer .
`Because of these considerations , a number of provisions were made
`within MPEG for switching or splicing of the bit stream . However ,
`these techniques were really designed for the type of situatio11 just
`described , where there is prior determination of the switch-out points
`and of the potential
`switch-in points . There
`is no mechanism
`in
`MPEG for a generalized switch from any point
`in one bit stream to
`any point in another.
`Some closer thought makes it clear why this cannot be . At the low(cid:173)
`est level, a variable-length-encoded
`bit stream is mearungf ul only if
`decoded from the beginning . Arbitrary switching provides no way to
`know what the data is, or what it means . If we tried
`to switch bit
`streams, like those shown in Chapter 7, we could (and likely would)
`end up interpreting extra bits as Huff man codes; what should have
`been motion vectors would be wrongly decoded and, perhaps, used as
`if the values were DCT coefficients .
`
`•
`
`•
`
`•
`
`IPR2021-00827
`Unified EX1010 Page 326
`
`
`
`Closing Thoughts
`
`307
`
`•
`
`characteristics of the
`We noted earlier that one of the important
`MPEG slice is that it is encoded without any outside references . All
`predictive encodings are reset at the beginning of a slice, so that we
`can begin decoding u11ambiguously at the start of any slice. Actually
`tl1is demonstrates
`that we have not yet defined
`the problem . We
`could start decoding the new stream at any splice , but it is not produc(cid:173)
`tive to begi11 decoding a picture in the middle. Resynchronization
`at a
`slice bou11dary is useful for recovery from data errors , but not for
`switching between programs .
`Perhaps we can use the picture boundaries for switching , just as we
`use vertical
`interval
`switching
`of baseband video
`today. For I(cid:173)
`f1·ame-only encodi11g we can do this , subject to certain conditions . !(cid:173)
`frame-only
`streams , like motion JPEG streams , can be switched with ·
`reasonable ease. Even at this point there are complications . If we per(cid:173)
`mit the same number of bits for each frame (the simplest case), a frame
`is 33.4 ms long in the nominal 60-Hz regions (such as North America )
`and 40 ms long in 50-Hz areas (such as Europe and Australasia ). Neither
`lines up with the 32 ms used by a frame of AC-3 audio . Typically this
`means leaving a small gap in the audio whenever a switch is made , and
`this complication exists however or wherever we switch .
`With other GOP structures
`the situation is more complex . Suppose
`we switch to the new bit stream , and the first picture is a P-f rame.
`Some macroblocks will likely be intracoded , and these will decode suc(cid:173)
`cessfully . Others, however , will be predictively
`encoded
`the bit
`stream will carry a motion vector identifying a similar macroblock in
`the previous I-frame and a set of DCT-coded information
`represent(cid:173)
`ing the changes that should be made to that 1nacroblock . The assump(cid:173)
`tion made by the encoder is, of course , that the 1-f rame is present in
`storage at the decoder . Well , there is an I-frame in the decoder store ,
`but it belongs to the previous video sequence
`the one before
`the
`switch (Figure 19-2)! If other conditions are correct the decoder will use
`it, but it will produce nonsense .
`If the frame following
`the switch is a B-f rame , the situation is only
`slightly worse . The B-frame may use both forward and backward pre(cid:173)
`diction , on the assumption
`that both of the surrounding
`pair of
`anchor frames (LI or LP ore PJ are in the decoder memory. Probably
`there are two frames in memory, and they may even be a pair of adja(cid:173)
`cent anchor frames , but both will belong to the previous video stream
`(Figure 19-3).
`
`IPR2021-00827
`Unified EX1010 Page 327
`
`
`
`308
`
`Chapter 19
`
`~-,>
`
`...
`
`""'·'-
`
`..... -, .. . ....
`' •.·I'<:.·""".. -
`Figure 19-2
`Incorrect P-frame
`referen ce after a
`splice .
`
`,--------------------.
`....
`:
`
`I
`
`11
`
`I
`
`I
`I
`I
`I
`I
`I •
`
`,---------------------------1
`
`I
`....
`
`Figure 19-3
`Incorrect references
`from a 8-frame.
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`•
`
`) '
`
`Ne ,,v sequ ence
`
`B
`
`I
`I
`
`I New sequence
`
`I
`I
`,+.,
`:
`I
`I
`, ____________________ J
`
`It appears that we may be able to switch MPEG bit streams o·nly at
`GOP boundaries, even though this may not provide sufficient resolu(cid:173)
`tion for some appljcations. We will look at some applications
`in a
`moment, but there is one more major issue awaiting us, even at the
`GOP boundary.
`In Chapter 9 we looked at MPEG rate control and discussed the
`concept of the VBV buffer model. Although not based on real buffer
`
`IPR2021-00827
`Unified EX1010 Page 328
`
`
`
`Closing Tho11ghts
`
`i
`I
`I
`I
`
`309
`
`-
`
`designs , VBV provides a 1neans of specifying decoder buffer perform(cid:173)
`ance and a means of checking
`that a bit stream is decodable by a stan(cid:173)
`dard decoder . The ba.sic conditions are that VBV sl1ould 11ever over (cid:173)
`flow or under£ low .
`VBV is filled at a 1101ninally co11stant rate by the trans1nission chan(cid:173)
`nel. It is emptied at regular (frame ) intervals , but by varying amounts
`is an I-, P-, or B(cid:173)
`depe11ding 011 whether
`the frame being extracted
`frame , and . on the complexity of the pa.rticu1ar frame . VBV represents
`an impo1·tant degree of freedom
`in the battle for constant quality ,
`and a good decoder will use the f11Il gamut of the buff er to help
`achieve this. At times tl1e buff er will be close to overflow , at other
`times close to underflow .
`It all works because the encoder tracks VBV fullness at al] times;
`except , of course 1 when we switch bit streams! The encoder for the
`second strea1n is managing VJ3V for hypothetical decoders tl1at have
`been receiving the bit stream since the beginning of the sequence. If any
`(or all) of the decoders have been receiving a different sequence , there
`is no way for the encoder
`to "know " that the switch has occurr·ed, or
`that the decoder VBV is in a11 unknown and t1nk11owable state. We
`might switch to an almost full VBV at a time when the seco11d stream
`that VBV should be almost en1pty; overflow is
`encoder has calculated
`to a 11ear-empty VBV when the
`inevitable
`. Similarly , switching
`encoder is tracking VBV as close to full will result in undei·flow.
`What happe11S when VBV overflows or underflows is t111defined by
`MPEG, and depends on the desig11 of a particular decoder. Most likely
`(and this is probably the least disturbing solution ) the picture will freeze
`until the decoder fiJ1ds that it again has valid data waiting to be decoded
`Unfortunately 1 the complexjty of the process makes it uncertain how
`quickly
`this point will be reached. It is also quite possible. tha: the
`decoder will start decoding again , but with. a VBV state that still differs
`from
`that assumed by the e11coder. In this case another overflo~ or
`is likely , but perhaps some considerable ti111e after ~e switch.
`underflow
`As a final
`twist
`statistical multiplexing might be the issue that
`to i1nagine how t?
`else look easy . It is difficult
`makes everything
`switch into one program stream
`that is1 to repl~ce all packets associ(cid:173)
`ated with that one program
`i11. the multiplexed bit stream. when the
`number of bits allocated to that program changes dy11amically tmder
`the control of so1ne device that may be on the other side of thbe. cottnt -
`.
`·
`lude rt1les about
`it ra e
`try. A worki11g sce11ar10 would 1ave to inc
`1
`.
`.
`
`for
`·
`and bit rate ce1 111gs
`. . .
`.
`insertion
`floors, at least at com1nerclal
`times,
`any material to be inserted at the switch.
`
`1.
`
`1
`
`IPR2021-00827
`Unified EX1010 Page 329
`
`
`
`310
`
`Chapter 19
`
`•
`
`Clearly , switching MPEG bit streams is a co1nple x a11d u11certain
`process . In fact , the preceding discussio11 pres ent s 011ly a gross si111pli(cid:173)
`fication of tl1e problem. Almost certai11ly the ge11eral problem
`for
`two unconstrained bit streams is insoluble . An MP EG expert would
`suggest that MPEG is not designed for such treatn1ent , so "Don 't do
`that!" U11fortunately , in the world of televi sion broad casting there are
`a nt11nber of areas where switchi11g of MP EG strea1ns is the only prac(cid:173)
`tical solution to an operational requirement
`. Let's look at some appli(cid:173)
`cations and see how they might be handled .
`
`MPEG Applications
`
`MPEG is used by recording devices a11d it would be useful to edit seg(cid:173)
`ments together without decoding . On disk record ers particularly , the
`ability to perform edits merely by using the rand om access capabilities
`of the disk is a major convenience . At the time of writing , almost all
`disk systems use motion JPEG , or some other
`intr acodi11g scheme
`where splicing is not a problem . However , there are str ong incentives to
`use MPEG for increased recording time and /or hi gher image quality .
`MPEG's increase in efficiency comes ma inly from temporal compres(cid:173)
`sion, so intracoding is not an optio11 that satisfie s these requirements.
`The ATSC Digital Television Standard uses MP EG-2 encoding
`for
`video and provides for (nominally ) one high-definition
`program or
`several multiplexed program streams of standard definition . Normal(cid:173)
`ly about 18 Mbits /s is available for video in the transmission signal
`Initially
`it was assumed that this signal would be distributed by net(cid:173)
`works , and that television stations would switch to local commercials
`by switching
`the MPEG bit stream. The lack of production
`values
`offered to the station , an.d the difficulties of the switching , J1ave con(cid:173)
`tributed
`to the unpopularity of this scheme , and most networks are
`now proposing a different approach (see below ).
`As we move down the transmission food chain , however , there is a
`continued need for switching , and although neither
`is attractive ,
`eventually
`there is no option but to decode and re-encode , or to
`switch the transmission signal. The most obvious example is the cable
`head-end. As cable systems extend the digital signal to the home, eco(cid:173)
`nomics demand
`that the head-end operate
`in pass-through mode ,
`without decoding the video and audio. Yet insertion of local commer(cid:173)
`cials at cable head-ends is now a thriving business, and this require(cid:173)
`ment will certainly not disappear.
`
`IPR2021-00827
`Unified EX1010 Page 330
`
`
`
`Closing Thoughts
`
`311
`.
`L ________ .._
`
`Most of these problems are as yet unsolved , and it is clear that there
`is no universal solutio11, but potential solution s for variou s applica(cid:173)
`tio11s a1·e on the horizon .
`
`Some Solutions
`
`There are significant development s in the field of re-e11coding, based
`011 reusing
`the original coding deci sion s. These development s will
`make a decode /re-encode approach viable in many areas by dra stically
`reducing the losses normally associated with proces s.
`Another approach requires partial decodi11g of the signal to change
`frame types to preserve the i11tegrity of n1otion co1npe11sation. In par(cid:173)
`ticular , B-frames may be n1odified so that vectors are used i11 only one
`direction . At the International Broadcasting Convention (IBC) in 1977,
`the European Atlantic project demo11strated switching of this type .
`Conventional wisdom 1night suggest that the first f ran1e of the
`new sequence (after a switch ) should be an I-frame , requiring no refer(cid:173)
`ences from previous fra1nes. However , B-frames may be used if they
`employ only backward p1·ediction
`that is, prediction based on an
`anchor frame later in time ~ (But remember , the I-frame still must be
`transmitted
`first , even though it is for later pre sentatio11.) The advan(cid:173)
`tage is that these B-frames ca11, at the expense of image quality , be
`squeezed to a very low 11u111ber of bits if required , and it may be nec(cid:173)
`essary to reduce the number of bits at the beginning of the second
`sequence to correct the VBV fullness.
`Taking bits from
`the B-f rames has two advantages . First , the B(cid:173)
`frames are a part of the sequence that can be dep ·rived of bits without
`affecting
`the quality of the I-frame or introducing errors that will
`propagate . Second , a sce11e change provides a substantial degree of
`te1nporal masking i11 the hun1an visual system . If the low-quality B(cid:173)
`f ra1nes occur immediately after the switch , it is most unlikely that the
`quality loss will be noticed .
`this view. An example
`supported
`Certainly
`the IBC demonstration
`switch betwee11 two sequences used two u11idirectional B-frames
`immediately
`after
`the cut. Examining
`the result frame by frame
`revealed that these B-f ra1nes were of very poor quality , but at nor1nal
`play speed the sa1ne sequence was quite acceptable, with no apparent
`quality loss.
`It is not certain what impact such developments will have on deci(cid:173)
`sions on signal switching at cable head-ends and similar locations . If a
`
`IPR2021-00827
`Unified EX1010 Page 331
`
`
`
`I
`i
`
`'
`I
`
`312 I I
`
`-
`
`•
`
`•
`
`Chapter 19
`
`switch can be made by decoding and re-encoding (switching at video )
`witl1 no significant loss of quality , this mechanism 1nay be attractive
`for n1any applications. It 1nay be that / by combining
`the techniques
`of reusing coding decisions a11d modifying
`frame
`types / the motion
`detector could be eliminated from tl1e re-encoder. This would elimi(cid:173)
`nate the most significant cost element and make tl1is approach finan(cid:173)
`cially attractive.
`In the meantime / most workers assume that some method of direct
`switching of MPEG bit streams will be required for some applications .
`A working group of SMPTE/ under the cl1airmanship of the author /
`investigated possible techniques
`for so1ne two years . The real wor .k
`was performed by an ad hoc group chaired by Katie Cor11og of Avid /
`with members fro1n many different
`industries . Eventually
`the com(cid:173)
`mittee produced a standard
`for splicing MPEG bitstreams , SMPTE
`312M/ published in 1999.
`The standard specifies how MPEG transport streams may be con(cid:173)
`structed to permit splicing . Splice points / in-points / and out-points are
`inserted at the desired points in the bit stream / and messages may be sent
`in the bit stream to advise splicing devices of upcon1ing splice points .
`The key to necessary VBV buffer management
`is to relate buffer
`fullness to a delay . As the "input " to VBV is a constant rate bit stream ,
`each value of buffer fullness corresponds
`to a certain delay : the time
`it would take for the buffer
`to reach fullness from e1npty. This is
`known as the splice decoding delay. A seamless splice is defined as one
`where there is no discernible artifact when the spliced bit stream is
`played by a standard decoder. To achieve a seamless splice / both bit
`streams are constrained
`to have a splice decode delay equal
`to a
`defined nominal value at the splice point. (There are many other con(cid:173)
`straints, but this is the essential factor for VBV management. )
`To provide for cases where a seamless splice cannot be achieved / the
`proposed standard also defines a nonseamless splice where VBV may
`undergo a controlled underflow. A well-designed decoder will pla Y a
`seamless splice with minimal disturbance , usually a freeze of the
`video for a few frames.
`of the SMPTE Standard, _ it has become
`Since the publication
`increasingly apparent
`that for most applications, seamless splicing
`is
`not practical. The bitstreams
`require considerable
`preparation
`to
`include splice points and to manage the buffers
`in the vicinity. It
`appears that other techniques, making use of mezzanine compression,
`transparent
`decode/encode, and/or GOP modification will provide
`practical solutions for studio applications.
`
`IPR2021-00827
`Unified EX1010 Page 332
`
`
`
`Closi11g Thoughts
`
`.
`.
`/ s·uch a local coin
`applications
`For lower-budget
`·
`
`1 merc1a 111serti
`.
`.
`cable systems / the cable 111d ust1·y has developed eco
`on in
`es that will likely be incorporated
`into a future ver;oinicfalhapproach-
`s d d
`t e SMPTE
`10n o
`ta11 ar .
`·
`
`313
`
`1::::: ~J r· .. ·= l Mezzanine Compression Systems
`
`It is ~lear f r~m the preceding sections that switching compressed sig(cid:173)
`nals is not s11nple. Furthermore
`/ productio11 effects ot11er than cuts
`are ~e~ded f r~quently
`at _i11termediate points. For example / a Joca1
`telev1s1011 station mt1st switch
`local corrunercials
`to a network
`feed
`but also must key station
`identif icatio11 a11d alert messages over th;
`video a11d insert voiceovers
`in the audio . S01ne stations per£ orm
`more sophisticated production
`effects such as squeezing the 11etwork
`image as the credits roll to all0w a "pro1no " for tI1e next progran1 . At
`present none of these effects can be perf orrned on the con1pressed
`signal
`to produce
`spatial effects we need the signal in the spatial
`domain / not the frequency dornai11. It is possible that solutions may
`be found
`for the simple cases/ but in ge11eral it is safe to assume that
`productio11
`effects
`require
`a baseba11d (u11compressed ) signal ,
`whether
`in video or audio.
`This presents a problem . Although we have seen that identical com(cid:173)
`pression syste1ns concatenate
`almost losslessly/ this is not true if the
`signal is changed between decodi11g and re-encoding . I 11 the case of
`MPEG delivery systems / it is fair to assun1e tl1at i11sL1f ficie11t head(cid:173)
`room exists to be able to decode/ perform effects , and re-e11code with(cid:173)
`out substantial degradation .
`The issue arises at many poi11ts i11 a televisio11 broadcast chai11, in
`channels / a11d in the TV statio11 when
`contribution
`and distribution
`recording devices are encountered
`. Most standard-definition
`digital
`video recorders use so1ne form of compression, and it is highly proba(cid:173)
`recording will don1inate tI1e world of 11igh def i(cid:173)
`bly that compressed
`nition. How do we design a viable system that i11cludes compression at
`so many poi11ts/ and where
`there
`is a need for production effects at
`many different points in the chain?
`The concept of mezzanine , or in-between , con1pression is evolving
`to meet these needs for broadcast systems based on the A TSC Digital
`Television Sta11dard. (To ·quote a colleague/ a mezzani11e is the floor in
`a hotel you can never find, and where your 1neeting is!)
`
`IPR2021-00827
`Unified EX1010 Page 333
`
`
`
`314 I
`
`L
`
`-
`
`Chapter 19
`
`Requirements fo1· a mezzanine system cover quite a large range . At
`one extren1e1 the mezzanine signal must be carried over a si11gle satel(cid:173)
`lite transpo11der (around 45 Mbits / s maximum ) as part of the network
`distribution , and 1nust I1ave sufficient quality overhead
`to permit
`to the A TSC emission
`decoding, production effects I and re-encoding
`standard (approximately 18 Mbits /s for video ).
`At tl1e otl1er extreme , co1npression used for recording , and possibly
`in-plant distribution , must be sufficiently
`robt1st to permit a signifi(cid:173)
`cant number of decode / re-encode operations (perhaps 6 to 10) without
`11oticeable degradation . It is also required
`that this signal be capable of
`simple switching and editing , so i11tra-only coding should be used ,
`preferably with a fixed number of bits per frame . Color encoding
`sl1ould be at least 4:2:2 for studio effects .
`Fro1n the discussion of concatenation , we can draw some conclu-
`. sions about the required characteristics of the mezzanine
`syste1ns .
`Working backward from
`the emission system defined , we must see
`the same algorithm (DCT ), the same DCT block placement
`, and the
`same quantization strategy .
`Obviously, there is a scheme that meets these requirements MPEG
`2 itself but as published there is no MPEG profile / level combination
`that meets all these needs . Because of the urgent need for a working
`document, SMPTE is in the process of standardizing a new operating
`point , the 4:2:2 profile at high level (4:2:2@HL ) permitting bit rates up to
`300 Mbits/s. Table 19-1 shows some examples of how this might be used
`a 1998 SMPTE standard
`(SDTI is serial data transmission interface
`for
`transmission of packetized data over serial digital links in the studio .)
`
`TABLE 19-1
`
`Possible scenarios
`for the use of
`MPEG for pre(cid:173)
`transmission
`encoding of
`high-definition
`TV signals.
`
`Application
`
`Contribution
`(low delay )
`
`Contribution
`(other )
`
`Possible
`bit rates
`
`155 Mbits /s
`
`60-80 Mbits /s
`
`Possible
`•
`•
`c1rcu1ts
`
`OC-3 fiber
`
`8-PSK satellite
`
`Possible GOP
`structures
`
`I only ,
`or IP or IB
`
`Long GOP if delay
`u njm porta n t., otherw-ise
`balance between
`quality and delay
`reg uiremen ts
`
`In-plant
`
`Distribution
`
`200-270 Mbits /s
`
`I only
`
`40-4 5 Mbits/ s
`
`Same as emjssion
`standard
`
`Recorders and SDTl
`
`QPSK satellite or DS-3
`
`IPR2021-00827
`Unified EX1010 Page 334
`
`
`
`Closing Thoughts
`
`I
`315
`
`Similar issues arise with audio distribution , but there are some dif(cid:173)
`ferent
`twists . Like any compression system , AC-3 (Dolby Digjtal) can
`be decoded and re-e11coded, but this is not desirable when compres(cid:173)
`sion has been performed on the final deliv ery rate. However , produc(cid:173)
`tion requirements
`are similar
`a station 11eeds to perform voiceovers,
`and these can11ot be done with audio in a compressed form. It is possi(cid:173)
`ble to use the same compression scheme at a l1igher data rate for dis(cid:173)
`tribution , and this is certainly one possibility . However , once the need
`to deviate from the transnussion standard is seen, it may be profitable
`to explore further
`. In tl1is case requirements
`in the studio suggest an
`that could also be used for distribution .
`alternative approach
`Two problems arise in the studio . One is the 11umber of channels
`required . The six channels could be carried by three AES/EBU circuits,
`but most recording devices will not handle this many channels . Also,
`it is difficult
`if riot impossible
`to carry three AES/EBU circuits in per(cid:173)
`fect sy11chronism through all the operations and equi pn1ent of a tele(cid:173)
`that the siganals will slip by at least 1 bit.
`vision studio . It is probable
`1 bit represents a phase change of some 35° at 10 kHz,
`Unfortunately
`and this is a significant error in a true surround-sound system. The
`other problem
`is that there is no convenient mechanism to carry the
`in Chapter 17) that is required
`for the AC-3
`metadata
`(discussed
`encoder .
`Dolby Laboratories has designed a new compressioi1 scheme to satis(cid:173)
`fy these requirements
`. The new system , Dolby-E, uses an algorithm
`based on that used for AC-3 and specifically designed to be benign
`when used ahead of AC-3. It will compress 5.1 channels into the pay(cid:173)
`load of a single AES/EBU circuit , together with the metadata needed
`for the AC-3 encoder . Some versions of Dolby-E provide additional
`channels within
`the same bitstream . This can be particularly useful
`when a program needs 5.1 audio for digital transmission a11d stereo or
`DSPL for analog transmission .
`. .
`.
`Dolby-E also addresses an issue common to all other d1g1tal audio
`systems
`the fact that the length of the frame or access unit does.not
`relate easily to the length of the video frame . Dolby-E has versions
`.
`.
`'d
`f
`ats and in each case the
`,
`.
`to all the maJor v1 eo orm,
`corresponding
`h
`h t
`f the video frame . This
`.
`length of the access unit mate es t a o
`.
`.
`.
`d . .
`11 ssociated with sw1tch1ng
`removes ma11y of the problems
`tra 1t1ona Ya
`a11d editing digital audio and video .
`The bit rate
`.
`.
`.
`h has many advantages .
`mit a useful
`This audio mezzanine approac
`.
`·
`1
`yet high eno_ug 1 to ~e~ot triple the
`is acceptable
`for distribution,
`operations; stations nee
`~umber of decode/reencode
`
`IPR2021-00827
`Unified EX1010 Page 335
`
`
`
`316
`
`Chapter 19
`
`layers of audio routing , videotape recorders will be able to record the
`sig11al, and the 1netadata will be delivered as part of the package .
`
`A Glimpse into the Future
`
`service , great
`Even though MPEG-2 is just goi11g into widespread
`things are being planned
`for the future . The MPEG-4 standard
`is
`approved and applications are appearing .
`MPEG-7 is well u11der way and , in conjunction with new technolo(cid:173)
`gy, promises to greatly increase the real value of multimedia archives ,
`MPEG-21 is intended
`to provide a complete structure
`to manage digi-
`tal assets.
`We saw that MPEG-4 can transmit a model of a face , and then ani -
`mate that face to match dialog . Other researchers are working on the
`of agents models
`that will n o t only accurately
`implementation
`mimic the delivery of a speech , but will do so in whatever
`language is
`requested.
`That's one of the interesting things about compression and its allied
`technologies . The progress over the last few years has been dramatic ,
`that we have only
`yet the n1ore we know , the more we realize
`scratched the surface. I find the subject fascinating
`I just hope that I
`have managed to convey some of my fascination with this book .
`
`•
`
`•
`
`•
`
`IPR2021-00827
`Unified EX1010 Page 336
`
`
`
`GLOSSARY
`
`in the ATSC Digital Televi(cid:173)
`This glossary is based upon that included
`sion Standard 1 A531 and pern1ission to reprodL1ce it here is gratefully
`acknowledged . The ATSC glossary 1 and so111e of 1ny additions 1 include
`ter1ns that are not strictly applica .ble to video co111pression and will
`not be found elsewhere in this book . Others refer to advanced
`topics
`the scope of this book . All of these terms are however
`rele-
`beyond
`vant to the field of digital television . Some of the definitions are those
`from
`the MPEG sta11dards; in these cases I have retained
`t11e formal
`syntax (e.g.1 group _start _code ) .
`
`1
`
`/
`
`•
`
`8-PSK A variant of QPSK used for satellite Jinks to provide greater
`data capacity u11der low-noise conditions .
`
`8 VSB Vestigial sideband modulation with 8 discrete amplitude
`levels .
`16 VSB Vestigial sideband modulation witl1 16 discrete amplitude
`levels .
`ACATS Advisory Committee on Advanced Television Service.
`
`AC-3 The audio compression scheme invented by Dolby Laborato(cid:173)
`for the ATSC Digital Television Standard. In the
`ries and specified
`.
`it is called Dolby Digit:11
`world of consumer equipn1ent
`
`access