throbber
I
`
`Applicati o n Issues of MPEG-1 /2 Video Cocli ng
`
`369
`
`during a transition period, bott1 NTSC and DTV service will be simultaneously broadcast on
`different channels J11d DTV cc1n 011ly use the taboo channels. This approacl1 allows a smooth
`transition to DTV, suct1 that the services of tl1e existing NTSC receivers will remain and gradually
`be phased out of existe11ce i,1 the year 2006. The si,nulcasting requirernent causes some tccl1nical
`difficulties in DTV desig11. First, tl1e l1igl1-quality HDTV prog,ram must be ·delivered in a 6-MHz
`cl1annel to n1ake efficje,1t use of spectrurr1 and to fit allocation plans for the spectrt.1n1 assigned to
`television broadcasti 11g. Seco11d, a low-power and low-interference signal must be used so that
`si1nulcasti11g i11 tl1e sa111e frequency allocations as current NTSC service does not c·ause excessive
`interferen ce witl1 the existing NTSC receiving, since the taboo channels are generally unsuitable
`for broadcastir1g an NTSC signal due to high interference. In addition to satisfying tl1e frequency
`spec trun1 requirement, tl1e DTV standard has several in1portant features, which allow DTV to
`achieve interope rability witl1 computers and data corn1nunications. The first feature is the adoption
`of a layered digital system arcl1itecture. Each individual layer of tl1e system is desig.ned to be
`ir1teroperable witl1 otl1er systems at the corresponding layers. For example, the square pixel and
`progressive sca11 picture forn1at should be provided to allo\v computers access Lo the compression
`layer or picture layer depending on the capacity of the computers and the ATM-like packet for1nat
`for the ATM nel vvork to access the transport layer. Second, tl1e DTV standard uses a l·1eader/descrip(cid:173)
`tor approach to provide maxin1u1n flexible operating characteristics. Therefore, the layered archi(cid:173)
`tecture is the 111ost important feature of DTV standards. Tl1e additional advantage of layering is
`that the ele1nents of the system can be combined with other tech.nologies to create new applications.
`Th e system of DTV standard includes four layers: the picture layer, the ·compression layer, the
`transport layer, and tl1e tra11smission layer.
`
`17 .2.2.1
`
`Picture Layer
`
`At the picture lay.er, the input video forn1ats t1ave been defined. The Executive Committee of tl1e
`ATSC has approved release of statement regarding the ider1tification of the HDTV and Standard
`Definition Television (SDTV) tra11sn1ission fom1ats \vitl1in the ATSC DTV standards. There are six
`video t·om1ats in the ATSC DTV standard, whicl1 are ·'Higl1 Definition Television." These fo·m,ats
`are I isted in Table 17. I .
`The remaining 12 video formats are not HDTV fom1at. These fom1ats represent some .in1prove(cid:173)
`ment s over analoo NTSC and are referred to as ''SDTV." These are listed in Table 17.2.
`These definitions are fully supported by tl1e technical speciJications for t_l1e various formats as
`measured against the internatjonally accepted definition of HDTV established in 1989 by the ITU
`a·nd the definitions cjted by t}ie FCC during the DTV standard developn1ent process. These forn1ats
`cover a wide variety of applications, which i11clude motion picture film, curre11tly available HDTV
`production equipm·ent, tl1e NTSC television _standard, and computers such as person.al computers
`and workstations. However, there is no simple tecf1nique \.vhich can convert images from one pixel
`
`0
`
`TABLE 17.1
`HDTV Formats
`
`Spatial Format
`(X x Y active pixels)
`
`Aspect Ratio
`
`Tempora ,I Rate
`(Hz progressive scan)
`
`I 920 x I 080 (square pixel)
`
`16:9
`
`1280 x 720 (square pixel)
`
`16:9
`
`23.976/24
`29.97/30
`59.94/60
`23.976/24
`29.97/30
`59.94/60
`
`•
`
`•
`
`IPR2021-00827
`Unified EX1008 Page 395
`
`

`

`370
`
`Image and Video Compre ssion for Multirnedia Er,gineering
`
`TABLE 17.2
`s·orv Formats
`Spatial Format
`(X x Y active pixels)
`
`Aspect Ratio
`
`Temporal Rate
`(Hz progressive scan )
`
`704 x 480 (CCIR601)
`
`16:9 or 4:3
`
`640 X 480 (VG·A, square pixel)
`
`4:3
`
`23.976/ 24
`29 .97/30
`59 .94/60
`23.976/24
`29 .97/3.0
`59.94/60
`
`format and fran1e rate to anotherll1at acl1ieve interoperabiliLy a1no11g fil1n a11d tl1e various worldwide
`televisio:n standards. For exan1ple, all lo\v-cost computers use square p,ixels arid progres.sive scan(cid:173)
`ning, while current television uses rectangular pixels and interlaced scanning. The video industry
`has paid a lot ·of attention to developing forn1at-cOn\1erting tecl111iques. Son1e tecl1niques such as
`deinterlacing, dO\vn/up-conversion for fo1111at conversio11 have already been developed. It should
`be noted that tl1e broadcasters, ¢ontent providers, a11d service providers ca11 use any one of these
`DTV forrnat. This results in a difficult problem for DTV receiver n1anufacturers \vho have to provide
`all kinds of DTV receivers to decode all these for111ats and the11 to convert the decoded signal to
`its particular ,display fom1at. On the otl1er hand, tl1is requiren1ent also gives receiver manufacturers
`the flexibility to produce a wide variety of products that have different functionality and cost, and
`the consumers freedom to choose an1ong tl1en1.
`
`17 .2.2.2 Compression Layer
`
`The ra\v data rate of HDTV of 1920 x 1080 x 30 x 16 ( 16 bits per pixel corresponds to 4:2:2 color
`format) is about 1 Gbps. T.he functjon of the compression layer is to compress the ra\v data from
`about_ 1 Gbps to the data rate of approxin1ately 19 Mbps to satisfy tl1e 6-MHz spectrun1 requirement .
`This goal is achieved by using the main profile and .high level of the MPEG-2 video standard .
`Actually, during the development of the Grand Alliance HDTV systen1, many research results \Vere
`adopted by the MPEG-2 standard at the same time; for example, the support for interlaced video
`fo11nat and the syntax for data partitioning and scalability. The ATSC DTV standard is tl1e first and
`mos,t important application example of the MPEG-2 standard. The use of MPEG-2 video compres(cid:173)
`sion funda1nentally enables ATSC DTV devices to interoperate witl1 MPEG - 1/2 cornputer multi(cid:173)
`media applications directly at th.e compres.sed bitstream lever.
`
`17 .2.2.3 Transport Layer
`
`The transp0rt layer is another important issue for interoperability. The ATSC DTV transport layer
`uses the· MPEG-2 system ·tr,ansport stream syntax. It is a fully compatible subset of the MPEG-2
`transport protocol. The basic function of the transport layer js to de·fine the basic for1Tiat of data
`packets. The pu,rposes of packetization include:
`
`• Pac~aging the data into the fixed-size cells or packets t·or forward error correctio ·n (FEC)
`e,ncoding to protect the bit error due to the communication channe l noise;
`• Mu.ltiplexing the video, audio, and data of a program into a bitstrearn;
`• Brovjding time syncl1ro·nization .for different n:iedia ele111ents;
`• Providing flexibility aQd exte.osibility witl1 backward compatibility.
`
`IPR2021-00827
`Unified EX1008 Page 396
`
`

`

`Application
`
`Issues of MPEG-1/2 Video Coding
`
`371
`
`-<1111(----
`
`4 byte packet header
`Vid.eo
`
`Audio
`
`Video
`
`Video
`
`Audio
`
`PGM GD
`
`Video
`
`FIGURE 17.1 Packet structure of ATSC DTV transport la.yer.
`
`The transport layer of ATSC DTV uses a fixed-length packet. The packet size is 188 bytes c·.onsisting
`of 184 bytes of pay'load and 4 bytes of header. Within the packel l1eader, tl1el 3-bit packet identifier
`(PID) is used to provide tl1e important capacity to combine tl1e vjdeo, audio, and ancillary data
`streao1 into a single bitstrearB as shown in Figure 17. l. Each packet contains only a single type of
`data (video, audio, data, progran1 guide, etc.) identified by tl1e PID.
`Tl1is type of packet structure packetizes tl1e video, audio, and auxiliary data separately. It also
`provides tl1e basic 1nL1ltiplexin.g function tl1at produces a bitstream including video, five-channel
`surround -sound audio1 a.nd an auxiliary data capacity. This ki.nd of transport layer approach also
`provides con1plete flexibility to allocate ct1annel capacity to achieve any mix among video, audio,
`ar1d otJ1er data services. It should be noted that the selection of 188-packet lengtl1 is a trade-off
`between reducing tl1e overhead due to the transport header and increasing tl1e efficiency of error
`correction . Also, one ATSC DTV packet can be con1pletely encapsulated with its heade.r \Vithin
`four ATM packets by using l AAL byte per ATM ]1eader leaving 47 usable payload bytes times 4,
`for 188 bytes. The details of tJ1e transport ]ayer is djs_cussed in the chapter on MPEG systems.
`
`Transmission Layer
`The function of tl1e transmission layer is to modulate the transport oitstream
`into a signal that can be transmitte·d over the 6-MHz analog cl1annel. The ATSC DTV system uses
`a trellis-coded eight-level vestigial sideband (8-VSB) modulation technique to deliver approxi(cid:173)
`mately 19.3 Mbps in the 6-MHz terrestria,1 si111ulcast ct1annel. VSB modulation inl1erenlly requires
`only processi11g the in-phase signal sampled at the symbol rate, tl1us reducing tl1e complexity of
`the receiver , and ultimately the cost of implen1er_1tation. The VSB signal is organized in a data
`fran1e that provide s a trai11i11g signal to facilitate channel equalization for removing multipath
`distortion. However, from several field-test results, the multipatl1 distortio11 is still a serious problem
`of terrestrial simulcast receiving. The fr~n1e is organized into segn1ents each \vith 832 symbols.
`Each transmitted seg1nent co11sists of one syr1chronizat.ion byte (four sy111bols), 187 data bytes, and
`20 R-S parity bytes. Tl1is corresponds to a J 88-byte p.acket, wl1ich is protecte·d by 20-byte R-S
`code. Interoperabi Ii ty at the trans1nission layer is required by different transmission n1edia appli(cid:173)
`cations. The different media us.e different modulation techniques nO\V, su.ch as QAM for cable and
`QPSK for satellite. Eve11 for terrestrial transmissio11, European DVB. systen1s use OFDM transn1is(cid:173)
`sion. Tl1e ATV receivers \viii 11ot only be designed to receive terrestrial broadcasts, but also the
`programs fron1 c.able, satellite, and other n1edia.
`
`17.3 TRANSCODING WITH BITSTREAM SCALING
`
`1 7 .3.1
`
`BACKGROUND
`
`As indicated in the previous cl1apters, digital video signals exist everywhere in tl1.e format of
`con1pressed bitstreams . The con1pressed bitstreams of· video signals are used for tra11sn1ission and
`storage tl1rougb different. media sucl1 as terrestrial TV, satellite, cable~ the ATM net\vork, and the
`
`IPR2021-00827
`Unified EX1008 Page 397
`
`

`

`372
`
`lin age and Video Co111pression for Multir n edia Engineerin g
`
`Internet. The decoding of a bitstream can be in1ple1ne11ted in eitl1er l1ardware or software. Ho\vever,
`for I1igh-bit-rate con1pressed video bitstrean1s 1 specially des igned hardware is still the n1aj or deco d(cid:173)
`ing approa ch due to the speed li_mitation of current computer processors. T l,1e compr essed bitstream
`_as a ne,v fom1at of , 1ideo signal is a revolutionary change to video industry since it enab les many
`application s. On th.e other hand , there is a proble1n of bitstrea1n conversion. B ilstrean1 co11version
`or transcoding can be classified as bit rate conversion, resolution conversion, and sy 11tax co11ver ion.
`Bit rate conversion includes bit rate scaling and the conversior1 betvvee11 co11stant bit rate (CBR)
`and variable bit rate (VBR) . Resolutio11 cor1,,ersion includes spatial reso lution cor1vers.ion and
`temporal resolution cor1version. Syntax conversion is 11eeded bet vveen di rrerent co 111 press ion stan(cid:173)
`dard s such as JPEG, MPEG-1, MPEG-2, H.26 1, and H.263. I11 tl1is ection, vve w1ll focu on the
`topic of bjt rate conversion, especially on bit rate scalir1g since it finds \vide ,1pplication and readers
`can extend the idea to other kinds of transcodi ng. Al o, \Ve limit ourseJ,,e to focu on the problem
`of scaling an MPEG CBR-en coded bitstrean1 do,vn LO a lo\ver CBR. T l1e other ki11d of transcoding,
`do\vn-conversion decoder, \viii be prese nted in a separate sectio,1.
`The basic function of bitstrean1 scaling may be tt1ougl1t of a n black box, \Vhich pa sivel)1
`accepts a precoded MPEG bitstream at the input and produce .. a sca led bit trea ,11, \Vl1ich 111ee ts ne\~'
`constraints that are not kno\vn a p1·io1·i dur i11g the creation of the origi11~,l pre oded bitstream. The
`bitstream scaler is a transcoder, or filter, tl1at provides a n1atcl1 bet\veen n,1 MPE G ·ource bitstream
`and the receivin g load. Tl1e rece iving load consists of tl1e trans1111ssion cl1a11r1c 1, tl1e destination
`decoder , and perhaps a destination storage device. The constrai11t on the ne,:v bitstrc,1m 111ay be bound
`by a variety ot· condition s. Among then1 are the peak or average bit rale i1.11posed l)y tl1e con1muni(cid:173)
`cations channel , the total nun1ber of bits imposed by the storage device, a11d/or the variacion of bit
`usage across pictures due to the an1ount of buffering available at the recei,1i11g decoder.
`While the idea of bitstream sca ling has r11any concepts si111ilar to tl1ose provided by tl1e various
`MPEG -2 scalability profiles, the intend.ed applica tions and goals dirfer. T l1e MP EG-2 sca labilit)'
`n1ethods (dat a partitioning , SNR scalability, spatial sca la'bility, and te111pornl ca lab ility) are aimed
`at pr0\ 1iding encoding of source video into multipl e service grades (thnt are preden ncd at the tjme
`of encoding) and multiti ered transn1ission for increased signal robustr1ess. T l1e mul l i pie bi tstreams
`created by MPEG-2 scalabilit y are hierarchically dependent in such a \vay that by decodin g an
`increasing number of bitslream s, higher service grades are reconsLructed. Bitstream scaling meth (cid:173)
`ods, in contra st, are prim arily decoder/transcoder tecl1niques for converting an existing preco ded
`bitstream to another one that meets new rate constraints. Several app lications that motivate bitstream
`scaling include the following:
`
`Consider a video-on-demand (VOD) scenario wherein a video file
`1. Video-On-Demand
`serve.r includes a storage device containing a library of preco ded MPEG bitstrea ms.
`The se bits,treams in the library are originally coded at high quality (e.g. , studi o qu ality) .
`A nu·mber of client s may request retrieval of these video progran1s at one particular time.
`The number of users and the quality of video delivered t.o the users are co nstrair1ed b)'
`the outgoing channel capacity . This outgoing channel, \vl1icl1 n, ay be a ca ble bus or an
`ATM trunk , for example, must be shared among the users who are adn1itted lo tl1e se rvice.
`Different users may require d'ifferent levels of video quality, and the quality of a respective
`pr:ogram will be based on the fraction of tl1e total channel capacity allocated to eac l1
`user. To acco111modate a plurality of users simultaneou sly, the video file ser ver 1nust scale
`the stored precoded bitstreams to a reduced rate before it is delivered ove r the ch.anncl
`to respective users. The quality of the resulting scaled bitstream sl1ould not be signifi(cid:173)
`eantly degraded compared with the quality of a hypotl1etical bitstream so obtain ed by
`coding the original source material at toe reduced rate . Con1plexity cost is not such a
`c.ritical factor because only tlie file server has to b.e equipped witl1 the bitstream scaling
`hardware, not every user. Presumably, video service provider s would be \Villin,g to pay
`a high co.st for delivering the possible highest-quality video at a pres cribed bit rate .
`
`IPR2021-00827
`Unified EX1008 Page 398
`
`

`

`Application
`
`Issues of MPEG-1/2 Video Coding
`
`373
`
`A~ ~n option, a so~histicated video file server rnay also perfo1m scaling of multiple
`or1g111al precoded b1tstreams jointly and statistically multiplex the resulting scaled VBR
`bitstreams i11Lo tl1e cha1111el. By scaling tl1e group of bitstreams jointly, statistical gains
`can be acl1ieved. These statistical gains can be realized in the for111 of higher and n1ore
`unifor111 pictL1re quality for tl1e san1e channel capacity. Statistica l multiplexing over a
`DirecTv transponder (Is11ardi, 1993) is 011e example of an application of video stati,stical
`111ul ti plexi rig.
`I11 this application, the video bitstream is scaled
`2. Trick- 1)lay Track on Digital VTRs
`to create a sid·etrack 011 video tape recorders (VTRs). This sidetrack contains very coarse
`qualit y video sufncie11L to facilitate trick-modes on the V""fR (e.g., FF and REW at
`differe11t speeds). Complexity cost for the bitstream scaling hardware is of significant
`co n.cer11 i 11 ll1is ,1pplication
`ince the VTR is a 1nass consumer i Lem subject co mass
`prodt1cl ion.
`3. Exte nded-Play Recording on Digital VTRs
`In this application, video is broadcast to
`users' l101nes at a certain broadcast quality (-6 Mbps for standard-definition video and
`- 24 Mbp s for l1igh-definition video). Witl1 a bitstrean1 scaling feature in their VTRs,
`u ers 111ay record tl1e video at a reduced rate, akin to extended-p lay (EP) mod e on today's
`VHS recorders, tl1ereby recording a greater duration of video progran1s onto a tape at
`lower qua'lily. Again, hardv.,are complexity costs would be a n1ajor factor l1ere.
`
`17.3.2
`
`BA SIC PRI NCIPLES OF BtT STREAM SCALING
`
`•
`
`As de scribed previously, the i(lea of scaling an MPEG-2-compre ssed bitstrearn do\vn to a lo\ver
`bit rate is initiated by se\ieral applications. One problern is the criteria tl1at should be used to judg e
`tl1e perf om1ance of (ln architecture that c·an reduce tl1e size or rate of an MPEG-con1pressed
`bitstre a111. Two basic principles of bitstream scaling are ( I ) the inforn1ation in the original bitstrean1
`should be exploited as 1n uch as possible, and (2) the resulting in1age quality of tl1e new bitstream
`\Vitl1 a lower bil rate should be as close as possible to a bitstrea1n created by coding the original
`source video a1 tl1e reduced rate. Here, we assu111e cl1at for a .giver1 rate the origina l sot1rce is encoded
`in an op timal way. Of course, the implementatio11 of hard\vare con1plexity also l1as to be considered.
`Figur e 17.2 shows a simplified encoding stru·cture of MPEG encodi ng ir1 \.vhich Ll1e rate control
`n1echanism is 11ot sl1own.
`In this structure, a block of image data is first transfor111ed to a set of coefficients; the coe'fficients
`are tl1en quantized wit!, a quantizer step \~hicl1 is decided by tl1e given bit rate budget, or number
`of bits ,1ssigned to tl1is block. Finally, the quantized coefficients are coded i11 variable-length coding
`to the binary forn1at, wl1icl1 is called the bitstrean1 or bits.
`
`Q
`
`VLC-
`
`Bits
`
`lnput source
`
`T
`
`p
`
`T-- transfonn, Q--quantizer, P-moticn-com,pensated prediction
`VLC-- variable 'length
`FIGURE 17.2 Simplified encoder structure. T = transform, Q = qua11tizer, P = motion-compensated predic(cid:173)
`tion, VLC = variable length.
`
`IPR2021-00827
`Unified EX1008 Page 399
`
`

`

`374
`
`Image and Video Co,npression for Multim eclia Engineering
`
`Fron, tl1is structure it is obvious that tl1e perfor.111ar1ce of cl1,1ngi11g tl1e qt1a11tizer step \vill be
`better than cuttin·g bi,gh.er freque11cies \Vhen tl1e same an1ount of rate 1ieeds to be reduced . In the
`original bitstrean1 tl1e coefficients are quantized \Vill1 finer qua11tization steps \vl1icl1 a1·e optimized
`at tl1e original l1igl1 rate. Af'ter cutting the coef.(icients o·f higl1er frequencies, tl1e rest of tl1e
`eoe~ncients are not quantized witl1 an opti1nal. quantizer. In the n1ethod of requantization all
`coefficients are requantized \.Vitl1 a11 opti111al quantizer \.vhicl1 is detern1i11ed by tl1e reduced rate ; the
`perfo1111ance of the requantization method n1ust be better tl1an tl1e n1etl1od of cutting high frequencie s·
`to reach the reduced r,ate. Tl1e theoretical analysis is give11 in Section 17.3.4.
`In the follo\ving, se,,eral different arcl1itectures that acco111plish the bi tstr ean1 sca ling are
`discussed. The different methods l1ave varying l1ard\\1are in1ple111e11tation con1plexities; each l1as its
`own degree of Lrad.e-off betwee11 required l1ardware and resultir1g i111age quality.
`
`17.3.3
`
`ARCHITECTURES OF BITSTREAM SCALING
`
`Four architectures for birstrean1 scali11g are discussed. Eacl1 of the sca li11g t1rcl1itectures described
`I1as its own. particular benefits that are suitable for a particular ,lf)plicatio ,1.
`
`Architecture I: Tl1e bitstrea111 is scaled by cutti11g l1igl1 frequencies.
`Architecture 2: Tl1e bitstreru11 is scaled by rec1uantization.
`Architecture 3: The bitstream is s.caled by reencodin.g the reco nstructed pictures. ,vitl1
`motio11 vectors and coding decisior1 n1odes extrc1cted rron1 the original l1igl1-
`quality bitstream.
`Architect ,ure 4: The bitstream is scaled by r.eencoding the recon tructed· pictures \Vitl1
`n1otion vectors extracted froin tl1e origir1al l1igl1.-qua lity bitstrearn, but ne,v
`coding decisior1s are co·n1puted b,1sed 011 reco nsLrucLcd pictures.
`
`Architectures 1 and 2 are considered for VTR applications sucl1 as trick-play n1odes and EP
`recording. Architectares 3 and 4 are considered for and other applica ·ble StatMux sce nari os.
`
`17.3.3.1 Architecture 1: Cutti.ng AC Coefficients
`
`A block diagram illustrating arcl1itecture 1 is sho\vn in Figure l 7.3a. Tl1e n1ethod of reducing the
`b.it rate fn archjtecture I is based on cutting the l1igher-frequency coe fficients. The incoming
`precoded CBR stream enters a decoder rate buffer. Following the top branch leading from tl1e rate
`buffer, a VLD is used to parse the bits for the next fran1e in the bu·ffer to identify all tl1e variable(cid:173)
`length codewords that corre spond to ac coefficients used in that frarne. No bits are ren1oved from
`the rate buffer. The codewords are not decoded, but just simply parsed by the VLD parser to
`determine codeword lengths. The bit a.I location a.nalyzer accun1ulates tl1ese ac bit counts for e,,e[)'
`macro-block in the frame and creates an ac bit usage profile as sho\vn in Figure 17 .3(b ). Tl1at is,
`the analyzer generates a running sum of ac OCT coefficient bits on a mac1·oblock basis:
`
`PVN = L_IAC_B!TS,
`
`(17.1)
`
`where PVN is the profile value of a runi:li.ng sun1 of AC codeword bits u11til the n1acroblock N. In
`addition, the analyzer count$ the sum of all c.oded bits for tl1e fran1e, TB (total bits). After a.II
`macrobl0eks for th.e frame h,ay,e been analyzed., a target value TVAc, of ac DCT coefficient bits pe.r
`frame is calculated as
`.
`
`,
`
`.
`
`(17.2)
`
`IPR2021-00827
`Unified EX1008 Page 400
`
`

`

`Application
`
`Issues of MPEG-1/2 Video Coding
`
`375
`
`Bitstream
`
`I I I I I
`
`New bit rate
`
`Cumulative bits
`use.d for AC cocffs
`
`VLD Parser
`
`Bit allocation
`Analysis
`
`t------.
`
`•
`
`•
`
`Delay
`
`VLD P~er
`
`i---
`
`Rare controller
`(frequency c.ut)
`
`.....,_.,..8_its-out
`
`Profile of original bits
`
`•
`
`I
`
`I New target
`
`I
`
`B.lock number
`
`0
`
`JI'lG URE 17.3
`
`(a) Ar chitecture I ; cullin g high frequencies. (b) Profile 1nap.
`
`\vl1ere TVAc is the target value of AC codeword bits per fran1e, PV LS is tl1e profile value at the last
`macrob lock , a is tl1e percentage by wl1ich tl1e pree11coded bitstrean1 is to 'be reduced, TB is the
`total bits, and B1;x is the an1ount of bits by \vl1icl1 the previous frame missed its desired target. The
`profile value of AC coefficient bits is scaled by tl1e factor T~\c !PVLS. Multiplying each PY N performs
`sca ling by that factor to .gene rate tl1e li11early scaled profile sho\vn in Fi·gure 17.3(b). Fo llowing tl1e
`bottom bra11ch fron1 t11e rate buffer, a delay is inserted equal to tl1e an1ount of ti1n.e required for
`the top branch analy sis processing to be completed for Lhe current frame. A second VLD parser
`accesses and re1noves all codeword bits fron1 the buffer and delivers Ll1em to a rate contro ller. Tl1e
`rate co ntroller rece ives tl1e scaled target bit usage prefile for tl1e ar11ount of ac bits to be used \Vitl1in
`tl1e frame. The rate controller has memory to store all coefficients associated \VitJ1 tl1e current
`macroblock it is operating on. AJI original codeword bits at a l1igl1e1· level tl1a11 ac coefficients (i.e .,
`all fixe"d-lengtb l1eader codes, n1otion vector codes, 11'laer0block typ.e codes, etc.) are l1eld in n1en1ory
`and will be re111ultiplexed with all AC codewords in tl1al ri1acroblock that f1ave not been ·excised to
`for1r1 tl1e outgoing scaled bits.tream. Tl1e ra(e co11Lroller determines and flags in the" n1acr0block
`codeword men1ory which AC code\vords t.o keep and wl1icl1 to excise. AC code\vords are acces.sed
`from tf1e tnacrobJock codeword men1ory i11 tl1e ord.er ACJI. AC12, ACJ3, AC14, AC/5, ACJ6 ,
`, AC22 , AC23 , AC24, AC25, AC26, AC3/ , AC32, AC33, etc., \vhere ACij denotes tl1e ith AC
`AC2/
`codewo rds fron1 jtl1 block in the macroblock if it is present. As tl1e AC code\vords are accessed
`t'ro1n memory, the respective codeword bits are summed and co11tinuot1sly eo111.pared witl1 the sca led
`profile value to the current macroblock, Jess the 11umber of bits for i11sertion of EOB (end-of~block)
`codew ords. Respective AC codewords are fiagged as kept until tl1e running sum of AC code\vords
`bits exceeds the scale@ profile value less EOB bits. w ·hen this condition is 1net, all remaining AC
`codewords are rnarked 'for being e_xcised. Tl1is proces"s continues un.til all macroblocks l1ave tl1eir
`kept codewords reassembled to forn1 the scaled bit~Lrean1 .
`
`•
`
`IPR2021-00827
`Unified EX1008 Page 401
`
`

`

`376
`
`Image and Video Compre ssion for Multim edia Engineering
`
`VLDP~er~
`
`.,__ __ ~
`
`it allocation
`Analysis
`
`Bitstrearn ---,
`----•~1111
`New bit rate
`
`•
`
`Delay
`
`VLD Parser~
`
`Rate controlle
`( requantizer)
`
`-----Bits-out
`
`VLC
`
`FIGURE 17.4 Ar chit ecture 2: increasin g quanti zaLion
`
`tep.
`
`17.3.3.2 Architecture 2: Increasing Quantization Step
`
`Architecture 2 is sho,vn in Figure 17.4. The method of bitstream scaling in ·1rcl1itcc ture 2 is based
`on increasing tl1e quantization step. Tllis n1ethod requires addi tional dcqu a11tizcr/q uant izer and
`variable-length coding (VLC) l1ardware o,1er tl1e first 111ethod. Like the nrst 111etl1od, it also rnakes
`a first VLD pass. on the bitstream a11d obtains a sin1ilar scaled pron le of Larget curnula tive code,vord
`bits vs. rnacroblock count to be used for rate control.
`The rate control mechanism differs from this point on. Arter tJ1e se -ond-1Jas VLD is made on
`the bitstream , quantized DCT coefficients are dequa11tized. A block of fi nely qua.ntized OCT
`coefficients js O·btai11ed as a result of this. This block of DCT coe fficients is req uar1rized vvilh a
`coarser quantizer scale. The value used for tl1e coarser qua 11tizer cale is de1ermined adap Li·ve1y by
`n1aking adjust111ents after every macroblock so that the scaled targe t profile is tracked as \,Ve progress
`through the n1acroblocks in the frame:
`
`•
`
`QN = QJ\IQ,,',f + G * L (BU - PV:\1-1) ,
`
`' t\l-1
`
`( 17.3)
`
`w·here QN is the quantization factor for macroblock N, Q N0 11.1 is an estimate of the new no·minal
`quantization factor for the frame, L N_1BV is the cumulative amount. of coded bits up to macroblock
`N -
`l, and G is a gain factor \vhich controls how tightly the pron le curve is tracked through the
`picture. QNOM is initialized to an average guess value before th·e very first frarne, and updat ed for
`the ne.Xct .frame by setting it to Q1-5 (the quantizati on factor for the last n1ac roblock) from the fran1e
`just completed. The coarsely requantized block of DCT coeffic ients is variable-l ength-c oded to
`generate the scaled bitstream. The rate controller also has provision.s for changing som.e macroblock (cid:173)
`layer codewords, su.cl1 as the n1acroblock-type and coded-block pattern to ensure a legitimate scaled
`bitstream that conforms to MPEG.;,2 syntax .
`
`17.3.3.3 Architecture 3: Reencodin-g with Old Motio .n Vectors
`and Old Decisions
`
`The third arch'tecture for bitstream scaling is shown in Figure 17.5. In tl1is architecture, the n1otion
`vectors and macroblock coding decis,ion modes are first e.xtractedl from the original bjtstream, and
`at the same time the reconstructed pictures are obtained from the normal decoding procedure. Then
`the scaled, bitstrea111 is obtained by reencoding the reconstructed pictures using tl1e old motion
`vectors and maeroblock decision.s from tl1e original h>itstrean1. The benefits obtained fron1 this
`arc.hitecture compared with full decoding and reencoding is that no ,notion esti,r11ation and decision
`c0mputation is needed.
`
`IPR2021-00827
`Unified EX1008 Page 402
`
`

`

`Applic ation lssL1es of MPEG-1 /2 Video Coding
`
`377
`
`VLD Pars
`
`Motion vector
`and~ing
`decision
`cxtracter
`
`Motion vectors and
`
`Macroblock Decision Modes
`
`Bitstream ---
`---41•~1111
`
`New bit rate
`
`Delay
`
`VLD&
`Dequantizer
`
`.----.._____, Bi ts-out
`
`Reconstruct
`
`Re-encoder
`
`FIGURE 17.5 Arcl1itecture 3.
`
`17.3.3.4 Archite c·ture 4: Reencoding with Old Motion Vectors
`and New Decisions
`
`Archite cture 4 is a n1odified version of arct1itecture 3 in which new macroblock decisio11 modes
`are con1puLed durin g ree11coding based on reconstructed pictures. Tl1e scaled bitstrean1 cre.ated tl1js
`\vay is expe cted to yield an .i111provement jn picture quality because the decisio n modes obtained
`from tl1e h.igh-qual i Ly original bitstrea·m are not optimal for ree11coding at L}1e reduced rate. For
`exa1nple, at higl.1.er r,1tes tl1e optimal n1ode decision for a n1acroblo-ck is 1nore likely to favor
`bidirectiona.l field motio,1 compensation over forward fran1e moti on con1.pensation. But at lower
`rntes, only tl1e oppo. ite decision n1ay be true. In.order for tl1e reencod.er to l1ave the possibility of
`deciding on 11e\v r11acroblock coding modes, the entire pool of n1otion vectors of every type must
`be available. Tl1is can be supplied by aL1gmenting tl1e original l1igh-qualit y bitstrec1m \Vith ancillary
`data co ntainin g the c11tire pool of n10Lion vectors during the ti_me iL 'vvas original ly encoded. It could
`be inserted into the user data every frame. For tl1e same origi11al bit rate, the quality of an original
`bitstream obtain ed this way is degraded con1pared with an origjnal bitstream obtained from archi(cid:173)
`tecture 3 because the addition al overhead required for the extra motion vectors steals away bits for
`actual encodi ng. However, the resulting scaled bitstrea111 is expected to sl1ow quality in1provement
`over the scaled bitstrean1 frorn architecture 3 if tl1e gains from computing ne'vv and more accurate
`decision 1nodes can overcome tl1e loss in original picture quality. Table 17 .3 outlines the hard\vare
`con1plex ity sav ings of each of the tl1ree proposed architectures as con1pared \vith full decoding and
`reencod ing.
`
`17 .3.3 .S Comparison of Bitstream Scaling Methods
`
`We have described four architectures 1·or bitstrean1 scaling ,vhicl1 are useful for various applications
`as described i11 tl1e introduction. Among the four arcl1itectures, arcl1itectures I and 2 do 11ol require
`
`TABLE 17.3
`Hardware Complexity Savings over Full Decoding/Reencodin .g
`
`Coding Method
`
`Architecture 1
`
`Architecture 2
`
`Archite cture 3
`Arcl1itecture 4 ·
`
`Hard\vare Complexity Savings
`
`No decoding loop. DO DCT/IDCT. no frnrne store n1ernory. noeocoding loop, no qunntizer/d.equnntizer,
`no motion compensaLion, no VLC. sin1plifi ed rate control
`No decoding loop, no DCT/ IDCT, no frnn,e store 111en1ory, no encoding loop, no 111otion con,pensntion,
`sin1plifi ed rate control
`No motion estimation, no n1ncroblock coding decisions
`No n101ion es1in1ntion
`
`IPR2021-00827
`Unified EX1008 Page 403
`
`

`

`Image and Video Compression for Mu ltimedia Engineering
`
`nd enr,-oding )oops or frame srore memory for reco11structed pictures, thereby
`, d.
`._ d
`enure . eco 1ng a
`,
`\.:,
`·....
`.
`.
`.
`. .
`.
`.fi
`·t hardware complexity. However, video quality te11ds to degrade tl1rough tl1e group
`saVIng s1gn1 can
`.
`. . .

`. .
`:
`(G,OP) until tl1e next I-p1ctu-re due to drift 1n tl1e absence of decoder/encoder loops . For
`·,.
`f
`o pictures
`.
`.
`.
`,
`.
`. 1. -o say for rate reduction greater tl1an 25o/o, arcl1.1te·cture l produces poor-quality blocky
`I O
`_ar.:,e sea IO.:,,
`.
`.
`,
`.
`.
`.
`.
`.
`.
`. _
`.
`pictures, prinlarily because n1any bits were spent in t~e or1g111al .h1gh-qual1~y b1tstr~am on finely
`quantizing the de an~ ot~1er ve~ lo~v-?rder ac coefficients.'. Arcl11tecture 2 1s a part1cularl~ good
`clloice for VTR appI1cat1o,ns since 1t 1s a good compromise between l1ardware compl exity and
`recon_structed image quality. Arcl1itectures 3 and 4 are suitable for VOD server applications

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket