`EXH. 2010
`Petitioner - Apple, Inc. / Patent Owner - E-Watch, Inc.
`IPR2015-00412
`
`Page 1 of 14
`
`
`
`306
`
`IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 5, NO. 4, AUGUST 1995
`
`to Rec.
`
`‘den
`
`Motion
`Prediction
`
`Gain Scaling
`MV Selection
`
`Motion
`
` nnd
`Compensation
`
` GainControl I ClassifiedDCT
`
`
`Classified DC!‘
`.
`_
`P"d'°“°“
`
`and
`DCTSelcclion
`
`and
`Quanlimion
`
`
`
`Inverse Motion
`Compensation
`
`Local
`Reconstructed
`Frame
`
`Fig. 1. Video encoder schematic.
`
`interframe modes of operation. In the intraframe mode the
`encoder transmits the coarsely quantized block averages for the
`current frame, which provides a low—resolution initial frame
`required for the operation of the interframe codec at both
`the commencement and during later stages of communica-
`tions in order to prevent encoder/decoder misalignment. The
`interframe mode of operation is based on a combination of
`gain-controlled motion compensation and gain-controlled DCT
`coding as seen in Fig. 1.
`Gain Controlled Motion Detection: At the commencement
`of the encoding procedure the motion compensation (MC)
`scheme detemiines a motion vector (MV) for each of the
`8 X 8 blocks. The MC search window is fixed to 4 x 4 pels
`around the center of each block. Before the actual motion
`compensation takes place the codec tentatively determines
`the potential benefit of the compensation in terms of motion
`compensated error energy reduction. In order to emphasize
`the subjectively, more important eye and mouth region of
`the videophone images the potential gains for each motion
`compensated block are augmented by a factor of two in the
`center of the screen. Then the codec selects the thirty blocks
`resulting in the highest scaled gain, and motion compensation
`is applied only to these blocks, whereas for all other so—called
`passive blocks the codec applies simple frame differencing.
`Gain Controlled Quadruple-Class DCT: Pursuing a similar
`approach, gain control
`is also applied to the DCT—based
`compression. Every block is DCT transformed and quan-
`tized. Because of the nonstationary nature of the motion
`compensated error residual (MCER) the energy distribution
`characteristics of the DCT coefficients vary. Therefore four
`different sets of DCT quantizers are available, as examplified
`in Fig. 2. All four bit allocation schemes are tentatively
`invoked in order to select the best set of quantizers resulting
`in the highest energy compaction gain. Ten bits are allocated
`for each quantizer, each of which are trained Max—Lloyd
`quantizers catering for a specific frequency—domain energy
`
`Page 2 of 14
`
`the design
`codecs operate well below the source entropy,
`philosophy hinges around the principle as to how best the total
`distortion is distributed over the source message in the time— or
`frequency—domain in order to minimise its subjective effects.
`When using Shannon’s ideal source codecs and channel
`codecs over memoryless AWGN channels, where bit errors
`occur randomly, there is no advantage in treating source and
`channel coding jointly. Our nonideal source codecs however
`produce sequences, which still retain correlation and unequal
`error sensitivity. Over fading mobile channels this problem is
`aggravated by the bursty error statistics, which can only be ran-
`domized using infinite memory channel interleavers inflicting
`infinite delays. In this situation source—matched channel coding
`[9], [18], [23], which takes account of the source significance
`information [23] (SSI) brings substantial advantages in terms
`of reducing the required minimum channel SNR.
`Joint coding and modulation in the form of trellis coded
`modulation (TCM) or block coded modulation (BCM) was
`also proposed in the literature in order to reduce the required
`channel SNR [51], [52], while in [18] and [25] source—matched
`joint source/channel coding and modulation was introduced. In
`this treatise we will follow a similar design philosophy in order
`to achieve best videophone performance over fading channels.
`The schematic of the proposed transceiver is portrayed in
`Fig. 7 and this treatise follows the same structure. Speech
`source coding issues are not considered here, the reader is
`referred to [34] and [33] for the choice of the appropriate
`speech codec. Channel coding issues are addressed in [50],
`while a detailed discussion of modulation is given in [26].
`Section II outlines the design of a variety of programmable,
`but fixed-rate video source codecs and analyzes their bit
`sensitivity. Section III details modulation and transmission
`aspects, which is followed by the description of the source-
`matched transceiver in Section IV. The system’s performance
`is characterized in Section V, before offering some conclusions
`in Section VI.
`
`11. VIDEOPHONE CODECS
`
`2.1. Codec I
`
`Let us initially focus our attention on the proposed discrete
`cosine transform [28] (DCT) based video codec depicted in
`Fig. 1, which was designed for hostile mobile channels. The
`codec uses 176 X 144 pixels Quarter Common Intermediate
`Format (QCIF) images scanned at 10 frames/s. For the sake
`of communications convenience and simple networking our
`aim was to develop a fixed-rate codec which is able to
`dispense with an adaptive feed—back—driven bit—rate control
`buffer. Therefore a constant bit—rate source codec was required,
`which in Codec 1 forced us to avoid using efficient variable-
`rate compaction algorithms, such as Huffman coding. This
`was achieved by fixing both the number of 8 x 8 blocks to be
`motion—compensated and those to be subjected to DCT to 30
`out of 22 X 18 = 396. The selection of these blocks is based
`on a gain-controlled approach, which will be highlighted next.
`In order to curtail error propagation across image frames
`the codec was designed to switch between intraframe and
`
`Page 2 of 14
`
`
`
`HANZO AND STREIT: ADAPTIVE LOW-RATE WIRELESS VIDEOPHONE SCHEMES
`
`307
`
`NoofBits
`
`NoofBits
`
`NoofBits
`
`NoofBits
`
`Fig. 2. Quad—c1ass DCT quantization schemes.
`
`distribution class. Again, the energy compaction gain values
`are scaled to emphasise the eye and mouth region of the image
`and the DCT coefficients of the thirty highest-compression
`blocks are transmitted to the decoder.
`
`interframe
`Partial Forced Update: The disadvantage of
`codecs is their vulnerability to channel errors. Every channel
`error results in a misalignment between the reconstructed
`frame buffer of
`the encoder and decoder. The errors
`
`accumulate and do not decay, unless a leakage—factor or a
`
`partial forced update (PFU) technique is employed. In our
`proposed codec in every frame 22 out of the 396 blocks,
`scattered over the entire frame, are periodically updated using
`the 4-b quantized block means, which are partially overlayed
`on to the contents of the reconstructed frame buffer. The
`
`overlaying is performed such that the block’s contents in the
`local buffer is weighted by 0.7 and superimposed on to the
`received block average, which is scaled by 0.3. The bit-rate
`contribution of this PFU process is a moderate 22 x 4 = 88 bits
`per QCIF frame and it refreshes about 5.6% of each frame.
`Bit Allocation Strategy: The bit allocation scheme was de-
`signed to deliver 1136 b per frame, which begins with a
`
`22—b frame alignment word (FAW). This is necessary to assist
`the video decoder’s operation in order resume synchronous
`operation after loss of frame synchronization over hostile
`fading channels. The partial intraframe update refreshes only
`22 out of 396 blocks every frame. Therefore every 18 frames
`or 1.8 s the update refreshes the same blocks. This periodicity
`is signalled to the decoder by transmitting the inverted FAW.
`A MV is stored using 13 b, where 9 b are required to identify
`one of the 396 the block indexes using the enumerative method
`and 4 b for encoding the 16 possible combinations of the
`X and Y displacements. The 8 X 8 DCT-compressed blocks
`use a total of 21 b, again 9 for the block index, 10 for the
`DCT coefficient quantizers, and 2 b to indicate which of the
`four quantizer has been applied. The total number of bits
`becomes 30 - (13 + 21) + 22 - 4 + 22 + 6 = 1136, where
`six dummy hits were added in order to obtain a total of 1136
`b suitable in terms of bit packing requirements for the specific
`forward error correction block codec used. The video codec’s
`
`peak signal-to-noise ratio (PSNR) performance is portrayed in
`Fig. 3 for the well—known ‘Miss America’ sequence and for
`
`Page 3 of 14
`
`Page 3 of 14
`
`
`
`308
`
`IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY. VOL. 5, NO. 4, AUGUST 1995
`
`PSNR(an)I 0 ~— Miss America
`
`X ‘— Lab
`60
`10
`In
`$0
`40
`Frame Index
`
`W
`
`IN
`
`II!
`
`N
`
`10
`
`P‘?ciao
`
`4L»
`
`in
`
`2
`
`(dB)573.2”"'‘ -3.
`PSNRDegradation
`
`
`O —— BilNol
`X -— BitNoll
`
`I0
`
`20
`
`no
`
`no
`so
`40
`Frame Index
`
`
`on
`so
`
`10
`
`Ion
`
`Fig. 3.
`
`PSNR performance of the 11.36 kbps Codec 1.
`
`Fig. 4
`
`PSNR degradation profile for Bits 2 and 11 of the MV in Codec 1.
`
`0.]
`
`f\ M”
`% m
`E M,
`'5 on!-
`G (ms
`850.04
`M 0.03
`5, om£1: ‘E
`Ob
`
`
`
`Bit l-4 PFU
`Bit 5-17 MV
`Bit 18-32 DCT
`
`25
`
`30
`
`35
`
`IS
`
`an
`_
`Bit Index
`
`Fig. 5.
`
`Integrated PSNR bit sensitivities of Codec 1.
`
`effect and averaged these values for all the occurrences of the
`corresponding bit errors. These results are shown in Fig. 5 for
`the 13 MV bits and 21 DCT bits of an 8 x 8 block, as well
`
`as for the 4 partial forced update bits.
`
`2.2. Codec 2
`
`In an attempt to improve the bandwidth efficiency of Codec
`1 and to explore the range of design trade-offs, we have studied
`the statistical properties of the various parameters of Codec
`1
`in order to identify any persisting residual redundancy.
`We found that
`the motion activity table and the table of
`DCT—active blocks were potentially amenable to further data
`compression using run length coding (RLC). Therefore we set
`out to contrive a range of run length coded video codecs with
`bit rates as low as 5, 8, and 10 kbps, which we refer to as
`Codec 2.
`
`The schematic diagram of Codec 2 is akin to that of Codec 1
`shown in Fig. 7, but the above mentioned coding tables are fur-
`ther compressed by RLC. Similarly to Codec 1, the operation
`of Codec 2 is also initialized in the intraframe mode, where
`
`the encoder transmits the coarsely quantized block averages
`for the current frame. This provides a low-resolution initial
`frame required for the operation of the motion compensated
`interframe codec at both the commencement and during later
`
`stages of communications in order to prevent encoder/decoder
`misalignment. However, for the sake of maintaining a total bit
`rate R in the range of 5-10 kb/s for our 176 X 144 pixel CCI'IT
`standard QCIF images at a scanning rate of 10 frames/s we
`now limited the number of encoded bits per frame in Codec
`2 to 500, 800, and 1000 b/frame, respectively. In order to
`
`Page 4 of 14
`
`TABLE I
`BI'I‘ ALLOCATION TABLE
`
`|MV DC’I‘Id
`FAW PFU|MVld
`H’22
`22x4 |
`30:9” 13ox4
`30x?) ex
`
`DCT
`30x12
`
`
`
`
`
`|'1‘t.al
`Pdd'
`aemg |
`ITEGJ
`
`a high—activity sequence referred to as the ‘Lab sequence."
`For ‘Miss America’ an average PSNR of about 33 dB was
`maintained, which was associated with pleasant videophone
`quality. The bit allocation scheme is summarized in Table I and
`the complexity of this codec is about 50 Mflops, which can be
`reduced to about 25 Mflops without significant performance
`penalty. In our further discourse we will refer to the above
`scheme as Codec 1. After addressing the bit sensitivity issues
`of Codec l we will propose a lower bit rate but more error
`sensitive arrangement, Codec 2, and analyze their advantages
`and disadvantages.
`to apply source-sensitivity
`In order
`Source Sensitivity:
`matched protection the video bits were subjected to sensitivity
`analysis. In [9] we have consistently corrupted a single bit
`of a video coded frame and observed the image peak signal-
`to—noise ratio (PSNR) degradation inflicted. Repeating this
`method for all bits of a frame provided the required sensitivity
`figures and on this basis bits having different sensitivities can
`be assigned matching FEC codes. This technique, however,
`does not take adequate account of the phenomenon of error
`propagation across image frame boundaries. Therefore in this
`treatise we propose to use the method suggested in [17],
`where we corrupted each bit of the same type in the current
`frame and observed the PSNR degradation for the consecutive
`frames due to the error event in the current frame. As an
`
`example, Fig. 4 depicts the PSNR degradation profile in case
`of corrupting all ‘No 1’ Bits, the most significant bit (MSB)
`of the PFU and all ‘No 11’ Bits, one of the address bits of the
`MV, in frame 21. In the first case, the MSB of all PFU blocks
`are corrupted causing a scattered pattern of artifacts across the
`image. Those blocks will be replenished by the PFU exactly
`every 18 frames, revealed in the ‘staircase’ effect in Fig. 4.
`The impact of the corrupted MV is randomly distributed across
`the frame and hence, mitigated continuously by the PFU.
`In order to quantify the overall sensitivity of any specific
`bit we have integrated (summed) the PSNR degradations over
`the consecutive frames, where they have had a measurable
`
`‘The MA sequence encoded at 11.36 kbps can be viewed under the address
`www: /whirligig . ecs . soton . ac . uk/~j ss .
`
`Page 4 of 14
`
`
`
`HANZO AND STREIT: ADAPTIVE LOW—RATE WIRELESS VIDEOPHONE SCHEMES
`
`309
`
`TABLE 11
`AVERAGE PSNR PERFORMANCE OF CoDEc 2 FOR
`THE ‘Miss AMERICA’ AND ‘LAB’ SEQUENCES
`
`
`I
`’Lab’
`21-87 <13 I
`
`24.34 dB
`26.91 dB
`
`I I
`
`
`
`-
`33 29 dB
`
`
`
`transmit all block averages with a 4-b resolution, as in Codec
`1, while not exceeding the above stipulated maximum bit rate,
`we fixed the initial intraframe block size to 10 X 10, 12 X 12,
`or 14 x 14 pixels for the above three target bit rates. The
`intraframe block size in Codec 1 was 10 X 10 pixels.
`However, in the motion-compensation (MC) we retained the
`block—size of 8 x 8 and the search window size of 4 x 4 around
`
`the center of each block. Furthermore, the previously proposed
`gain—controlled MC and quad-class DCT quantization was
`invoked. This method of classifying the blocks as motion—
`active and motion—passive results in an active/passive table,
`which consists of a one bit flag for each of the 396 blocks,
`marking it as passive or active. These tables are compressed
`using the elements of a two stage quad tree (QT) as follows.
`First the 396—entry activity table containing the binary flags
`is grouped in 2 X 2 blocks and a four bit symbol is allocated
`to those blocks which contain at least one active flag. These
`four-bit symbols are then run length encoded and transmitted
`to the decoder. This concept requires a second active table
`containing 396/4 = 99 flags in order to determine which of the
`two by two blocks contain active vectors. Three consecutive
`flags in this table are packetized to a symbol and then run
`length encoded. As a result, a typical 396-b active/passive
`table containing 30 active flags can be compressed to less
`than 150 b. The motion vectors do not lend themselves to run
`
`length encoding.
`If at this stage of the encoding process the number of bits
`allocated to the compressed motion— and DCT-activity tables
`as well as to the active MV’s exceeds half of the total number
`
`of available bits/frame, some of the blocks satisfying the initial
`motion-active criterion will be relegated to the motion—passive
`class. This process takes account of the subjective importance
`of various blocks and and does not ignore motion-active blocks
`in the central eye and lip regions of the image, while relegating
`those, which are closer to the fringes of the frame. The DCT
`blocks are handled using a similar procedure. Depending on
`the actual fixed—length transmission burst and the free buffer
`space, a number of active DCT blocks is chosen and the
`corresponding compressed tables are determined. If the total
`bit count overspills the transmission burst or if there are too
`many bits left unused, a different number of active blocks is
`estimated and new tables are determined.
`
`The PSNR versus frame index performance of a 5, 8, and 10
`kbps RLC scheme is shown for the ‘Miss America’ sequence
`in Fig. 6 and the average results are summarized is Table
`II. Although due to the low-resolution intraframe mode at
`the commencement of communications it takes a few frames
`
`is not
`this effect
`for the image to reproduce fine details,
`objectionable. This is because the subjectively more important
`center of the screen is processed first. Fig. 6 demonstrates that
`at 5 kbps the codec operates at its limits and hence it takes
`a long time before the steady—state PSNR value is reached.
`However, at rates at or above 8 kbps a pleasant quality is
`maintained leading to an average PSNR in excess of 30 dB,
`which is exceeded in the center of the image. Based on these
`findings, in the run length coded System 2 we have opted
`for an 8.52 kbps implementation of Codec 2, generating
`852 b per frame and maintaining an average PSNR of about
`
`PSNR(dB_)
`
`0
`
`lo
`
`20
`
`30
`
`an
`so
`40
`Fmmc Index
`
`70
`
`no
`
`90
`
`I00
`
`Fig. 6. PSNR versus frame index performance of Codec 2 for the ‘Miss
`America’ sequence.
`
`33.3 dB for the MA sequence. We also note that in some
`of the proposed systems an 8 kbps reduced—rate version of
`Codec 2 will be invoked, which we refer to as Codec 2a.
`Before we continue with the description of the source—matched
`transceiver schemes it must be emphasized that, in contrast
`to Codec l where no RLC is employed,
`if the RL-coded
`activity table bits are corrupted, the rest of that frame will be
`completely corrupted. Hence automatic repeat request (ARQ)
`techniques are preferred in the systems employing the RL-
`coded Codec 2. The sensitivity of the remaining bits is similar
`to that of the corresponding Class Two bits of Codec 1.
`
`III. MODULATION AND TRANSMISSION
`
`Over mobile channels constant envelope modulation tech-
`niques, such as for example Gaussian Minimum Shift Keying
`(GMSK) used in the Pan—European GSM system [59] has suc-
`cessfully been applied. In contrast, until quite recently QAM
`research was mainly focused at applications over AWGN
`channels [35]. However, fuelled by the drive towards ever
`higher bandwidth efficiency and facilitated by advances, such
`as noncoherent star QAM [45], coherent pilot symbol assisted
`modulation [55] and the transparent tone in band [56], [57]
`(TTIB) technique, during the last few years its employment
`has also become realistic over mobile channels [36]—[48]. In
`
`order to achieve high bandwidth efficiency, QAM encodes
`information on both the phase and magnitude of the complex
`transmitted signal and hence it requires a linear transceiver,
`which suffer from low power efficiency [53], [54]. However,
`in low-power pico- or microcellular applications this is not a
`serious limitation, since the power consumption of the high-
`complexity digital circuitry is more crucial. In fact, due to
`its reduced signalling rate such a transceiver may be able
`to operate in a nondispersive scenario, without a channel
`equalizer, which reduces the power consumption.
`
`Page 5 of 14
`
`Page 5 of 14
`
`
`
`310
`
`IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 5, NO. 4, AUGUST 1995
`
`wwpsn
`
`V'°E°
`ENCODER
`'5 W "8
`
`BCN
`CL. ONE
`ENCODER
`
`acu
`er. rwo
`
`
`ENCODER
`
`
`Hiavren
`l
`mun MPX
`
`mu
`
`
`
`
`
`
` OCHCL. ONE
`
`DECODEH ‘
`
`MAPPER
` VIDEO
`I
`
`DEOODER
`
`DNA DE
`I POSTFR
`
`BOH
`CL. TWO
`DECODEI
`
`
`POSTPROOESSING
`
`
`System’ s schematic.
`
`O --- Clnssl
`0 — Class2
`A —-- Average
`
`
`
`
`
`PSNRDegradation(dB)
`
` 1 10.1 2 s 111.:
`I
`.--_q'
`
`
`
`
`
`I0»
`1
`s
`I
`0-3 _
`a
`Bit Error Rate
`
`Fig. 8. PSNR versus BER degradation of Codec l for class one and two.
`
`lower than that of the lower quality C2 subchannel. Both
`subchannels support the transmission of two bits per symbol.
`This implies that the 16-PSAQAM scheme inherently caters
`for sensitivity-matched protection, which can be fine-tuned
`using appropriate FEC codes to match the source requirements.
`This property is not retained by the 4QAM scheme, but the
`required different protection for the source coded bits can be
`ensured using appropriately matched channel codecs.
`Source Sensitivity:
`In order to find the appropriate FEC
`code for our video codec,
`its output stream was split
`in
`two equal sensitivity classes, Class One and Two according
`to our findings in Fig. 5. Note that the notation Class One
`and Two introduced here for the more and less sensitive
`
`video bits is different from the higher and lower integrity
`Cl and C2 modulation channels. Then the PSNR degradation
`of both Class One and Two as well as the average PSNR
`
`degradation was evaluated for a range of BER values in
`Fig. 8. These results showed that a factor two lower BER
`was required by Class One bits than by Class Two bits, in
`order to maintain similar PSNR degradations in the range of
`1-2 dB. These integrity requirements conveniently coincided
`with the integrity ratio of the C1 and C2 subchannels of our
`16-PSAQAM modem [26]. Hence we can apply the same FEC
`protection to both Class One and Two source bits and direct
`Class One bits to the Cl 16-PSAQAM subchannel, while Class
`Two bits to the C2 subchannel.
`
`Page 6 of 14
`
`The innate sensitivity of QAM against co-channel interfer-
`ence in an interference limited scenario is mitigated by the
`partitioning walls in indoors pico-cells and can be further
`reduced using the channel segregation algorithm proposed
`in [49]. Instead of tolerance-sensitive linear-phase Nyquist
`filtering nonlinear filtering (NLF) joining time-domain signal
`transitions with a smooth curve can be employed [26]. In case
`of coherent detection better performance can be achieved than
`using lower-complexity noncoherent differential modems. In
`order to phase-coherently recover the orthogonal quadrature
`carriers at the receiver, which will assist to recover the trans-
`mitted data,
`the Transparent—tone-in-band (TTIB) principle
`[56]—[58] or Pilot Symbol Assisted Modulation (PSAM) [55],
`[48] can be invoked.
`Differentially coded noncoherent QAM modems [45] have
`typically low complexity than their coherent counterparts,
`but
`they inflict a characteristic 3 dB differential coding
`SNR penalty over AWGN channels, which persists also
`over Rayleigh channels. Hence they require higher SNR and
`SIR values than the more complex coherent schemes [26].
`Therefore in our video transceiver second-order switched-
`
`diversity assisted coherent Pilot Symbol Assisted Modulation
`(PSAM) using the maximum—n1inimum-distance square QAM
`constellation is used [26].
`
`IV. SOURCE—MATCHED TRANSCEIVER
`
`4.1. System 1
`
`System Concept: The system’s schematic is portrayed in
`Fig. 7, where the source encoded video bits generated by
`Codec 1 are split in two sensitivity classes and sensitivity
`matched channel coding/modulation is invoked. The proposed
`system was designed for mobile packet video telephony and
`it had two different modes of operation, namely 4-level and
`16-level quadrature amplitude modulation (QAM) [26]. Our
`intention was to contrive a system, where the more benign
`propagation environment of indoors cells would benefit from
`the prevailing higher signal-to—noise ratio (SNR) by using
`bandwidth efficient 16QAM and thereby requiring only half
`the number of packets compared to 4QAM. When the portable
`station (PS) is handed over to an outdoors rnicrocell or roams
`in a lower SNR region towards the edge of a cell, the base
`station (BS) instructs the PS to lower its number of modulation
`levels to 4 in order to maintain an adequate robustness under
`lower SNR conditions. Let us now focus our attention on
`
`specific details of System 1.
`robustness against
`Sensitivity—Matched Modulation: Best
`channel errors is achieved,
`if sensitivity-matched forward
`error correction coding is used. Similarly to our approach in
`[18], Wei [25] has also suggested to use unequal protection
`multilevel coded modulation in order
`to achieve high
`bandwidth efficiency. Following similar principles here, in our
`proposed videophone schemes we will exploit that 16-level
`pilot symbol assisted quadrature amplitude modulation [26]
`(16-PSAQAM) provides two independent 2-b subchannels
`having different bit error rates (BER). Specifically, the BER
`of the higher integrity Cl subchannel is a factor 2——3 times
`
`Page 6 of 14
`
`
`
`HANZO AND STREIT: ADAPTIVE LOW-RATE WIRELESS VIDEOPHONE SCHEMES
`
`311
`
`TABLE III
`SUMMARY or SYSTEM FEATURES
`
`
`
`
`
`System 4
`System 3
`
` I Feature System 1 System 2
`
`
`11
`l__
`
`
`| Frame Rate (5/5 _|
`
`
`C1 FEC
`BCH(127,71,9)
`BcH(i27,5o,13)
`BCH(127,71,9)
`BcH(127,5o,13)
`(BCH(127,50,13)
`
`C2 FEC
`BCH(127,71,9)
`BcH(127,92,5)
`BCH(127,71,9)
`BCI-I(1’27,50,13)
`(BCH(127,50,13)
`
`
`
`Header FEC
`BCH 127,5o,13)
`BCH(127,50,13
`BCH127,50,13
`BCH(127,50,13
`BCH127,50,13
`
`
`Tl
`
` W|
`
`
` l
`
`-WI
`
`System Signal» Rate kBd
`144 1
`System Bandwidth (kHz) |
`=19 I
`Effi User Bandwidth kHz1
`Min AWGN SNR(<13
`_|
`
`
`
`
`
`
`
`
`
`
`
`Forward Error Correction: Both convolutional and block
`
`codes can be successfully used over mobile radio links
`[50], but in our proposed scheme we have favored binary
`Bose—Chaudhuri—Hocquenghem (BCH) codes. BCH codes
`combine a good burst error correction capability with
`reliable error detection, a facility useful
`to invoke image
`post-enhancement,
`to monitor
`the channel’s quality and
`to control handovers between traffic cells. The preferred
`
`R = 71/ 127 z 0.56—rate BCH(127,71,9) code can correct
`9 errors in a block of 127 b, a correction capability of about
`7.1%. The number of channel coded bits per image frame
`becomes 1136 X 127/ 71 = 2032, while the bit rate is 20.32
`kbps at an image frame rate of 10 frames/s.
`Transmission Fomtat: The transmission packets are con-
`
`structed using one Class One BCH(127,71,9) code, one Class
`Two BCH(127,71,9) code, and a stronger BCH(127,50,13) is
`allocated to the packet header, yielding a total of 381 b per
`packet. In case of 16QAM these are represented by 96 symbols
`and after adding 11 pilot symbols using a pilot spacing of P =
`10 as well as 4 ramp symbols to ensure smooth power amplifier
`ramping the resulting l1l—symbol packets are transmitted over
`the radio channel. Eight such packets represent a whole image
`frame and hence the signalling rate becomes 111 symb/12.5 ms
`z 9 kBd. When using a time division multiple access (TDMA)
`channel bandwidth of 200 kHz, such as in the Pan—European
`
`second generation mobile radio system known as GSM and
`a modulation excess bandwidth of 38.8%, the signalling rate
`becomes 144 kBd. This allows us to accommodate 144/9 = 16
`
`users, which coincides with the number of so-called half-rate
`speech users supported by the GSM system [59].
`When the prevailing channel SNR does not allow 16QAM
`communications, 4QAM must be invoked. In this case the
`38l—b packets are represented by 191 2—b symbols and after
`adding 20 pilot symbols and 4 ramp symbols the packet-length
`becomes 225 symb/12.5 ms, yielding a signalling rate of 18
`kBd. In this case the number of videophone users supported
`by System 1 becomes 8, as in the full—rate GSM speech
`channel. The system also facilitates mixed-mode operation,
`where 4QAM users must reserve two slots in each 12.5 ms
`TDMA frame towards the fringes of the cell, while in the
`central section of the cell 16QAM users will only require
`
`one slot per frame in order to maximize the number of users
`supported. Assuming an equal proportion of 4 and 16QAM
`users the average number of users per carrier becomes 12. The
`equivalent user bandwidth of the 4QAM PS’s is 200 kHz/8 =
`25 kHz, while that of the 16QAM users is 200 kHz/16 = 12.5
`kHz.
`
`For very high quality mobile channels or for conventional
`telephone lines 64-QAM can be invoked, which further re-
`duces the required bandwidth at the cost of a higher channel
`SNR demand. However, the packet format of this mode of
`operation is different from that of the 16 and 4QAM modes and
`hence requires a different slot length. The 381-b payload of the
`packet is represented by 64 6—b symbols, four ramp symbols
`are added along with 14 pilot symbols, which corresponds to
`a pilot spacing of P = 5. The resulting 82—symbol/12.5 ms
`packets are transmitted at a signalling rate of 6.6 kBd, which
`allows us to host 22 videophone users. The user bandwidth
`becomes 200 kHz/22 2: 9.1 kHz.
`The above—mentioned features of the 16/4QAM System 1
`along with the characteristics of a range of other systems about
`to be introduced in the next section are summarized in Table
`III.
`
`the required signalling rate and bandwidth are
`Clearly,
`comparable to those of most state—of-art mobile radio speech
`links, which renders our scheme attractive for mobile video
`telephony in the framework of existing mobile radio systems.
`Furthermore, this rate can also be readily accommodated by
`conventional telephone subscriber loops.
`
`4.2. System 2
`
`In order to improve the bandwidth efficiency of Codec 1
`we introduced the run—length coded Codec 2. Hence Codec
`2 became more vulnerable against transmission errors than
`Codec 1 and their effect is particularly objectionable, if the
`run length coded activity table bits are corrupted. Therefore
`in System 2, which was designed to incorporate Codec 2, the
`more sensitive run length coded activity table bits are pro-
`tected by the powerful binary Bose—Chaudhuri—Hocquenghem
`BCH(127,50,l3) codec, while the less vulnerable remaining
`bits by the weaker BCH(127,92,5) code. Note that the overall
`
`Page 7 of 14
`
`Page 7 of 14
`
`
`
`312
`
`IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 5, NO. 4, AUGUST 1995
`
`coding rate of R : (50 + 92)/(127 + 127) z 0.63 is identical
`to that of System 1, but the RL-coded Class One bits are
`more strongly protected. At a fixed coding rate this inevitably
`assumes a weaker code for the protection of the less vulnerable
`Class Two bits. The 852 b/100 ms video frame is encoded
`
`using six pairs of such BCH code words, yielding a total of 6-
`254 = 1524 b, which is equivalent to a bit rate of 15.24 kbps.
`As in System 1,
`the more vulnerable run length and
`BCH(127,50,13) coded Class One bits are then transmitted
`over the higher integrity Cl
`l6QAM subchannel. The less
`sensitive BCH(127,92,5) coded Class Two DCT coefficient
`bits are conveyed using the lower—integrity C2 l6QAM
`subchannel. This arrangement is favored in order to further
`emphasize the integrity differences of the BCH codecs used,
`which is necessitated by the integrity requirements of the
`video bits.
`
`is constructed by adding an ad-
`The transmission burst
`ditional BCH(127,50,13) code word for the packet header
`and the resulting 381 b are again converted to 96 l6QAM
`symbols, and pilot as well as ramp symbols are added. In
`System 2 six such packets represent a video frame, hence the
`single-user signalling rate becomes 666 symb/100 ms, which
`corresponds to 6.66 kBd. This allows us to accommodate now
`Integer[l44kBd/6.66] = 21 such users, if no time slots are
`reserved for packet re-transn1issions. This number will have
`to be reduced in order to accommodate ARQ’s.
`Automatic Repeat Request: ARQ techniques have been
`successfully used in data communications [29]—[32] in order to
`render the bit and frame error rate arbitrarily low. However,
`due to their inherent delay and the additional requirement
`for a feed-back channel for message acknowledgement they
`have not been employed in interactive speech or video
`communications. In our packet video system however there
`exists a full duplex control link between the BS and PS, which
`can be used for acknowledgements and the short TDMA frame
`length ensures a low packet delay, hence ARQ can be invoked.
`In System 2 when the more powerful BCH codec conveying
`the more sensitive run—length coded Class One bits over the C1
`l6QAM subchannel is overloaded by channel errors, we re-
`transmit these bits only using robust 4QAM. Explicitly, for the
`first transmission attempt (TX1) we use contention-free Time
`Division Multiple Access (TDMA). If an ARQ-request occurs,
`the re-transmitted packets will have to contend for a number of
`earmarked time slots similarly to Packet Reservation Multiple
`Access (PRMA) [26]. The intelligent base station (BS) detects
`these events of packet corruption and instructs the portable
`stations (PS)
`to re—transmit
`their packets during the slots
`dedicated to ARQ-packets. Reserving slots for ARQ—packets
`reduces the number of video users supported depending on the
`prevailing channel conditions, as we will show in the Results
`Section, Section V.
`
`Although the probability of erroneous packets can be re-
`duced by allowing repeated re—transmissions, there is a clear
`trade-off between the number of maximum transmission at-
`
`tempts and the BCH—coded frame error rate (FER). In order
`to limit the number of slots required for ARQ-attempts, which
`potentially reduce the number of video users supported,
`in
`System we invoke ARQ only, if the more sensitive run-length
`
`coded Class One bits transmitted via the C1 16-PSAQAM
`channel and p