throbber
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003
`
`637
`
`The SP- and SI-Frames Design for H.264/AVC
`
`Marta Karczewicz and Ragip Kurceren, Member, IEEE
`
`Abstract—This paper discusses two new frame types, SP-frames
`and SI-frames, defined in the emerging video coding standard,
`known as ITU-T Rec. H.264 or ISO/IEC MPEG-4/Part 10-AVC.
`The main feature of SP-frames is that identical SP-frames can be
`reconstructed even when different reference frames are used for
`their prediction. This property allows them to replace I-frames
`in applications such as splicing, random access, and error re-
`covery/resilience. We also include a description of SI-frames,
`which are used in conjunction with SP-frames. Finally, simulation
`results illustrating the coding efficiency of SP-frames are pro-
`vided. It is shown that SP-frames have significantly better coding
`efficiency than I-frames while providing similar functionalities.
`Index Terms—AVC, bitstream switching, error recovery, error
`resiliency, H.264, JVT, MPEG-4, random access, SI-frames,
`SP-frames, splicing.
`
`I. INTRODUCTION
`
`T O INCREASE compression efficiency and network
`
`the emerging ITU-T Recommendation
`friendliness,
`H.264, ISO/IEC MPEG-4/Part 10-AVC [1] video compression
`standard introduces an extensive set of new features. In this
`paper, we describe in detail two of these features, specifically,
`new frame types referred to as SP-frames [1]–[3] and SI-frames
`[1], [4].
`SP-frames make use of motion compensated predictive
`coding to exploit temporal redundancy in the sequence similar
`to P-frames. The difference between SP- and P-frames is that
`SP-frames allow identical frames to be reconstructed even when
`they are predicted using different reference frames. Due to this
`property, SP-frames can be used instead of I-frames in such
`applications as bitstream switching, splicing, random access,
`fast forward, fast backward, and error resilience/recovery. At
`the same time, since SP-frames unlike I-frames are utilizing mo-
`tion-compensated predictive coding, they require significantly
`fewer bits than I-frames to achieve similar quality. In some of
`the mentioned applications, SI-frames are used in conjunction
`with SP-frames. An SI-frame uses only spatial prediction as
`an I-frame and still reconstructs identically the corresponding
`SP-frame, which uses motion-compensated prediction.
`The remainder of the paper is organized as follows. In Sec-
`tion II, a review of frame types being used in the existing standards
`is given. In Section III, we discuss how features of SP-frames can
`be exploited in several example applications. Section IV provides
`a description of SP- and SI-frame decoding and encoding pro-
`cesses. Section V includes details of experiments and a relative
`performance improvement of the SP-frames. Finally, Section V
`offers a summary and conclusions.
`
`Manuscript received December 13, 2001; revised May 9, 2003.
`The authors are with Nokia Research Center, Nokia Inc., Irving, TX 75039
`USA (e-mail: ragip.kurceren@nokia.com and marta.karczewicz@nokia.com).
`Digital Object Identifier 10.1109/TCSVT.2003.814969
`
`Fig. 1. Generic block diagram of decoding process.
`
`II. MOTIVATION
`
`In the existing video coding standards, such as MPEG-2,
`H.263 and MPEG-4, three main types of frames are defined.
`Each frame type exploits a different type of redundancy ex-
`isting in video sequences and consequently results in a different
`amount of compression efficiency and different functionality
`that it can provide.
`An intra-frame (or I-frame) is a frame that is coded ex-
`ploiting only the spatial correlation of the pixels within the
`frame without using any information from other frames.
`I-frames are utilized as a basis for decoding/decompression of
`other frames and provide access points to the coded sequence
`where decoding can begin.
`A predictive-frame (or P-frame) is coded/compressed using
`motion prediction from a so-called reference frame, i.e., a past
`I- or P-frame available in an encoder and decoder buffer. Fig. 1
`illustrates a generic decoding process for P- and I-frames. Fi-
`nally, a bidirectional-frame (or B-frame) is coded/compressed
`using a prediction derived from an I-frame (P-frame) in its past
`or an I-frame (P-frame) in its future or a combination of both.
`B-frames are not used as a reference for prediction of other
`frames.
`Since, in a typical video sequence, adjacent frames are
`highly correlated, higher compression efficiency is achieved
`when using B- or P-frames instead of I-frames. On the other
`hand, temporal predictive coding employed in P- and B-frames
`introduces temporal correlation within the coded bitstream, i.e.,
`B- or P-frames cannot be decoded without correctly decoding
`their reference frames in the future and/or past. In cases when
`a reference frame used in an encoder and a reference frame
`used in a decoder are not identical either due to errors during
`transport or due to some intentional action on the server side,
`
`1051-8215/03$17.00 © 2003 IEEE
`
`1
`
`SAMSUNG-1033
`
`

`

`638
`
`IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003
`
`the reconstructed values of the subsequent frames predicted
`from such a reference frame are different in the encoder than
`in the decoder. This mismatch would not only be confined to
`a single frame but would further propagate in time due to the
`motion-compensated coding.
`In H.264/AVC, I-, P-, and B-frames have been extended with
`new coding features, which lead to a significant increase in
`coding efficiency. For example, H.264/AVC allows using more
`than one prior coded frame as a reference for P- and B-frames.
`Furthermore, in H.264/AVC, P-frames and B-frames can use
`prediction from subsequent frames [5]. These new features are
`described in detail elsewhere in this Special Issue.
`Additionally, two new types of frames have been defined,
`namely, SP-frames [2], [3] and SI-frames [4]. The method of
`coding as defined for SP- and SI-frames allows obtaining frames
`having identical reconstructed values even when different refer-
`ence frames are used for their prediction.
`In the following, we describe on how these features of
`SP-frames and SI-frames can be exploited in specific applica-
`tions [6].
`
`A. Bitstream Switching
`Video streaming has emerged as one of the essential appli-
`cations over the fixed internet and in the near future over 3G
`wireless networks. The best-effort nature of today’s networks
`causes variations of the effective bandwidth available to a user
`due to the changing network conditions. The server should then
`scale the bit rate of the compressed video, transmitted to the
`receiver, to accommodate these variations. In case of conversa-
`tional services that are characterized by real-time encoding and
`point-to-point delivery, this can be achieved by adjusting, on the
`fly, source encoding parameters, such as a quantization param-
`eter or a frame rate, based on the network feedback. In typical
`streaming scenarios when an already encoded video bitstream
`is to be sent to a client, the above solution cannot be applied.
`The simplest way of achieving bandwidth scalability in the
`case of pre-encoded sequences is by representing each sequence
`using multiple and independent streams of different bandwidth
`and quality. The server then dynamically switches between the
`streams to accommodate the variations of the bandwidth avail-
`able to the client.
`Assume that we have multiple bitstreams generated inde-
`pendently with different encoding parameters, corresponding
`to the same video sequence. Let
`and
`denote the sequence of the decoded
`frames from bitstreams 1 and 2, respectively. Since the encoding
`parameters are different for each bitstream, the reconstructed
`frames from different bitstreams at the same time instant, for
`example, frames
`and
`, will not be identical. Now
`let us assume that the server initially sends bitstream 1 up to
`time n after which it starts sending bitstream 2, i.e., the decoder
`would have received
`.
`In this case, the frame
`can not be correctly decoded since
`the reference frame
`used to obtain its prediction is not
`received whereas the frame
`, which is received instead
`of
`, is not identical to
`. Therefore, switching
`between bitstreams at arbitrary locations can lead to visual
`artifacts due to the mismatch between the reference frames.
`
`Fig. 2. Switching between bitstreams using SP-frames.
`
`Furthermore, the visual artifacts will not only be confined
`to the frame
`but will further propagate in time due to
`motion-compensated coding.
`(mis-
`In the prior video encoding standards, perfect
`match-free) switching between bitstreams is possible only at
`frames, which do not use any information prior to their location,
`i.e., at I-frames. Furthermore, by placing I-frames at fixed (e.g.,
`1 s) intervals, VCR functionalities, such as random access or
`“Fast Forward” and “Fast Backward” (increased playback rate)
`when streaming a video content, are achieved. The user may
`skip a portion of a video sequence and restart playing at any
`I-frame location. Similarly, the increased playback rate can
`be achieved by transmitting only I-frames. The drawback of
`using I-frames in these applications is that, since I-frames do
`not exploit any temporal redundancy, they require much larger
`number of bits than P-frames at the same quality.
`identical
`From the properties of SP-frames, note that
`SP-frames can be obtained even when they are predicted using
`different reference frames. This feature can be exploited in
`bitstream switching as follows. Fig. 2 depicts an example how
`to utilize SP frames to switch between different bitstreams.
`Again assume that there are two bitstreams corresponding
`to the same sequence encoded at different bit rates and/or at
`different temporal resolutions. Within each encoded bitstream,
`SP-frames are placed at
`the locations at which switching
`from one bitstream to another will be allowed (frames
`and
`in Fig. 2). These SP-frames shall be referred to as
`primary SP-frames in what follows. Furthermore, for each
`primary SP-frame, a corresponding secondary SP-frame is
`generated, which has the same identical reconstructed values as
`the primary SP-frame. Such a secondary SP-frame is sent only
`during bitstream switching. In Fig. 2, the SP-frame
`is the
`secondary representation of
`, which will be transmitted
`only when switching from bitstream 1 to bitstream 2.
`uses
`the previously reconstructed frames from bitstream 2 as the
`reference frames, while
`uses the previously reconstructed
`frames from bitstream 1 as the reference frames. However, due
`to the special encoding of the secondary SP-frame, described
`later, the reconstructed values of these frames are identical.
`
`2
`
`2
`
`

`

`KARCZEWICZ AND KURCEREN: SP- AND SI-FRAMES DESIGN FOR H.264/AVC
`
`639
`
`Fig. 3. Splicing, random access using SI-frames.
`
`Fig. 4. SP-frames in error resiliency/recovery.
`
`If one of the bitstreams has a lower temporal resolution, e.g.,
`1 fps, then this bitstream can be used to achieve fast-forward
`functionality. Specifically, decoding from the bitstream with the
`lower temporal resolution and then switching to the bitstream
`with the normal frame rate would provide such functionality.
`
`B. Splicing and Random Access
`The bitstream-switching example discussed earlier considers
`bitstreams representing the same sequence of images. However,
`this is not necessarily the case for other applications where bit-
`stream switching is needed. Examples include
`• switching between bitstreams arriving from different cam-
`eras capturing the same event but from different perspec-
`tives, or cameras placed around a building for surveillance;
`• switching to local/national programming or insertion of
`commercials in TV broadcast, video bridging, etc.
`Splicing refers to the process of concatenating encoded bit-
`streams and includes the examples discussed earlier.
`When switching occurs between bitstreams representing
`different sequences, that affects encoding of secondary frames,
`i.e., encoding of
`in Fig. 2. Specifically, motion-compen-
`sated prediction of frames from one bitstream using reference
`frames from another bitstream when these bitstreams represent
`different sequences will not be as effective as when both
`bitstreams correspond to the same sequence. In such cases,
`using spatial prediction for the secondary frames could be
`more efficient. This is illustrated in Fig. 3, where the secondary
`frame is denoted as
`to indicate that this is an SI-frame
`encoded, as described later, using spatial prediction and having
`identical reconstructed values as the corresponding SP-frame
`. SI-frames can provide also random access points to the
`bitstream and have further implications in error recovery and
`resiliency, which will be described in Sections II-C and D.
`
`C. Error Recovery
`Multiple representations of a single frame in the form of
`SP-frames predicted from different reference frames, e.g., the
`immediate previously reconstructed frame and a reconstructed
`frame further back in time, as illustrated in Fig. 4, can be used
`
`to increase error resilience and/or error recovery. Consider the
`case when an already encoded bitstream is being streamed and
`there has been a packet loss leading to a frame or slice loss. The
`client signals the lost frame to the server which then responds
`by sending one of the secondary representations of the next
`SP-frame. This secondary representation, e.g.,
`in Fig. 4,
`uses the reference frames that have been correctly received by
`the client.
`Similarly, as argued earlier in the discussion of splicing, an-
`other representation of the SP-frame can be generated without
`using any reference frames, i.e.,
`in Fig. 4. In this case,
`the server can send the SI-frame representation, i.e.,
`in-
`stead of
`to stop error propagation. For slice-based packeti-
`zation and delivery, the server could further estimate the slices
`that would be affected by such a slice/frame loss and update only
`those slices in the next SP-frame with their secondary represen-
`tations.
`
`D. Error Resiliency
`intra macroblock refresh
`For lossy transport networks,
`strategy has been shown to provide significant increase in error
`resiliency/recovery performance [7]–[10]. Furthermore, it has
`been illustrated in [7]–[10] that intra macroblock refresh rate,
`i.e., frequency at which a macroblock is intra-encoded, should
`depend on transport channel conditions, e.g., packet loss and/or
`a bit error rate. In interactive client/server scenarios, the encoder
`on the server side decides to encode the slices/macroblocks in
`the intra mode either based on: the specific feedback received
`from the client, or the expected network conditions calculated
`through negotiation, or the measured network conditions.
`However, when already encoded bitstreams are sent, which is
`the case in typical streaming applications, the above strategy
`cannot be applied directly. Either the sequence needs to be
`encoded with the worst-case expected network conditions or
`additional error resiliency/recovery mechanisms are required.
`From the earlier discussion on SP-frame usage in error re-
`covery and splicing applications, SP-frames or slices can be
`represented as SI-frames/slices that do not use any reference
`frames. This feature can be exploited in the adaptive intra re-
`fresh mechanism discussed above. First, a sequence is encoded
`
`3
`
`3
`
`

`

`640
`
`IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003
`
`with some predefined ratio of SP-slices. Then during transport,
`instead of some of the SP-slices their secondary representation,
`that is SI-slices, is sent. The number of SI-slices that should be
`sent can be calculated similarly as in the real-time encoding/de-
`livery approach.
`
`E. Video Redundancy Coding
`SP-frames have other uses in applications in which they do
`not act as replacements of I-frames. Video redundancy coding
`(VRC) can be given as an example. “The principle of the VRC
`method is to divide the sequence of pictures into two or more
`threads in such a way that all camera pictures are assigned to
`one of the threads in a round-robin fashion. Each thread is coded
`independently. In regular intervals, all threads converge into a
`so-called sync frame. From this sync frame, a new thread series
`is started. If one of these threads is damaged because of a packet
`loss, the remaining threads stay intact and can be used to predict
`the next sync frame. It is possible to continue the decoding of
`the damaged thread, which leads to slight picture degradation,
`or to stop its decoding which leads to a drop of the frame rate.
`Sync frames are always predicted out of one of the undamaged
`threads. This means that the number of transmitted I-frames can
`be kept small, because there is no need for complete re-syn-
`chronization.” 1 For the sync frame, more than one representa-
`tion (P-frame) is sent, each one using a reference frame from a
`different thread. Due to the usage of P-frames these representa-
`tions are not identical. Therefore a mismatch is introduced when
`some of the representations cannot be decoded and their coun-
`terparts are used when decoding the following threads. The use
`of SP-frames as sync frames eliminates this problem.
`
`III. DECODING AND ENCODING PROCESSES FOR SP- AND
`SI-FRAMES
`
`In this section, we provide a detailed description of de-
`coding and encoding processes for nonintra blocks in SP- and
`SI-frames. For intra blocks in SP- and SI-frames, the process
`identical to that of I-frames is applied [1]. As noted earlier,
`SP-frames can be further classified as secondary SP-frames,
`e.g.,
`in Fig. 2, and primary SP-frames, e.g.,
`and
`in Fig. 2. We first describe the decoding process for the
`secondary SP-frames and SI-frames [2] to illustrate the basic
`principle. Then the description of the improved decoder [3],
`which is used for decoding primary SP-frames, is provided.
`Finally, an example of an SP- and SI-frame encoder is given.
`
`A. Decoding Process for Secondary SP-Frames and SI-Frames
`Fig. 5 illustrates a general schematic block diagram of
`decoding process for secondary SP-frames and SI-frames. As
`can be observed from Figs. 1 and 5, SP- and SI-frames make
`use of the existing coding modules for P- and I- frames. First,
`a predicted block
`is formed. For SP-frames,
`is formed by motion-compensated prediction from the already
`decoded frames using the motion vectors and the reference
`frame number information that are received from the encoder.
`For SI-frames, it is formed by spatial prediction from the
`already decoded neighboring pixels within the same frame
`
`1The description of VRC is copied from [14].
`
`Fig. 5. Generic block diagram of decoding process for secondary SP- and
`SI-frames.
`
`using the intra prediction mode information received from the
`encoder. Then, forward transform is applied to the predicted
`block
`and the obtained coefficients are quantized. The
`quantized predicted block coefficients denoted by
`are
`added to the received quantized prediction error coefficients
`to calculate the quantized reconstruction coefficients
`.
`The image is reconstructed by inverse transform of
`, which
`are found by dequantizing
`.
`Since during SP-frame (SI-frame) decoding, unlike during
`P-frame (I-frame) decoding (compare Figs. 1 and 5), transform
`is applied to predicted blocks and resulting coefficients are
`quantized, coding efficiency of the SP-frames (SI-frames) is
`expected to be worse than that of P-frames (I-frames), as will
`be illustrated in Section IV. The quantization applied to the
`predicted block coefficients and the prediction error coefficients
`in the secondary SP- and SI-frame decoding scheme described
`above has to be the same. More specifically, both should use the
`same quantization parameter. To improve coding efficiency, the
`following improved decoding structure [3] is defined, which
`gives the flexibility to use different quantization parameters for
`the predicted block coefficients than for the prediction error
`coefficients.
`
`B. Decoding Process for Primary SP-Frames
`Fig. 6 illustrates a block diagram of the improved SP-frame
`decoder used for primary SP-frames. Similar to the earlier case,
`a predicted block
`is formed and forward transform is ap-
`plied. The obtained transform coefficients are denoted by
`.
`Then the quantized prediction error coefficients
`that are re-
`ceived from the encoder are dequantized using a quantization
`parameter PQP and added to the predicted block transform co-
`efficients
`. The sum is denoted by
`. Note that in the
`earlier case, the quantized coefficients were added whereas in
`this case the summation of coefficients is performed. The recon-
`struction coefficients
`are quantized and dequantized using
`a quantization parameter SPQP and inverse transform is applied
`to the resulting coefficients
`, similar to the earlier scheme.
`The quantization parameter used for reconstruction coefficients
`and in turn for the predicted block coefficients, namely SPQP, is
`
`4
`
`4
`
`

`

`KARCZEWICZ AND KURCEREN: SP- AND SI-FRAMES DESIGN FOR H.264/AVC
`
`641
`
`Fig. 6. Generic block diagram of decoding process for primary SP-frames.
`
`not necessarily the same as the quantization parameter PQP used
`for the prediction error coefficients. Therefore, in this case, a
`finer quantization parameter, introducing smaller distortion, can
`be used for the predicted block coefficients than for the predic-
`tion error coefficients, which will result in smaller reconstruc-
`tion error.
`The SP- and SI-frame decoders discussed earlier in this sec-
`tion are general decoders and can easily be incorporated into
`other coding standards. The specific details as to how they are
`implemented in H.264/AVC can be found in [1].
`
`C. SP-Frame and SI-Frame Encoder
`In this section, we first present the encoding process for pri-
`mary SP-frames and later for secondary SP- and SI-frames. The
`following applies to the encoding of nonintra blocks in SP- and
`SI-frames. For intra blocks in SP- and SI-frames, the identical
`process to that used for I-frames is applied [1].
`Fig. 7 illustrates a general schematic block diagram of an ex-
`ample encoder corresponding to the primary SP-frames. First,
`a predicted block
`is formed by motion-compensated
`prediction using the original image and the previously recon-
`structed frames. Then forward transform is applied to both the
`predicted block
`and the corresponding block in the orig-
`inal image. The transform coefficients of
`are quantised
`and dequantized using the quantization parameter SPQP. The
`obtained coefficients
`are then subtracted from the trans-
`form coefficients of the original image. The results of the sub-
`traction represent the prediction error coefficients
`. The pre-
`diction error coefficients are quantized using PQP and the re-
`sulting coefficients
`are sent to the multiplexer together with
`motion vector information. The decoding process follows the
`identical steps as described earlier.
`In the following, we illustrate with an example, how
`SP-frames provide the functionality mentioned earlier, i.e.,
`identical frames are reconstructed even when different refer-
`ence frames are used for their prediction. Let us denote by
`the reconstructed values of the primary SP-frame
`encoded using the predicted frame
`and obtained by
`
`Fig. 7. Generic block diagram of encoding process for nonintra blocks in
`SP-frames.
`
`inverse transform of the quantized reconstructed coefficients
`(see Fig. 6). Now assume that we would like to generate
`secondary representation of this primary SP-frame having
`the identical reconstructed values
`and encoded using
`a different predicted frame
`. The problem becomes
`finding new prediction error coefficients
`that would
`identically reconstruct
`the frame
`using
`instead of
`. The quantized transform coefficients
`are calculated for the predicted frame
`using
`the quantization parameter SPQP. Then the prediction error
`coefficients for the secondary SP-frame are simply computed
`as
`. On the decoder side, according to
`the decoding process for secondary SP-frames, as illustrated in
`Fig. 5, the decoder forms
`and then computes
`by first applying transform to
`and then quantizing
`obtained coefficients using SPQP. The coefficients
`,
`identical to the ones on the encoder side, are added to the
`received prediction error coefficients
`. The resulting sum
`is equal to
`. This
`example illustrates that with the special encoding of secondary
`SP-frames, identical reconstructed values are obtained to a
`corresponding primary SP-frames.
`When
`this case
`is formed by intra-prediction,
`becomes converting a primary SP-frame with motion-com-
`pensated prediction into a secondary SI-frame with only
`spatial prediction. As shown earlier, this property has major
`implications in random access, and error recovery/resilience.
`To achieve identical reconstruction, the quantization param-
`eter used for a secondary SP-frame should be equal to the quan-
`tization parameter SPQP used for the predicted frame in a pri-
`mary SP-frame. That means that using a finer quantization pa-
`rameter value for SPQP although improves the coding efficiency
`of the primary SP-frames placed within the bitstream might re-
`sult in larger frame sizes for the secondary SP-frames. Since
`the secondary representations are sent only during switching
`
`5
`
`5
`
`

`

`642
`
`IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003
`
`Illustration of coding efficiencies SP frames using different SPQP
`Fig. 8.
`values, also included are I- and P-frame performances.
`
`or random access, the choice of the SPQP value is application
`dependent. For example when SP-frames are used to facilitate
`random access one can expect that the SP-frames placed within
`a single bitstream will have the major influence on compression
`efficiency and therefore the SPQP value should be small. On the
`other hand, when SP-frames are used for streaming rate control,
`the SPQP value should be kept close to PQP since the secondary
`SP-frames sent during switching from one bitstream to another
`will have large share of the overall bandwidth.
`
`IV. RESULTS
`
`In this section, we provide simulation results to illustrate the
`coding efficiency of SP-frames. First, we compare the coding
`efficiency of SP-frames with I- and P-frames. The results are
`obtained using TML 8.7 software [13] with five reference
`frames in UVLC mode with rate-distortion optimization option
`enabled. Later, a comparison of SP-frames with S-frames [11]
`is provided; these results are repeated from [12].
`The results reported here are for some of the standard se-
`quences used in JVT contributions with QCIF resolution and en-
`coded at 10 fps. Similar results are observed for other sequences
`and further results can be found in [3].
`1) Coding Efficiency of SP-Frames: Fig. 8 gives the
`comparison of coding efficiency of I-, P-, and SP- frames, in
`terms of their PSNR as a function of bit rate. These results are
`generated by encoding each frame of a sequence as either an
`I-, P-, or SP-frame, with the exception of the first frame, which
`is always an I-frame. We also include in Fig. 8 the results
`when SP-frames are coded using different values of SPQP. In
`the first case, SPQP is the same as PQP, then SPQP is equal
`to 3 and in the last case, SPQP is equal to PQP-6. It can be
`observed in Fig. 8, that SP-frames have lower coding efficiency
`than P-frames and significantly higher coding efficiency than
`I-frames. Note, however, that SP-frames provide functionalities
`that are usually achieved only with I-frames. As expected, the
`SP-frames performance improves with decreasing SPQP versus
`PQP. The SP-frames coding efficiency for
`becomes
`very close, slightly worse at higher rates, to the P-frame coding
`efficiency. Further results can be found in [3].
`
`Illustration of coding efficiencies SP frames using different SPQP
`Fig. 9.
`values when inserted periodically, also included is the periodic-intra coding
`approach.
`
`2) Performance Improvement When SP-Frames are Inserted
`Periodically: In the following, we present simulation results
`when SP- and I-frames are introduced at fixed intervals, as
`it will be the case when enabling bitstream-switching or
`random-access functionality. Fig. 9 illustrates the results ob-
`tained under the following conditions: the first frame is encoded
`as an I-frame and at fixed intervals, in this case 1 s, the frames
`are encoded as I- or SP-frames while the remaining frames are
`encoded as P-frames. Also included in Fig. 9 is the performance
`achieved when all the frames are encoded as P-frames. Note
`that in this case, none of the functionalities mentioned earlier
`can be obtained while it provides a benchmark for comparison
`with both SP- and I-frame cases. As can be seen in Fig. 9,
`SP-frames while providing the same functionalities as I-frames
`have significantly higher coding efficiency. Furthermore,
`the performance of SP-frames improves and approaches the
`P-frame performance with decreasing SPQP. Further results
`can be found in [3].
`3) Comparison With S-Frames: Farber et al. [11] introduced
`a specially encoded P-frame, called an S-frame, which is sent
`only when switching from one bitstream to another, similar to
`the secondary SP-frame
`in Fig. 2. More specifically, to en-
`able switching from bitstream 1 to bitstream 2 at
`th frame, an
`S-frame is generated by encoding the th reconstructed frame in
`bitstream 2 as a P-type frame predicted from previously recon-
`structed frames from bitstream 1. Unlike SP-frames, S-frames
`introduce mismatch while switching between bitstreams.
`To minimize drift, the quantization parameter QP used for
`S-frames should be kept small. In the following, we use QP
`equal to 3 to ensure that the initial PSNR difference is below
`0.2 dB, as is recommended in [11].
`In Fig. 10, we present examples illustrating switching
`between bitstreams using S-frames. Fig. 10 illustrates the
`PSNR profiles of bitstreams encoded with
`and with
`and two additional bitstreams created by switching
`between them. In each case, we switch from the bitstream
`encoded with
`to the bitstream encoded with
`but at different frames, namely at frames number 10 and 20.
`
`6
`
`6
`
`

`

`KARCZEWICZ AND KURCEREN: SP- AND SI-FRAMES DESIGN FOR H.264/AVC
`
`643
`
`TABLE I
`COMPARISON OF AVERAGE I-, S-, AND SP-FRAME SIZES THAT WOULD BE
`USED DURING SWITCHING FROM BITSTREAM ENCODED WITH QP
`TO THE BITSTREAM ENCODED WITH QP
`
`Illustration of switching between bitstream encoded by QP = 19
`Fig. 10.
`and 13 with S-frames, QP (S-frame) is equal to 3.
`
`TABLE II
`COMPARISON OF AVERAGE I-, S-, AND SP-FRAME SIZES THAT WOULD BE
`USED DURING SWITCHING FROM BITSTREAM ENCODED WITH QP
`TO THE BITSTREAM ENCODED WITH QP
`
`TABLE III
`PSNR AND TOTAL BITS OVER 100 FRAMES FOR THE MULTIPLE SWITCHING
`EXAMPLE FOR S-, I-, AND SP-FRAME APPROACHES
`
`Fig. 11. Multiple switching between bitstreams encoded with QP = 19 and
`13 when using S-frames. QP for S-frame is equal to 3.
`
`As can be seen from Fig. 10, the PSNR values of the recon-
`structed frames after the switch diverge from the values of the
`frames from the “target” bitstream—the bitstream that we are
`switching to. Moreover, the drift becomes more pronounced
`when multiple switches occur as illustrated in Fig. 11. In
`Fig. 11, the drift becomes larger than 1.5 dB after the last
`switch between bitstreams. Using even smaller quantization
`parameter for S-frames could reduce the drift; however, that
`would increase S-frame sizes which, as will be shown below,
`are already substantial.
`In this section, we present a brief comparison of SP- and
`S- frames. Tables I and II list the average frame sizes of S-
`and SP-frames. Here, SP-frames refer to secondary SP-frames
`that would be used during switching, e.g.,
`in Fig. 2. The
`average is taken over 10 S-frames (SP-frames) used to switch
`between two bitstreams at different locations. We switch from
`the bitstream encoded with
`to the bitstream encoded with
`where the values of
`and
`are given next to the
`sequence name in the first column of Tables I and II. Notice
`that in Table I, the switch always takes place from the lower to
`higher quality bitstream and in Table II from the higher to lower
`quality bitstream. In both cases, the sizes of the S-frames are
`considerably larger than that of the SP-frames. The differences
`
`are larger when the switch occurs from higher to lower quality
`bitstream. For example, for the “news” sequence, the average
`size of the S-frame is 3.4 times larger than that of the SP-frame
`when
`and
`(Table I) and 6.2 times larger
`when
`and
`(Table II). Similar results can
`be observed for other test conditions [12].
`In the following, we measure PSNR and bit rate for a number
`of bitstreams, each one generated by switching four times be-
`tween two different quality bitstreams representing the same se-
`quence. Switching occurs every other second and the first one
`takes place from the higher to lower quality bitstream. S-, I-,
`and SP-frames are used to facilitate the switching

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket