`
`M A S T E R O F S C I E N C E T H E S I S
`
`Video Coding in H.26L
`
`by
`
`Kristofer Dovstam
`
`April 2000
`
`Work done at Ericsson Radio Systems AB, Kista, Sweden,
`Ericsson Research, Department of Audio & Visual Technology.
`
`IPR2018-01413
`Sony EX1010 Page 1
`
`
`
`Video Coding in H.26L
`
`
`
`Kristofer Dovstam, April 2000
`
`
`
`
`
`
`2
`
`
`
`IPR2018-01413
`Sony EX1010 Page 2
`
`
`
`Video Coding in H.26L
`
`
`
`Kristofer Dovstam, April 2000
`
`Abstract
`
`
`
`For the past few years, the capacity of global networks and communication channels has
`increased considerably. This allows for real-time applications like video conferencing and
`video-on-demand using compressed video. State-of-the-art video coding solutions such as
`H.263, MPEG-2 and MPEG-4 all have one goal in common: to achieve highest possible
`image quality for lowest possible bit-rate.
` During 1999, the development of a new video coding standard, H.26L, started. H.26L is
`supposed to replace its predecessor H.263, and one of the goals is to achieve 50% greater bit-
`rate savings compared to H.263. The first proposal for the H.26L standard, presented in
`August 1999, achieves higher compression efficiency compared to H.263. However, the goal
`of 50% is not yet met.
`This thesis proposes a method to use adaptive arithmetic coding (AAC) for entropy coding
`in an H.26L video codec in order to further improve the compression efficiency. It is a
`general solution that can be applied to any video codec. However, implementations have been
`made for the H.26L video codec only. AAC is based on an entirely different strategy than the
`variable length entropy coding technique employed in H.26L and many other video codecs.
`Three experimental models for adaptation to local statistics have been designed and
`investigated. Results show that for high bit-rate environments significant bit-rate savings can
`be made using AAC, while less can be won for lower bit-rates.
`
`
`
`
`
`
`
`
`
`
`3
`
`
`
`IPR2018-01413
`Sony EX1010 Page 3
`
`
`
`Video Coding in H.26L
`
`
`
`Kristofer Dovstam, April 2000
`
`
`
`
`
`
`4
`
`
`
`IPR2018-01413
`Sony EX1010 Page 4
`
`
`
`Video Coding in H.26L
`
`
`
`Kristofer Dovstam, April 2000
`
`Contents
`
`
`
`1. Introduction and Background........................................................................... 9
`
`1.1 Overview of Video Compression ............................................................................. 9
`
`
`1.1.1 Exploiting Spatial Redundancy – INTRA-coded Compression....................... 9
`
`
`1.1.2 Exploiting Temporal Redundancy – INTER-coded Compression................. 10
`
`
`1.1.3 Motion Estimation.......................................................................................... 10
`
`
`1.1.4 Entropy Coding .............................................................................................. 11
`
`
`
` Run-Length Coding.................................................................................... 12
`
`
`
` Variable Length Codes............................................................................... 12
`
`
`
` Arithmetic Coding...................................................................................... 12
`
`
`1.1.5 A Typical Video Codec.................................................................................. 15
`
`1.2 A Brief Overview of Existing Standards .............................................................. 16
`
`
`1.2.1 H.261.............................................................................................................. 17
`
`
`1.2.2 MPEG-1 ......................................................................................................... 17
`
`
`1.2.3 MPEG-2 / H.262 ............................................................................................ 17
`
`
`1.2.4 H.263.............................................................................................................. 18
`
`
`
` Unrestricted Motion Vector mode.............................................................. 18
`
`
`
` Syntax-based Arithmetic Coding mode ..................................................... 18
`
`
`
` Advanced Prediction mode ........................................................................ 18
`
`
`
` PB-frames mode......................................................................................... 19
`
`
`1.2.5 MPEG-4 ......................................................................................................... 19
`
`
`1.2.6 H.26L ............................................................................................................. 19
`
`
`
` Only one regular VLC table used for symbol coding ................................ 20
`
`
`
` One-third pixel position used for motion prediction.................................. 20
`
`
`
` Seven different block sizes used for motion prediction ............................. 21
`
`
`
`
`Integer transform applied on 4x4 blocks.................................................... 21
`
`
`
` Multiple reference frames may be used for motion prediction .................. 21
`
`
`
` Other features............................................................................................. 22
`
`1.3 Areas of Research and Further Aspects of Video Coding .................................. 22
`
`
`1.3.1 Rate-Distortion Theory .................................................................................. 23
`
`
`1.3.2 Error Resilience.............................................................................................. 24
`
`
`
` Reversible Variable Length Codes............................................................. 25
`
`
`
` Data Partitioning ........................................................................................ 25
`
`
`
` Header Extension Code.............................................................................. 26
`
`
`1.3.3 Advanced Motion Estimation ........................................................................ 26
`
`
`
` Gradient Matching...................................................................................... 27
`
`
`
` Motion Models and Regions of Support .................................................... 27
`
`
`1.3.4 Adaptive Arithmetic Coding .......................................................................... 28
`
`1.4 Purpose and Goal.................................................................................................... 29
`
`1.5 Outline ..................................................................................................................... 30
`
`2. Video Codec Evaluation ..................................................................................... 31
`
`2.1 H.26L versus H.263 ................................................................................................ 31
`
`
`2.1.1 The SNR Metric ............................................................................................. 31
`
`
`
`
`
`
`5
`
`
`
`IPR2018-01413
`Sony EX1010 Page 5
`
`
`
`Video Coding in H.26L
`
`
`
`Kristofer Dovstam, April 2000
`
`2.1.2 Video Sequences and Formats ....................................................................... 32
`
`2.1.3 Test Issues and Procedure .............................................................................. 32
`
`2.1.4 Example of Trade-Off Between Image Quality and Quantization Step......... 34
`
`2.1.5 Test Specifications ......................................................................................... 34
`
`2.1.6 SNR vs. Bit-Rate for Foreman and Silent...................................................... 35
`
`2.2 Arithmetic Coding versus VLCs for H.263 .......................................................... 36
`
`2.2.1 Test Specifications ......................................................................................... 36
`
`2.2.2 Bit-Rate Reduction Using Arithmetic Coding ............................................... 36
`2.3 The H.26L Syntax................................................................................................... 37
`
`2.3.1 Picture Sync (Psync) ...................................................................................... 37
`
`2.3.2 Picture Type (Ptype) ...................................................................................... 37
`
`2.3.3 Macroblock Type (MB_Type) ....................................................................... 37
`
`2.3.4 Intra Prediction Mode (Intra_pred_mode) ..................................................... 38
`
`2.3.5 Reference Frame (Ref_frame)........................................................................ 38
`
`2.3.6 Motion Vector Data (MVD)........................................................................... 38
`
`2.3.7 Transform Coefficients (Tcoeff_luma, Tcoeff_chroma_DC &
`
`
`Tcoeff_chroma_AC) ...................................................................................... 38
`
`2.3.8 Coded Block Pattern (CBP) ........................................................................... 39
`2.4 Bit Distribution for Different Data Types ............................................................ 39
`
`2.4.1 Test Specification........................................................................................... 40
`
`2.4.2 Bit Distribution for Different Data Types in H.26L....................................... 40
`
`2.4.3 A comparison between H.26L and H.263...................................................... 41
`2.5 Entropy Study of H.26L......................................................................................... 41
`
`2.5.1 From Video Sequence to VLCs ..................................................................... 42
`
`2.5.2 Definition of Entropy ..................................................................................... 42
`
`2.5.3 Entropy with Respect to a Model................................................................... 43
`
`2.5.4 Average Codeword Length (ACL)................................................................. 44
`
`2.5.5 Average Disjoint Entropy (ADE)................................................................... 45
`
`2.5.6 No Double Scan ............................................................................................. 46
`
`2.5.7 Entropy, ACL and ADE for H.26L and Foreman .......................................... 46
`
`2.5.8 Difference Between Entropy and ACL for Different Data Types.................. 47
`2.6 Evaluation Conclusions.......................................................................................... 49
`
`2.6.1 H.26L versus H.263 ....................................................................................... 49
`
`2.6.2 Arithmetic Coding versus VLCs for H.263 ................................................... 50
`
`2.6.3 Bit Distribution for Different Data Types...................................................... 50
`
`2.6.4 Entropy Study of the H.26L output................................................................ 50
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`3. The Adaptive Arithmetic Codec .................................................................... 53
`
`3.1 Adaptive Methods................................................................................................... 53
`
`
`3.1.1 Dynamic Symbol Reordering in H.26L ......................................................... 53
`
`
`3.1.2 Adaptive Arithmetic Coding in Wavelet-Transform Video Coder ................ 53
`
`
`3.1.3 Multiple VLC Tables for Transform Coefficients in H.263 .......................... 54
`
`3.2 Adaptive Arithmetic Codec Functionality............................................................ 54
`
`
`3.2.1 Model and Symbols........................................................................................ 55
`
`
`3.2.2 Contexts and Adaptation................................................................................ 55
`
`
`3.2.3 Escape Symbols ............................................................................................. 56
`
`
`3.2.4 Purging Contexts............................................................................................ 56
`
`
`
`
`
`
`6
`
`
`
`IPR2018-01413
`Sony EX1010 Page 6
`
`
`
`Video Coding in H.26L
`
`
`
`Kristofer Dovstam, April 2000
`
`3.2.5 Stopping and Restarting the Adaptive Arithmetic Codec .............................. 57
`
`3.2.6 Manual Modification of the Symbol Distribution in a Context ..................... 57
`
`3.3 Adaptive Arithmetic Codec Parameters............................................................... 57
`
`3.3.1 Context Sizes.................................................................................................. 57
`
`3.3.2 Initial Symbol Distribution ............................................................................ 58
`
`3.3.3 Re-initialization Frequency............................................................................ 59
`
`3.3.4 Adaptation Rate.............................................................................................. 59
`
`
`
`
`
`
`
`
`
`4. Models and Results ............................................................................................... 61
`
`4.1 Designing Models.................................................................................................... 61
`
`4.2 Model 1 .................................................................................................................... 62
`
`
`4.2.1 No Run-Length Coding.................................................................................. 62
`
`
`4.2.2 Model 1 Contexts ........................................................................................... 63
`
`
`4.2.3 Initial Symbol Distribution for Model 1 ........................................................ 63
`
`
`4.2.4 ADAPT_RATE for Model 1.......................................................................... 64
`
`
`4.2.5 Bit-Rate Reduction for Model 1..................................................................... 64
`
`
`4.2.6 Conclusions.................................................................................................... 65
`
`4.3 Model 2 .................................................................................................................... 65
`
`
`4.3.1 Model 2 Contexts ........................................................................................... 65
`
`
`4.3.2 Initial Symbol Distribution for Model 2 ........................................................ 66
`
`
`4.3.3 ADAPT_RATE for Model 2.......................................................................... 66
`
`
`4.3.4 Bit-Rate Reduction for Model 2..................................................................... 67
`
`
`4.3.5 Conclusions.................................................................................................... 67
`
`4.4 Model 3 .................................................................................................................... 69
`
`
`4.4.1 Model 3 Contexts ........................................................................................... 69
`
`
`4.4.2 Initial Symbol Distribution for Model 3 ........................................................ 70
`
`
`4.4.3 Improved Weighting of Symbols ................................................................... 71
`
`
`4.4.4 Modified Handling of Psync and Ptype Codeword and Escape Symbols...... 72
`
`
`4.4.5 Bit-Rate Reduction for Model 3..................................................................... 73
`
`
`4.4.6 Conclusions.................................................................................................... 74
`
`5. Conclusions and Future Work........................................................................ 75
`
`5.1 Conclusions ............................................................................................................. 75
`
`
`5.1.1 Amount of Statistics Crucial for Good Adaptation........................................ 75
`
`
`5.1.2 Complexity..................................................................................................... 75
`
`
`5.1.3 Error Resilience.............................................................................................. 76
`
`
`5.1.4 A Few Words About Speed............................................................................ 76
`
`5.2 Future Work ........................................................................................................... 76
`
`
`5.2.1 Rate-Distortion Optimization......................................................................... 77
`
`
`5.2.2 Model 4 .......................................................................................................... 78
`
`Appendix A – Acronyms and Abbreviations ..................................................................... 79
`
`References ............................................................................................................................. 81
`
`
`
`
`
`
`
`
`7
`
`
`
`IPR2018-01413
`Sony EX1010 Page 7
`
`
`
`Video Coding in H.26L
`
`
`
`Kristofer Dovstam, April 2000
`
`
`
`
`
`
`8
`
`
`
`IPR2018-01413
`Sony EX1010 Page 8
`
`
`
`Video Coding in H.26L
`
`
`
`Kristofer Dovstam, April 2000
`
`1. Introduction and Background
`
`
`This section includes an overview of video compression, a brief introduction to current video
`coding standards, and a discussion about video coding research topics. The experienced
`reader, who is familiar with concepts such as motion estimation, adaptive arithmetic coding
`and error resilience could skip to the purpose and goal in Section 1.4.
`
`
`1.1 Overview of Video Compression
`
`The key idea of video compression is to exploit spatial and temporal redundancy. Spatial
`redundancy exists within an image due to similarities between different areas of the image.
`For instance, adjacent pixels (picture elements) in an image are very likely to have similar
`characteristics. Temporal redundancy can be found within a sequence of frames (video) and is
`due to similarities between successive frames. (Throughout this document the words image,
`frame and picture will be used interchangeably.) Procedures for exploiting redundancy have
`different approaches depending on the redundancy being spatial or temporal.
`There are two different kinds of compression: lossless compression and lossy
`compression, also referred to as reversible and irreversible compression. If data is compressed
`using a lossless compression technique, the information content of the original data and the
`compressed data is exactly the same. On the other hand, if a lossy compression technique is
`employed, the compressed data will contain less information than the original data, i.e.,
`information is lost forever.
`It is important to remember that when dealing with discrete representation of information,
`such as in a computer, data will always be quantized. This means that data-values, for
`example pixels, will always be rounded off or truncated to fit a certain number of bits. The
`fewer available bits, the coarser the quantization. If, for example, a natural image is
`quantized, the resulting quantized image always contains less information than the natural
`image.
`An important property of a compressed video signal is its bit-rate. The bit-rate is the
`average number of bits needed to represent each second of a video sequence, and the unit is
`bits per second. In other words, the bit-rate of a video sequence is its size in bits divided by its
`length in seconds.
`In video compression in general, the goal is to obtain a lower bit-rate for the compressed
`video signal than for the same uncompressed signal. It is, however, important to consider
`factors other than bit-rate as well. If, for instance, video is used in mobile communications,
`the compressed video signal needs to be insensitive to bit-errors. That is, the video signal
`needs to be error resilient. It is also preferable to keep down the complexity of the encoder
`and decoder and to add as little delay to the whole system as possible.
`
`
`1.1.1 Exploiting Spatial Redundancy – INTRA-coded Compression
`
`The most popular approach for removing spatial redundancy in an image is by means of a
`decorrelating transform, which decreases the correlation between pixels, and hence represents
`the information in the image in a more compact way. The transform is usually applied on
`
`
`
`
`
`
`9
`
`
`
`IPR2018-01413
`Sony EX1010 Page 9
`
`
`
`Video Coding in H.26L
`
`
`
`Kristofer Dovstam, April 2000
`
`square blocks of the image (4x4, 8x8, 16x16 pixels, etc). After transformation the block
`consists of transform coefficients containing the same information as the pixels. The
`transform coefficients correspond to different frequencies.
`The transformation does not in itself result in any compression. Instead, compression is
`achieved when the transform coefficients are quantized. The coarser the quantization, the
`higher the compression. This also results in loss of information. If pixels are quantized, all the
`frequencies in the image are affected equally much. However, when transform coefficients
`are quantized, the higher frequencies, for which the human eye is less sensitive, are often
`more coarsely quantized than the lower frequencies. This results in a smaller degradation of
`the apparent image quality than if the transform coding technique is not employed and the
`quantization is performed directly on the pixels. When decoding, the transform coding
`process is reversed and the pixel values are restored using an inverse transform. In video
`coding, when the encoding of an image in a sequence is independent of other frames in the
`sequence, the process is called INTRA-coded compression. The image is referred to as an
`INTRA-picture or I-picture.
`Examples of transforms with good decorrelating performance are the often adopted
`discrete cosine transform (DCT) and wavelet-transforms. As of today the most commonly
`used transform in video coding is the DCT, while for example the JPEG-2000 [7] image
`compression standard uses wavelet-transforms.
`
`
`
`1.1.2 Exploiting Temporal Redundancy – INTER-coded Compression
`
`Typically, adjacent images in a sequence of frames, for example video with a frame-rate of 30
`frames/s, are very similar. If we encode every frame as an INTRA-picture, we pay no
`attention to the fact that there is much correlation between consecutive frames (an example of
`an algorithm taking this approach is the Motion-JPEG algorithm (cf. [1])). The obvious
`solution to this problem is to represent only the difference for each frame with respect to the
`previous frame. This can easily be done by simply subtracting the pixel values in the frame
`currently being encoded with the pixel values from the previous frame. In this way, the pixel
`value variance between frames decreases and a stronger compression can be obtained. In the
`next step, the same transform compression technique as for INTRA-pictures is used on the
`difference data. Further compression can be achieved, for example by introducing a threshold.
`If the difference in pixel values between corresponding consecutive pixels in frames is below
`this threshold, we will ignore the difference and simply use the previous pixel values. In video
`coding, when an image in a sequence is encoded this way, the process is called INTER-coded
`compression.
`
`
`1.1.3 Motion Estimation
`
`INTER-coded compression can be improved by motion estimation. One of the simpler and
`less complex forms of motion estimation is forward prediction using block matching. This
`approach is described below.
`In current major video coding standards, images are partitioned into blocks called
`macroblocks. The size of a macroblock is typically 16x16 pixels. When a video sequence is
`encoded we predict the pixel values for each macroblock in each frame. Here, the macroblock
`
`
`
`
`
`
`10
`
`
`
`IPR2018-01413
`Sony EX1010 Page 10
`
`
`
`Video Coding in H.26L
`
`
`
`Kristofer Dovstam, April 2000
`
`to be predicted is called the target macroblock. The prediction is done by looking at a past
`frame called the reference picture. When encoding a frame, if no movement has taken place
`with respect to the reference picture, all the target macroblocks will be best predicted by the
`corresponding macroblock in the reference picture. If, however, there is movement, one or
`more of the target macroblocks may be better predicted by a displaced but equally sized
`block. This displacement can be found by matching the target macroblock with an
`appropriately positioned set of blocks of the same size (a search area) in the reference picture
`(see Figure 1.1). The “best” matching block will be used to predict the target macroblock.
`
`
`
`Reference picture (previous frame)
`
`Best
`match
`
`Current frame
`
`Figure 1.1 Forward prediction using block matching
`
`
`
`
`
`
`
` A
`
` commonly adopted metric when searching for the best match is the well-known mean
`square error (MSE) (see Section 2.1.1 for a definition of MSE). The smaller the error the
`better the match. In practical video coding, however, a simplified version of the MSE called
`sum of absolute differences (SAD) [1] is often employed. This decreases the amount of
`computations considerably since no multiplications are needed to compute the SAD.
`When the best matching block is found, a motion vector is designed to represent the
`displacement of the block, and the prediction error is calculated. The prediction error is the
`difference in pixel values between the target macroblock and the block used for prediction.
`Now, in order to decode a macroblock, all we need is its motion vector and the prediction
`error.
`A frame coded by using forward prediction is referred to as a P-picture. There exist far
`more advanced and sophisticated methods for motion estimation of which some will be
`discussed in Section 1.3.3.
`
`
`
`
`1.1.4 Entropy Coding
`
`When all frames in the sequence have been INTRA- or INTER-coded, the video signal
`consists of transform coefficients and motion vectors (a few bits are also needed for
`
`
`
`
`
`
`11
`
`
`
`IPR2018-01413
`Sony EX1010 Page 11
`
`
`
`Video Coding in H.26L
`
`
`
`Kristofer Dovstam, April 2000
`
`quantization information, type of macroblock encoding, etc). Due to statistical properties,
`however, there is still redundancy in the video signal. This redundancy can be exploited by
`adopting an entropy coding technique. Contrary to compression techniques discussed in
`earlier sections, entropy coding is a lossless stage of the video coding procedure.
`The video signal can be viewed as a source of information, which has a true entropy or
`information content (see Section 2.5.2 for a definition of entropy). By modeling the
`information source, we assign probabilities to different outcomes, or symbols, of the source.
`In video coding the symbols are typically the transform coefficients and the motion vectors.
`After modeling comes coding, which, based on the probabilities assigned by the model,
`translates the symbols to a sequence of bits or a bit-stream.
`
`
`Run-Length Coding
`
`An easy way to represent symbols efficiently is to use run-length coding. If the symbols in a
`serie of successive symbols in a signal are the same, we replace the symbols with one symbol
`plus the length of the run of that symbol. Provided many successive symbols are the same,
`this will result in a considerably decreased amount of bits necessary to represent the data. The
`technique can also be easily combined with other entropy coding techniques.
`
`
`Variable Length Codes
`
`By far the most commonly adopted entropy coding technique for video coding purposes is
`variable length codes (VLCs). Examples of VLCs are the well-known Huffman codes (cf.
`[3]). In the VLC approach, compression is achieved by assigning shorter codewords, i.e. short
`sequences of bits, to symbols with higher probabilities and longer codewords to symbols with
`lower probabilities. The codewords and associated symbols are organized in what is called a
`VLC table or a look-up table. There is, however, a limitation to how accurate the probabilities
`can be represented, which is due to the fact that each VLC is constrained to consist of an
`integral number of bits.
`
`
`Arithmetic Coding
`
`Another approach to entropy coding, quite different from VLCs and Huffman coding, is
`arithmetic coding (cf. [3] [5] [8]). In this approach, after modeling the information source, it
`is possible to represent the probabilities exactly. A sequence of symbols, or a message, is,
`after being encoded with the arithmetic coding technique, represented by an interval of real
`numbers between 0 and 1. The coding procedure is easiest described using an example
`(similar to the one found in [3]).
`Assume that we have an alphabet of three symbols with corresponding probabilities as in
`Table 1.1a. At first, each symbol is assigned a unique interval in between 0 and 1 according to
`its probability (see Table 1.1b). Let us say that we want to encode the message acab. The
`procedure is illustrated in Figure 1.2a and 1.2b. The message, i.e. the stream of symbols
`(acab), is put into the encoder. When the encoder “sees” the first symbol a in the message, a’s
`interval, i.e. [0, 0.5), is chosen to represent the entire message, which so far is only a.
`However, the message is longer than just a, and when the encoder sees the next symbol c, all
`
`
`
`
`
`
`12
`
`
`
`IPR2018-01413
`Sony EX1010 Page 12
`
`
`
`Video Coding in H.26L
`
`
`
`Kristofer Dovstam, April 2000
`
`Table 1.1a
`Symbols and probabilities
`
`
`Symbol
`Probability
`a
`0.5
`b
`0.2
`c
`0.3
`
`
`
`
`
`
`
`Table 1.1b
`Symbols and intervals
`
`
`Symbol
`Interval
`a
`[0, 0.5)
`b
`[0.5, 0.7)
`c
`[0.7, 1.0)
`
`
`
`the probabilities, and hence all the intervals, are normalized, or scaled, so that they all lie in
`the interval of a (i.e. in between 0 and 0.5 according to Figure 1.2a and 1.2b). By doing so, it
`is possible to represent c and a with c’s newly assigned interval, [0.35, 0.5). As seen in
`
`
`
`Origin
`
`a
`
`c
`
`a
`
`b
`
`
`
`
`
`c b
`
`a
`
`0.425
`
`0.4025
`0.3875
`
`---
`
`c b a
`
`0.35
`
`Figure 1.2a The process of dividing intervals into smaller subintervals.
`
`b
`
` 0.4025
`
`c b
`
`a
`
`a
`
` 0.425
`
` 0.4025
`
` 0.3875
`
`c b
`
`a
`
`c
`
`0.5
`
` 0.455
`
` 0.425
`
`c b
`
`a
`
`a
`
`0.5
`
` 0.35
`
` 0.25
`
`0
`
` 0.35
`
` 0.35
`
` 0.3875
`
`c b
`
`a
`
`1.0
`
`0.7
`
`0.5
`
`0
`
`Origin
`
`c b
`
`a
`
`1.0
`
`0.7
`
`0.5
`
`0
`
`Figure 1.2b Same process as in Figure 1.2a, here each interval is expanded to full height.
`
`
`
`13
`
`
`
`
`
`
`
`
`
`
`
`
`
`IPR2018-01413
`Sony EX1010 Page 13
`
`
`
`Video Coding in H.26L
`
`
`
`Kristofer Dovstam, April 2000
`
`Figure 1.2a, the interval [0.35, 0.5) is unique for the message ac. Now, the rest of the symbols
`will be encoded in the same manner within the interval [0.35, 0.5) and so on. The same
`process in each step assures that the message “seen so far” always can be uniquely
`represented. More probable symbols will reduce the size of the interval less, adding fewer bits
`to the message. The longer the message, the finer the last interval becomes and the more bits
`are needed to represent it. When we have reached the last symbol b, the message acab is
`represented by the interval [0.3875, 0.4025). However, it is not necessary for the decoder to
`know both ends of the interval, a number lying in the interval will suffice, such as 0.4 or
`0.3879 or even 0.401345978. For decoding it is also necessary to have knowledge of the
`source model, i.e. the symbols and probabilities in Table 1.1a.
`The decoding process is essentially a reversed encoding process. However, during
`decoding we are faced with a problem; how do we know when to stop decoding? Indeed, any
`real number in between 0 and 1 could represent an infinitely long message, since it is always
`possible to divide an interval into new subintervals. This problem is solved by adding an end
`of message symbol to the alphabet that will be used only to terminate messages. When the
`decoder encounters the end of message symbol, it knows that the entire message has been
`decoded and stops decoding. In the example above, we could choose b to be our end of
`message symbol.
`Another approach to solving the problem of knowing when to stop decoding is to make
`use of contexts. For a particular information source it may be clear within different contexts
`how many symbols are to be encoded and decoded. If this is the case, the decoder knows from
`the context when to stop decoding, and does not need an end of message symbol. Let us say,
`for instance, that the encoder and decoder agree to send m