`fortextandVideoCompression
`DzungTienHoang
`Ph.D.Dissertation
`DepartmentofComputerScience
`BrownUniversity
`Providence,RhodeIsland
`CS- -
`May
`
`IPR2018-01413
`Sony EX1036 Page 1
`
`
`
`IPR2018-01413
`Sony EX1036 Page 2
`
`IPR2018-01413
`Sony EX1036 Page 2
`
`
`
`Fast and Efficient Algorithms for Text and Video
`
`Compression1
`
`Dzung Tien Hoang 2
`Department of Computer Science
`
`Brown University
`
`Providence, RI 02912–1910
`
`Abstract
`
`There is a tradeoff between the speed of a data compressor and the level of compression it can
`
`achieve. Improving compression generally requires more computation; and improving speed gener-
`
`ally sacrifices compression. In this thesis, we examine a range of tradeoffs for text and video.
`
`In text compression, we attempt to bridge the gap between statistical techniques, which exhibit a
`
`greater amount of compression but are computationally intensive, and dictionary-based techniques,
`
`which give less compression but run faster. We combine the context modeling of statistical coding
`
`with dynamic dictionaries into a hybrid coding scheme we call Dictionary by Partial Matching.
`
`In low-bit-rate video compression, we explore the speed-compression tradeoffs with a range
`
`of motion estimation techniques operating within the H.261 video coding standard. We initially
`
`consider algorithms that explicitly minimizes bit rate and combination of rate and distortion. With
`
`insights gained from the explicit minimization algorithms, we propose a new technique for motion
`
`estimation that minimizes an efficiently computed heuristic function. The new technique gives
`
`compression efficiency comparable to the explicit-minimization algorithms while running much
`
`faster. We also explore bit-minimization in a non-standard quadtree-based video coder that codes
`
`motion information hierarchically using variable-sized blocks.
`
`For video coding at medium-to-high bit rates, we propose a framework that casts rate control
`
`as a resource allocation problem with continuous variables, non-linear constraints, and a novel
`
`lexicographic optimality criterion motivated for near-constant-quality video. With this framework,
`
`we redefine the concept of efficiency to better reflect the constancy in quality generally desired from
`
`a video coder. Rigorous analysis within this framework reveals elegant conditions for optimality,
`
`which leads to polynomial-time algorithms. Simulation studies confirm the theoretical analysis and
`
`produce encodings that are more constant in quality than that achieved with existing rate control
`
`methods. As evidence of the flexibility of the framework, we show how to extend it to allocate bits
`
`among multiple variable-bit-rate bitstreams for transmission over a constant-bit-rate channel.
`
`1 A similar version of this report was submitted in fulfillment of the dissertation requirement for Dzung Tien
`Hoang’s Ph.D.
`2Support was provided in part by a National Science Foundation Graduate Fellowship; by Air Force Office of
`Strategic Research grants F49620–92–J–0515 and F49620–94–I–0217; and by Army Research Office grant DAAH04–
`93–G–0076.
`
`IPR2018-01413
`Sony EX1036 Page 3
`
`
`
`IPR2018-01413
`Sony EX1036 Page 4
`
`IPR2018-01413
`Sony EX1036 Page 4
`
`
`
`Fast and Efficient Algorithms for Text and Video
`Compression
`
`by
`
`Dzung Tien Hoang
`
`B.S. Computer Science, Tulane University, 1990
`
`B.S. Electrical Engineering, Tulane University, 1990
`
`Sc.M. Computer Science, Brown University, 1992
`
`Thesis
`
`Submitted in partial fulfillment of the requirements for the degree of
`
`Doctor of Philosophy in the Department of Computer Science
`
`at Brown University.
`
`May 1997
`
`IPR2018-01413
`Sony EX1036 Page 5
`
`
`
`c(cid:13) Copyright 1997
`by
`
`Dzung Tien Hoang
`
`IPR2018-01413
`Sony EX1036 Page 6
`
`
`
`Vita
`
`Dzung Tien Hoang was born on April 20, 1968 in Nha Trang, Vietnam. He immigrated to the
`
`United States of America in 1975 with his parents, Dzuyet D. Hoang and Tien T. Tran, and two
`
`sisters. He now has three sisters and one brother. They have been living in Harvey, Louisiana.
`
`After graduating in 1986 from the Louisiana School for Math, Science and the Arts, a public
`
`residential high school in Natchitoches, Louisiana, he attended Tulane University in New Orleans
`
`with a full-tuition Dean’s Honor Scholarship and graduated in 1990 with Bachelor of Science degrees
`
`in Electrical Engineering and Computer Science, both with Summa Cum Laude honors.
`
`He joined the Computer Science Department at Brown University in Providence, Rhode Island,
`
`in 1990 under a University Fellowship and later under a National Science Foundation Graduate
`
`Fellowship. He received a Master of Science in Computer Science from Brown University in 1992.
`
`From 1993 to 1996, he was a visiting scholar and a research assistant at Duke University
`
`in Durham, North Carolina. From 1991 to 1995, he spent summers working at the Frederick
`
`National Cancer Research Facility, the Supercomputing Research Center, and the IBM T. J. Watson
`
`Research Center.
`
`In August 1996, he joined Digital Video Systems, in Santa Clara, California, as a Senior Software
`
`Engineer.
`
`iii
`
`IPR2018-01413
`Sony EX1036 Page 7
`
`
`
`iv
`
`IPR2018-01413
`Sony EX1036 Page 8
`
`IPR2018-01413
`Sony EX1036 Page 8
`
`
`
`Preface
`
`In today’s information-driven society, text, audio, image, video and other forms of information are
`
`being increasingly generated, manipulated, and transmitted in digital form. This trend is mani-
`
`fested in the increased level of automation in businesses, the ubiquity of personal computers, the
`
`explosive growth of the Internet and the World Wide Web, and the growing library of multimedia
`
`software that incorporates digital audio, images, and video. Within the past decade, we have seen
`
`secondary storage on desktop personal computers mushroom from a mere 20 MB to 2 GB and
`
`beyond. Modem technology has pushed the transmission bandwidth through plain telephone lines
`
`from 300 bits/s to 33.6 kbits/s. Even with technological improvements in transmission bandwidth
`
`and storage capacity, the information explosion is quickly making current technologies seem in-
`
`adequate. For example, application software that once fit onto a few floppy disks now demands
`
`multi-megabytes of disk space.
`
`Crucial to the management of digital information are data compression techniques that help
`
`make more efficient use of the limited transmission and storage resources available. For storage
`
`applications, data compression increases the effective storage space, allowing more data to be stored
`
`on a given storage device. For transmission applications, data compression increases the effective
`
`bandwidth, allowing a higher volume of data to be transmitted over a given transmission medium.
`
`Data compression can be viewed as a logical transformation of the data and is independent of
`
`the underlying transmission or storage technology. Data compression will not be made obsolete
`
`by advances in these technologies, as there will always be a need for even more storage and even
`
`greater bandwidth.
`
`v
`
`IPR2018-01413
`Sony EX1036 Page 9
`
`
`
`vivi
`
`IPR2018-01413
`Sony EX1036 Page 10
`
`IPR2018-01413
`Sony EX1036 Page 10
`
`
`
`Acknowledgements
`
`This dissertation is the culmination of seven years of graduate studies and four years of under-
`
`graduate education. I am grateful to Johnette Hassell, Mark Benard, and Boumediene Belkhouche
`
`from Tulane University for encouraging me to pursue a graduate degree in Computer Science.
`
`At Brown University, my research career began on firm ground under the direction of Daniel
`
`Lopresti, who guided me to my first conference publication and my first summer research position.
`
`I thank John Savage of Brown University for his patience and support during my search for an
`
`academic focus.
`
`Under the steady guidance of Jeff Vitter I have learned much about research and scholarship.
`
`Jeff is a demanding advisor, sparing in approval or praise. A note of “Good Work” and a few
`
`sparse markings on a recent draft research paper was sure sign of my progress. I am proud to have
`
`earned Jeff’s trust.
`
`During my residence at Brown University and Duke University, my life has been enriched by
`
`interaction with fellow students, especially T. Murali (tmax), Swarup Acharya, Dimitrios Michai-
`
`lidis, Darren Vengroff, P. Krishnan (pk), Choh-Man Teng, Thomas Alexander, Min Wang, Hongyan
`
`Wang, and Apostol Natsev. Much of my work at Duke was catalyzed by Phil Long who provided
`
`a patient ear and injected much enthusiasm during our many brainstorming sessions.
`
`I would like to thank Morten Schultz and Bruce Shapiro from the Frederick National Cancer
`
`Research Facility; Duncan Buell, Maya Gokhale, and Jeff Arnold from the Center for Computing
`
`Sciences; and Prasoon Tiwari, Elliot Linzer, and Cesar Gonzales from the IBM T. J. Watson
`
`Research Center for making my summers most stimulating and rewarding.
`
`Research is often a collaborative effort. The work described in Chapters 2, 4, and 5 was
`
`performed jointly with Phil Long and Jeff Vitter. Chapters 6 to 8 resulted from joint work with
`
`Elliot Linzer and Jeff Vitter. Chapter 10 was joint work with Jeff Vitter.
`
`I am indebted to the National Science Foundation for supporting me with an NSF Graduate
`
`Fellowship. This award has afforded me the freedom to pursue a research career in the field of my
`
`choice.
`
`I would be quite lost without the skills of the administrative staff at the Departments of Com-
`
`puter Science at Brown University and Duke University. Special tribute is due to Mary Andrade
`
`at Brown and Cathy Caimano at Duke for their tireless administrative assistance.
`
`I appreciate the kindness of Digital Video Systems for giving me the time and resources to
`
`finish writing my dissertation.
`
`Above all, I thank my parents, my brother, and my sisters for their constant love and support.
`
`vii
`
`IPR2018-01413
`Sony EX1036 Page 11
`
`
`
`viii
`
`IPR2018-01413
`Sony EX1036 Page 12
`
`IPR2018-01413
`Sony EX1036 Page 12
`
`
`
`Contents
`
`Vita
`
`Preface
`
`Acknowledgements
`
`Introduction
`
`I Text Compression
`
`1 Introduction to Text Compression
`
`1.1 Statistical Coding
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`1.1.1 Probability Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`1.1.2 Entropy Coding
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`1.1.2a
`
`Arithmetic Codes
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`1.1.2b
`
`Binary and Phased-In Codes . . . . . . . . . . . . . . . . . . . . . .
`
`1.1.3 Context Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
`
`iii
`
`v
`
`vii
`
`1
`
`5
`
`7
`
`7
`
`8
`
`8
`
`8
`
`9
`
`9
`
`1.1.4
`
`Static, Semi-Adaptive, and Adaptive Models
`
`. . . . . . . . . . . . . . . . . . 10
`
`1.1.5 Prediction by Partial Matching . . . . . . . . . . . . . . . . . . . . . . . . . . 10
`
`1.2 Dictionary Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
`
`1.2.1 Lempel-Ziv Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
`
`1.2.2 LZ77-Class Coders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
`
`1.2.3 LZ78-Class Coders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
`
`1.2.4 LZFG Coder
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
`
`1.3 Equivalence of LZ and Statistical Coders . . . . . . . . . . . . . . . . . . . . . . . . . 16
`
`2 Dictionary by Partial Matching
`
`21
`
`2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
`
`2.2 Dictionary by Partial Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
`
`2.3 LZ77-PM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
`
`2.3.1 Baseline Coder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
`
`2.3.2 PM Modification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
`
`ix
`
`IPR2018-01413
`Sony EX1036 Page 13
`
`
`
`2.3.3 Experimental Results
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
`
`2.4 LZW-PM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
`
`2.4.1 Baseline Coder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
`
`2.4.2 PM Modification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
`
`2.4.3 Experimental Results
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
`
`2.5 LZFG-PM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
`
`2.5.1 Baseline Coder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
`
`2.5.2 PM Modification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
`
`2.5.3 Experimental Results
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
`
`2.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
`
`II Video Compression
`
`3 Introduction to Video Compression
`
`33
`
`35
`
`3.1 Digital Video Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
`
`3.1.1 Color Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
`
`3.1.2 Digitization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
`
`3.1.2a
`
`Spatial Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
`
`3.1.2b
`
`Temporal Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
`
`3.1.2c
`
`Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
`
`3.1.3
`
`Standard Video Data Formats
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . 37
`
`3.2 A Case for Video Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
`
`3.3 Lossy Coding and Rate-Distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
`
`3.3.1 Classical Rate-Distortion Theory . . . . . . . . . . . . . . . . . . . . . . . . . 40
`
`3.3.2 Operational Rate-Distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
`
`3.3.3 Budget-Constrained Bit Allocation . . . . . . . . . . . . . . . . . . . . . . . . 41
`
`3.3.3a
`
`Viterbi Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
`
`3.3.3b
`
`Lagrange Optimization . . . . . . . . . . . . . . . . . . . . . . . . . 43
`
`3.4 Spatial Redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
`
`3.4.1 Vector Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
`
`3.4.2 Block Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
`
`3.4.3 Discrete Cosine Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
`
`3.4.3a
`
`Forward Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
`
`3.4.3b
`
`Inverse Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
`
`3.4.3c
`
`Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
`
`3.4.3d
`
`Zig-Zag Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
`
`3.5 Temporal Redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
`
`3.5.1 Frame Differencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
`
`3.5.2 Motion Compensation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
`
`3.5.3 Block-Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
`
`3.6 H.261 Standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
`
`x
`
`IPR2018-01413
`Sony EX1036 Page 14
`
`
`
`3.6.1 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
`
`3.6.2 Encoder Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
`
`3.6.3 Heuristics for Coding Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
`
`3.6.4 Rate Control
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
`
`3.7 MPEG Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
`
`3.7.1 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
`
`3.7.2 Encoder Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
`
`3.7.3 Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
`
`3.7.4 Video Buffering Verifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
`
`3.7.5 Rate Control
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
`
`4 Motion Estimation for Low-Bit-Rate Video Coding
`
`65
`
`4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
`
`4.2 PVRG Implementation of H.261 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
`
`4.3 Explicit Minimization Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
`
`4.3.1 Algorithm M1
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
`
`4.3.2 Algorithm M2
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
`
`4.3.3 Algorithm RD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
`
`4.3.4 Experimental Results
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
`
`4.4 Heuristic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
`
`4.4.1 Heuristic Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
`
`4.4.2 Experimental Results
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
`
`4.4.2a
`
`Static Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
`
`4.4.2b Adaptive Cost Function . . . . . . . . . . . . . . . . . . . . . . . . . 75
`
`4.4.3 Further Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
`
`4.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
`
`4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
`
`5 Bit-Minimization in a Quadtree-Based Video Coder
`
`85
`
`5.1 Quadtree Data Structure
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
`
`5.1.1 Quadtree Representation of Bi-Level Images . . . . . . . . . . . . . . . . . . . 85
`
`5.1.2 Quadtree Representation of Motion Vectors . . . . . . . . . . . . . . . . . . . 86
`
`5.2 Hybrid Quadtree/DCT Video Coder . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
`
`5.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
`
`5.4 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
`
`5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
`
`6 Lexicographically Optimal Bit Allocation
`
`93
`
`6.1 Perceptual Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
`
`6.2 Constant Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
`
`6.3 Bit-Production Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
`
`6.4 Buffer Constraints
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
`
`xi
`
`IPR2018-01413
`Sony EX1036 Page 15
`
`
`
`6.4.1 Constant Bit Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
`
`6.4.2 Variable Bit Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
`
`6.4.3 Encoder vs. Decoder Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
`
`6.5 Buffer-Constrained Bit-Allocation Problem . . . . . . . . . . . . . . . . . . . . . . . 99
`
`6.6 Lexicographic Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
`
`6.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
`
`6.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
`
`7 Lexicographic Bit Allocation under CBR Constraints
`
`105
`
`7.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
`
`7.2 CBR Allocation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
`
`7.2.1 DP Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
`
`7.2.2 Correctness of DP Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
`
`7.2.3 Constant-Q Segments
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
`
`7.2.4 Verifying a Constant-Q Allocation . . . . . . . . . . . . . . . . . . . . . . . . 113
`
`7.2.5 Time and Space Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
`
`7.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
`
`7.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
`
`8 Lexicographic Bit Allocation under VBR Constraints
`
`117
`
`8.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
`
`8.2 VBR Allocation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
`
`8.2.1 VBR Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
`
`8.2.2 Correctness of VBR Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 125
`
`8.2.3 Time and Space Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
`
`8.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
`
`9 Implementation of Lexicographic Bit Allocation
`
`129
`
`9.1 Perceptual Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
`
`9.2 Bit-Production Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
`
`9.2.1 Hyperbolic Model
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
`
`9.2.2 Linear-Spline Model
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
`
`9.3 Picture-Level Rate Control
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
`
`9.3.1 Closed-Loop Rate Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
`
`9.3.2 Open-Loop Rate Control
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
`
`9.3.3 Hybrid Rate Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
`
`9.4 Buffer Guard Zones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
`
`9.5 Encoding Simulations
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
`
`9.5.1
`
`Initial Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
`
`9.5.2 Coding a Longer Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
`
`9.6 Limiting Lookahead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
`
`9.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
`
`xii
`
`IPR2018-01413
`Sony EX1036 Page 16
`
`
`
`9.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
`
`10 Extensions of the Lexicographic Framework
`
`149
`
`10.1 Applicability to Other Coding Domains
`
`. . . . . . . . . . . . . . . . . . . . . . . . . 149
`
`10.2 Multiplexing VBR Streams over a CBR Channel
`
`. . . . . . . . . . . . . . . . . . . . 149
`
`10.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
`
`10.2.2 Multiplexing Model
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
`
`10.2.3 Lexicographic Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
`
`10.2.4 Equivalence to CBR Bit Allocation . . . . . . . . . . . . . . . . . . . . . . . . 153
`
`10.3 Bit Allocation with a Discrete Set of Quantizers
`
`. . . . . . . . . . . . . . . . . . . . 154
`
`10.3.1 Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
`
`10.3.2 Lexicographic Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
`
`Bibliography
`
`155
`
`xiii
`
`IPR2018-01413
`Sony EX1036 Page 17
`
`
`
`xivxiv
`
`IPR2018-01413
`Sony EX1036 Page 18
`
`IPR2018-01413
`Sony EX1036 Page 18
`
`
`
`List of Tables
`
`2.1 Distribution of bits for coding paper2 with various threshold values.
`
`. . . . . . . . . 26
`
`2.2 Compression results for files in the Calgary corpus. . . . . . . . . . . . . . . . . . . . 31
`
`4.1 Distribution of bits for intraframe coding of the Miss America sequence . . . . . . . 66
`
`4.2 Results of static heuristic cost function.
`
`. . . . . . . . . . . . . . . . . . . . . . . . . 75
`
`4.3 Results of adaptive heuristic cost function. . . . . . . . . . . . . . . . . . . . . . . . . 77
`
`9.1 Parameters for MPEG-2 Simulation Group software encoder used to encode the
`
`SIF-formatted video clips.
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
`
`9.2 Summary of initial coding experiments.
`
`. . . . . . . . . . . . . . . . . . . . . . . . . 138
`
`9.3 Parameters for MPEG-2 Simulation Group software encoder used to encode the IBM
`
`commercial. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
`
`9.4 Summary of coding simulations with IBM Commercial.
`
`. . . . . . . . . . . . . . . . 145
`
`xv
`
`IPR2018-01413
`Sony EX1036 Page 19
`
`
`
`xvi
`
`IPR2018-01413
`Sony EX1036 Page 20
`
`IPR2018-01413
`Sony EX1036 Page 20
`
`
`
`List of Figures
`
`1.1 Example of context models created by the PPM method.
`
`. . . . . . . . . . . . . . . 11
`
`1.2 Snapshot of LZ77 coder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
`
`1.3 Snapshot of LZ78 coder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
`
`1.4 Snapshot of LZFG coder.
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
`
`1.5 PATRICIA trie after insertion of the next phrase, caac.
`
`. . . . . . . . . . . . . . . . 18
`
`1.6 Statistical equivalent of a dictionary coder.
`
`. . . . . . . . . . . . . . . . . . . . . . . 19
`
`2.1 Example of a DPM coder with maximum order o = 3.
`
`. . . . . . . . . . . . . . . . . 23
`
`2.2 Distribution of bits versus context threshold Tσ.
`
`. . . . . . . . . . . . . . . . . . . . 27
`
`3.1 Block diagram of a video digitizer.
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
`
`3.2 Scanning techniques for spatial sampling of a video image. . . . . . . . . . . . . . . . 37
`
`3.3 Example of uniform quantization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
`
`3.4 Color subsampling formats, as specified in the MPEG-2 standard.
`
`. . . . . . . . . . 39
`
`3.5 Rate-distortion function for a Gaussian source with σ = 1. . . . . . . . . . . . . . . . 41
`
`3.6 Sample operational rate-distortion plot.
`
`. . . . . . . . . . . . . . . . . . . . . . . . . 42
`
`3.7 Comparison of coders in a rate-distortion framework. . . . . . . . . . . . . . . . . . . 42
`
`3.8 Example of a trellis constructed with the Viterbi algorithm. . . . . . . . . . . . . . . 44
`
`3.9 Graphical interpretation of Lagrange-multiplier method.
`
`. . . . . . . . . . . . . . . . 45
`
`3.10 Typical quantization matrix applied to 2D-DCT coefficients. . . . . . . . . . . . . . . 48
`
`3.11 Zig-zag scan for coding quantized transform coefficients
`
`. . . . . . . . . . . . . . . . 49
`
`3.12 Block diagram of a simple frame-differencing coder. . . . . . . . . . . . . . . . . . . . 49
`
`3.13 Block diagram of a generic motion-compensated video encoder.
`
`. . . . . . . . . . . . 50
`
`3.14 Illustration of frames types and dependencies in motion compensation. . . . . . . . . 51
`
`3.15 Reordering of frames to allow for causal interpolative coding.
`
`. . . . . . . . . . . . . 51
`
`3.16 Illustration of the block-translation model. . . . . . . . . . . . . . . . . . . . . . . . . 52
`
`3.17 Structure of a macroblock. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
`3.18 Block diagram of a p × 64 source coder.
`3.19 Heuristic decision diagrams for coding control from Reference Model 8 [7]. . . . . . . 55
`
`. . . . . . . . . . . . . . . . . . . . . . . . . 54
`
`3.20 Block diagram of rate control in a typical video coding system.
`
`. . . . . . . . . . . . 56
`
`3.21 Feedback function controlling quantization scale based on buffer fullness. . . . . . . . 57
`
`3.22 Block diagram of a typical MPEG encoder.
`
`. . . . . . . . . . . . . . . . . . . . . . . 58
`
`xvii
`
`IPR2018-01413
`Sony EX1036 Page 21
`
`
`
`3.23 Block diagram of the MPEG Video Buffering Verifier.
`
`. . . . . . . . . . . . . . . . . 59
`
`3.24 Block diagram of a fixed-delay CBR video transmission system. . . . . . . . . . . . . 60
`
`3.25 Block diagram of a stored-video system using double buffering.
`
`. . . . . . . . . . . . 61
`
`4.1 Distribution of bits for intraframe coding of the Miss America sequence.
`
`. . . . . . . 66
`
`4.2 Comparison of explicit-minimization motion estimation algorithms . . . . . . . . . . 70
`
`4.3 Density plots of DCT coding bits vs. MAD prediction error. . . . . . . . . . . . . . . 72
`
`4.4 Density plots of MSE reconstruction distortion vs. MAD prediction error.
`
`. . . . . . 73
`
`4.5 Results of static heuristic cost function.
`
`. . . . . . . . . . . . . . . . . . . . . . . . . 74
`
`4.6 Results of adaptive heuristic cost function. . . . . . . . . . . . . . . . . . . . . . . . . 77
`
`4.7 Frame 27 of the Miss America sequence as encoded using the PVRG and explicit-
`
`minimization motion estimation algorithms. . . . . . . . . . . . . . . . . . . . . . . . 78
`
`4.8 Frame 27 of the Miss America sequence as encoded using the heuristic motion esti-
`
`mation algorithms.
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
`
`4.9 Estimated motion vectors for frame 27 of the Miss America sequence for the PVRG,
`
`RD, H1-WH, and H2-WH coders. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
`
`4.10 Performance of motion estimation algorithms on eight test sequences.
`
`. . . . . . . . 81
`
`4.11 Distribution of bits for coding the Miss America sequence with adaptive heuristics. . 82
`
`5.1 A simple quadtree and corresponding image.
`
`. . . . . . . . . . . . . . . . . . . . . . 86
`
`5.2 Representation of a triangle using a quadtree of depth 5. . . . . . . . . . . . . . . . . 87
`
`5.3 Quadtree representation of a motion field.
`
`. . . . . . . . . . . . . . . . . . . . . . . . 87
`
`5.4 MSE vs. Rate for Trevor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
`
`6.1 Sample plot of buffer fullness for CBR operation.
`
`. . . . . . . . . . . . . . . . . . . . 98
`
`6.2 Sample plot of buffer fullness for VBR operation. . . . . . . . . . . . . . . . . . . . . 99
`
`7.1 Sketch for proof of Lemma 7.2.
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
`
`7.2 Illustration of search step in dynamic programming algorithm.
`
`. . . . . . . . . . . . 112
`
`9.1 Several instances of a simple “hyperbolic” bit-production model.
`
`. . . . . . . . . . . 130
`
`9.2 Example of a linear-spline interpolation model.
`
`. . . . . . . . . . . . . . . . . . . . . 132
`
`9.3 Guard zones to safeguard against underflow and overflow of VBV buffer. . . . . . . . 135
`
`9.4 Evolution of buffer fullness for CBR coders.
`
`. . . . . . . . . . . . . . . . . . . . . . . 137
`
`9.5 Evolution of buffer fullness for VBR coders.
`
`. . . . . . . . . . . . . . . . . . . . . . . 138
`
`9.6 Nominal quantization scale for CBR coders. . . . . . . . . . . . . . . . . . . . . . . . 139
`
`9.7 Nominal quantization scale for VBR coders. . . . . . . . . . . . . . . . . . . . . . . . 140
`
`9.8 PSNR for CBR coders. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
`
`9.9 PSNR for VBR coders. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
`
`9.10 Evolution of buffer fullness for coding IBM Commercial. . . . . . . . . . . . . . . . . 145
`
`9.11 Nominal quantization scale for coding IBM Commercial. . . . . . . . . . . . . . . . . 146
`
`9.12 PSNR for coding IBM Commercial. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
`
`xviii
`
`IPR2018-01413
`Sony EX1036 Page 22
`
`
`
`10.1 Example of how three VBR bitstreams can be multiplexed into the same channel as
`
`two CBR bitstreams, for a statistical multiplexing gain of 1.5. . . . . . . . . . . . . . 151
`
`10.2 System for transmitting multiple sequences over a single channel. . . . . . . . . . . . 151
`
`10.3 Block diagram of encoder/multiplexer. . . . . . . . . . . . . . . . . . . . . . . . . . . 152
`
`10.4 Operation of multiplexer.
`
`. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
`
`10.5 Block diagram of demultiplexer/decoder. . . . . . . . . . . . . . . . . . . . . . . . . . 152
`
`xix
`
`IPR2018-01413
`Sony EX1036 Page 23
`
`
`
`Introduction
`
`In this thesis, we restrict our attention to the compression of digital data that consist of a sequence of
`
`symbols chosen from a finite alphabet. In order for data compression to be meaningful, we assume
`
`that there is a standard representation for the uncompressed data that codes each symbol