throbber
United States Patent [19]
`Normile et al.
`
`I lllll llllllll Ill lllll lllll lllll lllll lllll 111111111111111111111111111111111
`US005461679A
`Patent Number:
`Date of Patent:
`
`[11]
`
`[45]
`
`5,461,679
`Oct. 24, 1995
`
`[54] METHOD AND APPARATUS FOR
`ENCODING/DECODING IMAGE DATA
`
`[75]
`
`Inventors: James 0. Normile, Sunnyvale; Chia L.
`Yeh, Saratoga; Daniel W. Wright,
`Sunnyvale; Ke-Chiang Chu, Saratoga,
`all of Calif.
`
`[73] Assignee: Apple Computer, Inc., Cupertino,
`Calif.
`
`[21] Appl. No.: 62,067
`
`[22] Filed:
`
`May 14, 1993
`
`Related U.S. Application Data
`
`[63] Continuation of Ser. No. 705,284, May 24, 1991, Pat. No.
`5,212,742.
`Int. Cl.6
`....................................................... G06K 9/36
`[51]
`[52] U.S. Cl . .......................... 382/304; 382/305; 395/650;
`395/163; 395/474
`[58] Field of Search .................................. 382/56, 27, 41,
`382/49; 340/752; 364/133; 395/114, 115,
`425
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,484,349
`4,665,556
`4,684,997
`5,070,531
`
`1111984 McCubbrey .............................. 382/49
`511987 Fukushima et al ....................... 382/49
`811987 Romeo et al ............................. 382/56
`1211991 Schuerman et al ....................... 382/27
`
`Primary Examiner-Yon J. Couso
`Attorney, Agent, or Finn-Blakely, Sokoloff, Taylor & Zaf(cid:173)
`man
`
`[57]
`
`ABSTRACT
`
`An apparatus and method for processing video data for
`compression/decompression in real-time. The apparatus
`comprises a plurality of compute modules, in a preferred
`embodiment, for a total of four compute modules coupled in
`parallel. Each of the compute modules has a processor, dual
`port memory, scratch-pad memory, and an arbitration
`mechanism. A first bus couples the compute modules and a
`host processor. Lastly, the device comprises a shared
`memory which is coupled to the host processor and to the
`compute modules with a second bus. The method handles
`assigning portions of the image for each of the processors to
`operate upon.
`
`4,174,514 11/1979 Sternberg .................................. 382/49
`
`15 Claims, 11 Drawing Sheets
`
`D
`
`Dis !av ill
`
`Frame
`Buffer
`427
`
`Display
`Controller
`426
`
`Host
`
`410
`
`Bus
`
`:----.. ~- - -- - - - - - - - - - - - - - - -1
`
`1
`I
`I
`I
`
`32
`
`Module
`0
`
`Module
`I
`
`Module
`2
`
`Module
`3
`
`I
`I
`I
`I
`I
`I
`I
`I
`4111
`
`32
`
`~:
`
`32
`
`420
`
`I
`I
`I
`Shared
`I
`Memory
`2MB >-~~~~~~~~ I
`I
`I
`------- ----------------------~
`
`Frame
`Buffer
`430
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 1 of 23
`
`

`

`U.S. Patent
`
`Oct. 24, 1995
`
`Sheet 1of11
`
`5,461,679
`
`/100
`
`DCT
`
`Quantizer
`
`1Ql
`
`102
`
`Variable
`Length
`Coding
`103
`
`Multi-
`plexing
`.1lli
`
`Inverse
`Q~~
`N8_
`
`Inverse
`DCT
`lQ2
`
`Rate
`Control
`ill
`
`Buffer
`
`ill
`
`Intra/
`Inter
`
`lQQ
`
`Motion
`Estimation
`105
`
`Frame
`Memory
`lQ1
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`1
`
`:
`I
`I
`I
`1
`
`1-------------------------------y ____ I
`r-------------,
`Rate Control
`I
`I
`I
`I
`I
`I
`I
`1
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`In ut Ima e
`I
`I
`110 I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`:
`I
`I
`I
`I
`
`L_______________________________ _ ___ J
`
`Side Information
`
`Figure 1 (Prior Art)
`
`Compressed Video Out
`
`120
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 2 of 23
`
`

`

`U.S. Patent
`
`Oct. 24, 1995
`
`Sheet 2of11
`
`5,461,679
`
`! __________________ !_ ___________ _
`
`200
`
`Compressed I
`Video In ut I
`
`210
`
`Variable
`Inverse
`Inverse
`Length
`Demulti-
`plexing .,.__,..,.Decoding1---91Quantizer..__,.... DCT
`___ 2_0 .. 1
`___ 2_02..
`203
`204
`
`208
`
`Motion
`.__ ___ _.__S_i_de_In_fo_rm_at_io_n_~ Compen-
`sation
`206
`
`Frame
`Buffer
`207
`--liiiiiiiiill
`
`220
`
`Video Out
`
`Figure 2 (Prior Art)
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 3 of 23
`
`

`

`U.S. Patent
`
`Oct. 24, 1995
`
`Sheet 3of11
`
`5,461,679
`
`Figure 3 (Prior Art)
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 4 of 23
`
`

`

`U.S. Patent
`
`Oct. 24, 1995
`
`Sheet 4of11
`
`5,461,679
`
`o..__....
`
`Disolav
`
`.4Z8.
`
`Frame
`Buffer
`427
`
`Display
`Controller
`426
`
`h ,,
`
`Host
`
`~~ ,.
`
`410
`
`425
`
`Bus
`
`4~
`:---- 412~------~.------~---------:
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`!
`I
`I
`I
`L-------~----------------------~
`
`.,.."32
`
`I
`I
`I
`I
`I
`I
`41~
`I
`I
`I
`I
`I
`I
`I
`
`'32
`
`Module
`o
`401
`
`Module
`I
`
`402
`
`Module
`2
`
`403
`
`Module
`3
`
`404
`
`~ ~
`~----..--_._ __ .,..._ _ __. __ ~,~32 _ ___.I
`\
`420
`
`Frame
`Buffer
`430
`
`Shared
`Memory
`2MB - - - - - - - - - - - -
`405
`
`D 440
`
`Figure 4
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 5 of 23
`
`

`

`U.S. Patent
`
`Oct. 24, 1995
`
`Sheet 5of11
`
`5,461,679
`
`To Bus 425
`
`1--~~ 2
`
`_____ _e:l
`
`Arbitration
`Mechanism
`502
`
`530)
`
`)'32
`
`,,
`Dual
`Port
`Memory
`(16K)
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`L ____________ J
`
`I
`
`j •
`
`520
`
`l.J
`
`J "32
`
`32
`
`504
`
`Digital
`Signal
`Processor
`501
`
`Local
`Memory
`(64K)
`
`503
`
`1 '
`
`Figure 5
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 6 of 23
`
`

`

`U.S. Patent
`
`Oct. 24, 1995
`
`Sheet 6of11
`
`5,461,679
`
`time t
`
`T
`607
`h
`602
`1 ...... -_-_-_-_-___ -_-_-_-_-_-_-_-_-....,.-r----608
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`- -~----606
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`- _...---603
`
`time t + I
`
`611
`
`P:-----602
`----------------
`...... -------.-....----------------r---608
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`-
`- -..-----606
`_____ Jl.OL _________ ..----603
`
`Figure 6
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 7 of 23
`
`

`

`U.S. Patent
`
`Oct. 24, 1995
`
`Sheet 7of11
`
`5,461,679
`
`time-.
`
`D Forward facing key frames (intra frames)
`~ Inter frames
`~ Backward facing key frames
`
`Figure 7a
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 8 of 23
`
`

`

`U.S. Patent
`
`Oct. 24, 1995
`
`Sheet 8of11
`
`5,461,679
`
`Frame
`ID
`751
`
`Forward Backwarc
`Pointer
`Pointer
`753
`752
`
`Variable Length Data lJ
`
`712
`
`754
`
`Figure 7b
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 9 of 23
`
`

`

`U.S. Patent
`
`Oct. 24, 1995
`
`Sheet 9of11
`
`5,461,679
`
`Rate Control
`
`Variable
`Length
`Coding
`.8.Q4
`
`Multi-
`plexing
`80 5
`
`I
`I
`I
`I
`II
`
`I
`I
`
`DCT
`
`Quantizer
`
`fil22
`
`fill3.
`
`813
`
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`I
`-in ut Ima e
`
`801
`
`I
`I
`I
`I
`I
`
`Intra/
`Inter/
`Still/
`Add
`
`£ 800
`-------------------------------- ----,
`--------------1
`I
`I
`I
`:
`
`Inverse
`Quantizer
`806
`
`Inverse
`DCT
`807
`
`Rate
`Control
`808
`
`Buffer
`
`Scene
`Detector
`
`Frame
`Frame
`Difference 114---tl Memory
`fill
`._ __ .,....8.14~
`
`Side Information
`L-------------------------------
`
`Figure Sa
`
`Compressed Video Out
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 10 of 23
`
`

`

`U.S. Patent
`
`Oct. 24, 1995
`
`Sheet 10 of 11
`
`5,461,679
`
`853
`
`852
`
`,------------------------,
`I
`I
`I
`I
`I
`I
`I
`I
`Decision __ -"--l.....____.
`.,___...._.__ Frame
`I
`Difference
`
`Threshold
`
`u and v
`Energy
`Computation
`
`:
`I
`I
`
`B2Q
`
`851
`
`I
`
`-----------------~-------·
`
`Scene Change Detector
`
`" ' - 815
`
`Figure Sb
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 11 of 23
`
`

`

`U.S. Patent
`
`Oct. 24, 1995
`
`Sheet 11 of 11
`
`5,461,679
`
`! __________________ !_ ___________ _
`
`900
`
`I
`I
`I
`Compressed 1
`Video In ut
`
`1
`
`910
`
`Variable
`Length
`Demulti-
`plexing ....___. .. Decoding
`901
`902
`_ _ lijiiiiii . .
`
`Inverse
`Quantizer
`903
`
`Inverse
`DCT ----.
`
`908
`
`Side Information
`
`Frame
`Differ(cid:173)
`encing
`906
`
`Forward/Backward
`I
`Play Control
`Signal ----. ........
`
`911
`
`Play
`Frame
`Control t-----------t~
`Buffer
`908
`907
`
`920
`
`Video Out
`
`Figure 9
`Enhanced CCITT Decoder
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 12 of 23
`
`

`

`1
`METHOD At'lD APPARATUS FOR
`ENCODING/DECODING IMAGE DATA
`
`5,461,679
`
`2
`
`This is a continuation Ser. No. 07/705,284, filed May 24,
`1991 now U.S. Pat. No. 5,212,742.
`
`5
`
`BACKGROUND OF THE INVENTION
`
`Image Size
`
`Frame Rate per second
`
`D
`
`64
`128
`256
`512
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`244.20
`61.05
`15.26
`3.81
`
`122.10
`30.52
`7.63
`1.90
`
`81.40
`20.35
`5.08
`1.27
`
`61.06
`12.25
`3.81
`0.95
`
`48.84
`12.21
`3.05
`0.76
`
`40.70
`10.17
`2.54
`0.63
`
`10
`
`20
`
`1. Field of the Invention
`This invention relates to the field of video imaging
`systems. More specifically, this invention relates to an
`improved method and apparatus for video encoding/decod(cid:173)
`ing.
`2. Description of the Related Art
`Due to the storage requirements, recent demands for full
`motion video in such applications as video mail, video
`telephony, video teleconferencing, image database brows(cid:173)
`ing, multimedia, and other applications have required that
`standards be introduced for video compression. One image
`of 35 mm slide quality resolution requires 50 megabytes of
`data to be represented in a computer system (this number is
`arrived at by multiplying the horizontal by the vertical
`resolution by the number of bits to represent the full color
`range or 4096x4096x8x3 [R+G+B] 18=50,331,648 bytes).
`One frame of digitized NTSC (National Television Stan(cid:173)
`dards Committee) quality video comprising 720x480 pixels
`requires approximately one half megabyte of digital data to
`represent the image (720x480xl.5 bytes per pixel). In an
`NTSC system which operates at approximately 30 frames 30
`per second, digitized NTSC-quality video will therefore
`generate approximately 15.552 megabytes of data per sec(cid:173)
`ond. Without compression, assuming a storage capability of
`one gigabyte with a two megabytes per second access rate,
`it is possible to:
`a. store 65 seconds of live video on the disk and to play
`it back at 3 frames per second;
`b. store 21 high quality still images taking 24 seconds to
`store or retrieve one such image.
`Assuming that a fiber distributed data interface (FDDI) is
`available with a bandwidth of 200 megabits per second, 1.5
`channels of live video can be accommodated, or 35 mm
`quality still images can be transmitted at the rate of one
`every two seconds. With currently available technology in 45
`CD-ROM, a likely distribution medium for products con(cid:173)
`taining video, the current transfer rate is approximately 0.18
`megabytes per second. 0.37 megabytes per second may be
`attained with CD-ROM in the near future.
`For illustration, take the variable parameters to be the 50
`horizontal and vertical resolution and frame rate, and assume
`that 24 bits are used to represent each pixel. Let D represent
`the horizontal or vertical dimension and assume an aspect
`ratio of 4:3. The data rate in megabytes per second as a
`function of frame rate and image size is:
`
`It is obvious from data rate and storage considerations that
`data compaction is required in order for full motion video to
`be attained.
`In light of these storage and rate problems, some form of
`15 video compression is required in order to reduce the amount
`of storage and increase the throughput required to display
`full-motion video in a quality closely approximating NTSC.
`Photographic and, to an even greater degree, moving images
`generally portray information which contains much repeti-
`tion, smooth motion, and redundant information. Stated in
`an equivalent way, areas of an image are often correlated
`with each other, as are sequences of images over time.
`Keeping these facts in mind, several techniques as have been
`established which eliminate redundancy in video imaging in
`25 order to compress these images to a more manageable size
`which requires less storage, and may be displayed at a fairly
`high rate. Some simple compression techniques include:
`1. Horizontal and Vertical Subsampling: Sampling only a
`limited number of pixels horizontally or vertically
`across an image. The required reduction in resolution
`provides for poor quality images.
`2. Reduction in Number of Bits Per Pixel: The technique
`including the use of a Color Look Up Table is currently
`used successfully to reduce from 24 to 8 bits per pixel.
`A reduction of approximately 3-1 is the useful limit of
`this method.
`3. Block Truncation Coding and Color Cell Methods: The
`block truncation coding (ETC) was developed by Bob
`Mitchell in the early 1980' s targeted at low compres(cid:173)
`sion rate and high quality applications (Robert Mitch(cid:173)
`ell, et al., Image Compression Using Block Truncation
`Coding, IEEE Trans., Comm., pp. 1335-1342, Vol.
`Com-27, No. 9, Sept. 1979). In this scheme, the first
`order statistics (mean) and the second order statistics
`(variance) of each pixel block is extracted and trans(cid:173)
`mitted. The image is reconstructed using these two
`quantities. An 8-1 compression ratio with 4x4 block
`sizes was demonstrated in (Graham Campbell, Two
`Bit/Pixel Full Color Encoding, pp. 215-223, Proceed(cid:173)
`ings of SIGGRAPH '86, Vol. 20, No. 4, Aug. 1986).
`4. Vector Quantization (VQ): A simple VQ maps discrete
`k-dimensional vectors into a digital sequence for trans(cid:173)
`mission or storage. Each vector (a block of 4x4 or 3x3
`pixels) is compared to a number of templates in the
`code book, and the index of the best matched template
`is transmitted to the receiver. The receiver uses the
`index for table look-up to reconstruct the image. A
`simple VQ could provide about 20-1 compression with
`good quality. A more complex VQ scheme has been
`demonstrated to provide similar quality to the CCITT
`(International Consultative Committee for Telephony
`& Telegraphy) DCT (Discrete Cosine Transformation)
`scheme recommendation H.261 (T. Murakami, Scene
`Adaptive Vector Quantization for Image Coding,
`Globecom, 1988).
`5. Predictive Techniques: The assumption on which this
`
`35
`
`40
`
`55
`
`Image Size
`
`Frame Rate per second
`
`D
`
`64
`128
`256
`512
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`0.04
`0.16
`0.65
`2.62
`
`0.08
`0.33
`1.31
`5.24
`
`0.12
`0.49
`1.96
`7.86
`
`0.16
`0.65
`2.62
`10.48
`
`0.20
`0.82
`3.27
`13.10
`
`0.24
`0.98
`3.93
`15.72
`
`or formulated in a slightly different way, the number of
`minutes of storage on a 600 megabyte disk is:
`
`60
`
`65
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 13 of 23
`
`

`

`5,461,679
`
`3
`family of methods relies is that adjacent pixels are
`correlated. As a consequence, data reduction can be
`accomplished by predicting pixel values based on their
`neighbors. The difference between the predicted and
`the actual pixel value is then encoded. An extensive 5
`body of work exists on this technique and variations on
`it (O'Neil, J. B., Predictive Quantization Systems for
`Transmission of TV Signals, Bell System Technical
`Journal, pp. 689-721, May/June 1966).
`The compression ratio to be expected from each of these 10
`simple methods is between four and eight to one.
`More complex techniques for video compression are also
`known in the art. It is possible to achieve data compression
`of between four and eight to one by using some of the
`simpler techniques as mentioned above. To achieve compa- 15
`rable quality, at compression ratios from twenty to forty to
`one, involves a superlinear increase in complexity. In this
`case, it is no longer appropriate to consider the compression
`process as a simple one-step procedure.
`In general, lossless compression techniques attempt to 20
`whiten or decorrelate a source signal. Intuitively, this makes
`sense in that a decorrelated signal cannot be compressed
`further or represented more compactly. For compression
`ratios of greater than twenty to one, a lossy element must be
`introduced somewhere into the process. This is usually done 25
`through a temporal or spatial resolution reduction used in
`conjunction with a quantization process. The quantization
`may be either vector or scalar. The quantizer should be
`positioned so that a graceful degradation of perceived qual(cid:173)
`ity with an increasing compression ratio results.
`Many of the succeeding methods are complex, but may be
`broken into a series of simpler steps. The compression
`process can be viewed as a number of linear transformations
`followed by quantization. The quantization is in turn fol(cid:173)
`lowed by a lossless encoding process. The transformations 35
`applied to the image are designed to reduce redundancy in
`a representational, spatial and temporal sense. Each trans(cid:173)
`formation is described individually.
`
`30
`
`DECORRELATION
`
`Although the RGB representation of digitized images is
`common and useful, it is not particularly efficient. Each one
`of the red, green and blue constituents is potentially of full
`bandwidth, although much of the information is correlated
`from plane to plane. The first step in the compression
`process is to decorrelate the three components. If an exact
`transformation were used, it would be image dependent, and
`as a consequence computationally expensive. A useful
`approximation which does not depend on image statistics is
`the following:
`
`4
`in the process, much of the interplane redundancy has been
`removed, and a dam reduction by a factor of two has been
`achieved.
`
`REDUCTION OF TEMPORAL REDUNDANCY
`
`Reduction of temporal redundancy may be achieved sim(cid:173)
`ply by taking the difference between successive frames. In
`the case of no motion and no scene change, this frame
`difference will be zero. The situation is more complex when
`there is interframe motion. In this case, some reregistration
`of the portions of the image which have moved is required
`prior to frame differencing. This is done by estimating how
`far pixels have moved between frames. Allowing for this
`movement, corresponding pixels are again subtracted
`(Ronald Plompen, et al., Motion Video Coding in CC/TT SG
`XV-The Video Source Coding, pp. 997-1004, IEEE Global
`Telecommunications Conference, Dec. 1988). The motion
`vectors are coded and transmitted with the compressed bit
`stream. These vectors are again used in the decoding process
`to reconstruct the images. The distinction between frame
`differencing and differencing using motion estimation may
`be expressed as follows. In the case of simple differencing,
`the error between frames is calculated as:
`
`e(x, y, t)=f (x, y, t+ 1)-f (x, y, t)
`
`using motion estimation error may be written as:
`
`e(x, y, t)=f(x+x, y+y, r+l)-f(x, y, t)
`where x and y are the calculated displacements in the x and
`y directions respectively.
`
`REDUCTION OF SPATIAL REDUNDANCY
`
`In most images, adjacent pixels are correlated. Reducing
`spatial redundancy involves removing this correlation. The
`removal is achieved using a linear transformation on the
`spatial data. In the ideal case, this transform should depend
`40 on image statistics. Such a transform does exist and is
`known as the Hotelling or Karhounen Loueve (KL) trans(cid:173)
`form (N. S. Jayant, Peter Noll, Digital Coding of Waveforms,
`Prentice Hall, Signal Processing Series, p. 58). As it is
`computationally expensive and does not have a fast algo-
`rithmic implementation, it is used only as a reference to
`evaluate other transforms. A variety of other transforms have
`been applied to the problem, including: Fourier, Walsh,
`Slant, Haddamard (Arun, N. Netravali, Barry G. Haskell,
`Digital Pictures Representation and Compression, Plenum
`50 Press). The cosine transform provides the best performance
`(in the sense of being close to the KL transform). The
`discrete cosine transform is defined in the following way:
`
`45
`
`y }(.3
`
`(
`
`U
`V
`
`.6
`.21
`
`.59
`-.28
`-.52
`
`.11 lR )
`
`-.32
`.31
`
`G
`B
`
`55 9(k. l)
`
`a(k)a(l)
`2
`
`N-lN-1
`:E
`:E x(m, n)·
`m=O n=O
`
`In the case of NTSC images, the resulting U and V (chromi(cid:173)
`nance components containing color information) compo(cid:173)
`nents are of lower bandwidth than the Y (luminance com(cid:173)
`ponent containing the monochrome information). In general,
`the U and V components are of less perceptual importance
`than the Y component. The next stage in the compression
`process usually consists of subsampling U and V horizon- 65
`tally and vertically by a factor of two or four. This is done
`by low pass filtering followed by decimation. At this point
`
`60
`
`cos
`
`(
`
`7tk(2m + 1)
`2N
`
`)cos ( 7tk(Z~ l)
`
`where x(m,n) is an NxN field (Blocksize), k, 1, m, n all range
`from 0 to N-1, and
`
`1
`a(O)= \f2 ;
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 14 of 23
`
`

`

`5,461,679
`
`5
`a(j)=l; J;itO. A range of DCT "block sizes" have been
`investigated for image compression (A. Netravali, et al.,
`Picture Coding; A Review, Proc. IEEE, pp. 366-406, March
`1980), and standards bodies have decided, apparently in an
`arbitrary fashion, that 8x8 blocks are "best." Adaptive block
`size have also been considered, with the adaptation driven
`by image activity in the area to be transformed (see Chen, C.
`T., "Adaptive Transform Coding Via Quad-Tree-Based Vari(cid:173)
`'89, pp.
`able Block Size," Proceedings of ICASSP
`1854-1857). In summary, a combination of the above tech(cid:173)
`niques, as applied to a raw video image would be performed
`as follows:
`1. Digitizing the image;
`2. transform RGB to YUV;
`3. remove temporal redundancy (through frame differenc(cid:173)
`ing and motion compensation;
`4. remove spatial redundancy (through a discrete cosine
`transfer); and
`5. entropy encode the data (using Huffman coding).
`This process yields the maximum compression possible
`using prior state of the art techniques.
`
`COMPRESSION STANDARDS
`
`Three examples of state of the an compression methods
`using some of these techniques are known as: CCITT H.261
`(the International Consultive Committee for Telephony and
`Telegraphy); JPEG (Joint Photographers Experts Group);
`and MPEG (the Motion Picture Experts Group). The JPEG
`standard was developed for encoding photographic still
`images and sets forth a basic technique for encoding video
`image data. The technique converts 8x8 pixel blocks of the
`source image using a discrete cosine transformation (DCT)
`function, with each block of pixels being represented in
`YUV source format (representing luminance and chromi(cid:173)
`nance information for the block). Threshold blocks of DCT
`coefficients using psychovisual thresholding matrices are
`then used to quantize the results of the 8x8 DCT macrob(cid:173)
`locks of the source image. Finally, each of the blocks is
`entropy encoded. The decoding process reverses these steps.
`The CCITT H.261 standard was developed for use in
`video teleconferencing and video telephony applications. It
`can operate at 64 kilobits (Kbits) per second to 1.92 mega-
`bits (Mbits) per second, and can operate upon images
`between 525 and 625 lines based upon a common interme(cid:173)
`diate format (CIF). It is performed using a method as shown
`in FIG. 1.
`The CCITT encoder 100 consists of a DCT, a zig-zag
`scanned quantization, and Huffman coding. DCT 101, quan(cid:173)
`tizer 102, and variable length coding 103 blocks perform the
`coding function. Finally, multiplexer 104 combines the
`Huffman code from the variable length coding block 103,
`motion vector data from motion estimation block 105,
`quantizer data from quantizer block 102. Intra/Inter type
`information from intra/inter block 106 and performs format(cid:173)
`ting and serializing, video synchronization and block
`addressing. Frame memory 107 is used to determine differ(cid:173)
`ences from the previous frame and the current frame using
`motion estimation block 105. CCITT encoder 100 further
`comprises inverse quantizer 108 and inverse DCT function
`109 to provide frame difference information. Lastly, infor(cid:173)
`mation multiplexed by 104 is passed to rate control 111 and
`buffer 112 for output as compressed bit stream 120.
`The CCITT decoder is shown in FIG. 2 as 200. Demul(cid:173)
`tiplexing block 201 takes the encoded bit stream 210,
`
`10
`
`25
`
`6
`identifies its constituents and routes them to the relevant
`pans of decoder 200. The main function of variable length
`decoding block 202 and inverse quantizer 203 block is to
`reconstruct the DCT coefficients from
`their Huffman
`5 encoded values, rescale these values and pass these on to
`inverse DCT block 204. Inverse DCT block 204 takes
`coefficients in blocks of 8x8 and reconstitutes the underlying
`spatial video information. If the macro block is intra-coded,
`no motion estimation is invoked. If it is inter-coded, the
`output is a difference between the information in this frame
`and the motion-compensated information in the last frame.
`A motion vector transmitted from demultiplexer 201 via
`"side information" signal 208 determines the best block to
`be used for reconstruction from the last frame. Motion
`compensation is performed by 206 from information of the
`15 current image in frame buffer 207. This is fed back into the
`decoded stream 205 and then as decoded output information
`220 in CIF format. The Y and UV components share the
`same motion vector information. A detailed description of
`the CCITT H.261 standard is described in document No.
`20 584, published on Nov. 10, 1989 by the Specialists Group on
`Coding For Visual Telephony, entitled Draft Revision of
`Recommendation H.261 published by the CCITT SG XV
`Working Party XV/l (1989).
`The MPEG standard is the most recent compression
`specification to use transport methods which describe
`motion video. Though not fully finalized, the MPEG speci(cid:173)
`fication's goal is to obtain VHS quality on reconstruction of
`the images with a bit rate of approximately 1.15 megabits
`30 per second for video. This yields a total compression ratio of
`about 40-1. The distinguishing feature of MPEG from JPEG
`and CCITT H.261 is that MPEG provides a higher quality
`image than CCITT H.261, like JPEG but allows motion.
`This is in contrast to JPEG, which only provides still-frame
`imagery and no audio. In addition, MPEG adds the addi(cid:173)
`tional feature of synchronized sound along with the encoded
`video data although it has not been finalized. A detailed
`description ofMPEG may be found in the document entitled
`MPEG Video Simulation Model 3 (SM3)-Draft No. 1
`40 published by the International Organization for Standard(cid:173)
`ization ISO-IEC/JTC1/SC2/WG8, Coded Representation of
`Picture and Audio Information ISO-IEC/JTC1/SC2/WG8 N
`MPEG 90/, published by A. Koster of PTT Research.
`Some of the relative advantages and disadvantages of the
`various coding algorithms are set forth as follows. JPEG
`provides no description of motion video at all. MPEG,
`although a full featured standard (it provides both forward
`motion, backwards motion, and still frame), is still under
`development and undergoing revision. CCITT H.261,
`50 because it was developed for teleconferencing and video
`telephony, it provides a moving source but has no provisions
`for viewing the motion picture images in a reverse direction,
`or provides any means for still frame viewing. Therefore, a
`system is required which is fairly mature, such as the CCITT
`55 H.261 standard, but yet provides all the capabilities (includ(cid:173)
`ing reverse play and still frame) of a full-featured compres(cid:173)
`sion system, such as the MPEG standard.
`CCITT H.261 uses a scheme such as that shown in FIG.
`3 in order to provide for full-motion video. FIG. 3 shows a
`60 series of frames which represents a particular section of
`moving video. 301 and 302 contain full scene information
`for the image at the beginning of a series of frames. 301 and
`302 are known as "intra" frames or keyframes which are
`used in CCITT H.261. Each intra frame 301 or 302 contains
`65 a full scene description of the frame at the times they appear.
`Although compressed, intra frames 301 and 302 contain
`substantial information. Each of the intervening frames
`
`35
`
`45
`
`Ex. 1045
`IPR2016-00923, HTC v. PUMA
`Page 15 of 23
`
`

`

`5,461,679
`
`7
`between two intra frames 301 and 302 are known as "inter"
`frames 303, 304, and 305. Each inter frame such as 303-305
`contains information which should be added to the preced(cid:173)
`ing frame. For example, inter frame 303 only contains
`information which has moved since intra frame 301. There-
`fore, the information represented in frames 303-305 may be
`substantially less than that contained in frames 301 and 302
`because the inter frames contain only motion data, and not
`complete scene information for the entire frame. This pro(cid:173)
`vides a fairly high compression ratio for intervening inter 10
`frames 303-305. CCITI H.261 as represented in FIG. 3 is
`incapable of providing reverse motion video because a
`"key" frame, such as intra frame 301 only establishes
`information for inter frames 303--305 which follow intra
`frame 301 in time. In other words, 303-305 only contain 15
`information which has moved from intra frame 301, not
`motion information from intra frame 302. An attempt to play
`such a sequence of frames in reverse will generate substan(cid:173)
`tial distortion of the moving image.
`Because a decompression rate of approximately 30 frames
`per second (FPS) is required for real-time moving video, the
`processor performing the decoding process must have a
`fairly high bandwidth and be able to handle all the necessary
`matrix-matrix multiply operations required by the decoding
`process in a short period of time. To date, no single device 25
`possesses the necessary computing power to decompress an
`incoming compressed bit stream at the necessary rate to
`make data available for NTSC quality video at the 30 frame
`per second rate.
`
`5
`
`8
`instead of that normally connected to the host. This provides
`increased performance of the system as a whole, especially
`in video decompression tasks.
`These and other objects of the present invention are
`provided for by a method in a computer system for parti(cid:173)
`tioning an image for processing by a parallel computing
`system. The parallel computing system comprises N com(cid:173)
`puting units. First, the total length of an image is determined,
`and is divided by N. The dividend is then stored in a first
`value, the first value, in a preferred embodiment, the width
`of the image to be assigned to each parallel computing unit.
`A first region of the image is assigned to a first computing
`unit, the first region starting at a first position, and ending at
`the first position offset by the first value. Therefore, a portion
`of the image is assigned, in a preferred embodiment which
`is the full image in width, and HIN wherein H is the length
`or height of the image, and N is the total number of
`processors. Each of the N processors are assigned corre(cid:173)
`sponding sections of the image according to their position
`20 relative to the first processor, each having a section which is
`the full width of the image, and which is HIN in length.
`Height and width information is represented, in the preferred
`embodiment, in blocks containing luminance and chromi(cid:173)
`nance information.
`
`BRIEF DESCRIPTION OF DRAWINGS
`
`The present invention is illustrated by way of example
`and not limitation in the figures of the accompanying in
`30 which like references indicate like elements and in which:
`FIG. 1 shows a prior art video encoder.
`FIG. 2 shows a prior art video decoder.
`FIG. 3 shows a prior art scheme for representing motion
`35 video.
`FIG. 4 shows the architecture of the video processor of the
`preferred embodiment.
`FIG. 5 shows one compute module used in the preferred
`embodiment of the present invention.
`FIG. 6 shows the partitioning of an image for processing
`by each of the computing modules of the preferred embodi(cid:173)
`ment.
`FIG. 7a shows the preferred embodiment's method of
`encoding motion video.
`FIG. 7b is a detailed representation of one frame in the
`preferred embodiment.
`FIG. Sa shows the improved CCITI encoder used in the
`preferred embodiment.
`FIG. Sb is a detailed representation of the scene change
`detector of the preferred embodiment.
`FIG. 9 shows the enhanced CCITI decoder used in the
`preferred embodiment.
`
`SUMMARY AND OBJECTS OF THE
`INVENTION
`
`One of the objects of the present invention is provide an
`architecture and method which has sufficient computing
`power to allow compressed moving video images to be
`decompressed and displayed in real time.
`This and other objects of the present invention are pro(cid:173)
`vided for by ·an apparatus for processing video data for
`compression/decompression in real-time which comprises a 40
`plurality of compute modules, in a preferred embodiment,
`for a total of four compute modules coupled in parallel. Each
`of the compute modules has a processor, dual port memory,
`scratch-pad memory, and an arbitration mechanism. In a
`preferred embodiment, the processor is a digital signal 45
`processor, and the device comprises 16 kilobytes of dual(cid:173)
`port dynamic random access memory and 64 kilobytes of
`local "scratch pad" dynamic random access memory. A first
`bus couples the compute modules and a host processor. In a
`preferred embodiment, the host processor is coupled to a 50
`complete computer system comprising a display, memory,
`and o

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket