`(12) Patent Application Publication (10) Pub. No.: US 2007/0274385 A1
`(43) Pub. Date:
`Nov. 29, 2007
`He
`
`US 20070274385A1
`
`(54) METHOD OF INCREASING CODING
`EFFICIENCY AND REDUCING POWER
`CONSUMPTION BY ON-LINE SCENE
`CHANGE DETECTION WHILE ENCODING
`INTER-FRAME
`
`Zhongli He, Austin, TX (US)
`(76) Inventor:
`Correspondence Address:
`HAMILTON & TERRILE, LLP
`P.O. BOX 203518
`AUSTIN, TX 78720
`Appl. No.:
`11/441,869
`Filed:
`May 26, 2006
`
`(21)
`(22)
`
`Publication Classification
`
`(51)
`
`Int. C.
`H04N 7/2
`H04N II/04
`
`(2006.01)
`(2006.01)
`
`(52) U.S. Cl. .............................. 375/240.12; 375/240.24
`
`ABSTRACT
`(57)
`A system and method for on-the-fly detection of scene
`changes within a video stream through statistical analysis of
`a portion of the macroblocks comprising each video frame
`as they are processed using inter-frame coding. If the
`statistical analysis of the selected macroblocks of the current
`frame differs from the previous frame by exceeding prede
`termined thresholds, the current video frame is assumed to
`be a scene change. Once a scene change is detected, the
`remainder of the video frame is encoded as an intra-frame,
`intra-macroblocks, or intra slices, through implementation
`of one or more predetermined or adaptively adjusted quan
`tization parameters to reduce computational complexity,
`decrease power consumption, and increase the resulting
`Video image quality. As decoding is the inverse of encoding,
`these improvements are similarly recognized by a decoderas
`it decodes a resulting encoded video stream.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Video Stream
`300 Scene Change
`/ Detection System
`
`Motion Estimation
`Process And Estimate
`Coding Cost
`310
`
`intra-prediction
`Process And Estimate
`Coding Cost
`
`!
`
`-----
`
`
`
`Predetermined Number
`Of Coded Macroblocks
`Read
`
`M acroblock
`Statistical
`Analysis
`3.18.
`
`Detected?
`328
`
`Mode Decision
`With Simple Scene
`Detection Process
`316
`
`--
`----
`
`--
`
`
`
`Adjust Quantization
`Parameters
`332
`
`
`
`Encode With
`Inter-Mode?
`336
`
`Continue Encoding
`With Intra-Mode
`Spatial Compensation
`334
`
`Continue Encoding
`With Inter-Mode
`Motion Compensation
`338
`
`Compute
`Average MAD
`(or SAD) For
`Al Macroblocks
`So Far
`322
`
`Compute
`Number Of
`Intranter
`Modes
`So Far
`324
`
`If (AvgMAD > AvgMAD Thres) ill
`(NumintraMB > NumintraMB Thres)
`scene change = 1,
`else
`scene change = 0;
`
`Scene Change
`Detection Algorithm
`326
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1037, p. 1
`
`
`
`Patent Application Publication Nov. 29, 2007 Sheet 1 of 6
`
`US 2007/0274385 A1
`
`Prior Art
`100 Video Compression
`
`/ System
`
`Previous Frame
`102
`
`Current Frame
`106
`
`Frame Segmentation
`108
`
`Macroblocks
`104
`
`Macroblocks
`110
`
`Macroblocks
`110
`
`Motion Estimation
`112
`
`Matching
`Macroblocks
`114
`
`
`
`
`
`Motion
`Vectors
`116
`
`Prediction Error Coding
`118
`
`Transmission
`
`FIGURE 1
`(Prior Art)
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1037, p. 2
`
`
`
`Patent Application Publication
`
`Nov. 29, 2007 Sheet 2 of 6
`
`US 2007/0274385 A1
`
`OOZ
`
`
`
`
`
`
`
`
`
`
`
`
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1037, p. 3
`
`
`
`Patent Application Publication
`
`US 2007/0274385 A1
`
`- - - - - -
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`70€.·
`
`Z09)
`
`[*]
`
`? |
`
`– | | | | |
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1037, p. 4
`
`
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1037, p. 5
`
`
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1037, p. 6
`
`
`
`Patent Application Publication
`
`Nov. 29, 2007 Sheet 6 of 6
`
`US 2007/0274385 A1
`
`
`
`
`
`6upoo peruasqO
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1037, p. 7
`
`
`
`US 2007/0274385 A1
`
`Nov. 29, 2007
`
`METHOD OF INCREASING CODING
`EFFICIENCY AND REDUCING POWER
`CONSUMPTION BY ON-LINE SCENE
`CHANGE DETECTION WHILE ENCODING
`INTER-FRAME
`
`BACKGROUND OF THE INVENTION
`1. Field of the Invention
`0001
`0002 The present invention relates in general to the field
`of video stream encoding, and more specifically, to detecting
`a scene change within a video stream.
`0003 2. Description of the Related Art
`0004. The use of digitized video continues to gain accep
`tance for use in a variety of applications including high
`definition television (HDTV) broadcasts, videoconferencing
`with personal computers, delivery of streaming media over
`a wireless connection to a personal digital assistant (PDA),
`and interpersonal video conversations via cellular phone.
`Regardless of how it is used, implementation of digitized
`video in each of these devices is typically constrained by
`screen size and resolution, processor speed, power limita
`tions, and the communications bandwidth that is available.
`Advances in video compression have helped address some
`of these constraints, such as facilitating the optimal use of
`available bandwidth. However, computational overhead,
`power consumption and image quality can still be problem
`atic for some devices when encoding video streams, espe
`cially those containing frequent scene changes.
`0005. In general, there is relatively little change from one
`Video frame to the next unless the scene changes. Video
`compression identifies and eliminates redundancies in a
`Video stream and then inserts instructions in their place for
`reconstructing the video stream when it is decompressed.
`Similarities between frames can be encoded such that only
`temporal changes between frames, or spatial differences
`within a frame, are registered in the compressed video
`stream. For example, inter-frame compression exploits the
`similarities between Successive video frames, known as
`temporal redundancy, while intra-frame compression
`exploits the spatial redundancy of pixels within a frame.
`While inter-frame compression is commonly used for
`encoding temporal differences between Successive frames, it
`typically does not work well for scene changes due to the
`low degree of temporal correlation between frames from
`different scenes. Intra-frame coding, which uses image com
`pression to reduce spatial redundancy within a frame, is
`better Suited for encoding video frames containing scene
`changes.
`0006. However, the encoder must first determine whether
`the scene has changed before intra-frame encoding can be
`applied to the frame being processed. Prior art approaches
`for detecting scene changes within a video stream include
`comparing the entire contents of a temporal residual frame
`with a predetermined reference before the frame is coded,
`which requires additional CPU cycles and decreases encod
`ing efficiency. Another approach processes a set of Succes
`sive video frames in two passes to determine the ratio of
`bi-directional (B) and unidirectional (P) motion compen
`sated frames to be encoded. While an impulse-like increase
`in motion costs can indicate a screen change in the video
`stream, the computational complexity of the approach is not
`well suited to wireless video devices. Frequent scene
`changes within a video stream can further increase the
`number of processor cycles, consume additional power, and
`
`further degrade encoding efficiency. In view of the forego
`ing, there is a need for improved detection of scene changes
`in a video stream that does not require pre-processing the
`entire contents of each video frame before the most appro
`priate encoding method can be implemented.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`0007. The present invention may be understood, and its
`numerous objects, features and advantages obtained, when
`the following detailed description is considered in conjunc
`tion with the following drawings, in which:
`0008 FIG. 1 is a generalized block diagram depicting a
`prior art system for motion compensated video compression;
`0009 FIG. 2 is a generalized block diagram depicting a
`prior art system for changing video encoding modes when
`scenes change within a video stream;
`0010 FIG. 3 is a generalized block diagram of a video
`stream Scene change detection system as implemented in
`accordance with an embodiment of the invention;
`0011
`FIG. 4 is a generalized block diagram of a video
`stream scene change detection system as implemented in a
`Video encoder system in accordance with an embodiment of
`the invention;
`0012 FIG. 5 is a generalized block diagram of a video
`stream scene change detection system as implemented in a
`Video decoder system in accordance with an embodiment of
`the invention; and
`0013 FIG. 6 is a table depicting observed performance of
`a video stream scene change detection system as imple
`mented in accordance with an embodiment of the invention.
`0014 Where considered appropriate, reference numerals
`have been repeated among the drawings to represent corre
`sponding or analogous elements.
`
`DETAILED DESCRIPTION
`0015. A system and method is described for on-the-fly
`detection of scene changes within a video stream through
`statistical analysis of a portion of each video frame's mac
`roblocks as they are processed using inter-frame encoding,
`thereby allowing the entire or the remainder of the macrob
`locks in the inter-frame to be encoded as an intra-frame,
`intra-slices, or intra-macroblocks, using adaptively adjusted
`or predetermined quantization parameters (QP) to reduce
`computational complexity, increase video coding efficiency,
`and improve video image quality.
`0016 Various illustrative embodiments of the present
`invention will now be described in detail with reference to
`the accompanying figures. While various details are set forth
`in the following description, it will be appreciated that the
`present invention may be practiced without these specific
`details, and that numerous implementation-specific deci
`sions may be made to the invention described herein to
`achieve the device designer's specific goals. Such as com
`pliance with process technology or design-related con
`straints, which will vary from one implementation to
`another. While such a development effort might be complex
`and time-consuming, it would nevertheless be a routine
`undertaking for those of ordinary skill in the art having the
`benefit of this disclosure. For example, selected aspects are
`depicted with reference to simplified drawings in order to
`avoid limiting or obscuring the present invention. Such
`descriptions and representations are used by those skilled in
`the art to describe and convey the substance of their work to
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1037, p. 8
`
`
`
`US 2007/0274385 A1
`
`Nov. 29, 2007
`
`others skilled in the art. Various illustrative embodiments of
`the present invention will now be described in detail with
`reference to the figures.
`0017 FIG. 1 is a generalized block diagram depicting a
`prior art system 100 for performing compensated video
`compression. In this depiction, a previous video frame 102
`of a video stream, comprising a plurality of macroblocks
`104, serves as a reference frame for current video frame 106.
`The current video frame 106 is segmented by frame seg
`mentation module 108 into a plurality of macroblocks 110.
`typically 16x16 pixels in size. The previous frame 102 and
`the macroblocks 110 are provided to a motion estimation
`module 112 which performs a search to find macroblocks
`within previous video frame 102 that correspond to mac
`roblocks 110 in the current frame 106. If found, candidate
`matching macroblocks 114 in previous video frame 102 are
`used as a Substitute for corresponding macroblocks 110 in
`current frame 106 when it is reconstructed during decom
`pression.
`0018. If the difference between the target macroblock in
`current frame 106 and the candidate macroblock at the same
`position in previous frame 102 is below a predetermined
`value, it is assumed that no motion has taken place and a
`Zero vector is returned, thereby avoiding the computational
`expense of a search. If, however, the difference between the
`target macroblock in the current frame 106 and the candidate
`macroblock at the same position in the previous frame 102
`exceeds the predetermined value, a search is performed to
`locate the best macroblock in the previous frame 102 and the
`corresponding macroblock in the current frame 106. The
`motion estimation module 112 then calculates motion vec
`tors 116 that describe the location of the matching macrob
`locks in previous frame 102 with respect to the position of
`corresponding macroblocks 114 in current frame 106. Cal
`culated motion vectors 116 may not correspond to the actual
`motion in the video stream due to noise and weaknesses in
`the matching algorithm and, therefore, may be corrected by
`the motion estimation module 112 using techniques known
`to those of skill in the art. The matching macroblocks 114,
`motion vectors 116, and corresponding macroblocks 110 are
`provided to the prediction error coding module 118 for
`predictive error coding and transmission.
`0019 FIG. 2 is a generalized block diagram depicting a
`prior art video stream encoding system 200 for changing
`Video encoding modes when scenes change within a video
`stream. Previous video frame 202 and current video frame
`204 depict a scene change in a video stream that is being
`encoded. Encoded macroblocks 206 comprising a previous
`video frame 202 are used for reference and serve as a
`reference for current video frame 204, which is segmented
`into macroblocks 208, typically 16x16 pixels in size. Mac
`roblocks of current video frame 208 reference macroblocks
`of previous video frame 206 for inter-frame motion estima
`tion encoding 210 and estimation of computational coding
`costs, with intra-prediction encoding and associated com
`putational costs 212 taking place thereafter before routing to
`encoding mode decision module 214. If encoding mode
`decision module 214, based on intra-prediction encoding
`212 and associated computational cost estimates, determines
`in step 216 to encode a macroblock in the current video
`frame 204 using inter-macroblock mode for coding with
`motion compensation, then this macroblock in the video
`frame 204 is encoded using inter-macroblock mode with
`motion compensation. Otherwise, this macroblock in video
`
`frame 204 is encoded using intra-macroblock mode for
`coding with spatial compensation, with the process continu
`ing until encoding of the all the macroblocks in a video
`frame.
`0020 FIG. 3 is a generalized block diagram of video
`stream scene change detection system 300 implemented in
`accordance with an embodiment of the invention. Previous
`video frame 302 and current video frame 304 depict a scene
`change in a video stream that is being encoded. In various
`embodiments of the invention, a portion (e.g., ~15%) of the
`encoded macroblocks 306 comprising a previous video
`frame 302 are used for on-the-fly analysis and comparison to
`a smaller portion (e.g., ~10%) of macroblocks 308 compris
`ing the current video frame 304 to determine if current video
`frame 304 contains a scene that is different (i.e., a scene
`change) from the scene contained in previous video frame
`302. In one embodiment of the invention, the portion of the
`macroblocks 308 used for on-the-fly analysis and compari
`son is a macroblock row (e.g., a 352x16 pixel portion of a
`352x288 video frame), half of a macroblock row, or a 1.5
`macroblock row according to predetermined parameters. In
`other embodiments of the invention, the portion of the
`macroblocks 308 used for on-the-fly analysis and compari
`son is a 64x64 pixel array located in the center of a video
`frame, a predetermined region of interest within the video
`frame, or another position within the video frame as deter
`mined by flexible-macroblock-order (FMO).
`0021. As macroblocks 308 of current video frame 304 are
`captured for encoding, macroblocks 306 of previous video
`frame 302 are used in process step 310 as references for
`inter-frame motion estimation and estimation of computa
`tional coding costs. Next, intra-prediction encoding and
`associated computational cost calculations are performed in
`step 312. The processed data is then routed to the scene
`change detection and mode decision module 316 in the
`intra/inter mode encoding decision module 314.
`0022. The scene change detection and mode decision
`module 316 is operable to process macroblocks using a
`statistical analysis process 318 to optimize detection of a
`scene change. Once a predetermined number of macrob
`locks N has been encoded in process step 320, they are
`processed for statistical analysis in step 322 to compute the
`average mean-absolute-difference (MAD) or sum-of-abso
`lute-difference (SAD) 322. They are also processed in step
`324 by computing the number of intra/inter modes 324.
`Since this information is provided as part of the encoding
`process, no additional computational overhead is incurred.
`The resulting statistical data is then processed using a scene
`change detection algorithm in step 326 once the encoded
`number of macroblocks reaches N, such as:
`
`If ((AvgMAD > Avg.MAD Thres) ||
`(NumIntraMB > NumIntraMB Thres))
`Scene change = 1;
`
`else
`
`Scene change = 0;
`
`where Avg.MAD is the average MAD for the predetermined
`number of encoded macroblock N and the Avg.MAD Thres
`is the predetermined threshold value for it, NumIntraMB is
`the number of macroblocks encoded in Intra mode among
`the number of encoded N macroblocks, the NumIntraMB
`Thres is the predetermined threshold value for it. Scene
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1037, p. 9
`
`
`
`US 2007/0274385 A1
`
`Nov. 29, 2007
`
`change detection algorithm in Step 326 determines if current
`video frame 304 contains a scene that is different (i.e., a
`scene change) from the scene contained in previous video
`frame 302. The results of scene change detection algorithm
`326 are then forwarded by mode decision with scene detec
`tion module 316 to decision process 328 where a determi
`nation is made of whether a scene change has occurred. If
`the result of decision process 328 is a determination that a
`scene change has occurred, the remaining (e.g., ~90%) of the
`macroblocks of current video frame 304 are processed by
`adjusting quantization parameters in process step 332 and
`encoding continues with intra-frame spatial compensation in
`step 334. If, however, the result of the decision in process
`step 328 indicates that a scene change has not been detected,
`then processing proceeds to step 336 following the conven
`tional coding approach to encode the remaining (e.g. ~90%)
`macroblocks of current video frame 304 using inter-frame
`coding techniques to determine whether a macroblock is
`encoded in intra mode or inter mode based on the mode
`decision result. If the result of the decision in step 336 is to
`process using intermode, processing proceeds to step 338
`where inter-mode processing techniques are applied. Other
`wise, processing proceeds to step 334, where the macrob
`locks are processed using intra-mode spatial compensation
`techniques.
`0023. In different embodiments of the invention, scene
`change detection and optimal encoding mode selection can
`be implemented with video standards based on MPEG/ITU
`video encoding standards based on constant or variable bit
`rate (CBR/VBR), including but not limited to, MPEG-4 part
`2 (MPEG4 video), MPEG-4 part 10 (AVC/H.264 video),
`H.263, MPEG-2, and scalable video coder. In another
`embodiment of the invention, coding efficiency and video
`image quality is improved by automatically inserting a
`key-frame for a video retrieval system, such as MPEG-7,
`and a video Summary.
`0024 FIG. 4 is a generalized block diagram of a video
`stream scene change detection system 400 as implemented
`in a video encoder system in accordance with an embodi
`ment of the invention. Encoder 402 converts the uncom
`pressed video input data 403 into a compressed video data
`bitstream. The uncompressed video input data is provided to
`intra prediction module 404, interceding module 406, and a
`summer 408. Intercoding module 406 includes a motion
`estimation module 410 that, in at least one embodiment,
`operates to produce a motion vector ("MV"). The motion
`vector is used by intermotion compensation module 412 and
`is encoded by entropy coding block 420. Summer 408
`determines the difference between the uncompressed video
`data 403 and either intra-prediction data or inter-motion data
`as selected by intra/inter mode decision module 435, com
`prising mode decision module 436 and scene change detec
`tion module 438.
`0025 Intra/inter mode decision module 435 in the
`embodiment illustrated in FIG. 4 comprises similar process
`ing features described in greater detail hereinabove with
`regard to intra/inter mode decision module 314 of FIG. 3. In
`the embodiment of the invention shown in FIG. 4, intra/inter
`mode decision module 435 counts the number of intra-mode
`macroblocks comprising a predetermined number (e.g.,
`~10%) of encoded macroblocks within current video frame
`403. If the number of intra-frame macroblocks exceeds a
`predetermined threshold, then intra/inter mode decision
`module 435 determines that current video frame 403 con
`
`tains a scene change. When a scene change is detected, the
`remaining macroblocks (e.g., -90%) of current video frame
`403 are encoded using intra-mode coding, which requires no
`motion estimation or compensation, thus reducing compu
`tational overhead and power consumption. At the same time,
`adaptively adjusted or predetermined quantization param
`eter values will be applied to favor either spatial or temporal
`resolution based on the content comprising current video
`frame 403.
`0026. The difference (or residual) data between the
`uncompressed video data (original video data) and the
`predicted data is transformed by forward transform module
`414 using for example a discrete cosine transform (“DCT)
`algorithm. The coefficients from the DCT transformation are
`scaled to integers and quantized by quantization module
`416. Coding controller 440 controls the quantization step
`size via control quantization parameter QP Supplied to
`quantization module 416. The quantized transform coeffi
`cients are scanned by Scan module 418 and entropy coded by
`entropy coding module 420. Entropy coding module 420 can
`employ any type of entropy encoding such as Universal
`Variable Length Codes (“UVLC), Context Adaptive Vari
`able Length Codes (“CAVLC), Context-based Adaptive
`Binary Arithmetic Coding ("CABAC), or combinations
`thereof. Entropy coded transform coefficients and intra/inter
`coding information (i.e. either intra-prediction mode or
`inter-prediction mode information) are transmitted along
`with motion vector data for future decoding. When intra
`prediction module 404 is associated with the current entropy
`encoded transform coefficients, the intraprediction mode,
`macroblock type, and coded block pattern are included in the
`compressed video data bitstream. When the interceding
`module 406 is associated with the current entropy encoded
`transform coefficients, the determined motion vector, mac
`roblock type, coded block pattern, and reference frame index
`are included in the compressed video data.
`0027 Encoder 402 also includes decoder 421 to deter
`mine predictions for the next set of image data. Thus, the
`quantized transform coefficients are inversed quantized by
`inverse quantization module 422 and inverse transform
`coded by inverse transform coding module 424 to generate
`a decoded prediction residual. The decoded prediction
`residual is added to the predicted data. The result is motion
`compensated video data 426, which is provided directly to
`intraprediction module 404. Motion compensated video data
`426 is also provided to deblocking filter 428 which deblocks
`the video data 426 to generate deblocked video data 430,
`which is fed into interceding module 406 for potential use in
`motion compensating the current image data.
`0028. The compressed video data bitstream produced by
`entropy coding module 420 is processed by bitstream buffer
`434, which is coupled to coding controller 406, which also
`comprises a rate control engine, which operates to adjust
`quantization parameters to optimize the processing of video
`compression while maintaining a given bitrate. The com
`pressed video data bitstream is ultimately provided to
`decoder 432, which uses information in the compressed
`Video data bitstream to reconstruct uncompressed video
`data. In one embodiment of the invention, the encoder 402
`and decoder 432 encode and decode Video data in accor
`dance with the H.264/MPEG-4 AVC video coding standard.
`0029 FIG. 5 is a generalized block diagram of a video
`stream scene change detection system as implemented in a
`Video decoder system in accordance with an embodiment of
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1037, p. 10
`
`
`
`US 2007/0274385 A1
`
`Nov. 29, 2007
`
`the invention. Video decoding is essentially the inverse of
`Video encoding. A compressed video bitstream is received
`from encoder 402, described in greater detail hereinabove,
`which is entropy decoded by entropy decoding block 520
`and reordered by inverse scan module 518 to produce a set
`of quantized coefficients, which are rescaled, inverse trans
`formed and quantized by decoder 521, comprising inverse
`quantization module 522 and inverse transform coding mod
`ule 524. Resulting motion compensated video data 526 is
`provided to intraprediction module 504. Motion compen
`sated video data 526 is also provided to deblocking filter 528
`which deblocks the video data 526 to generate deblocked
`video data 530, which is fed into inter motion compensation
`module 512 for motion compensating the current image
`data. Video coding benefits from the dynamic computation
`adjustment reduce the amount of processing needed for
`coding. Since decoder 432 performs a reverse process of
`encoder 402, computation reductions by encoder 402 are
`shared by decoder 432, resulting in reduced computational
`complexity and overhead, lower power consumption, and
`improved video image quality.
`0030 FIG. 6 is a table depicting observed performance of
`a video stream scene change detection system as imple
`mented in accordance with an embodiment of the invention.
`Observed performance table 600 comprises simulated video
`stream scene change detection tests 602, frequency ratio of
`frames containing scene changes within the simulated video
`test stream 604, peak signal-to-noise ratio (PSNR) of the
`simulated video test stream without scene changes 606,
`PSNR of the simulated video test stream with scene changes
`608, comparative PSNR ratio 610, the number of coded
`frames of the simulated video test stream without scene
`changes 612, the number of coded frames of the simulated
`Video test stream with Scene changes 614, and observed
`improvements in coding efficiency 616. By way of example,
`these tests are for low delay video encoding which allows
`frame dropping.
`0031 Simulated video stream scene change detection
`tests 602 comprise quarter common intermediate format
`(QCIF) at 15 frames per second (FPS) processed at 64
`kilobits per second (kbps) without implementation of flex
`ible-macroblock-order (FMO) 618, QCIF at 15FPS pro
`cessed at 64 kbps with implementation of FMO 620, com
`mon intermediate format (CIF) at 30FPS processed at 256
`kbps without implementation of FMO 622, and CIF at
`30FPS processed at 256 kbps with implementation of FMO
`624. QCIF video stream scene change detection test 618,
`conducted at 15FPS and processed at 64 kbps without
`implementation of FMO, comprises 315 video frames, of
`which 5 (1.58%) contain scene changes, with a measured
`peak signal-to-noise ration (PSNR) of 29.3 dB without scene
`changes and 29.2 dB with, resulting in a PSNR ratio of
`-0.03% dB, measured 277 encoded frames without scene
`detection and 306 with, for a 10.5% increase in encoding
`efficiency. QCIF video stream scene change detection test
`620, conducted at 15FPS and processed at 64 kbps with
`implementation of FMO, comprises 315 video frames, of
`which 5 (1.58%) contain scene changes, with a measured
`PSNR of 29.2 dB without scene changes and 28.9 dB with,
`resulting in a PSNR ratio of -0.10% dB, measured 262
`encoded frames without scene detection and 293 with, for a
`11.8% increase in encoding efficiency. CIF video stream
`scene change detection test 622, conducted at 30 FPS and
`processed at 256 kbps without implementation of FMO
`
`comprises 630 video frames, of which 5 (0.98%) contain
`scene changes, with a measured PSNR of 28.8 dB without
`scene changes and 28.8 dB with, resulting in a PSNR ratio
`of -0.00% dB, measured 581 encoded frames without scene
`detection and 613 with, for a 5.5% increase in encoding
`efficiency. CIF video stream scene change detection test 624,
`conducted at 30 FPS and processed at 256 kbps with
`implementation of FMO comprises 315 video frames, of
`which 5 (1.58%) contain scene changes, with a measured
`PSNR of 28.9 dB without scene changes and 29.0 dB with,
`resulting in a PSNR ratio of -0.03% dB, measured 586
`encoded frames without scene detection and 606 with, for a
`3.4% increase in encoding efficiency.
`0032. In accordance with the present invention, a system
`and method has been disclosed for on-the-fly detection of
`scene changes within a video stream through statistical
`analysis of a portion of each video frame's macroblocks as
`they are processed using inter-frame encoding. In an
`embodiment of the invention, a method for improving
`detection of Scene changes in a video stream, comprises: a)
`receiving a video data stream comprising a plurality of video
`data frames, wherein each of said frames comprises a
`plurality of macroblocks; b) initiating processing of a pre
`determined portion of the macroblocks in each video data
`frame in said plurality of data frames; and c) analyzing the
`processed portions of said macroblocks to determine
`whether the corresponding video frame should be processed
`using interframe processing protocols.
`0033. In various embodiments of the invention, when a
`scene change is detected, the macroblocks in the remaining
`portion of the frame are encoded as an intra-frame, intra
`slices, or intra-macroblocks, using adaptively adjusted or
`predetermined quantization parameters (QP) to reduce com
`putational complexity, increase video coding efficiency, and
`improve video image quality. Scene changes within a video
`stream are detected by statistical analysis of a small per
`centage (e.g., ~10%) of the macroblocks comprising each
`Video frame as they are processed using inter-frame coding.
`If the statistical analysis of the selected macroblocks of the
`current frame differs from the previous frame by exceeding
`predetermined thresholds, the current video frame is
`assumed to be a scene change.
`0034. In embodiments of the invention, the statistical
`information gathered from encoded macroblock samples
`includes, but is not limited to, mean-absolute-difference
`(MAD) or sum-of-absolute-difference (SAD), average
`length of motion vectors, number of intra/inter modes. As
`this information is provided as part of the encoding process,
`no additional computational overhead is incurred. In one
`embodiment of the invention, the analyzed area of the video
`frame is a macroblock row (e.g., a 352x16 pixel portion of
`a 352x288 video frame), half of a macroblock row, or a 1.5
`macroblock row according to predetermined parameters. In
`other embodiments of the invention, the analyzed area is a
`64x64 pixel array located in the center of a video frame, a
`predetermined region of interest within the video frame, or
`another position within the video frame as determined by
`flexible-macroblock-order (FMO).
`0035. Once a scene change is detected, the remainder of
`the video frame is encoded as an intra-frame, intra-macrob
`locks, or intra slices, through implementation of one or more
`predetermined or adaptively adjusted quantization param
`eters to reduce computational complexity, decrease power
`consumption, and increase the resulting video image quality.
`
`Unified Patents, LLC v. Elects. & Telecomm. Res. Inst., et al.
`
`Ex. 1037, p. 11
`
`
`
`US 2007/0274385 A1
`
`Nov. 29, 2007
`
`In a different embodiment of the invention, encoding of the
`inter-frame is restarted when a scene change is detected,
`with all macroblocks in the inter-frame being encoded as an
`intra-frame, intra-slices, or intra-macroblocks. This embodi
`ment of the invention results in higher video image