throbber
An objective video quality assessment system based on human perception
`
`Arthur A. Webster, Coleen T. Jones, Margaret H. Pinson,
`Stephen D. Voran, Stephen Wolf
`
`Institute for Telecommunication Sciences
`National Telecommunications and Information Administration
`325 Broadway, Boulder, CO 80303
`
`ABSTRACT
`
`The Institute for Telecommunication Sciences (ITS) has developed an objective video quality assessment system that
`emulates human perception. The system returns results that agree closely with quality judgements made by a large panel of
`viewers. Such a system is valuable because it provides broadcasters, video engineers and standards organizations with the
`capability for making meaningful video quality evaluations without convening viewer panels. The issue is timely because
`compressed digital video systems present new quality measurement questions that are largely unanswered.
`
`The perception-based system was developed and tested for a broad range of scenes and video technologies. The 36
`test scenes contained widely varying amounts of spatial and temporal information. The 27 impairments included digital video
`compression systems operating at line rates from 56 kbits/sec to 45 Mbits/sec with controlled error rates, NTSC encode/
`decode cycles, VHS and S-VHS record/play cycles, and VHF transmission. Subjective viewer ratings of the video quality
`were gathered in the ITS subjective viewing laboratory that conforms to CCIR Recommendation 500-3. Objective measures of
`video quality were extracted from the digitally sampled video. These objective measurements are designed to quantify the spa-
`tial and temporal distortions perceived by the viewer.
`
`This paper presents the following: a detailed description of several of the best ITS objective measurements, a percep-
`tion-based model that predicts subjective ratings from these objective measurements, and a demonstration of the correlation
`between the model's predictions and viewer panel ratings. A personal computer-based system is being developed that will
`implement these objective video quality measurements in real time. These video quality measures are being considered for
`inclusion in the Digital Video Teleconferencing Performance Standard by the American National Standards Institute (ANSI)
`Accredited Standards Committee T1, Working Group T1A1.5.
`
`1. INTRODUCTION
`
`The need to measure video quality arises in the development of video equipment and in the delivery and storage of
`video and image information. Although the work described in this paper is concerned specifically with NTSC video (the distri-
`bution television standard in the United States), the principles presented can be applied to other types of motion video and
`even still images. The methods of video quality assessment can be divided into two main categories: subjective assessment
`(which uses human viewers) and objective assessment (which is accomplished by use of electrical measurements). While we
`believe that assessment of video quality is best accomplished by the human visual system, it is useful to have objective meth-
`ods available which are repeatable, can be standardized, and can be performed quickly and easily with portable equipment.
`These objective methods should give results that correlate closely with results obtained through human perception.
`
`Objective measurement of video quality was accomplished in the past through the use of static video test scenes such
`as resolution charts, color bars, multi-burst patterns, etc., and by measuring the signal to noise ratio of the video signal.1 These
`objective methods address the spatial and color aspects of the video imagery as well as overall signal distortions present in tra-
`ditional analog systems. With the development of digital compression technology, a large number of new video services have
`become available. The savings in transmission and/or storage bandwidth made possible with digital compression technology
`depends upon the amount of information present in the original (uncompressed) video signal, as well as how much quality the
`user is willing to sacrifice. Impairments may result when the information present in the video signal is larger than the transmis-
`sion channel capacity. However, users may be willing to sacrifice quality to achieve a substantial reduction in transmission and
`
`Page 1 of 12
`
`MINDGEEK EXHIBIT 1013
`
`

`

`storage costs. But, how much quality is sacrificed for how much cost savings? We propose a set of measurements that offers a
`way to begin to answer this question. New impairments can be present in digitally compressed video and these impairments
`include both spatial and temporal artifacts.2 The old objective measurement techniques are not adequate to assess the impact
`on quality of these new artifacts.3
`
`After some investigation of compressed video, it becomes clear that the perceived quality of the video after passing
`through a given digital compression system is often a function of the input scene. This is particularly true for low bit-rate sys-
`tems. A scene with little motion and limited spatial detail (such as a head and shoulders shot of a newscaster) may be com-
`pressed to 384 kbits/sec and decompressed with relatively little distortion. Another scene (such as a football game) which
`contains a large amount of motion as well as spatial detail will appear quite distorted at the same bit rate. Therefore, we
`directed our efforts toward developing perception-based objective measurements which are extracted from the actual sampled
`video. These objective measurements quantify the perceived spatial and temporal distortions in a way that correlates as closely
`as possible with the response of a human visual system. Each scene was digitized (at 4 times sub-carrier frequency) to produce
`a time sequence of images sampled at 30 frames per second (in time) and 756 x 486 pixels (in space).
`
`2. DEVELOPMENT METHODOLOGY
`
`Figure 1 presents a graphical depiction of the development process for the ITS quality assessment algorithm. A set of
`video scene pairs (each consisting of the original and a degraded version) was used in a subjective test. These scene pairs were
`also processed on a computer that extracted a large number of features. Statistical analysis was used to select an optimal set of
`quality parameters (obtained from features) that correlated well with the viewing panel results. This optimal set of parameters
`was then used to develop a quality assessment algorithm that gives results that agree closely with viewing panel results.
`
`Objective
`Testing
`
`Objective
` Test
` Results
`(Features)
`
`Library of
`Test
`Scenes
`
`Original
` Video
`
`Impairment
`Generators
`
`Degraded
` Video
`
`Statistical
`Analysis
`
`Parameters
`
`Quality
`Assessment
`Algorithm
`
`Viewing
`Panel
`Results
`
`Subjective
`Testing
`
`Figure 1. Development Process for Video Quality Assessment Algorithm
`
`2.1 Library of test scenes
`
`Several scenes, exhibiting various amounts of spatial and temporal information content, are needed to characterize
`the performance of a video system. Even more scenes are needed to guard against viewer boredom during the subjective test-
`ing. A set of 36 test scenes was chosen for the experiment. The test scenes spanned a wide range of user applications including
`still scenes, limited motion graphics, and full motion entertainment video.
`
`Page 2 of 12
`
`MINDGEEK EXHIBIT 1013
`
`

`

`2.2 Impairment generators
`
`Twenty-seven video systems (plus the ‘no impairment’ system) were used to produce the degraded video that was
`used in the tests. The original video for this test was component analog video. The digital video systems included 11 video
`codecs (coder-decoders) from 7 manufacturers operating at bit rates from 56 kbits/sec to 45 Mbits/sec including bit error rates
`of 10-6 and 10-5. Also included were analog video systems such as VHS and S-VHS recording and playback, and noisy RF
`transmission. All video systems except the ‘no impairment’ system included NTSC encoding and decoding.
`
`2.3 Objective testing
`
`Both the original video and the degraded video were digitized and processed to extract a large number of features.
`The processing included Sobel filtering, Laplace filtering, fast Fourier transforms, first-order differencing, color distortion
`measurements4, and moment calculations. Typically, features were calculated from each original and degraded frame of the
`video sequence to produce time histories. Some features required the entire original and degraded video image (e.g., the vari-
`ance of the error image calculated from the difference between the original and the degraded images). Other features required
`only the statistics of the original and degraded video images (e.g., the change in image energy obtained from the differences
`between the original and the degraded image variances). The time histories of the features were collapsed by various methods,
`e.g., maximum (MAX), root mean square (RMS), standard deviation (STD), etc., to produce a single scalar value (or parame-
`ter) for each test scene. These parameters defined the objective measurements and were used in the statistical analysis step
`shown in Figure 1.
`
`2.4 Subjective testing
`
`The subjective test was conducted in accordance with CCIR Recommendation 500-3.5 A panel of 48 viewers were
`selected from the U.S. Department of Commerce Laboratories phone book in Boulder, Colorado. Each viewer completed four
`viewing sessions during a single week, attending one session per day. Each session lasted approximately 25 minutes and
`required viewing of 38 or 40, 30-second test clips. A clip is defined as a test scene pair consisting of the original video and the
`degraded video. The viewer was first shown the original video for 9 seconds followed by 3 seconds of grey and then 9 seconds
`of the degraded video. 9 seconds was allowed to rate the impairment on a 5 point scale before the next clip was presented. The
`viewer was asked to rate the difference between the original video and the degraded video as either (5) Imperceptible, (4) Per-
`ceptible but Not Annoying, (3) Slightly Annoying, (2) Annoying, or (1) Very Annoying. This scale covers a wide range of
`impairment levels and is specified as one of the standard scales in the CCIR Recommendation 500-3. Impairment testing was
`used since we were interested in measuring the change in video quality due to a video system. A mean opinion score was gen-
`erated by averaging the viewer ratings.
`
`The selection of 158 clips used in the test (out of 972 clips available) was made both deterministically and randomly.
`Random selections were made from a distribution table that paired video teleconferencing systems with more video teleconfer-
`encing scenes than entertainment scenes, and entertainment systems with more entertainment scenes than video teleconferenc-
`ing scenes. The viewers rated 132 unique clips from the 158 actually viewed because some were used for training and
`consistency checks.
`
`2.5 Statistical analysis and quality assessment system
`
`This stage of the development process utilized joint statistical analysis of the subjective and objective data sets. This
`step identifies a subset of the candidate objective measurements that provides useful and unique video quality information. The
`best measurement was selected by exhaustive search. Additional measurements were selected to reduce the remaining objec-
`tive-subjective error by the largest amount. Selected measurements complement each other. For instance, a temporal distortion
`measure was selected to reduce the objective-subjective error remaining from a previous selection of a spatial distortion mea-
`sure. When combined in a simple linear model, this subset of measurements provides predicted scores that correlate well with
`the true scores obtained in the subjective tests. In constructing the linear model we looked for p measurements {mi} and p +1
`constants {ci}, that allowed us to estimate the subjective mean opinion score. The estimated subjective mean opinion score is
`
`Page 3 of 12
`
`MINDGEEK EXHIBIT 1013
`
`

`

`given by
`
`s
`
`=
`
`sˆ
`
`c0
`
`p(cid:229)+
`
`1=
`
`i
`
`cimi
`
`,
`
`(1)
`
`where s is the true subjective mean opinion score and is the estimated score.
`sˆ
`
`3. RESULTS
`
`For the results presented here, three complementary video quality measurements (p=3) were selected. These three
`complementary measures (m1, m2, and m3) have been used to explain most of the variance in subjective video quality that
`resulted from the impairments used in this experiment. The investigations and research that produced the m1, m2, and m3
`video quality metrics also provided insight into how the human perceives the spatial and temporal information of a video
`scene.
`
`3.1 Spatial and temporal information features
`
`The difficulty in compressing a given video sequence depends upon the perceived spatial and temporal information
`present in that video sequence. Perceived spatial information is the amount of spatial detail in the video scene that is perceived
`by the viewer. Likewise, perceived temporal information is the amount of perceived motion in the video scene. Thus, it would
`be useful to have approximate measures of perceived spatial and temporal information. These information measures could be
`used to select test scenes that appropriately stress the video compression system being designed or tested. Two different test
`scenes with the same spatial and temporal information should produce similar perceived quality at the output of the transmis-
`sion channel. Measures of distortion could also be obtained by comparing the perceived information content of the video
`before and after passing through a video system. Although it is recognized that spatial and temporal aspects of vision percep-
`tion cannot be completely separated from each other, we have found spatial and temporal features that correlate with human
`quality perception of spatial detail and motion. Both of these features require pixel differencing operations, which seem to be
`basic attributes of the human visual system. The spatial information (SI) feature differences pixels across space while the tem-
`poral information (TI) feature differences pixels across time. Here, both the SI and TI features have been applied to the lumi-
`nance portion of the video.
`
`3.1.1 Spatial information (SI)
`
` The spatial information feature is based on the Sobel filter.6 At time n, the video frame Fn is filtered with the Sobel
`operators. The standard deviation over the pixels in each Sobel-filtered frame is then computed. This operation is repeated for
`each frame in the video sequence and results in a time series of spatial information values. Thus, the spatial information fea-
`ture, SI [Fn], is given by
`
`[
`SI Fn
`
`]
`
`
`
`
`
`S= TDspace Sobel Fn[{
`
`]
`
`}
`
`,
`
`(2)
`
`where STDspace is the standard deviation operator over the horizontal and vertical spatial dimensions in a frame, and Fn is the
`nth frame in the video sequence. Figure 2 shows a time sequence of 3 contiguous video frames for an original scene (top row)
`and degraded version of that scene (second row). These images were sampled at the NTSC frame rate of approximately 30
`frames per second. The degraded version of the scene was obtained from a 56 kbits/sec codec. The third row of Figure 2 shows
`the Sobel filtered version of the original scene and the fourth row shows the Sobel filtered version of the degraded scene. The
`highly localized, clearly focussed edges in the third row produce a large STDspace since the standard deviation is a measure of
`the spread in pixel values. On the other hand, the non-localized, blurred edges shown in the fourth row produce a smaller
`STDspace, demonstrating that spatial detail has been lost. This is particularly evident for the images in the third column.
`
`Page 4 of 12
`
`MINDGEEK EXHIBIT 1013
`

`

`

`
`
`Figure 2. Video Processed to Demonstrate Perceived Spatial and Temporal Information
`
`Page 5 of 12
`
`MINDGEEK EXHIBIT 1013
`
`Page 5 of 12
`
`MINDGEEK EXHIBIT 1013
`
`

`

`3.1.2 Temporal information (TI)
`
`The temporal information feature is based upon the motion difference image,
`ences between pixel values at the same location in space but at successive times or frames.
`defined as
`
`D Fn D Fn
`
`, which is composed of the differ-
` , as a function of time (n), is
`
`D Fn
`
`=
`
`
`
`Fn Fn 1--
`
`.
`
`(3)
`
`The temporal information feature, TI [Fn], is defined as the standard deviation of
`dimensions, and is given by
`
`D Fn
`
` over the horizontal and vertical spatial
`
`[
`TI Fn
`
`]
`
`=
`
`STDspace
`
`[
`
`D Fn
`
`]
`
`.
`
`(4)
`
`D Fn
`More motion in adjacent frames will result in higher values of TI [Fn]. The fifth row of Figure 2 shows the
`motion difference frames of the original video (top row) while the sixth row shows the motion difference frames of the
`degraded video (second row). The motion in the original scene was a smooth camera pan. Two of the motion difference frames
`for the degraded video shown in the sixth row contain very little motion energy. This resulted because the low bit-rate codec
`updated the information in the scene at less than 30 frames per second, the NTSC frame rate. In fact, the first degraded image
`in row 2 had already been repeated for several video frames prior to the time of column 1 (one can see this by comparing the
`first original image with the first degraded image in Figure 2 and noting that the first degraded image lags in time). When the
`codec does update the information (last column), the motion appears jerky. Humans perceive this as unnatural motion. Thus,
`the time history of TI quantifies the perception of this motion. In Figure 2, the flow of motion has been distorted from smooth
`and continuous in the original video to localized and discontinuous in the degraded video.
`
`3.1.3 Spatial-temporal matrix
`
`Figure 3 shows how the set of 36 test scenes used in the ITS video quality experiments can be placed on a spatial-
`temporal information plot. For clarity, only the maximum value over the 9 second time histories of SI and TI for each original
`scene was plotted. When entire time histories are plotted, most test scenes will produce trajectories in the spatial-temporal
`space. Along the TI=0 axis (x-axis) are found the still scenes and those with very limited motion. Along the SI=0 axis (y-axis)
`are found scenes with minimal spatial detail. These values of SI and TI can be compared to other test scenes measured using
`the above equations which have been spatially sampled at 4 times sub-carrier (4fsc) or approximately 756 x 486 pixels, 30
`frames per second, and digitized with white set to 235 and black set to 16. Note that no attempt has been made at this point to
`normalize SI and TI relative to their respective importance to the human visual system. As the distance from the origin
`increases, the total perceived information increases. This results in increasing coding difficulty and may result in increased dis-
`tortion for a fixed bit-rate digital system.
`
`3.2 Video quality measures
`
`The three video quality measures presented here involve equational forms of the SI and TI features extracted from the
`original and degraded video. Results will be presented in section 3.3 that demonstrate the validity of the three video quality
`measures presented in this section. Since the range of video quality in the experiment was quite large, and since the linear pre-
`dictor based on the three video quality measures closely tracked subjective mean opinion scores, we feel that the SI and TI fea-
`tures quantify basic perceptual attributes of the human visual system. The three video quality measurements have been
`normalized for unit variance so that we can interpret coefficient magnitudes as indications of the relative importance of m1,
`m2, and m3. These normalization constants are included in the equations for m1, m2, and m3.
`
`3.2.1 Measurement m1
`
`Measurement m1 was the first measurement selected by the statistical analysis. It is a measure of spatial distortion and
`is obtained from the SI features of the original and degraded video. The spatial distortions measured by m1 include both blur-
`ring and false edges. Since the Sobel filter is an edge enhancement filter, edge energy gained or lost in an image after passing
`
`Page 6 of 12
`
`MINDGEEK EXHIBIT 1013
`
`

`

`Figure 3. Spatial-Temporal Locations of ITS Test Scenes
`
`through a video system will be measured by m1. The equational form for m1 is given by
`
`m1
`
`=
`
`(
`RMStime 5.81
`
`[
`]
`[
`SI Dn
`SI On
`]
`[
`SI On
`
`]
`
`)
`
`,
`
`(5)
`
`where On denotes the nth frame of the original video sequence, Dn is the nth frame of the degraded video sequence, SI [.] indi-
`cates the Spatial Information operator defined in Equation (2), and RMStime denotes the root mean square time-collapsing
`function. Note that m1 measures the relative change in SI between the original and degraded video. Figure 4 shows the time
`history of SI [Fn] plotted for an original and a degraded version of a test scene. The degraded version was obtained from a 56
`kbits/sec codec. Note that the SI of the degraded video was less than the SI of the original video, indicating spatial blurring.
`
`3.2.2 Measurement m2
`
`The measurements m2 and m3 that were selected next by the statistical analysis are both measures of temporal distor-
`tion. These temporal distortion measures complement the spatial distortion measure m1 and account for most of the remaining
`prediction error that resulted from using just m1. Measurement m2 is given by
`
`where
`
`m2
`
`=
`
`[
`(
`[
`{
`ftime 0.108 MAX TI On
`
`]
`
`[
`TI Dn
`
`]
`
`) 0
`, }
`
`]
`
`,
`
`(
`ftime xt
`
`)
`
`=
`
`STDtime
`
`{
`
`(
`CONV xt ,
`
`[
`
`-1, 2, -1
`
`]
`
`)
`
`}
`
`,
`
`(6)
`
`(7)
`
`and TI [Fn] is the Temporal Information operator defined in Equation (4), CONV denotes the convolution operator, and
`STDtime denotes a standard deviation across time. Figure 5 shows TI [Fn] for the same original and degraded video as shown
`
`Page 7 of 12
`
`MINDGEEK EXHIBIT 1013
`
`-
`-
`(cid:215)
`

`

`Figure 4. SI [Fn] for the Original and Degraded Video Sequences
`
`Figure 5. TI [Fn] for the Original and Degraded Video Sequences
`
`Page 8 of 12
`
`MINDGEEK EXHIBIT 1013
`
`

`

`in Figure 4. The TI of the degraded video has areas of no motion (values that are approximately zero), and areas of large
`localized motion (spikes). By examining the frequency of the motion spikes, one can deduce the frame rate of the codec. For
`low bit rate systems, the frame rate is normally adaptive and changes depending upon the motion in the original scene. Areas
`of large motion in the original (such as the pan from frame 20 to frame 150) cause very large temporal distortions in the output.
`The longer the codec waits to update the video frame, the greater the perceived jerkiness. Note that m2 measures the effect of
`temporally localized motion in the degraded video that were not in the original video. The convolutional kernel enhances these
`motion transitions since it is a high pass filter and the STDtime quantifies the spread in energy of the enhanced motion
`transitions. The m2 measure is non-zero only when the degraded video has lost motion energy, such as during frame repetition
`as shown in Figure 2.
`
`3.2.3 Measurement m3
`
`Measurement m3 is another measure of temporal distortion and is given by
`
`=
`{
`m3 MAXtime 4.23 LOG10
`
`(
`
`[
`TI Dn
`[
`TI On
`
`]
`]
`
`)
`
`}
`
`(8)
`
`where MAXtime returns the maximum value of the time history. Measurement m3 selects the video frame that has the largest
`added motion. This may be the point of maximum jerky motion or the point in the video that has the worst uncorrected block
`errors (when errors in the digital transmission channel are present). Random or periodic noise are also detected by m3. All of
`these impairments were included in the experimental data.
`
`3.3 Assessment Algorithm Performance
`
`Objective quality assessment is provided by estimating the level of perceived impairment present in a video sequence
`from the m1, m2, and m3 quality measures. The {ci} constants for Equation (1) were found by applying least squares error cri-
`teria to a training set that was composed of 18 of the 36 test scenes (64 of the 132 clips). The remainder of the data was
`reserved for testing. The equation used to generate estimations of the subjective scores is
`
`=
`
`sˆ
`
`4.77
`
`0.992m1
`
`0.272m2
`
`0.356m3
`
`.
`
`(9)
`
`The correlation coefficient between the subjective scores and the estimated (or objective) scores was 0.92 for the
`training set. With the testing set (68 of the 132 clips), the correlation coefficient was 0.94. This is a highly encouraging result.
`Figure 6 shows a plot of subjective versus estimated scores for both sets of data. The triangles represent the training data and
`the squares represent the testing data.7 The testing data was clipped so that no scene was rated less than 1.
`
`3.4 Extensions of the basic forms
`
`The SI and TI features define a new class of quality metrics, not specifically limited to the above three equational
`forms. For some applications, collapsing SI and TI to single values per frame may be too limited. One could compute more
`localized measures by calculating SI and TI for image sub-regions. These subregions could be vertical strips, horizontal strips,
`rectangular regions, or even motion/still segmented regions. For instance, during recent analysis of contribution quality 45
`Mbits/sec codecs from a different experiment, we have found that a more sensitive quality metric can be obtained by applying
`the TI distortion measures to horizontal strips of the degraded video that did not contain any motion in the original video. This
`measures the added noise in the (originally) still portion of the video scene.
`
`Note that SI and TI are insensitive to shifts in the image mean (provided the dynamic range of the video system is not
`exceeded), but they are sensitive to image intensity scaling. In cases where one has reason to suspect that the video system
`under test has a constant, non-unity gain, the SI and TI of the degraded video can be compensated by dividing by the gain. This
`gain can be estimated from the original and degraded video images or found by other means.
`
`Page 9 of 12
`
`MINDGEEK EXHIBIT 1013
`
`(cid:215)
`-
`-
`-
`

`

`Figure 6. Estimated Scores Versus True Subjective Scores
`
`4. REAL TIME IMPLEMENTATION
`
`We are currently in the process of implementing our quality assessment algorithm in real time using a personal com-
`puter (PC). This has been made possible due to recent advances in frame sampling and image processing cards for PCs. The
`goal is the ability to estimate real-time video quality in the field, where the original and degraded video may be spatially sepa-
`rated by a great distance. Figure 7 is a block diagram of the real-time system. The final measurement system will consists of
`two PC’s (one to process the source video and the other to process the destination video), two modems, and a phone line. A
`primary advantage of the video quality metrics is that they are based on the low-bit rate SI and TI quantities. This means that
`they can be transmitted over ordinary dial-up phone lines. This makes it possible to conduct economic in-service measure-
`ments of video quality when the source video and the destination video are separated by large distances. In-service measure-
`ments are important because they allow one to measure the actual video quality that is being delivered by the transmission
`channel. We have seen that this quality can change depending upon the source video being transmitted. Out-of-service testing
`requires only a PC at the destination end and a standard set of video scenes at the source end. In this case, the SI and TI fea-
`tures for the original source video are precomputed and stored in the PC at the destination end.
`
`The video link begins with source video input to the system. This may be live video from a camera, video output from
`a VCR, a laserdisc player, or any other video source. The source video is input to the transmission channel shown in Figure 7.
`The example transmission service channel shown in Figure 7 is composed of an encoder, a digital communication circuit, and
`a decoder. The actual calculations of the video measurements and the predicted subjective score are performed in the parame-
`ter measurement equipment. This equipment consists of a PC with an image processing circuit board. The image processing
`board can process data at rates up to 30 MHz and is capable of digitizing video at 30 frames/sec. It performs both convolutions
`and image differencing. A PC, configured as described, is located at both the source video and the destination video sites.
`
`For in-service measurements, the PC’s will calculate the SI and TI features at or near real time for both the source and
`destination video. Although Figure 7 shows only the PC at the destination end calculating the m1, m2, and m3 quality mea-
`sures, the system is, in fact, symmetrical so that quality measurements are available at either the source or the destination end.
`The SI and TI features can be time tagged. This is particularly useful for calculating the video delay of the transmission chan-
`nel, since the SI and TI features can be time-aligned with a simple correlation process. Once both the source and destination
`
`Page 10 of 12
`
`MINDGEEK EXHIBIT 1013
`
`

`

`measurements are available at one PC, they can be combined in the linear predictor described previously to produce the esti-
`mated subjective rating of the live video.
`
`The real-time system is currently in the development stage. It consists of one PC containing two image processing
`cards. One card processes the source video and the other processes the destination video. Our laboratory digital communica-
`tion circuit consists of one codec with the coder output looped back to the decoder input. A camera supplies live source video.
`
` Source Video
`
`Encoder
`
`Decoder
`
` Destination Video
`
`--D
`
`n
`Dn-1
`
`Parameter
`Measurement
`Equipment
`
`SI & TI
`
`Compute
`Alignment,
`m1, m2, m3
`
`Digital
`Communication
`Circuit
`
`Parameter
`Measurement
`Equipment
`
`SI & TI
`
`--O
`
`n
`On-1
`
`Data Channel
`(£
`
` 2400 baud)
`
`S = c0 + c1m1 + c2m2 + c3m3
`
`Figure 7. Real-time System Implementation
`
`To make the measurements as efficient as possible, we are examining several alternate computations of SI and TI.
`These alternate computations are:
`
`1) The true Sobel filter is replaced with the pseudo-Sobel filter for the computation of the SI feature in Equation (2).
`The true Sobel filter convolves the image with two convolution kernels. One detects horizontal edges, and the other detects
`vertical edges. The final result is the square root of the sum of the squares of the individual convolutions. This calculation
`requires two passes of the image on the image processing boards we are using. The pseudo-Sobel uses the same two kernels as
`the true Sobel, but the final result is simply the sum of the absolute values of the individual convolutions. This calculation is
`twice as fast as the true Sobel calculation because it can be done in a single pass over the image. In addition, we have con-
`ducted investigations that show that the SI feature can be temporally sub-sampled at a rate as low as 3 times per second (every
`10th NTSC video frame) without appreciably affecting the results.
`
`2) The mean of the absolute value of the difference image (see Figure 7) is substituted for the TI feature in Equation
`(4). This modified TI measure is simpler to compute in real time since it is easier to accumulate pixel values than it is to accu-
`mulate the squares of pixel values. Investigations have shown that this new TI measure contains similar quality information as
`the original TI measure in Equation (4). The new TI measure is also spatially sub-sampled by a factor of 2 in the horizontal
`direction and 2 in the vertical direction (sampled at 2 times sub-carrier rather than 4 times sub-carrier and applied to only one
`field of the NTSC frame).
`
`The affects of the above changes to SI and TI on the video quality metrics m1, m2, and m3 are currently being inves-
`
`Page 11 of 12
`
`MINDGEEK EXHIBIT 1013
`
`

`

`tigated. The real-time video quality system provides a means for rapidly testing the performance of these alternate video qual-
`ity metrics. Future uses for the real-time system include developing and testing other video quality metrics in the laboratory,
`and measuring end-to-end video quality in the field for applications such as video teleconferencing and the delivery of digital
`video to the home.
`
`5. CONCLUSIONS
`
`Based on the results of this experiment, it appears that the human visual system may be perceiving spatial and tempo-
`ral distortions in video by performing differences in space and time, and by comparing these quantities to a reference. The ITS
`video quality algorithm used the original video scene as a source of reference features. These video quality reference features
`extracted from the original video scene can be communicated at a low bit rate. This seems to indicate that the perceptual band-
`widt

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket