`stream to guide level selection in a predictive quantizer. The data stream is chosen so that the resulting
`image looks like quantization noise. A variation on this scheme is also presented, where a watermark in the
`form of a dithering matrix is used to dither an image in a certain way. There are several drawbacks to these
`schemes. The most important is that they are susceptible to signal processing, especially requantization, and
`geometric attacks such as cropping. Flirthermore, they degrade an image in the same way that predictive
`coding and dithering can.
`In [TNM90], the authors also propose a scheme for watermarking facsimile data. This scheme shortens
`or lengthens certain runs of data in the run length code used to generate the coded fax image. This
`proposal is susceptible to digital-to-analog and analog-to-digital attacks. In particular, randomizing the
`LSB of each pixel's intensity will completely alter the resulting run length encoding. Tanaka et al also
`propose a watermarking method for "color-scaled picture and video sequences". This method applies the
`same signal transform as JPEG (DCT of 8 x 8 sub-blocks of an image) and embeds a watermark in the
`coefficient quantization module. While being compatible with existing transform coders, this scheme is quite
`susceptible to requantization and filtering and is equivalent to coding the watermark in the least significant
`bits of the transform coefficients.
`In a recent paper, Macq and Quisquater [MQ95] briefly discuss the issue of watermarking digital images
`as part of a general survey on cryptography and digital television. The authors provide a description of
`a procedure to insert a watermark into the least significant bits of pixels located in the vicinity of image
`contours. Since it relies on modifications of the least significant bits, the watermark is easily destroyed.
`Further, their method is restricted to images, in that it seeks to insert the watermark into image regions
`that lie on the edge of contours...
`Bender at al [BGM95] describe two watermarking schemes. The first is a statistical method called
`"Patchwork" that somewhat resembles the statistical component of our proposal. Patchwork randomly
`chooses n pairs of image points, (at, 60, and increases the brightness at at by one unit while correspondingly
`decreasing the brightness of bi. The expected value of the sum of the differences of the n pairs of points
`is then claimed to be 2n, provided certain statistical properties of the image are true. In particular, it is
`assumed that all brightness levels are equally likely, that is, intensities are uniformly distributed. However,
`in practice, this is very uncommon. Moreover, the scheme may (1) not be robust to randomly jittering the
`intensity levels by a single unit, and (2) be extremely sensitive to geometric affine transformations.
`The second method is called "texture block coding", wherein a region of random texture pattern found in
`
`3
`
`DISH - Blue Spike-408
`Exhibit 1010, Page 0613
`
`
`
`the image is copied to an area of the image with similar texture. Autocorrelation is then used to recover each
`texture region. The most significant problem with this technique is that it is only appropriate for images
`that possess large areas of random texture. The technique could not be used on images of text, for example.
`Nor is there a direct analog for audio.
`Digimarc Corporation or Portland, Oregon, describe a method that adds or subtracts small random
`quantities from each pixel. Addition or subtraction is determined by comparing a binary mask of L bits
`with the LSB of each pixel. If the LSB is equal to the corresponding mask bit, then the random quantity
`is added, otherwise it is subtracted. The watermark is subtracted by first computing the difference between
`the original and watermarked images and then by examining the sign of the difference, pixel by pixel, to
`determine if it corresponds to the original sequence of additions and subtractions. The Digimarc method
`does not make use of perceptual relevance and is probably equivalent to adding high frequency noise to the
`image. As such, it may not be robust to low pass filtering.
`Koch, Rindfrey and Zhao (KRZ94] propose two general methods for watermarking images. The first
`method, attributed to Scott Burgett, breaks up an image into 8 x 8 blocks and computes the Discrete Cosine
`Transform (DCT) of each of these blocks. A pseudorandom subset of the blocks is chosen, then, in each
`such block, a triple of frequencies is selected from one of 18 predetermined triples and modified so that their
`relative strengths encode a 1 or 0 value. The 18 possible triples are composed by selection of three out of eight
`predetermined frequencies within the 8 x 8 DCT block. The choice of the 8 frequencies to be altered within
`the DCT block is based on a belief that the "middle frequencies . . . have moderate variance", i.e. they have
`similar magnitude. This property is needed in order to allow the relative strength-of the frequency triples
`to be altered without requiring a modification that would be perceptually noticeable. Superficially, this
`scheme is similar to our own proposal and, in fact, also draws analogy with spread spectrum communication.
`However, the structure of their watermark is different from ours. The set of frequencies is not chosen based
`on any perceptual significance or relative energy considerations. Further, because the variance between
`the eight frequency coefficients is small, one would expect that their technique may be sensitive to noise or
`distortions. This is supported by the experimental results which report that the "embedded labels are robust
`against JPEG compression for a quality factor as low as about 50%". By comparison, we demonstrate that
`our method performs well with compression quality factors as low as 5%. An earlier proposal by Koch and
`Zha-o [KZ95) used not triples of frequencies but pairs of frequencies, and was again designed specifically for
`robustness to JPEG compression. Nevertheless, they state that " a lower quality factor will increase the
`likelihood that the changes necessary to superimpose the embedded code on the signal will be noticeably
`
`DISH - Blue Spike-408
`Exhibit 1010, Page 0614
`
`
`
`visible" .
`In a second method, designed for black and white images, no frequency transform is employed. Instead,
`the selected blocks are modified so that the relative frequency of white and black pixels encodes the final
`value. Both watermarking procedures are particularly vulnerable to multiple document attacks. To protect
`against this, Zhao and Koch propose a distributed 8 x 8 block created by randomly sampling 64 pixels from
`the image. However, the resulting DCT has no relationship to that of the true image and consequently may
`be likely to cause noticeable artifacts in the image and be sensitive to noise.
`In addition to direct work on watermarking images, there are several works of interest in related areas.
`Adelson [Ade90] describes a technique for embedding digital information in an analog signal for the purpose
`of inserting digital data into an analog TV signal. The analog signal is quantized into one of two disjoint
`ranges, ({0, 2, 4 .. .},{1, 3, 5 .. .}, for example) which are selected based on the binary digit to be transmitted.
`Thus Adelson's method is equivalent to watermark schemes that encode information into the least significant
`bits of the data or its transform coefficients. Adelson recognizes that the method is susceptible to noise and
`therefore proposes an alternative scheme wherein a 2 x 1 Hadamard transform of the digitized analog signal
`is taken. The differential coefficient of the Hadamard transform is offset by 0 or 1 unit prior to computing
`the inverse transform. This corresponds to encoding the watermark into the least significant bit of the
`differential coefficient of the Hadamard transform. It is not clear that this approach would demonstrate
`enhanced resilience to noise. Furthermore, like all such least significant bit schemes, an attacker can eliminate
`the watermark by randomization.
`Schreiber et al [SLAN91J describe a method to interleave a standard NTSC signal within an enhanced
`definition television (EDTV) signal. This is accomplished by analyzing the frequency spectrum of the EDTV
`signal (larger than that of the NTSC signal) and decomposing it into three sub-bands (L,M,H for low, medium
`and high frequency respectively). In contrast, the NTSC signal is decomposed into two subbands, L and
`M. The coefficients, Mk, within the M band are quantized into in levels and the high frequency coefficients,
`Hk, of the EDTV signal are scaled such that the addition of the Hk signal plus any noise present in the
`system is less than the minimum separation between quantization levels. Once more, the method relies on
`modifying least significant bits. Presumably, the mid-range rather than low frequencies were chosen because
`these are less perceptually significant. In contrast, the method proposed here modifies the most perceptually
`significant components of the signal.
`Finally, it should be noted that many, if not all, of the prior art protocols are not collusion resistant.
`
`7
`
`DISH - Blue Spike-408
`Exhibit 1010, Page 0615
`
`
`
`Watermarked
`
`Image or Sound 1 Transmission
`
`oc
`
`,5)
`
`Typical Distortions or Intentional Tampering
`
`Transmission
`
`Corrupted
`Watermarked Image or Sound
`
`Figure 1: Common processing operations that a media document could undergo
`
`3 Watermarking in the Frequency Domain
`
`In this section, we first discuss how common signal distort
`ect the frequency spectrum of a signal. This
`alysis supports our contention t
`ermark must be placed in perceptually significant regions of a
`roust. Section 3.2 proposes inserting a watermark into the perceptually most significant
`signal
`components of the spectrum using spread spectrum techniques.
`
`3.1 Common signal distortions and their effect on the frequency spectrum of a
`signal
`
`In order to understand the advantages of a frequency-based method, it is instructive to examine the processing
`stages that an image (or sound) may undergo in the process of copying, and to study the effect that these
`stages could have on the data, as illustrated in Figure 1. In the figure, "transmission" refers to the application
`of any source or channel code, and/or standard encryption technique to the data. While most of these steps
`
`8
`
`DISH - Blue Spike-408
`Exhibit 1010, Page 0616
`
`
`
`are information lossless, many compression schemes (JPEG, MPEG etc.) can potentially degrade the data's
`quality, through irretrievable loss of data. In general, a watermarking scheme should be resilient to the
`distortions introduced by such algorithms.
`Lossy compression is an operation that usually eliminates perceptually non-salient components of an
`image or sound. If one wishes to preserve a watermark in the face of such an operation, the watermark
`must be placed in the perceptually significant regions of the data. Most processing of this sort takes place
`in the frequency domain. In fact, data loss usually occurs among the high frequency components. Hence,
`the watermark must be placed in the significant frequency components of the image (or sound) spectrum.
`After receipt, an image may endure many common transformations that are broadly categorized as
`geometric distortions or signal distortions. Geometric distortions are specific''to images and video, and
`include such operations as rotation, translation, scaling and cropping. By manually determining a minimum
`of four or nine corresponding points between the original and the distorted watermark, it is possible to
`remove any two or three dimensional affine transformation [Fau93]. However, an affine scaling (shrinking)
`of the image leads to a loss of data in the high frequency spectral regions of the image. Cropping, or the
`cutting out and removal of portions of an image, also leads to irretrievable loss of data. Cropping may be a
`serious threat to any spatially based watermark such as [Car95] but is less likely to affect a frequency-based
`scheme, as shown in Section 5.5.
`Common signal distortions include digital-to-analog and analog-to-digital conversion, resampling, re-
`quantization, including dithering and recompression, and common signal enhancements to image contrast
`and/or color, and audio frequency equalization. Many of these distortions are non-linear, and it is difficult
`to analyze their effect in either a spatial or frequency based method. However, the fact that the original
`image is known allows many signal transformations to be undone, at least approximately. For example,
`histogram equalization, a common non-linear contrast enhancement method, may be removed substantially
`by histogram specification [GW93] or dynamic histogram warping [CRH95] techniques.
`Finally, the copied image may not remain in digital form. Instead, it is likely to be printed, or an analog
`recording made (onto analog audio or video tape). These reproductions introduce additional degradation
`into the image that a watermarking scheme must be robust to.
`The watermark must not only be resistant to the inadvertant application of the aforementioned distor-
`tions. It must also be immune to intentional manipulation by malicious parties. These manipulations can
`include combinations of the above distortions, and can also include collusion and forgery attacks.
`
`9
`
`DISH - Blue Spike-408
`Exhibit 1010, Page 0617
`
`
`
`3.2 Spread spectrum coding of a watermark
`
`The above discussion makes it clear that the watermark should not be placed in perceptually insignificant
`regions of the image or its spectrum since many common signal and geometric processes affect these compo-
`nents. For example, a watermark placed in the high frequency spectrum of an image can be easily eliminated
`with little degradation to the image by any process that directly or indirectly performs low pass filtering.
`The problem then becomes how to insert a watermark into the most perceptually significant regions of an
`spectrum without such alterations becoming noticeable. Clearly, any spectral coefficient may be altered,
`provided such modification is small. However, very small changes are very susceptible to noise.
`To solve this problem, the frequency domain of the image or sound at hand is-viewed as a communication
`channel, and correspondingly, the watermark is viewed as a signal that is transmitted through it. Attacks
`and unintentional signal distortions are thus treated as noise that the immersed signal must be immune to.
`While we use this methodology to hide watermarks in data, the same rationale can be applied to sending
`any type of message through media data.
`Rather than encode the watermark into the least significant components of the data, we originally con-
`ceived our approach by analogy to spread spectrum communications [PSM82]. In spread spectrum communi-
`cations, one transmits a narrowband signal over a much larger bandwidth such that the signal energy present
`in any single frequency is imperceptible. Similarly, the watermark is spread over very many frequency bins so
`that the energy in any one bin is very small and certainly undetectable. Nevertheless, because the watermark
`verification process knows of the location and content of the watermark, it is possible to concentrate these
`many weak signals into a single signal with high signal-to-noise ratio. However, to destroy such a watermark
`would require noise of high amplitude to be added to all frequency bins.
`Spreading the watermark throughout the spectrum of an image ensures a large measure of security
`against unintentional or intentional attack: First, the location of the watermark is not obvious. Furthermore,
`frequency regions should be selected in a fashion that ensures severe degradation of the original data following
`any attack on the watermark.
`A watermark that is well placed in the frequency domain of an image or a sound track will be practically
`impossible to see or hear. This will always be the case if the energy in the watermark is sufficiently small
`in any single frequency coefficient. Moreover, it is possible to increase the energy present in particular
`frequencies by exploiting knowledge of masking phenomena in the human auditory and visual systems.
`Perceptual masking refers to any situation where information in certain -regions of an image or a sound. is
`
`10
`
`DISH - Blue Spike-408
`Exhibit 1010, Page 0618
`
`•
`
`
`occluded by perceptually more prominent information in another part of the scene. In digital waveform
`coding, this frequency domain (and, in some cases, time/pixel domain) masking is exploited extensively to
`achieve low bit rate encoding of data PJS93, GG92]. It is clear that both the auditory and visual systems
`attach more resolution to the high energy, low frequency, spectral regions of an auditory or visual scene
`PJS93]. Further, spectrum analysis of images and sounds reveals that most of the information in such data
`is located in the low frequency regions.
`Figure 2 illustrates the general procedure for frequency domain watermarking. Upon applying a frequency
`transformation to the data, a perceptual mask is computed that highlights perceptually significant regions
`in the spectrum that can support the watermark without affecting perceptual fidelity. The watermark
`signal is then inserted into these regions in a manner described in Section 4.2. The precise magnitude of
`each modification is only known to the owner. By contrast, an attacker may only have knowledge of the
`possible range of modification. To be confident of eliminating a watermark, an attacker must assume that
`each modification was at the limit of this range, despite the fact that few such modifications are typically
`this large. As a result, an attack creates visible (or audible) defects in the data. Similarly, unintentional
`signal distortions due to compression or image manipulation, must leave the perceptually significant spectral
`components intact, otherwise the resulting image will be severely degraded. This is why the watermark is
`robust.
`In principle, any frequency domain transform can be used. However, for the experimental results of
`Section 5 we use a Fourier domain method based on the discrete cosine transform (DCT) [Lim90], although
`we are currently exploring the use of wavelet-based schemes as a variation. In our view, each coefficient in the
`frequency domain has a perceptual capacity, that is, a quantity of additional information can be added without
`any (or with minimal) impact to the perceptual fidelity of the data. To determine the perceptual capacity
`of each frequency, one can use models for the appropriate perceptual system or simple experimentation.
`In practice, in order to place a length n watermark into an N x N image, we computed the N x N DCT
`of the image and placed the watermark into the n highest magnitude coefficients of the transform matrix,
`excluding the DC component.' For most images, these coefficients will be the ones corresponding to the
`low frequencies. Reiterating, the purpose of placing the watermark in these locations is because significant
`tampering with these frequency will destroy the image fidelity well before the watermark.
`In the next section, we provide a high level discussion of the watermarking procedure, describing the
`
`1More generally, n randomly chosen coefficients could be chosen from the M, M > n most perceptually significant coefficients
`of the transform.
`
`Il
`
`DISH - Blue Spike-408
`Exhibit 1010, Page 0619
`
`
`
`Image, X(i,j)
`
`Frequency
`Transform
`
`Insert
`Mark
`
`Watermarked
`Image, fqi,j)
`
`Inverse
`Frequency
`Transform
`
`16ya
`Frequency
`decomposition
`
`Watermark
`Signal, w(k)
`
`Figure 2: Immersion of the watermark in the frequency domain
`
`structure of the watermark and its characteristics.
`
`4 Structure of the watermark
`
`We now give a high-level overview of our a basic watermarking scheme; many variations are possible. In its
`most basic implementation, a watermark consists of a sequence of real numbers X = xl ,
`, xn. In practice,
`we create a watermark where each value xi is chosen independently according to N(0,1) (where N(p, a 2 )
`denotes a normal distribution with mean it and variance a2). We assume that numbers are represented by a
`reasonable but finite precision and ignore these insignificant roundoff errors. Section 4.1 introduces notation
`to describe the insertion and extraction of a watermark and Section 4.3 describes how two watermarks .(the
`original one and the recovered, possibly corrupted one) can be compared. This procedure exploits the fact
`that each component of the watermark is chosen from a normal distribution. Alternative distributions are
`possible, including choosing xi uniformly from {1, —1}, {0,1} or [0,1]. However, as we discuss in Section 4.5,
`using such distributions leaves one particularly vulnerable to attacks using multiple watermarked documents.
`
`12
`
`DISH - Blue Spike-408
`Exhibit 1010, Page 0620
`
`
`
`value extraction
`
`combining H
`
`V'
`
`value insertion
`
`D'
`
`extraction
`
`differencing
`
`postprocessing
`
`x *
`
`Figure 3: Encoding and decoding of the watermark string
`
`4.1 Description of the watermarking procedure
`
`We extract from each document D a sequence of values V = v1,...,vn, into which we insert a watermark
`X = xi,. ..,x„ to obtain an adjusted sequence of values V' =14,...,vni . V' is then inserted back into the
`document in place of V to obtain a watermarked document D'. One or more attackers may then alter D',
`producing a new document D. Given D and D•, a possibly corrupted watermark X' is extracted and is
`compared to X for statistical significance. We extract X• by first extracting a set of values V' =
`, yr*,
`from D* (using information about D) and then generating X• from V' and V.
`Frequency-domain based methods for extracting V and V* and inserting V' are given in Section 3. For
`the rest of this section we ignore the manipulations of the underlying documents.
`
`13
`
`DISH - Blue Spike-408
`Exhibit 1010, Page 0621
`
`
`
`4.2
`
`Inserting and extracting the watermark
`
`When we insert X into V to obtain V' we specify a scaling parameter a which determines the extent to
`which X alters V. Three natural formulae for computing V' are:
`
`v:'= vi + ax;
`
`v: = vi(1 + ax;)
`
`v: = vi(e"')
`
`(1)
`
`(2)
`
`(3)
`
`Equation 1 is always invertible, and Equations 2 and 3 are invertible if v, 0 0, which holds in all of our
`experiments. Given V' we can therefore compute the inverse function to derive X' from V' and V.
`Equation 1 may not be appropriate when the v, values vary widely. If v, = 106 then adding 100 may be
`insufficient for establishing a mark, but if vi = 10 adding 100 will distort this value unacceptably. Insertion
`based on Equations 2 or 3 are more robust against such differences in scale. We note that Equations 2
`and 3 give similar results when ax; is small. Also, when vi is positive then Equation 3 is equivalent to
`lg(v;) = lg(vi) + ax;, and may be viewed as an application of Equation 1 to the case where the logarithms
`of the original values are used.
`
`4.2.1 Determining multiple scaling parameters
`
`A single scaling parameter a may not be applicable for perturbing all of the values vi, since different spectral
`components may exhibit more or less tolerance to modification. More generally one can have multiple scaling
`, an and use update rules such as v: = v,(1 + aixi). We can view ai as a relative measure
`parameters al ,
`of how much one must alter vi to alter the perceptual quality of the document. A large a; means that one
`can perceptually "get away" with altering vi by a large factor without degrading the document.
`There remains the problem of selecting the multiple scaling values. In some cases, the choice of ai may be
`based on some general assumption. For example, Equation 2 is a special case of the generalized Equation 1
`(v: = vi + aix,), for ai = avi . Essentially, Equation 2 makes the reasonable assumption that a large value
`is less sensitive to additive alterations than a small value.
`In general, one may have little idea of how sensitive the image is to various values. One way of empirically
`estimating these sensitivities is to determine the distortion caused by a number of attacks on the original
`image. For example, one might compute a degraded image D' from D, extract the corresponding values
`, yr*, and choose ai to be proportional to the deviation iv: — vd. For greater robustness, one shoUld
`
`vi,
`
`14
`
`DISH - Blue Spike-408
`Exhibit 1010, Page 0622
`
`
`
`try many forms of distortion and make a; proportional to the average value of Iv; — vd. As alternatives to
`taking the average deviation one might also take the median or maximum deviation.
`One may combine this empirical approach with general global assumptions about the sensitivity of the
`values. For example, one might require that a, > al whenever v, > v3. One way to combine this constraint
`with the empirical approach would be to set a, according to
`
`a;— max Iv; — vil.
`<v,
`A still more sophisticated approach would be to weaken the monotonicity constraint to be robust against
`occasional outliers.
`In all our experiments we simply use Equation 2 with a single parameter a = 0.1. When we computed
`JPEG-based distortions of the original image we observed that the higher energy frequency components
`were not altered proportional to their magnitude (the implicit assumption of Equation 2). We suspect that
`we could make a less obtrusive mark of equal strength by attenuating our alterations of the high-energy
`components and amplifying our alterations of the lower-energy components. However, we have not yet
`performed this experiment.
`
`4.3 Evaluating the similarity of watermarks
`
`It is highly unlikely that the extracted mark X' will be identical to the original watermark X. Even the act
`of requantizing the watermarked document for delivery will cause X" to deviate from X. We measure the
`similarity of X and X' by
`
`sim(X, X') =
`
`X• • X
`
`
`✓X' X'
`We argue that large values of sim(X, X') are significant by the following analysis. Suppose that the creators
`of document D' had no access to X (either through the seller or through a watermarked document). Then,
`even conditioned on any fixed value for X', each x, will be independently distributed according to N(0, 1).
`The distribution on X' • X may be computed by first writing it as E in_ I x; x;, where
`is a constant. Using
`the well-known formula for the distribution of a linear combination of variables that are independent and
`normally distributed, X' • X will be distributed according to
`
`(4)
`
`71
`
`N(0, E x72 ) = N(0, X* • X')
`
`Thus, sim(X, X') is distributed according to N(0,1). We can then apply the standard significance tests for
`the normal distribution. For example, if X' is created independently from X then it is extremely unlikely
`
`15
`
`DISH - Blue Spike-408
`Exhibit 1010, Page 0623
`
`
`
`that sim(X, X*) > 6. Note that slightly higher values of sim(X, X') may be required when a large number
`of watermarks are on file.
`
`4.3.1 Robust statistics
`
`The above analysis required only the independence of X from X', and did not rely on any specific properties
`of X' itself. This fact gives us further flexibility when it comes to preprocessing X'. We can process X' in a
`number of ways to potentially enhance our ability to extract a watermark. For example, in our experiments
`on images we encountered instances where the average value of x;, denoted Ei(r), differed substantially
`from 0, due to the effects of a dithering procedure. While this artifact could be easily eliminated as part of
`the extraction process, it provides a motivation for postprocessing extracted watermarks. We found that the
`simple transformation x7 4- X7 Ei(X") yielded superior values of sim(X, X''). The improved performance
`resulted from the decreased value of X' - X'; the value of X" - X was only slightly affected.
`In our experiments we frequently observed that x; could be greatly distorted for some values of i. One
`postprocessing option is to simply ignore such values, setting them to 0. That is,
`
`if lx:1 >tolerance
`x:
`0 Otherwise
`
`Again, the goal of such a transformation is to lower X' X. A less abrupt version of this approach is to
`normalize the X` values to be either —1, 0 or 1, by
`
`x:
`
`sign(x; — Ei (X")).
`
`This transformation can have a dramatic effect on the statistical significance of the result. Other robust
`statistical techniques could also be used to suppress outlier effects fliub81].
`A natural question is whether such postprocessing steps run the risk of generating false positives. Indeed,
`the same potential risk occurs whenever there is any latitude in the procedure for extracting X' from D.
`However, as long as the method for generating a set of values for X' depends solely on D and D', our
`statistical significance calculation is unaffected. The only caveat to be considered is that the bound on the
`probability that one of Xr,... X; generates a false positive is the sum of the individual bounds. Hence, to
`convince someone that a watermark is valid, it is necessary to have a published and rigid extraction and
`processing policy that is guaranteed to only generate a small number of candidate X.
`
`16
`
`DISH - Blue Spike-408
`Exhibit 1010, Page 0624
`
`
`
`4.4 Choosing the length, n, of the watermark
`
`The choice of n dictates the degree to which the watermark is spread out among the relevant components of
`the image. In general, as the number of altered components are increased the extent to which they must be
`altered decreases. For a more quantitative assessment of this tradeoff, we consider watermarks of the form
`v: = vi + axi and model a white noise attack by vs = + ri where ri are chosen according to independent
`normal distributions with standard deviation a. For the watermarking procedure we described below one can
`recover the watermark when cr is proportional to cr/Vii. That is, by quadrupling the number of components
`used one can halve the magnitude of the watermark placed into each component. Note that the sum of
`squares of the deviations will be essentially unchanged.
`However, when one increases the number of components used there is a point of diminishing returns at
`which the new components are randomized by trivial alterations in the image. Hence they will not be useful
`for storing watermark information. Thus the best choice of n is ultimately document-specific.
`
`4.5 Resilience to multiple-document (collusion) attacks
`
`The most general attack consists of using t multiple watermarked copies Di ,
`, D; of document D to
`produce an unwatermarked document D. . We note that most schemes proposed seem quite vulnerable
`to such attacks. As a theoretical exception, Boneh and Shaw [BS95] propose a coding scheme for use in
`situations in which one can insert many relatively weak 0/1 watermarks into a document. They assume that
`if the ith watermark is the same for all t copies of the document then it cannot be detected, changed or
`removed. Using their coding scheme the number of weak watermarks to be inserted scales according to t4,
`which may limit its usefulness in practice.
`To illustrate the power of multiple-document attacks, consider watermarking schemes in which v: is
`generated by either adding 1 or -1 at random to v1. Then as soon as one finds two documents with unequal
`values for v: one can determine vi and hence completely eliminate this component of the watermark. With t
`documents one can, on average, eliminate all but a 21-f fraction of the components of the watermark. Note
`that this attack does not assume anything about the distribution on vi. While a more intelligent allocation
`of ±1 values to the watermarks (following [I,M93, BS95]) will better resist this simple attack, the discrete
`nature of the watermark components makes them much easier to completely eliminate. Our use of continuous
`valued watermarks appears to give greater resilience to such attacks. Interestingly, we have experimentally
`determined that if one chooses the xi uniformly over some range, then one can remove the watermark using
`
`17
`
`DISH - Blue Spike-408
`Exhibit 1010, Page 0625
`
`
`
`only 5 documents.
`We assume an idealized scenario in which one can analyze attacks on multiple versions of a watermarked
`document. Let xij denote the ith component of the jth doctunent and let ‹ i = +xii and zs= v —vi. We
`assume that xid is independent