throbber
Audio Watermarking for Monitoring and Copy Protection
`
`Jaap Haitsma, Michiel van der Veen, Ton Kalker and Fons Bruekers
`Philips Research Laboratories
`Prof. Holstlaan 4
`5656 AA Eindhoven, The Netherlands
`[jaap.haitsma][michiel.van.der.veen][ton.kalker][fons.bruekers]@ philips.com
`
`ABSTRACT
`Based on existing technology used in image and video wa-
`termarking, we have developed a robust audio watermark-
`ing technique. The embedding algorithm operates in fre-
`quency domain, where the magnitudes of the Fourier co-
`efficients are slightly modified.
`In the temporal domain,
`an additional scale parameter and gain function are neces-
`sary to refine the watermark and achieve perceptual trans-
`parency. Watermark detection relies on the Symmetrical
`Phase Only Matched Filtering (SPOMF) cross-correlation
`approach. Not only the presence of a watermark, but also
`its cyclic shift is detected. This shift supports a multi-bit
`payload for one particular watermark sequence. The water-
`marking technology proved to be very robust to a large num-
`ber of signal processing ” attacks” such as MP3 (64 kb/s),
`all-pass filtering, echo addition, time-scale modification, re-
`sampling, noise addition, etc.
`It is expected that this ap-
`proach may contribute in a wide variety of existing (e.g.
`monitoring and copy protection) and future applications.
`
`Keywords
`audio, broadcast monitoring, copy protection, watermark
`embedding, watermark detection
`
`INTRODUCTION
`1.
`A digital audio watermark is an information label, which
`is embeddedin an audio signal in an imperceptible manner.
`During the past few years a numberof new audio watermark-
`ing techniques have been developed to support applications
`such as copy control [1] [2] or broadcast monitoring [3]. Most
`of these operate in time domain and employ methods such
`as echo-hiding [4] or some kind of noise addition, exploit-
`ing temporal and/or spectral masking models of the human
`auditory system [5] [6].
`
`[7]
`Based on image and video watermarking techniques [3]
`we have developed an alternative approach to audio water-
`marking. Similar to the work of Piva et al.[2], watermark
`
`Permission to make digital or hard copies ofall or part of this work for
`personal or classroom useis granted without fee provided that copies
`are not made ordistributed for profit or commercial advantage and that
`copies bear this notice and the full citation on the first page. To copy
`otherwise, to republish, to post on servers or to redistribute to lists,
`requires prior specific permission and/ora fee.
`ACM Multimedia Workshop Marina Del Rey CA USA
`Copyright ACM 2000 1-58113-311-1/00/11...$5.00
`
`embedding is performed in frequency domain. The princi-
`ples of spectral masking are exploited in a relatively simple
`mannerbyslightly modifying magnitudes of the Fourier co-
`efficients. The embedding algorithm is complemented with
`a detection procedure adapted from cross-correlation tech-
`niques used in imageregistration [9] and video watermark-
`ing [3] [8]. The combination of both algorithmsoffers sev-
`eral advantages in terms of robustness to sometrivial signal
`processing ”attacks” (e.g. all-pass filtering). In this paper,
`we introduce both embedding and detection algorithms and
`discuss briefly some key aspects such as payload, perceptual
`transparency, robustness and detection reliability.
`
`2. EMBEDDING
`A sketch of our watermark embedding algorithm is displayed
`in Figure 1. A random watermark sequence W(k) is drawn
`from a normal distribution with mean and standard devia-
`tion of 0 and 1, respectively. A cyclic shifted version W,(k)is
`used to achieve a multi-bit payload for one particular water-
`mark sequence W(k). Every possible shift may be associated
`with a different information label. Therefore, payloadis di-
`rectly proportional to the watermarksize (e.g. 1024-sample
`watermark corresponds to payload of maximum 10 bit).
`
`The dominant part of the perceptually weighted watermark
`w(n) is derived in the Fourier domain, where spectral mask-
`ing is exploited in a relatively simple manner. First, the
`audio signal x(n) is segmented into frames and transformed
`to the frequency domain. Here, the magnitudeof its Fourier
`coefficients are slightly modified by utilizing the shifted wa-
`termark sequence W,(k):
`
`Wi(k) = Wa(k)X;(k),
`
`(1)
`
`where i indicates the frame number, X;,(k) the spectral rep-
`resentation of the frame x;(n), and W/(k) the resulting fre-
`quency domain watermark. Note that the frame size is
`a trade-off between perceptual transparency (small frame
`sizes) and detection reliability (large frame sizes). Several
`experiments have demonstrated that, in general, framesizes
`of 2048-samples provide a good compromisein this trade-off.
`
`Inverse Fourier transforms F~! are used to reconstruct the
`time-domain watermark representation w(n). Shaping the
`watermark in frequency domain (Equation 1) is not suffi-
`cient to assure perceptual transparency. Since fixed length
`Fourier transforms do not provide accurate time-localization,
`watermarks computed in frequency domain will spread in
`time over the entire analysis window. This may result in
`
`119
`
`Sony v. MZ Audio
`
`Sony Exhibit 1041
`Sony Exhibit 1041
`Sony v. MZ Audio
`
`

`

`perceptual distortions such as pre-echos. Therefore, an ad-
`ditional scale parameter a and gain function g(n) are intro-
`ducedto refine the watermark in the temporal domain:
`
`y(n) = x(n) + ag(n)w(n),
`
`(2)
`
`wherea is the global scale parameter, g(n) a data dependent
`gain function with values between 0 and 1, and y(n) the
`watermarked audio.
`
`Analog to the frame size, also a is a parameter that in-
`fluences the trade-off between perceptual transparency and
`detection reliability: very small/large values of a may result
`in perceptual transparency/distortions and low/high water-
`mark detection reliabilities. Several informal adaptive up-
`down listening tests [10] were performed on a variety of wa-
`termarked audio excerpts to extract critical values of a. We
`found perceptual transparency was achieved by selecting a
`between 0.15 and 0.25, depending on the audio excerpt.
`
`y(n)
`
` Gain Function
`
`g(n)
`
`Figure 1: Overview of watermark embedding algo-
`rithm for digital audio. F and F~' indicate Fourier
`and inverse Fourier transforms, respectively.
`
`3. DETECTION
`Figure 2 gives an overview of the watermark detection algo-
`rithm. It relies on a cross-correlation procedure between the
`watermark sequence W(k) and the audio. Experimentsre-
`vealed that filtering prior to cross-correlation may improve
`detection reliabilities significantly.
`In our detection algo-
`rithm, y(n) is filtered with the "equalization” filter d(n)
`accordingto:
`
`g(r) = y(n) + d(n),
`
`(3)
`
`-—1]. This signal is
`with filter coefficients d(n) =[ —1 2
`segmented into frames and transformed to frequency domain
`to obtain the magnitude of the Fourier coefficients:
`¥i(k) = | F (g:(n)) |,
`
`(4)
`
`where ¥ indicates a Fourier transform operation. For each
`individual frame, the magnitude of Fourier coefficients Y;(k)
`need to be cross-correlated with every possible shifted ver-
`sion of W(k) to extract the payload. Such a cross-correlation
`is calculated most efficiently using Fourier transformedsig-
`nals:
`
`Y¥ir =F(Y;(k)), and We =F(W(k))*.
`
`(5)
`
`The traditional cross-correlation may then be written as:
`C= F (fir . Wr) :
`(6)
`is the cross-correlation function. Similar to de-
`where C;
`tection procedures in video watermarking [3], the detection
`performance may be enhanced by using the Symmetrical
`Phase Only Matched Filtering approach (SPOMF; [9]). In
`this cross-correlation procedure, only phase information of
`the signals Y;,7 and Wr is used:
`(7)
`Ci =F" (Phir) P(Wr)),
`where P is a phase-only operation and P(x) = 2x/|x| for
`x #0 and P(0) = 1. To improve detection reliability even
`further, Cj
`is accumulated over a period of time Cin
`> Cj. Since Chum is distributed normally its components
`may be normalized to the standard deviation a:
`Cc
`fo
`sum
`(8)
`Ch 7 a(Chum) ?
`Its
`is the normalized cross-correlation function.
`where C/,
`peak value, expressed in standard deviation o,
`is related
`directly to the detection reliability, whereas its position cor-
`responds to the cyclic shift (payload).
`
`The detection reliability depends strongly on the number of
`accumulated frames. In general, cross-correlation functions
`Cj need to be added over a period of 2 to 5 sec to exceed a
`detection threshold of 5¢. This corresponds to a false alarm
`probability of 2.9-107*. Figure 3 displays a typical cross-
`correlation function Cy.
`In this example, a peak value of
`~ 13o (false alarm probability of 6.3 - 10~°*) is detected at
`position 512.
`
`Payload
`
`Equalization
`
` y(n)
`
`Wk)
`
`Figure 2: Overview of watermark detection
`
`4. EXPERIMENTAL RESULTS
`In a numberof experiments we have examined the robust-
`ness of our audio watermark to a wide variety of signal ” at-
`tacks”. The following audio excerpts were used:
`(i) O For-
`tuna from Carl Orff, (ii) Success has made a failure of our
`home from Sinead O’Connor, (iii) Say what you want from
`Texas and (iv) She works hard for the money from Donna
`Summer. The 20 sec.
`audio fragments were sampled at
`44.1 kHz (16 bit, mono). Based on up-downlistening tests
`(section 2) we selected a = 0.2 for watermark embedding
`(Equation 2). All audio excerpts were subjected to the fol-
`lowing processing ” attacks”:
`
`© MP3 Encoding/Decodingat 64 kb/s and 32 kb/s.
`
`120
`
`

`

`Donna
`rd
`No Processing
`146
`MPS (G4kbit/3)
`MP3 (32kbit/s)|60[56|65|6.7
`Alkpass Filtering
`171
`Amp. Compr.
`178
`Equalization
`182
`Echo Addition
`16.2
`Band-Pass Filter
`14.3
`Time Scale +4%
`17.1
`Time Scale -4%
`16.5
`Fiesampling
`127
`Noise addition
`16.4
`D/A A/D
`7
`
`200
`
`900
`
`600
`500
`400
`Shift of watermark (W)
`
`700
`
`deviations} 100
`
`
`Oatectionrefiahility(stardard
`
`Table 1: Detection reliabilities expressed in stan-
`dard deviation co.
`
`Attack
`
`800
`
`900
`
`1000
`
`Figure 3: Example of cross-correlation function C),
`accumulated over a period of 5 sec. Dashed line
`indicates detection threshold of 5c.
`
`e All-pass Filtering using system function:
`H(z) = (0.812? — 1.64z + 1)/(z? — 1.64z + 0.81).
`
`e Amplitude Compression with the following ampli-
`tude compression ratios: 8.94:1 for |A| > —28.6 dB;
`1.73:1 for —46.4 < [A] < —28.6 dB; 1:1.61 for [A] <
`—46.4 dB.
`
`e Equalization with a 10-band equalizer where signals
`within each band are suppressed or amplified by 6 dB.
`
`e Echo Addition with a delay and decay of 100 ms and
`50%, respectively.
`
`e Band-Pass Filtering using a second order Butter-
`worthfilter with cut-off frequencies 100 Hz and 6000 Hz.
`
`e Time Scale Modification of +4% or -4%, where the
`pitch is unaffected.
`
`e Resampling consisting of subsequent down and up
`sampling to 22.05 kHz and 44.10 kHz, respectively.
`e Noise Addition with uniform white noise. Maximum
`magnitude of 150 quantization steps.
`
`e D/A-A/D Conversionsusing a commercial analogue
`tape recorder.
`
`Processing was performed in MatLab and CoolEdit Pro 1.2.
`Thedetection results were calculated by accumulatingcross-
`correlation functions C; (Equation 7) over periods of 5 sec
`and averaging the four detection reliabilities.
`
`The results are displayed in Table 1. Unprocessed water-
`markedaudio excerpts result in typical detection reliabilities
`between ~ 130 and ~ 170. MP3 compression at very low
`bit-rates (e.g. 32 kb/s) results in measurements close to the
`detection threshold of 5¢. The data reveal that detection
`reliability is affected only marginally by other signal attacks
`including MP3 compression at 64 kb/s and all-passfiltering.
`In general, reliabilities are in the range 1lo — 17¢, corre-
`sponding to a false alarm probability of at least 1.9. 10-75.
`
`5. CONCLUSIONS
`Based on existing technology in image and video watermark-
`ing, we have developed new algorithms for embedding and
`detecting watermarks in digital audio. Important character-
`istics of this new technique were discussed. Key results of
`this study are:
`
`1. Embedding: The dominant part of the perceptu-
`ally weighted watermark is derived in frequency do-
`main by slightly modifying the magnitude of Fourier
`coefficients. An additional scale parameter and time-
`domain gain function were necessary to refine the wa-
`termark. The scale parameter may also be utilized to
`tune system characteristics such as perceptual trans-
`parency and detectionreliability.
`
`2. Detection: The SPOMFcross-correlation approach
`offered a robust technology for blind detection of wa-
`termarks in digital audio.
`
`3. Robustness: Our watermark algorithm proved to be
`robust to a wide variety of signal processing ” attacks”
`such as MP3 (64 kb/s), all-passfiltering, echo addition,
`speed change, resampling, noise addition, etc.
`
`With the accomplishments described in paper, and possi-
`ble future developments, it is expected that our audio wa-
`termarking strategy can support a wide variety of existing
`(monitoring and copy control) and future applications.
`
`6. REFERENCES
`{1] E. Koch, and J. Zhao, 1995, ” Towards robust and
`hidden image copyright labeling”, in Nonlinear Signal
`Processing Workshop, Thessaloniki, Greece, pp.
`452-455.
`
`(2] A. Piva, M. Barni, and F. Bartolini, 1998, ” Copyright
`protection of digital images by means of frequency
`domain watermarking” , Proceedings of SPIE, vol.
`3456, pp. 25-35.
`
`[3] T. Kalker, G. Depovere, J. Haitsma, and M. Maes,
`1999, ”A video watermarking system for broadcast
`
`121
`
`

`

`monitoring” , Proceedings of IS&T/SPIE/EI25,
`Security and Watermarking of Multimedia Content,
`vol. 3657, pp. 103-112.
`
`D. Gruhl, W. Bender, and A. Lu, 1996,
`”Echo-hiding”, Information hiding: 1st International
`Workshop, R.J. Anderson, Ed., vol. 1174 of Lecture
`Notes in Computer Science, Isaac Newton Institute,
`England, pp. 295-315.
`
`P. Bassia, and I. Pitas, 1998, "Robust audio
`watermarking in the time domain”, 9th European
`Signal Processing Conference (EUSIPCO98), Greece,
`pp. 25-28.
`
`{7]
`
`[8]
`
`[9]
`
`I. Cox, J. Kilian, F.T. Leighton, and T. Shamoon,
`1996, ”A secure, robust watermark for multimedia”,
`In Proc. of the Information Hiding: First Int.
`Workshop, Lecture Notes in Computer Sciences, vol.
`1174, R. Anderson, ed., Springer-Verlag, pp. 183-206.
`
`G.F.G. Depovere, T. Kalker, and J.P.M.G. Linnartz,
`1998, Improved watermark detection reliability using
`filtering before correlation”, Int. Conf. on Image
`Processing, ICIP, Chicago IL.
`
`L.G. Brown, 1992, ”A survey of image registration
`techniques”, ACM Computing Surveys, vol. 24, pp.
`325-376.
`
`M.D. Swanson, B. Zhu, A.H. Tewfik, and L. Boney,
`1998, "Robust audio watermarking using perceptual
`masking”, Signal Processing, vol. 66, 337-355.
`
`[10]
`
`H. Levit, 1970, ” Transformed up-down methods in
`psychoacoustics”, The Journal of the Acoustical
`Society of America, vol. 49, pp. 467-477.
`
`[4]
`
`[5)
`
`[6]
`
`122
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket