`
`*Changsheng Xu , *Jiankang Wu & **David Dagan Feng
`♦Kent Ridge Digital Labs
`21 Heng Mui Keng Terrace
`Singapore 119613
`{xucs, jiangkang} @krdl.org.sg
`‘ ♦Department of Computer Science
`The University of Sydney
`NSW, 2006, Australia
`feng@cs.usyd.edu.au
`
`Abstract
`
`This paper proposes a method to embed and extract the digital watermark into and from digital compressed
`audio. The watermark is embedded in partially uncompressed domain and the embedding scheme is high related
`to audio content. The watermark content contains owner and user identifications and the watermark embedding
`and detection can be done very fast to ensure on-line transactions and distributions. The experimental results
`illustrate that the embedded watermark not only does not affect the audio quality in audibility as well as change
`the bit rates in compressed domain, but also can survive common signal processing methods such as D/A and
`A/D conversions, adding noise, filtering, re-sampling, and especially the decoding and re-encoding process. The
`proposed method is very useful and effective for copyright protection, trace of illegal distributions and other
`applications.
`
`1. Introduction
`
`Today as the development of Internet technology, audio coding technique and digital signal processing
`techniques, digital compressed audio distribution through the Internet gets faster and more convenient.
`Compression algorithms for digital audio can preserve audio quality as well as reduce bit rate
`dramatically, increase network bandwidth, and save density storage of audio content. Among various
`kinds of compressed digital audio currently used, MP3 is the most popular one and gets more and
`more welcomed by music users. MP3 audio compression is based on psycho-acoustic models of
`human auditory system (HAS). It is an ideal format for distributing high-quality sound files online
`because it can offer near-CD quality at the compression ratio of 11 to 1 (128kb/s).
`
`However, the open environment of Internet causes a problem of illegal distribution of privately owned
`digital audio and other multimedia products. To prevent digital media from illegal distribution, there is
`a demand for the copyright protection and trace of illegal distribution sources. Digital watermarking is
`one of the emerging technologies to solve these problems. It directly embeds the copyright information
`and user identification into the original audio and keeps the information present in the audio after all
`kinds of manipulations. Generally, a watermark inside the audio should be inaudible and robust to
`different kinds of attacks and collusion. Watermark detection must unambiguously identify the
`ownership and find the illegal distribution sources.
`
`Currently digital audio watermarking techniques mainly focus on uncompressed audio. The methods
`can be classified into time domain based techniques (Pitas, 1996; Wolfgang & Delp 1996), frequency
`domain based techniques (Cox et al, 1995; Swanson et al, 1996), and time-frequency domain based
`techniques (Swanson et al, 1998). Some of these watermarking techniques can survive compression-
`decompression-recompression processing. Therefore, one possible method to protect compressed
`audio is to decompress it first, then embed watermark into decompressed audio, and finally recompress
`the watermarked decompressed audio. This can probably ensure the robustness of the watermark, but it
`is too time-consuming because the compression process will take a long time. For example, it will take
`more than 30 minutes to compress a five to six minute audio of WAV format to MP3 format with the
`bit rate of 128k/sec. So it is not suitable for on-line transaction and distribution. In order to improve
`
`Sony Exhibit 1040
`Sony v. MZ Audio
`
`
`
`the embedding speed as well as maintain the robustness of watermark, fast and robust embedding
`schemes for compressed audio must be taken into consideration. But according to our searching, there
`are so fewer prior watermarking methods related to compressed audio. In (Sandford et al, 1997), the
`auxiliary information is embedded as a watermark into the host signal created by a lossy compression
`technique. Obviously, this method has low robustness since the watermark can be removed easily
`without affecting the quality of the host audio signal by decompress the compressed audio. In
`(Petitcolas, 1999), a watermarking method (MP3Stego) for MP3 files is proposed. MP3Stego hides
`information in MP3 files during the compression process. The watermark data is first compressed,
`encrypted and then hidden in the MP3 bit stream. The hiding process takes place at the heart of the
`Layer III encoding process namely in the innerjoop. The inner loop quantizes the input data and
`increases the quantizer step size until the quantized data can be coded with the available number of
`bits. Another loop checks that the distortions introduced by the quantization do not exceed the
`threshold defined by the psychoacoustic model. The part2_3_length variable contains the number of
`main_data bits used for scalefactors and Huffman code data in the MP3 bit stream. The bits were
`encoded by changing the end loop condition of the inner loop. Only randomly chosen part2_3_length
`values were modified and the selection was done by using a pseudo random bit generator based on
`SHA-1. This scheme is very weak in robustness. The author acknowledged that any attacker could
`remove the hidden watermark information by uncompressing the bit stream and recompressing it. On
`the other hand, MP3Stego does not directly embed watermark in compressed domain. The processed
`object is PCM audio and the ..watermark is embedded during the compress process, so it is time-
`consuming.
`
`This paper provides an effective method to protect copyright and trace illegal distributions for digital
`compressed audio by embedding digital watermark in partially uncompressed domain. The
`watermarked audio is robust to various kinds of manipulations and attacks. In the meantime, the
`embedded information will not affect the audio quality in audibility. The watermark embedding and
`detection can be done very fast. The detected watermark information can provide proofs of copyright
`and distribution sources. For copyright protection, the watermark content must contain the owner
`identification information which is identical in each audio content. For tracing illegal distributions, the
`watermark content must contain the user identification which is different for each audio transaction. In
`order to balance the optimality between the audibility and robustness, a content-adaptive embedding
`method based on human auditory system is proposed. By use of this method, the watermark is high
`related to the audio content and it tightly follows the masking threshold of the human auditory system.
`Watermark embedding increases the data rate very little so that it will not cause perceptible distortion
`in audibility. Any attempt to remove or distort it, including re-encoding the audio content, will lead to
`perceptible distortion of the original audio content. Since the watermark is embedded in partially
`uncompressed domain, it will make the embedding speed very fast.
`
`2. Watermarking Scheme
`
`2.1 Generic Embedding Scheme
`
`Usually, there are three generic watermark embedding scheme as shown in Figure 1, Figure2 and
`Figure3. Figure 1 illustrates the generic procedure of watermark embedding in uncompressed domain.
`The ideal case is the plain audio (PCM format) is embedded with watermark before compression. In
`this scheme, the content of the watermark only includes the copyright information because we can not
`get any user information before distribution. This scheme can not trace illegal distribution of the audio
`content. Actually in most cases, a lot of compressed audio contents without watermarks are existing in
`the music server and other media for on-line distributions. Figure2 and Figure3 illustrate two
`watermark embedding schemes to embed the watermark into compressed audio. The scheme of
`Figure2 is to directly embed the watermark in compressed domain. This can make watermark
`embedding very fast, but the robustness of the watermark is weak. Any decompression-recompression
`process can easily remove the watermark. In order to improve the robustness of the watermark,
`Figure3 illustrate another embedding scheme. According to this scheme, the compressed audio is first
`decompressed, then the watermark is embedded in uncompressed domain, and finally the watermarked
`
`
`
`content is recompressed to generate the watermarked compressed audio. This scheme can improve the
`robustness of watermark, but it is not suitable for on-line distribution because the compression process
`is time-consuming. To protect the copyright of these contents and trace illegal distributions as well as
`ensure on-line transactions, we proposed a novel content-based embedding scheme in this paper. Our
`scheme fully considers the audio coding algorithms and is high related to audio content so that it can
`get an optimal balance between audio quality and robustness of the embedded watermark and ensure
`the embedding speed suitable for on-line distribution.
`
`Plain Audio
`
`Watermark
`f Owner ID)
`
`f
`>
`Embed Scheme
`(uncompress)
`
`Watermarked
`Compressed
`Audio
`
`Compress
`
`Figurel: Embedding scheme 1
`
`Compressed
`Audio
`
`Watermark
`(Owner ID
`& User ID)
`
`Embed Scheme
`(compress)
`
`Watermarked
`Compressed
`^ Audio
`
`Figure2: Embedding scheme 2
`
`Watermark
`
`Figure3: Embedding scheme 3
`
`2.2 Content-Based Embedding Scheme
`
`In order to improve the robustness of the watermark embedded into the compressed audio as well as
`ensure the embedding speed, a content-based watermark embedding scheme is proposed in this
`section. According to this scheme the watermark will be embedded in partially uncompressed domain
`and the embedding scheme is high related to audio content. Figure4 illustrates the block diagram of the
`content-based watermark embedding scheme in partially uncompressed domain.
`
`
`
`Compressed
`Audio
`
`Frame
`Segmentation
`
`Frame ^
`
`Decode
`
`-1
`
`Frame 2
`> Decode
`
`-
`
`Non-embedded
`Frames
`(Coded)
`Watermarked
`Compressed Audio
`< ---------
`
`Frame n
`>
`
`Decode J
`Embedded
`Frames
`(Coded)
`
`Frame
`Reconstruction
`
`K
`
`Feature
`Extraction
`
`Psychoacoustic
`Model
`
`Embedded
`Frames
`(Decoded)
`Re-Encode <■
`
`Filter Bank
`
`Selected
`Frames
`(Decoded)
`
`\r >r
`Embedding
`
`SchemeT Watermark
`
`Figure4: Content-Based Watermark Embedding Scheme
`
`The incoming compressed audio is first segmented into frames according to the coding algorithm. All
`the frames are decoded from compressed domain to uncompressed domain. Then the feature extraction
`model and the psychoacoustic model are applied to each decoded frame to calculate the features of the
`audio content and masking threshold in each frame. According to the features and masking threshold, a
`pre-designed filter bank is used to select the candidate frames suitable for embedding watermark. The
`watermark will be embedded into these selected frames using an adaptive multiple bit hopping and
`hiding scheme depicted in Figure5. The embedded frames will be re-encoded to generate the coded
`frames using the coding algorithm. Finally, The re-encoded frames and the non-embedded frames will
`be reconstructed to generate the watermarked compressed audio. Compared with the embedded
`scheme in wholly uncompressed domain, this scheme can not oniy get the same performance in
`audibility and robustness but also embed the watermark much faster. It is suitable for on-line
`embedding and distribution.
`
`Figure5 illustrates the block diagram of detailed watermark embedding scheme for decoded frames
`from the compressed audio. Since audio coding is a lossy processing, the embedded watermark must
`exist after audio compression. Furthermore, the embedded watermark must not affect the audio quality
`perceptually. In order to satisfy these requirements, the embedding scheme fully considers the human
`auditory system and the features of audio content. For the decoded frames from the original
`compressed audio which will be selected to embed watermark, feature parameters are extracted from
`each selected frame to represent the characteristics of the audio content in that frame. In the meantime,
`each selected frame will pass through a psychoacoustic model to determine the ratio of the signal
`energy to the masking threshold. Based on the feature parameters and masking threshold, the
`embedding scheme for each selected frame is designed. The watermark is embedded into these frames
`using a multiple-bit hopping and hiding method. The watermarked audio frame will be compressed to
`generate the compressed audio frame.
`
`
`
`Original
`Audio Frame
`
`Threshold
`
`Figure5: Watermark Embedding Scheme for Single Frame
`
`23 Extraction Scheme
`
`In order to correctly detected the watermark from a compressed audio, the frames embedded
`watermark must be extracted at first. Figure6 illustrates how to extracted the frames including
`watermark from a compressed audio. This process is similar to the watermark embedding scheme to
`select candidate frames to embed watermark. The watermarked compressed audio is first segmented
`into frames according to the coding algorithm. These frames are decoded and each decoded frame is
`analyzed by the feature extraction model and the psychoacoustic model. According to the calculated
`feature parameters and masking threshold, a filter bank is applied to select the frames including
`watermark information. The watermark will be detected from these frames using the extraction scheme
`depicted as Figure7.
`
`Figure7 illustrate the block diagram of watermark extraction from the selected frames. For each
`incoming frame, we examined the magnitude (at relevant locations in each audio frame) of the
`autocorrelation of the embedded signal’s cepstrum. From the diagram of autocorrelation of the
`cepstrum, the bits of a watermark in each frame can be found according to a “power spike" at each
`delay of the embedded bits. Since we use multiple-bit hopping method to embed the bits into the
`frames, for detected bits in each frame, they will pass through a matched filter bank that can map the
`bits into the actual code (1 or 0). Finally, the watermark is recovered by correlate the detected codes
`with the original watermark.
`
`Figureó: Frames and Watermark Extraction Scheme
`
`
`
`Watermarked
`
`Figure7: Watermark Extraction in Uncompressed Domain
`
`Watermark
`
`3. Technical Description
`
`In this section, technical solutions of the feature extraction, embedding algorithm and extracting
`algorithm are described in details. Descriptions of psychoacoustic model and filter bank design can be
`found in perceptual audio coding books (Moore, 1997; Kahrs & Branderburg, 1998).
`
`3.1 Feature Extraction
`
`In order to select candidate frames from compressed audio and design the content-based embedding
`scheme, feature extraction must be done in audio content. The extracted features are an important
`reference to select candidate frames and design the embedding scheme. With the content-based
`embedding scheme, an optimal balance between audio quality and robustness of the watermark can be
`obtained. By doing so, the embedded watermark will better match the host audio content so that the
`embedded watermark is perceptually negligible. The content-based method couples audio content with
`embedded watermark, so it is difficult to remove the embedded watermark without destroying the host
`audio signal.
`
`According to psychophysical studies human perception of the frequency content of sounds, either for
`pure tones or for music signals, does not follow a linear scale. There are many non-linear frequency
`scales that approximate the sensitivity of the human ear. The mel scale (Robinson, 1998) is widely
`used because it has a simple analytical form:
`
`m = 1125ln(0.0016/ +1)
`
`/ > lOOOtfz
`
`(1)
`
`where / is the frequency in Hz and m is the mel scaled frequency. For / < 1000Hz, the scale is
`linear.
`
`An example procedure of feature extraction is described as follows:
`
`(1) Segment the audio signal into m fixed-length frames;
`(2) For each audio frame s, (n ), a Fast Fourier Transform (FFT) is applied;
`
`S,.(;co) = F (i1.(n))
`
`(2)
`
`(3) Define a frequency band in the spectrum;
`/max ’ /nan
`(4) Determine the channel number n, and n2, where n, for / < IkHz and n2 for / > 1 kH z;
`(5) For / < 1 kH z, calculate the bandwidth of each band:
`
`
`
`f ^
`
`b = m -
`«I
`
`(6) For / < IkH z, calculate the center frequency of each band:
`
`/ , = ' * + /min
`
`W
`
`(7) For / > 1 kH z, calculate the maximum and minimum mel scale frequency:
`
`=1125^(0.0016/,™ +1)
`=11251n(0.0016xl000+l)
`
`(8) For / > 1 kH z, calculate the mel scale frequency interval of each band:
`
`Am =
`
`"2
`
`(6)
`
`(9) For / > 1 kH z, calculate the center frequency of each band:
`
`/ = (exp(( ¿Am +1000) /1125) -1) / 0.0016
`
`(10) For / > 1 kH z, calculate the bandwidth of each band:
`
`0 )
`
`U
`
`(7)
`
`(8)
`
`(11) For each center frequency and bandwidth, determine a triangle window function such as that
`shown in Figure 8:
`
`Figure8: A Triangle Window Function
`
`w =
`
`1
`f c - f ,
`1
`f c - f .
`
`f -
`
`f ~
`
`fl
`f c - f l
`fr
`f c - f r
`
`f , * f * f c
`
`f c * f * f r
`
`(9)
`
`Where / . , / / , f , are the center frequency, minimum frequency and maximum frequency of
`each band;
`(12) For each band, calculate its spectral power:
`
`
`
`Where Sj is the spectrum of each frequency band;
`(13) For bands satisfying f c < 10007/z , calculate their power summation:
`
`Pf<]wz - X Pj
`
`(14) For bands satisfying f c > 1000Hz , calculate their power summation:
`
`Pj>\kHz - X Pf
`
`3.2 Embedding Algorithm
`
`( 10)
`
`( 11)
`
`( 12)
`
`In order to obtain an optimal balance between the audio quality and robustness of the watermark, a
`multiple-bit hopping and hiding technique is applied in the embedding scheme. Essentially, the bit
`hiding technique embeds a watermark into the host audio signal by introducing a delay of the host
`signal. The embedded watermark itself is a predefined binary code. A time delay in relation to the
`original audio signal encodes a binary bit in the code. In typical bit hiding method two time delays are
`used. One delay is for a binary one and the other is for a binary zero. Both time delays are chosen to
`remain below a certain threshold that the human ear can sense. Thus, most human beings can not
`resolve the embedded audio as deriving from different sources. In addition to decreasing the time
`delay, it also has to be ensured that the distortion is not perceivable by setting the amplitude below the
`audible threshold of the human ear.
`
`To enhance the robustness and tamper-resistance of the embedded watermark, a multiple-bit hopping
`technique is employed. Instead of embedding one bit into an audio (fame, multiple bits with different
`time delays can be embedded into each audio sub-frame. In other words, one bit is encoded with
`multiple bits. For authority, with the same detection rate, the amplitude of a bit can consequently be
`reduced. For attackers, since they do not know the parameters, this significantly reduces the possibility
`of unauthorized bit detection and removal of a watermark.
`
`The embedding schemes can be established according to their spectral characteristics from feature
`extraction as follows:
`
`(l)For each decoded frame, extract the features using mel scale method.
`
`F - (f/SUHi ■ P(>ikH z }
`
`(13)
`
`(2)Establish an embedding table for bit zero and bit one according to psychoacoustic model. Time
`delay and energy are the major parameters:
`
`
`
`where a represents the energy and S is the delay;
`(3)Select the embedding scheme for each frame according to comparing two power summations
`resulted from feature extraction:
`If
`^ 2Pf>lWz then apply scheme 1 to this frame;
`If Pf^tH. 2 P/sikHz < -Pj>iiHz then apply scheme 2 to this frame;
`^
`- f/>iuh < 2Pf<iizHz then apply scheme 3 to this frame;
`If Pj>liHz S 2Pjilktl. then apply scheme 4 to this frame.
`
`In the bit embedding process, each audio frame is first segmented into fixed sub-frames and each sub-
`frame is encoded with one bit. Thus, for the ith frame, the embedded audio signal S9 (n) can be
`expressed as follows.
`
`S9 (.n) = S9 (n) + a vSiJ( n - S iJ)
`
`S9 (k) = 0
`
`if k < 0
`
`(14)
`
`(15)
`
`where S9 (n) is the original audio signal of the jth sub-frame in the ith frame, a :/ is the amplitude
`factor and 8tJ is the time delay corresponding to bit ‘one’ or bit ‘zero’.
`
`3 3 Extraction Algorithm
`
`The watermark extraction contains bit detection, bit mapping and watermark recovery. Among these
`steps, bit detection is the most important. Bit detection involves the detecting the spacing between the
`bits. To do this, the magnitude (at relevant locations in each audio frame) of an autocoirelation of an
`embedded signal’s cepstrum is examined. Cepstral analysis utilizes a form of a homomorphic system
`that coverts the convolution operation into addition operation. It is useful in detecting the existence of
`embedded bits. From the autocorrelation of the cepstrum, the embedded bits in each audio frame can
`be found according to a “power spike” at each delay of the bits. Since we use multiple-bit hopping
`method to embed the bits into the frames, for detected bits in each frame, they will be mapped into the
`actual code (1 or 0). The watermark is recovered by correlate the detected codes with the original
`watermark.
`
`The bit detection method using cepstral analysis is described as follows.
`
`(1) For each audio frame st (n) , calculate the Fourier transformation;
`
`S ,.( ^ ) = F(i,.(n))
`
`(2) Take the complex Logarithm of S, (e/Q) :
`
`log 5, (eja) = log F(r, (n))
`
`(16)
`
`(17)
`
`
`
`(3) Take the inverse Fourier transformation (cepstrum):
`
`J, (n) = F ''(log F (St (n)))
`
`(4) Take the autocorrelation of the cepstrum:
`
`^rr(n) = X r(n + m)r(m)
`m=—oo
`
`(18)
`
`(19)
`
`(5) Search the time point (<5,) corresponding to a “power spike” of R - (n);
`
`(6) Determine the code corresponding to <5( .
`
`4. Experimental Results
`
`In order to further illustrate the performance of our method, the experimental results of listening and
`robust tests are given. As we know, inaudibility and robustness are two conflict requirements in audio
`watermarking technology. Our method will try to get an optimal balance between these two aspects.
`
`4.1 Listening Test
`
`In order to estimate the audio quality after watermark embedding, we employ “Perceptual Audio
`Quality Measure (PAQM)” (Beerends & Stemerdink, 1992) to make the listening test PAQM derives
`an estimate of the signals on the cochlea and compares the representation of the reference signal with
`that of the signal under test. The weighted difference of these representations is mapped to the five-
`grade impairment scale as used in the testing speech and audio coders. The Subjective Grade (SG)
`(Sporer, 1996) is shown in Tablel.
`
`SG
`5.0
`4.0
`3.0
`2.0
`1.0
`
`Description
`Imperceptible
`Perceptible, but not annoying
`Slightly annoying
`Annoying
`Very annoying
`
`Tablel Five-Grade Impairment Scales Used in Listening Test
`
`Eight subjects with audio experiences are selected and eight MP3 samples are used for testing. Each
`test pair contains an original sample and a watermarked sample. The subjects are settled in a music lab
`and provided high-quality headphones. Before the tests, the level of the stimuli was set to the subject’s
`preference. The subjects listened to each pair for 10 times, then gave the grade for this pair. The
`average grade of each pair from all subjects is the final grade of this pair. Table2 is the results of the
`listening test. From the test results, it can be seen that the watermark scheme gives no distortion to the
`original audio perceptually.
`
`
`
`Test sample
`Sample-1
`Sample-2
`Sample-3
`Sample-4
`Sample-5
`Sample-6
`Sample-7
`Sample-8
`
`Grade/PAQM
`4.88
`4.84
`4.91
`4.86
`4.92
`4.90
`4.94
`4.88
`
`Table2 Results of Listening Test
`
`4.2 Robust Test
`
`Generally, the most effective method to evaluate whether a watermarked compressed audio is robust is
`to decompress it and then recompress it. If the watermark still can be correctly detected after such a
`process, we can conclude that the watermarking method is robust. We use such a process to test the
`robustness of our watermarking method. The compressed audio used in the test is a MP3 music with
`stereo, 16 bits/sample, 44.1kHz sample rates, and 256k bit rate. The watermark length is 48 bits. Each
`selected decoded frame will be embedded one bit of the watermark. In order to ensure the robustness,
`the watermark is repeatedly embedded five times in the decoded frames. Therefore, 240 decoded
`frames will be selected to embed watermarks in each compressed audio.
`In order to evaluate the detecting accuracy after the decompression-recompression manipulation, the
`detecting accuracy rate of the watermarked audio is defined as follows:
`
`.
`number of bus correctly detected
`detecting accuracy rate = ---------- ---------------
`-----X100 %
`number of bits embedded
`
`(20)
`
`This index can reflect the robustness of the watermarked audio and the reliability of watermark
`detection in the presence of above manipulation.
`We embed the watermark into the tested audio by use of our embedding scheme. The watermarked
`audio is decompressed. The decompressed audio is recompressed with the bit rate of 256k, 128k and
`64k respectively. We use our watermarking extracting to detect the watermark from the recompressed
`audio. The corresponding detecting accuracy rates is shown in Figure9. It can be seen from the figure
`that our watermarking scheme can obtain an ideally robust performance even with low bit rate.
`
`
`
`Figure9: Watermark Detecting Rates
`
`5. Conclusion
`
`Compared with digital image and video watermarking technologies, digital audio watermarking
`technology provides a special challenge because the human auditory system is extremely more
`sensitive than human visual system. In this paper, we propose a novel content-based digital
`watermarking method for compressed audio. The watermark embedding is highly related to audio
`content and based on human auditory system. In order to improve the robustness and security of the
`embedded watermark, a multiple bit hopping technique is applied in watermarking embedding and
`extraction. The listening tests show the watermarking scheme does not affect the original audio quality
`after embedding additional information. The experimental results of robustness demonstrate the
`watermarking scheme can survive the decompression and recompression manipulations. According to
`the test results in listening and robustness, our watermarking method can attain an optimal balance
`between the inaudibility and robustness in the embedded audio. Furthermore, it is not necessary to
`recompress all the frames when employing watermark embedding. So our method can provide a fast
`embedding speed and is suitable for on-line distribution.
`
`References
`
`Pitas, I. (1996). A Method for Signature Casting on Digital Images, In Proc.of IEEE Int. Conf. On
`¡mage Processing, Vol.3, (pp. 215-218).
`Wolfgang, R.B. & Delp, E.J. (1996). A Watermark for Digital Images, In Proc.of IEEE Int. Conf. On
`Image Processing, Vol.3, (pp. 219—222).
`Cox, I.J., Kilian, I., Leighton, T., ■& Shamoon, T. (1995) Secure Spread Spectrum Watermarking for
`Multimedia, NEC Research Institute, Technique Report 95-10.
`Swanson, M.D., Zhu, B. & Tewfik, A.H. (1996). Transparent Robust Image Watermarking, In Proc.of
`IEEE Int. Corf. On Image Processing, Vol.3, (pp. 211—214).
`Swanson, M.D., Zhu, B., Tewfik, A.H. & Boney, L. (1998). Robust Audio Watermarking Using
`Perceptual Masking, Signal Processing, 66,337-355.
`Sandford, S. et.al. (1997). Compression Embedding, US Patent 5,778,102.
`Petitcolas, F. (1999). h t t p : / / www.cl ■ c a m .a c .u k /- f a p p 2 /s te g a n o g r a p h y /m p 3 s te g o /,
`Cambridge Univ. of UK.
`Moore, B.J.C. (1997). An Introduction to the Psychology of Hearing. Academic Press, Fourth edition.
`
`
`
`Kahrs, M. & Branderburg, K. (1998). Applications of Digital Signal Processing to Audio and
`Acoustics. Kluwer Academic Publishers.
`Robinson, T. (1998). Speech Analysis, httD://svr-www.eng.cam.ac.uk/-air/SpeechAnalvsis/.
`Beerends, J. & Stemerdink, J. (1992). A Perceptual Audio Quality Measurement Based on a
`Psychoacoustic Sound Representation. Journal of AES, 40(12), 963-972.
`Sporer, T. (1996). Evaluating Small Impairments with Mean Opinion Scale-Reliable or just a Guess.
`AES 101st Convention, Los Angeles, USA.
`Delaigle, J.F., Vleeschouver, C.D. & Macq, B. (1996). Digital Watermarking. In Proc. Of SPIE,
`Optical Security and Counterfeit Deterrence Techniques, Vol.2659, (pp. 99-110).
`Turner, L.F. (1989). Digital Data Security System, Patent IPN WO 89/08915.
`Bassia. P. & Pitas, I. (1998). Robust Audio Watermarking in the Time Domain, IX European Signal
`Processing Conference (EUSIPCO'98), Vol.l, (pp. 13-16), September 8-11, Rhodes, Greece.
`Gruhl, D., Lu, A. & Bender, W. (1996). Echo Hiding. In Proc. o f information Hiding Workshop, Univ.
`of Cambridge, (pp. 295-315).
`Bender, W„ Gruhl, D., Morimoto, N. & Lu, A. (1996). Techniques for Data Hiding. IBM Systems
`Journal, 35(3&4), 313-336.
`Yardimci, Y., Cetin, A.E. & Ansari, R. (1997). Data Hiding in Speech Using Phase Coding. ESCA,
`Eurospeech97, Greece, (pp.1679— 1682).
`Boney, L., Tewfik, A.H. & Hamdy, K.N. (1996). Digital Watermarks for Audio Signal. In Proc. of
`IEEE Int. Conf. On Multimedia Computing and Systems. Hiroshima, Japan.
`Cox, I.J., Kilian, J., Leighton, T. & Shamoon, T. (1997). Secure Spread Spectrum Watermarking for
`Multimedia. IEEE Trans, on Image Processing, 6(12), 1673-1687.
`
`