`a2) Patent Application Publication (10) Pub. No.: US 2003/0076245 Al
`(43) Pub. Date: Apr.24, 2003
`
`Gibsonet al.
`
`US 20030076245A1
`
`(54) TYPES-BASED, LOSSY DATA EMBEDDING
`
`Related U.S. Application Data
`
`(75)
`
`Inventors: Jerry Don Gibson, Dallas, TX (US),
`Mark Gavin Kokes, Dallas, TX (US);
`Geoffrey Charles Orsak, Dallas, TX
`(US); Victor James Stolpman, Dallas,
`TX (US)
`
`Correspondence Address:
`OBLON SPIVAK MCCLELLAND MAIER &
`NEUSTADT PC
`FOURTH FLOOR
`1755 JEFFERSON DAVIS HIGHWAY
`ARLINGTON, VA 22202 (US)
`
`(73) Assignee: Southern Methodist University, Dallas,
`TX
`
`(21) Appl. No.:
`
`10/143,959
`
`(22) Tiled:
`
`May 14, 2002
`
`(60) Provisional application No. 60/294,603, filed on Jun.
`1, 2001. Provisional application No. 60/294,268,filed
`on May 31, 2001.
`Seog
`‘
`.
`Publication Classification
`(51) Ute C17 caccssssscsssssstsssesnetsstseeete HO03M 7/00
`(82) US.) sscsssisnmaernremnrnans 341/50
`(57)
`ABSTRACT
`A new approach to data embedding within ITU G.722 and
`ITU G.711 based upon the method of types and universal
`classification is disclosed. A secondary data sequence is
`embedded in the original (host) data stream using the
`method of types. The embedded data is extracted using a
`type-based universal receiver, with or without the use of a
`key. The choice of type and rate for the embedded data is
`based uponananalysis of portions of the original ITU G.722
`or ITU G.711 coded data stream. The universal receiver
`learns the type from the received data alone, and hence, there
`is no side information required as in previous data embed-
`ding techniques. ‘The embedding process and the receiver
`may both be data adaptive, so the original data stream can
`be reconstructed at the decoder without error.
`
`702
`
`ORIGINAL
`
`STREAM
`
`704
`FRANE
`
`ANALYSIS
`
`708
`
`SECONDARY
`
`706
`FRAME
`TYPING
`
`He
`/
`MASTER
`ye
`STREAM
`TYPE
`
`EMBEDDED
`714
`ADAPTIVE PRECISION
`UNIVERSAL
`STREAM
`
`
`
`AND MODULATION
`RECEIVER
`
`
`1)
`740
`
`
`
`
`
`FRAME
`FRAME
`UNIVERSAL
`ADAPTIVE PRECISION
`
`RECEIVER
`ANALYSIS
`AND MODULATION
`TYPING
`
`EMBEDDED
`
`722 a
`724
`ORIGINAL MASTER
`seconpary 69
`\
`STREAM
`"y
`oy
`720
`
`FRAME ANALYSIS PARAMETERS
`
`RECEIVER DECISION
`
`RECEIVER DECISION
`
`728
`738
`
`
`
`
`FRAME ANALYSIS PARAMETERS
`
`
`
`
`
`T36—“|
`
`734
`
`732
`
`730
`
`Sony Exhibit 1038
`Sony Exhibit 1038
`Sony v. MZ Audio
`Sony v. MZ Audio
`
`
`
`
`
`102
`/
`ORIGINAL
`STREAM
`
`FRAME
`ANALYSIS
`
`704
`
`106
`FRAME
`TYPING
`
`108
`/
`SECONDARY
`STREAM
`
`710
`ADAPTIVE PRECISION
`AND MODULATION
`
`712
`/
`MASTER
`TYPE
`
`714
`UNIVERSAL
`RECEIVER
`
`Th
`é
`EMBEDDED
`TREAL
`
`FRAME ANALYSIS PARAMETERS
`
`RECEIVER DECISION
`
`718
`
`gC
`
`196
`
`740 <>
`
`
`
`TLJOTWINSE007‘PT“AdVY|-WoNRDTIGqngUONesT|ddyju91e
`
`
`
`RECEIVER DECISION
`
`FRAME ANALYSIS PARAMETERS
`
`136
`
`UNIVERSAL
`RECEIVER
`
`ADAPTIVE PRECISION
`AND MODULATION
`
`FRAME
`TYPING
`
`FRAME
`ANALYSIS
`
`ORIGINAL MASTER
`STREAM
`TYPE
`
`seconpary 6)
`me
`
`124
`
`722
`
`720
`
`TVSt79200/€007SN
`
`EMBEDDED
`STREAM
`
`
`
`Patent Application Publication Apr. 24,2003 Sheet 2 of 12
`
`US 2003/0076245 Al
`
`HISTOGRAM OF ORIGINAL G.711 CODEWORD DATA
`
`PROBABILITY
`
`0.12
`0.1
`0.08
`0.06
`0.04
`0.02
`0
`
`(
`
`50
`
`100
`
`150
`
`200
`
`250
`
`8-BIT CODEWORD
`
`FIC. 26a)
`
`HISTOGRAM OF MAPPED G.711 CODEWORD DATA (i.e. MASTER TYPE)
`
`0
`
`20
`
`100
`
`150
`
`200
`
`250
`
`8-BIT CODEWORD
`
`FIC. 266)
`
`PROBABILITY
`
`moOOODODwe
`
`2DODDFE+DT&NHco
`
`
`
`ZLJOEWIGSE007‘PT“AdyY|-HONBDTTGngUONesT|ddyju91eg
`
`
`
`
`
`ANTYATHOMECOD
`
`04
`
`
`
`
`
`“Oddd“THY“WYON
`
`=a—>—a — on
`= =a>con
`
`MODULATED 7-BIT CODEWORD DATA“
`
`902
`
`20
`
`40
`
`100
`80
`60
`TIME (SAMPLE NUMBER)
`FIG. 3(a)
`
`120
`
`140
`
`
`
`> TYPE ZERO
`
` |
`
`0
`
`20
`
`
`
`«60
`40
`80
`100
`120
`CODEWORD VALUE
`
`
`
`
`
`‘OWdd“THY“WYON
`
`TYPE ONE
`
`906
`
`D>mao
`
`
`
`—=_ F*~Focncm
`
`>
`
`aS
`
`20
`
`80
`60
`40
`CODEWORD VALUE
`
`100
`
`TVSt79200/€007SN
`
`
`
`ZLJOpbWINSE007‘PT“AdVY|-WONRDTIGqngUONesT|ddyju91e
`
`ENCODER
`
`102
`
`yi
`
`FRAME STE
`
`4
`
`ful
`SECONDARY
`SOURCE
`BITSTREAM
`
`ng
`
`MODULE MODULE
`
`COMPRESSED
`
`BITSTREAM
`
`
`
`
`
`
`
`
`
`
`UNCOMPRESSED
`
`
`
`
`j= mm|fares|_| on
`
`
`
`
`
`MODULE
`MODULE
`TYPE
`
`
`
`
`118
`120
`16
`
`
`
`128
`
`FIG. 4(a)
`
`TVSt79200/€007SN
`
`
`
`ZLJOSWINSE007‘PT“AdyY| WONBDTIGqngUONesTddyju91eg
`
`DECODER
`
`UNCOMPRESSED
`
`COMPRESSED
`
`BITSTREAM
`
`
`
`
`
`pi
`ps6
`EXTRACTED
`FRAME SIZE
`SOURCE
`BITSTREAM
`
`
` 142 144
`
`
`
`G.122
`
`ENCODER
`
` MODULE
`
`MODULE
`MODULE
`
`
`
`
`
`i SPEECH
`ORECTSION
`HYPOTHESIS
`MINIMUM
`
`
`MODULE | TESTING
`[~| ENTROPY
`140
`
`
`
`
`
`
`MODULE
`TYPR
`
`150
`152
`154
`
`
`
`
`
`164
`
`FIG. 4(6)
`
`TVSt79200/€007SN
`
`
`
`Od
`
`
`
`daTHe“WYON
`
`6-Bit)
`ORIGINAL CODEWORD HISTOGRAM (64 kbps,
`
`
`}a+\
`
`ur
`
`—_=>a
`
`DoCOCOile
`
`EPRGE5:.eee
`
`
`
`
`
`atent Application Publication Apr. 24,2003 Sheet 7 of 12
`
`US 2003/0076245 Al
`
`FREQ.
`
`MAPPED CODEWORD HISTOGRAM (56 kbps, 9-BIT)
`y
`
`64 kbps, 6-Bit) NORM.REL.
`FREQ.
`NORM.REL.
`FREQ,
`NORM.REL.
`
`MAPPED CODEWORD HISTOGRAM (48 kbps, 4-BIT)
`g
`
`FIC.5S(f)
`
`
`
`amS=a+a-a<=SSs=o2==iw=S==o==<==eo=e}uy
`
`vo=wi
`
`=vo
`
`>==—is#)==“nN
`
`TD_:=.—~s=-=5Seem°gCe
`
`\o
`co
`\Oiu
`=r
`—r
`cp©Ea
`oyJeS===a2oOaf=Qus
`CQKi
`
`
`- ~
`
`EN2%ESR
`
`,
`
`a==aaS|
`
`‘Oma
`
`
`
`
`CODEWORD
`
`CODEWORD
`
`
`
`
`
`PatentApplicationPublicationApr.24,2003Sheet9of12US2003/0076245Al
`SS\,=EEEaiw=zSSW=:=:
`oy™oy=o™=See="8faQOBerXN
`EB\e\_=°SE
`as0Byy=~=NS
`cS‘*m5.ceNXBS
`FSXQRNYQ
`~_\SEERA’Y\fa=WW
`LSSI.&Ry
`
`
`
`
`
`7139OLPONSC007“PZAdyY|MoNRDTTqngUONnesT|ddyju91ed
`
`DISTANCE MEASURE
`
`13
`
`10
`
`)
`
`-30
`
`-20
`
`0
`
`EMBEDDED BITS:3/3,
`
`ORIGINAL SYMBOL: 6,
`
`
`
`
`
`FIC. 7(g)
`
`10
`
`20)
`
`30
`
`DECODED SYMBOL: 6
`
`TVSt79200/€007SN
`
`
`
`Patent Application Publication Apr. 24,2003 Sheet 11 of 12
`
`US 2003/0076245 Al
`
`5900
`
`S510
`
`5520
`
`5530
`
`5540
`
`Seat
`
`$560
`
`OTART
`
`COLLECT A FRAME OF CODEWORD SAMPLES FROM THE OUTPUT
`
`OF A SOURCE COMPRESSION MECHANISM (FIG. 7a)
`
`
`
`FROM THE SAMPLES/CODEWORDS COLLECTED IN (a), FORM
`THE TYPE i.e, EMPIRICAL HISTOGRAM, OF THE
`FRAME
`
`
` THIS IS THE TYPING STAGE (FIG. 7d)
`
`
`
`TAKE A MEASUREMENT OF THE ENTROPY OF THE
`NEWLY FORMED DATA TYPE
`
`FROM A LOOKUP TABLE (WHICH HAS BEEN CALCULATED OFFLINE),
`LOOKUP THE NUMBER OF BITS TO EMBED IN THE CURRENT FRAME.
`
`WITH THIS NUMBER IN HAND, ACQUIRE THAT NUMBER
`OF BITS FROM A SECONDARY DIGITAL SOURCE
`
`WITH THIS SECONDARY BITS IN HAND,
`FORM A SYMBOL TO EMBED
`
`
`
`WITH KNOWLEDGE OF THE SYMBOL TO EMBED, CIRCULARLY SHIFT
`THE DATA 'TYPE'
`SO THAT IT IS CENTERED ON THE VALUE WHICH |
`
`
`CORRESPONDS TO THE VALUE OF THE SYMBOL TO EMBED (FIG. 4e)
`
`LLG. ES
`
`
`
`Patent Application Publication Apr. 24,2003 Sheet 12 of 12
`
`US 2003/0076245 Al
`
`5600
`
`5610
`
`FORM THE 'TYPE'
`
`TRANSMITTED SAMPLE DOMAIN VALUES (FIG. 7g)
`
`FROM THE PREVIOUSLY
`
`
`
`THIS NEW DATA 'TYPE’ DETERMINED, CALCULATE A DISTANCE
`MEASURE BETWEEN THE CIRCULARLY SHIFTED ‘TYPE’,
`JUST
`
`
`DETERMINED, AND ALL POSSIBLE VARIATIONS/SHIFTS OF A
`
`"MASTER TYPE' ORIGINALLY CENTERED AT THE ORIGIN, i.e. THE
`
`
`
`ORIGIN 1S ZERO,
`INDICATING THE EMBEDDED SYMBOL VALUE
`
`
`5620
`
`
`
`INVERSELY
`WITH THE EMBEDDED SYMBOL VALUE KNOWN,
`
`CIRCULARLY SHIFT THE PREVIOUSLY TRANSMITTED SHITED
`"TYPE' UNDOING THE EMBEDDING PROCESS (FIG. 7f)
`
`$630
`
`5640
`
`DETECT THAT 'TYPE' WHICH IS EQUEVALENT
`TO THE 'TYPE'
`FORMED IN STEP 510
`
`IS EQUEVAVLENT
`DETECT ‘FRAME ' WHICH
`TO THE 'FRAME'
`IN STEP 500
`
`END
`
`LIC. 9
`
`
`
`US 2003/0076245 Al
`
`Apr. 24, 2003
`
`TYPES-BASED, LOSSY DATA EMBEDDING
`
`CROSS-REFERENCE TO RELATED
`APPLICATIONS
`
`[0001] This application claims the benefit of and priority
`to provisional application Serial Nos. 60/294,268, filed May
`31, 2001, and 60/294,603, filed Jun. 1, 2001, each of which
`is incorporated herein by reference in their entirety.
`
`STATEMENT REGARDING FEDERALLY
`SPONSORED RESEARCH
`
`[0002] The U.S. Government has a paid-up license in this
`invention and the right in limited circumstances to require
`the patent owner to license others on reasonable terms as
`provided for by the terms of Grant Nos. NCR-9796255 and
`CCR-0093859 awarded by the National Science Foundation
`(NSF).
`
`BACKGROUND OF THE INVENTION
`
`[0003]
`
`1. Field of the Invention
`
`[0004] This invention relates generally to systems, meth-
`ods, and computer products for data embedding.
`
`[0005]
`
`2. Discussion of the Background
`
`[0006] The present invention relates to technologies ref-
`erenced and described in the references identified in the
`appended APPENDIX and cross-referenced throughout the
`specification by reference to the number, in parentheses, of
`the respective reference listed in the APPENDIX,the entire
`contents of which are also incorporated herein by reference.
`
`[0007] The field of information hiding contains several
`subfields,
`including steganography, where a message is
`concealed in another data stream, and watermarking, where
`ownership data is includedin a digital object to be protected.
`A third subfield of information hiding is the field of data
`embedding, wherein additional information is incorporated
`in the transmitted data stream by using a key and distorting
`(slightly) the original object. The embedded information
`cannot be reconstructed without the key.
`
`[0008] Over the last decade, and concurrent with the
`growth of the Internet, digital media has sprung to the
`forefront of consumer interest. Already offering several
`distinct advantages overits analog counterpart, digital media
`has presented itself more recently as a candidate for yet
`another new technology, data embedding. Data embedding,
`as its name implies, suggests that digital information (i.e.
`data, text, audio, or video) can be inserted into the content
`of another digital signal (i.e. data, text, audio, or video).
`
`To date, there are numerous applications for data
`[0009]
`embedding. One of the most important applications is copy-
`right protection of digital information. In the businesssector,
`there is growinginterest in a reliable, transparent mechanism
`to identify ownership and distribution channels for particular
`digital data sequences. In addition, many distributors of
`digital content are also looking for a cost effective solution
`for the transport of various control, reference, and descrip-
`tive signals which in turn can be usedto differentiate as well
`as track access to their products and services. Many believe
`that data embedding is the answer to these proposed prob-
`lems. One application, which is synonymous with data
`embedding,is the communication of secondary data sources
`
`through so-called covert channels. In this scenario, data
`embedding algorithms are used to securely hide relatively
`small amounts of potentially encrypted (i.e. secret) infor-
`mation within a host digital signal.
`
`[0010] Most of the background art in the area of data
`embedding has concentrated on image and video applica-
`tions. In the following, audio data embedding background
`art is Summarized and commented on.
`
`[0011] Scalar quantization refers to a process of identify-
`ing a numberof contiguous value ranges within a data set
`sufficient to accommodateall data values within the data set,
`assigning integer values to each value range, and then
`replacing each datum with an integer corresponding to the
`value range in which the datum’s value was found. Quan-
`tization requires a selection of the size of each value range,
`or “bin.”
`
`[0012] One of the first data embedding techniques used
`was least significant bit replacement. (See references (6)-
`(7)). Such techniques lead to problemsas the precision ofthe
`host signal decreases toward 1 bit/sample. Other techniques
`have been devised based on a phase coding approach. (See
`reference (8)). In these algorithms, the phase of the Fourier
`transform coefficients of a frame of the host signalis altered
`in a meaningful way. Echo coding has also been proposed
`for audio data embedding.
`(See reference (8)).
`In this
`method, multiple decaying echoes are placed in the spec-
`trum of the host signal such that by using cepstral analysis,
`one can locate and decode the nature of the embedded
`symbol. Many spread-spectrum approaches have also been
`proposed for audio data embedding applications. (See ref-
`erences (8)-(13)). Some authors propose embedding infor-
`mation as spread-spectrum (i.e. “colored”) noise. Several
`other methods (see references (14)-(16)) use spectral com-
`ponent replacement to embed data transparently into digital
`audio signals. Even simpler techniques have been attempted
`where signal peaks are modified within a segment of host
`audio in order to force the signal to fall within embedded
`data-specified quantization ranges. (See reference (16)). In
`this way, the embedded information is surmised by observ-
`ing trends in the quantization patterns of the host signal.
`
`[0013] Many of these techniques are already present in
`commercial products. The common factor among most of
`these techniques, is that they are limited in their ability to
`achieve significant embedded throughput. Background art
`techniques achieve embedded bitrates of 8-50 bps with
`corresponding error
`rates
`in the embedded bitstream
`between 107° and 10-. (See references (8)-(16)).
`
`SUMMARYOF THE INVENTION
`
`[0014] The present invention has been madein view ofthe
`above-mentioned and other problems and addresses the
`above-discussed and other problems.
`
`[0015] The present invention includes a types-based, lossy
`data embedding encoder and decoder, which may function
`independently, and a system including both a types-based,
`lossy data embedding encoder and a types-based, lossy date
`embedding decoder. As used herein, a “type” (i.e. empirical
`histogram) captures the essential statistical properties of a
`given data sequence.
`
`lossy data embedding encoder
`[0016] The types-based,
`includes a data precision module and a data embedding
`
`
`
`US 2003/0076245 Al
`
`Apr. 24, 2003
`
`module. The data precision module determines the number
`of bits to embed in an input (host) data stream, where the
`input data stream could be, for example, an ITU G.711 or
`G.722 data stream, or alternatively, the numberofbits to be
`embedded may be fixed. The data embedding module is
`coupled to the data precision module and receives a sec-
`ondary data input, which may beuser data or other data, and
`modulates the type of the data stream according to the
`secondary data symbolto be transmitted and the precision of
`the secondary data. The method used by the types-based,
`lossy data embedding encoder to encode data in a data
`stream includes framing input code words, mapping the
`framed code words into base master types, determining the
`number of bits to be embedded, forming secondary bit
`sequences into embedded data symbols, and modulating a
`frame based on the embedded data symbols and current
`frame type.
`
`lossy data embedding decoder
`[0017] The types-based,
`includes a data precision module which determines the
`numberof bits embedded in an incoming data stream, and a
`data extraction module coupled to the data precision module
`which produces a secondary data output by demodulating
`the data frame input to produce a secondary data symbol and
`a secondary data bit precision by determining the secondary
`data symbol using M-ary hypothesis testing of the input data
`frame. The host data stream could be, for example, an ITU
`G.711 or G.722 data stream. The types-based, lossy data
`embedding decoder decodesthe host data stream by framing
`the received code words, adaptively determining the number
`of bits that are embedded in the host data stream, demodu-
`lating the frame based on the embedded data symbols and on
`the current frame type, reverse mapping the base master
`types into framed code words, and forming embedded data
`symbols into secondary data bit sequences.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`[0018] Amore complete appreciation of the invention and
`many of the attendant advantages thereof will be readily
`obtained as the same becomes better understood by refer-
`ence to the following detailed description when considered
`in connection with the accompanying drawings, wherein:
`
`[0019] FIG. 1 is an exemplary data embedding system
`block diagram according to the present invention;
`
`FIGS. 2(a) and 2(4) depict examples of codeword
`[0020]
`histograms according to the present invention;
`
`illustrate an exemplary “type”
`FIGS. 3(a)-3(c)
`[0021]
`according to the present invention.
`
`FIGS. 4(a2) and 4(6) are block diagrams for an
`[0022]
`example data embedding encoder and decoder pair accord-
`ing to the present invention;
`
`FIGS. 5(@)-5(f) are diagrams of exemplary lower
`[0023]
`band codewords according to the present invention;
`
`FIGS. 6(a) and 6(b) are diagrams of exemplary
`[0024]
`upper band codewords according to the present invention;
`
`FIGS. 7(a@)-7(g) are diagrams of an exemplary
`[0025]
`encoding/decoding procedure according to the present
`invention;
`
`[0026] FIG. 8 is a flow chart of an exemplary encoding
`procedure according to the present invention; and
`
`[0027] FIG. 9 is a flow chart of an exemplary decoding
`procedure according to the present invention.
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENTS
`
`[0028] A novel approach to data embedding, within the
`ITU G.722 standard (see reference (1)) and within the ITU
`G.711 standard (see reference (25)), and based upon the
`method of types and universal classification, is disclosed.
`The ITU G.722 standard, the ITU G.711 standard, and all
`related references and all references cited therein are incor-
`porated herein by reference in their entirety. In the present
`invention, a secondary data sequence is embedded in the
`original (host) data stream using the method of types. The
`embedded data is extracted using a type-based universal
`receiver, without the use of a key. The choice of type and rate
`for the embedded data is based upon an analysis of portions
`of the original coded data stream. The universal receiver
`learns the type from the received data alone, and hence, there
`is no side information as in previous data embedding tech-
`niques. The embedding process and the receiver may both be
`data adaptive, so that the tradeoff between embedded data
`rate and errors in the reconstructed host data can be specified
`by the user.
`
`[0029] Referring now to FIG.1, the underlying principles
`of this new data embedding scheme lie in work based on
`universal receiver 714 and classifier design. Similar to the
`wireless communications problem, embedding data into
`digital signals can be thoughtof as transmitting information
`716 over a communication channel 718 that is corrupted by
`strong interference and channeleffects. Such a model for the
`case of a binary communication system is given as,
`Ay: YO=sotn(t), Symbol “0” Transmitted
`Ay: YO=s,+n(0, Symbol “1” Transmitted
`
`(1)
`
`In this model, a data symbol is hypothesized (i.e.
`[0030]
`H,,) to be transmitted from one of two sources. The binary
`data symbol to be transmitted, s., corresponds to the data
`symbol that is to be embeddedinto the host signal, y(t). The
`strong interference is representative of the host signal.
`Channel effects correspond to any pre- or post-processing
`done to the combined signal (i.e. y()=s,+n(t)).
`
`[0031] The present inventors’ novel method is to apply
`universal classification techniques toward embedded bit
`detection using an M-ary hypothesis testing procedure. Note
`that the statistical properties of the host signal vary signifi-
`cantly from frame-to-frame and thus in order to achieve
`reliable detection of embedded content, it is useful have a
`detector which is robust to the changing characteristics of
`both the host and embeddedsignals. By using the method of
`types and an information theoretic distance measure, the
`minimum distance between observed empirical data distri-
`butions and distributions based ontraining data sequences of
`sufficient length is looked for. In the method of the present
`invention,
`the minimum distance between the empirical
`distribution of the test data sequence and that of the M
`empirical distributions derived from the training data
`sequences indicate the presence of one particular type of
`embedded symbol.
`
`In the apparatus and method of the present inven-
`[0032]
`tion, some observed empirical distributions of host signal
`frames can be quite different from any empirical distribution
`
`
`
`US 2003/0076245 Al
`
`Apr. 24, 2003
`
`derived by observing the host signal over a long period of
`time. This could lead to a false detection of embedded
`symbols at the decoder. If this happens, it is sometimes
`possible to adapt the embedding schemeto counteract such
`events. At other times,it is difficult to alter the content of a
`segment in a way that can be surmised by the decoder and
`produce the correct embedded symbolat the decoder with-
`out affecting the original content of the data frame. In these
`cases, the method of the present invention may incur embed-
`ded bit detection errors. In any event, the decoder is intel-
`ligent enoughto adapt to the changing characteristics of both
`the host and embedded signals while working with only
`limited knowledge (i.e. with knowledge of only the com-
`bined signal, y(t)). Using this algorithm, the probability of
`getting an unworkable frame of data decreases as the frame-
`size of the data segmentincreases. Consequently, as the size
`of the data frame increases, the rate at which data can be
`embedded into the host signal decreases. Thus, there is a
`tradeoff that is balanced in order to achieve the desired data
`embedding goal (i.e. maximized embedded throughput with
`minimal error probability).
`
`[0033] With the general principles of the present inven-
`tors’ data embedding approach stated, examination is now
`made of the mathematical building blocks of the novel
`method, according to the present invention, for embedding
`information into digital audio signals.
`
`[0034] The data embedding problem is transformedinto a
`signal classification problem that can be cast as a M-ary
`hypothesis testing problem in which each hypothesis repre-
`sents a different random source from whichit is assumed any
`one embedded symbolis derived. It is noted that the channel
`model (i.e. y(t)) is rarely ever stationary and thus it varies
`with time depending on the characteristics of the host signal.
`Thus, there is an inherent need to use an adaptive detector
`to extract the embedded information. If the channel model
`could somehow be parameterized, then the General Likeli-
`hood Ratio Test could be used to detect
`the embedded
`content. However, such a solution produces mediocre results
`at best. (See reference (17)).
`
`increased numbersof observations(i.e. samples of test data).
`Moreover, with the definition of a rejection region, the decay
`rate can be controlled by the user. Furthermore, it has been
`shownthat no detector based solely on the test and training
`data sequences has a larger asymptotic rejection probability
`decay rate for the same exponential error decay rate. (See
`reference (19)). This result
`implies that with increasing
`numbersof test data samples, the type (i.e. empirical histo-
`gram) of the observed test data frame is more likely to be
`(less likely not to be) differentiated from any one of the M
`types derived from the training data.
`
`[0036] Consider the following M-ary hypothesis testing
`problem:
`
`Ho:
`
`Ay:
`
`X"~P;
`
`X"~Po
`
`Source 1
`
`Source 2
`
`)
`
`Ay:
`
`X"~ Pry
`
`Source M
`
`Hy+1: Rejection Region
`
`[0037] where the test vector X”is of length n. It is assumed
`that under hypothesis H,, the test vector, X”, is generated by
`a source with probability measure P; (unknown to the
`detector). In addition, due to the absence of an accurate
`statistical model for the M sources, it is assumedthat there
`exist training vectors T,N, i=1,2, ... M of length N from each
`of the M possible data sources. Therefore, the classification
`between source types is made onthe basis of the test vector,
`X", and the training vectors, T,‘, i=1,2,... M.
`
`[0038] The mathematical quantities used to differentiate
`the correct source density from those which lead to false
`detections of embedded data symbols are now further dis-
`closed. It has been shown that the asymptotically optimal
`Generalized Likelihood Ratio Test (GLRT) for determining
`if a finite alphabet test sequence, X, arose from the same
`source as a finite alphabettraining sequence, T,is:
`
`[0035] The present inventors disclose herein how to solve
`the problem of robust detection of the embedded content. It
`has been shown that under general circumstances,
`type-
`based detectors have asymptotic performance measures
`3)
`{84D9, 9) Q(X" )O2(TH)
`Ft1
`comparable to those of the clairvoyant detector. (See refer-
`A(X, T;) = —log:sup,0k”EN)
`Po
`of;
`ences
`(18)-(19)). The type characterizing the various
`hypotheses can be estimated from only the sample (i.e.
`observed) data. Because the optimal clairvoyant detector
`depends only on the true probability distributions,
`it
`is
`apparent that empirical histograms(i.e. types) are calculated
`from training data and compared to the empirical histogram
`of the observed test data in order to differentiate between
`hypotheses. When faced with classifying observed data
`frames based on the training data types, optimal perfor-
`manceis not guaranteed by merely calculating the empirical
`likelihood ratio. (See reference (18)). Rather, it has been
`shown that better performance can be obtained by concat-
`enating the training data for each hypothesis with the
`observed test data. (See reference (17)). How different the
`type derived from these longer sequences is from the types
`of: (a) the training data, and (b) the observation, can be
`assessed by utilizing the Kullback-Leibler distance measure.
`The rather surprising form of the hypothesis test leads to an
`exponential increase in the probability of detection with
`
`[0039] where Q,, Q., and Q denote source densities. (See
`reference (20)).
`
`[0040] From anintuitive point of view, it can be seen that
`if the data sequences X" and T;™ arise from the same source,
`then h, will converge to zero in the limit. Alternatively,if the
`data originated from different sources, then h; will converge
`to some constant greater than zero which will allow for
`discrimination between the proposed M hypotheses. It was
`originally shown by Gutman (see reference (19)) for the
`classification problem that
`this test offers asymptotically
`optimal performance over a very wide range of source
`statistics.
`
`[0041] Due to the requirement of the supremum calcula-
`tions in (3),
`the detector is not practical
`to implement.
`
`
`
`US 2003/0076245 Al
`
`Apr. 24, 2003
`
`through the use of the method of types,
`However,
`log-likelihood ratio is reduced to
`
`the
`
`N
`Ay(X, Ti, A) = der(Qcyr), QAxn,7)) + =, deelQin) pxr)) —A
`
`(4)
`
`[0042] The quantities, Q.p%), Qexs, and Q.x-py) represent
`the types of the data vectors, T;“, X", and the concatenated
`vectors (X", T,%). These types represent the empirical (his-
`togram) estimates of the statistics and jointstatistics of the
`data vectors. The distance metric is the functional d,,, the
`well known divergence or relative entropy between the
`probability mass functions in its argument. i is a positive
`constant chosen to satisfy some design criterion (i.e. rejec-
`tion region). In addition to the above, the present inventors
`disclose an alternative interpretation in terms of the entro-
`pies of the types,
`
`bX, Ti, Y=
`
`
`N+n
`N
`
`N
`S—H[Qpxnpy) — (Query) — [ H(Oepny) 2
`
`QS)
`
`[0043] The above expression for the discriminant function
`in terms of the entropies is computationally preferable for
`on-line processing, as the entropies of the training sequences
`can be pre-computed. Note that the joint type of X" and T,%
`in terms of the marginals is defined as
`
`28,7") =
`
`AQyn) + NQeN)
`TaN
`
`(6)
`
`the type
`[0044] With only a few general assumptions,
`based detector of the present invention has been shown to
`have asymptotic performance measures comparable to those
`of the clairvoyant detector. (See reference (17)). In addition,
`in (see reference (21)),
`the behavior of the type-based
`detector of the present invention relative to the amount of
`training data used has been explicitly shown. These dem-
`onstrations provide evidence that the type-based detector of
`the present invention can in fact achieve globally optimum
`performance even with limited amounts of training data.
`These results are particularly applicable to the experiments
`conducted by the present inventors.
`
`[0045] Returning now to FIG. 1, an exemplary embodi-
`mentof the present invention that features ITU G.711 is now
`described.
`
`[0046] Since the present approach is data-adaptive, each
`sequence of host data 702 is analyzed to determine if an
`embedded data stream 708 can be accommodated without
`substantially compromising the host data 702. Thus,
`the
`presentclassification problem is the M-ary hypothesis prob-
`lem with rejection (see reference (18)), where the rejection
`zone is used for the “no embedded data” case. The number
`of bits embedded per host data sequence is log,{M}.
`
`[0047] Because it is advantageous not to send any side
`information, the first issue addressed by the present inven-
`tion is determining under what conditions can an embedded
`data stream 708 be successfully decoded from the received
`
`if log,{M} bits are
`data stream 720. More specifically,
`embeddedin the host sequence 720 such that the probabili-
`ties of falsely decoding embedded hypothesis H,; as one of
`the other hypotheses, H; (j=1,2, . .. , M, j#i), exponentially
`decreases in n (the host data sequence length) with param-
`eter A, the following holds for the probability of correctly
`decoding under the M hypotheses. From (see reference(18)),
`it is knownthat if the training sequence N is of insufficient
`length with respect to n, then there exists an hypothesis H,
`such that the probability of choosing rejection given H,
`(decoding no embedded data given H,) approaches 1 as
`no. However, for a sufficiently long training sequence
`(length N) with respect to n as no, the probability of
`choosing the rejection region under H;
`is bounded away
`from 1.
`
`[0048] Since the method of the present invention is host
`data sequence adaptive, these results imply that by adap-
`tively varying the numberof bits embedded (log,{M}) per
`host sequence, the receiver will be able to track the data
`embedding processat the encoder with high probability, and
`without the transmission of side information.
`
`[0049] A-secondissue addressed by the present invention
`concerns modifying the data type of the host data such that
`data can be embedded and in such a waythat the receiver can
`determine the modified data type from the received data
`stream 720 only; that is, without side information. For a
`given host data sequence 702 to be transmitted, the case is
`considered where the minimum entropy data type is deter-
`mined and this minimum entropy data type is modified by
`shifting within the region of support of the class of data
`types. Justification for selecting the minimum entropy data
`type as the type to be modified will be made clear by the
`following disclosure. The process of data embedding via
`simple shifts of this type is analyzed. Note that the number
`of shifts corresponds to the number of hypotheses that must
`be detected with the universal receiver. Only symmetrized,
`unimodal type classes will be considered in this develop-
`ment.
`
`It is known that the optimal likelihood ratio test
`[0050]
`from the Neyman-Pearson lemma can be written as the
`difference between tworelative entropies(see reference (2)).
`Thus, if data is embedded by shifting the symmetrized data
`type, the number of hypotheses that can be distinguished
`will be dependent upon the spread of the type class and on
`the region of support. For M=2, there are three different
`errors that can occur: (i) Given that H, or H, has been sent,
`the detector may decide “no embedded data” and reject both;
`(ii) Given that H, is sent, the detector decides H,; and(iii)
`Giventhat H,is sent, the detector decides H,. Stein’s lemma
`discloses that one of these error probabilities can be fixed at
`some suitably small value and the others can be made to
`approach zero exponentially with respect to the relative
`entropy between hypotheses (see reference (2)). However,in
`the present situation, all of these errors may be of equal
`significance. What is needed is to select the shifts of the
`minimum entropy data type to obtain equal probabilities of
`making an error given that any hypothesis or the “no data”
`case is sent. Thus, a Bayesian approach can be used with
`specified a priori probabilities on the hypotheses, say 1, 1,
`and tz, and Sanov’s Theorem can be used to boundthe error
`probabilities with respect to the nearest neighbor regions
`(see reference (2)). Once the minimum entropy type has
`been determined, data can be embedded by constructing
`
`
`
`US 2003/0076245 Al
`
`Apr. 24, 2003
`
`hypotheses other than shifts of this data type (see reference
`(29)). These cases are also being investigated by the present
`inventors.
`
`[0051] A feature of type-based data embeddingisthatit is
`a lossy approach. By removing the many constraints(ie.
`perceptual) in the typical data embedding problem, infor-
`mation can be embedded in a host signal in such a way that
`the throughput of the channel
`is increased without also
`increasing the transmitted data rate. In order for the present
`invention to achieve this additional throughput, a small
`numberoferr