`571-272-7822
`
`Paper 37
`Date: April 8, 2024
`
`UNITED STATES PATENT AND TRADEMARK OFFICE
`
`BEFORE THE PATENT TRIAL AND APPEAL BOARD
`
`VERANCE CORP.,
`Petitioner,
`v.
`MZ AUDIO SCIENCES, LLC,
`Patent Owner.
`
`IPR2022-01544
`Patent 7,289,961 B2
`
`
`
`
`
`
`
`
`
`Before KARL D. EASTHOM, DAVID C. MCKONE, and
`IFTIKHAR AHMED, Administrative Patent Judges.
`EASTHOM, Administrative Patent Judge.
`
`
`JUDGMENT
`Final Written Decision
`Determining No Challenged Claims Unpatentable
`35 U.S.C. § 318(a)
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`
`INTRODUCTION
`I.
`The Board instituted inter partes review of claims 1–10 of U.S. Patent
`
`No. 7,289,961 B2 (Ex. 1001, “the ’961 patent”) based on a Petition filed by
`Sony Group Corp. (Japan), Sony Corp. of America, Sony Interactive
`Entertainment LLC, Sony Pictures Entertainment Inc., Sony Electronics Inc.
`(now terminated as petitioner, (Paper 17)), and Verance Corp. (sole
`remaining “Petitioner”). Paper 7 (“Pet.”).
`
`After the Institution Decision (Paper 12), MZ Audio Sciences, LLC
`(“Patent Owner”) filed a Response to the Petition (Paper 27, “PO Resp.”),
`Petitioner filed a Reply (Paper 28, “Reply”), and Patent Owner filed a Sur-
`reply (Paper 29, “Sur-reply”). The parties participated in an oral hearing and
`a copy of the transcript is in the record. Paper 36.
`We have jurisdiction under 35 U.S.C. § 6 to enter this Final Written
`Decision under 35 U.S.C. § 318(a). Petitioner has the burden of proving
`unpatentability of the challenged claims by a preponderance of the evidence.
`35 U.S.C. § 316(e). Having reviewed the parties’ arguments and cited
`evidence based on the full record, for the reasons discussed below, we
`determine that Petitioner has not demonstrated by a preponderance of the
`evidence that claims 1–10 of the ’961 patent are unpatentable. To the extent
`this Final Written Decision may conflict with the Institution Decision, “the
`Board has an obligation to assess the [validity] question anew after trial
`based on the totality of the record.” In re Magnum Oil Tools Int’l, Ltd., 829
`F.3d 1364, 1377 (Fed. Cir. 2016); see also Trivascular, Inc. v. Samuels, 812
`F.3d 1056, 1068 (Fed. Cir. 2016) (“The Board is free to change its view of
`the merits after further development of the record, and should do so if
`convinced its initial inclinations were wrong.”).
`
`
`
`2
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`
`A. Related Matters
`The parties indicate that Patent Owner asserted the ’961 patent in
`district court lawsuits, including MZ Audio Sciences, LLC v. Sony Group
`Corp. (Japan), No. 1:21-cv-0166 (D. Del.), and MZ Audio Sciences, LLC v.
`Sony Group Corp. (Japan), No. 2:22-cv-00866 (C.D. Cal.). Pet. xi; Paper 9,
`1. The parties identify no other related proceedings.
`
`B. The Asserted Grounds
`Petitioner asserts the following grounds of unpatentability (Pet. 2):
`
`Claim(s) Challenged 35 U.S.C. 1 §
`1–10
`103(a)
`
`2, 3, 5, 7, 8, 10
`
`103(a)
`
`1–10
`
`Reference(s)/Basis
`Srinivasan2, Cabot3, Kudumakis4
`Srinivasan, Cabot, Kudumakis,
`Hobson 5
`Kudumakis, Tilki6, Cabot
`
`103(a)
`
`1 The Leahy-Smith America Invents Act, Pub. L. No. 112-29, 125 Stat. 284
`(2011) (“AIA”), amended 35 U.S.C. § 103 (effective Mar. 16, 2013).
`Petitioner points out that “[t]he application from which U.S. Patent No.
`7,289,961 issued claims priority to U.S. Provisional Application No.
`60/479,438, filed June 19, 2003.” Pet. xi. Because the earliest possible
`effective filing date for the ’961 patent precedes the effective date of the
`applicable AIA amendment, the pre-AIA version of § 103 applies.
`2 Srinivasan, US 6,272,176 B1, issued Aug. 7, 2001. Ex. 1005.
`3 R. C. Cabot et al., Detection of Phase Shifts in Harmonically Related
`Tones, J. AUDIO ENG. SOC., VOL. 24, NO. 7 (Sept. 1976). Ex. 1006.
`4 Kudumakis et al., Int. Pub. WO 01/58063, published Aug. 9, 2001.
`Ex. 1007.
`5 Hobson et al., US 6,633,653 B1, issued Oct. 14, 2003, filed Feb. 4, 2000.
`Ex. 1042.
`6 J.F. Tilki et al., Encoding a Hidden Auxiliary Channel onto a Digital Audio
`Signal Using Psychoacoustic Masking, PROCEEDINGS IEEE
`3
`
`
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`In support, Petitioner relies on the testimony of Dr. Michael Scordilis
`(Ex. 1003).0
`
`C. The ’961 Patent
`The ’961 patent relates to embedding data in an audio signal for
`watermarking, steganography, or other purposes. Ex. 1001, code (57). “The
`present invention is directed to a system and method for insertion of hidden
`data into audio signals and retrieval of such data from audio signals and is
`more particularly directed to such a system and method using a phase
`encoding method.” Id. at 1:20–24. The process divides the audio signal into
`time frames that contain frequency bands representing the audio signal. Id.
`at code (57). Then, “the relative phases of one or more frequency bands are
`shifted to represent the data to be embedded.” Id. at code (57).
`The invention exploits the randomness of the relative phases of
`frequency components in typical audio speech or music. Ex. 1001, 3:47–53
`(“So far, however, the apparent randomness of the phase has not been
`exploited for data hiding purposes.”).
`The method involves dividing an audio signal into time frames,
`sampling the time frames, and transforming the representation of the signal
`into its frequency components. See Ex. 1001, 5:30–67. Then, the method
`involves selecting at least two frequency components, the first of which is a
`fundamental tone, and the other(s) of which is/are an overtone or harmonic
`of the fundamental tone, obtaining the relative phase difference(s) between
`
`
`SOUTHEASTCON ’97, “Engineering the New Century,” Apr. 12–14, 1997.
`Ex. 1008; see also Pet. 52–53 (arguing that Tilki is prior art under
`§§ 102(a)–(b)) (citing Ex. 1025, 1–8; Ex. 1026, 1–2; Ex. 1027, 2; Ex. 1030,
`1–2; Ex. 1032, 48; Ex. 1033, 2719).
`4
`
`
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`the at least two frequency components, altering the phase of at least one of
`the overtones or harmonics to embed the desired hidden data into the signal,
`and inverse transforming the frequency components back into a digital
`representation of the time varying signal. Id. at 5:30–6:28.
`
`The specification states that “in addition to steganography and
`watermarking, any suitable use for hidden data falls within the present
`invention.” Ex. 1001, 8:66–67.
`
`D. Challenged Claims
`Petitioner challenges all ten claims of the ’961 patent. Of these,
`claims 1, 4, 6, and 9 are independent. For purposes of this Final Written
`Decision, claim 1 is representative. Claim 1 follows (information added to
`conform to Petitioner’s nomenclature):
`1.
`[1PRE] A method for embedding data in an audio signal,
`the method comprising:
`
`[1A] (a) dividing the audio signal into a plurality of time
`frames and, in each time frame, a plurality of frequency
`components;
`
`[1B] (b) in each of at least some of the plurality of time
`frames, selecting at least two of the plurality of frequency
`components; and
`
`[1C] (c) altering a phase of at least one of the plurality of
`frequency components in accordance with the data to
`be embedded, wherein:
`
`[1C-1] step (b) comprises selecting a fundamental tone
`and at least one overtone; and
`
`[1C-2] step (c) comprises quantizing a phase difference of
`the at least one overtone relative to the fundamental tone to
`embed at least one bit of the data to be embedded.
`
`
`
`
`5
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`
`II. ANALYSIS
`A. Legal Standards
`“In an [inter partes review], the petitioner has the burden from the
`onset to show with particularity why the patent it challenges is
`unpatentable.” Harmonic Inc. v. Avid Tech., Inc., 815 F.3d 1356, 1363 (Fed.
`Cir. 2016) (citing 35 U.S.C. § 312(a)(3) (requiring inter partes review
`petitions to identify “with particularity . . . the evidence that supports the
`grounds for the challenge to each claim”)). This burden of persuasion never
`shifts to the patent owner. Dynamic Drinkware, LLC v. Nat’l Graphics, Inc.,
`800 F.3d 1375, 1378 (Fed. Cir. 2015).
`Resolving the legal question of obviousness requires resolving
`underlying factual considerations including (1) the scope and content of the
`prior art; (2) any differences between the claimed subject matter and the
`prior art; (3) the level of ordinary skill in the art; and (4) when in evidence,
`objective evidence of nonobviousness. Graham v. John Deere Co. of Kan.
`City, 383 U.S. 1, 17–18 (1966).
`Often, it will be necessary for a court to look to interrelated
`teachings of multiple patents; the effects of demands known to
`the design community or present in the marketplace; and the
`background knowledge possessed by a person having ordinary
`skill in the art, all in order to determine whether there was an
`apparent reason to combine the known elements in the fashion
`claimed by the patent at issue. To facilitate review, this analysis
`should be made explicit.
`KSR Int’l Co. v. Teleflex Inc., 550 U.S. 398, 418 (2007) (citing In re Kahn,
`441 F.3d 977, 988 (Fed. Cir. 2006) (“[R]ejections on obviousness grounds
`cannot be sustained by mere conclusory statements; instead, there must be
`some articulated reasoning with some rational underpinning to support the
`legal conclusion of obviousness”)).
`
`
`
`6
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`B. Level of Ordinary Skill in the Art and Dr. Scordilis’s Testimony
`Petitioner asserts that a person of ordinary skill in the art “would have
`had a bachelor’s degree in electrical engineering or a related field with
`coursework in signal processing, plus two years of academic and/or industry
`experience in signal processing or a related field. More education could
`substitute for experience, and vice versa.” Pet. 7. “Patent Owner does not
`dispute Petitioners’ definition of a [person of ordinary skill in the art].” PO
`Resp. 2.
`Based on a review of the record, we adopt Petitioner’s proposed level
`of ordinary skill in the art because it is consistent with the evidence of
`record, including the asserted prior art, other references of record, and the
`’961 patent specification.
`
`
`C. Claim Construction
`In inter partes reviews, the Board interprets claim language using the
`district-court-type standard, as described in Phillips v. AWH Corp., 415
`F.3d 1303 (Fed. Cir. 2005) (en banc). See 37 C.F.R. § 42.100(b) (2020).
`Under this standard, claim terms have their ordinary and customary
`meaning, as would be understood by a person of ordinary skill in the
`art at the time of the invention, in light of the language of the claims, the
`specification, and the prosecution history. See Phillips, 415
`F.3d at 1313–14.
`Petitioner relies on the plain and ordinary meaning and contends that
`it is unnecessary to construe any claim terms explicitly. See Pet. 8 (citing 37
`C.F.R. §42.100(b)). Patent Owner “agrees with Petitioner that, for purposes
`of this Response only, all terms of the ᾽961 patent have their plain and
`ordinary meaning.” PO Resp. 23. As Petitioner contends, it is not necessary
`7
`
`
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`to construe any claim terms explicitly. See Nidec Motor Corp. v. Zhongshan
`Broad Ocean Motor Co., 868 F.3d 1013, 1017 (Fed. Cir. 2017) (stating that
`“we need only construe terms ‘that are in controversy, and only to the extent
`necessary to resolve the controversy’” (quoting Vivid Techs.,
`Inc. v. Am. Sci. & Eng’g, Inc., 200 F.3d 795, 803 (Fed. Cir. 1999))).
`
`D. Summary of Asserted Prior Art References
`1. Srinivasan (Ex. 1005)
`Srinivasan relates to an encoder that adds an inaudible binary code to
`an audio signal, and a decoder for retrieving that code, using a phase
`modulation scheme. See Ex. 1005, code (57), 1:5–7, 3:16–19, 11:25–30.
`The code “may be used . . . in order to identify a broadcast program.” Id. at
`1:8–10.
`Srinivasan’s Figure 1 follows:
`
`
`
`
`
`8
`
`
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`Figure 1 is a schematic block diagram of an audience measurement
`system employing signal coding and decoding arrangements, and illustrates
`encoder 12 for adding codes to a digital representation of audio signal
`derived from audio signal portion 14 of a broadcast signal transmitted to
`receiver 20 for decoding the codes via decoder 26 and replaying the audio in
`speaker 24. See Ex. 1005, 5:52–54, 7:29–41.
`After sampling an audio block to provide 512 samples and computing
`a Fast Fourier Transform (FFT) of the block (see steps 40–44, Fig. 2),
`Srinivasan’s “method comprises the following steps,” which result in phase
`encoding of frequency components of the audio signal:
`a) selecting, within the block, (i) a reference frequency within the
`predetermined signal bandwidth, (ii) a first code frequency
`having a first predetermined offset from the reference frequency,
`and (iii) a second code frequency having a second predetermined
`offset from the reference frequency; b) comparing the spectral
`amplitude of the signal near the first code frequency to the
`spectral amplitude of the signal near the second code frequency;
`c) selecting a portion of the signal at one of the first and second
`code frequencies at which the corresponding spectral amplitude
`is smaller to be a modifiable signal component, and selecting a
`portion of the signal at the other of the first and second code
`frequencies to be a reference signal component; and
`d) selectively changing the phase of the modifiable signal
`component so that it differs by no more than a predetermined
`amount from the phase of the reference signal component.
` Ex. 1005, 3:3–19.
`
`To select the frequencies to encode, Srinivasan’s method
`“determine[s] a frequency index Imax at which the spectral power of the audio
`signal, as determined as the step 44 [of Figure 2], is a maximum in the low
`frequency band extending from zero Hz to two kHz.” Ex. 1005, 8:51–54.
`Then, “[t]he code frequency indices 11 and 10 are chosen relative to . . . Imax
`
`
`
`9
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`so that they lie in a higher frequency band at which the human ear is
`relatively less sensitive.” Id. at 8:59–63. Srinivasan states that the less
`sensitive frequency ranges (for embedding codes) are frequencies in the 4.8–
`6 kHz range “in order to exploit the higher auditory threshold in this band.”
`Id. at 7:64–67. “[E]ach successive bit of the code may use a different pair of
`code frequencies f1 and f0 denoted by corresponding code frequency indexes
`I1 and I0.” Id. at 7:67–8:3. “[T]wo preferred ways of selecting the code
`frequencies f1 and f0 . . . create an inaudible wide-band noise like code.”
`Id. at 8:3–5.
` Srinivasan’s Figure 3 follows:
`
`
`
`Srinivasan’s Figure 3 above, is a spectral plot illustrating the result of
`
`taking the FFT (Fast Fourier Transform) of a time block, where as a result of
`amplitude modulation coding, “spectrum 52 shows the audio block after
`coding of a ‘1’ bit, and a spectrum 54 shows the audio block before coding.”
`See Ex. 1005, 9:48–53. In other words, there is a visual difference in the
`plot between encoded audio waveform and the original audio waveform.
`
`Srinivasan describes one phase encoding technique as follows:
`
`
`
`10
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`In order to encode a binary number, the phase angle of one
`of these components, usually the component with the lower
`spectral amplitude, can be modified to be either in phase (i.e., 0°)
`or out of phase (i.e., 180°) with respect to the other component,
`which becomes the reference. In this manner, a binary 0 may be
`encoded as an in-phase modification and a binary 1 encoded as
`an out-of-phase modification.
`Ex. 1005, 11:26–32.
`
`To minimize the audible perception of coding the phase shifts,
`Srinivasan explains that “it is not essential to perform phase modulation to
`th[e] extent of [a maximum phase change of 180°], as it is only necessary to
`ensure that the two components are either ‘close’ to one another in phase or
`‘far’ apart.” Id. at 11:44–47. Therefore, Srinivasan assigns two phase
`neighborhoods, one ±45° (±π/4 radians) for a reference phase, and another
`±45° of 180° out of phase with the reference phase, to represent a “0” and
`“1,” and then modifies the phase angle of “[t]he modifiable spectral
`component . . . into one of these phase neighborhoods depending upon
`whether a binary ‘0’ or a binary ‘1’ is being encoded.” Id. at 11:26–54.
`With this variation, “approximately 30% of the segments are ‘self-coded’
`. . . and no modulation is required.” Id. at 11:56–57.
`2. Kudumakis (Ex. 1007)
`Kudumakis relates to “a method of labelling . . . an audio or video
`signal prior to broadcast or distribution to provide an audit trail.” Ex. 1007,
`1:4–8. Because prior art systems employed watermarking techniques using
`frequency notches at predictable frequencies to embed an inaudible code,
`Kudumakis “appropriately select[s] the part of the frequency spectrum
`where each watermark code is inserted, providing improved audio quality
`and extra security in the form of frequency hopping.” Id. at 2:23–25. The
`location in the frequency spectrum that includes the embedded code “is
`11
`
`
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`chosen adaptively with regard to the frequency content of the signal.” Id. at
`2:29–30. Therefore, Kudumakis’s frequency selection techniques introduce
`unpredictability as to the frequency location of the embedded codes. Id. at
`3:17–18.
`Kudumakis’s method embeds the codes using the same “notch”
`filtering as the prior art, which places “notches” in the audio signal to
`remove the original signal, and then replaces the removed signal with a
`frequency carrying the desired code using either amplitude or phase
`modulation. Ex. 1007, 1:17–18, 3:4–9, 6:1–2.
`
`For each input block, Kudumakis’s system “find[s] both the
`fundamental and its harmonics” of the audio input signal using known
`techniques such as FFT. Ex. 1007, 4:26–30. Using the fundamental and its
`harmonics to determine where to embed the codes enhances security against
`malicious attacks because those frequencies vary throughout the audio
`signal. Id. at 4:26–5:9. Kudumakis’s method embeds the notch filters and
`codes at the “edges” of “these harmonics.” Id. at 4:30–31. Codes are more
`perceptible if they “coincide with the main frequency component of the
`signal,” yet “they have to be placed in a part of the spectrum with sufficient
`energy so that frequent masking conditions can be met.” Id. at 3:5–9.
`3. Cabot (Ex. 1006)
`Cabot relates to testing “the audibility of phase shifts in two
`component octave complexes.” Ex. 1006, 568. The tests involved groups
`listening to “headphones with fundamental and third-harmonic signals” of an
`audio signal. Id.
`“The experiment shows phase shifts of harmonic complexes to be
`detectable, but judging from the difficulty experienced by the subjects, the
`effect appear to be small.” Ex. 1006, 570. In one experiment, subjects
`12
`
`
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`correctly identified a phase shift of 22.5 degrees between a fundamental and
`a harmonic with a 60 % corrected rate P. Id. (Table II).
`Cabot’s Figure 2a follows:
`
`
`Figure 2a above illustrates a time varying waveform (with time along
`
`the horizontal axis and amplitude along the vertical axis), which is equal to
`the sum of its fundamental harmonic at 400Hz plus its third harmonic at
`1200 Hz, as depicted.
`
`
`
`13
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`Cabot’s Figure 2b follows:
`
`
`Figure 2b above illustrates a time varying composite signal (with time
`
`along the horizontal axis and amplitude along the vertical axis), which is
`equal to the sum of its fundamental harmonic at 400Hz plus its third
`harmonic at 1200 Hz, which is shifted by a phase of 90 degrees relative to
`the fundamental harmonic, as depicted. Notice that the composite signal in
`Figure 2a is different from the composite signal in Figure 2b, due to the
`relative phase shift of the third harmonic.
`Cabot states that its “results, both quantitative and qualitative,
`correlate well with those of previous researchers using both similar and very
`different experimental techniques.” Ex. 1006, 571. Cabot states that
`“[a]lthough differences were detectable, they were subtle,” and “[t]his raises
`the question of its audibility compared to the more familiar forms of
`distortion.” Id.
`
`
`
`14
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`E. Obviousness Ground 1: Srinivasan, Cabot, and Kudumakis
`Petitioner contends that the subject matter of claims 1–10 would have
`been obvious over Srinivasan, Cabot, and Kudumakis. Pet. 2. Patent Owner
`argues that Petitioner’s showing is insufficient. PO Resp. 23–47.
`1. Independent Claim 1
`a. Step 1PRE: “A method for embedding data in an audio signal, the
`method comprising:”
`Petitioner contends that “Srinivasan’s differential phase encoding
`method that ‘add[s] an inaudible code to an audio signal’ as ‘bit[s]’” satisfies
`the preamble. Pet. 29 (citing Ex. 1003 ¶ 139; quoting Ex. 1005, 1:5–7,
`7:67).
`b. Step 1A: “(a) dividing the audio signal into a plurality of time
`frames and, in each time frame, a plurality of frequency components”
`Petitioner contends that Srinivasan discloses this step. Pet. 29–31.
`According to Petitioner, Srinivasan discloses sampling an audio signal,
`dividing the digitized audio signal into a plurality of time frames called
`“blocks,” and then dividing each block into a plurality of frequency
`components by taking the FFT (Fast Fourier Transform). See id.
`(reproducing Ex. 1005, Figs. 2, 5; quoting id. at 7:33–42 (“a first block v(t)
`of jNc samples is derived from the audio signal portion 14 . . . where v(t) is
`the time-domain representation of the audio signal within the block”)).
`c. Step 1B: “(b) in each of at least some of the plurality of time
`frames, selecting at least two of the plurality of frequency components; and”
`Petitioner contends that Srinivasan’s method selects “two of the
`plurality of frequency components as the code frequencies f1 and f0 (also
`represented by frequency indices I1 and I0, respectively).” Pet. 32 (citing
`Ex. 1005, 7:64–8:3; Ex. 1003 ¶ 143). Petitioner quotes Srinivasan as
`teaching that “[t]he code frequencies fi used for coding a block may be
`15
`
`
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`chosen from the Fourier Transform ℑ{v(t)} at a step 46” of Figure 2. Id.
`(quoting Ex. 1005, 7:64–66).
`Srinivasan’s Figure 2 follows:
`
`Srinivasan’s Figure 2 is a flow chart of steps performed by an
`encoder. Ex. 1005, 5:56–57. Petitioner relies on Srinivasan’s Figure 2
`above (step 46) to illustrate that Srinivasan’s technique involves selecting
`two frequencies f1 and f0 in each audio block of 512 samples (step 40). See
`Pet. 32–33 (citing Ex. 1003 ¶¶ 143–144).
`
`
`
`
`
`16
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`d. Step 1C: “altering a phase of at least one of the plurality of
`frequency components in accordance with the data to be embedded,
`wherein,”
`Petitioner contends that Srinivasan’s system modifies “the phase of
`one of the spectral components f0 or f1 . . . with respect to the other,” where
`Srinivasan refers to modified and reference phases as ɸM and ɸR,
`respectively. Pet. 34 (citing Ex. 1005, 11:26–39). Petitioner explains that
`Srinivasan’s method
`assign[s] two phase neighborhoods (one being ±45° (±π/4
`radians) of reference phase ɸR and another being ±45° of . . . 180°
`(π radians) out of phase with ɸR) to represent, e.g., a “0” and “1,”
`respectively, and by modifying the phase angle ɸM of the
`modifiable spectral component “at the step 56 so as to fall into
`one of these phase neighborhoods depending upon whether a
`binary ‘0’ or a binary ‘1’ is being encoded.”
`Id. (citing Ex. 1005, 11:26–54; Ex. 1003 ¶ 145; Pet. § VI.A).
`e. Step 1C-1: “wherein . . . step b comprises selecting a fundamental
`tone and at least one overtone” and Step 1C-2: “and step (c) comprises
`quantizing a phase difference of the at least one overtone relative to the
`fundamental tone to embed at least one bit of the data to be embedded.”
`
`For step 1C-1, Petitioner contends that “[t]he
`Srinivasan/Cabot/Kudumakis Combination selects a fundamental tone and
`the third harmonic (an overtone) as the code frequencies based on
`Kudumakis’s and Cabot’s teachings.” Pet. 35 (citing Pet. §§ VI.E, VI.E.3;
`Ex. 1003 ¶ 147). For step 1C-2, Petitioner similarly contends that “[t]he
`Srinivasan/Cabot/Kudumakis Combination uses Srinivasan’s method
`that quantizes a phase difference between code frequencies f1 and f0 to
`embed a bit of data.” Id. (citing Ex. 1008, 11:26–54; Ex. 1003 ¶ 148). In
`other words, as discussed further below, Petitioner primarily relies on Cabot
`and Kudumakis to address encoding the phase difference between an
`
`
`
`17
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`overtone relative to a fundamental tone and its stated reasons for combining
`the three references to address steps 1C-1 and 1C-2.
`
`Further as to step 1C-2, Petitioner relies on Srinivasan’s method as
`described above, which quantizes phase differences between code
`frequencies f0 and f1 to represent either a binary 1 or 0. According to
`Petitioner, a person of ordinary skill in the art
`would have understood that establishing one phase difference
`value (0°) or neighborhood of values (within 45° of in-phase) to
`represent one binary value, and another phase difference value
`(180°) or neighborhood of values (within 45° of out-of-phase) to
`represent a different binary value is quantizing the phase
`difference between the two frequency components as claimed.
`Pet. 35–36 (citing Ex. 1003 ¶ 149).
`
`In other words, Petitioner employs Srinivasan’s method of phase
`encoding wherein a phase difference shift may be about 45°, because a
`phase for f1 relative to f0 may fall anywhere between 45° and 135° (e.g.,
`90°), and when Srinivasan’s method shifts that phase so that it falls in the
`closest neighborhood, the phase shift is as high as 45°. See Ex. 1003
`¶¶ 145–149.
`
`Petitioner explains that a person of ordinary skill in the art “would
`have been motivated by Cabot and Kudumakis to modify Srinivasan to use
`the fundamental and third harmonic as the coding frequencies f0 and f1 to
`improve robustness and expand the applications for which Srinivasan’s
`system was suitable.” Pet. 22. Petitioner advances other reasons as to why
`an artisan of ordinary skill would have selected the fundamental and third
`harmonic as the coding frequencies. See id. at 21–28.
`
`For example, Petitioner contends that it would have been obvious to
`implement Cabot’s and Kudumakis’s suggestions to use the fundamental and
`
`
`
`18
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`third harmonic frequencies to modify Srinivasan’s system so that it can
`protect spoken speech and word recordings, which involve lower
`frequencies of the human audio spectrum than music. See Pet. 26–27.
`According to Petitioner, “[b]ecause such audio [speech] content commonly
`has frequencies up to about only 4 kHz, Srinivasan’s embedding band from
`4.8 kHz–6 kHz would not work with such narrowband voice applications.”
`Id. (citing Ex. 1003 ¶ 132). Petitioner explains that “[c]hoosing the
`fundamental and third harmonic as the coding frequencies would have
`expanded the useful applications of Srinivasan’s watermarking system to
`narrowband voice applications.” Id. at 27.
`
`Petitioner contends that a person of ordinary skill in the art would
`have “understood from Cabot that there was an alternative way to achieve
`low-visibility differential phase encoding.” Pet. 22 (citing Ex. 1003 ¶ 120).
`According to Petitioner,
`Cabot confirmed that modifying the phase of the third harmonic
`relative to the fundamental is inaudible for small phase changes
`and “subtle” at larger phase changes, even under pristine
`listening conditions where (1) the fundamental and third
`harmonic are isolated from all other sounds that might mask the
`phase change and (2) the listener, wearing headphones, is
`familiarized with the sound of the difference and given unlimited
`opportunities to try to detect the difference.
`Id. at 22–23 (citing Ex. 1006, Abstract, 570–71 (Observations & Discussions
`at ¶ 4, Conclusions); Ex. 1003 ¶ 120).
`
`Petitioner explains that a person of ordinary skill in the art
`would have understood that using the third harmonic and
`fundamental as Srinivasan’s code frequencies f1 and f0 . . . would
`have provided the low visibility Srinivasan desires, particularly
`under real-world conditions where the listener is unaware of the
`phase shift and the audio signal has other sounds masking the
`phase shift.
`
`
`
`19
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`Pet. 23 (emphasis added) (citing Ex. 1003 ¶¶ 121–124; Ex. 1006, 570–71;
`Ex. 1028 (Risset), 114 (noting there is a “remarkable insensitivity to phase”
`between harmonics of periodic tones and stating further that although
`changing the phase between the harmonics of a periodic tone can alter the
`timbre of audio under certain conditions, “this effect is quite weak, and it is
`generally inaudible in a normally reverberant room where phase relations are
`smeared.”); Ex. 1031 (Lipshitz), 584 (noting symmetrical waveforms like
`those resulting from Cabot’s use of the fundamental and third harmonic give
`rise to “much less pronounced phase effects” than those displayed by
`asymmetrical signals)).
`
`With respect to security, Petitioner explains that “because the
`fundamental frequency and its harmonics change unpredictably throughout
`the audio signal, using them to determine where to embed a code
`‘[e]nhance[s] security against malicious attacks’ seeking to remove
`the code.” Pet. 24 (quoting Ex. 1007, 5:3; citing id. at 3:3–18, 6:1–2;
`Ex. 1003 ¶ 127). Kudumakis teaches that “[i]f the code frequencies change
`frequently, then it becomes more difficult for an attacker to remove all the
`codes without introducing significant distortion to the original signal
`content.” Ex. 1007, 5:5–7. Therefore, Petitioner contends that “[p]lacing
`the watermark in perceptually less-significant regions (like Srinivasan did)
`leaves it susceptible to low-pass filtering that would remove the code by
`only retaining lower frequencies of the signal perceptible to the listener.”
`Pet. 25.
`
`With respect to robustness, Petitioner contends that “the modification
`would have resulted in using perceptually more significant lower frequencies
`less susceptible to common signal distortions and malicious attacks.”
`
`
`
`20
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`Pet. 24 (citing Ex. 1003 ¶ 128; Ex. 1020, Abstract (“a watermark must be
`placed in perceptually significant components of a signal if it is to be robust
`to common signal distortions and malicious attack.”); Ex. 1018, 4:32–37
`(“The higher the level of embedded signal, the more corrupted a copy can be
`and still be identified.”)).
`
`Petitioner explains that for audio files including speech, “the
`perceptually significant frequencies with the most energy were well below
`the 4.8–6 kHz range Srinivasan used and include the fundamental and its
`harmonics.” Pet. 24 (citing Ex. 1003 ¶ 128; Ex. 1008 (Tilki), 331 (“lower
`frequencies (below say 2 kHz) . . . typically contain the most energy in
`common audio signals”); Ex. 1020 (“Cox-1995”), 11 (“spectrum analysis of
`images and sounds reveals that most of the information in such data
`is located in the low frequency regions”); Ex. 1055 (“Cox-SWWW”), 1667).
`
`Therefore, according to Petitioner, “[p]lacing the watermark in
`perceptually less-significant regions (like Srinivasan did) leaves it
`susceptible to low-pass filtering that would remove the code by only
`retaining lower frequencies of the signal perceptible to the listener.”
`Pet. 25 (citing Ex. 1003 ¶ 129; Ex. 1020, Abstract; Ex. 1021 (“Wu”), Table
`4, 390 (“[L]ow pass filtering . . . could effectively eliminate the embedded
`watermark. However, since our watermark is embedded in the frequency
`bands with the highest energy, filtering out the inserted watermark also
`greatly effects the sound quality.”)).
`
`Petitioner also explains that “by placing the embedded data in
`frequency components that will typically be of relatively low energy,
`Srinivasan risks having the data be non-recoverable due to common signal
`
`
`
`21
`
`
`
`IPR2022-01544
`Patent 7,289,961 B2
`distortions like additive noise.” Pet. 25 (citing Ex. 1003 ¶ 129; Cox-1995,
`Abstract; Ex. 1021, Table 4, 390 (revealing that white noise is detrimental to
`watermarks in weaker frequencies)).
`
`Petiti