throbber

`
`
`
`
`
`
`
`
`
`Petitioner Apple Inc.
`Petitioner Apple Inc.
`Ex. 1012, Cover
`Ex. 1012, Cover
`
`

`

`Discrete-Time
`Processing of
`Speech Signals_
`
`John R. Deller, Jr. Michigan State University
`John G. Proakis Northeastern University
`John H. L. Hansen Duke university
`
`Macmillan Publishing Company
`New York
`
`Maxwell Macmillan Canada
`Toronto
`
`Maxwell Macmillan International
`New York Oxford Singapore Sydney
`
`Petitioner Apple Inc.
`Ex. 1012, Cover-2
`
`

`

`Editor: John Griffin
`Production Supervisor: Elaine W. Wetterau
`Production Manager: Roger Vergnes
`Text Designer: Natasha Sylvester
`Cover Designer: Cathleen Norz
`Illustrations: Academy ArtWorks, Inc.
`This book was set in Times Roman by Graphic Sciences Corporation,
`printed and bound by Book Press.
`
`Copyright © 1993 by Macmillan Publishing Company, a division of Macmillan, Inc.
`
`Printed in the United States of America
`All rights reserved. No part of this book may be reproduced or
`transmitted in any form or by any means, electronic or mechanical,
`including photocopying, recording, or any information storage and
`retrieval system, without permission in writing from the publisher.
`
`Macmillan Publishing Company
`113 Sylvan Avenue, Englewood Cliffs, NJ 07632
`
`Library of Congress Cataloging in Publication Data
`
`Deller, John R.
`Discrete-time processing of speech signals I John R. Deller, John
`G. Proakis, John H. L. Hansen.
`p. em.
`Includes index.
`ISBN 0-02-328301-7
`1. Speech processing systems. 2. Signal processing-Digital
`techniques. 3. Discrete-time systems.
`I. Proakis, John G.
`II. Hansen, John H. L.
`III. Title.
`TK7882.S65D44 1993
`621.382'2-dc20
`
`Printing:
`
`2 3 4 5 6 7 8
`
`Year:
`
`5 6 7 8 9 0 1 2
`
`92-38644
`CIP
`
`Petitioner Apple Inc.
`Ex. 1012, Cover-3
`
`

`

`~ER ]1_--------------------------------------~
`
`Propaedeutic
`
`Read.Me: If you are someone who never reads Chapter 1, please at least
`read Sections 1.0.2 and 1.0.3 before proceeding!
`
`1.0 Preamble
`
`1.0.1 The Purpose of Chapter 1
`If the reader learns nothing more from this book, it is a safe bet that
`he or she will learn a new word. A propaedeutic1 is a "preliminary body
`of knowledge and rules necessary for the study of some art or science"
`(Barnhart, 1964). This chapter is just that-a propaedeutic for the study
`of speech processing focusing primarily on two broad areas, digital signal
`processing (DSP) and stochastic processes, and also on some necessary
`topics from the fields of statistical pattern recognition and information
`theory.
`The reader of this book is assumed to have a sound background in the
`first two of these areas, typical of an entry level graduate course in each
`field. It is not our purpose to comprehensively teach DSP and random
`processes, and the brief presentation here is not intended to provide an
`adequate background. There are many fine textbooks to which the reader
`might refer to review and reinforce prerequisite topics for these subjects.
`We list a considerable number of widely used books in Appendices 1.A
`and 1.B.
`What, then, is the point of our propaedeutic? The remainder of this
`chapter is divided into four main sections plus one small section, and the
`tutorial goals are somewhat different in each. Let us first consider the
`two main sections on DSP and stochastic processes. In the authors' expe(cid:173)
`rience, the speech processing student is somewhat more comfortable with
`"deterministic" DSP topics than with random processes. What we will do
`in Section 1.1, which focuses on DSP, therefore, is highlight some of the
`key concepts which will play central roles in our speech processing work.
`Where the material seems unfamiliar, the reader is urged to seek help in
`
`1 Pronounced "pro' -pa-doo' -tic."
`
`3
`
`Petitioner Apple Inc.
`Ex. 1012, p. 3
`
`

`

`4 Ch. 1 I Propaedeutic
`
`one or more of the DSP textbooks cited in Appendix l.A. Our main ob(cid:173)
`jective is to briefly outline the essential DSP topics with a particular in(cid:173)
`terest defining notation that will be used consistently throughout the
`book. A second objective is to cover a few subtler concepts that will be
`important in this book, and that might have been missed in the reader's
`first exposure to DSP.
`The goals of Section 1.2 on random processes are somewhat differ(cid:173)
`ent. We will introduce some fundamental concepts with a bit more for(cid:173)
`mality, uniformity, and detail than the DSP material. This treatment
`might at first seem unnecessarily detailed for a textbook on speech pro(cid:173)
`cessing. We do so, however, for several reasons. First, a clear understand(cid:173)
`ing of stochastic process concepts, which are so essential in speech
`processing, depends strongly on an understanding of the basic probability
`formalisms. Second, many engineering courses rely heavily on stochastic
`processes and not so much on the underlying probability concepts, so
`that the probability concepts become "rusty." Emerging technologies in
`speech processing depend on the basic probability theory and some re(cid:173)
`view of these ideas could prove useful. Third, it is true that the mastery
`of any subject requires several "passes" through the material, but engi(cid:173)
`neers often find this especially true of the field of probability and ran(cid:173)
`dom processes.
`The third and fourth major divisions of this chapter, Sections 1.3 and
`1.4, treat a few topics which are used in the vast fields of statistical pat(cid:173)
`tern recognition and information theory. In fact, we have included some
`topics in Section 1.3 which are perhaps more general than "pattern rec(cid:173)
`ognition" methods, but the rubric will suffice. These sections are con(cid:173)
`cerned with basic mathematical tools which will be used frequently, and
`in diverse ways in our study, beginning in Part IV of the book. There is
`no assumption that the reader has formal coursework in these topics be(cid:173)
`yond the normal acquaintance with them that would ordinarily be de(cid:173)
`rived from an engineering education. Therefore, the goal of these sections
`is to give an adequate description of a few important topics which will be
`critical to our speech work.
`Finally, Section 1.5 briefly reviews the essence and notation of phasors
`and steady-state analysis of systems described by differential equations.
`A firm grasp of this material will be necessary in our early work on ana(cid:173)
`log acoustic modeling of the speech production system in Chapter 3.
`As indicated above, the need for the subjects in Sections 1.3-1.5 is not
`immediate, so the reader might wish to scan over these sections, then re(cid:173)
`turn to them as needed. More guidance on reading strategy follows.
`
`1.0.2 Please Read This Note on Notation
`The principal tool of engineering is applied mathematics. The lan(cid:173)
`guage of mathematics is abstract symbolism. This book is written with a
`conviction that careful and consistent notation is a sign of clear under-
`
`Petitioner Apple Inc.
`Ex. 1012, p. 4
`
`

`

`1.0 1 Preamble 5
`
`standing, and clear understanding is derived by forcing oneself to com(cid:173)
`prehend and use such notation. Painstaking care has been taken in this
`book to use information-laden and consistent notation in keeping with
`this philosophy. When we err with notation, we err on the side of exces(cid:173)
`sive notation which is not always conventional, and not always necessary
`once the topic has been mastered. Therefore, the reader is invited (with
`your instructor's permission if you are taking a course!) to shorten or
`simplify the notation as the need for the "tutorial" notation subsides.
`Let us give some examples. We will later use an argument m to keep
`track of the point in time at which certain features are extracted from a
`speech signal. This argument is key to understanding the "short-term"
`nature of the processing of speech. The ith "linear prediction" coefficient
`computed on a "frame" of speech ending at time m will be denoted
`ii(i; m). In the development of an algorithm for computing the coeffi(cid:173)
`cients, forexample, the index m will not be very germane to the develop(cid:173)
`ment and the reader might wish to omit it once its significance is clear.
`Another example comes from the random process theory. Numerous ex(cid:173)
`amples of sloppy notation abound in probability theory, a likely reason
`why many engineers find this subject intractable. For example, some(cid:173)
`thing like "f(x)" is frequently used to denote the probability density func(cid:173)
`tion (pdf) for the random variable ~· There are numerous ways in which
`this notation can cause misunderstandings and even subtle mathematical
`traps which can lead to incorrect results. We will be careful in this text to
`delineate random processes, random variables, and values that may be
`assumed by a random variable. We will denote a random variable, for
`example, by underscoring the variable name, for example, x. The pdf for
`~ will be denoted fx(x), for example. The reader who has a clear under(cid:173)
`standing of the underlying concepts might choose to resort to some slop(cid:173)
`pier form of notation, but the reader who does not will benefit greatly by
`working to understanding the details of the notation.
`
`1.0.3 For People Who Never Read Chapter 1 (and Those Who Do)
`To be entitled to use the word "propaedeutic" at your next social en(cid:173)
`gagement, you must read at least some of this chapter. 2 If for no other
`reason than to become familiar with the notation, we urge you to at least
`generally review the topics here before proceeding. However, there is a
`large amount of material in this chapter, and some people will naturally
`prefer to review these topics on an "as needed" basis. For that reason, we
`provide the following guide to the use of Chapter 1.
`With a few exceptions, most of the topics in Sections 1.1 and 1.2 will
`be widely used throughout the book and we recommend their review be(cid:173)
`fore proceeding. The one exception is the subsection on "State Space Re-
`
`21f you have skipped the first part of this chapter, you will be using the word without
`even knowing what it means.
`
`Petitioner Apple Inc.
`Ex. 1012, p. 5
`
`

`

`6 Ch. 1 1 Propaedeutic
`
`alizations" in Section 1.1.6, which will be used in a limited way in
`Chapters 5 and 12. The topics in Sections 1.3 and 1.4, however, are
`mostly specialized subjects which will be used in particular aspects of our
`study, beginning in Part IV of the book. Likewise the topic in Section 1.5
`is used in one isolated, but important, body of material in Chapter 3.
`These latter topics and the "state space" topic in the earlier section
`will be "flagged" in Reading Notes at the beginning of relevant chapters,
`and in other appropriate places in the book.
`
`1.1 Review of DSP Concepts and Notation
`
`1.1.1 "Normalized Time and Frequency"
`Throughout the book, we will implicitly use what we might call nor(cid:173)
`malized time and frequency variables. By this we mean that a discrete
`time signal (usually speech), say s(n), will be indexed by integers only. 3
`Whereas s(n) invariably represents samples of an analog waveform, say
`sa<t), at some sample period, T,
`
`( 1.1)
`
`n = ... , -1, 0, 1, 2, ... ,
`the integer n indexes the sample number, but we have lost the absolute
`time orientation in the argument. To recover the times at which the sam(cid:173)
`ples are taken, we simply need to know T.
`To understand the "physical" significance of this mathematical con(cid:173)
`vention, it is sometimes convenient to imagine that we have scaled the
`real-world time axis by a factor of T prior to taking the samples, as illus(cid:173)
`trated in Fig. 1.1. "Normalized time," say t', is related to real time as
`t' =.!... T
`and the samples of speech are taken at intervals which are exactly4 "nor(cid:173)
`malized seconds (norm-sec)." In most cases it is perfectly sufficient to
`refer to the interval between samples as the "sample period," where the
`conversion to the real-world interval is obvious. However, on a few occa(cid:173)
`sions we will have more than one sampling process occurring in the same
`problem (i.e., a resampling of the speech sequence), and in these in(cid:173)
`stances the concept of a "normalized second" is useful to refer to the
`basic sampling interval on the data.
`Of course, the normalization of time renders certain frequency quanti(cid:173)
`ties invariant. The sample period in normalized time is always unity, and
`therefore the sample frequency is always unity [dimensionless, but some-
`
`( 1.2)
`
`3Note that we have referred to s(n) as "discrete time" rather than "digital." Throughout
`most of this book, we will ignore any quantization of amplitude.
`4The reader should note that the normalized time axis is actually dimensionless.
`
`Petitioner Apple Inc.
`Ex. 1012, p. 6
`
`

`

`1.1 f Review of DSP Concepts and Notation 7
`
`Speech waveform
`
`0 1 2 3
`
`9
`8
`4 5 6 7
`"Normalized" time, t' (norm-sec)
`
`18
`
`0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8
`Real time, t (msec)
`
`FIGURE 1.1. Segment of a speech waveform used to illustrate the concept
`of "normalized time." Suppose that samples are to be taken at a rate F. =
`1 0 kHz so that the sample period is T = 0.1 msec. The lower time axis
`represents real time measured in milliseconds, while the upper represents a
`normalization of the time axis such that the sample times fall at integers.
`Normalized time, t', is related to real time, t, as t' = tfT. We will on a few
`occasions refer to the sample period in the scaled case as a "normalized
`second (norm-sec)."
`
`times "normalized Hertz (norm-Hz)"], and the sample radian frequency
`is always 2n [dimensionless or "normalized radians per second (norm(cid:173)
`rps)"]. Accordingly, the Nyquist frequency is always 0.5 norm-Hz, or n
`norm-rps. In general, the conversions between "real" frequencies, say F
`(Hz) and n (rps) and their normalized counterparts, say f and w, are
`given by
`
`f=FT
`w=QT.
`
`( 1.3)
`
`( 1.4)
`
`We can easily verify this by examining a single sinusoid at real frequency
`n,
`
`xa(t) =A sin(Ot + q7) =A sin(QT; + q7).
`
`( 1.5)
`
`The rightmost term can be regarded as a sinusoid at a different fre(cid:173)
`quency, w = QT, on a different time axis t' = t /T,
`x~(t') =A sin(wt' + q7).
`
`(1.6)
`
`Petitioner Apple Inc.
`Ex. 1012, p. 7
`
`

`

`8 Ch. 1 1 Propaedeutic
`
`Clearly, we obtain the same samples if we sample xa(t) at t = nT, or
`x~(t') at t' = n. A magnitude spectrum is plotted against real and nor(cid:173)
`malized frequencies in Fig. 1.2 to illustrate this concept.
`In spite of this rather lengthy explanation of "normalized time and
`frequency," we do not want to overemphasize this issue. Simply stated,
`we will find it convenient to index speech waveforms by integers (espe(cid:173)
`cially in theoretical developments). This is an accepted convention in
`. DSP. The point of the above is to remind the reader that the resulting
`"normalized" time and frequency domains are simply related to the
`"real" time and frequency. When the "normalized" quantities need to be
`converted to "real" quantities, this is very easily accomplished if the
`sample frequency or period is known. While using the DSP convention
`for convenience in many discussions, we will always keep the sampling
`information close at hand so the "real" quantities are known. To do oth(cid:173)
`erwise would be to deny the physical nature of the process with which we
`are working. In many instances in which practical systems and applica(cid:173)
`tions are being discussed, it will make perfect sense· to simply work in
`terms of "real" quantities.
`
`60
`
`a:) .,
`"""
`"--,
`~
`v "0
`
`3 ·= OJJ "' E
`
`-;;;
`b
`u
`<()
`0..
`CI"J
`
`00
`
`0.1
`
`0.3
`0.2
`0.4
`"Normalized" frequency, f(norm-Hz)
`
`0.5
`
`4
`
`5
`
`0
`
`2
`3
`Frequency, F (kHz)
`FIGURE 1.2. Magnitude spectrum of a typical speech waveform. This
`spectrum is based on the DFT of samples of the waveform taken at
`1 0 kHz. The lower frequency axis represents real frequencies measured in
`kHz, while the upper represents a normalization of the frequency axis
`concomitant to the time normalization. Normalized frequency, f, is related to
`real frequency, F, as f= FT, where Tis the nominal sample period in a given
`analysis. Accordingly, the "normalized" sampling, say t •• and Nyquist, fN,
`frequencies are invariant with the sample rate, with t. = 1 and fN = 1/2. We
`will sometimes refer to the units of normalized frequencies as "normalized
`Hertz (norm-Hz)" or "normalized radians per second (norm-rps)."
`
`Petitioner Apple Inc.
`Ex. 1012, p. 8
`
`

`

`1.1 1 Review of DSP Concepts and Notation 9
`
`1.1.2 Singularity Signals
`In the continuous time domain, a singularity signal is one for which
`one or more derivatives do not exist at one or more time points. Al(cid:173)
`though the concept of a derivative is no longer meaningful in discrete
`time, we often borrow the term singularity to describe analogous se(cid:173)
`quences in sampled time. The two for which we will have the most use
`are
`
`• The unit sample sequence or discrete-time impulse, defined by
`if n = 0
`
`J(n) d,;,f
`
`1'
`
`{
`
`0,
`
`otherwise
`
`( 1. 7)
`
`• The unit step sequence, defined by 5
`
`u(n) d~r
`
`1'
`
`0,
`
`{
`
`if n > 0
`
`otherwise
`
`(1.8)
`
`The unit step sequence is much more analogous to its analog counter(cid:173)
`part than the discrete-time impulse in the sense that u(n) simply repre(cid:173)
`sents samples of the analog unit step function (discounting problems that
`may arise due to different definitions at time zero). On the other hand,
`recall that the analog (Dirac) impulse function, say Ja(t), is defined such
`that it apparently has infinite height, zero width, and unity area. Al(cid:173)
`though the discrete-time impulse plays an analogous role to that played
`by the analog impulse, it may not be interpreted as its samples.
`
`1.1.3 Energy and Power Signals
`There are many ways in which a discrete time signal can be classified.
`One useful grouping is into the categories energy signal, power signal, or
`neither. Recall that the energy of a discrete time signal is defined as 6
`Ex d~r L jx(n)j2.
`
`cc
`
`n=-co
`
`( 1.9)
`
`A signal x(n) IS called an energy signal if
`
`O<Ex<oo.
`
`( 1.1 0)
`
`The power in a discrete-time sequence is
`
`5The notation u(n) is widely used to indicate the unit step sequence, but u(n) will also
`refer to a very important waveform, the ~glottal volume velocity," throughout the book. Be(cid:173)
`cause of the context, there will be no risk of confusion.
`6The absolute value signs are included because, in general, x(n) is a complex sequence.
`
`Petitioner Apple Inc.
`Ex. 1012, p. 9
`
`

`

`10 Ch. 1 1 Propaedeutic
`
`P
`x
`
`ct~r lim
`N-co
`
`1
`
`N
`
`2N + 1 L lx(n)l2·
`
`n=-N
`
`(1.11)
`
`A power signal has finite but nonzero power,
`0 < Px <OJ.
`A signal cannot be both a power signal and an energy signal simultane(cid:173)
`ously, since if Ex < ro, then Px = 0. A signal can, however, be neither
`when Px = ro or Ex= 0.
`For our purposes in speech processing, it is sufficient to associate the
`energy category with two broad classes of signals. These are
`
`(1.12)
`
`• Transients, those which decay (usually exponentially) with time. Ex(cid:173)
`amples are
`
`lal< 1
`
`( 1.13)
`
`(1.14)
`/a/< 1.
`• Finite sequences, those which are zero outside a finite time duration.
`An example is
`IPI < ro.
`x 3(n) = ePn[ u(n + 3)- u(n- 246)],
`Whereas the energy signals either decay out sufficiently fast or "stop"
`completely, the power signals neither decay nor increase in their enve(cid:173)
`lopes. The power signals can be associated with three broad classes of
`signals. These are
`
`(1.15)
`
`• Constant signals. An example is
`
`-ro < a < ro.
`( 1.16)
`• Periodic signals, those for which x(n) = x(n + N) for some finite N
`and for all n. Examples are
`
`-ro <a< ro
`x6(n) = [x3(n)Jmodulo 512 = I x3<n + i512).
`
`00
`
`i=-co
`
`( 1.17)
`
`( 1.18)
`
`• Realizations of stationary, ergodic stochastic processes (see Section
`1.2.3).
`
`The signals which fall into neither category are the trivial zero signal
`and those which "blow up" with time. Examples of the latter are x 1(n)
`and x 2(n) above with the magnitude of a taken to be greater than unity.
`
`1.1.4 Transforms and a Few Related Concepts
`At the heart of much of engineering analysis are various frequency do(cid:173)
`main transforms. Three transforms on discrete-time data will be used ex-
`
`Petitioner Apple Inc.
`Ex. 1012, p. 10
`
`

`

`1.1 1 Review of DSP Concepts and Notation 11
`
`tensively throughout this book, and it will be assumed that the reader is
`familiar with their properties and usage.
`The first is the discrete-time Fourier transform (DTFT), which, for the
`sequence x(n), is defined by
`X(w) = I x(n)e-jwn.
`
`( 1.19)
`
`00
`
`The inverse DTFT (IDTFT) is given by
`
`n=-oo
`
`fTC
`X(w)ejwn dw.
`-n
`
`( 1.20)
`
`1
`x(n) = -
`2
`rr
`The DTFT bears a useful relationship to the continuous-time Fourier
`transform in the case in which x(n) represents samples of the analog sig(cid:173)
`naF xa(t'). In this case X(w) will be a periodic (with period 2rr), poten(cid:173)
`tially aliased version of Xa(w),
`X(w) = I Xa(w- 2rri).
`
`00
`
`i=-co
`
`( 1.21)
`
`The existence of the DTFT is not a trivial subject, and we will review
`only a few important details. A sufficient condition for the DTFT of a
`sequence x(n) to exist is that the sequence be absolutely summable,
`00 I \x(n)\ <co.
`
`( 1.22)
`
`n=-co
`
`This follows immediately from ( 1.19). Moreover, absolute summability of
`x(n) is tantamount to absolute convergence of the series I:-oo x(n)e-jwn
`implying that this series converges uniformly to a continuous function of
`w (Churchill, 1960, Sees. 59 and 60). A sequence that is absolutely sum(cid:173)
`mabie will necessarily be an energy signal, since
`
`( 1.23)
`
`There are, however, energy signals that are not absolutely summable (see
`Problem 1.2). These energy signals will still have DTFTs, but ones whose
`series converge in a weaker (mean square) sense. This can be seen by
`viewing ( 1.19) as a conventional Fourier series for the periodic function
`X(w) whose coefficients are x(n). One of the properties of Fourier series
`is that if the energy in a single period of the fur..ction is finite, then the
`
`7Note the use of "normalized time" here. If "real" time is used, (1.21) becomes
`
`Petitioner Apple Inc.
`Ex. 1012, p. 11
`
`

`

`12 Ch. 1 1 Propaedeutic
`
`series will converge in mean square (Churchill, 1963). In the present case
`(using the Parseval relation),
`jX(w)j 2 dw = 2rc I x 2(n) = 2rcEx < ro,
`
`f 1t
`
`-1t
`
`00
`
`n=-oo
`
`(1.24)
`
`so the OTFT will converge in the mean square sense. Practically, this
`means that the OTFT sum will converge to X(w) at all points of continu(cid:173)
`ity, and at points of discontinuity it will converge to the "average" value
`("halfway" between the values on either side of the discontinuity).
`Properties of the OTFT are detailed in the textbooks cited in Appen(cid:173)
`dix l.A, and some are reviewed in Problem 1.3. The reader should also
`recall the numerous symmetry properties of the transform relation which
`can be useful in simplifying various computations and algorithms.
`Whereas the OTFT is most useful for theoretical spectral analysis, it is
`not computable on a digital computer because it is a function of a con(cid:173)
`tinuous argument. In principle, it also works with a sequence of doubly
`infinite length, which also precludes any practical computation. If we re(cid:173)
`strict ourselves to the practical situation in which a sequence of finite
`length is being studied, then the discrete Fourier transform (OFT) pro(cid:173)
`vides a mapping between the sequence, say
`
`x(n), n=O,l,2, ... ,N-1
`
`( 1.25)
`
`and a discrete set of frequency domain samples, given by
`
`X(k) =
`
`{
`
`N-1
`'\:"' X(n)e-;(2rr/N)kn
`L
`'
`
`n~o
`
`k = 0, I, ... , N- l
`
`0,
`
`other k
`
`The Inverse DFT (IDFT) is given by
`_!_ I X(k)e)(2n/N)kn
`
`N-1
`
`x(n) = N k~o
`{
`0,
`
`'
`
`n == 0, l, ... , N- l
`
`other n
`
`( 1.26)
`
`( 1.27)
`
`The OFT represents exact samples of the OTFT of the finite sequence
`x(n) at N equally spaced frequencies, wk = (2rc/N)k, for k E [O,N- 1].
`The discrete Fourier series (OFS) is closely related to the OFT compu(cid:173)
`tationally, but is quite different philosophically. The OFS is used to rep(cid:173)
`resent a periodic sequence (hence a power signal) with period, say N,
`using the set of basis functions eJ<Zn/N)kn for k = 0, ... , N- l. These rep(cid:173)
`resent the N harmonic frequencies that may be present in the signal. For
`a periodic sequence y(n), the expansion is
`y(n) = I C(k)eJ(2n/N)kn,
`
`( 1.28)
`
`N-1
`
`k~O
`
`Petitioner Apple Inc.
`Ex. 1012, p. 12
`
`

`

`1.1 1 Review of DSP Concepts and Notation 13
`
`where the coefficients are computed as
`
`C(k) = k LY(n)e-J(27r/N)kn.
`
`N-I
`
`n~o
`
`( 1.29)
`
`[In principle, the C(k)'s may be computed over any period of y(n).]
`It is occasionally convenient to use an "engineering DTFT" for a peri(cid:173)
`odic signal that technically has no DTFT. The contrived DTFT com(cid:173)
`posed of analog impulse functions at the harmonic frequencies weighted
`by the DFS coefficients is
`
`(1.30)
`
`Such a construction is not always palatable to a mathematician, but it
`works for most engineering purposes in the sense that it can be used any(cid:173)
`where that a DTFI is needed for y(n), as long as the rules for continuous(cid:173)
`time impulse functions are carefully followed. Note that this DTFT cor(cid:173)
`rectly asserts, for example, that y(n) has infinite energy at the harmonic
`frequencies. Consistency with conventional Fourier transform computa(cid:173)
`tions is obtained by defining the magnitude spectrum of such a DTFT by
`
`One more contrived quantity is sometimes used. The power density spec(cid:173)
`trum (PDS) for a periodic signal y(n), say ~,(w), is a real-valued function
`of frequency such that the average power in y(n) on the frequency range
`WI to w 2 with 0:::; wi < w 2 < 2n is given by
`
`(1.31)
`
`1 fw2
`average power in y(n) on wE [wl' w2 ] =-
`
`7{
`
`Wi
`
`~.(w) dw
`
`.
`
`(1.32)
`
`k2
`
`= 2 L I C(k)l2,
`
`k~kl
`
`where ki and k 2 represent the integer indices of the lowest and highest
`harmonic of y(n) in the specified range. 8 It is not difficult to show that a
`suitable definition is
`
`f- I k 12
`(
`2n)
`T;,(w)=2nk~oo C() oaw-kN.
`def
`
`(1.33)
`
`"If w 1 and hence k 1 are zero, then the lower expression in ( 1.32) should read
`k,
`C(O) + 2 .2:1C(k)l 2
`
`•
`
`k~l
`
`Petitioner Apple Inc.
`Ex. 1012, p. 13
`
`

`

`14 Ch. 1 J Propaedeutic
`
`[The reader can confirm that this is consistent with ( 1.32).] By comparing
`with ( 1.30), it is clear why some authors choose to write I Y(wW as a no(cid:173)
`tation for the PDS. The advantage of doing so is that it gives the con(cid:173)
`trived DTFT of (1.30) yet another "DTFT-like" property in the following
`sense: IX(w)j 2 is properly called the energy density spectrum for an energy
`signal x(n) and can be integrated over a specified frequency range to find
`total energy in that range. I Y(w)j 2 is thus an analogous notation for an
`analogous function for a power sequence. The disadvantage is that it in(cid:173)
`troduces more notation which can be easily confused with a more
`"valid" spectral quantity. We will therefore use only 1(w) to indicate the
`.
`y
`PDS of a periodic signal y(n).
`The similarity of the DFT to the DFS is apparent, and this similarity
`is consistent with our understanding that the IDFT, if used outside the
`range n E [0, N- 1], will produce a periodic replication of the finite
`sequence x(n). Related to this periodic nature of the DFT are the proper(cid:173)
`ties of "circular shift" and "circular convolution" of which the reader
`must beware in any application of this transform. A few of these notions
`are reviewed in Problem 1.4.
`For interpretive purposes, it will be useful for us to note the following.
`Although the DTFT does not exist for a periodic signal, we might con(cid:173)
`sider taking the limit9
`
`1
`.
`-
`y ( w) = J~ 2N + 1
`
`N L y(n)e-Jwn
`
`n~-N
`
`( 1.34)
`
`in the hope of making the transform converge. A moment's thought will
`reveal that this computation is equivalent to the same sum taken over a
`single period, say
`
`( 1.35)
`
`We shall refer to Y(w), 0 < w < 2n, as the complex envelope spectrum for
`a periodic signal y(n). The theoretical significance of the complex enve(cid:173)
`lope is that it can be sampled at the harmonic frequencies to obtain the
`DFS coefficients for the sequence. The reader will recall that a similar
`phenomenon occurs in the analog domain where the FT of one period of
`a periodic signal can be sampled at the harmonics to obtain the FS
`coefficients.
`Finally, with regard to Fourier techniques, recall that the fast Fourier
`transform (FFT) is a name collectively given to several classes of fast al(cid:173)
`gorithms for computing the DFT. The literature on this subject is vast,
`
`9The operator notation .L'{ · ) will be used consistently in the text to denote a time aver(cid:173)
`age of this form. We will formally define the operator when we discuss averages in Section
`1.2.3.
`
`Petitioner Apple Inc.
`Ex. 1012, p. 14
`
`

`

`1 .1 1 Review of DSP Concepts and Notation 15
`
`but the textbooks cited above provide a general overview of most of the
`fundamental treatments of the FFT. Some advanced topics are found in
`(Burris, 1988).
`The final transform that will be used extensively in this book is the
`(two-sided) z-transform (ZT), defined by
`X(z) = I x(n)z-n,
`
`00
`
`n=-co
`
`(1.36)
`
`where z is any complex number for which the sum exists, that is, for
`which
`
`00 I ix(n)zl-n <co.
`
`n=-co
`
`( 1.37)
`
`x(n) =
`
`2!1
`
`The values of z for which the series converges comprise the region of
`convergence (ROC) for the ZT. When the series converges, it converges
`absolutely (Churchill, 1960, Sec. 59), implying that the ZT converges uni(cid:173)
`formly as a function of z everywhere in the ROC. Depending on the time
`sequence, the ROC may be the interior of a circle, the exterior of a cir(cid:173)
`cle, or an annulus of the form rin < lzl <rout' where rin may be zero and
`rout may be infinite. The ROC is often critical in uniquely associating a
`time sequence with a ZT. For details see the textbooks in Appendix l.A.
`The ZT is formally inverted by contour integration,
`J. X(z)zn-l dz,
`J::
`where e is a counterclockwise contour through the ROC and encircling
`the origin in the z-plane, but several useful computational methods are
`well known, notably the partial fraction expansion method, and the resi(cid:173)
`due method.
`The ZT plays a similar role in DSP to that which the Laplace trans(cid:173)
`form does in continuous processing. A good speech processing engineer
`will learn to "read the z-plane" much the same as the analog systems en(cid:173)
`gineer uses the s-plane. In particular, the reader should be familiar with
`the correspondence between angles in the z-plane and frequencies, and
`between z-plane magnitudes and "damping." The interpretation of pole(cid:173)
`zero plots in the z-plane is also an essential tool for the speech processing
`engineer.
`Finally, we recall the relationships among the two Fourier transforms
`and the ZT. From the definitions, it is clear that
`DTFT X(w) = ZT X(eJw)
`
`( 1.38)
`
`( 1.39)
`
`for any w, so that the DTFT at frequency w is obtained by evaluating
`the ZT at angle w on the unit circle in the z-plane. This is only valid, of
`
`Petitioner Apple Inc.
`Ex. 1012, p. 15
`
`

`

`16 Ch. 1 1 Propaedeutic
`
`course, when the ROC of the ZT includes the unit circle of the z-plane. 10
`The periodicity with period 2n of the DTFf is consistent in this regard.
`Since the DFT represents samples of the DTFT at frequencies wk,
`k = 0, l, ... , N- 1, it can be obtained by evaluating the ZT at equally
`spaced angles around the unit circle in the z-plane. Therefore,
`
`DIT X(k) = DTIT x(wk =;; k) = ZT X(ej(2n/N)k).
`
`(1.40)
`
`Since we use the same "uppercase," for example, X, notation to indicate
`all three transforms, it is occasionally necessary in DSP work to explic(cid:173)
`itly denote the particular transform with, for example, a presuperscript as
`in (1.40).
`
`1.1.5 Windows and Frames
`In all practical signal processing applications, it is necessary to work
`with short terms or frames of the signal, unless the signal is of short du(cid:173)
`ration.11 This is especially true if we are to use conventional analysis
`techniques on signals (such as speech) with nonstationary dynamics. In
`this case it is necessary to select a portion of the signal that can reason(cid:173)
`ably be assumed to be stationary.
`Recall that a (time domain) window, say w(n), is a real, finite length
`sequence used to select a desired frame of the original signal, say x(n), by
`a

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket