throbber

`
`  
` 

`
`OLYMPUS EX. 1021 -1/15
`
`OLYMPUS EX. 1021 - 1/15
`
`

`

`SIGNAL PROCESSING
`
`CONTENTS
`
`Volume 6 (1984) NO. 4
`
`Regular Papers
`
`M. Vetterli and H. Nussbaumer, Simple FFT and DOT algorithms with reduced number
`
`of operations
`
`J. B. Martens, Convolution algorithms, based on the CRT (Chinese Remainder
`Theorem)
`
`T. Gtingen and N. C. Geckini, An algorithm for formant tracking
`
`E.
`
`I. Plotkin,‘ L. M. Roytman and N. M. S. Swamy, Unbiased estimation of an initial phase
`
`F. F. Liedtke, Computer simulation of an automatic classification procedure for
`digitally modulated communication signals with unknown parameters
`
`Author Index
`
`Short Communications
`
`G. H. Allen, Programming an efficient radix-four FFT algorithm
`
`J. Le Roux, Non symmetric lattice structure for pole zero filters
`
`Book Review
`
`Announcements + Call for Papers
`
`OLYMPUS EX. 1021 - 2/15
`
`

`

`NAL PROCESSING
`
`A
`I
`
`I
`
`‘opean Journal devoted to the methods and applications of signal processing
`alication of the European Association for Signal Processing (EURASIP)
`
`M. Nagao (Kyoto, Japan)
`O. D. Faugeras (Paris, France)
`ir-in—Chiet
`H. Niemann (Erlangen, West Germany)
`A. L. Fawe (Sart Tilman, Belgium)
`‘
`t Kunt
`F. Pellandini (Neuchatel, Switzerland)
`A. Fettweis (Bochum, West Germany)
`ratoire de traitement de signaux
`8. Picinbono (Gif-sur-Yvette, France)
`J. G. Gander (Solothurn, Switzerland)
`3 polytechnique federale de Lausanne
`B. S. Ramakrishna (Bangalore, India)
`C, Gazanhes (Marseille, France)
`lhemin de Bellerive
`F. Rocca (Milano, Italy)
`.
`D. Godard (La Gaude, France)
`1007 Lausanne Switzerland
`8. Sankur (Istanbul, Turkey)
`G. H. Granlund (Linkoeping, Sweden)
`)2147 26 26 Telex: 24478
`L. L. Scharf (Kingston, USA)
`C. Gueguen (Paris, France)
`)rial Board
`J. C. Simon (Paris, France)
`T. S. Huang (West Lafayette, USA)
`ellanger (Le Plessis Robinson, France)
`C. Y. Suen (Montreal, Canada)
`Z. Kulpa (Warsaw, Poland)
`3ite (Mons, Belgium)
`Ju—Wei Tai (Peking, China)
`J. L. Lacoume (Grenoble, France)
`raccini (Genova, Italy)
`H. Tiziani (Heerbrugg, Switzerland)
`M. A. Lagunan-Hernandes
`appellini (Florence, Italy)
`D. Wolf (Frankfurt, West Germany)
`(Barcelona, Spain)
`arayannis (Athens, Greece)
`G. Winkler (Karlsruhe, West Germany)
`D. S. Lebedev (Moscow, USSR)
`.Constantinides (London, UK.)
`I. T. Young (Delft, The Netherlands)
`S. Levialdi (Napoli, Italy)
`acoulon (Lausanne, Switzerland)
`P. Zamperoni (Braunschweig, West
`C. E. Liedtke (Hannover, West Germany)
`Durrani (Glasgow, UK.)
`Germany)
`J. Max (Grenoble, France)
`lndres (Darmstadt, West Germany)
`scudié (Lyon, France)
`R. De Mori (Montreal, Canada)
`Secretary: I. Bezzi (Miss)
`
`orial Policy. SIGNAL PROCESSING is a European Journal
`Data PlOCeSSlng
`Remote Sensing
`.enting the theory and practice of signal processing.
`Its
`Communication Signal Processing
`iary objectives are:
`Biomedical Signal Processing
`lemination of research results and of engineering develop-
`Geophysical and Astrophysical Signal Processing
`its to European signal processing groups and individuals;
`Earth Resources Signal Processing
`sentation of practical solutions to current signal processing
`Acoustic and Vibration Signal Processing
`stems in engineering and science.
`Signal Processing Technology
`editorial policy andthetechnical content ofthe Journal are
`Speech Processing
`responsibility ofthe Editor—in-Chief and the Editorial Board.
`i Journal is self—supporting from subscription Income and
`Radar Signal PFOCGSSWQ
`Industrial Applications
`tains a minimum amount of advertisements. Advertisements
`subject to the prior approval of the Editor—in—Chief.
`
`Sonar Signal Processing
`SPECIE' Signal PiOCGSSlng
`New Applications
`
`upe. SIGNAL PROCESSING incorporates all aspects of the
`Dry and practice of signal processing (analogue and digital).
`iatures original research work, tutorial and review articles,
`I accounts of practical developments.
`It
`is intended for a
`id dissemination of knowledge and experience to engineers
`I scientists working in signal processing research, develop—
`nt or practical application.
`
`Jject coverage. Subject areas covered by the Journal include:
`nal Theory
`Signal Processing Systems
`ichastic Prooesses
`Software Developments
`tection and Estimation
`Image Processing
`sctral Analysis
`Pattern Recognition
`ering
`Optical Signal Processing
`
`Membership and Subscription Information.
`SignaIProcessing (ISSN 0165—1684) is publishedintwovolumes
`(Bissues) ayear.The subscription pricefor1984(Volumes 6, 7) is
`Dfl. 350.00 + DfI. 42.00 p.p.h. : Dfl. 392.00. Our p.p.h. (postage,
`packing and handling) charge includes surface delivery of all
`Issues, except to subscribers in the USA/Canada and India, who
`receive all issues by airdelivery (S.A.L. a Surface Airl Lifted) at no
`extracost. Forthe restlofthe world, airmailand S.A.L.charges are
`available upon request. Claims for missing issues will be
`honouredfree ofchargewithinthree months afterthe publication
`date of the issues. Mail orders and inquiries to: Elsevier Science
`Publishers B.V., Journals Department, PO. Box 211, 1000 AE
`Amsterdam, The Netherlands. For full membership information
`of the Association, including a subscription at a reduced rate,
`please contact: EURASIP. PO. Box 134, CH—1000 Lausanne13,
`Switzerland.
`
`
`1984, Elsevier Science Publishers B.V. (North—Holland)
`
`stem or transmitted in any form or by any means, electronic, mechanical,
`e Publishers B.V. (Norttholland), PO. Box 1991, 1000 BZ
`
`rights reserved. No part of this publication may be reproduced, stored in a retrieval sy
`)tocopying, recording or otherwise, without the prior permission of the publisher, Elsevier Scienc
`sterdam, The Netherlands.
`Jmission of an article for publication implies the transfer of the copyright from the author(s) to the publisher and entails the author's irrevocable and exclusive
`and/ortoactinoroutofCourtin
`horization oi the publisher to collect any sums or considerations for copying or reproduction payable by third parties (as mentioned in article 17 paragraph 2 of the
`Ich Copyright Act of 1912 and in the Royal Decree of June 20, 1974 (S. 351) pursuant to article 16b ofthe Dutch Copyright Act of 1912)
`inection therewith.
`scial regulations for readers in the U.S.A. A This journal has been registered with the Copyright Clearance Center. Inc. Consent Is given for copying of articles for
`is This consent is given on the condition that the copier pays through the Center the per-copy tee stated in
`'sonal or internal use, or for the personal use of specific clien .
`1
`code on the first page of each article for copying beyond that permitted by Sections 107 or 108 oi the US. Copyright Law. The appropriate fee should be (orwarded with
`opy ofthefirst page otthe articletothe Copyright Clearance Center, Inc,, 21 Congress Street, Salem,
`tgiven broad consent to copy and permission to copy must be obtained directly from the author. All articles published priorto 1981 may be copied fora perscopy tee oi
`$2.25, also payable through the Center. (NB. For review journals this fee is $0.25 per copy per page.) This consent does not extend to other kinds oi copying, such as
`general distribution, resale, advertising and promotion purposes, or forcreating new collective works. Special written permission must be obtained from the publisher
`such copying.
`will be asked to transfer copyright of the article to the publisher.
`eclal regulations [or authors in the USA. 7 Upon acceptance of an article by the journal, the author(s)
`is transfer will ensure the widest possible dissemination of intormation under the US. Copyright Law.
`
`ibiished bimonihiy
`
`Printed in The Netherlands
`
`OLYMPUS EX. 1021 - 3/15
`
`OLYMPUS EX. 1021 - 3/15
`
`

`

`This material may be protected by Copyright law (Title 17 U.S. Code)
`
`Signal Processing 6 (1984) 267—278
`North-Holland
`
`SIMPLE FFT AND DCT ALGORITHMS WITH REDUCED NUMBER OF
`OPERATIONS
`
`/
`Martin VETTERLI, member EURASIP, and Henri JKNNU§SBAUMER
`Laboratoire d’Informatique Technique, Ecole Polytechnique Fédérale de Lausanne, 16 Chemin de Bellerive, CH-1007 Lausanne,
`Switzerland
`
`Received 2 November 1983
`Revised 20 February 1984
`
`Abstract. A simple algorithm for the evaluation of‘discrete Fourier transforms (DFT) and discrete cosine transforms (DCT)
`is presented. This approach, based on the divide and conquer technique, achieves a substantial decrease in the number of
`additions when compared to currently used FFT algorithms (30% for a DFT on real data, 15% for a DFT on complex data
`and 25% for a DCT) and keeps the same number of multiplications as the best known FFT algorithms. The simple structure
`of the algorithm and the fact that it is best suited for real data (one does not have to take a transform oftwo real sequences
`simultaneously anymore) should lead to efficient implementations and to a wide range of applications
`
`Zusammenfassung. Ein einfacher Algorithmus zur Berechnung von diskreten Fourier Transformationen (DFT) und diskreten
`Cosinus Transformationen (DCT) wird vorgeschlagen. Diese Methode, basierend aufder “Teilen und Losen” Technik, erlaubt
`eine Verkleinerung der Anzahl Additionen gegeniiber gebraiichlichen FFT Algorithmen (30% fiir eine DFT von einem reellen
`Signal, 15% Hit eine DFT von einem complexen Signal und 25% fiir eine DCT) und braucht gleichviel Multiplikationen wie
`die besten bekannten FFT Algorithmen. Die einfache Struktur des Algorithmus und der Fakt dass er am besten fiir reelle
`Signale geeignet
`ist (man braucht nicht mehr gleichzeitig zwei reelle Signale zu transformieren) sollten zu effizienter
`Implementierung und zu zahlreichen Applikationen fiihren.
`
`Résumé. Un algorithme simple pour l’évaluation de la transformée de Fourier discrete (DFT) et de la transformée en cosinus
`discrete (DCT) est proposé. Cette approche, basée sur la méthode de la “division et solution”, permet une diminution
`Substantielle du nombre d’additions par rapport aux algorithmes de FFT courants (30% pour une DFT de signaux réels,
`15% pour une DFT de signaux complexes et 25% pour une DCT) tout en gardant un nombre de multiplications ‘égal a celui
`des meilleurs algorithmes de FFT connus. La structure simple de l’algorithme ainsi que le fait qu’il s’applique bien aux
`signaux réels (i1 n’y a plus besoin de prendre la transformée de deux signaux réels simultanément) devraient conduire a une
`implantation efficace ainsi qu’a un large champ d’applications.
`
`Ol65-1684/84/$3.00 © 1984, ‘Elsevier Science Publishers B.V. (North—Holland)
`
`transform (WFTA) [6], although a beautiful result in complexity theory, did not bring the expected
`improvements once implemented on real life computers [7], essentially due to the large total number of
`operations and to the structural complexity of the algorithm.
`The fact that most FFT’s are taken on real data is seldom fully taken into account. The algorithm using
`a FFT of half dimension for'the computation of a DFT on a real sequence [8] uses substantially more
`operations than the method of computing a single FFT on two real sequences simultaneously [9]. The
`
`Keywords. Fast Fourier transform, fast cosine transform, transforms of real data.
`
`1. Introduction
`
`Since the rediscovery of the fast Fourier transform (FFT) algorithm [1, 2] for the evaluation of discrete
`Fourier transforms, several improvements have been made to the basic divide and conquer scheme as for
`example the mixed radix FFT [3] and the real factor FFT [4, 5]. The introduction of the Winograd Fourier
`
`OLYMPUS EX. 1021 - 4/15
`
`

`

`
`
`3
`
`
`
`l
`
`268
`
`M. Verterli, H. J. Nilssbaumer/ Simple FFT and DCT Algorithms
`
`latter method has the disadvantage that one has to take two DFT’s at once and that the sorting of the
`output uses additional adds. The fact that the input and output sequences are real is used explicitly in a
`real convolution algorithm [10] where the DFT and inverse DFT are computed with a single complex FFT.
`Another transform that
`is mostly applied to real data is the discrete cosine transform. Since the
`introduction of the DCT [11], the search for a fast algorithm followed two main different approaches.
`One was to compute the DCT through a FFT of same dimension [12], where one is bound to take two
`transforms simultaneously. The other was a direct approach, leading to rather involved algorithms [13].
`It should be noted that the former technique outperforms all the latter ones when using optimal FFT’s,
`a fact often left in the dark [14].
`_
`Recently, evaluation of signal processing algorithms has shifted away from multiplication counts alone
`to the counting of the total number of operations, including data transfers [15]. This is due to the fact
`that the ratios (multiplication time)/(addition time) and (multiplication time)/(load time) are close to one
`on most computers and signal processors. Another growing concern has been the generation of time
`efficient software [16], and finally, the efficiency of an algorithm turns out to be a non-trivial combination
`of the various operation counts as well as of its structural complexity [17].
`In this communication, we address an old problem, namely, the efficient evaluation of DFT’s and DCT’s
`of real data. Efficiency is meant in the sense of m‘inimal number of multiplications and additions as well
`as in the sense of structural simplicity. As it turns out, the two problems are closely related, since a DFT
`of dimension N can be evaluated with two DCT’s of size N/4 and since a DCT of size N can be evaluated
`with a DFT of size N and additional operations. The same technique can be applied again to the reduced
`DCT of size N/4 and to the DFT of size N, and this until only trivial transforms are left over (N z 1, 2).
`This leads to an elegant recursive formulation ofthe two algorithms and to a number of multiplications
`identical
`to the best FFT’s while diminishing substantially the number of additions (typically 30%).
`Interestingly, this last saving is partly kept when computing complex DFT’s, and as an example, the total
`number of operations for a 1024-point transform is nearly 10% below the number of operations required
`for a lOO8—point WFTA. The prime factor FFT (PFA) requires about the same number of operations [18],
`but has a more complex structure.
`Note that the algorithms below were developed while searching for an efficient way to compute DCT’s
`of real data. The derived FFT algorithm for real data that follows immediately requires a number of
`multiplications identical to the one found in [19] (which is a variation of the Rader—Brenner algorithm),
`and a total number of operations that can be found in [20]. While obtaining an identical complexity, the
`derivations are quite different and the algorithm below seems more suitable for programming.
`Section 2 is used to derive the general algorithm and Section 3 evaluates its computational complexity.
`In Section 4, the results are compared to other algorithms and some implementation considerations are
`addressed.
`
`2. Derivation of the algorithms
`
`Let us define the following transforms ofthe length-N real vector x with elements x(0), x(l) ‘
`
`-
`
`- x(N — 1):
`
`Discrete Fourier transform
`Nil
`DFT(k,N,x):= z x(n)-e1'2"""/N, k=0,...,N—1,
`11:0
`
`(1)
`
`where j =
`Signal Processing
`
`OLYMPUS EX. 1021 - 5/15
`
`OLYMPUS EX. 1021 - 5/15
`
`

`

`n20
`
`where (8) or (9) are applied when necessary.
`Hence, the cosine DFT of dimension N has been mapped into a cosine DFT of N/2 and a DCT of
`N/4, at a cost of N/4 input and N/2 output additions (note that cos-DFT(k, N, x) is evaluated for
`k =0, .
`.
`.
`, N/2 but that k: N/4 does not require an output addition). The cosine DFT of N/2 can be
`handled similarly (and this until the transform becomes trivial) and the case of the DCT will be treated
`below.
`
`Turning to the sine DFT, we take a similar approach, using the fact that the sine function is odd.
`N/Zil
`
`_
`
`.
`
`21T(2n+l)k
`
`(5)
`
`(6)
`
`(7)
`
`(8)
`
`(9)
`
`(10)
`
`(11)
`
`(12)
`
`(13)
`
`M, Verterli, H. JfNussbaumer/ Simple FFT and DCT Algorithms
`
`Discrete cosine transform
`
`DCT(k, N, x) I: Z x(n) - cos<%)—
`
`2
`
`2 +1 k
`
`N"
`n:0
`
`Cosine DFT
`
`Sine DFT
`
`sin—DFT(k, N, x) I: X x(n) - sin
`n:0
`
`...
`
`.
`
`(4)
`
`Note that in the definition of the DCT, the normalizing factor l/VE at k=0 is omitted for simplicity.
`While the transforms are usually only defined for k = 0,. .
`.
`, N — 1, one will see that other values are useful
`as well and can easily be obtained from the ones defined above.
`Obviously, the following relations hold:
`
`DFT(k, N, x) = cos-DFT(k, N, x) —j sin-DFT(k, N, x),
`
`DCT( N, N, x) = 0,
`
`DCT(—k, N, x) = DCT(k, N, x),
`
`DCT(ZN ~ k, N, x) = * DCT(k, N, x),
`
`cos-DFT(N k, N, x) = cos—DFT(k, N, x),
`
`sin-DFT(N ~ k, N, x) = —sin-DFT(k, N, x).
`
`Looking at the evaluation of cos-DFT(k, N, x), we note that since the cosine function is even:
`
`C05-DFT(1<,N,x)= M2271x(2n)'COS<2fink>+NE71(X(2n +1) +x(N—2n — 1)) . cos<w),
`
`nv—O
`
`N/2
`
`":0
`
`4- N/4
`
`or, in a more succinct form:
`
`cos-DFTUc, N, x): c0s-DFT(k, N/2, x.) +DCT(k, N/4, x2),
`
`k = O, .
`
`.
`
`.
`
`, N — l,
`
`with x1(n)=x(2n),
`
`n=0,...,N/2—1,
`
`x2(n)=x(2n+l)+x(N—2n—l), n=0,...,N/4—l,
`
`sin—DFT(k, N, x): Z x(2n) - sin<
`
`217nk
`
`N/2>+ E0 (x(2n+1)—x(N—2n—-1))
`
`N/AH
`
`sm(———4IN/4
`
`Vol. 6, No. 4, August I984
`
`OLYMPUS EX. 1021 - 6/15
`
`

`

`2I
`
`}2
`
`/
`
`
`
`memmwmwWWowmemmwafififwflfiéfinmfiW§AkkwlfllWfkwfléwmy
`
`i
`
`I
`
`5
`
`,
`
`3
`
` sgé
`
`,Eti
`
`g
`
`270
`
`M. Vetterli, H. J. Nussbaumer/ Simple FFT and DCT Algorithms
`
`Using the following identity:
`
`Sin<217(2n]\[+1)k> :(“1)“‘COS<27T(2n +1])\([N/4—k)>,
`we can rewrite (13), using (8) or (10) when necessary, as:
`
`sin-DFT(k, N, x) = sin-DFT(k, N/2, x,) +DCT(N/4— k, N/4, x3),
`with k3(n)=(—1)"-(x(2n+l)—x(N~2n—l)).
`
`k = 0, .
`
`.
`
`.
`
`, N — l,
`
`(14)
`
`(15)
`
`Therefore, the sine DFT of dimension N has been mapped into a sine DFT of N/2 and a DCT of N/4,
`at the cost of N/4 input and (N/2)—2 output additions (note that sin-DFT(k, N, x) is evaluated for
`k = 0, .
`.
`.
`, N/2 but that k : 0, N/4 and N/2 do not require any output additions). The sine DFT of N/2
`is handled in a similar fashion until the length is reduced so that the transform becomes trivial.
`
`We now focus our attention on the computation of the DCT. Using the following mapping [12]:
`
`x4(n)=x(2n),
`x4(N—n—1)=x(2n+l), n=0,...,N/2—l,
`the DCT can be evaluated as:
`N—1
`2
`4 +1 k
`DCT(k, N, x): Z x4(n)-cos(—1—n——i), k:0,..., N—l.
`
`11:0
`
`i
`
`Using basic trigonometry, (17) becomes:
`
`21Tk
`.
`21Tk
`.
`DCT(k, N, x) 2 cos TN ~cos-DFT(k, N, x4)~sm W -s1n—DFT(k, N, x4),
`
`k = 0, .
`
`.
`
`.
`
`(16)
`
`(17)
`
`, N — l.
`(18)
`
`With the symmetries of trigonometric functions, (18) can be computed as follows:
`(2%)
`. (wk)
`cos — —s1n ~—
`4N
`4N
`(217k)
`(217k)
`
`‘ [cos—DFT(k, N, 3%)]
`~
`sin~DFT(k, N, x4)
`
`’
`
`k_0
`
`"H’
`
`N/2_1
`
`] _
`[ DCT(k, N, x)
`DCT(N— k, N, x) _
`
`_
`
`sm — cos m
`4N
`4N
`
`and
`
`DCT(N/Z, N, x) = cos(1-r/4) - cos-DFT(N/2, N, x4).
`
`(19)
`
`The matrix product in (19) can be evaluated with 3 multiplications and 3 additions (instead of 4
`multiplications and 2 additions) [9] as follows:
`
`pl—COS
`
`Sln
`
`,
`
`a . (my, (age)
`
`0s 4N ,
`
`pz—sm 4N
`
`s, : cos-DFT(k, N, x4) +sin-DFT(k, N, x4),
`
`217k
`
`m1=sl - cos m,
`Signal Processing
`
`wmmmmmmmmmmmmwmmmwmmwmwm
`
`OLYMPUS EX. 1021 - 7/15
`
`OLYMPUS EX. 1021 - 7/15
`
`

`

`M. Vetterli, H. J. Nussbaumer/ Simple FFT and DCT Algorithms
`
`~ m2 = Sin-DFT(k, N, x4) - 171,
`
`m3 : cos—DFT(k, N, x4) ' P2,
`
`DCT(k, N; x) = ml — m2,
`
`DCT(N — k, N, x) x ml +m3,
`
`I
`
`'
`
`(20)
`
`where p. and p; are precomputed. Using all simplifications, the computation of (19) requires therefore a
`total of (3N/2)72 multiplications and (3 N/2)~3 additions.
`Thus, we have shown how to map a N dimensional DFT into two DCT’s of N/4 (5, 12 and 15) and
`how to map a DCT of N into a DFT of same size (16 and 19). In other words, since the DCT is computed
`through a DFT, it is shown how to compute a DFT of N with 2 DFT’s of N/4 plus auxiliary operations}
`and this operation can be repeated until N has been reduced sufficiently so as to lead to trivial transforms.
`Figures 1
`to 4 try to visualize schematically the interaction of the various transforms. Combination of
`these figures in appropriate order show how to compute a transform of any dimension.
`
`Vol. 6, No.4, August 1984
`
`DFT(N)
`
`cos-DFT(N>
`
`sm—DFT(N)
`
`Fig.
`
`l. Decomposition ofa DFT of size N into a cosine DFT of size N and a sine DFT of size N.
`
`cos—DFT(N)
`
`cos—DFTtN/2)
`
`DCT(N/LD
`
`Fig. 2. Decomposition iof a cosine DFT of size N into a cosine DFT of size, N/2 and a DCT of size N/k4.
`
`OLYMPUS EX. 1021 - 8/15
`
`

`

`272
`
`M. Venerli, H. J. Nussbaumer/ Simple FFT and DCT Algorithms
`
`sm—DFHN)
`
`SIN—DFT(N/2)
`
`
`
`DCT(N/Ll)
`
`
`Fig. 3. Dccomposition of a sinc DFT of size N into a sine DFT of size N/2 and a DCT of size N/4.
`
`DCT(N
`
`cos—DFT(N)
`
`SIN-DFT(N)
`
`Fig. 4. Decomposition of a DCT of size N into a cosine DFT of size N and a sine DFT of size N plus auxiliary operations.
`
`The algorithm is best suited for DFT’s on real data, but if the input is complex, one simply takes the
`real and imaginary parts separately (thus doubling the computational load) and evaluates the output with
`2N—4 auxiliary adds (k = 0 and N/2 require no additions).
`
`’3. Computational complexity
`
`Even if the derivation above used only the fact that N was a multiple of 4, the algorithm, as other
`divide and conquer schemes, performs best when N is highly composite, typically a power of 2.
`Below, we restate the number of operations required for the various steps needed in the evaluation of
`the transforms, where OM[-] and OA[-] stand for the number of multiplies and adds respectively.
`Signal Processing
`
`OLYMPUS EX. 1021 - 9/15
`
`OLYMPUS EX. 1021 - 9/15
`
`

`

`M. Vetterli, H. J. Nussbaumer/ Simple FFT and DCT Algorithms
`
`DFT on length—N real data:
`
`OM[R-DFT( N)] = OM[cos-DFT(N)] + 0M [sin-DFT(N)],
`
`0A[R—DFT(N)] = OA[cos-DFT(NI)] + OA[sin—DFT(N)].
`
`DFT on length-N complex data:
`
`OM[C-DFT(N)] = 2- (OM[cos-DFT(N)]+ OM[sin-DFT(N)]),
`
`OA[C-DFT(N)] = 2 - (OA[cos-DFT(N)] + OA[sin-DFT(N)]) +2N —4.
`
`DCT on length-N real data:
`
`OM [DCT(N)] = OM[cos-DFT(N)] + OM[sin—DFT(N)] +(3 N/2)~2,
`
`OA[DCT(N)] = OA[cos-DFT(N)] + OA[sin-DFT(N)] +(3 N/2) — 3.
`
`Cosine DFT on length-N real data:
`
`OM[cos—DFT(N)] = OM[DCT(N/4)] + on; [cos‘DFT(N/ 2)],
`
`OA[cos—DFT(N)] : OA[DCT(N/4)] + OA[cos—DFT(N/2)] +(3 N/4).
`
`Sine DFT on length-N real data:
`
`OM[sin-DFT(N)] = OM[DC'I‘(N/4)] + OM[sin-DFT(N/2)],
`OA[sin-DFT(N)] = OA[DCT(N/4)]+ OA[sin-DFT(N/2)] +(3 N/4) —2.
`
`25
`
`(
`
`)
`
`OM[R-DFT(2)] = OM[C-DFT(2)] = OM[cos-DFT(2)] = OM[sin-DFT(2)] = 0,
`
`OM[DCT(2)] : 1,
`
`0A[R-DFT(2)] = OA[DCT(2)] : OA[cos-DFT(2)] = 2,
`
`OA[C-DFT(2)] : 4,
`
`OA[sin-DFT(2)] : 0.
`
`But, when N is a power of 2 and that the recursions are therefore applied Logz N times, the operation
`counts for the DCT reduces simply to:
`
`Vol. 6, No. 4, August I984
`
`, From (21)—(25) one can compute recursively the number of operations needed for the various transform
`types and sizes greater than 2. For N = 1, no operations are required, and the values for N = 2 are given
`in (26), thus defining the initial conditions for the above recursions.
`
`0M[DCT(N)]= N/2 ‘ L0g2(N),
`
`OA[DCT(N)] = N/2 - (3 Log2(N)—2) + 1.
`
`From (27) it follows immediately with (21)—(23) that:
`
`OM[R-DFT(N)]= N/2 - (Log2(N) —3) +2,
`
`OA[R-DFT(N)] = N/2 - (3 Log2(N)—5)+4,
`
`(27)
`
`(28)
`
`OLYMPUS EX. 1021 - 10/15
`
`

`

`274
`
`and
`
`M. Verterli, H. J. Nussbaumer/ Simple FFT and DCT Algorithms
`
`OM[C—DFT(N)] = N- (Log2(N)—3) +4,
`.
`OA[C-DFT(N)]= 3N- (L0g2(N)— 1) +4.
`
`The above closed form expressions can now easily be compared to existing algorithms.
`
`
`
`29
`
`(
`
`)
`
`4. Comparison with existing algorithms
`
`First, the introduced algorithm (which is called in the following fast Fourier-cosine transform or FFCT)
`is compared to other algorithms which work directly on real data. Then we look at the case where
`two transforms on real data are taken simultaneously with an optimal FFT algorithm. The case of the
`FFT on complex data is investigated as well, and the issue of algorithm structure is addressed. The
`operation counts for FFT’s are taken from [9] unless otherwise specified.
`For the computation of a DFT on real data, one can use the algorithm based on an FFT of N/2 [8].
`Using again the 3 mult/3 adds approach for the auxiliary output operations, one gets the following
`complexity:
`
`OM[FFT(N)] = OM[FFT(N/2)] +(3N/2)_2,
`
`OA[FFT(N)] = 0A[FFT(N/2)] +(9N/2)—2.
`
`(30)
`
`Table 1 compares this result with the FFCT and shows the substantial savings that are obtained.
`
` .mmmm.&mwmmawWWMWMWtMM.WMgMW)WWWa.
`
`Table 1
`
`Comparison of direct computation of a DFT on real data with a FFT of N/2 or
`the FFCT algorithm (the FFT of N/2 is computed with the Rader—Brenner algorithm)
`
`size
`FFT of N/2 +ops.
`FFCT algorithm
`
`
`N
`
`real mults
`
`real adds
`
`real mults
`
`real adds
`
`20
`2
`42
`10
`8
`60
`10
`122
`26
`16
`164
`34
`290
`66
`32
`420
`98
`710
`162
`64
`1 028
`258
`1 678
`386
`128
`2 436
`642
`3 870
`898
`256
`5636
`1538
`8766
`2050
`512
`12804
`3586
`19582
`4610
`1024
`
`
`
`
`43 262 8 19410 2422 048 28 676
`
`,
`
`Turning to the DCT, one can compare the FFCT to fast discrete cosine algorithms which perform the
`transform directly, without going to the FFT (and therefore to the need to perform 2 transforms on real
`data at the same time). The various operation counts are shown in Table 2. The number of multiplications
`for the FFCT is always below the other algorithms, and even if the number of additions is slightly above
`the Chen et a]. version, the total number of operations is always substantially less.
`Signal Processing
`
`mamxmmmmemwmxmmmmwmwmmmmmwmm“888mm
`
`OLYMPUS EX. 1021 - 11/15
`
`OLYMPUS EX. 1021 - 11/15
`
`

`

`M. Vetlerli, H. J. Nussbaumer/ Simple FFT and DCTAIgorirhms
`
`Table 2 '
`
`Comparison of direct computation of a DCT on real data with the algorithms of Chen et a]. [13], Wang er a1. [14]
`and the FFCT (the operation counts for the two first algorithms are taken from [14])
`
`
`
`
`
`
`
`size FFCT Chen et a1. . Wang et a1.
`
`
`
`
`
`
`
`
`
`r. mults r. adds r. adds r. multsN r. adds
`
`29
`12
`29
`13
`26
`16
`8
`81
`32
`83
`35
`74
`44
`16
`209
`80
`219
`9|
`194
`116
`32
`513
`192
`547
`227
`482
`292
`64
`1217
`448
`1315
`547
`1154
`708
`128
`2 817
`1 024
`3 075
`l 283
`2 690
`l 668
`256
`6 401
`2 304
`7 043
`2 947
`6146
`3 844
`512
`14 337
`5 I20
`15 875
`6 659
`13 826
`8 708
`1024
`
`
`
`
`
`
`19460 30 722 14 851 35 331 112642048 31745
`
`Vol. 6, No. 4, August 1984
`
`
`
`
`
`The other general approach to transforms on real data is to compute simultaneously the transform of
`two real sequences. Beside the drawback that one has always to compute two transforms (otherwise the
`approach is suboptimal), the sorting of the output requires about N output additions. The operation count
`for a DFT on real data are given below:
`
`OM[R-DFT(N)] = 1/2 - OM[FFT(N)],
`
`OA[R-DFT(N)]=1/2' OA[FFT(N)] + N —2.
`
`(31)
`
`Table 3 compares (31) with the FFCT, and, as can be seen, the multiplication count is identical while a
`saving of about 30% is made with regard to additions.
`The situation is quite similar when computing a DCT with an FFT [12]. It is assumed that 2 DCT’s on
`real data are evaluated at the same time. Thus, the operation count is:
`
`OM[R-DFT(N)]=1/2 - OM[FFT(N)]+(3N/2)—2,
`OA[R~DFT(N)]= 1/2- OA[FFT(N)]+(5N/2)—5.
`
`Table 3
`
`Comparison of the computation of a DFT on real data with an FFT of N on two
`data sequences simultaneously and output additions or the FFCT algorithm (the
`FFT of N is computed with the Radar—Brenner algorithm)
`
`(32)
`
`
`
`size FFCT algorithm FFT of N+adds
`
`
`
`
`
`
`
`
`
`real mults real adds real multsN real adds
`
`
`
`32
`88
`
`2
`10
`
`98
`
`20
`60
`
`
`
`OLYMPUS EX. 1021 - 12/15
`
`

`

`
`
`276
`
`M. Vellerli, H. J. Nussbaumer/ Simple FFT and DCT Algorithms
`
`A technique similar to the one used in (23) was used in the computation of the output of the DCT. The
`results are compared in Table 4 where it is seen that the additions are reduced in the FFCT by about
`25% with an identical number of multiplies.
`
`Table 4
`
`Comparison of the computation of a DCT on real data with the algorithm using
`an FFT on two data sequences simultaneously or the FFCT (the FFT is the
`Rader—Brenner version)
`
`
`
`size
`
`N
`
`FFT of N +ops.
`
`FFCT algorithm
`
`real mults
`
`real adds
`
`real mults
`
`real adds
`
`29
`12
`41
`12
`8
`81
`32
`109
`32
`16
`209
`80
`287
`80
`32
`513
`192
`707
`192
`64
`1 217
`448
`1 675
`448
`128
`2 817
`1 024
`3 867
`1 024
`256
`6401
`2304
`8763
`2304
`512
`14337
`5120
`19 579
`5120
`1024
`31 745
`11 264
`43 259
`11 264
`2 048
`
`
`Turning to the computation ofa DFT on a complex input, one can use (29) for the FFCT. The comparison
`to the Rader—Brenner FFT is given in Table 5 where the number of multiplies turns out to be identical
`while the number of additions is reduced by nearly 20%.
`Looking at last at the Winograd Fourier transform, one sees that even if the number of multiplies is
`larger in the FFCT case, the total number of operations is smaller for large transforms (for example 5%
`for the WFTA(504)/FFCT(512) and 9% for the WFTA(1008)/FFCT(1024) comparison). Note that the
`gain is less in the PFA case (5% and 2% respectively).
`Concerning the WFTA and the PFA, one recalls that its main drawback is the involved structure of the
`algorithm. Even if the structure of the FFCT is not as straightforward as the radix-2 FFT,
`it remains
`simple (it is similar to the structure of the Rader—Brenner algorithm). In the FFCT, the reduction in the
`number of operations is obtained at the cost of additional permutations and data transfers. Even if these
`
`Table 5
`
`Comparison ofthe Rader—Brenner FFT with the FFCT when applied to complex data
`
`
`
`size
`
`N
`
`FFT of N
`
`FFCT algorithm
`
`real mults
`
`real adds
`
`real mults
`
`real adds
`
`52
`4
`52
`4
`8
`148
`20
`148
`20
`16
`388
`68
`424
`68
`32
`964
`196
`l 104
`196
`64
`2308
`516
`2720
`516‘
`128
`5 380
`1 284
`6 464
`,1 284
`256
`12292
`3076
`14976
`3076
`512
`27 652
`7 172
`34 048
`7 172
`1024
`
`
`
`
`16 38876 28816 3882 048 61444
`Signal Processing
`
`OLYMPUS EX. 1021 - 13/15
`
`OLYMPUS EX. 1021 - 13/15
`
`

`

`M. Velterli, H. J. Nussbaumer/ Simple FFT and DCTAIgurillnns
`
`277
`
`permutations can be grouped with previous ones (for example at the entry of the DCT), the resulting
`structure is more complex. But this increase in topological complexity is annihilated by the substantial
`decrease in arithmetic complexity.
`Looking at the implementation, one sees the importance of using very eflicient small transforms. Together
`With the explicit coding technique [16], this should lead to fast code for real transforms, especially on
`micro and signal processors, where the number of arithmetic and data registers is small (leading to
`uneflicient implementations of the complex transform on two real sequences version).
`
`5. Concluding remarks
`
`We have introduced a simple, recursive algorithm for the computation of the discrete Fourier transform
`and the discrete cosine transform. First, this algorithm can be applied directly to real data, providing
`therefore an attractive alternative to the method using a complex transform oftwo real sequences. Secondly,
`it uses the same number of multiplies as the best structured FFT algorithms but decreases the number of
`additions by about 25 to 30% for DFT’s and DCT’s on real data and by nearly 20% for DFT’s on complex
`data when compared to currently used algorithms. At last, its structure is simple enough so that it should
`lead to efficient implementations.
`It is worth noting that the proposed algorithm, while leading to the same computational complexity
`for the FFT as the one in [20], is not isomorphic. Interestingly, the algorithm in [20] has been around for
`15 years but was seldom referenced and even less used. Meanwhile, literature was published on amelior-
`ations of the Cooley—Tukey FFT which leads to less efficient algorithms than the one in [20]... . More
`recently, actually just as the final copy was being sent to the publisher, yet another algorithm appeared
`that leads to the same complexity for the complex FFT [21].
`Investigations are under way in order to generalize the above ideas to other sinusoidal transforms and
`to higher dimensions, as well as to prove the efficiency of an implementation.
`I
`
`Vol. 6, No. 4, August 1984
`
`Acknowledgements
`
`The authors would like to thank one of the anonymous reviewers for bringing [20] and [21] to their
`attention as well as the “Fonds National Suisse de la Recherche Scientifique” for supporting this research.
`
`References
`
`[l] C. Runge, and H. Konitz, Vorlesungen iiber Numerisches Reclinen, Springer Verlag, Berlin, 1924.
`[2] J.W. Cooley, and J.W. Tukey, “An algorithm for the machine calculation of complexfourier series", Math. of Comput., Vol.
`19, pp. 297—301, April 1965.
`[3] R. Singleton, “An algorithm for computing the mixed radix fast fourier transform“, IEEE Trans. on Audio. and Electraacoustics,
`Vol. AU—l7, pp. 93—103, June 1969.
`[4] CM. Rader, and N.M. Brenner, “A new principle for fast fouricr transfoimation”, IEEE Trans. Acoust., Speech, Signal Processing,
`Vol. ASSP-24, pp. 264—265, June 1976.

`[5] G. Bruun, “z—transform DFT filters and FFT’s”, IEEE Trans. Acoust, Speech, Signal Processing, Vol. ASSP-26, pp. 56—63, Feb.
`1978.
`[6] S. Winograd, “On computing the discrete fouricr transform”, Proc.

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket