throbber
 
` 

`
`OLYMPUS EX.1021 - 1/15
`
`OLYMPUS EX. 1021 - 1/15
`
`

`

`SIGNAL PROCESSING
`
`CONTENTS
`
`Volume 6 (1984) No. 4
`
`Regular Papers
`
`M. Vetterli and H. Nussbaumer, Simple FFT and DCTalgorithms with reduced number
`of operations
`
`J. B. Martens, Convolution algorithms, based on the CRT (Chinese Remainder
`Theorem)
`
`T. Gdingen and N. C. Geckini, An algorithm for formant tracking
`
`E.
`
`|. Plotkin, L. M. Roytman and N. M.S. Swamy, Unbiased estimation of an initial phase
`
`F. F. Liedtke, Computer simulation of an automatic classification procedure for
`digitally modulated communication signals with unknown parameters
`
`Short Communications
`
`G.H. Allen, Programming an efficient radix-four FFT algorithm
`
`J. Le Roux, Non symmetric lattice structure for pole zerofilters
`
`Book Review
`
`Announcements + Call for Papers
`
`Author Index
`
`267
`
`279
`
`293
`
`301
`
`311
`
`325
`
`331
`
`337
`
`339
`
`343
`
`
`
`seaeecaspentememeeNath—etenaptnt
`
`
`
`
`
`OLYMPUS EX.1021 - 2/15
`
`OLYMPUS EX. 1021 - 2/15
`
`

`

`NAL PROCESSING
`
`S
`t
`
`a
`
`“opean Journal devoted to the methods and applications of signal processing
`Stication of the European Association for Signal Processing (EURASIP)
`M. Nagao (Kyoto, Japan)
`r-in-Chief
`O. D. Faugeras (Paris, France)
`H. Niemann (Erlangen, West Germany)
`.
`t Kunt
`A. L. Fawe (Sart Tilman, Belgium)
`F. Pellandini (Neuchatel, Switzerland)
`ratoire de traitement de signaux
`A. Fettweis (Bochum, West Germany)
`B. Picinbono (Gif-sur-Yvette, France)
`2 polytechnique fédérale de Lausanne
`J. G.-Gander (Solothurn, Switzerland)
`B.S. Ramakrishna (Bangalore, India)
`‘hemin de Bellerive
`C, Gazanhes(Marseille, France)
`F. Rocca (Milano,Italy)
`1007 Lausanne Switzerland
`D. Godard (La Gaude, France).
`B. Sankur (Istanbul, Turkey)
`21 47 26 26 Telex: 24478
`G.H. Granlund (Linkoeping, Sweden)
`L. L. Scharf (Kingston, USA)
`wial Board
`C. Gueguen (Paris, France)
`J. C. Simon (Paris, France)
`ellanger (Le Plessis Robinson, France)
`T.S. Huang (West Lafayette, USA)
`C. Y. Suen (Montreal, Canada)
`dite (Mons, Belgium)
`Z. Kulpa (Warsaw, Poland)
`Ju-Wei Tai (Peking, China)
`raccini (Genova, Italy)
`J. L. Lacoume (Grenoble, France)
`H. Tiziani (Heerbrugg, Switzerland)
`appellini (Florence,Italy)
`M. A. Lagunan-Hernandes
`D. Wolf (Frankfurt, West Germany)
`arayannis (Athens, Greece)
`(Barcelona, Spain)
`G. Winkler (Karlsruhe, West Germany)
`_ Constantinides (London,U.K.}
`D. S. Lebedev (Moscow, USSR)
`|. T. Young (Delft, The Netherlands)
`» Coulon (Lausanne, Switzerland)
`S. Levialdi (Napoli, Italy)
`P. Zamperoni (Braunschweig, West
`Durrani (Glasgow, U.K.)
`C. E. Liedtke (Hannover, West Germany)
`Germany)
`‘ndres (Darmstadt, West Germany)
`J. Max (Grenoble, France)
`scudié (Lyon, France)
`R. De Mori (Montreal, Canada)
`Secretary: |. Bezzi (Miss)
`
`orial Policy. SIGNAL PROCESSING is a European Journal
`Data Processing
`,enting the theory and practice of signal processing.Its
`Remote Sensing

`Communication Signal Processing
`lary objectives are:
`Biomedical Signal Processing
`semination of research results and of engineering develop-
`Geophysical and Astrophysical Signal Processing
`its to European signal processing groups andindividuals;
`Earth Resources Signal Processing
`sentation of practical solutions to current signal processing
`Acoustic and Vibration Signal Processing
`slems in engineering and science.
`editorial policy and the technical contentof the Journal are
`Signal Processing Technology
`SonarSignal Processing
`responsibility of the Editor-in-Chief and the Editorial Board.
`Speech Processing
`Special Signal Processing
`. Journal is self-supporting from subscription income and
`Radar Signal Processing
`New Applications
`tains a minimum amountof advertisements. Advertisements
`Industrial Applications
`subject to the prior approvalof the Editor-in-Chief.
`Membership and Subscription Information.
`Signal Processing (ISSN 0165-1684)is published intwo volumes
`(8issues) ayear. Thesubscription price for 1984 (Volumes6,7) is
`Dfl. 350.00 + Dfl. 42.00 p.p.h. = Dff. 392.00. Our p.p.h. (postage,
`packing and handling) charge includes surface delivery of all
`issues, except to subscribers inthe USA/Canada and India, who
`receive allissues byair delivery (S.A.L.- Surface Air| Lifted) atno
`extra cost. For the rest of the world, airmail and S.A.L. charges are
`available upon request. Claims for missing issues will be
`honoured free of charge within three months after the publication
`date of the issues. Mail orders and inquiries to: Elsevier Science
`Publishers B.V., Journals Department, P.O. Box 211, 1000 AE
`Amsterdam, The Netherlands.Forfull membership information
`of the Association, including a subscription at a reduced rate,
`please contact: EURASIP,P.O. Box 134, CH-1000 Lausanne13,
`Switzerland.
`
`ipe. SIGNAL PROCESSING incorporates all aspects of the
`ory and practice of signal processing (analogue anddigital).
`yatures original research work, tutorial and review articles,
`| accounts of practical developments. It is intended for a
`id dissemination of knowledge and experience to engineers
`{ scientists working in signal processing research, develop-
`nt or practical application.
`
`aject coverage. Subject areas covered by the Journalinclude:
`nal Theory
`Signal Processing Systems
`;chastic Processes
`Software Developments
`tection and Estimation
`Image Processing
`actral Analysis
`Pattern Recognition
`ering
`Optical Signal Processing
`
`
`1984, Elsevier Science Publishers B.V. (North-Holland)
`ublication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical,
`rights reserved. No part of this p'
`stocopying, recording or otherwise, without the prior permission of the publisher, Elsevier Science Publishers B.V. (North-Holland), P.O. Box 1991, 1000 BZ
`sterdam, The Netherlands.
`mission of an article for publication implies the transfer of the copyright from the author(s) to the publisher and entaiis the author's irrevocable and exclusive
`horization of the publisher to collect any sums or considerations for copying or reproduction payable by third parties (as mentionedin article 17 paragraph2 of the
`tch Copyright Act of 1912 and inthe Royal Decree of June 20, 1974 (S. 351) pursuantto article 16b of the Dutch Copyright Act of 1912) and/orto actin or out of Courtin
`inection therewith.
`s been registered with the Copyright Clearance Center, inc. Gonsentis given for copying of articles for
`acial regulations for readers in the U.S.A. - This journal ha
`ts. This consentis given onthe condition that the copier pays through the Center the per-copy fee stated in
`‘sonal orinternal use,or for the personal use of specific clien
`code onthefirst page of eacharticle for copying beyondthat permitted by Sections 107 or 108 of the U.S. Copyright Law. The appropriate fee should be forwarded with
`opy of thefirst page of the article to the Copyright Clearance Center,inc., 21 Congress Street, Salem, MA 01970, U.S.A.If no code appearsin anarticle, the author has
`t given broad consentto copy and permission to copy mustbe obtained directly from the author.All articles published prior to 1981 may be copied for a per-copy fee of
`. $2.25, also payable through the Center.(N.B. For review journals this fee is $0.25 per copy per page.) This consent does not extend to otherkinds of copying, such as
`generaldistribution, resale, advertising and promotion purposes, or for creating new collective works. Special written permission must be obtained from the publisher
`such copying.
`ecialregulations for authors in the U.S.A. - Upon acceptance of anarticle by the journal, the author(s) will be asked to transfer copyrightof the article to the publisher.
`is transfer will ensure the widest possible dissemination of information under the U.S. Copyright Law.
`iblished bimonthly
`
`Printed in The Netherlands
`
`OLYMPUS EX.1021 - 3/15
`
`OLYMPUS EX. 1021 - 3/15
`
`

`

`
`This material may be protected by Copyright law (Title 17 U.S. Code)
`
`
`
`Signal Processing 6 (1984) 267-278
`North-Holland
`
`.
`
`267
`
`SIMPLE FFT AND DCT ALGORITHMS WITH REDUCED NUMBER OF
`OPERATIONS
`
`{
`Martin VETTERLI, member EURASIP, and Henri J..NUSSBAUMER
`Laboratoire d’Informatique Technique, Ecole Polytechnique Fédérale de Lausanne, 16 Chemin de Bellerive, CH-1007 Lausanne,
`Switzerland
`
`Received 2 November 1983
`Revised 20 February 1984
`
`Abstract. A simple algorithm for the evaluation of discrete Fourier transforms (DFT): and discrete cosine transforms (DCT)
`is presented. This approach, based on the divide and conquer technique, achieves a substantial decrease in the numberof
`additions when compared to currently used FFT algorithms (30% for a DFT on real data, 15% for a DFT on complex data
`and 25% for a DCT) and keeps the same numberof multiplications as the best known FFT algorithms. The simple structure
`of the algorithm and the fact that it is best suited for real data (one does not have to take a transform of two real sequences
`simultaneously anymore) should lead to efficient implementations and to a wide range of applications.
`
`Zusammenfassung. Ein einfacher Algorithmus zur Berechnung von diskreten Fourier Transformationen (DFT) und diskreten
`Cosinus Transformationen (DCT) wird vorgeschlagen. Diese Methode, basierend auf der “Teilen und Lésen”’ Technik, erlaubt
`eine Verkleinerung der Anzahi Additionen gegeniiber gebraiichlichen FFT Algorithmen (30%fiir eine DFT von einem reellen
`Signal, 15% fir eine DFT von einem complexen Signal und 25% fiir eine DCT) und braucht gleichviel Multiplikationen wie
`die besten bekannten FFT Algorithmen. Die einfache Struktur des Algorithmus und der Fakt dass er am besten fiir reelle
`Signale geeignet
`ist (man braucht nicht mehr gleichzeitig zwei reelle Signale zu transformieren) sollten zu effizienter
`Implementierung und zu zahlreichen Applikationen fiihren.
`
`Résumé. Un algorithme simple pour l’évaluation de la transformée de Fourier discréte (DFT) et de la transformée en cosinus
`discréte (DCT) est proposé. Cette approche, basée sur la méthode de la ‘division et solution’, permet une diminution
`substantielle du nombre d’additions par rapport aux algorithmes de FFT courants (30% pour une DFT de signaux réels,
`15% pour une DFT de signaux complexes et 25% pour une DCT)tout en gardant un nombre de multiplications égal 4 celui
`des meilleurs algorithmes de FFT connus. La structure simple de l’algorithme ainsi que le fait qu’il s’applique bien aux
`signaux réels (il n’y a plus besoin de prendre la transformée de deux signaux réels simultanément) devraient conduire 4 une
`implantation efficace ainsi qu’a un large champ d’applications.
`
`Keywords, Fast Fourier transform, fast cosine transform, transformsofreal data.
`
`1. Introduction
`
`Since the rediscovery of the fast Fourier transform (FFT) algorithm [1, 2] for the evaluation of discrete
`Fourier transforms, several improvements have been madeto the basic divide and conquer schemeasfor
`example the mixed radix FFT [3] and the real factor FFT [4, 5]. The introduction of the Winograd Fourier
`transform (WFTA) [6], although a beautiful result in complexity theory, did not bring the expected
`improvements once implemented on real life computers [7], essentially due to the large total number of
`operations and to the structural complexity of the algorithm.
`The fact that most FFT’s are taken on real data is seldom fully taken into account. The algorithm using
`a FFT of half dimension forthe computation of a DFT on a real sequence [8] uses substantially more
`operations than the method of computing a single FFT on two real sequences simultaneously [9]. The
`
`0165-1684/84/$3.00 © 1984, Elsevier Science Publishers B.V. (North-Holland)
`
`OLYMPUS EX.1021 - 4/15
`
`OLYMPUS EX. 1021 - 4/15
`
`

`

`
`
`AURAioe,
`
`
`|
`
`|
`
`268
`M. Vetterli, H. J. Nussbaumer / Simple FFT and DCT Algorithms
`latter method has the disadvantage that one has to take two DFT’s at once and that the sorting of the
`output uses additional adds. The fact that the input and output sequencesare real is used explicitly in a
`real convolution algorithm [10] where the DFT and inverse DFT are computed with a single complex FFT.
`Another transform that
`is mostly applied to real data is the discrete cosine transform. Since the
`introduction of the DCT [11], the search for a fast algorithm followed two main different approaches.
`One was to compute the DCT through a FFT of same dimension [12], where one is bound to take two
`transforms simultaneously. The other was a direct approach, leading to rather involved algorithms[13].
`It should be noted that the former technique outperformsall the latter ones when using optimal FFTs,
`a fact often left in the dark [14].
`Recently, evaluation of signal processing algorithms has shifted away from multiplication counts alone
`to the counting of the total number of operations, including data transfers [15]. This is due to the fact
`that the ratios (multiplication time)/(addition time) and (multiplication time)/(load time) are close to one
`on most computers and signal processors. Another growing concern has been the generation of time
`efficient software [16], andfinally, the efficiency of an algorithm turns out to be a non-trivial combination
`of the various operation counts as well as of its structural complexity [17].
`In this communication, we address an old problem, namely, the efficient evaluation of DFT’s and DCT’s)
`of real data. Efficiency is meantin the sense of minimal numberof multiplications and additions as well
`as in the sense ofstructural simplicity. As it turns out, the two problemsare closely related, since a DFT
`of dimension N can be evaluated with two DCT’sofsize N/4 andsince a DCTof size N can be evaluated
`with a DFTof size N and additional operations. The same technique can be applied again to the reduced
`DCTofsize N/4 and to the DFT ofsize N, andthis until only trivial transformsare left over (N = 1, 2).
`This leads to an elegant recursive formulation of the two algorithms and to a numberof multiplications
`identical
`to the best FFT’s while diminishing substantially the number of additions (typically 30%).
`Interestingly, this last saving is partly kept when computing complex DFT’s, and as an example, the total
`numberof operations for a 1024-point transform is nearly 10% below the numberof operations required
`for a 1008-point WFTA. The prime factor FFT (PFA) requires about the same numberof operations [18],
`but has a more complex structure.
`Note that the algorithms below were developed while searching for an efficient way to compute DCT’s
`of real data. The derived FFT algorithm for real data that follows immediately requires a number of
`multiplications identical to the one found in [19] (which is a variation of the Rader—Brenner algorithm),
`and a total number of operations that can be found in [20]. While obtaining an identical complexity, the
`derivations are quite different and the algorithm below seems more suitable for programming.
`Section 2 is used to derive the general algorithm and Section 3 evaluates its computational complexity.
`In Section 4, the results are compared to other algorithms and some implementation considerations are
`addressed.
`
`2. Derivation of the algorithms
`
`Let us define the following transformsofthe length-N real vectorx with elements x(0), x(1)- +> x(N—1):
`
`Discrete Fourier transform
`N-I
`;
`DFT(k, Nx)i= Y x(np-eP/N, k=0,...,N—-1,
`n=0
`
`(1)
`
`where j= +V=1,
`Signal Processing
`
`OLYMPUS EX.1021 - 5/15
`
`OLYMPUS EX. 1021 - 5/15
`
`

`

`Discrete cosine transform
`
`M.Vetterli, H. J. Nussbaumer / Simple FFT and DCTAlgorithms
`
`DCT(k, N,x):= ¥ x(n): coo(BOE k=0,...,N-1.
`
`2u(2n +1)k
`
`4N
`
`No!
`
`n=0
`
`Cosine DFT
`
`cos-DFT(k, N, x)= } x(n): cos( yy ),
`
`N-tI
`n=
`
`
`2
`k
`
`k=0,...,N-—-l1.
`
`269
`
`(2)
`
`(3)
`
`Sine DFT
`
`
`2
`k
`
`N-1!
`n=0
`
`(4)
`sin-DFT(k, N, x)= ) x(n): sin( Ny ), k=0,...,N-—-l.
`Note that in the definition of the DCT, the normalizing factor 1/V2 at k=0 is omitted for simplicity.
`While the transforms are usually only defined for k =0,..., N—1, one will see that other values are useful
`as well and can easily be obtained from the ones defined above.
`Obviously, the following relations hold:
`
`DFT(k, N, x) =cos-DFT(k, N, x) —jsin-DFT(k, N, x),
`
`DCT(N, N, x)=0,
`
`DCT(-k, N, x) =DCT(k, N, x),
`
`DCT(2N — k, N, x) =~ DCT(k, N, x),
`
`(5)
`
`(6)
`
`(7)
`
`(8)
`
`
`
`cos-DFT(N~-k, N, x)= cos-DFT(k, N, x), (9)
`
`sin-DFT(N — k, N, x) =—sin-DFT(k, N, x).
`
`Looking at the evaluation of cos-DFT(k, N, x), we note that since the cosine function is even:
`
`
`cos-DFT(k, N, x)= “SI x(2n)> cos(2) Oy (xQn4+1)+x(N—-2n—1))- co(zen),
`
`Oo
`
`n=0
`
`.
`
`or, in a more succinct form:
`
`cos-DFT(k, N, x)= cos-DFT(k, N/2, x,)+DCT(k, N/4,x,.), k=0,...,N-1,
`
`with x,(n)=x(2n),
`
`n=
`
`.,N/2-1,
`
`x,(n)=x(2n+1)+x(N-2n-1), n=0,...,N/4-1,
`
`(10)
`
`(11)
`
`12
`
`0%)
`
`where (8) or (9) are applied when necessary.
`Hence, the cosine DFT of dimension N has been mapped into a cosine DFT of N/2 and a DCT of
`N/4, at a cost of N/4 input and N/2 output additions (note that cos-DFT(k, N, x) is evaluated for
`kK=0,..., N/2 but that k= N/4 does not require an output addition). The cosine DFT of N/2 can be
`
`handledsimilarly (and this until the transform becomestrivial) and the case of the DCT will be treated
`
`below.
`Turning to the sine DFT, we take’a similar approach, using the fact that the sine function is odd.
`N/2-1
`
`sin-DFT(k, N,x)= ¥ x(2n)- sin(
`
`n=0
`
`Nia)
`
`+“S@Qn+1)-x(N —2n—1))+sin(a) (13)
`
`Vol. 6, No. 4, August 1984
`
`OLYMPUS EX.1021 - 6/15
`
`OLYMPUS EX. 1021 - 6/15
`
`

`

`OSS
`
`222 5
`
`
`
`snieeeAEUSEESSgSCT
`
`|
`
`270
`
`M. Vetterli, H. J. Nussbaumer / Simple FFT and DCT Algorithms
`
`Using the following identity:
`
`sin(228 a) =(-1)" cos(20" + Ne 2),
`we can rewrite (13), using (8) or (10) when necéssary, as:
`sin-DFT(k, N, x) =sin-DFT(k, N/2, x,)+DCT(N/4—k, N/4,%3), k=0,...,N-1,
`(15)
`xs(n) =(<L)"* (x(n +1) —x(N ~2n -1)),
`with
`Therefore, the sine DFT of dimension N has been mapped into a sine DFT of N/2 and a DCTof N/4,
`at the cost of N/4 input and (N/2)—2 output additions (note that sin-DFT(k, N, x) is evaluated for
`k=0,..., N/2 but that k=0, N/4 and N/2 do not require any output additions). The sine DFT of N/2
`is handled in a similar fashion until the length is reduced so that the transform becomestrivial.
`We now focus our attention on the computation of the DCT. Using the following mapping [12]:
`x4(n) = x(2n),
`
`(14)
`
`x4a(N-—n-1)=x(2n+1), n=0,...,N/2~-1,
`the DCT can be evaluated as:
`DCT(k, N.x)= ¥ x(n) eosOE), k=0,...,N-—l1.
`Nxt
`
`n=0
`
`4N
`
`.
`
`2n(4n+I)k
`
`Using basic trigonometry, (17) becomes:
`
`(16)
`
`|
`(7)
`
`i
`
`2ak
`_
`f2rk
`.
`DCT(k, N, x) = cos aN) cos-DFT(k, N, x4) ~sin 4N)° sin-DFT(k, N, x4), k=0,...,N-1.
`|
`(18)
`|
`
`With the symmetries of trigonometric functions, (18) can be computedasfollows:
`(=)
`; (=)
`4N
`4N/
`DCT(k, N, x)
`_
`| ee N, 9]
`DCT(IN-k,N,x)} |. (=)
`(=) -L
`sin-DFT(k, N, xa}
`
`cos{ ——
`
`—sin{ ——
`
`os| ——~
`in{ ———
`iN “\4N
`
`k=0
`
`uo
`
`N/2-1
`
`i
`
`i
`
`|
`|
`
`and
`
`DCT(N/2, N, x) =cos(1/4) - cos-DFT(N/2, N, x4).
`
`(19)
`
`The matrix product in (19) can be evaluated with 3 multiplications and 3 additions (instead of 4
`multiplications and 2 additions) [9] as follows:
`
` |:.i|g /2
`
`g
`
`s, = cos-DFT(k, N, x4) +sin-DFT(k, N, x4),
`
`2Qak
`m= s,- cos\ 7],
`Signal Processing
`
`OLYMPUS EX.1021 - 7/15
`
`Pp, = Cos 4N
`sin 4N >
`-sin(224) cos(224
`
`P2=sin\ 7,
`
`o8\ TN
`
`
`
`RAOnenreeeenOUME
`
`OLYMPUS EX. 1021 - 7/15
`
`

`

`M. Vetterli, H. J. Nussbaumer / Simple FFT and DCTAlgorithms
`
`271
`
`- my =sin-DFT(k, N, x4) ° Pi,
`
`m,=cos-DFT(k, N, x4) ° pa,
`DCT(k, N, x)= m,— m5,
`DCT(N-k,Nx)=m+m,
`
`:
`
`(20)
`
`where p, and p, are precomputed. Usingall simplifications, the computation of (19) requires therefore a
`total of (8N/2)—2 multiplications and (3 N/2)—3 additions.
`Thus, we have shown how to map a N dimensional DFT into two DCT’s of N/4 (5, 12 and 15) and
`how to map a DCTof N into a DFT of samesize (16 and 19). In other words, since the DCT is computed
`through a DFT,it is shown how to compute a DFT of N with 2 DFT’s of N/4 plus auxiliary operations,
`and this operation can be repeated until N has been reduced sufficiently so asto leadto trivial transforms.
`Figures |
`to 4 try to visualize schematically the interaction of the various transforms. Combination of
`these figures in appropriate order show how to compute a transform of any dimension.
`
`DFT CN)
`
`cos-D FT (N)
`
`Fig.
`
`|. Decomposition of a DFT of size N into a cosine DFT of size N and a sine DFTofsize N.
`
`sin-D FT ON)
`DCT CNA4)
`
`cosD FT CN)
`
`cosDFT(N2)
`
`Fig. 2. Decomposition of a cosine DFT of size N into a cosine DFT of size N/2 and a DCTofsize N/4.
`Val. 6, No. 4, August 1984
`
`OLYMPUS EX.1021 - 8/15
`
`OLYMPUS EX. 1021 - 8/15
`
`

`

`272
`
`M.Vetterli, H. J. Nussbaumer / Simple FFT and DCT Algorithms
`
`swDFT CN)
`
`sin-D FT (N/2)
`
`
`
`DCT CNA)
`
`
`Fig. 3. Decomposition of a sine DFT of size N into a sine DFTof size N/2 and a DCT of size N/4.
`
`DCT(N)
`
`cos-D FT CN)
`
`sinDFT ON)
`
`Fig. 4. Decomposition of a DCT of size N into a cosine DFT of size N and a sine DFTof size N plus auxiliary operations.
`
`The algorithm is best suited for DFT’s on real data, but if the input is complex, one simply takes the
`real and imaginary parts separately (thus doubling the computational load) and evaluates the output with
`2N —4 auxiliary adds (k=0 and N/2 require no additions).
`
`‘3. Computational complexity
`
`Even if the derivation above used only the fact that N was a multiple of 4, the algorithm, as other
`divide and conquer schemes, performs best when N is highly composite, typically a power of2.
`Below, we restate the number of operations required for the various steps needed in the evaluation of
`the transforms, where Oy[-] and O,[-] stand for the number of multiplies and adds respectively.
`Signal Processing
`
`OLYMPUS EX.1021 - 9/15
`
`OLYMPUS EX. 1021 - 9/15
`
`

`

`M.Vetterli, H. J, Nussbaumer / Simple FFT and DCT Algorithms
`
`DFT on length-N real data:
`
`Ou[R-DFT(N)]= Oy[cos-DFT(N)] + Oy, [sin-DFT(N)],
`
`O,4[R-DFT(N)]= O,{cos-DFT(N)] + Og[sin-DFT(N)].
`DFT on length-N complex data:
`|
`
`Ou{C-DFT(N)]=2 + (Oy [cos-DFT(N)]+ Ou[sin-DFT(N)]),
`
`O,[C-DFT(N)]= 2 - (O4[cos-DFT(N)] + Oa[sin-DET(N)]) +2.N —4.
`
`DCT on length-N real data:
`
`Om[DCT(N)] = Ox [cos-DFT(N)] + Ox [sin-DFT(N)]+3.N/2)- 2,
`
`O,[DCT(N)]= O4[cos-DFT(N)] + O,[sin-DFT(N)]+(3.N/2)—3.
`
`Cosine DFT on length-N real data:
`
`Oy[cos-DFT(N)] = Oy [DCT(N/4)] + Oy [cos-DFT(N/2)],
`O,[cos-DFT(N)] = O,[DCT(N/4)]+ O,[cos-DFT(N/2)]+(3.N/4).
`
`Sine DFT on length-N real data:
`
`273
`
`@)
`
`(22)
`
`(23)
`
`24
`
`°)
`
`On [sin-DET(N)] = Oy[DCT(N/4)] + Oy[sin-DFT(N/2)],
`
`>)
`O,[sin-DFT(N)] = O,[DCT(N/4)] + O,[sin-DFT(N/2)]+(N/4)—2.
`_ From (21)-(25) one can compute recursively the number of operations needed for the various transform
`types and sizes greater than 2. For N =1, no operations are required, and the values for N =2 are given
`in (26), thus defining the initial conditions for the above recursions.
`
`25
`
`Oy [R-DFT(2)] = Ou[C-DFT(2)] = Oy [cos-DFT(2)] = Ox, [sin-DFT(2)] = 0,
`
`Oy[DCTQ)]= 1,
`
`O,[R-DFT(2)] = O4[DCT(2)] = O4[cos-DFT(2)] = 2,
`
`(26)
`
`O,[C-DFT(2)] = 4,
`
`O,[sin-DFT(2)]=0.
`
`But, when N is a powerof 2 and that the recursions are therefore applied Log, N times, the operation
`counts for the DCT reduces simply to:
`
`Om[DCT(N)]= N/2- Log, (N),
`
`O,[DCT(N)]= N/2- (3 Logo(N)—2)+1.
`
`From (27) it follows immediately with (21)-(23) that:
`Om[R-DFT(N)]= N/2- (Log.(N)-3) +2,
`O,[R-DFT(N)]= N/2: (3 Log,(N)—5) +4,
`
`(27)
`
`(28)
`
`Vol. 6, No. 4, August 1984
`
`OLYMPUS EX.1021 - 10/15
`
`OLYMPUS EX. 1021 - 10/15
`
`

`

`274
`
`and
`
`M. Vetterli, H. J. Nussbaumer / Simple FFT and DCT Algorithms
`
`Om[C-DFT(N)]= N- (Log.(N)— 3) +4,
`O,[C-DFT(N)]=3N> (Logo(N)— 1) +4.
`The above closed form expressions can noweasily be comparedto existing algorithms.
`
`29
`
`@)
`
`4. Comparison with existing algorithms
`
`
`
`First, the introduced algorithm (whichis called in the following fast Fourier-cosine transform or FFCT)
`is compared to other algorithms which work directly on real data. Then we look at the case where
`two transforms on real data are taken simultaneously with an optimal FFT algorithm. The case of the
`FFT on complex data is investigated as well, and the issue of algorithm structure is addressed. The
`operation counts for FFT’s are taken from [9] unless otherwise specified,
`For the computation of a DFT on real data, one can use the algorithm based on an FFT of N/2 [8].
`Using again the 3 mult/3 adds approach for the auxiliary output operations, one gets the following
`complexity:
`
`Om[FFT(N)] = Ou[FFT(N/2)]+(3.N/2)—2,
`
`O,[FFT(N)] = O4[FFT(N/2)]+(9N/2)—2.
`
`00
`
`Table | compares this result with the FFCT and showsthe substantial savings that are obtained.
`
`aeeseanataVSait
`
`‘SSNSacerseeHeeler
`
`Table |
`
`Comparison of direct computation of a DFT on real data with a FFT of N/2 or
`the FFCTalgorithm (the FFT of N/2 is computed with the Rader—-Brenner algorithm)
`
`
`size
`FFT of N/2 +ops.
`FFCTalgorithm
`
`
`N
`real mults
`real adds
`real mults
`real adds
`
`
`20
`2
`42
`10
`8
`60
`10
`122
`26
`16
`164
`34
`290
`66
`32
`420
`98
`710
`162
`64
`1028
`258
`1 678
`386
`128
`2 436
`642
`3.870
`898
`256
`5 636
`1 538
`8 766
`2050
`512
`12 804
`3 586
`19 582
`4610
`1024
`
`
`
`
`10 242 43 262 81942048 28 676
`
`-
`
`i
`/
`|
`/
`|
`:
`:
`|
`
`Turning to the DCT, one can compare the FFCTto fast discrete cosine algorithms which perform the
`transform directly, without going to the FFT (and therefore to the need to perform 2 transforms on real
`data at the same time). The various operation counts are shown in Table 2. The numberof multiplications
`for the FFCTis always below the other algorithms, and even if the numberofadditionsis slightly above
`the Chen ef al. version, the total numberof operationsis always substantially less.
`Signal Processing
`
`OLYMPUS EX.1021 - 11/15
`
`OLYMPUS EX. 1021 - 11/15
`
`

`

`M. Vetterli, H. J. Nussbaumer / Simple FFT and DCT Algorithms
`
`275
`
`Table 2 —
`
`Comparison of direct computation of a DCT on real data with the algorithms of Chen et al. [13], Wang et al. [14]
`and the FFCT (the operation counts for the two first algorithms are taken from [14])
`
`
`size
`Cheneral.
`.
`Wangetal.
`FFCT
`
`
`N
`r. mults
`r. adds
`r. mults
`r. adds
`r. mults
`r adds
`
`8
`16
`26
`13
`29
`12
`29
`16
`44
`74
`35
`83
`32
`81
`32
`116
`194
`91
`219
`80
`209
`64
`292
`482
`227
`547
`192
`$13
`128
`708
`1154
`547
`1315
`448
`1217
`256
`1 668
`2 690
`1 283
`3 075
`1 024
`2817
`$12
`3 844
`6 146
`2947
`7 043
`2 304
`6 401
`1024
`8 708
`13 826
`6 659
`15 875
`5 120
`14 337
`2 048
`19 460
`30 722
`14851
`35 331
`11 264
`31745
`
`
`The other general approach to transforms on real data is to compute simultaneously the transform of
`two real sequences. Beside the drawback that one has always to compute two transforms (otherwise the
`approachis suboptimal), the sorting of the output requires about N output additions. The operation count
`for a DFT on real data are given below:
`
`Oy[R-DFT(N)]= 1/2 + Oy[FFT(N)],
`
`O,[R-DFT(N)]= 1/2 - O,[FFT(N)]+ N —2.
`
`(31)
`
`Table 3 compares (31) with the FFCT, and, as can be seen, the multiplication count is identical while a
`saving of about 30% is made with regard to additions.
`The situation is quite similar when computing a DCT with an FFT [12]. It is assumed that 2 DCT’s on
`real data are evaluated at the same time. Thus, the operation countis:
`
`Om[R-DFT(N)] = 1/2 + Ou [FFTCN)]+GN/2)-2,
`
`OL R-DFT(N)]= 1/2 + O,[ FFT(N)]+(5N/2)—5.
`
`Table 3
`
`Comparison of the computation of a DFT on real data with an FFT of N on two
`data sequences simultaneously and output additions or the FFCT algorithm (the
`FFT of N is computed with the Rader—Brenner algorithm)
`
`(32)
`
`
`
`size FFCTalgorithm FFT of N+adds
`
`
`
`
`
` N real mults real adds real mults real adds
`
`
`
`
`
`
`
`20
`2
`32
`2
`8
`60
`10
`88
`10
`16
`164
`34
`242
`34
`32
`420
`98
`614
`98
`64
`1 028
`258
`1 486
`258
`128
`2 436
`642
`3 486
`642
`256
`5 636
`1 538
`7998
`1 538
`512
`12 804
`3 586
`18 046
`3 586
`1024
`
`
`
`
`8 194 40.190 8 1942 048 28 676
`
`Vol. 6, No. 4, August 1984
`
`OLYMPUS EX.1021 - 12/15
`
`OLYMPUS EX. 1021 - 12/15
`
`

`

`276
`M. Vetterli, H. J. Nussbaumer/ Simple FFT and DCTAlgorithms
`A technique similar to the one used in (23) was used in the computation of the output of the DCT. The
`results are compared in Table 4 whereit is seen that the additions are reduced in the FECT by about
`25% with an identical number of multiplies.
`
`Table 4
`
`Comparison of the computation of a DCT onreal data with the algorithm using
`an FFT on two data sequences simultaneously or the FFCT (the FFT is the
`Rader~Brenner version)
`
`size
`FFT of N +ops.
`FFCTalgorithm
`
`N
`real adds
`real adds
`
`real mults
`
`
`‘aRaeceteeeaeertelr
`
`Turning to the computation of a DFT on a complex input, one can use (29) for the FFCT. The comparison
`to the Rader—Brenner FFTis given in Table 5 where the numberof multiplies turns out to be identical
`while the numberof additions is reduced by nearly 20%.
`Lookingat last at the Winograd Fourier transform, one sees that even if the number of multiplies is
`larger in the FFCT case, the total number of operations is smaller for large transforms (for example 5%
`for the WFTA(504)/FFCT(512) and 9% for the WFTA(1008)/FFCT(1024) comparison). Note that the
`gain is less in the PFA case (5% and 2% respectively).
`Concerning the WFTAand the PFA,onerecalls that its main drawbackis the involved structure of the
`algorithm. Even if the structure of the FFCT is not as straightforward as the radix-2 FFT,
`it remains
`simple (it is similar to the structure of the Rader-Brenner algorithm). In the FFCT, the reduction in the
`numberof operations is obtained at the cost of additional permutations and data transfers. Even if these
`
`:
`
`|
`;
`|
`|
`|
`|
`|
`|
`
`||
`
`|
`
`OLYMPUS EX.1021 - 13/15
`
`real mults
`
`29
`12
`Al
`12
`8
`81
`32
`109
`32
`16
`209
`80
`287
`80
`32
`513
`192
`707
`192
`64
`1217
`448
`1675
`448
`128
`2817
`1 024
`3 867
`1.024
`256
`6 401
`2304
`8 763
`2304
`512
`14337
`$120
`19 579
`5 120
`1024
`2048
`11 264
`43 259
`11 264
`31 745
`
`
`Tabie 5
`
`Comparison ofthe Rader—Brenner FFT with the FFCT whenapplied to complex data
`
`
`size
`FFT of N
`FFCTalgorithm
`
`
`N
`real mults
`real adds
`real mults
`real adds
`
`
`52
`4
`52
`4
`8
`148
`20
`148
`20
`16
`388
`68
`424
`68
`32
`964
`196
`1 104
`196
`64
`2 308
`516
`2720
`516°
`128
`5 380
`1 284
`6 464
`1 284
`256
`12292
`3 076
`14976
`3 076
`512
`27 652
`7172
`34 048
`7172
`1 024
`61 444
`16 388
`76 288
`16 388
`2 048
`
`Signal Processing
`
`OLYMPUS EX. 1021 - 13/15
`
`

`

`
`
`M. Vetterli, H. J. Nussbaumer / Simple FFT and DCT Algorithms
`
`277
`
`permutations can be grouped with previous ones (for example at the entry of the DCT), the resulting
`structure is more complex. But this increase in topological complexity is annihilated by the substantial
`decrease in arithmetic complexity.
`Looking at the implementation, one sees the importanceofusingveryefficient small transforms. Together
`with the explicit coding technique [16], this should lead to fast code for real transforms, especially on
`micro and signal processors, where the number of arithmetic and data registers is small (leading to
`unefficient implementations of the complex transform on two real sequencesversion).
`
`5. Concluding remarks
`
`We have introduced a simple, recursive algorithm for the computation of the discrete Fourier transform
`and the discrete cosine transform. First, this algorithm can be applied directly to real data, providing
`therefore an attractive alternative to the method using a complex transform oftwo real sequences. Secondly,
`it uses the same numberof multiplies as the best structured FFT algorithms but decreases the numberof
`additions by about 25 to 30% for DFT’s and DCT’sonreal data and by nearly 20% for DFT’s on complex
`data when comparedto currently used algorithms. At last, its structure is simple enough so that it should
`lead to efficient implementations.
`It is worth noting that the proposed algorithm, while leading to the same computational complexity
`for the FFT as the one in [20], is not isomorphic. Interestingly, the algorithm in [20] has been aroundfor
`15 years but was seldom referenced and even less used. Meanwhile, literature was published on amelior-
`ations of the Cooley-Tukey FFT which leadsto less efficient algorithms than the one in [20].... More
`recently, actually just as the final copy was being sent to the publisher, yet another algorithm appeared
`that leads to the same complexity for the complex FFT [21].
`Investigations are under way in order to generalize the above ideas to other sinusoidal transforms and
`to higher dimensions, as well as to prove the efficiency of an implementation.
`,
`
`Acknowledgements
`
`The authors would like to thank one of the anonymous reviewers for bringing [20] and [21] to their
`attention as well as the “Fonds National Suisse de la Recherche Scientifique” for supportingthis research.
`
`References
`
`[1] C. Runge, and H. Konitz, Vorlesungen iiber Numerisches Rechnen, Springer Verlag, Berlin, 1924.
`[2] J.W. Cooley, and J.W. Tukey, “An algorithm for the machine calculation of complex fourier series”, Math. of Comput., Vol.
`19, pp. 297-301, April 1965.
`[3] R. Singleton, “An algorithm for computing the mixed radix fast fourier transform”, IEEE Trans. on Audio. and Electroacoustics,
`Vol. AU-17, pp. 93-103, June 1969.
`[4] C.M. Rader, and N.M. Brenner, “A new principle for fast fourier transformation”, [EEE T

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket