throbber
HUAWEI EX. 1021 -“I/“I5
`
`HUAWEI EX. 1021 -1/15
`
`

`
`SIGNAL PROCESSING
`
`CONTENTS
`
`Volume 6 (1984) No. 4
`
`Regular Papers
`
`M. Vetterli and H. Nussbaumer, Simple FFT and DCT algorithms with reduced number
`of operations
`
`J. B. Martens, Convolution algorithms, based on the CRT (Chinese Remainder
`Theorem)
`
`T. Gilingen and N. C. Geckini, An algorithm f.or formant tracking
`
`267
`
`279
`
`293
`
`E. I. Plotkin, L. M. Roytman and N. M. S. Swamy, Unbiased estimation of an initial phase 301
`
`F. F. Liedtke, Computer simulation of an automatic classification procedure for
`digitally modulated communication signals with unknown parameters
`
`Short Communications
`
`(3. H. Allen, Programming an efficient radix-four FFT algorithm
`
`J. Le Roux, Non symmetric lattice structure for pole zero filters
`
`Book Review
`
`Announcements + Call for Papers
`
`Author Index
`
`311
`
`325
`
`331
`
`387
`
`339
`
`343
`
`
`
`.:.t=.=........,._........7.».,._.c.M_,...~.m_....,..w...,.-.sA
`
`I-ISUAWEI EX. 1021 -2/15
`
`HUAWEI EX. 1021 -2/15
`
`

`
`SIGNAL PROCESSING
`
`i
`
`I
`
`A European Journal devoted to the methods a
`A publication of the European Association for
`
`nd applications of signal processing
`Signal Processing (EURASIP)
`
`Editor-in-Chiei
`Murat Kunt
`Laboratoire de traitoment de signaux
`Ecole polytechnique ledcrale de Lausanne
`I6, Chemin de Bellerive
`CH -1007 Lausanne Switzerland
`Tel: 021 47 26 26 Telex: 24478
`Editorial Board
`M. Bellanger (Le Plessis Robinson, France)
`Ft. Boitc (Mons, Belgium)
`C. Braccini (Genova, Italy)
`V. Cappellini (Florence, Italy)
`G. Carayannis (Athens, Greece)
`A. G. Constantinides (London, U.K.)
`F. de Coulon (Lausanne, Switzerland)
`T. S, Durrani (Glasgow. UK.)
`W. Endres (Darmstadt, West Germany)
`B. Escudié (Lyon, France)
`
`O. D. Faugeras (Paris, France)
`A. L. Fawe (Sart Tilman, Belgium)
`A. Fettweis (Bochum, West Germany)
`J. G. Gander (Soiothurn, Switzerland)
`0, Gazanhcs (MarseilIe, France)
`D. Godard {La Gaude, France)
`G. H. Granlund (Linkoepirrg, Sweden)
`C. Gueguen (Paris, France)
`T. S, Huang (West Lafayette, USA)
`Z. Kulpa (Warsaw, Poland)
`J. L. Lacoume (Grenoble, France)
`M. A. Lagunan—Hernan(ies
`(Barcelona, Spain)
`D. S. Lebedev (Moscow, USSR)
`S Leviaidi (Napoli, Italy)
`C. E. Liedtke (Hannover, West Germany)
`J. Max (Grenoble, France)
`R. De Mori (Montreal, Canada)
`
`M Nagao (Kyoto, Japan)
`H. Niemann (Erlangcn, West Germany)
`F. Peltandini (Neuchatel, Switzerland)
`8. Picinbono (Gif—sur~Yvette, France)
`8.8. Ramakrishna (Bangalore, India)
`F. Rocca (Milano, Italy)
`B, Sankur (Istanbul, Turkey)
`L. L, Schari (Kingston, USA)
`J. C. Simon (Paris, France)
`C. Y. Suen (Montreal, Canada)
`Ju—Wei Tai (Peking, China)
`H. Tiziani (Heerbrugg, Switzerland)
`D. Woll (Frankfurt, West Germany)
`G. Winkler (Karlsruhe, West Germany’)
`I. T. Young (Delit, The Netherlands)
`P. Zamperoni (BraunsChweig, West
`Germany)
`Secretary: I. Bezzi (Miss)
`
`Editorial Policy. SIGNAL PROCESSlNG is a European Journal
`presenting the theory and practice of signal processing.
`its
`primary objectives are:
`Dissemination of research results and oi engineering develop-
`ments to European signal processing groups and individuals;
`Presentation oi practical solutions to current signal processing
`problems in engineering and science.
`The editorial policy and the technical content of the Journal are
`the responsibility of the Editor—in—Chief and the Editorial Board,
`The Journal lS selfvsupporting from subscription income and
`contains a minimum amount oi advertisements. Advertisements
`are subject to the prior approval of the Editor—in—Chiei
`
`Scope. SIGNAL PROCESSING incorporates all aspects of the
`theory and practice of signal processing (analogue and digital).
`It teatures original research work, tutorial and review articles,
`and accounts of practical developments.
`It
`is intended for a
`rapid dissemination of knowledge and experience to engineers
`and scientists working in signal processing research, develop-
`ment or practical application.
`
`Subject coverage. Subject areas covered by the Journal include:
`Signal Theory
`Signal Processing Systems
`Stochastic Processes
`Software Developments
`Detection and Estimation
`image Processing
`Spectral Analysis
`Pattern Recognition
`Filtering
`Optical Signal Processing
`
`Data Processing
`Remote Sensing
`Communication Signal Processing
`Biomedical Signal Processing
`Geophysical and Astrophysical Signal Processing
`Earth Resources Signal Processing
`Acoustic and Vibration Signal Processing
`Signal Processing Technology
`Speech Processing
`Radar Signal Processing
`industrial Applications
`
`Sonar Signal Processing
`Special Signal Processing
`New Applications
`
`Membership and Subscription Intormation.
`SignalProcessing (ISSN 01654684) is publishedin two volumes
`(8issues)ayear.Thesubscription pricefor1984 (Vo|umes6,7)is
`Dtl. 350.00 + Dfl, 42,00 p.p.h : Dtl. 392.00. Our p,p.h. (postage,
`packing and handling) charge includes surface delivery of all
`issues, exceptto subscribers in the USA/Canada and India, who
`receive allissues by airdelivery(S,A L.~Sur1‘ace AirlLifted)atno
`extracost Fortherestoitheworld,airrriailandS_A.L chargesare
`available upon request. Claims for missing issues will he
`honouredfreeofchargewithinthrcc monthsafterthe publication
`date of the issues. Mail orders and inquiries to: Elsevier Science
`Publishers B.V., Journals Department, PO. Box 211,1000 AE
`Amsterdam, The Netherlands. For trill membership information
`of the Association. including a subscription at a reduced rate,
`please contact: EURASIP, PO. [Box 134, CH—1000 Lausanne13,
`Switzerland.
`
`©1984, Elsevier Science Publishers B.V. (North—Holland)
`All rights reserved. No part oi this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means. electronic, mechanical,
`photocopying, recording or otherwise, without the prior permission of the publisher, Elsevier Science Publishers B.V, INorthvHoI|and), PO Box 1991, 1000 DZ
`Amsterdam. The Netherlands.
`Submission of an article tor publication implies the transier oi the copyright from the authnr(s) to the publisher and entails the author's irrevocable and exclusive
`authorization of the publisher to collect any sums or considerations tor copying or reproduction payable by third parties (as mentioned in article 17 paragraph 2 oi the
`Dutch Copyright Act oi 1912 and in the Royal Decree of June 20, 1974 (S 351) pursuant to article 16b oi the Dutch Copyright Act of 1912) and/orto act in or out of Court in
`connection therewith,
`Special regulations for readers in the U.S.A.
`this journal has been registered with the Copyright Clearance Center. Inc. Consent is given for copying oi articles for
`ri on the condition that the copier paysthrouqh the Centerthe per-copy tee stated in
`personal orirrternal rise, or for the personal use of specific clients This consent is give
`107 or 108oi the U 8. Copyright Law. the approprlateteeshould be (orwarrted with
`the code on the first page oi each article for copying beyond that permrtted by Sections
`gress Street, Salem, MA 01970, U.S./\. if no code appears in an article, the author has
`a copy of the first page of the article to the Copyright Clearance Center. Inc , 21 Con
`not given broad consent to copy and permission to copy must be obtained directly trom the author All articles published prior to I981 may be copied for a per—copy tee oi
`US $2 25, also payable through the Center [N Ft Fnr review journals this tee is $0.25 per copy per page.) This consent does not extenrl to other kinds oi copying, such as
`for general distribution, resale, advertising and promotion purposes, or torcreating new collective works. Special written permission must be obtained from the publisher
`for such copying.
`Specialregu/atiorrs [or authors in the L/.S.A. — Upon acceptance 01 an article by the journal, the author(s) will be asked to transfer copyright ot the article to the publisher
`This transter will ensure the widest possible dissemination oi intorniation under the U.S. Copyright Law
`
`Published bimonthly
`
`Printed in The Netherlands
`
`HUAWEI EX. 1021 -3/15
`
`
`
`HUAWEI EX. 1021 -3/15
`
`

`
`This material may be protected by Copyright law (Title 17 US. Code)
`
`Signal Processing 6 (1984) 267—278
`North—l-lolland
`
`V
`
`267
`
`SIMPLE FFT AND DCT ALGORITHMS WITH REDUCED NUMBER OF
`OPERATIONS
`
`/
`Martin VETTERLI, mcmber EURASIP, and Henri J.lNU§SBAUMER
`Luboraluire d’In/"urmariqzte Technique, Ecole Polyleclmique Fédérale de Lausanne, 16 Chemin de Bellerive, CH-1007 Lausanne,
`Switzerland
`
`Received 2 November 1983
`Revised 20 February 1984
`
`Abstract. A simple algorithm for the evaluation ofdiscrete Fourier transforms (DFT) and discrete cosine transforms (l,)('/1')
`is presented. This approach, based on the divide and conquer technique, achieves a substantial decrease in the number of
`additions when compared to currently used FFT algorithms (30% for a DFT on real data, 15% for a DFT on complex data
`and 25% fora DCT) and keeps the same number of multiplications as the best known FFT algorithms. The simple structure
`of the algorithm and the fact that it is best suited for real data (one does not have to take a transform of two real sequences
`simultaneously anymore) should lead to efficient implementations and to a wide range of applications.
`
`Zusammenfassung. Ein einfacher Algorithmus zur Berechnung von diskreten Fourier Transformationen (DFT) und diskreten
`Cosinus Transformationen (DCT) wird vorgeschlagen. Diese Methode, basierend aufdcr “Tcilen und Losen” Technik, crlaubt
`eine Verkleinerung der Anzahl Additionen gegeniiber gebraiichlichen FFT Algorithmen (30% fiir eine DFT von einem reellen
`Signal, 15% fiir eine DFT von einem complcxcn Signal und 25% fur cinc DCT) und braucht gleichvicl Multiplikationcn wic
`die bcsten bckanntcn FFT Algorithmcn. Die einfache Struktur des Algorithmus und der Fakt dass er am besten fiir reelle
`Signale geeignct
`ist (man braucht nicht mehr gleichzeitig zwei reelle Signale 7u trarisformieren) sollten 7u eflizienter
`lmplementierung und zu zahlreichen Applikationen fiihren.
`
`Résumé. Uri algoritlune simple pour l’évaluation de la transfortnée de Fourier discrete (DFT) et de la transformée en cosinus
`discrete (DCT) est propose. Cette approche, basée sur la méthode de la “division et solution”, permet une diminution
`substantielle du nombre d’additions par rapport aux algorithmes de FFT courants (30% pour une DFT de signaux réels,
`15% pour une DFT de signaux complexes et 25% pour une DCT) tout en gardant un nombre de multiplicationségal a celui
`des meilleurs algorithmes de FFT connus. La structure simple de l’algorithme ainsi que le fait qu‘il s’applique bien aux
`signaux réels (il n’y a plus besoin de prendre la transformée de deux signaux réels simultanément) devraient conduire a mic
`implantation eflicace ainsi qu’2‘i un large champ d’applications.
`
`Keywords. Fast Fourier transform, fast cosine transform, transforms of real data.
`
`1. Introduction
`
`Since the rediscovery of the fast Fourier transform (FFT) algorithm [1,2] for the evaluation of discrete
`Fourier transforms, several improvements have been made to the basic divide and conquer scheme as for
`example the mixed radix FFT [3] and the real factor FFT [4, 5]. The introduction of the Winograd Fourier
`transform (WFTA) [6], although a beautiful result in complexity theory, did not bring the expected
`improvements once implemented on real life computers [7], essentially due to the large total number of
`operations and to the structural complexity of the algorithm.
`The fact that most FFT’s are taken on real data is seldom fully taken into account. The algorithm using
`a FFT of half dimension forthe computation of a DFT on a real sequence [8] uses substantially more
`operations than the method of computing a single FFT on two real sequences simultaneously [9]. The
`
`016,5-1684/84/$3.00 © 1984, ‘Elsevier Science Publishers B.V. (North—Hol1and)
`
`HIUAWEI EX. 1021 -4/15
`
`HUAWEI EX. 1021 -4/15
`
`

`
`t
`
`268
`
`M. Venerli, H. J. Nusshamner/ Simple FFT and DCT Algorithms
`
`latter method has the disadvantage that one has to take two DFT’s at once and that the sorting of the
`output uses additional adds. The fact that the input and output sequences are real is used explicitly in a
`real convolution algorithm [10] where the DFT and inverse DFT are computed with a single complex EFT.
`Another transform that is mostly applied to real data is the discrete cosine transform. Since the
`introduction of the DCT [11], the search for a fast algorithm followed two main different approaches.
`One was to compute the DCT through a FFT of same dimension [12], where one is bound to take two
`transforms simultaneously. The other was a direct approach, leading to rather involved algorithms [13].
`It should be noted that the former technique outperforms all the latter ones when using optimal FFT’s,
`a fact often left in the dark [14].
`,
`Recently, evaluation of signal processing algorithms has shifted away from multiplication counts alone
`to the counting of the total number of operations, including data transfers [15]. This is due to the fact
`that the ratios (multiplication time)/(addition time) and (multiplication time)/(load time) are close to one
`on most computers and signal processors. Another growing concern has been the generation of time
`efficient software [16], and finally, the efficiency of an algorithm turns out to be a non—trivial combination
`of the various operation counts as well as of its structural complexity [17].
`In this communication, we address an old problem, namely, the efficient evaluation of DFT’s and DCT’s
`of real data. Efficiency is meant in the sense of mlinimal number of multiplications and additions as well
`as in the sense of structural simplicity. As it turns out, the two problems are closely related, since a DFT
`of dimension N can be evaluated with two DCT’s of size N/4 and since a DCT of size N can be evaluated
`with a DFT of size N and additional operations. The same technique can be applied again to the reduced
`DCT of size N/4 and to the DFT of size N, and this until only trivial transforms are left over (N = 1, 2).
`This leads to an elegant recursive formulation of the two algorithms and to a number of multiplications
`identical
`to the best FFT‘s while diminishing substantially the number of additions (typically 30%).
`Interestingly, this last saving is partly kept when computing complex DFT’s, and as an example, the total
`number of operations for a 1024-point transform is nearly 10% below the number of operations required
`for a 1008-point WFTA. The prime factor FFT (PFA) requires about the same number of operations [18],
`but has a more complex structure.
`Note that the algorithms below were developed while searching for an efficient way to compute DCT’s
`of real data. The derived FFT algorithm for real data that follows immediately requires a number of
`multiplications identical to the one found in [19] (which is a variation of the Rader—Brenner algorithm),
`and a total number of operations that can be found in [20]. While obtaining an identical complexity, the
`derivations are quite difierent and the algorithm below seems more suitable for programming.
`Section 2 is used to derive the general algorithm and Section 3 evaluates its computational complexity.
`In Section 4, the results are compared to other algorithms and some implementation considerations are
`addressed.
`
`2. Derivation of the algorithms
`
`Let us define the following transforms ofthe length-N real vector x with elements x(0), x(l) -
`
`-
`
`- x(N -1):
`
`Discrete Fourier transform
`N—l
`
`DFT(k,N,x)'.= 2 xtn)-e“i2""“”, k=0,...,N—l,
`11:0
`
`(1)
`
`where j 2
`Signal Processiiig
`
`HUAWEI EX. 1021 -5/15
`
`HUAWEI EX. 1021 -5/15
`
`

`
`M. Verterli, H. J. Nussbaumer / Simple FFT and DC‘TAIgm'1'thms
`
`Discrete cosine transform
`
`DCT(I<,N,x):= Z x(n)-cos<&L), k:o,...,N—1.
`
`2
`
`2 +1 k
`
`~—'
`
`n*0
`
`Cosine DFT
`
`cos-DFT(k, N, x)I= 2 x(n)‘cos< "A:
`
`2
`
`k
`
`k=0,...,N—— 1.
`
`N4
`n=0
`
`Sine DPT
`
`sin-DFT(k, N, x)Z= 2 x(n)-sin< “L”
`
`2
`
`k
`
`k:0,...,N—l.
`
`1"/—-I
`11:0
`
`269
`
`(2)
`
`(3)
`
`(4)
`
`Note that in the definition of the DCT, the normalizing factor 1/x/2 at k=0 is omitted for simplicity.
`While the transforms are usually only defined for k = 0, .
`.
`.
`, N — 1, one will see that other values are useful
`as well and can easily be obtained from the ones defined above.
`Obviously, the following relations hold:
`
`DFT(k, N, x) = cos-DFT(k, N, x) —j sin-DFT(k, N, x),
`
`DCT( N, N, x) 2 0,
`
`DCT(—k, N, x) = DCT(k, N, x),
`
`DCT(2N~ k, N, x) = - DCT(k, N, x),
`
`cos-DFT(N — k, N, x) = cos-DFT(k, N, x),
`
`sin-DFT(N~k, N, x)=—sin-DFT(k, N, x).
`
`(5)
`
`(6)
`
`(7)
`
`(8)
`
`(9)
`
`(10)
`
`Looking at the evaluation of cos-DFT(k, N, x), we note that since the cosine function is even:
`
`cos
`
`-DFTk,N,
`
`(
`
`x)
`
`: V
`
`2
`
`-
`
`N/‘H
`2-nnk
`N/H
`":0 x( n) cos(N/2)+ "20 (x( n+ )+x(
`
`2
`
`1
`
`N—2 -1 1 ea 11
`
`n
`
`21T(2n +1)k>
`)) cos< 4. N/4
`
`(
`
`)
`
`or, in a more succinct form:
`
`cos-DFT(k, N, x)= cos-DFT(k, N/2, x.) +DCT(k, N/4, x2),
`
`k 2 O,
`
`.
`
`.
`
`.
`
`, N — l,
`
`with x1(n)=x(2n),
`
`n=0,...,N/2—1,
`
`x2(n)=x(2n+l)+x(N—2n—l), n=O,...,N/4—l,
`
`(12)
`
`where (8) or (9) are applied when necessary.
`Hence, the cosine DFT of dimension N has been mapped into a cosine DFT of N/2 and a DCT of
`N/4, at a cost of N/4 input and N/2 output additions (note that cos-DFT(k, N, x) is evaluated for
`k =0, .
`.
`.
`, N/2 but that k: N/4 does not require an output addition). The cosine DFT of N/2 can be
`handled similarly (and this until the transform becomes trivial) and the case of the DCT will be treated
`below.
`I
`,
`Turning to the sine DPT, We take a similar approach, using the fact that the sine function is odd.
`N/2—1
`sin—DFT(k, N, x)= Z x(2n) - sin 2T‘-nk +N/izl (x(2n + l)—x(N—2n —- l)) - sin
`n:
`11:0
`0
`
`,3
`
`2L<2n_+m<)
`)
`(
`'
`4-N/4
`Vol. 6, No. 4, August 1984
`
`HIUAWEI EX. 1021 -6/15
`
`HUAWEI EX. 1021 -6/15
`
`

`
`
`‘.'§¢K’..'\¥'§"~.\.:\k§..2
`
`
`
`t
`
`2
`‘
`
`
`
`2
`
`€31
`
`E:7
`
`3t Es
`
`DCT(N/2, N, x) 2 cos(~rr/4) - cos-DFT(N/2, N, x4).
`
`(19)
`
`The matrix product in (19) can be evaluated with 3 multiplications and 3 additions (instead of 4
`multiplications and 2 additions) [9] as follows:
`
`21-rk
`_
`21rk
`p12cos W +s1n W,
`
`211k
`21rk
`_
`p22s1n W —cos W,
`
`s, 2 cos-DFT(k, N, x4) +sin-DFT(k, N, x4),
`
`2TFk
`
`ml2s, ' COS W ,
`Signal Processing
`
`HUAWEI EX. 1021 -7/15
`
`270
`
`M. Verrerlz, H.
`
`.1. Nussbrmmer/ Simple FFT and DCTAlgoriII1ms
`
`Using the following identity:
`
`Sin( > __ (._1)" . C0S< )’
`
`we can rewrite (13), using (8) or (10) when necessary, as:
`
`sin-DFT(k, N, x)2sin-DFT(k, N/2, x,)+DCT(N/4- k, N/4, x3),
`
`k 2 0, .
`
`.
`
`.
`
`, N— I,
`
`with x3(n)2(—1)"-(x(2n+1)—x(N~2n—l)).
`
`(14)
`
`(15)
`
`Therefore, the sine DFT of dimension N has been mapped into a sine DFT of N/2 and a DCT of N/4,
`at the cost of N/4 input and (N/2)—2 output additions (note that sin-DFT(k, N, x) is evaluated for
`k 2 0, .
`.
`.
`, N/2 but that k 2 0, N/4 and N/2 do not require any output additions). The sine DFT of N/2
`is handled in a similar fashion until the length is reduced so that the transform becomes trivial.
`We now focus our attention on the computation of the DCT. Using the following mapping [I2]:
`
`X4(")= x(2"),
`
`x4(N—n—l)2x(2n+l), n20,...,N/2~l,
`the DCT can be evaluated as:
`N"‘
`2
`4 +1 k
`DCT(k, N, x)2 2 x4(n) - cos<—T11—)>,
`Using basic trigonometry, (17) becomes:
`2 k
`l)CT(k, N, x) : cos(i—) -cos-DFT(k, N, x4) —sin<L) -sinmDFT(Ic, N, x4),
`4N
`
`n2O
`
`/«:0, .
`
`i
`
`. ., N— 1.
`
`With the symmetries of trigonometric functions, (18) can be computed as follows:
`
`2 k
`4N
`
`(16)
`
`(17)
`
`k = 0, .
`
`.
`
`.
`
`, N -1.
`
`(18)
`
`0- T —''m ——
`
`C
`
`[ DCT(k, N, x)
`DCT(N—k, N, x)
`
`]_
`
`and
`
`(Zwk)
`5 4N
`_ (wk)
`
`Sln 7 cos
`4N
`
`_ <2'rrk)
`4N
`<21'rk>
`

`
`_ [cos-DFT(k, N, W]
`-
`sin»DFT(k, N, x4)
`
`’
`
`kfo
`
`""’
`
`N/2_1
`
`4N
`
`HUAWEI EX. 1021 -7/15
`
`

`
`M. Venerli, H. J. Nusshaumer/ Simple FFT and DCT Algorirlmzs
`
`271
`
`m2 = sin-DFT(k, N, x4) - pl,
`
`m5 : cos—DFT(k, N, x4) - P2,
`
`DCT(k, N," x) = m, — ml,
`
`DCT( N —— k, N, x) : m1+ m3,
`
`'
`
`(20)
`
`where p. and p2 are precomputed. Using all simplifications, the computation of (19) requires therefore a
`total of (3 N/2)—2 multiplications and (3 N/2)e~/3 additions.
`Thus, we have shown how to map a N dimensional DFT into two DCT’s of N/4 (5, l2 and 15) and
`how to map a DCT of N into a DFT of same size (16 and l9). In other words, since the DCT is computed
`through a DFT, it is shown how to compute a DFT of N with 2 DFT’s of N/4 plus auxiliary operations,»
`and this operation can be repeated until N has been reduced sufficiently so as to lead to trivial transforms.
`Figures l
`to 4 try to visualize schematically the interaction of the various transforms. Combination of
`these figures in appropriate order show how to compute a transform of any dimension.
`
`DFT(N)
`
`cos-DFT(N)
`
`Fig. 1. Decomposition ofa DFT of size N into a cosine DFT of size N and a sine L)l<T of size N.
`
`sm-DFT(N)
`DCT(N/ll)
`
`cos—lJFT(N)
`
`cos—lJFl(N/2)
`
`Fig. 2. Decomposition of a cosine DFT of size N into a cosine I)FT of size, N/2 and a DCT of size N/M4.
`Vol. 6, No. 4, August 1984
`
`HIUAWEI EX. 1021 -8/15
`
`HUAWEI EX. 1021 -8/15
`
`

`
`272
`
`M. Venerli, H. J. Nu.vsbamner/ Simple FFT and DCT /llgorirhms
`
`siN—DFT(N)
`
`stN—DT(N/2)
`
`DCT(N/ll)
`s1N-DFT(N)
`
`Fig. 3. Decomposition of a sine DFT of size N into a sine DFT of size N/2 and a DCT of size N/4,
`
`DCT(N
`
`cos4DFT(N)
`
`Fig. 4. Decomposition of a DCT of‘ size N into a cosine DFT of size N and a sine DFT of size N plus auxiliary operations.
`
`The algorithm is best suited for DFT’s on real data, but if the input is complex, one simply takes the
`real and imaginary parts separately (thus doubling the computational load) and evaluates the output with
`2N—4 auxiliary adds (k =0 and N/2 require no additions).
`
`*3. Computational complexity
`
`Even if the derivation above used only the fact that N was a multiple of 4, the algorithm, as other
`divide and conquer schemes, performs best when N is highly composite, typically a power of 2.
`Below, we restate the number of operations required for the various steps needed in the evaluation of
`the transforms, where ()M[-] and OAL-] stand for the number of multiplies and adds respectively.
`Signal Processing
`
`HUAWEI EX. 1021 -9/15
`
`HUAWEI EX. 1021 -9/15
`
`

`
`M. Vetlerli, H. J. Nu:s'baumer/ Simple FFT and DCT Algorithms
`
`DFT on length-N real data:
`
`OM[R-DFT(N)] = OM[cos—DFT(N)] + OMLsin—DFT(N)j,
`-
`—
`_
`_
`OA[R»DFT(N)] : OA[cos-DFT(N_)] + OA[s1n—DFT(N)j.
`DFT on length-N complex data:
`2
`
`0M[C-DFT(N)] = 2- (OM[c0s-DFT(N)] +OM[sin-DFT(N)]),
`OA[C-DFT(N)] = 2 - (OA[cos-l’)FT(N)] + O,4[sin-DFT(N)]) +2N -4.
`
`DCT on length-N real data:
`
`OM [DCT(N)] = OM[cos-DFT(N)j + Ol\1”_Sin~DFT(N):| +(3 N/2) — 2,
`
`OA[DCT(N)] = OA[c0s-DFT(N)] + OA[sin-DFT(N)] +(3 N/2) — 3.
`
`Cosine DPT on length-N real data:
`
`OM[cos-DFT(N)] = OM[DCT(N/4)] + 0,‘; [cos-DFT(N/2)],
`OA[cos-DFT(N)] = OA[DCT(N/4)] + OA[c0s-DFT(N/2)] +(3N/4).
`
`Sine DFT on length-N real data:
`
`OM[sin—DFT(N)] = 0M[DCT(N/4)] + ()M[sin-DFT(N/2)],
`OA[sin—DFT(N)] = OA[DCT(N/4)] + OA[sin-DFT(N/2)] +(3 N/4) — 2.
`
`273
`
`(21)
`
`(22)
`
`(23)
`
`24
`
`(
`
`25
`
`(
`
`)
`
`)
`
`From (21)—(25) one can compute recursively the number of operations needed for the various transform
`types and sizes greater than 2. For N = 1, no operations are required, and the values for N =2 are given
`in (26), thus defining the initial conditions for the above recursions.
`
`OM[R~DFT(2)] = OMLC-DFT(2)] = OM[cos-Dl<T(2)] : OM[sin-DFT(2)]= 0,
`
`()M[DCT(2)] : 1,
`
`OA[R-DFT(2)] = OA{DCT(2)] = OA[cos-DFT(2)] = 2,
`
`(26)
`
`OA[C-DFT(2)] : 4,
`
`OA[sin-DFT(2)] = 0.
`
`But, when N is a power of 2 and that the recursions are therefore applied Log; N times, the operation
`counts for the DCT reduces simply to:
`
`0M[DCT(N)] = N/2 ' Logz (N),
`
`O,\[DCT(N)]= N/2 - (3 Log2(N)—2) + 1.
`
`From (27) it follows immediately with (21)—(23) that:
`
`OM[R-DFT(N)]= N/2 - (I.og2(N)—3)+2,
`OA[R-DFT(N)]= N/2 - (3 Log2(N)—5)+4,
`
`(27)
`
`(28)
`
`Vol. 6, No. 4, August I984
`
`HUAWEI EX. 1021 -10/15
`
`HUAWEI EX. 1021 -10/15
`
`

`
` 32,
`
`4‘l
`'4
`
`274
`and
`
`M. Verterli, H. J. Nussbaumer/ Simple FFT and DCT Algorithms
`
`OM[C-DFT(N)] = N~ (Log;(N) — 3) +4,
`OA[C-DFT(N)] ; 3N- (Log2(N) — 1) +4.
`
`(29)
`
`The above closed form expressions can now easily be compared to existing algorithms.
`
`4. Comparison with existing algorithms
`
`First, the introduced algorithm (which is called in the following fast Fourier-cosine transform or FFCT)
`is compared to other algorithms which work directly on real data. Then we look at the case where
`two transforms on real data are taken simultaneously with an optimal FFT algorithm. The case of the
`FFT on complex data is investigated as well, and the issue of algorithm structure is addressed. The
`operation counts for FFT’s are taken from [9] unless otherwise specified.
`For the computation of a DFT on real data, one can use the algorithm based on an FFT of N/2 [8].
`Using again the 3 mult/3 adds approach for the auxiliary output operations, one gets the following
`complexity:
`
`0M[FFT(N)] = OM[FFT(N/2)] +(3 N/2) _ 2,
`0A[FFT(N)] = O,\[FFT(N/2)] +(9N/2)—2.
`
`(30)
`
`Table l compares this result with the FFCT and shows the substantial savings that are obtained.
`
`Table I
`
`Comparison of direct computation of a DFT on real data with a FFT of N/2 or
`the FFCT algorithm (the FFT of N/2 is computed with the Rader—Brenner algorithm)
`
`size
`
`N
`
`8
`I6
`32
`64
`I28
`256
`512
`I024
`2 048
`
`FFT of N/2 +ops.
`
`FFCT algorithm
`
`real mults
`
`real adds
`
`real mults
`
`real adds
`
`10
`26
`66
`162
`386
`898
`2050
`4610
`10 242
`
`_
`
`42
`I22
`290
`710
`l 678
`3 870
`8766
`I9 582
`43 262
`
`2
`10
`34
`98
`258
`642
`1538
`3586
`8 194
`
`20
`60
`164
`420
`l 028
`2 436
`5636
`12804
`28 676
`
`Turning to the DCT, one can compare the FFCT to fast discrete cosine algorithms which perform the
`transform directly, without going to the FFT (and therefore to the need to perform 2 transforms on real
`data at the same time). The various operation counts are shown in Table 2. The number of multiplications
`for the FFCT is always below the other algorithms, and even if the number of additions is slightly above
`the Chen et al. version, the total number of operations is always substantially less.
`Signal Processing
`
`HUAWEI EX. 1021 -11/15
`
`HUAWEI EX. 1021 -11/15
`
`

`
`M. Vetlerli, H. J. Nussbaumer/ Simple FFT and DCT Algorithms
`
`275
`
`Table 2
`
`Comparison of direct computation of a DCT on real data with the algorithms of Chen el al. [13], Wang er al. [14]
`and the FFCT (the operation counts for the two first algorithms are taken from [14])
`
`size
`
`N
`
`8
`16
`32
`64
`128
`256
`512
`1024
`2 048
`
`Chen et a1.
`
`.
`
`Wang er (11.
`
`FFCT
`
`r. mults
`
`I‘. adds
`
`1‘. mults
`
`I‘. adds
`
`r. mults
`
`r. adds
`
`16
`44
`116
`292
`708
`1668
`3844
`8 708
`19 460
`
`26
`74
`194
`482
`1154
`2 690
`6146
`13 826
`30 722
`
`13
`35
`91
`227
`547
`1283
`2947
`6 659
`14 851
`
`29
`83
`219
`547
`1315
`3 075
`7043
`15 875
`35 331
`
`12
`32
`80
`192
`448
`1024
`2304
`5120
`11264
`
`29
`81
`209
`513
`1217
`2 817
`6401
`14 337
`31 745
`
`The other general approach to transforms on real data is to compute simultaneously the transform of
`two real sequences. Beside the drawback that one has always to compute two transforms (otherwise the
`approach is suboptimal), the sorting of the output requires about N output additions. The operation count
`for a DFT on real data are given below:
`
`OM[R-DFT(N)]: 1/2 - OM[FFT(N)],
`OA[R-DFT(N)]:1/2- OA[FFT(N)]+ N -2.
`
`(31)
`
`Table 3 compares (31) with the FFCT, and, as can be seen, the multiplication count is identical while a
`saving of about 30% is made with regard to additions.
`The situation is quite similar when computing a DCT with an FFT [12]. It is assumed that 2 DCT’s on
`real data are evaluated at the same time. Thus, the operation count is:
`
`OM[R-DFT(N)]= 1/2 - OM[FFT(N)] +(3 N/2)~2,
`OA[R-DFT(N)]=1/2- OA[FFT(N)]+(5N/2)— 5.
`
`Table 3
`
`Comparison of the computation of a DFT on real data with an FFT of N on two
`data sequences simultaneously and output additions or the FFCT algorithm (the
`FFT of N is computed with the Rader—Brenner algorithm)
`
`size
`
`N
`
`8
`16
`32
`64
`128
`256
`512
`1024
`2 048
`
`FFT ofN4 adds
`
`Fl"-CT algorithm
`
`real mults
`
`real adds
`
`real mulls
`
`real adds
`
`2
`10
`34
`98
`258
`642
`1538
`3 586
`8194
`
`32
`88
`242
`614
`1486
`3 486
`7 998
`18 046
`40.190
`
`2
`10
`34
`98
`258
`642
`1538
`3 586
`8194
`
`20
`60
`164
`420
`1028
`2 436
`5 636
`12 804
`28 676
`
`(32)
`
`Vol. 6, No. 4, August 1984
`
`HUAWEI EX. 1021 -12/15
`
`HUAWEI EX. 1021 -12/15
`
`

`
`276
`
`M. Vellerli, H. J. Nu.v.vbaumer/ Simple FFT and DCTAlguri(Im1s
`
`A technique similar to the one used in (23) was used in the computation ofthe output of the DCT. The
`results are compared in Table 4 where it is seen that the additions are reduced in the FFCT by about
`25% with an identical number of multiplies.
`Table 4
`
`Comparison of the computation of a DCT on real data with the algorithm using
`an FFT on two data sequences simultaneously or the FFCT (the FFT is
`the
`Rader—-Brenner version)
`
`size
`
`N
`
`8
`16
`32
`64
`128
`256
`S12
`1024
`2 048
`
`FFT of N Hips.
`
`FFCT algorithm
`
`real mults
`
`real adds
`
`real mults
`
`real adds
`
`12
`32
`80
`192
`448
`I 024
`2304
`5120
`11 264
`
`41
`109
`287
`707
`1675
`3 867
`8763
`19579
`43 259
`
`12
`32
`80
`192
`448
`1 024
`2304
`5120
`11 264
`
`29
`81
`209
`513
`1217
`2 817
`6401
`14 337
`31 745
`
`Turning to the computation ofa DFT on a complex input, one can use (29) for the FFCT. The comparison
`to the Rader—Brenner FFT is given in Table 5 where the number of multiplies turns out to be identical
`while the number of additions is reduced by nearly 20%.
`Looking at last at the Winograd Fourier transform, one sees that even if the number of multiplies is
`larger in the FFCT case, the total number of operations is smaller for large transforms (for example 5%
`for the WFTA(504)/FFCT(5l2) and 9% for the WFTA(l008)/FFCT(1024) comparison). Note that the
`gain is less in the PFA case (5% and 2% respectively).
`Concerning the WFTA and the PFA, one recalls that its main drawback is the involved structure of the
`algorithm. Even if the structure of the FFCT is not as straightforward as the radix-2 FFT,
`it remains
`simple (it is similar to the structure of the Rader—Brenner algorithm). ln the FFCT, the reduction in the
`number of operations is obtained at the cost of additional permutations and data transfers. Even if these
`
`Table 5
`
`Comparison ofthe Rader—Brenner FFT with the FFCT when applied to complex data
`
`FFT of N
`
`FFCT algorithm
`
`real mults
`
`real adds
`
`real mults
`
`real adds
`
`4
`20
`68
`196
`516‘
`,1 284
`3 076
`7172
`16388
`
`52
`148
`424
`l 104
`2720
`6 464
`14 976
`34 048
`76 288
`
`4
`20
`68
`196
`516
`l 284
`3 076
`7 172
`16388
`
`52
`I48
`388
`964
`2308
`5 380
`12 292
`27 652
`61444
`
`size
`
`N
`
`8
`16
`32
`64
`128
`256
`512
`1024
`2048
`
`Signal Processing
`
`HUAWEI EX. 1021 -13/15
`
`HUAWEI EX. 1021 -13/15
`
`

`
`
`
`M. Vetterli, H. J. Nuxsbaumer/ Simple FFTant1 DCTAIgari1}m1s
`
`277
`
`permutations can be grouped with previous ones (for example at the entry of the DCT), the resulting
`structure is more complex. But this increase in topological complexity is annihilated by the substantial
`decrease in arithmetic complexity.
`Looking at the implementation, one sees the importance of using very efficient small transforms. Together
`with the explicit coding technique [16], this should lead to fast code for real transforms, especially on
`micro and signal processors, where the number of arithmetic and data registers is small (leading to
`unetficient implementations of the complex transform on two real sequences version).
`
`5. Concluding remarks
`
`We have introduced a simple, recursive algorithm for the computation of the discrete Fourier transform
`and the discrete cosine transform. First, this algorithm can be applied directly to real data, providing
`therefore an attractive alternative to the method using a complex transform oftwo real sequences. Secondly,
`it uses the same number of multiplies as the best structured FFT algorithms but decreases the number of
`additions by about 25 to 30% for Dl~"l"s and DCT’s on real data and by nearly 20% for DFT’s on complex
`data when compared to currently used algorithms. At last, its structure is simple enough so that it should
`lead to efficient implementations.
`It is worth noting that the proposed algorithm, while leading to the same computational complexity
`for the FFT as the one in [20], is not isomorphic. Interestingly, the algorithm in [20] has been around for
`15 years but was seldom referenced and even less used. Meanwhile, literature was published on amelior-
`ations of the Cooley—Tukey FFT which leads to less efficient algorithms than the one in [20]. .
`.
`. More
`recently, actually just as the linal copy was being sent to the publisher, yet another algorithm appeared
`that leads to the same complexity for the complex FFT [21].
`Investigations are under way in order to generalize the above ideas to other sinusoidal transforms and
`to higher dimensions, as well as to prove the efficiency of an implementation.
`'
`
`Acknowledgements
`
`The authors would like to thank one of the anonymous reviewers for bringing [20] and [21] to their
`attention as well as the “Fonds National Suisse de la Recherche Scientifique” for supporting this research.
`
`References
`
`[1] C. Runge, and H. Konilz, Varlesungen fiber Numerisches Rerlmen, Springer Verlag, Berlin, 1924.
`[2] J.W. Cooley, and .I.W. Tukey, “An algorithm for th

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket