`
`Aldana et al.
`In re Patent of:
`8,416,862
`
`U.S. Pat. No.:
`April 9, 2013
`Issue Date:
`Appl. Serial No.: 11/237,341
`Filing Date:
`September 28, 2005
`Title:
`EFFICIENT FEEDBACK OF CHANNEL INFORMATION IN
`A CLOSED LOOP BEAMFORMING WIRELESS
`COMMUNICATION SYSTEM
`
`Attorney Docket No.: 35548-0097IP1
`
`Declaration of Jacob Robert Munford
`
`1
`
`LG 1021
`
`
`
`1. My name is Jacob Robert Munford. I am over the age of 18, have personal
`
`knowledge of the facts set forth herein, and am competent to testify to the
`
`same.
`
`2. I earned a Master of Library and Information Science (MLIS) from the
`
`University of Wisconsin-Milwaukee in 2009. I have over ten years of
`
`experience in the library/information science field. Beginning in 2004, I
`
`have served in various positions in the public library sector including
`
`Assistant Librarian, Youth Services Librarian and Library Director. I have
`
`attached my Curriculum Vitae as Appendix A.
`
`3. During my career in the library profession, I have been responsible for
`
`materials acquisition for multiple libraries. In that position, I have cataloged,
`
`purchased and processed incoming library works. That includes purchasing
`
`materials directly from vendors, recording publishing data from the material
`
`in question, creating detailed material records for library catalogs and
`
`physically preparing that material for circulation. In addition to my
`
`experience in acquisitions, I was also responsible for analyzing large
`
`collections of library materials, tailoring library records for optimal catalog
`
`2
`
`
`
`
`
`
`
`
`
`search performance and creating lending agreements between libraries
`
`during my time as a Library Director.
`
`4. I am fully familiar with the catalog record creation process in the library
`
`sector. In preparing a material for public availability, a library catalog record
`
`describing that material would be created. These records are typically
`
`written in Machine Readable Catalog (herein referred to as “MARC”) code
`
`and contain information such as a physical description of the material,
`
`metadata from the material’s publisher, and date of library acquisition. In
`
`particular, the 008 field of the MARC record is reserved for denoting the
`
`date of creation of the library record itself. As this typically occurs during
`
`the process of preparing materials for public access, it is my experience that
`
`an item’s MARC record indicates the date of an item’s public availability.
`
`
`5. I have reviewed Exhibit EX1008, an article by B. Yang and J.F. Bohme
`
`entitled “Reducing The Computations of the Singular Value Decomposition
`
`Array Given By Brent and Luk” (hereto referred to as ‘Yang’) as presented
`
`in SIAM Journal On Matrix Analysis and Applications Volume 12, Issue 4.
`
`
`6. Attached hereto as YA01 is a true and correct copy of the spine, publication
`
`data, title page and complete ‘Yang’ from SIAM Journal On Matrix Analysis
`
`3
`
`
`
`
`
`
`
`and Applications from the University of Pittsburgh library. In comparing
`
`YA01 to Exhibit EX1008, it is my determination that Exhibit EX1008 is a
`
`true and correct copy of ‘Yang’.
`
`
`7. Attached hereto as YA02 is a true and correct copy of the MARC record
`
`describing SIAM Journal On Matrix Analysis and Application from the
`
`University of Pittsburgh’s library. I secured this record myself from the
`
`library’s online catalog. The 008 field of this MARC record indicates SIAM
`
`Journal On Matrix Analysis and Application was first cataloged by the
`
`University of Pittsburgh library as of September 9, 1987. The item holdings
`
`indicate this journal was held in perpetuity since September 1987. This item
`
`record also indicates the library’s collection includes the Volume 12, Issue 4
`
`publication of SIAM Journal On Matrix Analysis and Application containing
`
`“Yang”.
`
`
`8. The date stamp on page 4 of YA01 indicates this journal was processed by
`
`library staff as of November 1991. Considering this information in concert
`
`with the record data from YA02, it is my determination that the Volume 12,
`
`Issue 4 edition of SIAM Journal On Matrix Analysis and Application was
`
`made available and accessible to the public by the University of Pittsburgh
`
`library shortly after initial publication and certainly no later than November
`
`
`
`4
`
`
`
`1991. Based on journal availability, it is my determination that ‘Yang’ was
`
`made available and accessible to the public shortly after initial publication
`
`via SIAM Journal On Matrix Analysis and Application.
`
`9. I have been retained on behalf of the Petitioner to provide assistance in the
`
`above-illustrated matter in establishing the authenticity and public
`
`availability of the documents discussed in this declaration. I am being
`
`compensated for my services in this matter at the rate of $100.00 per hour
`
`plus reasonable expenses. My statements are objective, and my
`
`compensation does not depend on the outcome of this matter.
`
`10. I declare under penalty of perjury that the foregoing is true and correct. I
`
`hereby declare that all statements made herein of my own knowledge are
`
`true and that all statements made on information and belief are believed to
`
`be true; and further that these statements were made the knowledge that
`
`willful false statements and the like so made are punishable by fine or
`
`imprisonment, or both, under Section 1001 of Title 18 of the United States
`
`Code.
`
`5
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Dated: 7/31/19
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Jacob Robert Munford
`
`
`
`6
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`APPENDIX A
`APPENDIX A
`
`7
`
`
`
`Appendix A - Curriculum Vitae
`
`Education
`
`University of Wisconsin-Milwaukee - MS, Library & Information Science, 2009
`Milwaukee, WI
`● Coursework included cataloging, metadata, data analysis, library systems,
`management strategies and collection development.
`● Specialized in library advocacy and management.
`
`Grand Valley State University - BA, English Language & Literature, 2008
`Allendale, MI
`● Coursework included linguistics, documentation and literary analysis.
`● Minor in political science with a focus in local-level economics and
`government.
`
`Professional Experience
`
`Library Director, February 2013 - March 2015
`Dowagiac District Library
`Dowagiac, Michigan
`● Executive administrator of the Dowagiac District Library. Located in
`Southwest Michigan, this library has a service area of 13,000, an annual
`operating budget of over $400,000 and total assets of approximately
`$1,300,000.
`● Developed careful budgeting guidelines to produce a 15% surplus during the
`2013-2014 & 2014-2015 fiscal years.
`● Using this budget surplus, oversaw significant library investments including
`the purchase of property for a future building site, demolition of existing
`buildings and building renovation projects on the current facility.
`● Led the organization and digitization of the library's archival records.
`● Served as the public representative for the library, developing business
`relationships with local school, museum and tribal government entities.
`
`8
`
`
`
`● Developed an objective-based analysis system for measuring library services
`- including a full collection analysis of the library's 50,000+ circulating
`items and their records.
`
`
`November 2010 - January 2013
`Librarian & Branch Manager, Anchorage Public Library
`Anchorage, Alaska
`● Headed the 2013 Anchorage Reads community reading campaign including
`event planning, staging public performances and creating marketing
`materials for mass distribution.
`● Co-led the social media department of the library's marketing team, drafting
`social media guidelines, creating original content and instituting long-term
`planning via content calendars.
`● Developed business relationships with The Boys & Girls Club, Anchorage
`School District and the US Army to establish summer reading programs for
`children.
`
`June 2004 - September 2005, September 2006 - October 2013
`Library Assistant, Hart Area Public Library
`Hart, MI
`● Responsible for verifying imported MARC records and original MARC
`cataloging for the local-level collection as well as the Michigan Electronic
`Library.
`● Handled OCLC Worldcat interlibrary loan requests & fulfillment via
`ongoing communication with lending libraries.
`
`Professional Involvement
`
`Alaska Library Association - Anchorage Chapter
`● Treasurer, 2012
`
`Library Of Michigan
`● Level VII Certification, 2008
`● Level II Certification, 2013
`
`9
`
`
`
`Michigan Library Association Annual Conference 2014
`● New Directors Conference Panel Member
`
`Southwest Michigan Library Cooperative
`● Represented the Dowagiac District Library, 2013-2015
`
`Professional Development
`
`Library Of Michigan Beginning Workshop, May 2008
`Petoskey, MI
`● Received training in cataloging, local history, collection management,
`children’s literacy and reference service.
`
`Public Library Association Intensive Library Management Training, October 2011
`Nashville, TN
`● Attended a five-day workshop focused on strategic planning, staff
`management, statistical analysis, collections and cataloging theory.
`
`Alaska Library Association Annual Conference 2012 - Fairbanks, February 2012
`Fairbanks, AK
`● Attended seminars on EBSCO advanced search methods, budgeting,
`cataloging, database usage and marketing.
`
`10
`
`
`
`YA01
`YAOI
`
`11
`
`
`
`IM)
`
`5 050 777 956
`
`12
`
`
`
`STAM
`BUa)
`Meee
`ANALYSIS AND
`
`APPLICATIONS
`
`13
`
`
`
`R@8-M16-S24-Te6
`31735050777956
`AACAAA
`Hillman Gr.Fl. Lending
`Request IDi 432140
`Pull Date: 2619-87723 113032
`Call No.
`
`Title? SIAM journal on matrix analysi
`
`RED
`
`MUNFORD, JACOB R
`ULSertsyB
`2L666268035853sS
`Req. Date: 2019/07/23 10:07:43
`
`Do Not Remove This Wrapper
`
`
`
`University of Pittsburgh
`University Library System
`Storage
`Facility
`
`DO NOT
`CIRCULATE
`DO NOT
`
`14
`
`eRhiStSlhdBSih
`
`
`
`14
`
`
`
`
`
`15
`
`
`
`SIAM JOURNAL ON
`
`Matrix Analysis
`and Applications
`
`VOLUME 12
`
`
`Cc
`
`NI
`
`. Meyer
`. K. Nichols
`Olkin
`S. Pang
`. G. Rothblum
`K. Sigmon
`G. Strang
`P. van Dooren
`C. Van Loan
`R. S. Varga
`A. Watson
`H. Weinberger
`
`.J
`
`.U
`
`Managing Editor
`Associate
`Managing Editor
`Editorial Board
`
`G. H. Golub
`
`R. J. Plemmons
`
`T. Ando
`A. Berman
`R. Brualdl
`J. Bunch
`A. Bunse-Gerstner
`P. J. Courtois
`G. Cybenko
`B. N. Datta
`Y. Genin
`J. R. Gilbert
`L. J. Gleser
`M. Goldberg
`W.B. Gragg
`
`A. Greenbaum
`. Gutknecht
`. Hammarling
`. J. Higham
`A. Horn
`. Kagstrém
`. Kautsky
`. Lancaster
`W. H. Liu
`T. Luk
`. A. Manteuffel
`S. Maybee
`. L. Merris
`
`DOANVEDDZMES
`
`oo—sS—
`Copyright 1991 by the Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania.
`
`16
`
`16
`
`
`
`AN
`JOURNAL ON
`
`
`
`
`Matrix Analysis
`and Applications
`
`OCTOBER 1991
`Volume 12, Number 4
`
`
`Managing Editor
`Associate
`Managing Editor
`
`Editorial Board
`
`G. H. Golub
`
`R. J. Plemmons
`
`1. Ando
`A. Berman
`R. Brualdi
`J. Bunch
`A. Bunse-Gerstner
`P. J. Courtois
`G. Cybenko
`B. N. Datta
`Y. Genin
`J. R. Gilbert
`L. J. Gleser
`M. Goldberg
`W. B. Gragg
`
`A. Greenbaum
`M. Gutknecht
`S. Hammarling
`N. J. Higham
`R. A. Horn
`B. Kagstr6m
`J. Kautsky
`P. Lancaster
`J. W. H. Liu
`F. T. Luk
`T. A. Manteuffel
`J. S. Maybee
`R. L. Merris
`
`C. Meyer
`N. K. Nichols
`|. Olkin
`J. S. Pang
`U. G. Rothblum
`K. Sigmon
`G. Strang
`P. van Dooren
`C. Van Loan
`R. S. Varga
`A. Watson
`H. Weinberger
`
`nn
`
`Publisher: Vickie H. Kearn. Production Manager: J. Corey Gray. Managing Editor: Tricia Manning. Senior
`Production Editor: Crystal G. Norris. Copy Editor: Beth Gallagher. Editorial Assistant: David Livewell. Pro-
`duction Coordinator: Nancy H. Abbott. Reprints Coordinator: Jill M. Davis.
`
`SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS(ISSN 0895-4798)is published quarterly in January,
`April, July, and October by the Society for Industrial and Applied Mathematics, 3600 University City Science
`Center, Philadelphia, PA 19104-2688. Second-class postage paid at Philadelphia, Pennsylvania and additional
`mailing offices. POSTMASTER: Send address change to S/AM Journal on Matrix Analysis and Applications,
`3600 University City Science Center, Philadelphia, PA 19104-2688.
`
`Subscriptions: Annual subscriptionlist price: $130.00 (domestic), $155.00 (overseas); annual individual member
`subscription price: $40.00 (domestic), $43.00 (overseas). Subscriptions are available on a calendar-year basis
`only. Send orders to Customer Service, SIAM, 3600 University City Science Center, Philadelphia, PA 19104-2688.
`In Japan send orders to USACO Corporation, 13-12, Shimbashi, 1-chome, Minato-ku, Tokyo 105, tel. 03(502)-
`6471. All back volumesare available; prices will be provided on request. Manuscript Submissions:See Instructions
`to Authors on pageiv of this journal. Advertising: Accepted forall publications. Inquiries should be directed to
`the Marketing Manager, SIAM Publications, 3600 University City Science Center, Philadelphia, PA 19104-2688.
`Authorization to photocopyitemsfor internal and personaluse,or the internal use of specific clients,is granted
`by the Society for Industrial and Applied Mathematicsforlibraries and other users registered with the Copyright
`Clearance Center (CCC) Transactional Reporting Service provided that the base fee of $1.50 per copy plus $0.10
`per pageis paid directly to CCC, 21 Congress Street, Salem, Massachusetts 01970.
`
`TAPSCO,Inc., Akron, Pennsylvania, compositor; Lancaster Press, Lancaster, Pennsylvania,printer.
`The SIAM Journal on Matrix Analysis and Applications is a continuation of the SIAM Journal on Algebraic and
`Discrete Methods (ISSN 0196-5212).
`
`Copyright 1991 by the Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania.
`Siamisa registered trademark.
`
`Me
`
`17
`
`17
`
`
`
`SIAM J. MATRIX ANAL. APPL.
`Vol. 12, No. 4, pp. 713-725, October 1991
`
`© 1991 Society for Industrial and Applied Mathematics
`009
`
`REDUCING THE COMPUTATIONS OF THE SINGULAR VALUE
`DECOMPOSITION ARRAY GIVEN BY BRENT AND LUK*
`
`B. YANG+ AND J. F. BOHME+
`
`Abstract. A new,efficient, two-plane rotation (TPR) method for computing two-sided rotations involved
`insingular value decomposition (SVD) is presented. It is shown that a two-sided rotation can be evaluated by
`only two plane rotations and a few additions. This leads to significantly reduced computations. Moreover,if
`coordinate rotation digital computer (CORDIC) processors are usedfor realizing the processing elements(PEs)
`of the SVD array given by Brent and Luk, the computational overhead of the diagonal PEs due to angle
`calculations can be avoided. The resulting SVD array has a homogeneousstructure with identical diagonal and
`of-diagonal PEs. Similar results can also be obtained if the TPR method is applied to Luk’s triangular SVD
`aray and to Stewart’s Schur decomposition array.
`
`Key words. singular value decomposition, systolic arrays, CORDIC, two-sidedrotations, VLSI
`
`AMS(MOS)subjectclassification. 15A18
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`1. Introduction. One important problem in linear algebra and digital signal pro-
`cessing is the singular value decomposition (SVD). Typical applications arise in beam-
`forming and direction finding, spectrum analysis, digital image processing,etc. [1]. Re-
`cently, there has been a massive interest in parallel architectures for computing SVD
`because of the high computational complexity of SVD, the growing importanceofreal-
`time signal processing, and the rapid advancesin very large scale integration ( VLSI) that
`make low-cost, high-density and fast processing memory devices available.
`There are different numerically stable methods for computing complete singular
`value and singular vector systems of dense matrices, for example, the Jacobi SVD method,
`the OR method, and the one-sided Hestenes method. Forparallel implementations, the
`Jacobi SVD method is far superior in terms of simplicity, regularity, and local com-
`munications. Brent, Luk, and Van Loan have shown how the Jacobi SVD method with
`parallel ordering can be implemented by a two-dimensional systolic array [2], [3]. Various
`coordinate rotation digital computer (CORDIC) realizations ofthe SVD array have been
`_teported by Cavallaro and Luk [4] and Delosme[5], [6].
`The Jacobi SVD method is based on, as commonforall two-sided approaches,
`applying a sequence of two-sided rotations to 2 X 2 submatrices ofthe original matrix.
`The computational complexity is thus determined by how to compute the two-sided
`rotations. In most previous works, a two-sided rotation is evaluated in a straightforward
`manner by four plane rotations, where two of them are applied from left to the two
`column vectors of the 2 * 2 submatrix and the other onesare applied from right to the
`row vectors, respectively. In the diagonal processing elements (PEs), additional operations
`for calculating rotation angles are required. This leads to an inhomogeneousarray ar-
`chitecture containing two different types of PEs.
`In this paper, we develop a two-plane rotation (TPR) method for computing two-
`sided rotations. We show that the above computational complexity can be reducedsig-
`nificantly because each two-sided rotation can be evaluated by only two plane rotations
`and a few additions. Moreover, the SVD array given by Brent and Luk becomes ho-
`mogeneouswith identical diagonal and off-diagonal PEs when CORDICprocessors are
`
`
`
`* Received by the editors September 28, 1989; accepted for publication (in revised form) August 2, 1990.
`+ DepartmentofElectrical Engineering, Ruhr-Universitat Bochum, 4630 Bochum, Germany.
`713
`
`18
`
`
`
`18
`
`
`
`714
`
`B. YANG AND J. F. BOHME
`
`
`used. In a recent work [6], Delosme has also indicated this possibility in connection
`with “trough rotations” independently. He has taken, however,a different approachthat
`is based on encoding the rotation angles. He hasstill required four plane rotations on
`
`the off-diagonal PEs while diagonal and off-diagonal operations can be overlapped.
`
`Ourpaperis organized asfollows. In § 2, we briefly reexamine Jacobi’s SVD method
`and Brent and Luk’s SVDarray. Then, we develop the TPR methodin § 3. The CORDIC
`
`algorithm is described in § 4, where in particular CORDICscaling correction techniques —
`
`are discussed and examplesof scaling-corrected CORDIC sequencesare given.In § 5,a
`
`unified CORDIC SVD module for all PEs of the SVD array is presented. This module
`is comparedto those proposed by Cavallaro, Luk, and Delosmein§6. Finally, westress
`the applicability of the TPR method to several other problems.
`
`2. Jacobi SVD method. In this paper, we considerreal, square, and nonsymmetric
`matrices. Let M € R‘*” be a matrix of dimension N. The SVDis given by
`
`(1)
`
`M=U3V",
`
`
`where U € R**” and V € R*** are orthogonal matrices containing theleft and right
`singular vectors, and € R‘** is a diagonal matrix ofsingular values, respectively. The
`superscript 7 denotes matrix transpose. Based on an extension of the Jacobieigenvalue
`algorithm [7], Kogbetliantz [8] and Forsythe and Henrici [9] proposed to diagonalize
`M bya sequenceof two-sided rotations,
`
`(2)
`
`Mo=M,
`
`My.41= ULM, V;
`
`(k=0,1,2, +++).
`
`
`
`
`
`
`
`U,, and V; describe tworotations in the (/, /)-plane (1 S i <j S N), wheretherotation
`angles are chosen to annihilate the elements of MM, at the positions (i, /) and (J,i),
`
`Usually, several sweeps are necessary to complete the SVD, where a sweepis a sequence
`
`of N(N — 1)/2 two-sided rotations according to a special ordering of the N(N — 1)/2
`
`Bo
`different index pairs (7, /).
`For sequential computing on a uniprocessor system, possibly the most frequently
`used orderings are the cyclic orderings, namely, the cyclic row ordering
`
`d
`
`(¢
`
`di
`
`(2
`el
`ge
`
`ar
`ar
`sti
`
`ees
`
`(3)
`
`(2,J)=(1,2),C1,3), +++ C1), (2,3), +++ (25.4), +** CGV —-1,N)
`
`or the equivalent cyclic column ordering. Sameh [10] and Schwiegelshohnand Thiee
`[11] have shown how to implementthe cyclic row ordering on a ring-connected ora
`mesh-connected processor array. Recently, a variety of parallel orderings havebeende.
`veloped. Luk and Park [12] have shownthatthese parallel orderings are essentially equi.
`alent to the cyclic orderings and thus share the same convergenceproperties.
`Brent and Luk havesuggested a particular parallel ordering and developed a squat
`systolic array consisting of [N/2]X [N/21PEs for implementing the Jacobi SVD method
`(Fig. 1). To do this, the matrix
`is partitioned into 2 X 2 submatrices. Each PEcontains
`one submatrix and performs a two-sided rotation
`
`
`
`
`
`(4)
`
`where
`
`(3)
`
`B= R(0;)"AR(62),
`
`A=
`
`Qa;
`a2;
`
`a)2
`An?
`
`= bi bi
`and B=
`br,
`br»
`
`19
`
`R(
`
`the
`
`
`
`19
`
`
`
`TAS
`
`
`
`
`Fic. 1. The SVD array given by Brent and Luk.
`
` REDUCED SVD COMPUTATIONS AND HOMOGENEOUS SVD ARRAY
`
`describes a plane rotation through theangle 6. Atfirst, the diagonal PEs (symbolized by
`adouble square in Fig. 1 ) generate the rotation angles to diagonalize the 2 x 2 submatrices
`(bi) = by; = 0) stored in them. This meansthat 6, and 6, are first calculated from the
`dements of A and then relation (4) is used to compute 5,; and b2.. Wecall this the
`generation mode. Then,the rotation anglesare senttoall off-diagonal PEs in the following
`way: the angles associated to theleft-side rotations propagate along the rows while the
`angles associated to the right-side rotations propagate along the columns. Once these
`angles are received, the off-diagonal PEs perform the two-sided rotations (4) on their
`stored data. Wecall this the rotation mode. Clearly, if we compute the rotation mode
`straightforwardly, we require four plane rotations. For the generation mode, additional
`operations for calculating 6; and @2 are required.
`3. TPR method for computing two-sided rotations. In order to develop the TPR
`method for computing two-sided rotations more efficiently, we first discuss the com-
`mutative properties of two special types, the rotation-type and the reflection-type, of
`2X 2 matrices. We define
`
`denote the submatrix before and after the two-sided rotation, respectively, and
`
`(6)
`
`Ro) =(
`
`cos@ sin@
`
`—sin@ cosé
`
`ols
`
`x.ver|
`
`
`
`and Mal" ’)
`
`y =x
`
` x.yer,
`
`The formeris called rotation-type because it has the same matrix structure as a 2 X 2
`plane rotation matrix. Similarly, the latter is called reflection-type because it has the
`same matrix structure as a 2 X 2 Givensreflection matrix [13]. Note that x and y must
`not be normalized to x? + y? = 1. Using the above definitions, the following results can
`be shown by some elementary manipulations.
`LEMMA1. IfA, €.@™ and A, € M™, then A\ Az = A2A\ € aM,
`LEMMA 2. IfA,€.@™ and A, € M™, then A\Ay = ATA, E M™,
`In particular, if we consider two plane rotations, we know the following.
`LEMMA 3. If R(0,) and R(@2) are plane rotations described by (6),
`R(0;)R(82) = R(O; + 62) and R(6,)"R(62) = R62 — 41).
`Now,we give a theorem describing the rotation mode of the TPR method.
`THEOREM. If the 2 X 2 matrix A andthe two rotation angles 6, and 6are given,
`then the two-sided rotation (4) can be computed by two planerotations, ten additions,
`
`then
`
`
`
`20
`
`20
`
`
`
`716
`
`B. YANG AND J. F. BOHME
`
`andfour scalings by 3:
`(8)
`
`Di = (22+ a,;)/2,
`G1 = (2, — ay2)/2,
`6_=6.—6,,
`
`Dz = (@z2 — ay1)/2,
`G2 = (21 + ay2)/2,
`0, =6.+61,
`
`(Senn) (mel)
`
`ty
`
`1
`
`ly
`
`a
`
`bp=-t +b,
`bi =ri—h,
`12
`1
`2
`11
`1
`2
`by =ryit+n.
`by =ti th,
`Proof. Using (8), the matrix A can be reformulated as
`
`(9)
`
`wy
`
`(11)
`
`A=Ata(?! See oy
`N
`Pi
`42
`P2
`Clearly, R(6;), R(62) in (4) and A, are elements of .@'while Az belongs to .a@™, This
`leads to the following reformulation of the matrix B by using Lemmas 1-3:
`B= R(6)"AR(92)
`
`= R(6,)7A, R(62)+ R(6,)7A>R( 2)
`
`= R(6,)7R(62)A:+R(6,)7R(82)7Az
`
`= R(0.—6,)A; + R(02+6;)7Ay
`
`=R0(?" eRe( a
`
`Pi
`
`1
`
`1
`
`1
`
`2
`
`2
`
`
`
`This completes the proof.
`The generation mode of the TPR methodfollows directly from the abovetheorem,
`COROLLARY. If the 2 X 2 matrix A is given, we can diagonalize A andcalculate
`the corresponding rotation angles 0; and 0) by two Cartesian-to-polar coordinates con-
`versions, eight additions, andfour scalings by }:
`Di =(a22+ ay,)/2,
`P2=(a22—a1,)/2,
`41 = (421 — Qy2)/2,
`G2 = (21 + ay2)/2,
`n=sign(p,)Vpt+qi,
`m=sign (pr) Vp3+q3,
`6_ = arctan (q:/p1),
`94 = arctan (42/2),
`6, = (0, —6_)/2,
`62 = (0, + 0_)/2,
`
`
`21
`
`(12)
`
`(13)
`
`(14)
`
`(15)
`
`by =rjt+n.
`bi =r,
`Proof. Regarding (11), bi. = 6, = 0 is equivalent to t; = t = 0. Equation (13)
`follows then from (10). This completes the proof.
`In equation (13), we choose the rotation through the smaller angle. All vectors
`lyingin the first or the fourth quadrantare rotated onto the positive x-axis, andall vectors.
`lying in the second and the third quadrant are rotated onto the negative x-axis. For
`vectors on the y-axis, the rotation direction is arbitrary. Thus, the generatedrotation
`
`18)
`inw
`19)
`a
`Equ
`4 (20
`
`
`
`
`21
`
`
`
`REDUCED SVD COMPUTATIONS AND HOMOGENEOUS SVD ARRAY
`
`717
`
`angles 0_ and 6, satisfy |6_|, |6,| < 90°. This results in
`(16)
`|0,|S90°
`and
`|6,| $909,
`
`
`
`
`
`due to (14).
`Equation (16) is important with respect to the convergence of the Jacobi SVD
`method. Forsythe and Henrici [9] have proven the convergencefor cyclic orderings if
`the rotation angles 6, and 6, are restricted to a closed interval inside the open interval
`-90°, 90°). They have also demonstrated that this condition mayfail to hold,i.e., 0;
`and 6. may be +90°, if the off-diagonal elements b,. and bp; in (5) have to be exactly
`annihilated. As a remedy, they suggested an under- or overrotation by computing the
`
`two-sided rotation (4) with angles (1 — y)@, and (1 — y)@2 (—1 < y < 1) and proved
`
`its convergence. In practice, however, the finite machine accuracy in the real arithmetic
`
`allows only an approximative computation of the rotation angles and implies under- or
`
`overrotations. So the Jacobi SVD method converges without using under- or overrotations
`
`asshown by the experimental results of Brent, Luk, and Van Loan [3]. In case ofCORDIC
`implementations, the effect of implicit under- or overrotations is more apparent. The
`
`angles +90° can neverbe exactly calculated because of the limited angle resolution arc-
`tan(2~”) of the CORDICalgorithm, where p denotes the mantissa length.
`
`4. The CORDICalgorithm. In the previous section, we have seen that the main
`
`operations of the TPR-methodare plane rotations and Cartesian-to-polar coordinates
`conversions. These operations can be carried out by multiplier-adder-based processors
`
`supported by software or special hardware units. An alternative approach is the use of
`
`dedicated processors that usually map algorithms moreeffectively to hardware. The
`
`CORDIC processoris such a powerful onefor calculating trigonometric functions.
`
`The CORDICalgorithm was originally designed by Volder [14] as an iterative pro-
`
`cedure for computing plane rotations and Cartesian-to-polar coordinates conversions.It
`was later generalized and unified by Walther [15], enabling a CORDIC processor to
`
`calculate more functions, including hyperbolic functions, as well as multiplications and
`
`divisions. In the following, we consider Volder’s CORDICalgorithm because only trig-
`onometric functions are involved in SVD applications.
`The CORDICalgorithm consists of iterative shift-add operations on a three-com-
`ponent vector,
`
`
`
`(17) fr + ’ 7 (® — o6iVi\ |
`
`
`
`
`
`
`
`1
`
`cos(a;) —a; Sin (@;) a)
`
`
`
`
`
`
`
`Vi+1 cos (a;)\o; sin (a;)yi + 0;6;x;} cos (a) /\y;)’
`
`(18)
`
`Zis1 = 2) — €0;0;
`
`(0 <6;< ljo,=+lje=+1;1=0,1,°::,n-—1),
`
`in which the iteration stepsize 6, is defined by
`
`(19)
`
`6; = tan (a;)=2-.
`
`The set of integers {.S(i)} parametrizing the iterations is called CORDIC sequence.
`Equation (17) can be interpreted, except for a scaling factor of
`
`
`
`=V1+6?,
`
`
`
`
`
`a cos (a;)
`yo)
`
`asa rotation of (x;, y;)’ through the angle a;, where the sign o; = +1 gives the rotation
`direction. After 7 iterations, the results are given by
`
`a
`
`
`
`(22)
`
`x
`
`Vn
`
`fglane
`
`cosa —sina)\/
`
`sin a
`
`cos a/\
`
`Xo
`
`eee)
`
`yo
`
`Zn = Zo — €Q,
`
`22
`
`22
`
`
`
`718
`
`B. YANG AND J. F. BOHME
`
`with the overall scaling factor K = |], k; andthetotal rotation angle a = 2; o;a;. Now,
`if the CORDIC sequencesatisfies the following convergence condition
`n~—l
`a- D>, wSan-s
`j=itl
`
`(23)
`
`(i=0,1, +++ ,n—-2),
`
`we can choosethe sign parameter
`
`(24)
`
`Cj
`
`|—sign (x;y)
`
`;
`sign (e€Z;)
`
`fory,—>0,
`
`for z,—> 0
`
`to force y, or Z, to zero, provided that the input data x9, yo, and Zo lie in the conver-
`gence region
`
`(25)
`
`c=> a|
`
`nol
`i=0i=
`
`|Zo|
`
`arctan (yo/Xo)|
`
`oT
`
`fory,—0,
`
`for z,—> 0.
`
`
`
`
`
`
`
`In this way, two different types of CORDIC trigonometric functions can be computed
`(Table 1). In the mode y,, > 0, the Cartesian coordinate (Xo, yo) of a plane vectoris —
`converted to its polar representation, where the parameter e = +1 determinesthesign
`of the phase angle calculated. When z,, > 0, a given plane vectoris rotated through the
`angle zy, where e = +1 controls the rotation direction.
`In Table 1, the principal value | arctan (¥o/Xo)| S 90° of the inverse tangent function
`is calculated when computing Cartesian-to-polar coordinates conversions. Correspond-
`
`ingly, x,, may be positive or negative according to the sign of xo. So,it is guaranteedthat
`
`a vector is always rotated through the smaller angle onto the x-axis in accordancewith
`
`(13). In this case, a convergence region of C 2 90° is sufficient for the generation mode
`of the two-sided rotation.
`One main drawback of the CORDIC algorithm is the need of correcting thescal-
`ing factor K that arises during the iterations (17). For example, if we use Volder’s
`CORDICsequence
`
`
`
`
`
`
`
`with n = p + 1 CORDICiterations for a mantissa accuracy of 2”, the scaling factoris
`K = 1.64676. Compensating this undesired scaling effect with a minimum numberof
`computationsis of particular importance.
`Clearly, multiplying x, and y, in Table 1 by K' will degrade the algorithm per-
`formancesubstantially. Most of the scaling correction issues are based onshift-add oper-
`ations. For a two-sided rotation that is implemented by four plane rotations, each matrix
`
`element undergoes two plane rotations so that the total scaling factor to be corrected
`is K*. In this case, Cavallaro and Luk [16] have pointed out that there is a simple
`systematic approach for scaling correction when using the CORDIC sequence(26).
`They proposed to use [p/4] scaling iterations of the type x < x — 2-74x with je J=
`{1, 3, 5, +--+, 2.p/41— 1} and oneshift operation 27', The remainingscaling erroris
`
`(26)
`
`£S@)} = (0, 1,2,3, **: ,p- Lp},
`
`TABLE1
`
`CORDICtrigonometric functions (e = +1).
`
`
`
`
`
`
`Xn = K sign (xo) Vx0+ yo
`Zn = Zo + € arctan ()o/X)
`
`Xn = K(Xo COS Zo — EYSIN Zo)
`Yn = K(eXo Sin Zo + Yo COS Zo)
`
`23
`
`
`
`amdeadges
`
`23
`
`
`
`
`
`REDUCED SVD COMPUTATIONS AND HOMOGENEOUS SVD ARRAY
`
`719
`
`bounded by 2°? ',!
`
`(7)
`
`{1-27 J] (1-2-%)- kK?
`jes
`
` =[1-2" Ma -2)-Tpa +2 aoe!
`
`i=0
`
`jeJ
`
`
`
`a
`
`Doaa8
`
`j=1
`
`p=16:
`
`{S(i)}={0123 --- 1516},
`
`This approach, however, fails in the TPR method. Here, each matrix element un-
`dergoes only oneplanerotation. The scaling factor to be corrected is thus K rather than
`K°In order to solve this more difficult problem, different techniques have been developed
`inthe literature. Haviland and Tuszynski [17] used similar scaling iterations as Cavallaro
`and Luk. Ahmed [18] repeated some CORDICiterations to force K to a powerof the
`machine radix. Delosme [19] combined both methods of Haviland, Tuszynski, and
`_} Ahmed for minimizing the number of computations. Deprettere, Dewilde, and Udo [20]
`
`suggested the double-shift concept.
`for a systematic search of CORDIC
`We designed a computer program [21]
`
`sequences. We allow shifts parameters S(/) (i = 0, 1,--:, nm — 1) with differences
`S(i+ 1) — S(i) €
`{0, 1, 2} to provide more optimization freedom. For an efficient
`
`scaling correction, we require that the scaling factor K be corrected by a sequence of nz
`
`shift-add operations,
`a are I (1+m(j)2°7™)-K=1+AK—(T(j)integers, n(j)=+1).
`
`These additional scaling iterations are parametrized by the set of signed integers
`{T(0), n(1)T(1), +++ , n(n) T(n,)}. The total numberofiterations is L = n + ny.
`In (28), AK denotes the remainingrelative scalingerror after the scaling correction.
`
`_) We emphasize that this is a systematic error with a constantsign. By contrast, the other
`two types of CORDICerrors, the angular error due to the limited angle resolution and
`
`the rounding error, are of statistical nature because they may be positive or negative.
`
`The scaling error is thus morecritical with respect to error accumulation when repeated
`
`CORDIC operations on the same data have to be computed as in SVD applications.
`
`Roughly speaking,thetotal scaling error after k CORDIC functioncalls increases linearly
`
`with k, a fact that has been verified by our numerical experiments. For this reason, we
`
`require | AK | to be much smaller than 2°”.
`We found catalogues of CORDIC sequences with complexity comparable to those
`
`of Cavallaro and Luk.In the following, five examples for different mantissa lengths p =
`16, 20, 24, 28, and 32, including the total numberofiterations L = n + mj