`
`Aldana et al.
`In re Patent of:
`8,416,862
`
`U.S. Pat. No.:
`April 9, 2013
`Issue Date:
`Appl. Serial No.: 11/237,341
`Filing Date:
`September 28, 2005
`Title:
`EFFICIENT FEEDBACK OF CHANNEL INFORMATION IN
`A CLOSED LOOP BEAMFORMING WIRELESS
`COMMUNICATION SYSTEM
`
`Attorney Docket No.: 35548-0097IP1
`
`Declaration of Jacob Robert Munford
`
`1
`
`HUAWEI 1021
`
`
`
`1. My name is Jacob Robert Munford. I am over the age of 18, have personal
`
`knowledge of the facts set forth herein, and am competent to testify to the
`
`same.
`
`2. I earned a Master of Library and Information Science (MLIS) from the
`
`University of Wisconsin-Milwaukee in 2009. I have over ten years of
`
`experience in the library/information science field. Beginning in 2004, I
`
`have served in various positions in the public library sector including
`
`Assistant Librarian, Youth Services Librarian and Library Director. I have
`
`attached my Curriculum Vitae as Appendix A.
`
`3. During my career in the library profession, I have been responsible for
`
`materials acquisition for multiple libraries. In that position, I have cataloged,
`
`purchased and processed incoming library works. That includes purchasing
`
`materials directly from vendors, recording publishing data from the material
`
`in question, creating detailed material records for library catalogs and
`
`physically preparing that material for circulation. In addition to my
`
`experience in acquisitions, I was also responsible for analyzing large
`
`collections of library materials, tailoring library records for optimal catalog
`
`2
`
`
`
`
`
`
`
`
`
`search performance and creating lending agreements between libraries
`
`during my time as a Library Director.
`
`4. I am fully familiar with the catalog record creation process in the library
`
`sector. In preparing a material for public availability, a library catalog record
`
`describing that material would be created. These records are typically
`
`written in Machine Readable Catalog (herein referred to as “MARC”) code
`
`and contain information such as a physical description of the material,
`
`metadata from the material’s publisher, and date of library acquisition. In
`
`particular, the 008 field of the MARC record is reserved for denoting the
`
`date of creation of the library record itself. As this typically occurs during
`
`the process of preparing materials for public access, it is my experience that
`
`an item’s MARC record indicates the date of an item’s public availability.
`
`
`5. I have reviewed Exhibit EX1008, an article by B. Yang and J.F. Bohme
`
`entitled “Reducing The Computations of the Singular Value Decomposition
`
`Array Given By Brent and Luk” (hereto referred to as ‘Yang’) as presented
`
`in SIAM Journal On Matrix Analysis and Applications Volume 12, Issue 4.
`
`
`6. Attached hereto as YA01 is a true and correct copy of the spine, publication
`
`data, title page and complete ‘Yang’ from SIAM Journal On Matrix Analysis
`
`3
`
`
`
`
`
`
`
`and Applications from the University of Pittsburgh library. In comparing
`
`YA01 to Exhibit EX1008, it is my determination that Exhibit EX1008 is a
`
`true and correct copy of ‘Yang’.
`
`
`7. Attached hereto as YA02 is a true and correct copy of the MARC record
`
`describing SIAM Journal On Matrix Analysis and Application from the
`
`University of Pittsburgh’s library. I secured this record myself from the
`
`library’s online catalog. The 008 field of this MARC record indicates SIAM
`
`Journal On Matrix Analysis and Application was first cataloged by the
`
`University of Pittsburgh library as of September 9, 1987. The item holdings
`
`indicate this journal was held in perpetuity since September 1987. This item
`
`record also indicates the library’s collection includes the Volume 12, Issue 4
`
`publication of SIAM Journal On Matrix Analysis and Application containing
`
`“Yang”.
`
`
`8. The date stamp on page 4 of YA01 indicates this journal was processed by
`
`library staff as of November 1991. Considering this information in concert
`
`with the record data from YA02, it is my determination that the Volume 12,
`
`Issue 4 edition of SIAM Journal On Matrix Analysis and Application was
`
`made available and accessible to the public by the University of Pittsburgh
`
`library shortly after initial publication and certainly no later than November
`
`
`
`4
`
`
`
`1991. Based on journal availability, it is my determination that ‘Yang’ was
`
`made available and accessible to the public shortly after initial publication
`
`via SIAM Journal On Matrix Analysis and Application.
`
`9. I have been retained on behalf of the Petitioner to provide assistance in the
`
`above-illustrated matter in establishing the authenticity and public
`
`availability of the documents discussed in this declaration. I am being
`
`compensated for my services in this matter at the rate of $100.00 per hour
`
`plus reasonable expenses. My statements are objective, and my
`
`compensation does not depend on the outcome of this matter.
`
`10. I declare under penalty of perjury that the foregoing is true and correct. I
`
`hereby declare that all statements made herein of my own knowledge are
`
`true and that all statements made on information and belief are believed to
`
`be true; and further that these statements were made the knowledge that
`
`willful false statements and the like so made are punishable by fine or
`
`imprisonment, or both, under Section 1001 of Title 18 of the United States
`
`Code.
`
`5
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Dated: 7/31/19
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Jacob Robert Munford
`
`
`
`6
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`APPENDIX A
`
`APPENDIX A
`
`7
`
`
`
`Appendix A - Curriculum Vitae
`
`Education
`
`University of Wisconsin-Milwaukee - MS, Library & Information Science, 2009
`Milwaukee, WI
`● Coursework included cataloging, metadata, data analysis, library systems,
`management strategies and collection development.
`● Specialized in library advocacy and management.
`
`Grand Valley State University - BA, English Language & Literature, 2008
`Allendale, MI
`● Coursework included linguistics, documentation and literary analysis.
`● Minor in political science with a focus in local-level economics and
`government.
`
`Professional Experience
`
`Library Director, February 2013 - March 2015
`Dowagiac District Library
`Dowagiac, Michigan
`● Executive administrator of the Dowagiac District Library. Located in
`Southwest Michigan, this library has a service area of 13,000, an annual
`operating budget of over $400,000 and total assets of approximately
`$1,300,000.
`● Developed careful budgeting guidelines to produce a 15% surplus during the
`2013-2014 & 2014-2015 fiscal years.
`● Using this budget surplus, oversaw significant library investments including
`the purchase of property for a future building site, demolition of existing
`buildings and building renovation projects on the current facility.
`● Led the organization and digitization of the library's archival records.
`● Served as the public representative for the library, developing business
`relationships with local school, museum and tribal government entities.
`
`8
`
`
`
`● Developed an objective-based analysis system for measuring library services
`- including a full collection analysis of the library's 50,000+ circulating
`items and their records.
`
`
`November 2010 - January 2013
`Librarian & Branch Manager, Anchorage Public Library
`Anchorage, Alaska
`● Headed the 2013 Anchorage Reads community reading campaign including
`event planning, staging public performances and creating marketing
`materials for mass distribution.
`● Co-led the social media department of the library's marketing team, drafting
`social media guidelines, creating original content and instituting long-term
`planning via content calendars.
`● Developed business relationships with The Boys & Girls Club, Anchorage
`School District and the US Army to establish summer reading programs for
`children.
`
`June 2004 - September 2005, September 2006 - October 2013
`Library Assistant, Hart Area Public Library
`Hart, MI
`● Responsible for verifying imported MARC records and original MARC
`cataloging for the local-level collection as well as the Michigan Electronic
`Library.
`● Handled OCLC Worldcat interlibrary loan requests & fulfillment via
`ongoing communication with lending libraries.
`
`Professional Involvement
`
`Alaska Library Association - Anchorage Chapter
`● Treasurer, 2012
`
`Library Of Michigan
`● Level VII Certification, 2008
`● Level II Certification, 2013
`
`9
`
`
`
`Michigan Library Association Annual Conference 2014
`● New Directors Conference Panel Member
`
`Southwest Michigan Library Cooperative
`● Represented the Dowagiac District Library, 2013-2015
`
`Professional Development
`
`Library Of Michigan Beginning Workshop, May 2008
`Petoskey, MI
`● Received training in cataloging, local history, collection management,
`children’s literacy and reference service.
`
`Public Library Association Intensive Library Management Training, October 2011
`Nashville, TN
`● Attended a five-day workshop focused on strategic planning, staff
`management, statistical analysis, collections and cataloging theory.
`
`Alaska Library Association Annual Conference 2012 - Fairbanks, February 2012
`Fairbanks, AK
`● Attended seminars on EBSCO advanced search methods, budgeting,
`cataloging, database usage and marketing.
`
`10
`
`
`
`YA01
`
`YAOl
`
`11
`
`
`
`H||\||\\l|l\l|\ll||||WIN|I||||||l\Ill“Ml.Ill“lllillllllllllllilillll
`
`3 1735 050 7779
`
`12
`
`
`
`JOURNAL 0N
`
`MATRIX
`
`SIAM
`
`ANALYSIS AND
`
`APPLICATIONS
`
`13
`
`
`
`R08-M16-824-T06
`
`31735050777956
`WWWWWWWWWWWWMWWW
`Hillman GLFI. Lending
`Request ID: 432148
`PuH Date: 2819/87/23 113832
`Can Nod
`
`Title: SIGN journal on matrix analysi
`
`RED
`
`MUNFORD,JRCUB R
`ULScrtsyB
`2LBBB2BBB35853S
`Req.Date:2618/6P/23 1B:B?:43
`
`Do Not Remove This Wrapper
`
`
`
`University of Pittsburgh
`University Library System
`
`
`
`;
`{
`5
`
`:
`
`Storage
`Facility
`DO NOT
`
`14
`
`CIRCULATE
`_»
`Do NOT
`‘
`iIIIIIIIIIIIII‘aIII‘ilI‘lfliiliI-llllllllllhi-i
`
`14
`
`
`
`
`
`
`
`15
`
`
`
`SIAM JOURNAL ON
`
`Matrix Analysis
`and Applications
`
`VOLUME12
`
`1991
`
`
`Managing Editor
`Associate
`Managing Editor
`Editorial Board
`
`G. H. Golub
`
`R. J. Plemmons
`
`T. Ando
`A. Berman
`R. Brualdr
`J. Bunch
`A. Bunse-Gerstner
`P. J. Courtois
`G. Cybenko
`B. N. Datta
`Y. Genin
`J. R. Gilbert
`L. J. Gleser
`M. Goldberg
`W. B. Gragg
`
`A. Greenbaum
`M. Gutknecht
`S. Hammarling
`N. J. Higham
`R. A. Horn
`B. Kégstrém
`J. Kautsky
`P. Lancaster
`J. W. H. Liu
`F. T. Luk
`T. A. Manteuffel
`J. S. Maybee
`R. L. Merris
`
`U. G. Rothblum
`K. Sigmon
`G. Strang
`P. van Dooren
`C. Van Loan
`R. S. Varga
`A. Watson
`H. Weinberger
`
`_\
`
`Copyright 1991 by the 800er for Industrial and Applied Mathematics, Philadelphia. Pennsylvanla.
`
`16
`
`16
`
`
`
`
`
`SIAM
`
`JOURNAL ON
`
`
`Matrix Analysis
`and Applications
`
`OCTOBER 1 991
`Volume 12, Number 4
`
`
`Managing Editor
`Associate
`Managing Editor
`Editorial Board
`
`G. H. Golub
`
`R. J. Plemmons
`
`T. Ando
`A. Berman
`R. Brualdi
`J. Bunch
`A. Bunse-Gerstner
`P. J. Courtois
`G. Cybenko
`B. N. Datta
`Y. Genin
`J. R. Gilbert
`L. J. Gleser
`M. Goldberg
`W. B. Gragg
`
`A. Greenbaum
`M. Gutknecht
`S. Hammarling
`N. J. Higham
`R. A. Horn
`B. Kégstrom
`J. Kautsky
`P. Lancaster
`J. W. H. Liu
`F. T. Luk
`T. A. Manteulfel
`J. S. Maybee
`R. L. Merris
`
`C. Meyer
`N. K. Nichols
`l. Olkin
`J. S. Pang
`U. G. Rothblum
`K. Sigmon
`G. Strang
`P. van Dooren
`C. Van Loan
`R. S. Varga
`A. Watson
`H. Weinberger
`
`——-————_
`
`Publisher: Vickie H. Kearn. Production Manager: J. Corey Gray. Managing Editon Tricia Manning. Senior
`Production Editor: Crystal G. Norris. Copy Editor: Beth Gallagher. Editorial Assistant: David Livewell. Pro-
`duction Coordinator: Nancy H. Abbott. Reprints Coordinator: Jill M. Davis.
`
`SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS (ISSN 0895-4798) is published quarterly in January.
`April. July, and October by the Society for Industrial and Applied Mathematics. 3600 University City Science
`Center. Philadelphia. PA 191042688. Second-class postage paid at Philadelphia. Pennsylvania and additional
`mailing offices. POSTMASTER: Send address change to SIAM Journal on Matrix Analysis and Applications.
`3600 University City Science Center. Philadelphia, PA 19104-2688.
`
`Subscriptions: Annual subscription list price: $130.00 (domestic), $155.00 (overseas); annual individual member
`subscription price: $40.00 (domestic). $43.00 (overseas). Subscriptions are available on a calendaryear basis
`only. Send orders to Customer Service, SIAM. 3600 University City Science Center. Philadelphia. PA 19104-2688.
`In Japan send orders to USACO Corporation. 13—12. Shimbashi. 1-chome. Minato»ku. Tokyo 105. tel. O3(502)-
`6471 . All back volumes are available; prices will be provided on request. Manuscript Submissions: See Instructions
`to Authors on page iv of this journal. Advertising: Accepted for all publications. Inquiries should be directed to
`the Marketing Manager. SIAM Publications, 3600 University City Science Center. Philadelphia. PA 19104-2688.
`
`Authorization to photocopy items for internal and personal use. or the internal use 0t specific clients. is granted
`by the Society lor Industrial and Applied Mathematics for libraries and other users registered with the Copyright
`Clearance Center (CCC) Transactional Reporting Service provided that the base tee of $1 .50 per copy plus $0.10
`per page is paid directly to CCC. 21 Congress Street. Salem. Massachusetts 01970.
`
`TAPSCO. Inc. Akron. Pennsylvania. compositor; Lancaster Press. Lancaster. Pennsylvania. printer.
`The SIAM Journal on Matrix Analysis and Applications is a continuation of the SIAM Journal on Algebraic and
`Discrete Methods (ISSN 0196-5212).
`
`Copyright 1991 by the Society for Industrial and Applied Mathematics. Philadelphia. Pennsylvania.
`513111. is a registered trademark.
`
`%
`
`17
`
`17
`
`
`
`
`
`
`
`
`
`
`SIAM J. MATRIX ANAL. APPL.
`Vol, 12. No, 4. pp. 713—725. October I991
`
`© l99l Society for Industrial and Applied Mathematics
`009
`
`REDUCING THE COMPUTATIONS OF THE SINGULAR VALUE
`DECOMPOSITION ARRAY GIVEN BY BRENT AND LUK*
`
`B. YANGT AND J. F. BOHMET
`
`Abstract. A new, efficient, two-plane rotation (TPR) method for computing two-sided rotations involved
`in singular value decomposition (SVD) is presented. It is shown that a two-sided rotation can be evaluated by
`only two plane rotations and a few additions. This leads to significantly reduced computations. Moreover, if
`coordinate rotation digital computer (CORDIC) processors are used for realizing the processing elements (PEs)
`of the SVD array given by Brent and Luk, the computational overhead of the diagonal PEs due to angle
`calculations can be avoided. The resulting SVD array has a homogeneous structure with identical diagonal and
`oil-diagonal PEs. Similar results can also be obtained if the TPR method is applied to Luk’s triangular SVD
`array and to Stewart’s Schur decomposition array.
`
`Key words. singular value decomposition, systolic arrays, CORDIC, two-sided rotations, VLSI
`
`AMS(MOS) subject classification. 15Al8
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`1. Introduction. One important problem in linear algebra and digital signal pro-
`cessing is the singular value decomposition (SVD). Typical applications arise in beam-
`forming and direction finding, spectrum analysis, digital image processing, etc. [1]. Re-
`cently, there has been a massive interest in parallel architectures for computing SVD
`because of the high computational complexity of SVD, the growing importance of real-
`time signal processing, and the rapid advances in very large scale integration (VLSI) that
`make low-cost, high-density and fast processing memory devices available.
`There are different numerically stable methods for computing complete singular
`value and singular vector systems of dense matrices, for example, the Jacobi SVD method,
`the QR method, and the one-sided Hestenes method. For parallel implementations, the
`Jacobi SVD method is far superior in terms of simplicity, regularity, and local com-
`munications. Brent, Luk, and Van Loan have shown how the Jacobi SVD method with
`parallel ordering can be implemented by a two-dimensional systolic array [2] , [3]. Various
`coordinate rotation digital computer (CORDIC) realizations of the SVD array have been
`,. reported by Cavallaro and Luk [4] and Delosme [5], [6].
`The Jacobi SVD method is based on, as common for all two-sided approaches,
`
`applying a sequence of two-sided rotations to 2 X 2 submatrices of the original matrix.
`The computational complexity is thus determined by how to compute the two-sided
`rotations. In most previous works, a two-sided rotation is evaluated in a straightforward
`manner by four plane rotations, where two of them are applied from left to the two
`column vectors of the 2 X 2 submatrix and the other ones are applied from right to the
`row vectors, respectively. In the diagonal processing elements (PEs), additional operations
`for calculating rotation angles are required. This leads to an inhomogeneous array ar-
`chitecture containing two different types of PBS.
`In this paper, we develop a two-plane rotation (TPR) method for computing two-
`sided rotations. We show that the above computational complexity can be reduced sig-
`nificantly because each two-sided rotation can be evaluated by only two plane rotations
`and a few additions. Moreover, the SVD array given by Brent and Luk becomes ho-
`mogeneous with identical diagonal and off-diagonal PEs when CORDIC processors are
`
`
`
`
`
`‘ Received by the editors September 28, 1989; accepted for publication (in revised form) August 2, 1990.
`T Department of Electrical Engineering, Ruhr-Universitat Bochum, 4630 Bochum, Germany.
`713
`
`18
`
`18
`
`
`
`714
`
`B. YANG AND J. F. BéHME
`
`
`used. In a recent work [6], Delosme has also indicated this possibility in connection
`with “rough rotations” independently. He has taken, however, a different approach that
`is based on encoding the rotation angles. He has still required four plane rotations on
`
`the off-diagonal PEs while diagonal and off-diagonal operations can be overlapped.
`
`Our paper is organized as follows. In § 2, we briefly reexamine J acobi’s SVD method
`and Brent and Luk’s SVD array. Then, we develop the TPR method in § 3. The CORDIC
`
`algorithm is described in § 4, where in particular CORDIC scaling correction techniques .
`
`are discussed and examples of scaling-corrected CORDIC sequences are given. In § 5,a
`
`unified CORDIC SVD module for all PEs of the SVD array is presented. This module
`is compared to those proposed by Cavallaro, Luk, and Delosme in § 6. Finally, we stres
`the applicability of the TPR method to several other problems.
`
`2. Jacobi SVD method. In this paper, we consider real, square, and nonsymmetric
`matrices. Let M 6 RN" N be a matrix of dimension N. The SVD is given by
`
`( 1 )
`
`M = U2 V T,
`
`where U e RNXN and V 6 R” X” are orthogonal matrices containing the left and right
`singular vectors, and 23 6 R” x N is a diagonal matrix of singular values, respectively. The
`
`superscript Tdenotes matrix transpose. Based on an extension of the Jacobi eigenvalue
`algorithm [7], Kogbetliantz [8] and Forsythe and Henrici [9] proposed to diagonalize
`M by a sequence of two-sided rotations,
`
`
`
`(2)
`
`M0=M,
`
`Mk+1=UliMka
`
`(k=0,l,2,--').
`
`
`
`
`
`
`
`
`Uk and Vk describe two rotations in the (i, j )-plane ( 1 § 1' < j é N), where the rotation
`angles are chosen to annihilate the elements of Mk at the positions (i, j) and (j, 1‘).
`Usually, several sweeps are necessary to complete the SVD, where a sweep is a sequence
`
`of N(N - l )/ 2 two-sided rotations according to a special ordering of the N(N ~ l)/2
`
`different index pairs (i, j).
`,0;
`For sequential computing on a uniprocessor system, possibly the most frequently
`used orderings are the cyclic orderings, namely, the cyclic row ordering
`
`(‘
`
`d4
`
`(1
`cl
`
`8C
`
`at
`at
`st<
`
`5’88
`
`
`
`
`
`
`(3)
`
`(i,j)=(l,2),(1,3),
`
`,(1,N).(2,3),
`
`,(2.N),
`
`,(N— lJV)
`
`or the equivalent cyclic column ordering. Sameh [10] and Schwiegelshohn and Thiele
`[1 l] have shown how to implement the cyclic row ordering on a ring-connected ora
`mesh-connected processor array. Recently, a variety of parallel orderings have beende
`veloped. Luk and Park [ 12] have shown that these parallel orderings are essentially equiv-
`alent to the cyclic orderings and thus share the same convergence properties.
`Brent and Luk have suggested a particular parallel ordering and developed a square
`systolic array consisting of [N/ 2] X [N/ 2] PBS for implementing the Jacobi SVD method
`(Fig. 1 ). To do this, the matrix M is partitioned into 2 X 2 submatrices. Each PE contains
`one submatrix and performs a two-sided rotation
`
`(4)
`
`where
`
`5
`
`(
`
`)
`
`B=R(01)TAR(02).
`
`A =
`
`a“
`(C121
`
`an
`022)
`
`and B:
`
`bl]
`(b2:
`
`blZ)
`bzz
`
`19
`
`R(
`
`
`
`19
`
`
`
`
`
`REDUCED SVD COMPUTATIONS AND HOMOGENEOUS SVD ARRAY
`
`715
`
`
`
`
`FIG. 1. The SVD array given by Brent and Luk.
`
`denote the submatrix before and after the two-sided rotation, respectively, and
`
`(6)
`
`R(0)=(
`
`cos 0
`
`sin 6
`
`—sin 6
`
`cos 6
`
`describes a plane rotation through the angle 6. At first. the diagonal PEs (symbolized by
`adouble square in Fig. l ) generate the rotation angles to diagonalize the 2 X 2 submatrices
`(bn = b2, = 0) stored in them. This means that 0. and 62 are first calculated from the
`elements of A and then relation (4) is used to compute b“ and bzz. We call this the
`generation mode. Then, the rotation angles are sent to all off-diagonal PBS in the following
`way: the angles associated to the left-side rotations propagate along the rows while the
`angles associated to the right-side rotations propagate along the columns. Once these
`angles are received, the off—diagonal PEs perform the two-sided rotations (4) on their
`stored data. We call this the rotation mode. Clearly, if we compute the rotation mode
`straightforwardly, we require four plane rotations. For the generation mode, additional
`operations for calculating 61 and 02 are required.
`
`3. TPR method for computing two-sided rotations. In order to develop the TPR
`method for computing two-sided rotations more efficiently, we first discuss the com-
`mutative properties of two special types, the rotation-type and the reflection-type, of
`2X 2 matrices. We define
`
`(7)
`
`M“°={( x y)x,y€R]
`
`
`—y x
`
`and mwf=[(x
`
`y —x
`
`y)
`
` x,yeR].
`
`The former is called rotation-type because it has the same matrix structure as a 2 X 2
`plane rotation matrix. Similarly, the latter is called reflection-type because it has the
`same matrix structure as a 2 X 2 Givens reflection matrix [13 ]. Note that x and y must
`not be normalized to x2 + y2 = 1. Using the above definitions, the following results can
`be shown by some elementary manipulations.
`LEMMA 1. IfAl e of!“ and A2 6 all“, then AA; = A2Al 6 Jim.
`LEMMA 2. IfA. e ear“ and A2 6 all“, then AlAz = AzTA. e all”.
`In particular, if we consider two plane rotations, we know the following.
`LEMMA 3. If R(0.) and R(62) are plane rotations described by (6),
`R(01)R(02) = R(01 + 02) and R(01)TR(02) = R(02 — 01)-
`Now, we give a theorem describing the rotation mode of the TPR method.
`THEOREM. If the 2 X 2 matrix A and the two rotation angles 0. and 62 are given,
`then the two-sided rotation (4) can be computed by two plane rotations, ten additions,
`
`then
`
`
`
`20
`
`20
`
`
`
`716
`
`B. YANG AND J. F. BoHME
`
`andfour scalings by %:
`
`(8)
`
`(9)
`
`(11)
`
`P1=(022+an)/2,
`€11=(021—6112)/2,
`
`p2=(022—an)/2,
`42=(azr+aiz)/2,
`
`0—=02—0I9
`
`6+=02+012
`
`(“new (“W“),
`
`(11
`
`[2
`
`£12
`
`t]
`
`b=r—r,
`ll
`1
`2
`
`b=—-t+t,
`12
`1
`2
`
`b21=ti+t2,
`
`b22=rl+r2-
`
`Proof Using (8), the matrix A can be reformulated as
`
`(12).
`-q.)+(—pz
`AZA1+A2=(p1
`Clearly, R (01 ), R(02) in (4) and A, are elements of a” '°‘ while A2 belongs to 11'“. This
`leads to the following reformulation of the matrix B by using Lemmas 1—3:
`
`‘1]
`
`P1
`
`42
`
`P2
`
`B=R(0,)TAR(02)
`
`= RU),)TA,R(62)+R(0,)TA2R(02)
`
`= R(0.)TR(02)Al +R(01)TR(62)TA2
`
`=R(02—01)A1+R(62+01)TA2
`
`=R(0_)(pl
`
`41
`
`_(I1)+R(0+)T(—P2
`
`I71
`
`42
`
`(12)
`
`172
`
`r
`
`= ( l
`
`t 1
`
`—l
`
`1 ) + (
`
`r 1
`
`—
`
`r2
`
`12
`
`t
`
`2).
`
`’2
`
`
`
`This completes the proof.
`The generation mode of the TPR method follows directly from the above theorem. ‘
`COROLLARY. If the 2 X 2 matrix A is given, we can diagonalize A and calculate '
`the corresponding rotation angles 0. and 02 by two Cartesian-to-polar coordinates con~
`versions, eight additions, andfour scalings by %:
`
`(12)
`
`(13)
`
`(14)
`
`(15)
`
`Pi:(022+an)/2,
`
`P2=(022_a11)/2,
`
`412(021‘012)/2,
`r1=sign(p1)Vpi+qi,
`0— = arctan (41/111),
`01=(0+—0—)/2,
`
`bllzr1_r2a
`
`q2=(a2,+a12)/2,
`rz=sign(p2)Vp%+q%,
`0+ = arctan (612/122),
`02=(0++0_)/2,
`
`b22:rl+r2-
`
`(18)
`finw
`(19)
`Th
`.'Equ
`'-(20‘
`
`‘
`
`Proof. Regarding (11), bn = b2] = 0 is equivalent to [1 = 12 = 0. Equation (13):
`follows then from ( 10). This completes the proof.
`:
`In equation (13), we choose the rotation through the smaller angle. All vectors
`lying in the first or the fourth quadrant are rotated onto the positive x—axis, and all vectors,
`lying in the second and the third quadrant are rotated onto the negative x—axis. For.
`vectors on the y-axis, the rotation direction is arbitrary. Thus, the generated rotation.
`
`
`
`
`
`21
`
`21
`
`
`
`REDUCED SVD COMPUTATIONS AND HOMOGENEOUS SVD ARRAY
`
`717
`
`mgles 6_ and 6+ satisfy |6_ | ,
`
`|6+| é 90°. This results in
`
`(16)
`
`l0.| §90°
`
`and
`
`l02| é90°,
`
`
`
`
`
`due to ( 14).
`Equation (16) is important with respect to the convergence of the Jacobi SVD
`method. Forsythe and Henrici [9] have proven the convergence for cyclic orderings if
`the rotation angles 0, and 02 are restricted to a closed interval inside the open interval
`(-90°, 90°). They have also demonstrated that this condition may fail to hold, i.e., 01
`1and 02 may be :90°, if the off-diagonal elements blz and 1721 in (5) have to be exactly
`annihilated. As a remedy, they suggested an under- or overrotation by computing the
`
`two-sided rotation (4) with angles (1 — 7)01 and (l — 7)02 (—1 < 7 < 1) and proved
`
`itsconvergence. In practice, however, the finite machine accuracy in the real arithmetic
`
`allows only an approximative computation of the rotation angles and implies under— or
`
`overrotations. So the Jacobi SVD method converges without using under- or overrotations
`asshown by the experimental results of Brent, Luk, and Van Loan [3]. In case of CORDIC
`implementations, the effect of implicit under— or overrotations is more apparent. The
`angles i90° can never be exactly calculated because of the limited angle resolution arc—
`tan (2””) of the CORDIC algorithm, where p denotes the mantissa length.
`
`
`
`4. The CORDIC algorithm. In the previous section, we have seen that the main
`
`operations of the TPR-method are plane rotations and Cartesian-to-polar coordinates
`conversions. These operations can be carried out by multiplier—adder—based processors
`
`supported by software or special hardware units. An alternative approach is the use of
`
`dedicated processors that usually map algorithms more effectively to hardware. The
`
`CORDIC processor is such a powerful one for calculating trigonometric functions.
`
`The CORDIC algorithm was originally designed by Volder [14] as an iterative pro-
`
`cedure for computing plane rotations and Cartesian-to-polar coordinates conversions. It
`was later generalized and unified by Walther [15], enabling a CORDIC processor to
`calculate more functions, including hyperbolic functions, as well as multiplications and
`divisions. In the following, we consider Volder’s CORDIC algorithm because only trig-
`onometric functions are involved in SVD applications.
`The CORDIC algorithm consists of iterative shift-add operations on a three-com-
`ponent vector,
`
`
`
`
`
`
`
`
`
`
`
`(18)
`
`xi + 1
`
`yi + 1
`
`(x,- — ciéiyi) =
`
`y; + aiéixi
`
`
`
`1
`
`cos (01,-)
`
`(
`
`cos(a,-) —a,- sin (ai) )(x[)
`
`cos (at)
`
`y,-
`
`a,- sin ((1,)
`
`z,-+.=z,-—ea,-a,-
`
`(0<6,-<l;a,-=:l;e=il;i=0,l,---,n—~1),
`
`5,-=tan(a,-)=2—S(”.
`
`k,=
`
`=V1+5,2,
`
`
`
`(22)
`
`x"
`
`yn
`
`(
`
`cos a —sin oz
`
`)=K(-
`
`sm :1
`
`
`in which the iteration stepsize 6,- is defined by
`
`(19)
`
`The set of integers {S(i)} parametrizing the iterations is called CORDIC sequence.
`Equation ( 17) can be interpreted, except for a scaling factor of
`
`
`(20)
`1
`
`cos (01,)
`
`asa rotation of (xi, y,)T through the angle a., where the sign a,- = i1 gives the rotation
`direction. After n iterations, the results are given by
`
`<2”
`
`
`)(
`
`x0
`
`y0
`
`)’
`
`cos a
`
`2,. = 20 — ea,
`
`22
`
`22
`
`
`
`l
`
`718
`
`B. YANG AND J. F. Bot—{ME
`
`with the overall scaling factor K = H,- k,- and the total rotation angle a = Z,- 0.01;. Now,
`if the CORDIC sequence satisfies the following convergence condition
`"—1
`
`(23)
`
`(Xi_ Z a,-§a,,-l
`j= i+ l
`
`(i=0,l,"',n_2),
`
`we can choose the sign parameter
`
`fOFJ/n‘*0,
`
`for 2,, -> 0
`
`[—sign (xiyl)
`
`ign (52,-)
`
`,S
`
`(24)
`
`0i =
`
`to force y" or 2,, to zero, provided that the input data x0, yo, and 20 he in the conver-
`gence region
`
`(25)
`
`C= Z on;
`(=0
`
`'1“
`
`{larctan (yo/x0)|
`
`IZol
`
`fern—>0,
`
`forzn—>0.
`
`
`
`
`
`
`
`In this way, two different types of CORDIC trigonometric functions can be computed
`(Table 1). In the mode y,, —> 0, the Cartesian coordinate (x0, yo) of a plane vector is
`converted to its polar representation, where the parameter c = i1 determines the sign
`of the phase angle calculated. When z,I —> 0, a given plane vector is rotated through the
`angle 20, where e = i1 controls the rotation direction.
`In Table l, the principal value larctan (yo/xo)| é 90° of the inverse tangent function
`is calculated when computing Cartesian-to-polar coordinates conversions. Correspond-
`
`ingly, x" may be positive or negative according to the sign of x0. So, it is guaranteed that
`
`a vector is always rotated through the smaller angle onto the x—axis in accordance with
`
`(13). In this case, a convergence region of C 2 90° is sufficient for the generation mode
`of the two-sided rotation.
`
`One main drawback of the CORDIC algorithm is the need of correcting the seal-
`ing factor K that arises during the iterations (17). For example, if we use Volder’s
`CORDIC sequence
`
`(26)
`
`{S(i)}={0,1,2,3,"',p—l,p},
`
`
`
`
`with n 2 p + l CORDIC iterations for a mantissa accuracy of 2‘”, the scaling factoris
`K z 1.64676. Compensating this undesired scaling effect with a minimum number of
`computations is of particular importance.
`
`Clearly, multiplying x" and y" in Table l by K‘1 will degrade the algorithm per-
`formance substantially. Most of the scaling correction issues are based on shift-add oper-
`ations. For a two-sided rotation that is implemented by four plane rotations, each matrix
`
`element undergoes two plane rotations so that the total scaling factor to be corrected
`
`is K2. In this case, Cavallaro and Luk [16] have pointed out that there is a simple
`systematic approach for scaling correction when using the CORDIC sequence (26).
`
`They proposed to use [12/41 scaling iterations of the type x <— x — 2‘ij with j e] =
`{1, 3, 5,
`-
`-
`-
`, Zip/41 — l} and one shift operation 2“. The remaining scaling erroris
`
`TABLE 1
`
`CORDIC trigonometric functions (a = i1).
`
`
`
`
`
`
`x" = K(xo cos 2.; — cyo sin 20)
`y,, = K(exo sin 20 + yo cos 20)
`
`x" = K sign (X0) VX%+y%
`2,, = 20 + e arctan (yo/x0)
`
`23
`
`
`
`23
`
`
`
`
`
`REDUCED SVD COMPUTATIONS AND HOMOGENEOUS SVD ARRAY
`
`719
`
`bounded by 2'” ‘ 1,1
`
`(27)
`
`1-2—'1‘[(1—2—21)-K2
`jEJ
`
`
`
`‘1—2—'1‘[(1—2—21‘)-fi(1+2—2")<21“.
`
`[:0
`
`jeJ
`
`
`
`This approach, however, fails in the TPR method. Here, each matrix element un-
`dergoes only one plane rotation. The scaling factor to be corrected is thus K rather than
`K2. In order to solve this more difficult problem, different techniques have been developed
`in the literature. Haviland and Tuszynski [17] used similar scaling iterations as Cavallaro
`and Luk. Ahmed [18] repeated some CORDIC iterations to force K to a power of the
`machine radix. Delosme [19] combined both methods of Haviland, Tuszynski, and
`' Ahmed for minimizing the number of computations. Deprettere, Dewilde, and Udo [20]
`
`suggested the double-shift concept.
`We designed a computer program [2]] for a systematic search of CORDIC
`
`sequences. We allow shifts parameters S(i) (i = 0,
`l,
`, n — l) with differences
`S(i + l) — S(i) e {0, 1, 2} to provide more optimization freedom. For an efficient
`
`scaling correction, we require that the scaling factor K be corrected by a sequence of m,
`shift-add operations,
`
`24“” Ti (1+ HOW—Tm) ‘ K: 1 + AK
`i=1
`
`(T(j)int€gers,
`
`11(1') = 1‘1)-
`
`5
`
`(nus-o'-
`
`
`(23)
`
`These additional scaling iterations are parametrized by the set of signed integers
`{T(0), n( 1)T( 1), ~-- , n(nk)T(nk)}. The total number of iterations is L = n + nk.
`
`In (28), AK denotes the remaining relative scaling error after the scaling correction.
`
`. We emphasize that this is a systematic error with a constant sign. By contrast, the other
`two types of CORDIC errors, the angular error due to the limited angle resolution and
`
`the rounding error, are of statistical nature because they may be positive or negative.
`
`The scaling error is thus more critical with respect to error accumulation when repeated
`
`CORDIC operations on the same data have to be computed as in SVD applications.
`
`R