throbber
United States Patent 19
`McPherson et al.
`
`54 METHOD AND SYSTEM OF ROUNDING
`FOR DIVISION OR SQUARE ROOT:
`ELMINATING REMANDER
`CALCULATION
`
`(75) Inventors: Thomas Joseph McPherson; Eric
`Mark Schwarz, both of Gardiner, N.Y.
`73) Assignee: International Business Machines
`Corporation, Armonk, N.Y.
`
`21
`22
`51
`52
`58
`
`Appl. No.: 614,561
`Mar 13, 1996
`Fled:
`...................... G06F 7/38
`Int. Cl.
`... 364/748.03; 364,748.05
`U.S. C. ...........
`Field of Search ........................ 364/748.03, 748.05.
`364f715.04
`
`56
`
`References Cited
`U.S. PATENT DOCUMENTS
`4,878,190 10/1989 Darley et al..
`5,212,661
`5/1993 Taniguchi.
`5,341,321
`8/1994 Karp et al.,
`5,373,461 12/1994 Bearden et al. ................. 364/715.04
`5,511,016 4/1996 Béchade .........
`... 364/48.03
`5,627,773 5/1997 Wolrich et al. ........ 364/748.05
`OTHER PUBLICATIONS
`"Enterprise Systems Architecture/390 Principles of Opera
`tion". Order No. SA22-7201-02, available through IBM
`branch offices, 1990.
`Goldberg. D, "IEEE Standard for Binary Floating-Point
`Arithmetic ANSI/IEEE Std. 754-1985'. The Institute of
`Electrical and Electronic Engineers, Inc., New York, Aug.
`1985.
`Goldschmidt, RE, "Applications of Division by Conver
`gence", Master's thesis, M.I.T., Jun. 1964.
`
`USOO5764555A
`Patent Number:
`11
`45 Date of Patent:
`
`5,764,555
`Jun. 9, 1998
`
`Anderson, SF et al., “The IBM System/360 Model 91:
`Floating-Point Execution Unit". IBM Journal of Research
`and Development, 11(1): 34-53, Jan. 1967.
`Waser, S et al. "Introduction to Arithmetic for Digital
`System Designers", CBS College Publishing, New York,
`1982, Chapter 5.
`Flynn, MJ, "On Division by Functional Iteration", IEEE
`Transactions on Computers, vol. C-19, No. 8, Aug. 1970,
`pp. 702-706.
`Dao-Trong Set al., “A single-chip IBM System/390 float
`ing-point processor in CMOS". IBM Journal of Research
`and Development, vol. 36, No. 4, Jul. 1992, pp. 733-749.
`Markstein, PW. “Computation of elementary function on the
`IBM RISC System/6000 processor". IBM Journal of
`Research and Development, vol. 34, No. 1. Jan. 1990, pp.
`111-119.
`Markstein. PWet al., "Adaptation of Floating Point Division
`to Fixed Point Arithmetic'. IBM Technical Disclosure Bul
`letin, vol. 36, No. 06B, Jun. 1993, pp. 529-533.
`Primary Examiner Tan V. Mai
`Attorney, Agent, or Firm-Lynn L. Augspurger
`57
`ABSTRACT
`A method and system which provides exactly rounded
`division and square root results for a designated rounding
`mode independently of a remainder, or equivalent calcula
`tion of the relationship between the remainder and Zero, for
`predetermined combinations of the rounding mode and the
`guard digit of an estimate that has several more bits of
`precision than the exactly rounded result, and has an error
`tolerance magnitude less than the weight of the least sig
`nificant bit of the estimate. The estimate is generated in
`accordance with a quadratically converging division or
`square root algorithm. The method and system is described
`in connection with IEEE 754-1985 and IBM S/390 binary
`floating point architectures.
`
`21 Claims, 9 Drawing Sheets
`
`-10
`
`A-02
`
`q' (BIS)
`
`TERENE of Nik Ts-00
`- K=
`(-N-
`
`RUCE -
`
`I
`
`-- sis
`
`
`
`FR
`
`For
`FOR RED
`Road grisle
`C
`E St.
`NAREST SAME
`a-be-tyws
`SS
`TRUNCATE
`
`4
`
`ARD
`
`CACUAE
`RENER
`
`LzLabs GmbH. Ex. 1021-1
`
`

`

`U.S. Patent
`
`Jun. 9, 1998
`
`Sheet 1 of 9
`
`5,764,555
`
`GENERATE N+G BT ESTMATE:
`Qit
`WHERE
`
`
`
`150
`
`FIG. 1A
`
`EXACTLY ROUND Qll
`
`160
`
`
`
`www.000
`
`www.100
`
`ROUND TO NEARES
`xxx.000
`
`xxx. 100
`
`yyy.000
`
`202-17
`
`TRUNCATE
`
`INCREMENT
`--213 --216
`p--O O--6 o--6
`re 214
`-217
`210-1
`p-of-free
`
`
`
`212-1
`
`...
`
`224-N
`
`REMAINDER
`COMPARISON
`CASE
`
`
`
`
`
`
`
`
`
`LzLabs GmbH. Ex. 1021-2
`
`

`

`U.S. Patent
`
`Jun. 9, 1998
`
`Sheet 2 of 9
`
`5,764,555
`
`DETERMINE Q' (N + k BITS-100
`IQ-Q' <= 2* (-N-j)
`
`INCREMENT Q' BY 2* (-(N+G+1))
`
`101
`
`102
`
`TRUNCATE Q' TO N+G BITS
`-2* (-N-G-1)-2* (-N-j) <= Q-Q" <= +2*(-N-G-1) +2*(-N-j)
`GIVEN j > G+1
`-2' (-N-G) < Q-Q" < +2*(-N-G)
`
`Q' (N BITS)
`
`FOR
`ROUND
`
`NEAREST
`
`FOR
`115
`FOR
`ROUND TO
`INFINITY INNITYOPOSETRUNCATEINCREMENTDECREMENT
`SIGN, ZERO, OR
`TRUNCATE
`
`
`
`
`
`106
`
`NCREMENT
`INCREMENT
`FIG.1B
`
`DECREMENT
`
`116
`
`MULTIPLEXOR
`
`LzLabs GmbH. Ex. 1021-3
`
`

`

`U.S. Patent
`
`Jun. 9, 1998
`
`Sheet 3 of 9
`
`5,764,555
`
`
`
`00Z ----
`
`
`
`000 KKK 99°** 000xxx
`
`
`
`
`
`
`
`
`
`
`
`
`000, MMM
`
`INEWEHONIBIWON[^>]]
`
`A-Z09
`
`A-100
`
`LzLabs GmbH. Ex. 1021-4
`
`

`

`U.S. Patent
`
`Jun. 9, 1998
`
`Sheet 4 of 9
`
`5,764,555
`
`
`
`NOSI HWdW00 + –
`
`
`
`
`
`
`
`
`
`?}, 001* 05@wxx 00"
`
`|NEWB}}030
`
`
`
`000 MMM
`
`LzLabs GmbH. Ex. 1021-5
`
`

`

`U.S. Patent
`
`Jun. 9, 1998
`
`Sheet 5 of 9
`
`5,764,555
`
`www.000
`
`www.100
`
`ROUND TO NEAREST
`xxx.000
`xxx. 100
`
`yyy.000
`
`yyy. 100
`
`ASSUME 21 CYCLES
`FOR APPROX Q
`ASSUME 7 CYCLES TO
`CALCULATER
`21+7*(1/8)=21,875 CYCLES
`
`21+7*(2/8)=22.75 CYCLES
`
`o-oooo--o
`
`
`
`CASE
`
`AGARWAL. E.T A.
`O--O 0--O 0--0
`APPLICATION
`
`- REMANDER
`CASES
`CALCULATE REMANDER
`FOR HALFWAY CASE
`
`- -
`
`o--o SCHWARZ PRIOR APPLICATIONS
`o-H-o
`21+7*(1/2)=24.5 CYCLES
`
`FG.5
`
`LzLabs GmbH. Ex. 1021-6
`
`

`

`U.S. Patent
`
`Jun. 9, 1998
`
`Sheet 6 of 9
`
`5,764,555
`
`ACCURATE TO
`
`REGISTER
`
`N+G+1
`REGISTER
`53
`
`FIG.6
`
`TRUNCATE TO N--G BTS
`
`BIT
`N
`
`N--G
`REGISTER
`54
`
`W Q
`
`-(N+G+1)
`
`2
`
`Q'
`
`0.
`
`
`
`Q"
`O
`
`LzLabs GmbH. Ex. 1021-7
`
`

`

`U.S. Patent
`
`Jun. 9, 1998
`
`Sheet 7 of 9
`
`5,764,555
`
`S/390
`ROUNDING
`NEAREST
`MODE
`OR
`REGISTER
`IEEE
`(ONE AND
`ONLY ONE NEAREST
`SELECTED)
`
`IEEE
`SAME
`SIGN
`INFINITY
`
`
`
`IEEE
`OPPOSITE ZERO
`SIGN
`INFINITY
`
`
`
`S/390
`TRUNCATE
`
`
`
`GUARD
`DIGIT
`REGISTER G BIS
`G
`
`
`
`77
`
`FAST INCREMENT
`FIG.7
`
`FAST TRUNCATE
`
`LzLabs GmbH. Ex. 1021-8
`
`

`

`U.S. Patent
`
`Jun. 9, 1998
`
`Sheet 8 of 9
`
`5,764,555
`
`FOR DIVISION:
`FOR SQUARE ROOT:
`
`Q"
`Q"
`
`B
`A
`Q" D
`
`WHERE Q". APPROX A/B
`WHERE Q". APPROX SQRT(D)
`
`MULTIPLER
`N+1 BY
`N-1 BITS
`
`
`
`80
`
`87
`
`FIG.8
`
`CALC REM
`arr
`76
`
`
`
`
`
`ENABLE
`
`COMPARATOR
`
`81
`
`REM K= 0 REM > = 0
`
`RM1
`
`RM2
`
`RM3
`
`REM (= 0 is
`ags
`
`
`
`ill.
`NE
`
`In
`
`
`
`NLE
`
`DELAY
`BUFFER
`
`96
`
`DELAYED
`TRUNCATE
`
`DELAYED
`INCREMENT
`FIG.9
`
`130
`DELAYED
`DECREMENT
`
`LzLabs GmbH. Ex. 1021-9
`
`

`

`U.S. Patent
`
`Jun. 9, 1998
`
`Sheet 9 of 9
`
`5,764,555
`
`74
`
`DELAYED DECREMENT
`FAST INCREMENT
`FAST TRUNCATE
`DELAYED TRUNCATE
`DELAYED INCREMENT
`130
`126
`72
`128
`OR -120
`OR -122
`TRUNCATE
`INCREMENT
`
`DECREMENT
`
`TRUNC(Q") MOST SIGNIFICANT N BITS
`
`“-1 ULP'
`
`'-1 ULP'
`
`N ADD 7
`
`NADD /
`138
`
`MULTIPLEXOR
`
`142
`
`140
`
`(*
`
`FIG.10
`
`134
`
`FINISH
`
`LzLabs GmbH. Ex. 1021-10
`
`

`

`5,764,555
`
`1
`METHOD AND SYSTEM OF ROUNDING
`FOR DIVISION OR SQUARE ROOT:
`ELMINATING REMANDER
`CALCULATON
`
`2
`E. M. Schwarz, "Method and System of Rounding for
`Quadratically Converging Division and Square Root." Ser.
`No. 08/414,867, filed with U.S. Patent Office Mar. 31, 1995.
`E. M. Schwarz, "Method and System of Rounding for
`Quadratically Converging Division and Square Root," Ser.
`No. 08/463,378, filed with U.S. Patent Office Jun. 5, 1995.
`R. C. Agarwal, A. A. Bjorksten, and F. G. Gustavson,
`"Method and System for Performing Floating-Point
`Operations.” Ser. No. 08/354.402, U.S. Pat. No. 5,563,818,
`Oct. 8, 1996.
`Computer arithmetic operations of division and square
`root as defined by various architectures such as in "Enter
`prise Systems Architecture/390 Principles of Operation" or
`in "IEEE standard for binary floating-point arithmetic,
`ANSI/IEEE Std 754-1985", cited above, define an exactly
`rounded result. Specifically, the Systems Architecture/390
`reference prescribes that the result for the floating point
`division operation be equal to taking an infinitely precise
`quotient and truncating it to the precision specified (short: 24
`bits, long:56 bits, or extended: 112 bits), and for square root
`the infinitely precise result is rounded to the nearest machine
`representable number. The IEEE 754 standard, which
`includes a single (24 bit), double (53 bit) and double
`extended format (63 or greater bits), specifies that the results
`of any arithmetic operation including divide and square root
`need to be rounded to any of four rounding modes. These
`modes are called round to nearest even, round to zero, round
`to positive infinity, and round to negative infinity. Round to
`nearest even selects the closest machine representable num
`ber to the result, and if there are two such numbers then the
`even numberis selected as the result. This moderemoves the
`bias that is associated with the conventional round to nearest
`mode, which "rounds up” in the halfway case. Round to
`zero discards the fractional bits that don't fit in the machine
`representation, and is often termed truncation. Round to
`positive infinity rounds to the next larger machine repre
`sentable number, while round to negative infinity rounds to
`the next smaller representative number. All these modes are
`defined to have a result equal to taking an infinitely precise
`result and rounding it to desired precision (i.e., exactly
`rounded).
`Some arithmetic processing apparatus use convergence
`type division or square root operations, such as the quadrati
`cally converging Newton-Raphson algorithm and Gold
`schmidt algorithm. Details of these algorithms are given in
`text books on computer arithmetic such as the book by
`Waser and Flynn. The first detailed work on Goldschmidt's
`algorithmis Goldschmidt's Master's Thesis cited above, and
`a discussion of its implementation on the IBM 360 and 91
`is given in Anderson et al. For convergence-type division or
`square root operations there are several well-known methods
`of calculating an exactly rounded result. Two of the most
`common methods are rounding by calculating to twice the
`desired accuracy and rounding by calculating the remainder
`for all cases.
`Specifically, in Waser and Flynn's book, cited above, it is
`suggested that the result be calculated to twice the needed
`precision to perform exact rounding. In terms of latency, this
`results in the negative effect of requiring an additional
`iteration of the algorithm. An implementation of the
`Newton-Raphson algorithm which follows this method of
`rounding is the RS/6000 as described by Markstein in the
`IBM Journal of Research and Development. Also, U.S. Pat.
`No. 5,341.321 shows a mechanism for calculating Newton
`Raphson algorithm to twice the accuracy with one additional
`iteration but using the same width dataflow.
`
`10
`
`15
`
`25
`
`35
`
`FIELD OF THE INVENTON
`This invention is related to computers and computer
`systems and particularly to a method and system for com
`puter arithmetic operations and processing
`CROSS REFERENCE TO RELATED
`APPLICATTON
`The present application claims priority from the following
`co-pending patent application:
`U.S. Ser. No. 08/463,378 filed Jun. 5, 1995 by E. Schwarz
`entitled "Method and System of Rounding for Qua
`dratically Coverging Division or Square Root"; and
`U.S. Ser. No. 08/414,867 filed Mar. 31, 1995 by E.
`Schwarz entitled "Method and System of Rounding for
`Quadratically Converging Division or Square Root".
`These co-pending applications and the present application
`are owned by one and the same assignee. International
`Business Machines Corporation of Armonk, N.Y.
`BACKGROUND OF THE INVENTION
`During the ensuing description the following references,
`which are herein incorporated by reference, are cited.
`30
`"Enterprise Systems Architecture/390 Principles of
`Operation.” Order No. SA22-7201-0, available through IBM
`branch offices, 1990.
`"IEEE standard for binary floating-point arithmetic,
`ANSI/IEEE Std 754-1985. The Institute of Electrical and
`Electronic Engineers, Inc., New York, August 1985.
`R. E. Goldschmidt. "Applications of division by
`convergence.” Master's thesis, M.I.T.. June 1964.
`S. F. Anderson, J. G. Earle, R. E. Goldschmidt, and D. M.
`Powers. “The IBM system/360 model 91: floating-point
`execution unit,” IBM Journal of Research and Development,
`11(1):34-53, January 1967.
`S. Waser and M. J. Flynn. Introduction to Arithmetic for
`Digital Systems Designers, CBS College Publishing. New
`York, 1982.
`45
`M. J. Flynn. "On division by functional iteration," IEEE
`Trans. Comput., C.-19(8):702-706, August 1970.
`S. Dao-Trong and K. Helwig, "A single-chipBM system
`390 floating-point processor in CMOS." IBM Journal of
`Research and Development, 36(4):733-749, July 1992.
`P. Markstein, "Computation of elementary functions on
`the IBM RISC system/6000 processor." IBM Journal of
`Research and Development, 34(1):111-119, January 1990.
`P. Markstein and A. K. Spencer, "Adaption of Floating
`55
`Point Division Algorithm to Fixed Point Arithmetic.” IBM
`Tech. Discl. Bulletin, 36(06B):529-533, June 1993.
`A. H. Karp, P. Marstein, and D. Brzezinski, "Floating
`Point Arithmetic Unit using Modified Newton-Raphson
`Technique for Division and Square Root," U.S. Pat. No.
`5,341.321, Aug. 23, 1994.
`T. Taniguchi, "Apparatus for Performing Floating Point
`Arithmetic Operation and Rounding the Result thereof."
`U.S. Pat. No. 5.212,661, May 18, 1993.
`H. M. Darley et al., "Floating Point/Integer Processor
`with Divide and Square Root Functions U.S. Pat. No.
`4.878,190, Oct. 31, 1989.
`
`65
`
`50
`
`LzLabs GmbH. Ex. 1021-11
`
`

`

`5,764,555
`
`O
`
`15
`
`20
`
`30
`
`3
`For the Goldschmidt algorithm it is difficult to get twice
`the number of bits correct. The Goldschmidt algorithm is not
`self-correcting; a truncation error in an iteration will cause
`error in the next iteration, unlike the Newton-Raphson
`algorithm. Yet, compared to the Newton-Raphson algorithm,
`the Goldschmidt algorithm provides certain advantages in
`terms of computational efficiency because of the possibility
`for parallel calculation of intermediate results during each
`iteration. To avoid truncation errors, a straight-forward,
`"brute force" method of implementing exact rounding by
`calculating to twice the desired accuracy for the Gold
`schmidt algorithm would require the dataflow to accommo
`date twice the width operands, which corresponds to an
`unreasonable amount of hardware. Also, even if providing
`the necessary additional hardware were not deemed
`unreasonable, this method of calculating to twice the desired
`accuracy is not computationally efficient since it inherently
`requires at least one additional iteration of the Goldschmidt
`algorithm.
`Other researchers have considered calculating the remain
`der to determine which direction to round based on a
`comparison of the remainder with zero. Markstein and
`Spencer, in an IBM Technical Disclosure Bulletin, use the
`Goldschmidt algorithm to produce an approximation within
`a unit in last place (ulp) of the desired accuracy, and then
`compute the remainder, which is then used to determine
`whether to increment or truncate the approximation to
`compute an exactly truncated result. Darley et al. in U.S. Pat.
`No. 4.878.190 describe a Newton-Raphson algorithm for
`division and square root for the IEEE 754 architecture which
`includes computing an equivalent to a remainder and testing
`its relation to zero, wherein a reduced number of bits is
`needed in the comparison. Also, Taniguchi in U.S. Pat No.
`5.212,661 discloses arounding algorithm for square root and
`divide which considers the approximate result to be gener
`ated using any convergence algorithm, and wherein an
`equivalent operation to a remainder calculation is performed
`followed by a comparison to zero. As in Darley et al.,
`Taniguchi uses a reduced number of bits in the comparison.
`Note that this type of rounding algorithm requires the
`calculation of the remainder, or an equivalent operation, for
`all cases and rounding is based on comparison of the
`remainder with zero.
`Recently a few researchers have explored using extra
`precision in the approximate solution to eliminate the
`remainder comparison for some of the combinations of
`guard bit values and selected rounding modes. This is the
`technique used by the present invention. Schwarz in two
`prior U.S. patent applications claimed the use of a strict error
`tolerance and the use of one guard bit to reduce the remain
`der comparison to only half the combinations of guard bits
`and rounding modes. Agarwal, Bjorksten, and Gustavson
`have also claimed multiple guard bits with a weaker error
`tolerance that reduced the remainder comparison to be
`needed in 22 cases to produce an exactly rounded
`solution, where G is the number of guard bits. The present
`invention discloses a mechanism for creating an approxi
`mate solution with G guard bits and a mechanism for
`rounding which requires a remainder comparison in only
`2 cases to produce an exactly rounded solution. Thus, the
`present invention reduces the remainder comparison of
`Agarwal et al. invention by one half.
`In addition, several patents illustrate techniques for per
`forming a floating point division or square root operation.
`U.S. Pat No. 4999,801, issued Mar. 12, 1991 to Akira
`Katsuno, deals with the Newton-Raphson algorithm for
`division and square root but does not discuss calculating an
`
`35
`
`45
`
`50
`
`55
`
`65
`
`4
`exactly rounded result. U.S. Pat. No. 5.132.925, issued Jul.
`21, 1992 to T. H. Kehl et al., is directed to a non-restoring
`division algorithm. Non restoring algorithms are not qua
`dratically converging, and calculate a remainder as part of
`their basic iteration. U.S. Pat. No. 5.305,247, issued Apr. 19.
`1994 to B. L. Lindsely, discloses a Newton-Raphson algo
`rithm for division and square root that includes a range
`reduction technique to speed up convergence, but does not
`discuss exactly rounding the intermediate results to those
`dictated by the IEEE 754 standard or IBM S/390 architec
`ture.
`There remains, therefore, a need for further improvements
`in implementing floating point division and square root
`operations to provide an exactly rounded result in a desig
`nated rounding mode.
`SUMMARY OF THE INVENTION
`The present invention overcomes the above, and other
`limitations of the prior art by a method and system which
`provides exactly rounded division and square root results for
`a designated rounding mode according to a quadratically
`converging operation, and which requires no additional
`iterations of the quadratically converging operation, and
`which does not require determining a relationship of the
`remainder to zero in all cases for performing a rounding
`operation.
`An embodiment of the present invention, employed in
`accordance with a floating point arithmetic unit, provides an
`N-bit exactly rounded result for a designated rounding mode
`by a method that includes the steps of generating an approxi
`mate result having N+G bits and an error tolerance of less
`than plus or minus the least significant (N+Gth) bit, N being
`a positive, nonzero integer, G being a positive, non zero
`integer, the N+1th bit through and including the N+Gth bit
`referred to as the guard bits or guard digit; and providing an
`exactly rounded result based on the value of the guard digit
`and the designated rounding mode, independently of a
`relationship of a remainder to zero, for predetermined com
`binations of the designated rounding mode and the guard
`digit. For other predetermined combinations of the rounding
`mode and the guard digit, a relationship of a remainder to
`Zero is determined, and the exactly rounded result is pro
`vided according to the designated rounding mode and the
`remainder relationship to zero. Generally, for any rounding
`mode, the exactly rounded result will be one of the follow
`ing: the N most significant bits of the approximate result,
`referred to as the truncated approximate result; or the
`truncated approximate result incremented by one unit in the
`least significant bit; or the truncated approximate result
`decremented by one unit in the least significant bit. Also in
`accordance with the present invention, an apparatus that
`provides an N-bit exactly rounded result in a designated
`rounding mode for floating point division or square root
`operations comprises: means for providing an N-G bit
`approximate result having an error tolerance of less than
`plus or minus the least significant (N-Gth) bit; the least
`significant Gbits are referred to as the guard digit; means for
`detecting the designated rounding mode; means for gener
`ating a remainder output signal indicative of a relationship
`between zero and a remainder representative of a difference
`between an infinitely precise result and an approximate
`result to the division or square root operation; means for
`providing an incremented result by incrementing the N most
`significant bits of the approximate result by 1 unit in the least
`significant bit; means for providing a decremented result by
`decrementing the N most significant bits of the approximate
`result by 1 unit in the least significant bit; and means for
`
`LzLabs GmbH. Ex. 1021-12
`
`

`

`5,764,555
`
`10
`
`5
`
`5
`selecting any one of the incremented result, decremented
`result, or the N most significant bits of the approximate
`result as an output representing the exactly rounded result,
`wherein for predetermined combinations of the guard digit
`and the designated rounding mode the exactly rounded result
`is selected independently of the remainder output signal, and
`wherein. for other predetermined combinations of the guard
`digit and the designated rounding mode the exactly rounded
`result is provided according to the designated rounding
`mode and the remainder output signal.
`A feature of the present invention is that a method and
`apparatus is provided for arithmetic processing of divide and
`square root intermediate results to perform exact rounding of
`these results.
`Another feature of the present invention is that an exactly
`rounded result is provided for the Goldschmidt algorithm
`without requiring the dataflow to accommodate operands
`having twice the width of the desired accuracy.
`A further feature of the present invention is that exactly
`rounded division and square root results which are compli
`ant with an architecturally defined rounding mode are pro
`vided with elimination of extra cycles of a quadratically
`converging algorithm.
`Yet another feature of the present invention is that exactly
`rounded division and square root results for designated
`rounding modes are provided in accordance with a method
`and system that achieves reduced processing time at little or
`no cost in terms of design complexity or chip real estate.
`Yet a further feature of the present invention is that it
`provides exactly rounded division and square root results for
`a designated rounding mode according to a quadratically
`converging operation without requiring additional iterations
`of the quadratically converging operation.
`Still another feature of the present invention is that it
`provides exactly rounded division and square root results for
`a designated rounding mode according to a quadratically
`converging operation without requiring that a relationship
`between a remainder and zero be calculated for all cases of
`the rounding operation.
`BRIEF DESCRIPTION OF THE ORAWNGS
`Additional aspects, features, and advantages of the inven
`tion will be understood and will become more readily
`45
`apparent when the invention is considered in the light of the
`following description made in conjunction with the accom
`panying drawings, wherein:
`FIG. 1A shows the basic steps of implementing a pre
`ferred embodiment of the present invention of providing, for
`a specified rounding mode, an exactly rounded result of a
`division or square root operation;
`FIG. 1B shows schematically an overview of a preferred
`embodiment represented in functional block format;
`FIG.2 shows a numberline diagram of the range that the
`infinitely precise quotient, denoted by Q, can be in relation
`to the N+G bit approximation, denoted by Q" for the round
`to nearest mode with three guard bits, in accordance with
`practicing the present invention;
`FIG.3 shows a numberline diagram of the range that the
`infinitely precise quotient, denoted by Q, can be in relation
`to the N+G bit approximation, denoted by Q"for the round
`to infinity with the same sign mode with three guard bits, in
`accordance with practicing the present invention;
`FIG. 4 shows a number line diagram of the range that the
`infinitely precise quotient, denoted by Q, can be in relation
`to the N+G bit approximation, denoted by Q" for the round
`
`50
`
`6
`to infinity opposite sign, round to zero, or truncate mode
`with three guard bits, in accordance with practicing the
`present invention;
`FIG. S shows a number line diagram of the range that the
`infinitely precise quotient, denoted by Q, can be in relation
`to the N+G bit approximation, denoted by Q" for the round
`to nearest mode with three guard bits, in accordance with
`practicing the present invention, and in accordance to prior
`art of Agarwal et al. and prior art of Schwarz;
`FIG. 6 illustrates circuitry involved in generating a pre
`liminary rounded result (i.e. estimate Q") from an approxi
`mation Q generated as the result of a quadratically converg
`ing algorithm, in accordance with an embodiment of the
`present invention;
`FIG. 7 illustrates circuitry for generating, according to the
`selected rounding mode and the value of the guard digit,
`signals used for selection an appropriate exactly rounded
`result independently of a remainder indicative signal (i.e. a
`signal indicative of a relationship between a remainder and
`zero), and a signal indicative of whether a remainder indica
`tive signal must be generated, in accordance with an
`embodiment of the present invention;
`FIG. 8 shows circuitry for conditionally generating a
`signal representing a relationship between Zero and a
`remainder (i.e. a remainder indicative signal), in accordance
`with an embodiment of the present invention;
`FIG. 9 illustrates circuitry for generating signals used for
`selecting an appropriate exactly rounded result according to
`outputs from the remainder circuitry shown in FIG. 8 and to
`the rounding mode, in accordance with an embodiment of
`the present invention; and
`FIG. 10 depicts circuitry for selecting an exactly rounded
`result for the given rounding mode, in accordance with an
`embodiment of the present invention.
`DETALED DESCRIPTION OF THE
`INVENTION
`In FIG. 1A, there are shown the basic steps of imple
`menting a preferred embodiment of the present invention of
`providing, for a specified rounding mode, an exactly
`rounded result of a division or square root operation wherein
`the exactly rounded result has a desired precision of N bits.
`(Throughout the description, for a word having m bits, the
`least significant bit is referred to as the "mth bit" or the "last
`bit".) First, in step 150, an N+G bit estimate of the exact
`resultis generated having an error tolerance of less than plus
`or minus the weight of the N+Gth least significant bit (i.e.,
`2", where the most significant bit is weighted 2'). For
`instance, in the IEEE 754 standard, 53 bits is the required
`precision for long and thus, a 54 bit approximation, with an
`error tolerance of less than plus or minus 2, would be
`needed to generate an exactly rounded result according to
`the presently disclosed invention. Such an estimate may be
`provided directly as the result of a quadratically converging
`algorithm; however, more generally, a quadratically con
`verging algorithm provides apreliminary result with several
`extra bits of precision compared to N, the desired number of
`bits, prior to rounding.
`Thus, in accordance with practicing the present invention,
`it is useful to illustrate, by way of example, a description of
`a technique used to reduce a high precision approximation
`into an N+G bit approximation with strict error tolerances.
`Assume the actual (infinitely precise) result is called Q
`and that an exactly rounded machine representable number
`(to any of 5 types of rounding modes) is to be calculated.
`
`25
`
`30
`
`35
`
`55
`
`LzLabs GmbH. Ex. 1021-13
`
`

`

`5,764,555
`
`7
`Assume Q is the needed answer which has N bits of
`accuracy and that the results have been normalized, QC1.0
`and for hex normalization is greater than or equal to /16 and
`for binary normalization is greater than or equal to /2.
`Assume a preliminary answer Q is available which has N+k
`bits with an error of plus or minus 2
`where k2jZ0. That
`
`1S
`
`8
`place (i.e., the least significant bit of tuncCQ")); or, trunc
`(Q") decremented by one unit in the last place. In some
`cases (defined by the selected rounding mode and the value
`of the guard digit), which of these values is selected as the
`exactly rounded result may be determined based solely on
`the rounding mode and the value of the guard digit-without
`determining a relationship between Zero and a remainder, in
`other cases, the relationship between zero and a remainder
`must be determined to decide which value to select.
`For a given rounding mode and guard bit value, which
`value is selected as the exactly rounded result, and whether
`a relationship between zero and a remainder must first be
`determined, may be better understood with reference to
`FIGS. 2-4, which illustrate relative number lines for differ
`ent cases. In each of FIGS. 2-4, a horizontal numberline 200
`is pictured with machine represented numbers with long
`vertical lines, halfway points between machine represent
`able numbers with medium vertical lines, and other points
`which represent the precision of points with up to three
`guard bits of additional precision using small vertical lines.
`xxx is used to denote the last few (least significant) bits of
`Q" through the Nth bit, and three bits are attached to
`indicate the value of the guard bits of Q". yyy is used to
`indicate the next higher machine representable number and
`www is used to represent the next machine representable
`number smaller than xxx. Halfway points are represented
`with the guard bits equal to 100 (i.e. the halfway point
`between www and xxx is represented by www.100). Below
`the numberline 200 are action lines which denote the range
`of numbers which are represented by a machine represent
`able number in the chosen rounding mode. The action lines
`have a filled circle at its end point to represent inclusion of
`the end point, and an empty circle to indicate exclusion of
`the end point into the range considered. This notation also
`applies to the next set of eight lines which are representa
`tions of the range of Q given a Q" with a given set of guard
`bits. For the case of three guard bits there are eight different
`values of guard bits which are depicted by lines 210 through
`217. Line 210 represents the range of Q given a Q" with
`guard bits equal to "000". Line 211 represents the range of
`Q given a Q" with guard bits equal to "000". Line 212
`represents the range of Q given a Q" with guard bits equal
`to "010". Line 213 represents the range of Q given a Q" with
`guard bits equal to "011". Line 214 represents the range of
`Q given a Q" with guard bits equal to "100". Line 215
`represents the range of Q given a Q" with guard bits equal
`to "101". Line 216 represents the range of Q given a Q" with
`guard bits equal to "110". Line 217 represents the range of
`Q given a Q" with guard bits equal to "111". For instance,
`line 210 indicates for Q" with guardbits equal to "000" the
`range of Q is between but not including www.111 and
`xxx.001.
`Round to Nearest and Round to Nearest Even Modes
`FIG. 2 shows a number line for the case where the
`selected mode is either round to nearest or round to nearest
`even. Thus, the action lines 201, 202, 203 show the appro
`priate Q, machine representable solution, given any Q
`between www and yyy. For example, action line 202 shows
`that for the value of actual infinitely precise solution, Q, is
`greater than or equal to www.100 and less than but not
`including XXX.100, which is mathematically expressed by
`the following www.100s.O<xxx. 100, that the exactly
`rounded solution Q is equal to xxx. This is equivalent to
`truncating Q". Also action line 203 shows that for
`xxx.100s.OKyyy. 100 that Q equals yyy which is truncated
`Q" incremented by 1 ulp. Given these actions needed for
`exact rounding it is straightforward to determine the round
`
`2-Nis QQ's-2-
`Q needs to be transformed into an N+G bit number with
`an accuracy of plus or minus the last bit, which is weighted
`2. To do this, first Q is incremented by more than
`2
`but less than 2". Assume Q is incremented by
`2^*, the result called Q":
`
`10
`
`5
`
`20
`
`25
`
`35
`
`45
`
`SO
`
`55
`
`Q" is an overestimate of Q. The next step is to truncate
`this overestimate to N-G bits which reduces the estimate by
`some amount

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket