`
`:19}
`
`[1 l] Patent Number:
`
`5,101,431
`
`Even
`
`[45] Date of Patent: Mar. 31, 1992
`
`||||l|||||||||III|||||Illllllllllllll|||||||||||l|||||||| ||||||||||1||||||l
`usmswmm
`
`[54] SYSTOLIC ARRAY FOR MODULAR
`MULTIPLICATION
`
`[75]
`
`Inventor:
`
`Shimun Even, Haifa, Israel
`
`[73] Assignee: Bell Communications Research, Inc,
`Livingston, NJ.
`
`[21] Appl. No.: 628,570
`
`[22] Filed:
`
`Dec. 14, 1990
`
`_
`
`[51]
`
`Int. 01.5 ......................... H041. 9/39; GOtSF 7/52;
`6061: 7/?2
`[52} us. Cl. ........................................ 380/30; 330/23;
`364/746; 364/746]
`[53] Field of Search ............ 364/200. 900, 224, 224.3,
`364/2608], 931.01, 931.02, 937.18, 739, 744,
`746, 746.1
`
`{56]
`
`References Cited
`PUBLICATIONS
`
`“A Survey of Hardware Implementations of RSA“, E.
`F. Brickell, presented at CRYP'I‘O ’89, Santa Barbara.
`Calif., Aug. 1989.
`the Motorola DSP
`”A Cryptographic Library for
`56000", S. R. Dusse et 81., EuroCrypt filo—Abstracts,
`
`May 21-24. 1990, Scantic on Arhus, Denmark. pp.
`230—244.
`
`"VICTOR, An Efficient RSA Hardware Implementa-
`tion", H. Orup et al.. EurOCrypt 90—Abstracts, May
`21—24. 1990, Scanticon Arhus, Denmark, pp. 245—252.
`“A One—Dimensional Real Time Iterative Multiplier",
`A. J. Atrubin, IEEE Trans. on Electronic Computers.
`vol. 14, pp. 394—399. 1965.
`"Modular Multiplication Without Trial Division", P. L.
`Montgomery, Math. of Computation, vol. 44, pp.
`519—521, 1935.
`
`Primary Examiner—Bernarr E. Gregory
`Attorney. Agent, or Firm—Leonard Charles Suchyra
`
`[57}
`
`ABSTRACT
`
`A systolic array (10) comprising a sequence of identical
`cells is utilized to perform modular reduction in linear
`time. Each cell receives a binary number at a first input
`and an n-bit modulus at a second input and performs
`binary addition to form a binary number output. The
`systolic array (10) for performing modular reduction
`may be combined with Atrubin‘s array (60) to perform
`modular multiplication.
`
`10 Claims, 3 Drawing Sheets
`
`
`
`
`
`14-1
`
`13-1
`
`14"?
`
`18-2
`
`1-1-3
`
`18-3
`
`(cid:51)(cid:72)(cid:87)(cid:76)(cid:87)(cid:76)(cid:82)(cid:81)(cid:72)(cid:85)(cid:3)(cid:48)(cid:76)(cid:70)(cid:85)(cid:82)(cid:86)(cid:82)(cid:73)(cid:87)(cid:3)(cid:38)(cid:82)(cid:85)(cid:83)(cid:82)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:25)(cid:21)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:20)
`Petitioner Microsoft Corporation - EX. 1062, p. 1
`
`
`
`US. Patent
`
`Mar. 31, 1992
`
`Sheet 1 of 3
`
`5,101,431
`
`mic“ msmfi
`
`“an“
`
`ma:
`
`mahu<¢hm=m
`
`(cid:51)(cid:72)(cid:87)(cid:76)(cid:87)(cid:76)(cid:82)(cid:81)(cid:72)(cid:85)(cid:3)(cid:48)(cid:76)(cid:70)(cid:85)(cid:82)(cid:86)(cid:82)(cid:73)(cid:87)(cid:3)(cid:38)(cid:82)(cid:85)(cid:83)(cid:82)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:25)(cid:21)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:21)
`Petitioner Microsoft Corporation - EX. 1062, p. 2
`
`
`
`
`
`
`US. Patent
`
`Mar. 31, 1992
`
`Sheet 2 of3
`
`5,101,431
`
`I.-_______:I_‘_3;?.____x_<i___1
`
`20-
`
`TIMING
`
`;
`
`:
`g
`!
`i
`
`34
`
`
`
`30
`
`31
`
`.21—1
`
`TIMING
`
`:
`!
`i
`i
`
`33
`
`p...-2"?" H
`
`'l'l I... H _U
`
`H cl: I—l
`
`‘11 r... H "U
`
`FIG. 3
`
`so
`
`70
`
`
`
`ATHUBIN'S
`MODULAR
`HEDUCTIDN
`ARRAY
`CIRCUIT
`
`
`-—
`
`30
`
`(cid:51)(cid:72)(cid:87)(cid:76)(cid:87)(cid:76)(cid:82)(cid:81)(cid:72)(cid:85)(cid:3)(cid:48)(cid:76)(cid:70)(cid:85)(cid:82)(cid:86)(cid:82)(cid:73)(cid:87)(cid:3)(cid:38)(cid:82)(cid:85)(cid:83)(cid:82)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:25)(cid:21)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:22)
`Petitioner Microsoft Corporation - EX. 1062, p. 3
`
`
`
`US. Patent
`
`Mar. 31, 1992
`
`Sheet 3 of 3
`
`5,101,431
`
`FIG. 4
`
`
` BO
`
`HULTIPLIEH
`
`[a2]1
`
`
`
`I35)
`
`[351")
`
`
`
`__.._.mImu:[.0..__.._..
`
`80
`
`80
`
`BO
`
`80
`
`HULTIPLIEH
`
`BO
`
`_
`
`[38]
`
`.—
`
`(313}
`
`a19 (mole1(SE)
`
`MULTIPLIEH
`
`(cid:51)(cid:72)(cid:87)(cid:76)(cid:87)(cid:76)(cid:82)(cid:81)(cid:72)(cid:85)(cid:3)(cid:48)(cid:76)(cid:70)(cid:85)(cid:82)(cid:86)(cid:82)(cid:73)(cid:87)(cid:3)(cid:38)(cid:82)(cid:85)(cid:83)(cid:82)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:25)(cid:21)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:23)
`Petitioner Microsoft Corporation - EX. 1062, p. 4
`
`
`
`5,101,431
`
`1
`
`SYSTOLIC ARRAY FOR MODULAR
`MULTIPLICATION
`
`FIELD OF THE INVENTION
`
`The present invention relates to a systolic array for
`performing modular reduction. The inventive array is
`especially useful in cryptographic systems where re-
`peated modular multiplication is utilized.
`BACKGROUND OF THE INVENTION
`
`A public key cryptography system is made up of a
`collection of users, each having his own encryption and
`decryption keys. In one such system, known as the RSA
`system, the encryption key comprises two integers N
`and e and the decryption key is single integer d. The
`integers N, d, and e are typically very large integers
`whose binary representations typically involve several
`hundred bits. Each user makes his encryption key avail-
`able to the other users while keeping his decryption key
`secret.
`
`In this system, N is an integer which is the product of
`two carefully selected large primes p and q and e is an
`integer such that
`its greatest common divisor with
`(p— l){q— l) is one [god {e,(p— l)(q— l)]=[. Finally, d
`is calculated by solving the linear congruence
`
`de=l (modtp—lltq—ll)
`
`If a user A wishes to communicate a numerically
`coded message M to another user B he calculates the
`ciphertext K = M‘Tmod N)(0 E K < N, 0% M < N) where
`e and N form the encryption key for user B. To deter-
`mine the plaintext M from the ciphertext K, B uses the
`decryption key d to calculate
`
`M=Kdtmodm (0§M<M
`
`Thus, both the encryption and decryption operations
`involve repeated modular multiplications where the
`modulus N has a representation involving hundreds of
`bits.
`As a result, great effort has been expended in the field
`of cryptography to find fast and inexpensive ways to do
`modular multiplication (see, e.g., E.F. Brickell “A Sur-
`vey of Hardware Implementations of RSA” presented
`at CRYPTO'BQ, Santa Barbara, Calif, August 1939;
`SR. Dusse et al, “A Cryptographic Library for the
`Motorola DSP 56000" EuroCrypt 90—Abstracts, May
`21—24, 1990, Scanticon Arhus, Denmark, pp. 213-217;
`H. Orup et a1, “VICTOR, An Efficient RSA Hardware
`Implementation” Eurocrypt 90—-Abstracts, May 21—24
`1990 Scanticon, Arhus, Denmark, pp. 219—227).
`A systolic array of cells for performing ordinary (i.e.
`not modular) multiplication of large integers has been
`proposed by Atrubin (see, AJ. Atrubin, “A One—
`Dimeusional Real Time Iterative Multiplier", IEEE
`Trans. on Electronic Computers, Vol. 14, 1965, pp.
`394—399}. Two positive integers to be multiplied are
`represented in binary. They are fed serially to the first
`cell of the array, least significant bit first. The product is
`supplied by the first cell, least significant bit first, with-
`out delay. The time required to obtain the product is
`linear with the length of the product. The structure of
`each cell in Atrubin's array is very simple and the array
`utilizes no long distance communications, i.e., each cell
`communicates only with its neighbors. Thus, a very
`high clock rate is possible.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`55
`
`65
`
`2
`It is an object of the present invention to provide a
`systolic array of cells which can be utilized in combina-
`tion With Atrubin’s array to perform repetitive modular
`multiplications of the type utilized in the above-
`described public key cryptographic systems.
`As is shown below the inventive systolic array uti-
`lizes the modular reduction system proposed by Mont-
`gomery (see, P.L. Montgomery, “Modular Multiplica-
`tion Without Trial D‘ivision", Math. of Computation,
`Vol. 44. 1985, pp. 519—521).
`The modular reduction system of Montgomery may
`be understood as follows. Let the modulus N be an odd
`
`integer. Let n be the number of bits in the binary repre-
`sentation ofN i.e. 2"*1<N <2". Let R=2”. Let i be the
`image of the integer where i=xR(mod N}. A binary
`representation of an image has at most :1 bits and it may
`exceed N, but it is nonuegative.
`In the Montgomery modular multiplication system,
`modular operations are done with images. For example,
`suppose one wants to multiply two numbers x and y to
`obtain a product z=xy(rnod N). In the Montgomery
`system this is done by multiplying i and i to obtain
`M=xy=xyR1(mod 1%):le2 (Mod N)=Efi[Mod N).
`Note that M has at most 2n bits and z is non-negative
`and has at most it hits:
`In order to obtain 2 from M it is necessary to divide
`M by R and in order to obtain 2 from 5. it is necessary to
`divide again by R. This modular division by R is known
`as modular reduction.
`
`Thus, it is a further object of the present invention to
`provide a systolic array of cells to perform the above-
`described modular reduction operation and to utilize
`the systolic array to perform repeated modular multipli-
`cations in a cryptographic system.
`
`SUMMARY OF THE INVENTION
`
`In accordance with the present invention, the compu-
`tation MR- l (mod N) is performed using the following
`algorithm.
`begin
`MEL—M
`for i=1 to I: do
`begin
`thMih 1 + NMn'" I
`Shift-right Mi, one bit
`end
`if M"2" then M'h—Mfl—N
`end
`
`reduced,
`be
`to
`number
`the
`where MD-M is
`MH=M°R- I(mod N), and Mo" is the least significant bit
`of the binary number Mi.
`An array for carrying out this algorithm to perform a
`modular reduction comprises a sequence of cells Cl,
`i=1, .
`.
`. , it. Each of the cells C5 except the first cell C1
`has a first input for serially receiving a binary number
`MF-I from the immediately preceding cell (3*-1 in the I
`sequence. The first cell CI has a first input for serially
`receiving the binary number M=M°to be reduced from
`an external source. Each cell in the sequence also has a
`second input for serially receiving an n-bit modulus N.
`A serial adder in each cell serially performs the binary
`operation l\1"=l\rl"—l +NMgl—1 and each cell includes
`means for afl‘ecting a one bit shift right operation on the
`binary number Ml. Each cell except the last cell in the
`sequence has an output for serially transmitting M5 to
`the first input of the next cell in the sequence. The last
`cell has an output for serially outputting a binary num-
`ber M’3 which is equal to Moll“l (mod N).
`
`(cid:51)(cid:72)(cid:87)(cid:76)(cid:87)(cid:76)(cid:82)(cid:81)(cid:72)(cid:85)(cid:3)(cid:48)(cid:76)(cid:70)(cid:85)(cid:82)(cid:86)(cid:82)(cid:73)(cid:87)(cid:3)(cid:38)(cid:82)(cid:85)(cid:83)(cid:82)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:25)(cid:21)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:24)
`Petitioner Microsoft Corporation - EX. 1062, p. 5
`
`
`
`5,101,431
`
`3
`The output of the last cell is connected to a subtrac-
`tion circuit for subtracting N from M” if M"§2”.
`In short. the present invention is a systolic array with
`a very simple cell structure for performing modular
`reduction in linear time. The systolic array may be
`combined with Atrubin’s array to perform modular
`multiplication in linear time.
`BRIEF DESCRIPTION OF THE DRAWING
`
`FIG. 1 illustrates a systolic array for performing a
`modular reduction, in accordance with an illustrative
`embodiment of the present invention.
`FIG. 2 is a block diagram which illustrates one of the
`cells in the array of FIG. 1, in accordance with an illus-
`trative embodiment of the present invention.
`FIG. 3 illustrates a circuit for performing modular
`multiplication which utilizes the systolic array of FIG.
`1 in accordance with an illustrative embodiment of the
`
`present invention.
`FIG. ti shows how repetitive multiplications are per-
`formed using the array of FIG. 1.
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`FIG. 1 illustrates a circuit 1 for performing modular
`reduction. The circuit 1 comprises a systolic array 10.
`The systolic array 10 comprises a sequence of cells C1,
`C2. .
`.
`.
`, C". Each cell Cf has two inputs 12-i and 14-1.
`Thus the cell C1 has the inputs 12-1 and 14-1 and the
`Cell C" has the inputs 12-n and 14-n. Each cell Ct except
`the last cell C" also has two outputs 16~i and lS-i. Thus,
`the cell C 1 has the outputs 16-1 and 1.8-1. The last cell
`C" has one output which is designated 16-h. The inputs
`12 and 14 of each cell except the first cell are connected
`to the corresponding outputs 16 and 18 of the preceding
`cell in the array.
`Each cell C has an input Zfl-i and an output ZI-i by
`which a timing signal is passed from cell to cell.
`Each cell in the array 10 except the first cell receives
`all its inputs from the preceding cell in the array and
`each cell except the last cell transmits all its outputs to
`the next succeeding cell in the array. Each input l4-i of
`the cell Cl serially receives the modulus N least signif-
`cant bit first. Each bit in the binary representation of the
`modulus N is delayed by two clock cycles in each cell
`and is then outputted at the output 18-i for transmission
`to the next cell. The input 12-1 of the first cell C1 seri-
`ally receives. least significant bit first, the binary num-
`ber M=M° to be reduced. The remaining cells Cl seri-
`ally receive at their inputs 12-i the binary number M'"'1
`from the output l6—i—1 of the preceding cell. Each cell
`includes a serial adder for performing the operation
`Mi=Mi-'+NMot'—1, where Moi—l is the least signif-
`cant bit of Mir 1, and means for affecting a one bit shift-
`right operation on M". The binary number Ml is serially
`outputted at the output 16-i. The output 16-n of the last
`cell C“ is connected to a subtractor circuit 25. The sub-
`tractor 25 performs the operation M"«—M"—N if
`Mflglfl. The output of the subtractor is MR- 1{mod N).
`A cell C5 is shown in greater detail in FIG. 2.
`A timing signal enters the cell Ci of FIG. 2 at input
`mi and leaves Via the output 21-i. There are two delay
`flip-flops 3t] and 31 in the path of the timing signal. The
`flip-flops 30 and 31 are delay flip-flops so that the output
`of each flip-flop at time t+ I is equal to its input at time
`t. Thus, there is a delay of two time units in the path of
`the timing signal.
`
`10
`
`I5
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`As seen by the input 20-] of the cell C1 (see FIG. 1).
`the timing signal comprises a “one" pulse followed by
`Zn—l “zero“ pulses. Because the “one" pulse is delayed
`by two time units in each cell. the one pulse arrives at
`the input ZO-i of the cell C at time 2i—1.
`The binary number Mi—l is received serially at the
`input 12-i, is delayed One time unit by the delay flip-flop
`36 and is connected to an input of the exclusive-OR gate
`38.
`
`When the one pulse arrives at the input 204 of the cell
`C" of FIG. 2, the bit value of M"-I which arrives at the
`input 12-i is held by the sample and hold circuit 34.
`The modulus N is received serially at the input Ill-i
`and is outputted serially at the output 18-i. There are
`two delay flip-flops 40 and 42 in the path between the
`modulus input I4—i and the modulus output IB-i. The
`output of the flip-flop 40 is connected to one input of the
`AND-gate 46. The second itiput of the AND-gate 4-6 is
`connected to the sample and hold 34. The output of the
`AND-gate 46 is connected to an input of the exclusive
`OR-gate 38 and to an input of the AT LEAST TWO
`Gate 48.
`
`The AT LEAST TWO gate 48 has three inputs, the
`first of which is connected to the delay flip-flop 50, the
`second of which is connected to the AND-gate 46, and
`the third of which is connected to the flip-flop 36. The
`output of the AT LEAST TWO gate 48 is a “one"
`when at least two of the three inputs are “one". The
`exclusive-OR gate 38 has three inputs, the first of which
`is connected to the flip-flop 36, the second of which is
`connected to the AND—gate 46, and the third of which
`is connected to the flip-flop 50. The output of the exclu-
`sive-OR gate 38 is a "one" when one of the inputs is
`“one" and the other two inputs are “zero“. In the cell
`Ci of FIG. 2, all flip-flaps are reset to produce a “zero"
`output at time 1:1.
`The exclusive-OR gate 38 and the AT LEAST TWO
`gate 48 in combination with the flip-flop 50 form a serial
`adder
`for
`serially performing the addition Mi:-
`Mi—] +NMolr '. The binary number Ml is transmitted
`serially from the output of the exclusive-0R gate 38 to
`the cell output 16-i. The carry bit of the serial addition
`operation is generated at the output of the AT LEAS
`TWO gate 4-8 and is delayed for one time unit by the
`flip-flop 5|]. The serial adder formed by the gates 38 and
`48 adds the numbers hill—1 and N which appears at the
`inputs 12-i and Iii-i serially, least significant bit first, as
`follows. If the first bit of Mi—l is a “one", then starting
`from time 2i (Le. one time unit after the timing pulse
`arrives) the AND-gate 46 is enabled. In this case the
`data bits comprising the modulus N pass through the
`enabled AND-gate and are added serially to the bits
`comprising the number Mi. If the first bit of hill-1 is a
`“zero“, then the AND-gate 46 remains disabled and the
`data bit of Mir-1 goes through the cell unchanged, be-
`cause the disabled AND-gate blocks the bit from the
`modulus N from entering the adder. In either case. the
`first output bit of Mt appears on the output 16-i at time
`2i and is always “zero”. This zero is ignored by the next
`succeeding cell in the array thus affecting a shift-right
`Operation on the output of 0'. As indicated above. the
`bits comprising the modulus N are delayed in the cell
`Ci by two time units through use of the flip~fl0ps 40 and
`42 so that the least significant bit of N appears at the
`output 18-i at the same time the least significant bit of
`Mt appears at the output l6-i, i.e., after the right-shift is
`affected.
`
`(cid:51)(cid:72)(cid:87)(cid:76)(cid:87)(cid:76)(cid:82)(cid:81)(cid:72)(cid:85)(cid:3)(cid:48)(cid:76)(cid:70)(cid:85)(cid:82)(cid:86)(cid:82)(cid:73)(cid:87)(cid:3)(cid:38)(cid:82)(cid:85)(cid:83)(cid:82)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:25)(cid:21)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:25)
`Petitioner Microsoft Corporation - EX. 1062, p. 6
`
`
`
`5,101,431
`
`5
`As indicated previously, to carry out a modular re-
`duction the circuit 1 of FIG. 1 implements the algo-
`rithm:
`begin
`Mot—M
`for i=1 to it do
`
`begin
`MihMi—l +NMO’I’ I
`Shift-right Mi one bit
`end
`if M'iEZ" then M"--—M"-—N
`end
`
`The cell C" of FIG. 1 performs the iii" iteration of the
`inner loop of the algorithm. The remaining subtraction
`is carried by the subtraction circuit 25 of FIG. 1.
`As indicated previously the circuit for performing
`modular reduction is especially useful for performing
`repetitive modular multiplications. Consider for exam-
`ple the repetitive multiplication a'9 (mod N). To per-
`form this repetitive multiplication in accordance with
`the present invention, a'9 (mod N) may be represented
`as
`
`a'9tmodm=«(ta)1)2)=a)3a (mod-NJ
`This sequence of multiplications can be done using
`the systolic array of the present invention as follows. At
`the time the modulus N is chosen, the quantity R=R2
`(mod N) is determined using a conventional modular
`multiplication technique. As shown in FIG. 3,
`the
`image 5 is obtained by first using Atrubin’s array 60 to
`form the product:
`
`all =aTt2 (mod Nj=afi (mod N)
`
`Then the modular reduction circuit 70 of the present
`invention is used to obtain
`
`(tram -1 (mad m=e
`
`In FIG. 3, the Atrbuin’s array 60 and the modular
`reduction circuit 70 may be viewed as forming a single
`multiplier circuit 8|].
`FIG. 4 shows the sequence of multiplications per-
`formed by the multiplier circuit 80 to go from a to a“-i
`(mod N).
`Thus the inventive modular reduction circuit may be
`utilized to easily perform the repetitive modular multi-
`plications of the type utilized in public key crypto-
`graphic systems.
`Finally,
`the above-described embodiments of the
`invention are intended to be illustrative only. Numerous
`alternative embodiments may be devised by those
`skilled in the art without departing from the spirit and
`scape of the following claims.
`I claim:
`1. A circuit for performing a modular reduction com-
`prising
`a sequence ofconnected cells 0", i=1, .
`each of said cells Ci comprising:
`a first input for serially receiving a binary number
`Mi-l and a second input for serially receiving an
`n-bit modulus N,
`
`.
`
`.
`
`, 11
`
`6
`adding means for serially performing binary addition
`to form the binary number Mia—Mi—1+NMoi—l,
`wherein Moi-l is the least significant bit of Mi— 5,
`means for causing a one-bit shift-right operation on
`said binary number Mi, and
`an output for serially outputting said binary number
`M’.
`wherein the first input of the first cell C' in said se-
`quence receives a binary number M0, wherein the
`output of cell 0, i=1, .
`.
`.
`, 11—1,
`is the binary
`number Mi and is connected to said first input of
`the next adjacent cell, and wherein the output of
`said last cell C" in said sequence outputs a binary
`number M" which is equal to Molt-‘1 ‘ (mod N},
`where 11:2".
`2. The circuit of claim 1 wherein said circuit further
`includes subtraction means connected to said output of
`said last cell for subtracting N from M" if MHEZH.
`3. The circuit of claim 1 wherein the output of each
`cell Ci in said sequence except the last cell is connected
`to the first input of the next succeeding cell in said
`sequence.
`4. The circuit of claim I wherein said means for caus-
`ing a oneabit shift-right operation comprises means for
`insuring that the first bit of M" is a “zero".
`5. The circuit of claim 1 wherein said first input of
`said first cell C' in said sequence receives said binary
`number M0 from a systolic array for performing ordi-
`nary multiplication.
`6. A circuit for performing modular reduction com-
`prising
`a first systolic array for performing ordinary multipli-
`cation, and
`a second systolic array for performing a modular
`reduction on a product produced by said first sys-
`tolic array,
`said second systolic array comprising a sequence of
`cells C', i= 1, .
`.
`. , n, each of said cells Cl'encept the
`first cell having a first input for serially receiving a
`binary number M"—l from the preceding cell in-the
`sequence, each of said cells having a second input
`for receiving an n-bit modulus N, and each of said
`cells having serial adder means for performing the
`addition Mi—1+NM0"“1, where Moi *1 is the least
`significant bit in Mi" ', to form the binary number
`Mi, wherein the first cell in said sequence has an
`input for receiving a binary number M9 to be mad-
`ularly reduced, each of said cells has an output, and
`each of said cells, except the last in said sequence,
`has its output cemented to said first input of the
`next adjacent cell.
`7. The circuit of claim 6 wherein said binary number
`M0 is received at the input of the first cell from the first
`systolic array.
`8. The circuit of claim 6 wherein said first systolic
`array is Atrubin's array.
`9. The circuit of claim 6 wherein each cell Ci in said
`second systolic array includes means for insuring that
`the first bit of Mf is a “zero“.
`Ill. The circuit of claim 6 wherein a subtractor circuit
`is connected to an output the last cell in said second
`systolic array for subtracting N from M" if Mflézn.
`a
`c
`a
`a
`at
`
`5
`
`IO
`
`IS
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`50
`
`65
`
`(cid:51)(cid:72)(cid:87)(cid:76)(cid:87)(cid:76)(cid:82)(cid:81)(cid:72)(cid:85)(cid:3)(cid:48)(cid:76)(cid:70)(cid:85)(cid:82)(cid:86)(cid:82)(cid:73)(cid:87)(cid:3)(cid:38)(cid:82)(cid:85)(cid:83)(cid:82)(cid:85)(cid:68)(cid:87)(cid:76)(cid:82)(cid:81)(cid:3)(cid:16)(cid:3)(cid:40)(cid:91)(cid:17)(cid:3)(cid:20)(cid:19)(cid:25)(cid:21)(cid:15)(cid:3)(cid:83)(cid:17)(cid:3)(cid:26)
`Petitioner Microsoft Corporation - EX. 1062, p. 7
`
`