throbber
TO THE ASSISTANT COMMISSIONER FOR PATENTS
`
`Transmitted herewith for filing under 35 U.S.C. 111 and 37 C.F.R. 1.53 is the patent application of:
`Lee, R.
`For: A METHOD AND SYSTEM FOR PERFORMING PERMUTATIONS WITH BIT PERMUTATION
`INSTRUCTIONS
`
`EL847628152US
`
`Enclosed are:
`l8l Certificate of Mailing with Express Mail Mailing Label No.
`l8l 27
`sheets of drawings.
`(cid:143) A certified copy of a
`(cid:143)
`l8l Declaration
`l8l Power of Attorney
`(cid:143)
`Information Disclosure Statement
`(cid:143) Preliminary Amendment
`l8l Not needed
`Verified Statement(s) to Establish Small Entity Status Under 37 C.F.R. 1.9 and 1.27.
`(cid:143) Other:
`
`Signed.
`
`l8l Unsigned.
`
`application.
`
`For
`
`#Filed
`
`#Allowed
`
`#Extra
`
`CLAIMS AS FILED
`
`Total Claims
`
`lndep. Claims
`
`77
`
`23
`
`-20 =
`
`- 3 =
`
`57
`
`20
`
`X
`
`X
`
`Multiple Dependent Claims (check if applicable)
`
`(cid:143)
`
`Rate
`
`$9.00
`
`$40.00
`
`Fee
`
`$513.00
`
`$800.00
`
`$0.00
`
`BASIC FEE
`
`$355.00
`
`TOT AL FILING FEE
`
`$1,668.00
`
`l8l A check in the amount of
`$1,668.00
`to cover the filing fee is enclosed.
`l8l The Commissioner is hereby authorized to charge and credit Deposit Account No.
`as described below. A duplicate copy of this sheet is enclosed.
`(cid:143) Charge the amount of
`as filing fee.
`l8l Credit any overpayment.
`l8l Charge any additional filing fees required under 37 C.F.R. 1.16 and 1.17.
`D Charge the issue fee set in 37 C.F.R. 1.18 at the mailing of the Notice of Allowance,
`pursuant to 37 C.F.R. 1.311(b).
`
`11
`
`13-2165
`
`Dated: May 7, 2001
`
`cc:
`
`Diane Dunn McKay
`Reg. No. 34,586
`Mathews, Collins, Shepherd & Gould, P.A.
`100 Thanet Circle, Suite 306
`Princeton, NJ 08540
`(609) 924-8555 Telephone
`(609) 924-3036 Facsimile
`
`P01 SMALUREV06
`
`Oracle-1002 p. 1
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`4759-104 us
`
`EXPRESS MAIL CERTIFICATE
`
`"Express Mail" Mailing Label Number: EL847628152US
`
`Express Mail Corporate Account Number: X079384
`
`Date of Deposit:
`
`May 7, 2001
`
`Type of Documents:
`
`1.
`2.
`3.
`4.
`5.
`6.
`7.
`
`Acknowledgment Post Card;
`"Express Mail" Certificate;
`Patent Application Transmittal Letter (Small Entity)
`Specification (34 pages), Claims (19 pages) & Abstract (1 page);
`27 Sheets of Drawings;
`Declaration and Power of Attorney (unsigned);
`Check in the amount of $1,668.00.
`
`I hereby certify that this paper or fee is being deposited with the United States Postal
`Service "Express Mail Post Office to Addressee" service under 3 7 CFR 1.10 on the date
`indicated above and is addressed to the Assistant Commissioner of Patents, Washington,
`D.C. 20231; BOX: PATENT APPLICATION.
`
`Diane Dunn McKay
`(Typed or printed name of person mailing paper or fee)
`
`Oracle-1002 p. 2
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`A Method and System for Performing Permutations
`With Bit Permutation Instructions
`
`4759-104
`
`5
`
`Background of the Invention
`
`1. Field of the Invention
`
`The present invention relates to a method and system for performing
`
`10
`
`permutations of a sequence of bits in a programmable processor.
`
`2. Description of the Related Art
`
`The need for secure information processing has increased with the increasing use
`
`15
`
`of the public internet and wireless communications in e-commerce, e-business and
`
`personal use. Typical use of the internet is not secure. Secure information processing
`
`typically includes authentication of users and host machines, confidentiality of messages
`
`sent over public networks, and assurances that messages, programs and data have not
`
`been maliciously changed. Conventional solutions have provided security functions by
`
`20
`
`using different security protocols employing different cryptographic algorithms, such as
`
`public key, symmetric key and hash algorithms.
`
`For encrypting large amounts of data symmetric key cryptography algorithms
`have been used, see Bruce Schneier, "Applied Cryptography", 2nd Ed., John Wiley &
`Sons, Inc., 1996. These algorithms use the same secret key to encrypt and decrypt a
`
`25
`
`given message, and encryption and decryption have the same computational complexity.
`
`In symmetric key algorithms, the cryptographic techniques of "confusion" and
`
`"diffusion" are synergistically employed. "Confusion" obscures the relationship between
`
`the plaintext ( original message) and the ciphertext ( encrypted message), for example,
`through substitution of arbitrary bits for bits in the plaintext. "Diffusion" spread the
`
`30
`
`redundancy of the plaintext over the ciphertext, for example through permutation of the
`bits of the plaintext block. Such bit-level permutations have the drawback of being slow
`
`1
`
`Oracle-1002 p. 3
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`4759-104
`
`when implemented with conventional instructions available in microprocessors and other
`
`programmable processors.
`
`Bit-level permutations are particularly difficult for processors, and have been
`
`5
`
`avoided in the design of new cryptography algorithms, where it is desired to have fast
`
`software implementations, for example in the Advanced Encryption Standard, as
`described in NIST, "Announcing Request for Candidate Algorithm Nominations for the
`
`Advanced Encryption Standard (AES)", http:/ /csrc.nist.gov/encryption/aes/pre(cid:173)
`roundl/aes 9709.htm, Since conventional microprocessors are word-oriented, performing
`bit-level permutations is difficult and tedious. Every bit has to be extracted from the
`
`source register, moved to its new location in the destination register, and combined with
`the bits that have already been moved. This requires 4 instructions per bit (mask
`
`generation, AND, SHIFT, OR), and 4n instructions to perform an arbitrary permutation
`of n bits. Conventional microprocessors, for example Precision Architecture (PA-RISC)
`have been described to provide more powerful bit-manipulation capabilities using
`EXTRACT and DEPOSIT instructions, which can essentially perform the four operations
`required for each bit in 2 instructions (EXTRACT, DEPOSIT), resulting in 2n
`instructions for any arbitrary permutation of n bits, see Ruby Lee, "Precision
`Architecture", IEEE Computer, Vol. 22, No. 1, pp. 78-91, Jan. 1989. Accordingly, an
`arbitrary 64-bit permutation could take 128 or 256 instructions on this type of
`conventional microprocessor. Pre-defined permutations with some regular patterns have
`been implemented in fewer instructions, for example, the permutations in DES, as
`described in Bruce Schneier, "Applied Cryptography", 2nd Ed., John Wiley & Sons, Inc.,
`
`1996.
`
`Conventional techniques have also used table lookup methods to implement fixed
`permutations. To achieve a fixed permutation of n input bits with one table lookup, a
`table with 2n entries is used with each entry being n bits. For a 64-bit permutation, this
`type of table lookup would use 267 bytes, which is clearly infeasible. Alternatively, the
`table can be broken up into smaller tables, and several table lookup operations could be
`
`10
`
`15
`
`20
`
`25
`
`30
`
`2
`
`Oracle-1002 p. 4
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`4759-104
`
`used. For example, a 64-bit permutation could be implemented by permuting 8
`consecutive bits at a time, then combining these 8 intermediate permutations into a final
`permutation. This method requires 8 tables, each with 256 entries, each entry being 64
`bits. Each entry has zeros in all positions, except the 8 bit positions to which the selected
`8 bits in the source are permuted. After the eight table lookups done by 8 LOAD
`instructions, the results are combined with 7 OR instructions to get the final permutation.
`In addition, 8 instructions are needed to extract the index for the LOAD instruction, for a
`total of 23 instructions. The memory requirement is 8*256*8=16 kilobytes for eight
`tables. Although 23 instructions is less than the 128 or 256 instructions used in the
`previous method, the actual execution time can be much longer due to cache miss
`penalties or memory access latencies. For example, if half of the 8 Load instructions
`miss in the cache, and each cache miss takes 50 cycles to fetch the missing cache line
`from main memory, the actual execution time is more than 4*50=200 cycles.
`Accordingly, this method can be longer than the previously described 128 cycles using
`EXTRACT and DEPOSIT. This method also has the drawback of a memory requirement
`of 16 kilobytes for the tables.
`
`5
`
`10
`
`15
`
`20
`
`Permutations are a requirement for fast processing of digital multimedia
`information, using subword-parallel instructions, more commonly known as multimedia
`instructions, as described in Ruby Lee, "Accelerating Multimedia with Enhanced Micro(cid:173)
`processors", IEEE Micro, Vol. 15, No. 2, pp.22-32, April 1995, and Ruby Lee, "Subword
`Parallelism in MAX-2", IEEE Micro, Vol. 16, No. 4, pp.51-59, August 1996. The MAX-
`2 general-purpose PERMUTE instructions can do any permutation, with and without
`repetitions, of the subwords packed in a 64-bit register. However, it is only defined for
`16-bit subwords. MIX and MUX instructions have been implemented in the IA-64
`architectures, which are extensions to the MIX and PERMUTE instructions ofMAX-2,
`see Intel Corporation, "IA-64 Application Developer's Architecture Guide", Intel
`Corporation, May, 1999. The IA-64 uses MUX instruction, which is a fully general
`permute instruction for 16-bit subwords, with five new permute byte variants. A
`30 VPERM instruction has been used in an AltiVec extension to the Power PC™ available
`
`25
`
`3
`
`Oracle-1002 p. 5
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`4759-104
`
`from IBM Corporation, Armonk, NY, see Motorola Corporation, "'AltiVec Extensions to
`PowerPC' Instruction Set Architecture Specification", Motorola Corporation, May 1998.
`The Altivec VPERM instruction extends the general permutation capabilities ofMAX-2's
`PERMUTE instruction to 8-bit subwords selected from two 128-bit source registers, into
`a single 128-bit destination register. Since there are 32 such subwords from which 16 are
`selected, this requires 16*lg32=80 bits for specifying the desired permutation. This
`means that VPERM has to use another 128-bit register to hold the permutation control
`bits, making it a very expensive instruction with three source registers and one
`destination register, all 128 bits wide.
`
`It is desirable to provide significantly faster and more economical ways to
`perform arbitrary permutations of n bits, without any need for table storage, which can be
`used for encrypting large amounts of data for confidentiality or privacy.
`
`5
`
`10
`
`15
`
`SUMMARY OF THE INVENTION
`
`;,
`
`The present invention provides permutation instructions which can be used in
`software executed in a programmable processor for solving permutation problems in
`cryptography, multimedia and other applications. For fast cryptography, bit-level
`permutations are used, whereas for multimedia, permutations on subwords of typically 8
`bits or 16 bits are used. Permutation instructions of the present invention can be used to
`provide any arbitrary permutation of sixty-four 1-bit subwords in a 64-bit processor, i.e.,
`a processor with 64-bit words, registers and datapaths, for use in fast cryptography. The
`permutation instructions of the present invention can also be used for permuting
`subwords greater than 1 bit in size, for use in fast multimedia processing. For example,
`in addition to being able to permute sixty-four 1-bit subwords in a register, the
`permutation instructions and underlying functional unit can permute thirty-two 2-bit
`subwords, sixteen 4-bit subwords, eight 8-bit subwords, four 16-bit subwords, or two 32-
`bit subwords. The permutation instructions of the present invention can be added as new
`instructions to the Instruction Set Architecture of a conventional microprocessor, or they
`
`20
`
`25
`
`30
`
`4
`
`Oracle-1002 p. 6
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`4759-104
`
`can be used in the design of new processors or coprocessors to be efficient for both
`cryptography and multimedia software.
`
`5
`
`10
`
`15
`
`20
`
`25
`
`A PPERM instruction is defined to perform permutations by a sequence of
`instructions with each sequence specifying the position in the source for each bit in the
`destination. In the PPERM instruction bits in the destination register that change are
`updated and bits in the destination register that do not change are set to zero.
`Alternatively, a PPERM3R instruction is defined to perform permutations. The
`PPERM3R instruction is similar to the PPERM instruction except that the bits from the
`destination register which do not change are copied unchanged, rather than set to zero as
`in PPERM. Accordingly, the PPERM3R instruction uses three source registers because
`the destination register is also a source register since the unchanged bits are held in the
`destination register. For every one of n bits to be changed in the final permutation, 1 gn
`bits can be used in the PPERM instruction or the PPERM3R instruction to specify which
`bit in the source register should replace the bit to be changed in the destination register.
`
`In an alternate embodiment, a GRP instruction is defined to perform
`permutations. The GRP instruction divides the initial sequence in the source register into
`two groups depending on configuration bits. The first group is concatenated with the
`second group to form the result of one GRP instruction, which is also an intermediate bit
`sequence toward the final permutation. The total number of GRP instructions for a
`permutation of n bits is up to lgn.
`
`In an embodiment of the present invention, multibit subwords are permuted with
`the GRP instruction. In a further embodiment of the invention, the method and system
`are scaled for performing permutations of 2n bits in which subwords are packed into two
`or more registers. In this embodiment, at most 2lgn+4 instructions are used to permute
`2n bits using n-bit words.
`
`30
`
`For a better understanding of the present invention, reference may be made to the
`accompanying drawings.
`
`5
`
`~
`
`Oracle-1002 p. 7
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`4759-104
`
`Brief Description of the Drawings
`
`5
`
`10
`
`15
`
`Fig. 1 is a schematic diagram of a system for implementing permutation instructions in
`accordance with an embodiment of the present invention.
`Fig. 2 is a flow diagram of a method for determining a permutation instruction
`sequence to achieve a desired permutation in accordance with an embodiment of the present
`invention.
`Fig. 3A is a schematic diagram of operation of a PPERM instruction.
`Fig. 3B is a schematic diagram of an example extracting 8 bits from multiple registers
`with PPERM instructions with "otherreg" bit.
`Fig. 4A illustrates a schematic diagram of an alternate system with three source
`registers.
`Fig. 4B is a schematic diagram of a circuit for implementing a 64-bit PPERM
`instruction.
`Fig. 5 is a schematic diagram of an operation of a GRP instruction.
`Fig. 6 is a schematic diagram of a method for determining a sequence of GRP
`instruction and control bits for the GRP instructions.
`Fig. 7 is an example of determining a GRP instruction sequence for an 8-bit
`permutation.
`Fig. SA is a schematic diagram of a unit for serial implementation of a GRP
`operation with one control signal.
`Fig. 8B is a circuit diagram for a unit for a GRP operation with one control signal.
`Fig. 8C is a schematic diagram of an alternate unit for serial implementation of a
`GRP operation with two control signals.
`Fig. SD is a circuit diagram for a unit for a GRP operation with two control
`signals.
`Fig. 9A is an example of a circuit for serial implementation of a first step of a
`30 GRP operation using half of a GRP function unit.
`Fig. 9B is an example of a whole 8-bit GRP function unit.
`
`20
`
`25
`
`6
`
`Oracle-1002 p. 8
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`4759-104
`
`Fig. 10 is a circuit diagram of a serial implementation of a GRP operation.
`Fig. 11 is a schematic diagram of principle for a parallel scheme for a GRP
`
`operation.
`Fig. 12 is a schematic diagram of a unit for parallel implementation of a GRP
`operation.
`Fig. 13 is a circuit diagram for extracting z bits in an 8-bit group by combining z
`bits in two 4-bit groups.
`Fig. 14 is a circuit diagram of addition of two one-hot encoded numbers.
`Fig. 15 is a circuit diagram of a circuit to generate a final result of an n bit GRP
`operation, wherein n=4.
`Fig. 16 is a schematic diagram of a parallel GRP implementation.
`Fig. 17 is a schematic diagram of a module for generating select signals.
`Fig. 18A is a flow diagram of a method for 2n-bit permutation in accordance with
`an embodiment of the present invention.
`Fig. 18B is a schematic diagram of the method shown in 18A.
`Fig. 19 is a graph of the number of instructions for encryption and key generation
`in DES.
`Fig. 20 is a circuit diagram for extracting z bits in an 8-bit group by combining z
`bits in two 4-bit groups and making non-z bits zero.
`Fig. 21 is a schematic diagram of an embodiment of a parallel implementation of
`8-bit GRP operation.
`Fig. 22 is a schematic diagram of an embodiment of a parallel implementation of
`64-bit GRP operation.
`Fig. 23 is a schematic diagram of an embodiment of a module for generating
`select signals.
`
`5
`
`10
`
`15
`
`20
`
`25
`
`Detailed Description
`
`Reference will now be made in greater detail to a preferred embodiment of the
`invention, an example of which is illustrated in the accompanying drawings. Wherever
`
`30
`
`=
`
`7
`
`Oracle-1002 p. 9
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`4759-104
`
`possible, the same reference numerals will be used throughout the drawings and the
`description to refer to the same or like parts.
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`Fig. 1 illustrates a schematic diagram of a system 10 for implementing efficient
`permutation instructions in accordance with the teachings of the present invention. Register
`file 12 includes source register 1 la, source register 11 b and destination register l lc. System
`10 can provide bit-level permutations of all n bits of any register in register file 12. The same
`solution can be applied to different subword sizes of i bits, for i=0, 1, 2, ... , m, where n=2m
`bits. For a fixed word size of n bits, and 1-bit subwords, there are n subwords to be permuted.
`Source register values to be permuted 13 from register l la and configuration bits 15 from
`source register 11 b are applied over data paths to permutation functional unit 14. Permutation
`function unit 14 generates permutation result 16. Permutation result 16 can be an intermediate
`result if additional permutations are performed by permutation functional unit 14. For other
`instructions, arithmetic logic unit (ALU) 17 and shifter 18 receive source register values 13
`from source register 1 la and source register values 15 from source register I lb and generate
`respective ALU result 20 and shifter result 21 over a data path to destination register I le.
`System 10 can be implemented in any programmable processor, for example, a conventional
`microprocessor, digital signal processor (DSP), cryptographic processor, multimedia processor,
`media processor, programmable system-on-a-chip (SOC), and can be used in developing
`processors or coprocessors for providing cryptography, multimedia and other operations.
`
`Fig. 2 is a flow diagram of a method of determining permutation instruction sequences
`for permutations 22. The determined permutation instruction sequences can be executed in
`permutation functional unit 14. In block 23, bit positions in a source sequence of bits are
`defined for a group of bits in a destination register. In block 24, a permutation instruction is
`determined with the bit positions to assemble bits from the source sequence of bits. In block
`25, the permutation instruction is performed. The assembled bits are inserted into a destination
`register as determined by the bit positions. Blocks 23-25 can be conditionally repeated for
`every non-overlapping group of bits in the destination, in block 26. After the final permutation
`is determined, the desired permutation of the source register is determined in block
`27.Alternatively, all block 23 can be performed for all bits in the destination register and block
`
`8
`
`Oracle-1002 p. 10
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`24 is performed for all bit positions determined in block 23. Thereafter when the permutation
`needs to be performed, block 25 is performed. All permutation instructions can be performed
`in block 25 in parallel.
`
`4759-104
`
`A PPERM instruction can be used as the permutation instruction described above
`for dynamically specified permutations of n subwords. Each PPERM instruction defines
`a subset of bits which subsets can be combined to define the desired permutation. The
`source positions for k bits can be specified with one instruction. PPERM instructions can
`be defined as follows:
`
`5
`
`10
`
`PPERM,x
`
`Rl, R2, R3
`
`wherein Rl and R2 are the source registers and R3 is a destination register. Rl contains
`the bits to be permuted. R2 contains configuration bits. x specifies which k bits in R3
`15 will change. In R3, only k bits specified by x are updated, the other bits are set to zero.
`klgn bits in R2 can be used to specify where to extract the k consecutive bits to be
`changed in R3.
`
`20
`
`25
`
`Fig. 3A illustrates a schematic diagram of an example operation of the PPERM
`instruction. The PPERM instruction is PPERM, 1 Rl, R2, R3 wherein R2 is
`0x020E160820252C33 in hexadecimal notation. This is the same as the decimal values
`shown in R2 as (2, 14, 22, 8, 32, 37, 44, 51). Configuration bits ofregister R2 can be
`applied with source bits to be permuted of register RI to a 64-8 crossbar 30. 64-8
`crossbar 30 assembles bits from RI according to configuration bits ofR2 into assembled
`bits 32. Assembled bits 32 are inserted in R3 at byte I as determined by x. In this
`embodiment, 8 bits are permuted each time, and 56 bits are set to zero.
`In order to store the position information in one register, the following inequality
`should hold
`klgn::,; n
`
`(1)
`
`30
`
`Therefore,
`
`9
`
`Oracle-1002 p. 11
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`ks..!!:_
`lgn
`
`4759-104
`
`(2)
`
`Approximately nllgn bits can be specified with one instruction. In total, n/k
`PPERM instructions which is approximately equivalent to lg n PPERM instructions are
`used for an n-bit permutation. For example, when n=64, k=8 is selected. Eight PPERM
`instructions for a 64-bit permutation are used, and seven OR instructions to merge these
`results to get the desired permutation. For every one of the k bits to be copied in the final
`permutation, lgn bits are used to specify which bit in the source register should be copied.
`
`The PPERM instruction is scalable to multiple n bits wherein subwords are packed in
`more than one register. To allow PPERM to permute bits from more than one source register,
`an extra bit (denoted "otherreg") is used to select each bit in the source register. Accordingly,
`different PPERM instructions can pick bits from more than one register. In this embodiment,
`for n=64 bits, each index into the source register is (lgn+ 1 )=7 bits. If the "otherreg" bit=0, then
`the remaining 6-bit index selects a bit in the source register to place in the destination register,
`as described above. If the "otherreg" bit=l, the corresponding bit in the destination register is
`forced to zero. The pseudo code for the operation performed by PPERM instructions on 64-bit
`architecture is shown in Table 1.
`
`Instruction
`PPERM,x Rl, R2, R3
`
`Pseudo code
`
`O;
`R3 [0 .. n-1]
`<: k; i ++}
`0; i
`for (i
`otherreg R2[i*(lg(n)+l)];
`j = R2[i*(lg(n}+l}+l .. ( (i+l) * (lg(n} +1} -1];
`0)
`if (otherreg
`R3[x*k+i]= Rl[j];
`
`Table 1
`
`To permute 2n bits, two source registers must be used, and two destination registers are
`produced. For each destination register, 8 PPERM instructions are used on each source
`register, requiring a total of 16 PPERM instructions and 15 OR instructions to combine the
`results into one destination register. The same must be repeated to produce the other
`destination register. Hence, a total of 2(16+ 15)=62 instructions are needed to permute 2n bits.
`
`5
`
`10
`
`15
`
`20
`
`25
`
`10
`
`Oracle-1002 p. 12
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`4759-104
`
`The PPERM instruction with the "otherreg" bit can permute more than 2n bits stored in
`multiple registers. In Fig 3B, the 8 bits to be collected in register Rl can come from 5 different
`source registers RI 1, Rl2, R13, Rl4 and R15. In this example the bits collected are: bit 2 from
`RI 1, bit 14 from R12, bit 22 from R13, bit 8 from R14, and bit 32, bit 37, bit 44 and bit 51
`from R15. Registers R21, R22, R23, R24 and R25 are used for storing configuration bits. 8
`configuration bits are used to select one data bit from one ofregisters RI 1, R12, R13, R14 or
`Rl 5. The lower 6 bits are the position of selected bits in source registers, shown as the lower
`rwo of numbers in the configuration registers in Fig. 3B. The most significant bit is the
`"otherreg" bit, shown as the upper row of numbers in Fig. 3B. The PPERM instructions shown
`in Table 1.1 can be used to collect the desired 8 bits from the 5 data registers RI 1, R12, Rl3,
`R14 and RlS. In each of instructions 1, 2, 3 and 4, only one bit is extracted to RI, R2, R3, and
`R4 because only one index in each of the configuration bits has a O "otherreg" bit. For
`example, in R21, only 02 has O in "otherreg" bit. In R22, only OE has O in "otherreg" bit.
`Instruction 5 puts 4 bits in RS because 4 indices have O in their "otherreg" bit. Thereafter, the
`desired 8 bits are merged in Rl with OR instructions.
`
`1:
`2:
`3:
`4:
`5:
`6:
`7:
`8:
`9:
`
`PPERM,2 Rll, R21, Rl
`R12, R22, R2
`PPERM,2
`R13, R23, R3
`PPERM,2
`R14, R24, R4
`PPERM,2
`R15, R25, RS
`PPERM,2
`Rl, R2, Rl
`OR
`R3, R4, R3
`OR
`Rl, RS, Rl
`OR
`Rl, R3, Rl
`OR
`
`0x02 OE 16 08 20 25 2C
`;R21=0x02 80 80 80 80 80 80
`;R22=0x80 OE 80 80 80 80 80
`;R23=0x80 80 16 80 80 80 80
`;R24=0x80 80 80 08 80 80 80
`;R2S=0x80 80 80 80 20 25 2C
`
`33
`80
`80
`80
`80
`33
`
`Table 1.1
`
`In an alternate embodiment, the number of configuration registers are reduced. An
`additional parameter can be used in the PPERM instruction such that the PPERM instruction
`can be defined as:
`
`11
`
`~
`
`5
`
`10
`
`15
`
`20
`
`25
`
`Oracle-1002 p. 13
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`4759-104
`
`PPERM, x, regid
`
`Rl, R2, R3
`
`wherein Rl and R2 are the source registers and R3 is a destination register. Rl
`contains a subset of the bits to be permuted. x specifies which k bits in R3 are changed
`by copying bits from the source register. Regid specifies which subset of bits are stored
`in Rl. The configuration register R2 contains the index of the bit to be copied, and a
`srcid field, for each of the k bits. In R3, a bit is copied if it is one of the k bits specified
`by x and its "srcid" is equal to the "regid" encoded in the instruction. Otherwise, this bit
`in R3 is set to zero. "regid" and "srcid" can be any reasonable size, but both "regid" and
`"scrid" must contain the same number of bits. If regid and srcid have m bits, k(lgn+m)
`bits in R2 are used to specify where to extract the k bits and from which register. If m=O,
`the 11PPERM,x,regid" instruction is reduced back to the above-described "PPERM,x"
`instruction.
`
`For example, the PPERM instructions shown in Table 1.2 can be used for reducing the
`number of configuration registers used in the previous example, shown in Table 1.1. 8 bits are
`used to specify the location for each selected bit. The lower 6 bits are the bit position in the
`source register and higher 2 bits are "srcid". 2 configuration registers are used. Instruction 1
`grabs one bit because only 02' s srcid 0 matches the regid=0 of instruction 1. Instruction 2
`grabs one bit because only 4E's srcid 1 matches the regid=l of instruction 2. Instruction 3
`grabs one bit because only D6's srcid 3 matches the regid=3 of instruction 3. Instruction 4
`grabs one bit because only 48's srcid 1 matches the regid=l of instruction 4. Notice that
`instruction 4 uses a different configuration register 25 than the configuration register 21 used
`by the first 3 instructions. This allows more than 4 source registers to be used to supply bits to
`be permuted. Instruction 5 grabs 4 bits because the srcid 0 of bit 20, bit 25, bit 2C and bit 33
`all match the regid=0 of instruction 5.
`
`In this embodiment, only one configuration register is needed, if the data bits to be
`permuted are stored in at most 4 source registers. This is because 8 data bits are permuted in
`one PPERM instruction, each data bit requiring 6 configuration bits to specify its positional
`location in a source register, so only 2 configuration bits are left to specify a source register. If
`
`12
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`~
`
`Oracle-1002 p. 14
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`two configuration registers are used, then the data bits to be permuted can be stored in at most
`
`8 registers, and so forth.
`
`4759-104
`
`1:
`
`PPERM,2,0
`
`Rll, R21, Rl
`
`2:
`
`PPERM,2,1
`
`R12, R21, R2
`
`3:
`
`PPERM,2,3
`
`R13, R21, R3
`
`4:
`
`PPERM,2,l
`
`R14, R25, R4
`
`PPERM,2,0
`
`RlS, R25, RS
`
`OR
`OR
`OR
`OR
`
`Rl, R2, Rl
`R3, R4, R3
`Rl, RS, Rl
`Rl, R3, Rl
`
`5:
`
`6:
`7:
`8:
`9:
`
`5
`
`0x02 OE 16 08 20 25 2C 33
`
`;R21=0x02 4E D6 80 80 80 80 80
`;srcid:0 1
`2
`2
`2
`2
`2
`3
`
`;R21=0X02 4E D6 80 80 80 80 80
`l
`2
`2
`2
`2
`;srcid:0
`3
`2
`
`;R21=0X02 4E D6 80 80 80 80 80
`3
`2
`;srcid:0 1
`2
`2
`2
`2
`
`;R25=0x80 80 80 48 20 25 2C 33
`;srcid:2
`2
`2
`0
`0
`1
`0
`0
`
`;R25=0X80 80 80 48 20 25 2C 33
`1
`0
`;srcid:2 2
`0
`0
`0
`2
`
`Table 1.2
`
`Fig. 4A illustrates a schematic diagram of an alternate embodiment of system 100 with
`
`3 source registers rather than two. Register file 112 includes source register 11 la, source
`
`register 11 lb, source register 11 ld and destination register 11 lc. System 100 can provide bit-
`
`10
`
`level permutations of all n bits of any register in register file 112. Source register values to be
`
`permuted 113 from register 11 la and configuration bits 115 from source register 11 lb and
`
`intermediate result 122 from source register 11 ld are applied over data paths to permutation
`
`functional unit 114. Permutation function unit 114 generates permutation result 116 which is
`
`written back to destination register 11 lc. PPERM3R instructions are defined on system 100 to
`
`15
`
`perform any n-bit permutation.
`
`The PPERM3R instruction can be defined as follows:
`
`PPERM3R,x RI, R2, R3
`
`RI and R2 are source registers, and R3 is a destination register. Rl contains the bits to be
`
`permuted. R2 contains the configuration bits. x specifies which k bits in R3 will change.
`
`20
`
`In R3, only k bits specified by x are updated, the other bits are copied from R3
`
`13
`
`Oracle-1002 p. 15
`Oracle v. Teleputers
`IPR2021-00078
`
`

`

`4759-104
`
`unchanged. The R3 destination register must also be a source register because the
`unchanged bits must be copied from R3 (used as a source register) to R3 (used as a
`destination register). The PPERM3R instruction is similar to the PPPERM instruction
`described above, except in the PPERM3R instructions three source registers are used.
`
`PPERM3R does not use the OR instructions used to accumulate the intermediate
`results produced by each PPERM instruction. For example, if 8 PPERM instructions are
`performed to permute 64 bits, then 7 OR instructions are used to accumulate the final
`permuted result, as described previously. To achieve the same permutation, 8
`PPERM3R instructions are used, since the partial result can be accumulated with each
`PPERM3R instruction. Accordingly, system 100 for PPERM3R requires 3 source
`registers, whereas system 10 for PPERM3 requires only 2 source registers.
`
`The following codes in Table 2 give an example of PPERM3R instruction which
`can be used to do an initial permutation in the data encryption standard (DES). All
`registers are 64 bits in width. Rl is the source and R2 is the target register. RIO through
`Rl 7 are registers containing permutation configuration bits. 6 of 8 bits are used to
`represent the position of each bit in the source register. Each PPERM3R instruction
`produces an intermediate state with 8 bits permuted. 8 instructions are required to
`permute all 64 bits. For example, the first byte in Rl0, Ox39 (in hexadecimal notation),
`indicates that the first bit in the target register R2 is bit 57 (in decimal notation) in the
`source register Rl.
`
`PPERM3R,O Rl, RlO, R2
`PPERM3R,1 Rl, Rll, R2
`PPERM3R,2 Rl, Rl2, R2
`PPERM3R,3 Rl, Rl3, R2
`PPERM3R,4 Rl, R14, R2
`PPERM3R,5 Rl, R15, R2
`PPERM3R,6 Rl, Rl6, R2
`PPERM3R,7 Rl, R17, R2
`
`Ox3931292119110901
`RlO
`R

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket