throbber
United States Patent (19)
`Liu et al.
`
`54 METHOD FORSCHEDULING A FLAG
`GENERATING INSTRUCTION AND A
`SUBSEQUENT INSTRUCTION BY
`EXECUTING THE FLAG GENERATING
`INSTRUCTION IN A MICROPROCESSOR
`
`(75)
`
`56)
`
`Inventors: Kin-Yip Liu, Millbrae; Ken
`Shoemaker, Los Altos Hills; Gary
`Hammond, Campbell; Anand Pai, San
`Jose; Krishna Yellamilli, Cupertino, all
`of Calif.
`Assignee: Intel Corporation, Santa Clara, Calif.
`Notice:
`This patent issued on a continued pros
`ecution application filed under 37 CFR
`1.53(d), and is subject to the twenty year
`patent term provisions of 35 U.S.C.
`154(a)(2).
`Appl. No.: 08/699,910
`Filed:
`Aug. 20, 1996
`Int. Cl." ............................... G06F 9/40; G06F 9/38
`U.S. Cl. .......................... 712/214; 712/215; 712/216;
`712/221; 712/223; 712/23; 712/32; 712/41;
`712/245
`Field of Search ..................................... 395/376-394,
`395/561, 564, 566, 567, 800.23, 800.41,
`800.42, 800.32; 712/200, 214-216, 220-221,
`223, 226, 245, 23, 32, 41
`References Cited
`U.S. PATENT DOCUMENTS
`
`1/1973 Batcher ................................... 708/210
`3,711,692
`4,161,784 7/1979 Cushing et al.
`... 712/222
`4,189,716 2/1980 Krambeck .....
`... 708/210
`4,371,951
`2/1983 Kort et al. .
`395/800.02
`4,393,468 7/1983 New ..............
`... 708/518
`4,418,383 11/1983 Doyle et al.
`... 710/127
`4,435,753 3/1984 Rizzi .............
`... 395/709
`4,486,848 12/1984 Kaminski ..
`... 708/210
`4,498,177 2/1985 Larson ..........
`... 708/210
`4,567,574
`1/1986 Saadé et al. ..
`... 395/709
`4,630,192 12/1986 Wassel et al. ........................... 708/210
`(List continued on next page.)
`
`addf;2X = X2 - X3
`
`prodf32 AFLAGS = X1, ALFAGS
`
`
`
`
`
`
`
`
`
`
`
`
`
`X1 {32} = carry out of bit 3
`X133} = carry out of bit 6
`X1 {34} = carry out of bit 7
`X1 {35} = carry Out of bit 14
`X1 {36} = carry Out of bit 15
`X1 {37} = carry Out of bit 30
`X1 (38} = carry out of bit 31
`
`aux = location of Auxiliary Carry in AFLAGS
`carry = location of Carry in AFLAGS
`Overf= location of Overflow in AFLAGS
`parity = locatin Parity in AFLAGS
`sign = location of Sign in AFLAGS
`Zero = location of Zero in AFLAGS
`
`
`
`US006049864A
`Patent Number:
`11
`(45) Date of Patent:
`
`6,049,864
`*Apr. 11, 2000
`
`FOREIGN PATENT DOCUMENTS
`318957A3 6/1989 European Pat. Off. .......... G06F 5/OO
`OTHER PUBLICATIONS
`MC88110 Second Generation Risc Microprocessor User's
`Manual, Motorola, pp. 1-23, date unknown.
`MC88110 Errata to Second Generation Risc Microproces
`sor User's Manual, Motorola, pp. 1-11, Sep. 22, 1992.
`MC88110 User Manual Index, Motorola, date unknown.
`MC8811OPRG Programmer's Reference Guide, Motorola,
`pp. 1-4, Dec. 1992.
`MC88110 User's Manual, Section 2 Programming Model,
`Motorola, pp. 1-20, date unknown.
`MC88110 User's Manual, Section 3 Addressing Modes and
`Instruction Set Summary, Motorola, pp. 1-32, date
`unknown.
`MC88110 User's Manual, Section 5 Graphics Unit Imple
`mentation, Motorola, pp. 1-26, date unknown.
`MC88110 User's Manual Padd Pixel Add, Section 10, pp.
`62-71, date unknown.
`(List continued on next page.)
`Primary Examiner Zarni Maung
`ASSistant Examiner Bharat Barot
`Attorney, Agent, or Firm Blakely, Sokoloff, Taylor &
`Zafman LLP
`ABSTRACT
`57
`A method for Scheduling a flag generating instruction and a
`Subsequent instruction. The Subsequent instruction has a
`data dependency on the flag generating instruction. The flag
`generating instruction is translated into first and Second
`instructions. The Subsequent instruction is translated into at
`least a third instruction. The first instruction, when executed,
`generates a result and intermediate flag generation data. The
`Second instruction, when executed, generates a plurality of
`flags. The first instruction is scheduled to execute before the
`Second and third instructions. The Second instruction is
`Scheduled to execute before the third instruction if the third
`instruction has a data dependency on the Second instruction,
`otherwise the third instruction may be Scheduled to execute
`before the Second instruction.
`
`19 Claims, 4 Drawing Sheets
`
`Translate the flag generating instruction
`into at least first and second
`instructions.
`
`Translate the subsequent instruction
`into at least a third instruction.
`
`
`
`
`
`Schedule the first instruction to
`execute.
`
`
`
`
`
`
`
`Does
`the third
`instruction have a data
`dependency On the Second
`instruction
`
`Schedule the Second
`instruction to execute
`before the third
`instruction.
`
`
`
`Schedule the third
`instruction to execute
`before the second
`instruction.
`
`LzLabs GmbH. Ex. 1007-1
`
`

`

`6,049,864
`Page 2
`
`
`
`U.S. PATENT DOCUMENTS
`
`7/1996 Ashkenzai ............................... 708/210
`5,541,865
`5,555,432 9/1996 Hinton et al. ...................... 395/800.23
`5,590,359 12/1996 Sharangpani.
`... 395/800.32
`4,656,582 4/1987 Chaitin et al. .......................... 395/707
`5,613,080 3/1997 Ray et al. ............................... 395/390
`4,656,583 4/1987 Auslander et al.
`395/709
`5,632,023 5/1997 White et al. ............................ 395/394
`4,707,800 11/1987 Montrone et al.
`708/714
`4,710,872 12/1987 Scarborough .
`395/707
`OTHER PUBLICATIONS
`4,773,007 9/1988 Kanada et al. .
`395/704
`it. H1. Milit al. .
`3,. SPARC Technology, UltraSPARC Multimedia Capabilities
`4,785,421 11/1988 Takahashi et al.
`708/205 CE Srps s RTime Video and Advanced
`4,821,181 4/1989 Iwasawa et al. ..
`34.5/500
`irri
`•s
`4,833,606 5/1989 Iwasawa et al. ........................
`7,
`Case, TriMedia Processors Aimed at Future Multimedia
`4.885,684 12/1989 Austin et al. ........................... 70/10s
`Embedded App, pp. 128 (1994).
`4,901.270 2/1990 Galbi et al. ...
`708/708
`Gwennap, HPS PA 7100LC Uses New Instructions to
`4,965,724 10/1990 Utsumi et al. .
`305,707
`Eliminate Decoder Chip, pp. 16-17 (1994).
`4,989,168
`1/1991 Kuroda et al. .
`708/216
`U.S. Patent Application entitled, “Method and Apparatus
`5,201,056 4/1993 Daniel et al. ..
`712/221
`Using Novel Operations in a Processor Inventors: Peleg et
`5,321,820 6/1994 Nakajima ...
`395/586
`al. Filed: Dec. 30, 1993.
`5,339,447 8/1994 Balmer .................................... 708/210
`U.S. Patent Application entitled, “Apparatus for Providing a
`5,404.552 4/1995 Ikenaga .............................. 395/800.41
`Two-Instruction Implementation of a Population Count
`5,418,736 5/1995 Widigen et al. ........................ 708/785
`Function” Inventors: Dreyer et al., Filed: Dec. 29, 1995.
`
`LzLabs GmbH. Ex. 1007-2
`
`

`

`U.S. Patent
`
`Apr. 11, 2000
`
`Sheet 1 of 4
`
`6,049,864
`
`
`
`
`
`Jo pºppé ale slºquinú þaufils üb?, sinôpôleinuð?pu002 SI MOJAMO
`
`
`
`
`
`
`
`
`
`TO?I: || V | Z | S || 0 ||
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`| 037 || Z |
`
`LzLabs GmbH. Ex. 1007-3
`
`

`

`
` SOVAVUlUBISJoUOReIO|=ubisSoydiyulAueAlelixnyJouoyeso]=xne
`
`({o:z}.x)sedsey=Awedsgydyuaaa
`
`SOYTIUlMO|J9AQJOUOIEIO]=J4aA0
`(0=={O:LE}LX)=O18Z'SOVTAV
`
`{Ze}iXvi{selLx=Haao'sovidy
`SOWTV‘LX=SUV1dVc&spoud
`SOVTAYulAuedJouole90]=Aued
`SOV140Ul0497JOUOIE90|=0192
`
`{getLX=Aueosoyldy
`
`SOVTVUlAqueguljeso|=Aqued
`{LE}LX=ubis'SOWTdy
`{Ze}LX=xne'SsoyTy
`
`uolonysu|Guryessuay6e|4
`EX+cX=1Xo€Iddv
`
`uononssujJuanbasqns
`{o:retex+{o:Le}ex={0
`pLUdJoInoAueo={o¢}1x2UGJonoAueo={pe}LXQIqJonoAued={ee}Lx
`GLUdJoynoAureos={9¢}LX
`OFUgJoInoAueo={7¢}1X
`L¢ug'soJnoAueo={9c}1X
`€ygJonoAueo={ze}1x
`GX+OX=Ixceippe
`
`
`
`
`
`
`
`¢Old
`
`
`
`eOl
`
`‘LEFLX
`
`U.S. Patent
`U.S. Patent
`
`Apr. 11, 2000
`
`Sheet 2 of 4
`
`6,049,864
`6,049,864
`
`
`
`‘BJM
`
`LzLabs GmbH. Ex. 1007-4
`
`LzLabs GmbH. Ex. 1007-4
`
`
`
`
`
`
`
`
`

`

`U.S. Patent
`
`Apr. 11, 2000
`
`Sheet 3 of 4
`
`6,049,864
`
`FIG. 4
`
`Format: prodf AFLAGS =X1, mode, Opsize
`Operation: switch (opsize) {
`Case 8-bit
`top bit p0S = 7
`Carry poS = 34;
`break,
`Case 16-bit
`top bit p0S = 15,
`Carry poS = 36;
`break,
`Case 32-bit:
`top bit p0S =31;
`Carry pós = 38;
`break)
`Switch (mode) {
`case ADDF:/*OO/
`AFLAGS.aux=X1{32};
`AFLAGS.Carry =X1(carry pos});
`AFLAGS.Overf=X1(carry pos}^X1(carry pos-1};
`break,
`Case SUBF:/*O1*/
`AFLAGS.aux = -(X1(32});
`AFLAGS.Carry = -(X1(carry pos});
`AFLAGS.Overf=X1(carry pos} ^X1(carry pos-1};
`break,
`Case shift1:/*10*/
`/* left shifts/rotates;
`AFLAGS.Carry =X1{top bit poS + 1};
`AFLAGS.Overf= AFLAGS.Carry
`X1 (top bit pos};
`break,
`Case shiftr/11/
`/* rightshifts/rotates */
`AFLAGS.carry =X1(top bit p0s + 1};
`AFLAGS.Overf=X1{top bit pos-1} ^X1 (top bit p0s),
`break;
`
`AFLAGS.parity = has even parity (X17:0});
`AFLAGS.sign =X1(top bit p0S};
`AFLAGS.Zero = (X1(top bit p0S:0} = = 0);
`
`LzLabs GmbH. Ex. 1007-5
`
`

`

`U.S. Patent
`
`Apr. 11, 2000
`
`Sheet 4 of 4
`
`6,049,864
`
`Translate the flag generating instruction
`into at least first and SeCOnd
`instructionS.
`
`
`
`Translate the Subsequent instruction
`into at least a third instruction.
`
`Schedule the first instruction to
`eXeCute.
`
`
`
`
`
`DOeS
`the third
`instruction have a data
`dependency On the second
`instruction
`?
`
`
`
`
`
`
`
`
`
`
`
`
`
`Schedule the SeCOnd
`instruction to eXecute
`before the third
`instruction.
`
`
`
`Schedule the third
`instruction to eXecute
`before the SeCOnd
`instruction.
`
`F.G. 5
`
`LzLabs GmbH. Ex. 1007-6
`
`

`

`6,049,864
`
`1
`METHOD FOR SCHEDULING A FLAG
`GENERATING INSTRUCTION AND A
`SUBSEQUENT INSTRUCTION BY
`EXECUTING THE FLAG GENERATING
`INSTRUCTION IN AMCROPROCESSOR
`
`2
`A flag is a bit that may indicate the condition of a
`microprocessor. A flag may also control the operation of the
`microprocessor. Flags are typically grouped together into a
`flag register. For example, a microprocessor could group
`arithmetic flags into the AFLAGS register as shown in FIG.
`1(a). Descriptions of these arithmetic flags are shown in
`FIG. 1(b).
`The arithmetic portion of the above CISC arithmetic
`instruction efficiently translates into a Single RISC instruc
`tion or at least a relatively small number of RISC instruc
`tions. However, the flag generation portion conventionally
`translates into many RISC instructions. Numerous instruc
`tions are required because of the complexity of flag genera
`tion. AS an example, the previously discussed ADD instruc
`tion could modify the contents of the sign, Zero, carry,
`auxiliary carry, parity, and Overflow flags. As a result of the
`large number of RISC instructions necessary to generate the
`above flags, the performance of the microprocessor
`decreases dramatically. Thus, translating CISC flag gener
`ating arithmetic instructions into RISC instructions is not an
`optimal Solution.
`There is a need for a more efficient method to execute flag
`generating arithmetic and logic instructions. Such a method
`should avoid the expense, complexity, and Slow Speed of
`CISC microprocessors. In addition, such a method should
`take advantage of RISC performance enhancements while
`avoiding generating a large number of RISC instructions
`that decrease performance.
`2. SUMMARY OF THE INVENTION
`The invention relates to a method, performed by a
`microprocessor, for Scheduling a flag generating instruction
`and a Subsequent instruction. The Subsequent instruction has
`a data dependency on the flag generating instruction. The
`flag generating instruction is translated into first and Second
`instructions. The Subsequent instruction is translated into at
`least a third instruction.
`The first instruction, when executed, generates a result
`and intermediate flag generation data. The Second
`instruction, when executed, generates a plurality of flags.
`The first instruction is scheduled to execute before the
`Second and third instructions. The Second instruction is
`Scheduled to execute before the third instruction if the third
`instruction has a data dependency on the Second instruction,
`otherwise the third instruction may be Scheduled to execute
`before the Second instruction.
`
`50
`
`55
`
`60
`
`65
`
`3. BRIEF DESCRIPTION OF THE DRAWINGS
`FIG. 1(a) is a diagram of an AFLAGS register.
`FIG. 1(b) is a table containing a description of flags
`contained in the AFLAGS register of FIG. 1(a).
`FIG. 2 is a diagram of a plurality of computer instructions.
`FIG. 3 is a high level diagram of one embodiment of a
`method to translate a flag generating instruction.
`FIG. 4 presents pseudo code for a prodf instruction.
`FIG. 5 is a high level diagram of a method for executing
`the flag generating and Subsequent instructions of FIG. 2.
`4. DETAILED DESCRIPTION OF SPECIFIC
`EMBODIMENTS
`Referring to FIG. 2, a plurality of instructions is shown.
`The plurality of instructions are typically CISC instructions,
`although they may be RISC instructions.
`The plurality of instructions include at least one flag
`generating instruction and a Subsequent instruction. The flag
`
`15
`
`25
`
`35
`
`40
`
`45
`
`BACKGROUND OF THE INVENTION
`This invention relates to a method for executing a flag
`generating instruction and a Subsequent instruction. This
`method is particularly useful for rapidly executing flag
`generating CISC instructions Such as Intel x86 instructions.
`When many computer languages that are in widespread
`use today were developed, memory was a very expensive
`commodity. In addition, during the development of the early
`microprocessors, instructions took a long time to fetch when
`compared with the time needed to execute. Thus, the lan
`guage developerS needed to ensure that instructions could be
`Stored compactly. As a result, they developed computer
`languages that contained a very rich Vocabulary of computer
`operations. A microprocessor that executes these operations
`is known as a complex instruction set computer (CISC).
`While CISC microprocessors can rapidly handle CISC
`instructions, Such microprocessors are often more complex,
`expensive, and often slower than a microprocessor designed
`to handle Simpler instructions.
`Today, because memory is relatively inexpensive, modem
`computer languages emphasize Speed of execution rather
`than compactness of code. Thus, more modern reduced
`instruction set computer (RISC) languages utilize a smaller
`number of Simple instructions that may be rapidly handled
`by a RISC microprocessor. One of the primary reasons that
`RISC instructions may be handled faster than CISC instruc
`tions is that RISC instructions are easier to schedule and
`execute in parallel than CISC instructions. A microprocessor
`that can execute instructions in parallel is known as a
`SuperScalar microprocessor.
`Most microprocessor instructions generate at least one
`result that is utilized by Subsequent instructions. If a Subse
`quent instruction utilizes a result produced by a previous
`instruction, the Subsequent instruction is said to have a data
`dependency on the previous instruction. Thus, if a Super
`Scalar microprocessor attempts to execute the two instruc
`tions at the same time, the execution of the Subsequent
`instruction must wait until the previous instruction has
`produced its result. Typically, any instruction must be
`delayed until all of its inputs have been produced. This
`requirement places a Significant performance limitation on
`SuperScalar microprocessors.
`The above described execution, known as out-of-order
`execution, is a conventional method to Significantly increase
`the performance of modem microprocessors. It is known in
`the art that RISC instructions can be more easily executed
`out-of-order than their CISC counterparts.
`In an effort to ensure compatibility with older CISC
`Software, the vast majority of computer programmerS Still
`use CISC instructions today. However, in an effort to take
`advantage of the RISC performance enhancements, Some
`microprocessor designers break down lengthy CISC instruc
`tions into Simpler operations that more closely resemble
`RISC instructions.
`When a CISC arithmetic instruction, Such as ADD
`X1=X2+X3 executes, it generates a result equal to X2+X3.
`X2 and X3 are known as operands. In addition, the arith
`metic instruction generates Several flags. CISC logical
`instructions such as AND, OR, and XOR, also generate
`Similar flags.
`
`LzLabs GmbH. Ex. 1007-7
`
`

`

`3
`generating instruction may be any instruction that generates
`at least one flag. Examples of flag generating arithmetic
`instructions include add, Subtract, Shift left, and shift right.
`Examples of flag generating logical instructions include
`instructions that perform the logical operations AND, OR,
`and XOR.
`The Subsequent instruction has a data dependency on the
`flag generating instruction. Thus, the Subsequent instruction
`may take the result and/or a flag generated by the flag
`generating instruction as an input.
`While only two instructions are shown in FIG. 2, it is to
`be understood that the two instructions are part of a larger
`computer program consisting of numerous instructions. In
`addition, even though the Subsequent instruction is shown as
`immediately following the flag generating instruction, there
`may be additional instructions between the flag generating
`instruction and the Subsequent instruction.
`4.1 Translating the flag generating instruction
`Referring to FIG. 3, a method to execute the instructions
`shown in FIG. 2 is presented. A flag generating instruction,
`such as a 32 bit add instruction (ADD32), is translated into
`a first instruction (addf32) and a second instruction
`(prodf32). The first and second instructions are typically
`RISC instructions, however, under Some circumstances,
`they may be CISC instructions. (In an effort to facilitate
`readability, all CISC instructions will be upper case and all
`RISC instructions will be lower case.) While the terms
`“first and “second” are used to describe the above two
`instructions, there may be additional instructions generated
`from the flag generating instruction. In addition, there may
`be additional instructions generated between the generation
`of the first and Second instructions.
`Referring again to FIG.3, ADD32X1=X2+X3 is intended
`to place the sum of the two 32 bit operands X2 and X3 into
`register X1. Further, ADD32 is intended to update some or
`all of the 6 arithmetic flags in the AFLAGS register of FIG.
`1(a). ADD32 X1=X2+X3 may be broken down into two
`RISC instructions. Addf32 generates the arithmetic result
`X2+X3. In addition, addf32 produces intermediate flag
`generation data Such as carry out bits or other bits that may
`be utilized in producing flags. Prodf32 produces the arith
`metic flags using the intermediate flag generation data
`produced by addf32.
`4.1.1 addf32
`As shown in FIG. 3, addf32 places the arithmetic result
`X2--X3 into the lower 32 bits of X1. Note that both X2 and
`X3 are 32 bit numbers. However, the registers that hold X1,
`X2, and X3 may be greater than 32 bits in length. A length
`of 64 bits may be optimal. If 64 bit registers are used, then
`the upper 32 bits may be used to Store intermediate flag
`generation data and flags while the lower 32 bits may be
`used to Store the above arithmetic result. Referring again to
`FIG. 3, the various carry out bits may be placed into the
`higher order bits of X1. Alternatively, the carry out bits may
`be placed in any memory Storage location, Such as another
`general purpose register. However, placing the arithmetic
`result, intermediate flag generation data, and even AFLAGS
`in a Single register may be useful in certain circumstances
`Such as data dependency checking.
`4.1.2 prodf32
`As shown in FIG. 3, the prodf32 instruction produces six
`arithmetic flags by performing conventional operations on
`the intermediate flag generation data. Because prodf32 takes
`the intermediate flag generation data as an input, prodf32 is
`data dependent on add32. In the embodiment shown in FIG.
`3, prodf32 sets the Auxiliary Carry flag to the value of
`X1{32}. X132 contains the value of the carry out of bit 3.
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6,049,864
`
`15
`
`25
`
`4
`Similarly, prodf32 sets the Carry flag to the value of X1:38,
`the carry out of bit 31. The Overflow flag is set to the result
`of X1{38 XOR X1{37. X1{37 contains the value of the
`carry out of bit 30. The Parity flag is set if the first 8 bits in
`X1 have even parity. Similarly, the Zero flag is set if the first
`32 bits of X1 are equal to zero.
`AS is known by those skilled in the art, flag generation
`depends on both the operand Size and the flag generating
`instruction. Thus, different versions of prodf may be needed
`for the differing operand Sizes and differing flag generating
`instructions. FIG. 4 presents C pseudo code for a prodf
`instruction that Supports operand sizes of 8, 16, and 32 bits.
`This C pseudo code also Supports Several different flag
`generating instructions.
`There may be circumstances when a designer would
`desire to keep Some of the flags unchanged. For example, in
`an effort to maintain compatibility with certain Intel x86
`CISC instructions, some instructions may be allowed to
`modify only a Subset of the flags. In this case, flags that the
`instruction should not modify could be masked. Thus, prodf
`could take an AFLAGS mask as an additional input. Prodf
`could then generate a temporary AFLAGS, check the given
`AFLAGS mask, and merge the temporary and original
`AFLAGS together to form the final AFLAGS.
`4.2 Translating the Subsequent instruction
`Referring to FIG. 5, after the flag generating instruction is
`translated into first and Second instructions, the Subsequent
`instruction is translated into one or more instructions. Typi
`cally these instructions are RISC instructions. One of these
`RISC instructions will be referred to as a third instruction.
`Recall that the Subsequent instruction is data dependent on
`the flag generating instruction. Thus, the third instruction is
`data dependent on either the first and/or the Second instruc
`tions.
`4.3 Scheduling the first instruction
`AS discussed previously, the Second instruction takes the
`result generated by the first instruction as an input. Similarly,
`the third instruction takes the result of the first instruction
`and/or a flag generated by the Second instruction as an input.
`Thus, because the Second and third instructions are data
`dependent on at least the first instruction, the first instruction
`is Scheduled to execute first by the microprocessor instruc
`tion Scheduler.
`4.4 Scheduling the Second and third instructions
`Again referring to FIG. 5, the third instruction may or may
`not take the flags generated by the Second instruction as an
`input. If the third instruction does take the flags as an input,
`then the third instruction is data dependent on the Second
`instruction. Thus, the Second instruction is Scheduled to
`execute prior to the third instruction. However, if the third
`instruction is not data dependent on the Second instruction,
`then an increase in performance may be obtained. In this
`case, the third instruction may be Scheduled to execute
`before the Second instruction.
`4.5 Storage of certain flag combinations
`When determining a conditional branch, a microprocessor
`may often use certain combinations of Some of the bits in
`AFLAGS. Some examples of these combinations are “below
`or equal” (Carry OR Zero), “less” (Sign XOR Overflow),
`“less or equal” (Sign XOR Overflow) OR Zero), and the
`complements of these cases. In an effort to increase micro
`processor performance, the values for each of these combi
`nations may be produced by prodf and Stored in the
`AFLAGS register. If the combinations are so stored, then
`many branch decisions may be made by testing only one bit.
`5. REMARKS
`One of the advantages of the invention is that it provides
`an efficient method for executing flag generating instruc
`
`LzLabs GmbH. Ex. 1007-8
`
`

`

`S
`tions. By breaking Such instructions into a Small number of
`instructions, Subsequent instructions may be expedited if
`they do not take flags as an input. In addition, by breaking
`Such instructions into only a Small number of instructions,
`the invention avoids the large number of instructions that are
`generated by conventional CISC to RISC translations.
`Another advantage of the invention is that it allows Some
`arithmetic instructions to be executed in a single clock cycle
`in a modem microprocessor. For example, it may not be
`possible to execute a CISC ADD32 instruction in a single
`clock cycle of a modern microprocessor. However, it would
`be possible to execute a RISC add32 in a single instruction
`cycle.
`Another advantage of breaking CISC flag generating
`arithmetic instructions into RISC instructions, is that it
`simplifies the design effort. It is known by those skilled in
`the art that designing two Separate modules often requires
`less effort than designing a Single complex module.
`Still another primary advantage of this invention is that it
`decreases the complexity of a microprocessor. CISC micro
`processors arc inherently complex. However, by breaking
`down CISC flag generating instructions into more manage
`able RISC instructions, CISC instructions can be efficiently
`executed using RISC performance enhancements.
`It will be appreciated by those of ordinary skill having the
`benefit of this disclosure that the illustrative embodiments
`described above are capable of numerous variations without
`departing from the Scope and Spirit of the invention. For
`example, while an ADD32 instruction has been discussed as
`an example of a flag generating instruction, other ADD
`instructions that operate on different length operands are
`also flag generating instructions. There are many other flag
`generating instructions, Such as but not limited to Subtract,
`left shift, right shift, and logical instructions. Accordingly,
`the exclusive rights Sought to be patented are as described in
`the claims below.
`What is claimed is:
`1. A method, performed by a microprocessor, for execut
`ing a flag generating instruction, the method comprising:
`(a) translating the flag generating instruction into at least
`a first instruction and a Second instruction;
`(b) the first instruction generating intermediate flaggen
`eration data upon execution of the first instruction; and
`(c) the Second instruction, utilizing the intermediate flag
`generation data, generating a plurality of flags upon
`execution of the Second instruction.
`2. The method of claim 1, further comprising Scheduling
`the first instruction for execution before the Second instruc
`tion.
`3. The method of claim 1 wherein the flag generating
`instruction comprises a CISC instruction and the first and
`Second instructions comprise RISC instructions.
`4. The method of claim 1 wherein the flag generating
`instruction comprises an Intel x86 instruction.
`5. The method of claim 1 wherein the flag generating
`instruction comprises an arithmetic instruction.
`6. The method of claim 1 wherein the flag generating
`instruction comprises a logical instruction.
`7. The method of claim 1 wherein the microprocessor
`comprises a SuperScalar out-of-order microprocessor.
`8. A method, performed by a microprocessor, for Sched
`uling a flag generating instruction and a Subsequent
`instruction, the Subsequent instruction having a data depen
`dency on the flag generating instruction, the method com
`prising:
`
`6
`(a) translating the flag generating instruction into a first
`instruction and a Second instruction, the first
`instruction, when executed, generating intermediate
`flag generation data, the Second instruction, when
`executed, utilizing the intermediate flag generation
`data, generating a plurality of flags;
`(b) translating the Subsequent instruction into at least a
`third instruction;
`(c) Scheduling the first instruction to execute before the
`Second and third instructions, and
`(d) Scheduling the Second instruction to execute before the
`third instruction if the third instruction has a data
`dependency on the Second instruction.
`9. The method of claim 8 wherein the flag generating
`instruction and the Subsequent instruction comprise CISC
`instructions and the first, Second, and third instructions
`comprise RISC instructions.
`10. The method of claim 8 wherein the flag generating
`instruction comprises an arithmetic instruction.
`11. The method of claim 8 wherein the flag generating
`instruction comprises a logical instruction.
`12. The method of claim 8 wherein the flag generating
`instruction and the Subsequent instruction comprise Intel
`x86 instructions.
`13. The method of claim 8 wherein the microprocessor is
`a SuperScalar out-of-order microprocessor.
`14. A method, performed by a microprocessor, for Sched
`uling a flag generating instruction and a Subsequent
`instruction, the Subsequent instruction having a data depen
`dency on the flag generating instruction, the method com
`prising:
`(a) translating the flag generating instruction into a first
`instruction and a Second instruction, the first
`instruction, when executed, generating intermediate
`flag generation data, the Second instruction, when
`executed, utilizing the intermediate flag generation
`data, generating a plurality of flags;
`(b) translating the Subsequent instruction into at least a
`third instruction;
`(c) Scheduling the first instruction to execute before the
`Second and third instructions,
`(d) Scheduling the Second instruction to execute before the
`third instruction if the third instruction has a data
`dependency on the Second instruction; and
`(e) otherwise, Scheduling the third instruction to execute
`before the Second instruction.
`15. The method of claim 14 wherein the flag generating
`instruction and the Subsequent instruction comprise CISC
`instructions and the first, Second, and third instructions
`comprise RISC instructions.
`16. The method of claim 14 wherein the flag generating
`instruction comprises an arithmetic instruction.
`17. The method of claim 14 wherein the flag generating
`instruction comprises a logical instruction.
`18. The method of claim 14 wherein the flag generating
`instruction and the Subsequent instruction comprise Intel
`x86 instructions.
`19. The method of claim 14 wherein the microprocessor
`is a SuperScalar out-of-order microprocessor.
`
`k
`
`k
`
`k
`
`k
`
`k
`
`6,049,864
`
`1O
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`LzLabs GmbH. Ex. 1007-9
`
`

`

`UNITED STATES PATENT AND TRADEMARK OFFICE
`CERTIFICATE OF CORRECTION
`PATENT NO. :6,049,864
`DATED
`: April 11, 2000
`INVENTOR(S) : Liu et al.
`It is certified that error appears in the above-identified patent and that said Letters Patent is hereby
`corrected as shown below:
`
`In column 1, at line 26, delete "modem" and insert-modern--.
`
`Attest:
`
`Signed and Sealed this
`Fifteenth Day of May, 2001
`Zaaé, f-34.
`
`NICHOLAS P. GODC
`
`Attesting Officer
`
`Acting Director of the United States Patent and Trade mark Office
`
`LzLabs GmbH. Ex. 1007-10
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket