`IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
`WILHIIIIIIIIIII-III II, I ,w
`
`.....
`
`l
`
`" "
`
`,
`
`_
`
`I
`
`_wwmmmwmmwma
`
`JI'I'}
`
`TH1*} UNITED STATES OF AMERICAG
`4 6 v
`i?
`
`'3 4’
`‘g,
`94‘?
`UNITED STATES DEPARTMENT OF COMMERCE
`,g
`
`A ‘9
`United States Patent and Trademark Office
`‘
`
`
`
`4??
`=§2gi§
`
`I#
`
`3:
`
`_,
`
`§§§‘
`‘
`
`October 17, 2018
`
`THIS IS TO CERTIFY THAT ANNEXED IS A TRUE COPY FROM THE
`RECORDS OF THIS OFFICE OF THE FILE WRAPPER AND CONTENTS
`OF:
`
`APPLICATION NUMBER: 09/608,126
`
`FILING DATE: June 30, 2000
`PATENT NUMBER: 6,839,751
`
`ISSUE DATE: January 04, 2005
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`By Authority of the
`'fii
`Under Secretary of Commerce for Intellectual Property
`6‘3 \.
`and Director of the United States Patent and Trademark Office 3.
`.
`: I 1,:
`..
`‘2 w, :
`
`.MONTG MER
`
`Certifying Officer
`
`I’M“;
`E I Ii
`.j
`E
`
`PART (3) OF (3) PART(S)
`
`
`
`-';~_'~'3'-":""~”'.‘““"""""”a“; nl‘f 'lg‘,‘!“r "may it:
`
`“"""’"_"'
`
`:: V- "
`
`.
`
`NOAC Ex. 1018 Page 696
`
`
`
`.
`a“ “w.
`United States Patent
`[19:
`5,507,028
`Liu
`Apr. 9, 1996
`[45] Date of Patent:
`
`/
`
`lllllllllllllllllllllllllllllllllllll||I||II|l|llIlIlllllllllllllllllllllll
`
`U8005507028A
`
`[11] Patent Number:
`
`f
`
`’
`
`[54] HISTORY BASED BRANCH PREDICTION
`ACCESSED VIA A HISTORY BASED
`EARLIER INSTRUCTION ADDRESS
`
`Inventor: Lishing Liu,Pleasantville, NY.
`[75]
`[73] Assignee:
`International Business Machines
`Corporation, Armonk, NY.
`
`[21] Appl. No.: 370,342
`
`[22]
`
`Filed:
`
`Jan. 9, 1995
`
`Related US. Application Data
`
`[63]
`
`Continuation of Ser. No. 860,631, Mar. 30, 1992, aban-
`daned.
`
`................ 606F938
`......
`.
`Int. Cl.‘5 ...........t .
`[51]
`
`[52] US. Cl. ................. 395/37 , 64/2613; 364/2611;
`364/2618; 364IDIG. 1
`[58] Field of Search .......................................... 395/375
`
`[56]
`
`Referenca Cited
`U.S. PATENT DOCUMENTS
`
`5/1989 Cronse ct a]. ................... 395/375
`4,831,517
`111991 Hanatani el al.
`..
`4.984.154
`
`12/1992 Shibuya
`5.168.557
`5,175,827 12/1992 Morisads .....
`
`5.237.666
`8/1993 Suzuki et a]. ......................... 395/375
`
`Primary Examiner—Kevin J. Teska
`Assistant Examiner—Ayni Mohamed
`Attorney, Agent, or Finn—Ronald L. Drumheller
`[57]
`ABSTRACT
`
`An improved history table is disclosed in which at least
`some of the entries are stored and accessed based upon the
`address of an instruction which historically pmeds the
`branch instruction itself. The access address may be used to
`determine the location of the entry in the table and/or may
`be stored in whole or in part in the entry itself. Furthermore,
`the improved history table may be of any known type
`including but not limited to branch history table types and
`decode history table types. The entries in the improved
`history table preferably are stored and accessed by the
`address of the preceeding taken branch target and preferably
`contain a number indicative of the number of instructions
`between the access address and the address of the branch
`instruction or its target.
`
`4,763,245
`
`8/1988 EmmaetaL .. 395/375
`
`8 Claims, 9 Drawing Sheets
`
`
` INSTR.ADDR.
`
`
`BRANCH
`DUTCEIME
`
`
`
`READ ADDR.
`
`
`42
`
` HISTURY
`TABLE
`WRITE ADDR.
`
`
`.53
`
`
` CDMPARE
`MATCH
`
`
`FINAL PREDICTIUN
`RESULT 11
`
`r
`
`_‘
`
`_
`
`“VII/Ht
`
`NOAC EX. 1018 Page 697
`
`NOAC Ex. 1018 Page 697
`
`
`
`US. Patent
`
`Apr. 9, 1996
`
`Sheet 2 of 9
`
`5,507,028
`
`FIG. 2
`
`FETCH UNIT
`
`
`
`13
`
` INSTRUCTION
`
`
`
`PREDICTION
`
`11
`
`
`
`
`
`
`
`NT”Mfuwv.“-_V,_._.,.._,,..A.“m—..-..
`
`
`v«~wr—mmmu
`
`
`
` 31
`
`INSTRUCTION
`
` 32
`
`
`
`INSTRUCTION
`
`
` 33
`INSTRUCTION
`
`
`
`
`18
`
`
`
`88
`
`
`
`2
`
`20
`
`BRANCH
`PREDICTOR
`
`
`
`10
`
`BRANCH
`OUTCOME
`
`
`
`18
`
`37
`
`87
`
`.
`
`INSTRUCTION -
`
`INSTRUCTION
`ISSUE
`
`NOAC EX. 1018 Page 698
`
`NOAC Ex. 1018 Page 698
`
`
`
`US. Patent
`
`0
`
`Apr. 9, 1996
`
`Sheet 2 of 9
`
`5,507,028
`
`FIG. 2
`
`a
`
`1
`
`ADDRESS
`
`
`
`
`
`22'
`
`
`
`
`
`INSTRUCTIUN
`FETCH UNIT
`
`
`
`13
`
`PREDICTIUN
`
`11
`
`
`
`
`2
`
`
`
`80
`
`ADDRESS
`
`BRANCH
`PREDICTDR
`
`
`
`
`10
`
`
`
`
`
`
`
`BRANCH
`DUTCDME
`
`12
`
`27
`
`ADDRESS
`
`31
`
`INSTRUCTIUN
`
`32
`
`INSTRUCTIUN
`
`33
`
`INSTRUCTION
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`37
`
`‘
`
`INSTRUCTION
`
`INSTRUCTIDN
`ISSUE
`
`
`
`NOAC EX. 1018 Page 699
`
`NOAC Ex. 1018 Page 699
`
`
`
`. M.“
`
`
`
`
`
`INS TR.ADDR.
`
`
`
`MATCH
`FIELD
`
`HISTORY
`
`HISTORY
`
`
` 5
`
` READ ADDR.
`
`
`
`
`54
`
`NO
`MATCH h
`
`FINAL PREDICTION RESULT 11
`
`
`
`mailed'S'Il
`
`9661‘6'Jdv
`
`6JOEiaaqs
`
`SZO‘LOS‘S
`
`NOAC EX. 1018 Page 700
`
`NOAC Ex. 1018 Page 700
`
`
`
`US. Patent
`
`IUZ¢mm
`
`UZDUHDD
`
`
`
`>NEFMHI10.7.2).
`
`.mnn¢.m._.wZH
`
`
`734..;§.,
`
`
`rill..2.cg,‘,(xxEb:IllqhflynEluiraxiafiA5::(5:5A..Acxlr”.z»1flu§§Efir.9
`
`A
`
`w
`
`S
`
`9
`
`8
`
`,3
`
`wan?aqua
`
`mum:
`
`5.,a:_5.
`
`
`
`Ma53mm7,2958qu.22E
`
`%N?
`
`
`
`mmBun:mmizmo
`
`.mE4VmManiMEN.)
`
`VNQAGEXTIOB Page 701
`
`NOAC Ex. 1018 Page 701
`
`
`
`
`
`”MM3'
`,.‘n'rwmrw
`
`
`
`mmmsu-Wmmm—»-L”~
`
`‘~”:1‘3‘.
`
`:mun:mxtz-nrmmwm‘.hm“mung"LPsuing»~,
`
`~~:mms-nmrnxu.*x’H.~
`
`US. Patent
`
`Apr. 9, 1996
`
`Sheet sot 9
`
`5,507,028
`
`FIG. 4
`
`MATCH
`FIELD
`
`HISTORY
`
`OTHERS
`
`FIG. 6
`
`MATCH
`FIELD
`
`HISTORY
`
`OTHERS
`
`———
`
`FIG. 7
`
`MATOH
`FIELD
`
`HISTORY
`
`OTHERS
`
`-——
`
`MATCH
`
`FIELD
`
`F IG' 8
`
`HISTORY
`
`OTHERS
`
`F I G. 9
`
`MATCH
`FIELD
`
`SSA1
`
`HISTORY
`
`OTHERS
`
`ENDA2 : SSA2 -
`
`FIG. TO
`
`MATCH
`FIELD
`
`BR_ADDR
`
`HISTORY
`
`OTHERS
`
`BR_OUTCOME -
`
`NOAC EX. 1018 Page 702
`
`NOAC Ex. 1018 Page 702
`
`
`
`139.2
`
`US. Patent
`
`Apr. 9, 1996
`
`Sheet 6 of 9
`
`5,507,023
`
`
`
`ZDHFanmmm:32:
`
`.3HJDWME
`
`.mnn¢ELK)
`
`.Naeqem”.I
`
`10255
`
`auntiigflkf
`
`.mmm¢.~_l_.wzH
`
`l;L
`
`NOAC EX. 1018 Page 703
`
`NOAC Ex. 1018 Page 703
`
`
`
`
`Iiri‘k.
`
`US. Patent
`
`Apr. 9, 1996
`
`Sheet 7 of 9
`
`5,507,028
`
`IUZ¢mm
`
`wimp—bu
`
`>~=u._.wHI
`
`“mu—ms.
`
`. ¢m
`
`
`
`mm05D;mm¢mzno
`
`:uhci
`
`6.:9.5
`
`\LrIJ
`
`
`
`zquoHnumm46.2:
`
`.3hqawwm
`
`
`
`.mnncn¢mm
`
`NOAC EX. 1018 Page 704
`
`
`
`
`
`‘ :1
`
`v4LJI?Lsxavhx‘cuiiigfafidfif‘k3.):,tiriurlqwgggigififl¢‘
`
`
`
`Frk%.§i§§x$i‘3
`
`NOAC Ex. 1018 Page 704
`
`
`
`
`US. Patent
`
`7
`
`Apr. 9, 1996
`
`Sheet 8 of 9
`
`5,507,028
`
`gr
`I:
`
`
`
`F I6. 'I 3
`
`.
`
`MATCH
`FIELD
`M__ADDFI
`
`HISTORY
`OTHERS
`BFLOUTCOME -
`
`FIG. I4
`
`80
`BR TARGET
`
`INSTR ADDR
`
`81
`
`
`
`83
`
`
`
`MUX CONTROL
`
`
`
`CURRENT
`INSTRUCTION
`ADDRESS
`
`FIG. I5-
`
`80
`
`81
`
`INSTR ADDR
`
`INCR
`
`
`
`
`83
`
`BR TARGET
`
`
`
`ADDRESS
`FOR
`PREDICTION
`
`MUX CONTROL
`
`
`CURRENT
`INSTRUCTION
`ADDRESS
`
`NOACVEX. 1018 Page 705
`
`NOAC Ex. 1018 Page 705
`
`
`
`US. Patent
`
`Apr. 9, 1996
`
`Sheet 9 of 9
`
`5,507,028
`
`
`
`$99.19..
`
`
`
`
`
`(I:U:gagging-JIf!)
`
`
`
`m..03.
`
`wmmIFO
`
`
`
`N>EOFQIF>m0._.w_I
`
`l3mm"$523
`
`wimm.HNIKGZMA“(mm
`
`gL.0.“—
`
`mmmzho
`
`
`
`N>m0._.m_IF>m0hm_I
`
`I0._.<2
`
`DAN—n.
`
`3mm“gnawm<mm"NEE
`
`5(mm.
`
`NOAC EX. 1018 Page 706
`
`NOAC Ex. 1018 Page 706
`
`
`
`
`5,507,028
`
`1
`HISTORY BASED BRANCH PREDICTION
`ACCESSED VIA A HISTORY BASED
`EARLIER INSTRUCTION ADDRESS
`
`This is a continuation of application Ser. No. 07/860,631,
`filed Mar. 30, 1992, now abandoned.
`
`FIELD OF THE. INVENTION
`
`This invention generally relates to control of instruction
`flow in a computer system and more particularly to the
`prediction of outcome of branch instructions using a history
`based branch prediction table.
`
`BACKGROUND OF THE INVENTION
`
`5
`
`10
`
`15
`
`2
`outcome. If the choice proves wrong the conditionally
`initiated instructions are abandoned and the W391 mm-
`tion is fetched. The cycles devotedto the conditional instruc-
`tions are then lost as well as the cycles to fetch the correct
`target inslmction. However, the latter is often avoided in the
`prior art by prefetching the target at the time the branch is
`decoded.
`
`A more sophisticated strategy is embodied in U.S. Pat.
`No. 3,,559183 to Susscnguth, which patent is assigned to the
`assignee of the present invention. 11'1s based on the obser—
`vation that the outcome of most branches, considered indi-
`vidually, tends to repeat. In this strategy, a history table of
`taken branches is constructed, which is known as a Branch
`History Table (Bl-1T). Etch entry in the table consists of the
`address of a taken branch followed by the target address of
`the branch. The table is a hardware construct and so it has
`a predetermined size. When the table is full, making a new
`entry requires displacing an older entry. This can be accom-
`plished by a Least-Recently-Used (LRU) policy as in
`caches. When a branch'rs resolved as taken during execu-
`tion, the history information associated with the branch'rs
`inserwd into or updated'to the BET. Branch prediction and
`instruction prefetching me accomplished through constant
`search for the next taken branches in the history table. Upon
`final resolution/execution of a branch, any incorrect history
`information associated with the branch will be reset/updated
`properly. The major benefit of a BI-lT is to allow a separate
`branch processing unit to prefetch instructions into the
`instruction butter (I-Bufl’er) ahead of the instruction decode
`stage. Such instruction prefctching into the I-bufier past
`predicted takrm branches is possible due to the recording of
`target addressee fortakenbranchcs in the BET. U.S. Pat. No.
`4,679,141 to Pomerene er al, which patent is assigned to the
`assignee of the present invention, improves the BET design
`by recording more history information in a hierarchical
`manner.
`
`U.S. Pat. No. 4,477,872 to Losq et al, which patent is
`assigned to the assignee of the present invention, proposes
`a decode time branch prediction mechanism called a Decode
`History Table (DHT). The DHT mechanism improves the
`decode time static branch prediction methods of U.S. Pat.
`No. 3,325,785, to Stephens, by employing a hardware table
`to record simple histories of conditional branches. In the
`simplest form a DHT consists of a bit vector of fixed length.
`For each conditional branch instruction executed, a bit
`position in the DHT is derived through a fixed hashing
`algorithm, andrhe corresponding bitinthe DHTrccords the
`outcome of the execution, indicating whether the branch was
`taken or not taken. Similar to U.S. Pat. No. 3,325,785, the
`BET method allows overlap processing on a conditional
`basis past the decode of a conditional branch instruction if
`the branch is predicted. based on the DHT history, as not
`taken.
`
`20
`
`30
`
`35
`
`45
`
`In typical pipelined processors, the processing of each
`instruction is divided into successive stages, with each stage
`of an instruction processing being handled by a specialized
`unit in a single cycle. Each successively earlier stage in the
`pipeline of stages is ideally handling simultaneously the
`successively next instruction. However, when a conditional
`branch instruction is encountered, there are several cycles of
`delay between the decoding of the branch instruction and its
`final execution/resolution, so it is not immediately known
`which instruction will be the next successive instruction. It
`is wasteful of the computer resource, however, to wait for
`the resolution of an instruction before starting with the
`processing of a next instruction. Therefore, itis recognized
`that it is advantageous to provide a mechanism for predict-
`ing the outcome of a conditional branch instruction in
`advance of its actual execution in order to provisionally
`begin processing instructions which will need to be pro-
`cessed if the prediction is correct When the prediction is
`correct, the computer system can function without a delay in
`processing time. There is a time penalty only when a correct
`prediction cannot be attained ahead of time.
`Throughout this application,
`the following terms and
`conventions will be used and shall have the indicated
`meaning. A branch instruction tests a condition specified by
`the instruction If the condition is true, then the branch is
`taken, that is, following instruction execution begins at the
`target address specified by the branch instruction. If the
`condition is false, the branch is not taken and instruction
`execution continues with the instruction sequentially fol-
`lowing the branch instruction. There may be branches that
`are unconditionally taken all the time. Such unconditional
`branches may simply be viewed as a special form of
`branches when appropriate.
`A number of patents are directed to branch prediction
`mechanisms. For example, U.S. Pat. No. 4,370,711 to Smith
`discloses a branch predictor for predicting in advance the
`result of a conditional branch instruction in a computm'
`system. The principle upon which the system is based is that
`a conditional branch instruction is likely to be decided in the
`same way as the instruction’s most recent executions.
`A simple strategy for handling branches is to suspend
`pipeline overlap until the branch is fully completed (i.e.,
`resolved as taken or not taken). Iftaken, the target instruc-
`tion is fetched from the memory. U.S. Pat. No. 3,325,785 to
`Stephens sets forth a static branch prediction mechanism. An
`improved method of this type is to perform static branch
`prediction by making a fixed choice based on the type of
`branch and statistical experience as to whether the branch
`will be taken. When the choice indicates that the branch is
`predicted to be not taken, normal overlap processing is
`continued on a conditional basis pending the actual branch
`
`55
`
`65
`
`The common technique for the above cited branch pre-
`diction methods that are based on the dynamic ltistories of
`branches is to first record the previous outcome of branch
`instructions in a history based dymmic table and to then use
`such recorded histories for predicting the outcome of sub-
`sequently encountered branch instructions. Etch branch
`recorded in such ahistory based table is recorded with either
`implicit or explicit information about the address of the
`recorded branch instruction so that the addresses of later
`encountered instructions can be correlated against
`the
`recorded information (i.e., by using the address of the
`instruction which is potentially a taken branch instruction in
`order to access the table for historical branch information).
`In order for branches to be predicted, the history table is
`
`NOAC EX. 1018 Page 707
`
`NOAC Ex. 1018 Page 707
`
`
`
`.u-wur’fim.4mac-Wt»~‘
`
`dukvflnu
`
`5,507,028
`
`3
`checked for a relevant entry by correlating the address of the
`instruction to be predicted against the implicitly or explicitly
`recorded address information of recorded branch instruc-
`tions. In the DHT method, the bit position in the history
`vector is derived through hashing from the address of the
`conditional branch. In the BET approach. an instruction is
`predicted to be a conditional branch which is taken if there
`is a match of the instruction address with a taken branch
`address found in an entry in the BHT and the target address
`recorded in this found entry is predicted as the current
`branch target address.
`Numerous variations and improvements have been pro—
`posed in implementing aBHT. For example, in US. Pat. No.
`4,679,141 to Pomcrene et a1, a technique is described for
`recording histories by block (e.g., doubleword) addresses
`instead of by individual branch instruction addresses. This
`technique otters advantages in reducing cache fetch traflic
`and the possibility of identifying the outcome of multiple
`branches within a single block. However, through more
`complex tags at each BHT entry, the block recording tech-
`nique still conceptually operates as in conventional BHT
`methods in terms of identifying taken branch history by
`matching the addresses of the currently concerned instruc-
`tions against the recorded addresses (or more precisely
`matching a portion of each such address) of the branch
`instructions recorded in the block.
`US. Pat. No. 3,940,741 to Horikoshi et al sets forth an
`information processing device for processing instructions,
`including branches. A cache-like route memory is provided
`for storing branch target addresses of a plurality of taken
`branch instructions and the branch target instructions (code)
`themselves in corresponding relationship to the branch tar-
`get addrmses. The route memory is referenced by the target
`address of a branch instruction, and the branch target
`instruction at the corresponding branch target address is read
`out. The Horikoshi et al patent utilizes the target address of
`the branch, which is known upon decoding of the branch, to
`retrieve target
`instruction code for decode if the target
`address isrecordedas aprevious targetforatakenbranchin
`the route memory. Such a mechanism generally requires
`some delay beforethe access to the route memory due to the
`address formation for the branch target.
`In practical implementrnions for branch prediction based
`on histories, timing is ofien found to be a mitical factor.
`History table access generally involvesaddress calculations
`and slower array, lookup operations. In order to efliciently
`search constantly for potentially taken branches in BHT type
`implementations also involves complexity in the recording
`of history entries. Considering all of the tasks which need to
`be accomplished in order to make a prediction and to utilize
`it to advantage, it is desirable for practical reasons to be able
`to start the prediction process with respect to a particular
`instruction of interest as far in advance as possible and also
`to achieve the prediction as far in advance as possible.
`Neverthelem, there is no known art that ofl‘ers the capability
`of either making a prediction decision or even initiating the
`prediction decision process with respect to an instruction
`which is potentially a taken branch instruction prior to
`identifying the address of that instruction of interest. It
`would be desirable to be able to predict instruction branches
`even earlier than the point where the address of an instruc-
`tion is brow which has the potential of being ataken branch
`instruction, because it would ofi‘er an opportunity to imple-
`ment and use branch prediction with simpler and less costly
`circuits.
`
`SUMMARY OF THE INVENTION
`
`It is therefore an object of the present invention to provide
`an alternative approach to the prediction of branch outcome
`
`4
`based on histories in order to achieve an earlier prediction.
`It is also an object to initiate the process of predicting the
`outcome -of an instruction which is potentially a taken
`branch instruction prior to the time when the address of that
`instruction is known
`A further object is to predict 3 taken branch without first
`deter-ruining the address of the branch instruction.
`It is another object to provide a history based branch
`prediction table which can be maintained and accessed using
`an instruction address which historically proceeds the branch
`instruction.
`
`Another object is to record in a history based branch
`prediction table a number indicative of the number of
`instructions by which such a preeeeding instruction histori-
`cally proceeds the recorded branch instruction.
`It is a further object to record the address of such a
`proceeding instruction in a history based branch prediction
`table.
`
`10
`
`15
`
`These and further objects have been achieved by this.
`invention with an improved history table in which at least
`some of the entries are stored and accessed based upon the
`address of an instruction which historically proceeds the
`branch instruction itself. The access address may be used to
`determine the location of the entry in the table and/or may
`be stored in whole orin part in the entry itself. Furthermore,
`the improved history table may be of any known type
`including but not limited to branch history table types and
`decode history table types. The entries in the improved
`history table preferably are stored and accessed by the
`. address of the proceeding taken branch target and preferably
`contain a number indicative of the number of instructions
`between the access address and the address of the branch
`instruction or its target.
`
`30
`
`35
`
`45
`
`50
`
`55
`
`Theory ofOperation
`
`Instructions are executed in a computer according to a
`certain logical sequence. Each instruction resides in the
`memory at a specific address. Two successive instructions in
`the memory may be executed in sequential or non-sequential
`orduing. When two instructions are sequentially adjacent to
`each other the address ofthe second instruction is exactly the
`address of the first instruction incremented by the length of
`the first instruction code. Non-sequential flow of instructions
`may occur during execution by various causes, among
`which branch instruction exeartions are the most frequent.
`Instructions in a typical computer system may be classified
`into two categories: breaker and non-breaker. A non-breaker
`instruction is the type that will cause the sequentially
`following instruction to be executed next, tmless an excep-
`tion condition (c.g., divide-by—zero) occurs. A breaker
`instruction is the type that can cause natural jumps to a
`non-sequentially related instruction to be executed next.
`Taken branch instructions are typical breaker type instruc-
`tions. For the simplicity of description of the basic concept
`of the present invention only branch instructions will be
`considered for breakers.
`
`Without an exception condition, the instruction stream
`executed is a sequence of sequential blocks (S-blocks), with
`each sequential block consisting of zero or more non-
`breakers followed by a branch instruction at the end. The
`branch instruction of an S-block may be taken or not taken
`during execution. Similarly. the instruction stream executed
`is a sequence of sequential, segments (S—segments), with
`each sequential segment consisting of one or more succes-
`sive S—blocks such that only the last S-block has a taken
`
`NOAC EX. 1018 Page 708
`
`NOAC Ex. 1018 Page 708
`
`
`
`5,507,028
`
`5
`branch at the end. FIG. 1 depicts such a pattern of instruction
`flow during program execution. Only two S—segments (881
`and 882) are illustrated. SSl consists of two S—blocks, 81311
`and 8812. The branch instruction B11 at the end of 8811 is
`not taken, while the branch instruction B12 of SB12 is taken
`with the beginning of $82 as the branch target. Similarly
`SSZ consists of three suwessive S-blocks 8321, 8822 and
`SB23. The ending branches 321 and B22 for SB21 and
`.8322, respectively, are not taken. The ending branch B3 of
`81323 is taken with the beginning of another S-segment as
`the target-
`Considcring any conventional history based branch pre-
`dictiOn method, the instruction stream is constantly exam-
`ined for the next branch to be predicted Upon locating such
`a branch instruction the branch predictor uses the branch
`address (in whatever representation) to look for an appli—
`cable history in the history table. For example, when con-
`ventional BHT design is applied to the S-segment Sll in
`FIG. 1, the S—segrnent 811 is scanned through for a taken
`' branch in the history table. IfBlZ is the taken branch in the
`history table the S-segment 811 can be rather long and
`involves multiple cycles for the scan, even when block
`recording technique is used. Only upon locating the taken
`branch 312 in the history table prediction can then be can'ied
`out Such conventional approach often leads to complex
`encoding of the history table in order to locate the relevant
`taken branches in a timely manner.
`The present invention is based on the observation that the
`main purpose of conventional branch prediction methods is
`to resolve the flow of the instruction stream early in order to
`bring instructions into the instruction buffer soon enough for
`decoding. Branch addresses have been used for history table
`look-up for the natural reason that branches are normally the
`cause of non-sequential instruction flow. I have observed
`that predicting the outcome of 8 taken branch can be done
`without first locating the branch. For example, it is possible
`to create all S-segment history table (SSHT) to rword the
`flow patterns among successive S-segments. With an SSHT.
`the instruction flow prediction can be achieved without
`locating and predicting individual branches. This observa-
`tion has been further generalized into the concept of pre-
`dicting instruction flow (as dominated by taken branches)
`based on addresses of instructions prior to the conccmed
`branches. Use of such a technique for early resolution of
`branch predictions not only allows more timely resolution of
`instruction flow control but also offers flexibility for simpler
`implementations.
`I
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`10
`
`30
`
`35
`
`45
`
`50
`
`55
`
`The foregoing and other objects, aspects and advantages
`of the'invention will be better understood from the following
`detailed description of a preferred embodiment of theinven-
`tion with reference to the drawings,in which.
`FIG. 1is a block diagram showing the instruction flow
`Patterns between successive sequential blocks and sequen-
`tial segments;
`FIG 2 is a block diagram showing the structm'e of60
`Pipelining with branch prediction;
`FIG. 3is a block diagram showing the gents-a1 structure
`Of a history-based branch predictor,
`FIG. 4 shows the format of history table entries for
`prediction based on S-segments;
`FIG. 5 is a block diagram showing a structure for the
`Motor of S-segment flow:
`
`65
`
`
`
`6
`FIG. 6 shows an aim-native format of history table entries
`for S—segment flow prediction, with length information
`recorded for the first S—segment;
`FIG. 7 shows an alternative format of history table entries
`for S-segment flow prediction, with end address recorded for
`the first S-segment;
`FIG. 8 sbows an alternative format of history table entries
`for S—segment flow prediction. with length information
`recorded for the second S-segment;
`FIG. 9 shows an alternative format of history table entries
`for S-segrnent flow prediction, with end address recorded for
`the second S-segment;
`FIG. 10 shows the format of history table entries for a
`conventional DHT type branch prediction mechanism;
`FIG. 11 is a block diagram showing a branch predictor in
`which an address adder is used to form the precise branch
`address;
`
`FIG. 12 is a block diagram showing an alternative design
`of the predictor shown in FIG. 11 which bypasses the
`address adder for history table amess;
`FIG. 13 shows an alternative to the history table entry
`format shown in FIG. 10 with a more general address at the
`MATCH FIELD;
`FIG. 14 shows an alternative method of address genera-
`tion from the method illustrated in FIG. 11;
`FIG. 15 illustrates the method of carrying out branch
`prediction always based on the address of the previous
`instruction processed;
`FIG. 16 shows a generalization of the history table entry
`format shown in FIG. 8, with the successive flow between
`more than two S—segments recorded; and
`FIG. 17 shows a generalization of the history table entry
`format shown in FIG. 9, with the sucmssive flow between
`more than two S—segments recorded.
`
`DETAILED DESCRIPIION OF A PREFERRED
`EMBODIMENT OF THE INVENTION
`
`Refm'ing now to the drawings, and more particularly to
`FIG. 2.
`there is illustramd a typical computer pipeline
`structure with a branch prediction mechanism. The branch
`predictor 10 performs the branch predicfiou and branch
`history management operations. Instructions 31—37 are pro-
`cessed according to a certain logical sequence for a particu-
`lar pipeline design. Addresses 21—27 represent the corre-
`sponding addresses of instruction stream 31-37. The branch
`predictor 10 receives an input instruction address on line 20
`from address 23 in the instruction address pipeline. The
`prediction output 11 from the branch predictor 10 is nor—
`mally used to determine the subsequent instructions to be
`fetched into instruction buifer. The branch predictor 10 also
`receives from input signal line 12 informan'on that allows it
`to determine the correctness of prediction and the adjust—
`ment of history information. The issuing of an instruction 37
`for actual processing (e.g., for decode or for a later step) is
`' usually one or more cycles past the branch prediction stage
`in the pipeline.
`The function of the branch predictor 10 is highly depen-
`dent upon the particular branch prediction design employed.
`For instance. with respect to the DHT mechanism employed
`in IBM 3090 systems, the branch predictor 10 is activated
`during thedccode stage ofaninstructioninthepipeline with
`the output 11 being as simple as a single bit indicating the
`predicted outcome as taken or not taken for the instruction
`being decoded. In a more sophisticamd BH'I‘ mechanism, the
`
`NOAC EX. 1018 Page 709
`
`NOAC Ex. 1018 Page 709
`
`
`
`5,507,028
`
`7
`'ctor 10 can be activated for an instruction
`$3221 $3115 earlier than the actual. decode of the instruc-
`lion with the prediction output 11 indicating whether the
`instruction is a taken branch or not and the target Instruction
`address if the instruction is guessed as a taken branch
`Although not a necessity,
`the output 11 of the branch
`‘ctor is normally passed to an instruction fetch unit 13
`for prefetching the predicted subsequent instructions. The
`branch predictor 10 in prior art branch prediction schemes
`for early branch resolution conceptually. guesses the out-
`come Of a branch instruction 3 according to the branch
`address 23 input from line 20.
`Referring now to FIG. 3, which is a block diagram
`representation of the branch predictor 10 with altistory table
`43 milizcd for branch prediction An instruction address 40
`from a register is the input instruction address for which
`prediction steps are to be taken. For simplicity of description
`of the invention, the history table 43 may be a 1-dimensional
`table with a fixed number of entries. The selection of an
`entry in the history table 43 for a given instruction address
`40 is via proper hashing of cenain address bits, Each urtry
`in the history table 43 generally consists of two portions:
`MATCH FIELD and HISTORY. The MATCH FIELD por-
`tion is used for matching with input address bits in order to
`determine whether the history information recorded in the
`HISTORY portion of the entry is applicable. In the simplest
`casetheMATCHFIELDportioncanbenullwiththe
`address match results considered always successful. The
`address queue 42 is used to hold address indicators for
`instructions with unfinished branch prediction status, The
`left portion of each entry of the address queue 42 contains
`address identifiers that are compatible with the MATCH
`FIELD in the history table 43. The right porn'on of each
`entry contains a tag allowing the identification of the entry
`selection for the associated instruction address. The address
`queue 42 is implemnted as a first-in-first—out queue, The
`write address 52 signal is used for updating the history table
`43. Upon update to the history table 43 the left portion ofthe
`oldest entry in the address queue 42 supplies the new value
`51 fortheMKI‘CHFlELDJ‘heresultqueueMisusedfor
`queuing predictions that have not yet been verified as
`correct,
`
`The compare logic 45 is used for determining a match
`between the MATCH FIELD value 54 of the selected entry
`of the history table 43 and the con'espondence in the
`instruction address 40. The output HISTORY value 55 at the
`selected .try of the history table 43 is input to both the
`result queue 44 and logic 46. The logic 46 simply readjusts
`the HISTORY value received into a proper format. When a
`match is found at the compare logic 45, MATCH line 64 is
`raised (with value 1) and the AND logic 48 provides the
`prediction result to the 0R logic 49 via line 67. If a match
`is not found at the compare logic 45, the N0 MATCH line
`63 is raised instead and causes a default guess result to be
`passed to the 0R logic 49 via line 66. The output 11 from the
`branch predictor consists of two portions 11a and 11b. The
`output line 11a from the OR logic 49 gives the final result
`of the prediction based on the contents of the instruction
`address 40. The compare lmit 45 provides the MATCH
`signal to the output line 1117. In certain implementations the
`output line 11b can be eliminated. For example, in the IBM
`3090 implementation of a DHT, the MATCH FIELD is null
`and the MATCH signal 11b is conceptually on always and
`hence the compare lmit 45 and the output line 11b can be
`ignored. The correction logic 41 takes as input the actual
`branch outcome 12 from the branch execution unit and via
`signal line a the earlier prediction from the first-in-first—out
`
`10
`
`15
`
`35
`
`45
`
`55
`
`65
`
`8
`result from queue 44. By matching the branch prediction
`with the actual branch outcome the correction logic 41
`performs the function of verifying correctness of active
`branch predictions and triggering necessary adjustments to
`the HISTORY portion of a relevant entry in the history table
`43. The two queues 42 and 44 are controlled such that the
`history table 43 can be updated consistently with the corre-
`sponding prediction verified by the correction unit 41. In an
`implementation that allows only one access (Read or Write)
`per cycle to the history table 43, priority needs to be
`provided between concurrent accesses. For example, update
`to the history table 43 due to signal 56 from correction logic
`41 may delay the read access for a new prediction. Such
`implementation choices are not critical to the present inven-
`tion and will not be discussed in further detail. The MATCH
`signal from the compare unit. 45 is also sent to the result
`queue 44 via signal line 68. It is assumed that each entry of
`the result queue 44 has a history hit bit (EH-bit) with the
`value (0 or 1) rec