`
`filelog-depot-r400-devel-parts_lib-sre-sp.txt
`11107 on 2001/12/03 by pmitchel@pmitchel_r400_win_marlboro
`
`my block dirs to gfx
`
`Change
`
`10478 on 2001/11/21 by askende@andi_r400
`
`further update of the I/o definition
`
`change
`
`9918 on 2001/11/14 by askende@andi_r400
`first time check-in
`
`change
`
`8480 on 2001/10/25 by askende@andi_r400
`
`inserted into source control by Andi s.
`
`change
`
`6887 on 2001/09/25 by askendeGandi_r400_devel
`
`more changes
`
`Change
`
`6810 on 2001/09/21 by askende@andi_r400_devel
`
`newly added files
`
`Change
`
`5440 on 2001/08/16 by askende@andi_r400_devel
`
`adding source code into source control
`
`Change
`
`5002 on 2001/08/02 by pmitchel@pmitchel_test_client
`
`directory creation
`
`Page 1
`
`ATI 2107
`LGv. ATI
`IPR2015-00325
`
`ATI Ex. 2112
`IPR2023-00922
`Page 1 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 1 of 638
`
`
`
`//depot/r400/devel/parts lib/src/gfx/sq/ais/sq_alu_instr_seq.v
`#83 change 132649 edit on 2003/11/18 by vromaker@vromakerr400linuxmarlboro
`(ktext)
`
`- alu_instr_seq timing fixes for constant store read: first the register stage
`on the offset was moved after the sum2 adder;
`then the initdonebits signal
`was changed from a combinational ACS state machine output to a registered
`one-bit state machine output to help the path to the new sum2 register
`- thread buff status read timing fix - moved the status read back one cycle by
`sending the unregistered, rotated request vector to the arbiter and registering
`the winner out of the arbiter;
`the output of the status read mux was
`then registered
`
`#82 change 130763 edit on 2003/11/07 by llefebvr@llefebvr_r400_linuxmarlboro
`(ktext)
`
`Reverting timing fix that broke r400sq_constindex04.cpp test.
`
`#81 change 129723 edit on 2003/11/01 by vromaker@vromakerr400linuxmarlboro
`(ktext)
`
`- fixed pix ctl output buffer overwrite bug
`- backed timing fix out of status reg and pix thread buff
`
`#80 change 129066 edit on 2003/10/28 by vromaker@vromaker_r400_linuxmarlboro
`(ktext)
`
`- added vtx input optimization for autocount on and continued off
`- fixed initialization problem for vtx autocount
`- made pix thread buff timing fixes:
`reduced load on status read
`data bit 19, which is the event bit, and also tried to reduce
`
`in the status register
`the load on pop_thread (part of the same path)
`- backed out a timing fix in alu_instrseq that was causing a mova
`test to fail
`
`- fixed the AUTOCOUNTSI4E definition
`
`#79 change 128209 edit on 2003/10/23 by vromaker@vromakerr400linuxmarlboro
`(ktext)
`
`- timing fixes for constant store read address
`
`#78 change 126234 edit on 2003/10/10 by vromaker@vromakerr400linuxmarlboro
`(ktext)
`
`- added export arbiter module that will limit the number of color buffer export
`threads to one every 4 clocks
`- hooked up the export blocker outputs and commented out
`blocking code
`
`the previous export
`
`ATI Ex. 2112
`IPR2023-00922
`Page 2 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 2 of 638
`
`
`
`- added export alloc arbiter inputs to expallocctl module so that the bufavail
`counter will be updated by the export allocs
`- added logic to support the export arbiter to the vertex and pixel thread buffers
`- added logic to support the export arbiter to the thread arbiter
`- separated the export alloc request out of the alu request logic in the status
`register,
`
`and added an output for the export alloc request
`
`#77 change 122520 edit on 2003/09/22 by vromaker@vromakerr400linuxmarlboro
`(ktext)
`
`timing fixes - added registers for vs and ps base and size after the
`context register read mux
`
`#76 change 118589 edit on 2003/08/28 by vromaker@vromakerr400linuxmarlboro
`(ktext)
`
`- fix for loop index clamping and constant address generation (both index and offset
`relative)
`- changed the connection of the real time bit such that it now goes directly from the
`AIQ to the
`AIS output mux (and not thru the AIS)
`
`- sq_tests.simpleregindexing tests now pass
`
`#75 change 115595 edit on 2003/08/08 by dougd@dougd_r400linuxmarlboro (ktext)
`
`fixed the path for the real time bit down the alu pipeline
`to reach the constant and instruction stores.
`
`#74 change 115159 edit on 2003/08/06 by rramsey@rramseycrayola_linux_orl
`
`(ktext)
`
`Change sq_alu_instr_seq so gpr_rd_en is not asserted when reading constants
`Changes to thread_arb, ctlflowseg, and status_reg to get mem exports flowing
`
`#73 change 111736 edit on 2003/07/17 by mmang@mmangcrayolalinuxorl (ktext)
`
`Added sp->sx export arbitration between multiple simd engines.
`Added register after instrstart OR of multiple simd engines by
`taking unregistered signal out of sq_ais_output.
`
`#72 change 110640 edit on 2003/07/12 by mmantor@mmantorcrayola_linux_orl
`
`(ktext)
`
`<l. Enlarge export memories for performance fill rate
`gc,
`tbsqsp,
`tb_sx)
`for shift bug with added guard bit
`2. Fix Sx diff engine (interpolators)
`3.
`Fix compile/sre code problem with s-blocks memories
`4. Added the sx to tbh_sqsp by default, can still disable by macro
`5 Added mode to th_sqsp and tbh_sx to run interfaces at max rate
`
`(emulator, sq, sx,
`
`rb, ferret
`
`ATI Ex. 2112
`IPR2023-00922
`Page 3 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 3 of 638
`
`
`
`6. Initialized state in vc to allow cp surface synchronizer micro code to
`invalidate tce/ve
`
`sx and testbenches
`7, Added test signals to sc.v, sc_b.v, sq, sp, spi,
`THIS CHANGES REQUIRES THE RELEASE OF SC, SC_B, SQ, SPI, SP, SX, RB,
`src/chip/chip**.tree files,
`parts lib/sim/test/gc/vcs top.ini, gc/tbsgsp/tbsx updates
`togeather
`>
`
`and the emulator
`
`#71 change 109679 edit on 2003/07/08 by llefebvr@llefebvrr400emu_montreal
`
`(ktext)
`
`Fixed r400sp_mova_tests.cpp TESTCASE=mova512.
`
`The PVPS detection was rightly disabled during the waterfall but wasn't re-enabled for
`the following instructions of the clause.
`I used the waterfalldone signal to re-enable
`the PVPS detection after the waterfalling.
`
`#70 change 109466 edit on 2003/07/07 by dougd@dougd4400linuxmarlboro (ktext)
`
`fixed error in bit width of ais realtime
`
`#69 change 109126 edit on 2003/07/03 by dougd@dougd1400linuxmarlboro (ktext)
`
`pipelined the Real Time bit from the pix thread buffer down through
`both arbiters,
`the ve,
`tex and alu instruction pipelines to the alu,
`tex and cfc constant stores to enable reading the real time constants.
`
`#68 change 108760 edit on 2003/07/01 by llefebvr@llefebvrr400linuxmarlboro
`(ktext)
`
`Fixed r400sqconstindex03.cpp. Now works on the S@SP testbench. Still has issues on
`the GC because of bad ferret/cp ring buffer synchronization.
`
`Fixed:
`
`1) Bad clamping of the address register in the SP
`2) Bad error handling of an out of range address in the SQ.
`
`#67 change 108222 edit on 2003/06/27 by smoss@smoss_crayola_linux_orl_regress
`(ktext)
`
`I have too many i's
`
`#66 change 108188 edit on 2003/06/26 by mmang@mmangcrayolalinuxorl (ktext)
`
`For pixel quads, enable all pixels of a quad when any pixel is hit
`for gpr write enables and constant address waterfalling sequencing.
`Another update will fix constant address register writing.
`
`ATI Ex. 2112
`IPR2023-00922
`Page 4 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 4 of 638
`
`
`
`#65 change 107757 edit on 2003/06/25 by mmantor@mmantorcrayola_linux_orl
`
`(ktext)
`
`<
`
`1.
`
`sqalu_instrseq.v —- Use the Queue pop signal to qualify last_inclause
`and last_in_shader out of the queue.
`sqtargetinstrfetch.v - Fixed a buf in the the target_instructfetch
`write to the queue to prevent dropping last_in_shader and last_inclause
`if the queue is full when first trying to send instruction.
`>
`
`2.
`
`#64 change 107389 edit on 2003/06/22 by mmang@mmangcrayolalinuxorl (ktext)
`
`made change sp_vector.v to grab pred/kill results
`a clock sooner since Vic a register delay to
`
`spscalarlut.bvrl. May have to change back later.
`Took away register delay in sq_ais output to account
`for extra register needed for muxing and registering
`both simd engines for SQSXsp signals.
`In sqalu_instrseq.v, backed cut Laurent's previous
`fix for constant waterfalling and made different change
`where ism registers are loaded based on aisstart
`instead of ais_rtr. With waterfalling,
`the ais_rtr
`does not happen early enough for ism registers to be
`available for AIS state machine.
`
`In sqexport_alloc.v, added connections for second simd
`engine to handle sx export allocation and deallocation.
`In sq.v, added muxing between simdO and simdl
`sqais output for SQSX signals.
`In sqexpalloc_ctrl.v, added simdl connections for
`SX export control logic.
`
`In sqpixthread_buff.v and sq_vtx_thread_buffi.v, added
`A)
`Simdl logic for ALU memory write (register delayed
`simd1l information to avoid overlap with simd0)
`B) Appropriate read mux for simd0/simdl for control
`flow memory (based on status simd num).
`Cc) Added simdl status register write data connections.
`In sqstatusreg.v, added connections and muxing for second
`simd engine status bits write.
`to tbhsqsp.v.
`Added a variety of connections for simdl
`Added delay pipe for thread_id and thread_type for simdl
`in order to correctly track sp to sx interface.
`(tbtrk_spsx.v)
`Fixed bug in sx related to using correct export id during
`free done process of pixel to rb buffers
`(sx_export_control_common.v)
`
`~J
`
`LO
`10.
`
`11.
`
`#63 change 105592 edit on 2003/06/11 by llefebvr@llefebvrr400linuxmarlboro
`(ktext)
`
`Added storage element in the S@ to store the valid addresses of the mova so that they
`
`ATI Ex. 2112
`IPR2023-00922
`Page 5 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 5 of 638
`
`
`
`can de restored at any instruction that uses the address register. The way it was
`currently would only work if the use of the address was directly following the MOVA
`
`instruction. This fixes r400sq_const_index_02.cpp.
`
`#62 change 105465 edit on 2003/06/10 by vromaker@vromaker_r400_linuxmarlboro
`(ktext)
`
`- timing fix in pixthreadbuff
`- VC interface is connected to ve instruction seg
`- TP_SQfetch stall replaced by TP_SQdec (but not tested at GC level)
`- SQTPgprwraddr and S@TPclause removed from top level
`(and tbh updated)
`- fetch arbitration for VC and TP updated
`- recoded a few lines in gpr alloc to see if it will help timing
`
`#61 change 101908 edit on 2003/05/21 by mmang@mmangcrayolalinuxorl (ktext)
`
`Fixed bug in waterfalling by grabbing register input of donebits
`instead of registered value when performing initdonebits operation.
`
`#60 change 99346 edit on 2003/05/06 by mmang@mmangcrayola_linuxorl (ktext)
`
`Fixed bug (1 created) related to initializing the constant address
`
`I used ais_init_pred
`register valids at the beginning of a clause.
`which in some cases was too late. Created new ais initconstaddr
`that is 3 clocks sooner.
`
`#59 change 98773 edit on 2003/05/02 by mmang@mmangcrayolalinuxorl (ktext)
`
`2.
`
`3.
`
`1. Added constant address register valids to validate the
`address register data.
`The valid is set when address register
`is written.
`If valid is not set, sequencer will not waterfall
`those vertices or pixels. This disables waterfalling for
`predicated off writes and improperly initialized contant
`address registers.
`Fixed bug in sqs alu_instr_seq for phase 3 snooping of
`constant address registers bus. Previously,
`this snooping
`did not account for predication of those registers.
`Fixed bug where aisloaddonebits was not hooked up. This
`signal disables previous vector/scalar management which needs
`to be turned off during constant waterfalling. With bug,
`pyvps
`logic went unknown which caused unknowns to eventually
`propagate in and out of the gprs.
`Fixed bug where non-optimized offset was not being determined
`properly.
`non_optoffset is determined by a priority encoder
`of pOdone, pldone, p2 done, and p3_done.
`5. With advent of constant address register valids, created
`waterfallactive_g to properly init and avoid re-initing of
`different pixel and vertex done bits.
`
`4.
`
`ATI Ex. 2112
`IPR2023-00922
`Page 6 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 6 of 638
`
`
`
`#58 change 96946 edit on 2003/04/22 by viviana@viviana_crayola2 syn (ktext)
`
`Added donevector to sensitivity list at line 902.
`Removed “S@_SRCB_PHASE from sensitivity list at line 1018.
`Added isrthread_typeq to sensitivity list at line 1233.
`
`ATI Ex. 2112
`IPR2023-00922
`Page7 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 7 of 638
`
`
`
`//depot/r400/devel/partslLib/src/gfx/sq/sq.v
`#311 change 132649 edit on 2003/11/18 by vromaker@vromakerr400linuxmarlboro
`(ktext)
`
`- alu_instr_seq timing fixes for constant store read: first the register stage
`on the offset was moved after the sum2 adder;
`then the initdonebits signal
`was changed from a combinational ACS state machine output to a registered
`one-bit state machine output to help the path to the new sum2 register
`- thread buff status read timing fix - moved the status read back one cycle by
`sending the unregistered, rotated request vector to the arbiter and registering
`the winner out of the arbiter;
`the output of the status read mux was
`then registered
`
`#310 change 130421 edit on 2003/11/06 by bhankins@bhankins_xenos_linuxorl (ktext)
`
`- sq-sx thread id added to sq output and into and through the sx
`- updated sx-rb trackers to use sgq-sx thread id
`- removed obsolete code from sx
`
`- fixed sx bug where an ea from one export to memory was resetting the valid bits
`for the other export
`to memory
`
`#309 change 129444 edit on 2003/10/30 by llefebvr@llefebvr_r400linuxmarlboro
`(ktext)
`
`Fixing dangling wires in the sq related to performance module.
`Fixing shader due to Kill opcode assembler change.
`Fixing trakcer problem in the TBSQSP when autocount vtx is on.
`
`#308 change 129259 edit on 2003/10/29 by danh@danh_xenos linuxorl (ktext)
`
`- spi_interpctl Iv buffer changed from one 16x200 memory to two 16x100 memories.
`- added additional $QSP_interp_qd[0:1] prim_sela signals to improve spi input
`
`timing.
`
`307 change 129213 edit on 2003/10/29 by llefebvr@llefebvrr400linuxmarlboro
`(ktext)
`
`Added VC_PERFACTUALSTARVED performance counter in the 8Q.
`
`306 change 129066 edit on 2003/10/28 by vromaker@vromakerr400linuxmarlboro
`(ktext)
`
`
`
`- added vtx input optimization for autocount on and continued off
`- fixed initialization problem for vtx autocount
`- made pix thread buff timing fixes:
`reduced load on status read
`data bit 19, which is the event bit, and also tried to reduce
`
`in the status register
`the load on popthread (part of the same path)
`- backed out a timing fix in alu_instr_seq that was causing a mova
`
`ATI Ex. 2112
`IPR2023-00922
`Page 8 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 8 of 638
`
`
`
`test to fail
`
`- fixed the AUTOCOUNTSIZE definition
`
`#305 change 128816 edit on 2003/10/27 by llefebvr@llefebvr_r400linuxmarlboro
`(ktext)
`
`Adding VC performance counters in the S@.
`Removed the SX->RB warnings on non-initialized GPR channels.
`
`#304 change 128647 edit on 2003/10/27 by rramsey@rramseyxenoslinuxorl (ktext)
`
`Change ais so PS sre sel gets priority over PV
`Add predicated jumps and calls to cfs
`
`Fix fetchtype connection in sq and tex_instr_seq
`
`#303 change 128601 edit on 2003/10/27 by mmantor@mmantor_xenos_linux_orl (ktext)
`
`<Enable SQ use of 128 locations in export memmory instead of 112 locations. Also added
`counters in sq arbiter to give priority to instruction pipe that has the fewest
`instructions when both control flow machines are available. This changlist reguires
`both an emulator and hardware rtl code updates>
`
`#302 change 128393 edit on 2003/10/24 by llefebvr@llefebvr_r400_linux_marlboro
`(ktext)
`
`This should fix the instruction count being off. The bad machine (cfs) was used to
`determine the thread type and hence some pixel shader instructions were counted as
`vertex ones and vice versa.
`
`#301 change 128365 edit on 2003/10/24 by mearl@mearl_xenos_linux_orl
`
`(ktext)
`
`Added 2 primitive interpolation in $Q and SPI. Fixed a bug in sx_parametercache. Fixed
`synthesis
`bugs in sc.
`
`#300 change 127895 edit on 2003/10/22 by vromaker@vromakerr400linuxmarlboro
`(ktext)
`
`- timing fixes for gpr alloc
`
`#299 change 126823 edit on 2003/10/15 by rramsey@rramseyxenos_linux_orl (ktext)
`
`Add sqvc tracker to gc testbench when running with orlando trackers
`Rework some of the alu/tex constant logic to get rid of the bug that
`was allowing threads to start processing before all of the constants for
`their context had been loaded.
`
`#298 change 126796 edit on 2003/10/15 by vromaker@vromaker_r400_linux_marlboro
`
`ATI Ex. 2112
`IPR2023-00922
`Page 9 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 9 of 638
`
`
`
`(ktext)
`
`- hooked up the new alu_arb_policy and tx_cache_sel register bits (but
`temporarily tied the tx_cachesel input to the vtx thread buff low
`since it is being incorrectly set to 1 by Primlib)
`
`#297 change 126450 edit on 2003/10/13 by donaldl@donaldl_xenos_linux_orl
`
`(ktext)
`
`Delayed 50SXspsimdid an extra clock to line up for reduduncy use.
`
`#296 change 126234 edit on 2003/10/10 by vromaker@vromakerr400_linuxmarlboro
`(ktext)
`
`- added export arbiter module that will limit the number of color buffer export
`threads to one every 4 clocks
`- hooked up the export blocker outputs and commented out the previous export
`blocking code
`- added export alloc arbiter inputs to expallocctl module so that the bufavail
`counter will be updated by the export allocs
`- added logic to support the export arbiter to the vertex and pixel thread buffers
`- added logic to support the export arbiter to the thread arbiter
`- separated the export alloc request out of the alu request logic in the status
`register,
`
`and added an output for the export alloc request
`
`#295 change 125660 edit on 2003/10/08 by rramsey@rramseyxenoslinuxorl (ktext)
`
`Fix compile warnings for sq (several missing ports)
`Fix compile warning in sxparametercaches
`Fix 5Q_SP_fetch_simd_sel so it lines up with the data coming out of the GPRs
`
`#294 change 125278 edit on 2003/10/07 by dougd@dougd_r400_linuxmarlboro (ktext)
`
`Added a new state register, vce_fifo_depths_ll_reqfifo_depth to
`sqrbbm_interface.v and wired it up to the compare logic for
`veminicount_q in sqfetchark.v.
`
`Corrected a typo in sqvtx_ctl.v that affected synthesis.
`
`#293 change 124792 edit on 2003/10/03 by dougd@dougd_r400linuxmarlboro (ktext)
`
`Removed all references to SIMD1DISABLE in sq.v and sqrbbminterface.v.
`
`Added 32 new performance counters: many are for SIMD2 and SIMD3 but
`other existing counters were expanded to differentiate between vertex
`and pixel counts. There are now 95 performance counters in the sq.
`
`#292 change 124203 edit on 2003/10/01 by dougd@dougd_r400_linuxmarlboro (ktext)
`
`ATI Ex. 2112
`IPR2023-00922
`Page 10 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 10 of 638
`
`
`
`The four existing SYNCSTALL counters were separated into
`(8) pix and vtx stall counters.
`The two ALU INSTRUCTION ISSUED counters were made to increment
`
`by 1,2,3 or 4.
`The two CF INSTRUCTION ISSUED counters were made to increment
`
`by 1,2,3,4,5 or 6.
`
`Added “ifdef's to sqperfmonwrapper for SIMD1, SIMD2, SIMD3.
`
`perfmon event window:
`An enable for the performance counters is generated by events received
`from the VGT and/or SC which create a window of time when the counters
`will be active. All of the perf counters are now controlled by this enable.
`
`#291 change 123952 edit on 2003/09/30 by mmantor@mmantor_xenos_linuxorl (ktext)
`
`<added changes for 2 prim interpolation to the spi and sq and all top level
`interconnects, and sqsxsp_simd_id for redundancy control, and all changes to test
`bench as well as some neverilog error messages.
`Some other misc top level clean up>
`
`#290 change 123918 edit on 2003/09/29 by rramsey@rramseyxenoslinuxorl (ktext)
`
`Change tp_sqsp dump to use FMT32 32 32 32FLOAT
`Remove a monitor from tbhtrksc for now since it is broken for ONEPPC
`Need to register the if inputs to aig since they are put in the fifo
`one clk after the transfer
`Fix the exec sm so it is 4 clks even when switching clauses
`Remove one clk of latency on tpdec from fetcharb
`Fix the strap bits in sq.v so the tp and ve cfs and if machines get
`two read cycles out of 8 when we have two instruction stores
`
`Change the tp_sq dec input and force the tp_sp format in tbhsqsp
`Fix the
`tif so its state machine is 4 clks between clauses and change
`it so 0
`count execs can be merged into the instruction ahead of them
`Fix the
`tex_instr_seq for the case where tp_dec happens on the same
`elk the
`fes state machine kicks off (instr were getting dropped)
`Check in Scott's vgt change to clamp vtx_reuse based on good pipes
`
`#289 change 123260 edit on 2003/09/25 by mmang@mmang_xenos_linux_orl
`
`(ktext)
`
`OeWNPR
`
`For Vivian E., added new simd memories and star patch in/out wires.
`In vertex thread buffer,
`fixed bug in simd3 alu state registers.
`In pixel thread buffer,
`fixed bug in simd2/3 cf state read data.
`Adjusted simd id bus width for sq to tp tracker.
`In sq.v, added vertex shader and pixel shader constant base and
`size connections to simd2/3 alu instruction sequencers.
`
`#288 change 123113 edit on 2003/09/24 by llefebvr@llefebvr_r400_linux_marlboro
`
`ATI Ex. 2112
`IPR2023-00922
`Page 11 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 11 of 638
`
`
`
`(ktext)
`
`Fixed the autocount pixel timing by removing 5 pipeline registers in the 8Q control
`path. Also fixed the counter's with back to 17 bits (from 19)
`int both the vertex and
`pixel path such that when it hits the SP it is of the correct 23 bits width (17 bits
`count + 2 bits phase + 4 bits index). This fixes r400vgt_multi_pass pixshader01 at
`the sqspsx testbench level.
`
`#287 change 123076 edit on 2003/09/24 by donaldl@donaldlxenoslinuxorl (ktext)
`
`Connected ROM block redundancy signals.
`Added sq export address buffer support.
`
`#286 change 122683 edit on 2003/09/23 by mearl@mearl_crayola_linux_orl
`
`(ktext)
`
`One primitieve per clock changes in the back of the SC and front of the $@. Right now,
`the ONEPRIMPERCLOCK define in
`header.v and SCSQinterface.v are needed for this change. Will update this to
`ONEPPC, since this already exists in
`header.v. Also,
`the sim.cfg file does not have an ifdef, so is hardcoded to one
`prim per clock.
`
`#285 change 122402 edit on 2003/09/20 by mmang@mmang_crayola_linux_orl
`
`(ktext)
`
`1. Added simd2 and simd3 to code.
`
`4,
`5.
`
`6.
`
`2. Added simd2 to synthesized code.
`3.
`In sq.blk and sqrbbminterface, added
`DBREADMEMORY, DBWENMEMORY2, and DBWENMEMORY3
`to S@MISCDEBUG register.
`In header.v,
`turned on SIMD2_ PRESENT.
`In scpacker.v,
`turned on SIMD2 but don't use it
`with SIMD2 PRESENTTEMP.
`In sqaluconstmem.v,
`sqaluconst_top.v, sq_cfc.v,
`and sq_instruction_store.v, hooked up DBWENMEMORY2
`and DBWENMEMORY3 to appropriate SIMD2/3 memories.
`In sqexportalloc.v, handle position/main export
`id
`and parameter cache thread base for simd2/3.
`Be able
`to handle one type down simd0/1l and a different type
`down simd2/3 on the same clock.
`
`~d
`
`8.
`
`oO
`
`In sqpixctl.v and sqvtxctl.v, multiple simd
`gpr_alloc blocks return different acks, gpr bases,
`and gpr maxes.
`
`In sq_exp_alloc_ctrl.v, handle position/main export
`buffer management.
`Be able handle one type down
`simd0/1 and a different type down simd2/3 on the same
`clock.
`
`10.
`
`In sqpixthreadbuff.v and sqvtxpixthreadbuff.v,
`added muxing and memories to handle status bits, cfs
`
`ATI Ex. 2112
`IPR2023-00922
`Page 12 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 12 of 638
`
`
`
`11.
`
`12.
`
`13.
`
`14.
`
` 16.
`
`state, and alu state.
`simd3 mirrors simdl.
`
`Simd2 mirrors simd0, while
`
`In sq_status_reg.v, added simd2/3 arb requests and
`status bit writing from simd2/3.
`In tbh_sqsp.v,
`fixed some bugs related to pspv_wr_en,
`predoverride, const_addr, and constvalid hook ups.
`In tbhtrk_spsx.v, SIMDPRESENT conditional delaying
`and management of thread_id and threadtype for
`tracker.
`
`In tbtrk_sqpixrs input.v and tbtrk_sqvtx_rs_input.v,
`temporary klug to hook up bOblpredicate instead of
`predicate.
`
`In tbhtrksqspvec_gpr.v, added simd2/3 tracking of
`gpr_int_wen interface.
`In sqtexinstrqueue.v, get gpr_max from appropriate
`simd data.<enter description here>
`
`15.
`
`#284 change 121348 edit on 2003/09/15 by dougd@dougd_r400linuxmarlboro (ktext)
`
`1. corrected the trigger events for VTX_SWAP_IN, VTXSWAPOUT,
`PIXSWAP_IN, PIX_SWAP_OUT, CONSTANTSUSEDSIMDO and
`CONSTANTS_USEDSIMDO.
`2. made event counters for these used multibit increment values
`
`"“+incdir+S$PARTSLIB/src/gfix/sp" to vcs_top.ini to pick up
`3. added
`spdefines.v included in sqais output.v
`
`#283 change 121065 edit on 2003/09/12 by donaldl@donaldlcrayolalinuxorl (ktext)
`
`Registered ROM_ENRSP and ROMPIPESEL[3:0].
`
`#282 change 120910 edit on 2003/09/12 by donaldl@donaldl_crayola_linux_orl
`
`(ktext)
`
`Removed SPtos@ kill_type and kill_valid signals and added them internally
`in the SQ. Done to save some gates and also to avoid having to add
`redundancy logic to them.
`
`#281 change 120592 edit on 2003/09/10 by vromaker@vromakerr400_linuxmarlboro
`(ktext)
`
`changed $Qhs_bclk, TST_SQrfstar_wrek, T5T_SQhs_star_wreck so they
`are defined without the [0:0]
`range
`
`#280 change 120510 edit on 2003/09/10 by vromaker@vromakerr400linuxmarlboro
`(ktext)
`
`fix for 8QVC_simdid typo
`
`#279 change 120426 edit on 2003/09/10 by donaldl@donaldl_crayolalinuxorl (ktext)
`
`ATI Ex. 2112
`IPR2023-00922
`Page 13 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 13 of 638
`
`
`
`Added redundancy logic.
`
`#278 change 120190 edit on 2003/09/09 by dougd@dougd_r400_linuxmarlboro (ktext)
`
`changed SQRBevent to SQRBevent_pulse and declared as output from sq.v
`
`#277 change 120087 edit on 2003/09/08 by dougd@dougd_r400_linuxmarlboro (ktext)
`
`Fixed 2 bugs in Real Time address logic in aluconst.
`Added correct default value for INSTBASEVTX in sqrbbm_interface.v
`Fixed bug in Real Time write data buffer in sqinstructionstore.v
`Added missing input/output declarations for SIMD2 & SIMD3 signals to sqaluconst_top.v
`Clean up missing SIMD2, SIMD3 wire declarations in sq.v for the aluconst,
`is and cfe
`
`#276 change 119736 edit on 2003/09/05 by danh@danhcrayolallinuxorl (ktext)
`
`removed SQSP_interpmode,
`Redundant SP
`
`SQSPinterpbuffswap, added SQSP_interp_simdid for
`
`#275 change 119294 edit on 2003/09/03 by vromaker@vromakerr400linuxmarlboro
`(ktext)
`
`- instatiation of sq export blocker at sq top level
`— thread buffer timing fix related to status read/export count update
`
`#274 change 119127 edit on 2003/09/02 by dougd@dougdr400linuxmarlboro (ktext)
`
`to the instruction and
`Added the extra memories and their support
`constant stores to support 4 SIMD's. These memories and their
`required wiring and control are instantiated with “ifdef and use
`the SIMDnPRESENT macros defined in header.v
`Removed the use of SIMD1 macro.
`
`#273 change 118589 edit on 2003/08/28 by vromaker@vromaker_r400_linux_marlboro
`(ktext)
`
`- fix for loop index clamping and constant address generation (both index and offset
`relative)
`- changed the connection of the real time bit such that it now goes directly from the
`AIQ to the
`AIS output mux (and not thru the AIS)
`- sq_tests.simpleregindexing tests now pass
`
`#272 change 118128 edit on 2003/08/26 by dclifton@dcliftonr400 (ktext)
`
`Added definable # of simd's to sp.
`
`#271 change 117706 edit on 2003/08/22 by mmantor@mmantor_crayola_linux_orl
`
`(ktext)
`
`ATI Ex. 2112
`IPR2023-00922
`Page 14 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 14 of 638
`
`
`
`<added new ports and/or expanded to two bits to vgt, sq, and pa for simd_id with
`modifications to their test benches and added
`
`ifdefs with bad pipe signals to input of vgt,
`
`replaced SIMD1 macro with
`
`SIMD1PRESENT macro in the 5c files>
`
`#270 change 117504 edit on 2003/08/21 by mmang@mmang_crayola_linux_orl
`
`(ktext)
`
`1.
`
`Increased simdid wires to 2 bits throughout SQ.
`interfaces are still only 1 bit.
`2. Made SQ simd 1 blocks conditional based on SIMD1PRESENT in
`header.v. Realigned some code in anticipation of SIMD2 and SIMD3.
`
`S50 external
`
`#269 change 116380 edit on 2003/08/13 by mmang@mmang_crayola_linux_orl
`
`(ktext)
`
`1. Added separate gpr allocation/deallocation
`management for multiple simds
`(sq_gpr_alloc,
`sqexitsm, sqpixthreadbuff,
`sqstatus reg,
`sqvtxthreadbuff,
`sqpixctl, and sqvtx_ctl)
`2. Made thread_arb poll cfs rtr on a 4 clock
`interval in order to ensure the arbiters
`
`stayed in phase between simds.
`3. Created new interface signal between
`thread_arb and exportalloc to lock exportid
`and parameter cache base for each simd.
`In
`addition, created registers for these values
`for each simd in order to ensure they got
`allocated in order.
`
`4.
`
`5.
`
`6.
`
`In ais output, used simd to mask pixctl gpr
`writes to different simds.
`
`In tbhsqsp, added simdid and gpr write address
`to texture latency fifo to help trackers and
`read inject return files.
`In tex_instr_queue, grab appropriate gpr_max
`based on simd id.
`
`#268 change 115728 edit on 2003/08/10 by rramsey@rramseycrayolalinuxorl
`
`(ktext)
`
`Change S@ to hold off popping the RBBM skid fifo while map copies are in
`progress. This fixes the problem where gfxcopy writes were being missed
`if they were less than 8 clks apart.
`Get rid of extra write into RBBM skid fifo for reads, and instead zero out
`we and re out of fifo if it's empty. The fifo was overflowing if the filling
`entry was a read, since one additional entry was getting pushed.
`sxsppedata tracker now ignores 4f5eaddf
`(unwritten pe locations)
`Fix a problem in the sqsp testbench that was causing rbbm writes to be dropped
`if the sq exerted back pressure.
`
`ATI Ex. 2112
`IPR2023-00922
`Page 15 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 15 of 638
`
`
`
`#
`
`267 change 115620 edit on 2003/08/08 by dougd@dougdr400linuxmarlboro (ktext)
`
`1. ch
`
`ange all hs virage memories & files to have subword size in name
`2. added diagnostic write enable from rbbm interface register to the modules
`wi
`th extra memories to support multiple SIMDs
`
`#
`
`1. co
`ALU a
`
`2. ch
`on SO
`
`#
`(ktex
`
`266 change 115241 edit on 2003/08/06 by dougd@dougd_r400_linuxmarlboro (ktext)
`
`rrected the connections to sqperfmonwrapper to enable the
`ctive counters.
`
`)
`[0:0]
`anged a few 1 bit vector declarations (
`outputs because it caused errors in synthesis.
`
`to scalar
`
`265 change 114305 edit on 2003/07/31 by vromaker@vromaker1400linuxmarlboro
`t)
`
`clean
`
`ed up the path of ismstate down through the
`instruction pipelines and removed the defparams used in the
`multiple instantiations of several modules.
`
`#
`(ktex
`
`264 change 113286 edit on 2003/07/25 by vromaker@vromakerr400linuxmarlboro
`t)
`
`- a few more fixes for SQ_VC/TP interfaces;
`with the VC turned on
`
`the sq mini-regress now passes
`
`#263 change 112073 edit on 2003/07/21 by vromaker@vromakerr400linuxmarlboro
`(ktext)
`
`- fix for SQVC interface
`- TPSQdec was hooked up to the interface counter
`- timing fix in vtx thread buffer
`- simd_num connected thru ptr buff and pix ctl to pix thread buff
`- performance fix in pix ctl
`
`262 change 111905 edit on 2003/07/18 by ygiang@ygiangr400pv2_marlboro (ktext)
`
`added:
`
`new perf counters for sq hardware
`
` 260 change 111419 edit on 2003/07/16 by rramsey@rramseycrayola_linuxorl (ktext)
`
`261 change 111736 edit on 2003/07/17 by mmang@mmang_crayola_linux_orl
`
`(ktext)
`
`Added
`Added
`
`sp->sx export arbitration between multiple simd engines.
`register after instrstart OR of multiple simd engines by
`taking unregistered signal out of sqaisoutput.
`
`ATI Ex. 2112
`IPR2023-00922
`Page 16 of 638
`
`ATI Ex. 2112
`
`ATI Ex. 2112
`IPR2023-00922
`Page 16 of 638
`
`
`
`Connect TST_awt_enable to ve_skidbuf and wire it up to the top level
`
`#259 change 111008 edit on 2003/07/14 by dougd@dougd_r400_linuxmarlboro (ktext)
`
`added logic to support programmable memory size for texconst and
`aluconst stores.
`
`#258 change 110640 edit on 2003/07/12 by mmantor@mmantorcrayolalinuxorl (ktext)
`
`<l. Enlarge export memories for performance fill rate
`ge,
`tbhsqsp,
`tb_sx)
`for shift bug with added guard bit
`2.
`Fix Sx diff engine (interpolators)
`3. Fix compile/sre code problem with s-blocks memories
`4. Added the sx to th_sqsp by default, can still disable by macro
`5. Added mode to thsqsp and thsx to run interfaces at max rate
`6. Initialized state in ve to allow cp surface synchronizer micro code to
`invalidate tc/vc
`
`(emulator, sq, sx,
`
`rb, ferret
`
`sx and testbenches
`7. Added test signals to sc.v, sc_b.v, sq, sp, spi,
`THIS CHANGES REQUIRES THE RELEASE OF SC, SC_B, SQ, SPI, SP, SX, RB,
`sre/chip/chip**.tree files,
`parts _lib/sim/test/gc/vcs_top.ini, gc/tb_sqsp/tb_sx updates
`togeather
`>
`
`and the emulator
`
`#257 change 110177 edit on 2003/07/10 by rramsey@rramseycrayola_linux_orl
`
`(ktext)
`
`Changes to get simdid piped down the vertex side and into the thread
`buffer. Also only write the active simd's gprs and mux pipedisable bits.
`The memory in sqveskidbuf increased by 1 bit,
`so this will require
`a new memory to be checked in before running without USEBEHAVEMEM.
`
`#256 change 110083 edit on 2003/07/09 by dougd@dougd_r400_linuxmarlboro (ktext)
`
`(SIMD1, SIMDO)
`added data output mux to select between the two memories
`for RBBM diagnostic reads. The mux is controlled by a rbbm register bit
`in the SQDEBUGMISC register.
`
`#255 change 110066 edit on 2003/07/09 by vromaker@vromakerr400_linuxmarlboro
`(ktext)
`
`- fixed a bug in tex instr seq related to