throbber
(19) United States
`(12) Reissued Patent
`Morikawa et al.
`
`USOORE43 145E
`
`(10) Patent Number:
`
`(45) Date of Reissued Patent:
`
`US RE43,145 E
`*Jan. 24, 2012
`
`(54) PROCESSOR WHICH CAN FAVORABLY
`EXECUTE A ROUNDING PROCESS
`COMPOSED OF POSITIVE CONVERSION
`AND SATURATED CALCULATION
`PROCESSING
`
`(75)
`
`Inventors: Toru Morikawa, Kadoma (JP); Nobuo
`Higaki, Kadoma (JP); Akira Miyoshi,
`Kadoma (JP); Keizo Sumida, Kadoma
`(JP)
`
`(73) Assignee: Panasonic Corporation, Osaka (JP)
`
`( * ) Notice:
`
`This patent is subject to a terminal dis-
`claimer.
`
`(21) Appl.No.: 11/016,920
`
`(22)
`
`Filed:
`
`Dec. 21, 2004
`Related U.S. Patent Documents
`
`Reissue of:
`
`6,237,084
`May 22, 2001
`09/399,577
`Sep. 20, 1999
`
`(64) Patent No.:
`Issued:
`Appl. No.:
`Filed:
`U.S. Applications:
`(62) Division of application No. 10/366,502, filed on Feb.
`13, 2003, now Pat. No. Re. 39,121, which is a division
`of application No. 08/980,676, filed on Dec. 1, 1997,
`now Pat. No. 5,974,540.
`
`(30)
`
`Foreign Application Priority Data
`
`Nov. 29, 1996
`
`(JP) ..................................... .. 8-320423
`
`(51)
`
`Int. Cl.
`(2006.01)
`G06F 9/302
`(52) U.S. Cl.
`....................... .. 712/221; 708/551; 708/552
`
`(58) Field of Classification Search ................ .. 708/550,
`708/551, 552, 203, 204, 208
`See application file for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,935,890 A
`
`6/1990 Funyu
`(Continued)
`
`EP
`
`FOREIGN PATENT DOCUMENTS
`0 657 804 Al
`6/1995
`
`(Continued)
`
`OTHER PUBLICATIONS
`
`TMS320l0 User’s Guide, Digital Signal Processor Products, Texas
`Instruments, 1983, p. 3-7.*
`
`(Continued)
`
`Primary Examiner — Richard Ellis
`(74) Attorney, Agent, or Firm — McDermott Will & Emery
`LLP
`
`ABSTRACT
`(57)
`A processor which executes positive conversion processing,
`which converts coded data into uncoded data, and saturation
`calculation processing, which rounds a value to an appropri-
`ate number of bits, at high speed. When a positive conversion
`saturation calculation instruction “MCSST D1” is decoded,
`the sum-product result register 6 outputs its held value to the
`path P1. The comparator 22 compares the magnitude of the
`held value ofthe sum-product result register 6 with the coded
`32-bit integer “0x0000_00FF”. The polarity judging unit 23
`judges whether the eighth bit of the value held by the sum-
`product result register 6 is “ON”. The multiplexer 24 outputs
`one of the maximum value “0x0000_00FF” generated by the
`constant generator 21, the zero value “0x0000_0000” gener-
`ated by the zero generator 25, and the held value of the
`sum-product result register 6 to the data bus 18.
`
`44 Claims, 17 Drawing Sheets
`
`
`
`I REGISTER 1-1113
`
`
`
`
`.—
`
`
`
`C1
`
`PETITIONER EXHIBIT 1025-0001
`
`

`

`US RE43,145 E
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`4,945,507 A
`5,235,533 A
`5,251,166 A
`5,402,368 A
`5,448,509 A
`5,504,697 A
`5 2 *
`,
`,
`5,684,728 A
`5,696,709 A
`5,301,977 A
`5,812,439 A
`5,847,978 A
`5,889,980 A
`5,915,109 A
`5a917a740 A
`5974540 A
`6’029’184 A
`6,058,410 A
`
`7/1990 Ishida et al.
`8/1993 Sweedler
`10/1993 Ishida
`3/1995 Yamada
`9/1995 Lee et al.
`4/1996 Ishida
`1C:h1i%(3W3
`1
`............... ..
`a up eta.
`11/1997 Okayama et al.
`12/1997 Smith, Sr,
`9/1998 Km-P et 31,
`9/1998 Hansen
`12/1998 Ogura
`3/1999 Smith, Jr.
`6/1999 Nakamura et al.
`6/1999 V01k_0I15kY
`10/1999 Mmkawa 6‘ *'1~
`2/2000 He
`5/2000 Sharangpani
`
`712/234
`
`EP
`EP
`GB
`GB
`Jp
`JP
`JP
`JP
`JP
`JP
`JP
`
`FOREIGN PATENT DOCUMENTS
`657804
`6/1995
`0 766169 A1
`4/1997
`2 300 054 A
`10/1996
`2300054
`10/1996
`5856032
`4/1983
`58.0 55032 A
`4/1983
`07-182141
`7/1995
`7-210368 A
`8/1995
`7210368
`8/1995
`7-334346
`12/1995
`8-272591
`10/1996
`
`JP
`
`KR
`JP
`W0
`W0
`
`09-97178
`
`1995-0010571
`10-55274
`9617292
`WO 96/17292
`
`4/1997
`
`9/1995
`2/1998
`6/1996
`6/1996
`
`OTHER PUBLICATIONS
`
`Dictionary.com, definition of“specified”, http://dictionaryreference.
`-
`°°."“.br°WSe/SPe°‘fied’.‘?°°eSSedf““' 1.6’ 2(,),10'*
`,
`.
`.
`D1ct1onary.com, defin1t1on of
`defin1tely , http.//dict1onary.refer-
`ence,.com/browse/defin1te1y, accessed Jun. 16, 2010.*
`,
`,
`D1ct1onary.com, defin1t1on of “unamb1guously”, http://d1ct1onary.
`reference.com/browse/unambiguously, accessed Jun. 16, 2010.*
`Patterson et a1., AVLSI RISC, IEEE Computer, 1982, pp. 8-18 and
`20.21,*
`Japanese Office Action, issued in corresponding Japanese Patent
`Application No. 9-327866 dated on Oct. 11 2007.
`Nadehara Kouhei et al
`,“Low-Power Multimedia RISC” IEEE
`.
`’
`’
`"
`’
`M1cro, US, IEEE Inc., NewYork, Vol. 15, No. 6, (Dec. 1, 1995), pp.
`20-29, XP538227, ISSN: 0272-1732. _
`_
`Lee, RubyB., SubwordPara1le11smw1th MAX-2 ,IEEE M1cro,US,
`IEEE Inc., New York, Vol. 16, No.
`, (Aug. 1, 1996), pp. 51-59,
`XP000596513
`'
`.
`.
`.
`.
`.
`Korean Office Act1on, 1ssued 1n Correspond1ng Korean Patent Appl1-
`catron No. 10-1997-0064288, dated on Feb. 26, 2004.
`“Low-Power Multimedia RISC,” by K. Nadehara, 8207 IEEE Micro
`15 (1995) Dec., No. 6.
`“Subword Parallelism with MAX-2,”by R. Lee, IEEE Micro Aug. 1,
`1996, V01, 16,No, 4,
`
`* cited by examiner
`
`PETITIONER EXHIBIT 1025-0002
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 1 of 17
`
`US RE43,145 E
`
`FIG. 1 PRIOR ART
`
`
`
` ARITHMETIC
`
`LOGIC UNIT
`
`
`
`
`SUM-PRODUCT
`RESULT REGISTER
`
`PETITIONER EXHIBIT 1025-0003
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 2 of 17
`
`US RE43,145 E
`
`wE.:a.§_.mE.:a.mE.~E._E$m.:o.$o.$o.$m.2o.$o.25wm..C$.oE.m$.§_m$.~$._Em8guano.m8.£o.mmo.N8.28amamamam5mmNEammmogmosoananmoN8so
`
`
`$,,__5%$.§.§_.§.~E.2E$u_$m.$o.8u.$u.$o.N8.2mo
`
`wE.t..m.2.m.mE.E.m.2.m.~E.Em§o.:u.Eo.2o.:o.m5.~.5_25
`
`
`
`mExm.,Gm..fiw,.3mfiawfiammwouamumouaouaouaonwmunso
`
`mE.$n_.§_$n_.§.mE.~E._E$o.so_o8.$o.$u.mmo.~8.28
`
`
`
`
`
`wE.:h$E.mE.¢$.23:2E
`mB.:o_Su.m_o_:o.2o.§38
`
`.52.mesaN.0E
`
`mmmhm_8m.mN:.§m.m~m_m~:.mmam:m2:2:3mem3:E
`
`§.__:m.$m_3:.:.m.mE.mEd:
`$m.$m.$m.$:EmEm$2.5m
`
`mm:_Sm_mm:_£:.xm.m..m.mmm.2mm
`Ext:.Em.2:.:m.2mHmE.E.
`§.:mm.$m.mmmwmmuwmnmmz._mm
`mmzxwmnmwmfl$:”$mU2.m.~mm.am
`
`:a._;:u+:%..:o+§_...m:u+_E*Eu
`
`
`
`+5?:u+am*2o+_E*~_o+:%_.SUHZE
`
`~$.::o+~E*:o+NE,::o+~£._.2o
`
`
`
`+~E*3o+~E*2o+N2._.~_o+NE*zouam
`
`PETITIONER EXHIBIT 1025-0004
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 3 of 17
`
`US RE43,145 E
`
`0E
`
`m
`
`
`
`zofioayzmézoemm
`
`mmm~Eo<
`
`.535SE
`
`295352.
`
`m.05
`
`zofiémmo
`
`zofiaomxm
`
`mDH<m<nE<
`
`mam<,_.<D
`
`PETITIONER EXHIBIT 1025-0005
`
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 4 of 17
`
`US RE43,145 E
`
`'!.|Il'!l.iIII:
`IIlI||'lI||l|I!‘loil'||l|u
`
`Elll
`
`||_
`
`U.InlI|IIIll"l|lI'l0iI|t|
`
`I..I.l|J-
`
`asas.§___
`_awgomags
`
`.
`
`.
`
`.
`
`-
`
`.
`
`.
`
`PETITIONER EXHIBIT 1025-0006
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 5 of 17
`
`US RE43,145 E
`
`2GE
`
`222E5:..=2222222222222532282:22:
`
`
`
`
`22222222.225;22222222222222
`
`
`22222..:227:E.525::2E;28....83,8222222
`
`STE..35.3222222
`STE82:5222222
`3:22222222122
`
`21:".2$553.5:":
`2.1::.885522SE:
`2152352252.:
`
`21222.2:::222
`
`2.222.._222222222
`.2.22.2.222>222222222.222222.22222222222.2222_,22222
`
` 22_282222222222.222222222222.222222222222:2222222222222>2-2222222
`22.222222222222222222222
`
`-222,.
`
`22:2222222222
`
`22222.222222.222:322$22258E2222
`
`2238:322.222E2322222
`
` -22E222.2222222222222222222.222222:2222222
`
`
`
`2222:.32252>.2222225.:
`
`
`22.-2222222
`
`PETITIONER EXHIBIT 1025-0007
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 6 of 17
`
`US RE43,145 E
`
`FIG. 6
`
`MACCB INSTRUCTION
`
`MULTIPLIER
`READ ADDRESS
`INDICATION
`
`MULTIPLICAND
`READ ADDRESS
`INDICATION
`
`ll‘---MCR
`00- -
`- -REGISTER D0
`01----REGISTERD1
`10'
`' "REGISTER D2
`
`ll----MCR
`00- '
`- -REGISTER D0
`01-~-'REGISTERD1
`10 -
`- "REGISTER D2
`
`INDICATION OF CONTENT OF ELEMENTAL OPERATION
`
`1' -
`0' -
`
`' -MULTIPLICATION
`'
`'NONE
`
`
`
`INDICATION OF CALCULATED CONTENT OF
`ALGEBRAIC SUM
`
`1" ' ‘ADDITION
`0' '
`' ‘NONE
`
`INDICATION OF STORAGE ADDRESS
`FOR _SUM-PRODUCT RESULT
`
`1. .
`0' '
`
`. .MCR
`' ‘NONE
`
`PETITIONER EXHIBIT 1025-0008
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 7 of 17
`
`US RE43,145 E
`
`FIG. 7
`
`MCSST INSTRUCTION
`
`STORAGE ADDRESS
`POSITIVE CONVERSION
`SATURATION CALCULATION INDICATED
`WIDTH INDICATION
`
`00---'24bit POSITIVECONVERSION 00' ' ' "REGISTER D0
`01""l6bilP0$l'l'lVECONVERSION 01“-‘REGISTER D1
`11---'8b'u POSITIVECONVERSION
`10' °
`' ' REGISTER D2
`11' ‘
`' ‘REGISTER D3
`
`PETITIONER EXHIBIT 1025-0009
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 8 of 17
`
`US RE43,145 E
`
`FIG. 8A
`
`
`
`CODE 3”
`
`SUM-PRODUCT RESULT
`
`x
`%
`8
`9
`I6
`24
`32
`IIIIIIIIIIIIIIIIIIIIIIII§§§§§&&8
`
`
`
`
`
`
`
`MATRIX MULTIPLICATION RESULT I-llj
`
`FIG. 8B
`
`SUM-PRODUCT RESULT
`32767
`(7FFF)
`
`- 32767
`
`.
`
`PETITIONER EXHIBIT 1025-0010
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 9 of 17
`
`US RE43,145 E
`
`FIG. 9
`
`LOGIC VALUE X LOGIC VALUE Y SELECTED INPUT VALUE
`
`0x0000_00FF
`
`0x0000_0000
`
`
`
`
`
`0x0O00_00O0
`
`STORED VALUE OF
`SUM—PRODUCT RESULT
`REGISTER
`
`PETITIONER EXHIBIT 1025-0011
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 10 Of 17
`
`US RE43,145 E
`
`FIG. 10
`
`EXAMPLE OPERATION: DO X D1(0x7f X 0x70)
`
`REGISTER STORED D0
`VALUE
`
` OUTPUT OF LOWER-ORDER
`
`32 BITS
`
`
`
`
`MsB:o
`POSITIVE CONVERSION
`SATURATION CALCULATION 0x00003790>0x000000ff
`CIRCUIT
`"'0x000000ff
`
`32 0x00O000ff
`
`
`
`REGISTER STORED
`VALUE
`
`D1
`
`
`
`0x(l00000ff
`
`
`
`MEMORY STORED VALUE
`
`
`Oxff
`
`PETITIONER EXHIBIT 1025-0012
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 11 of 17
`
`US RE43,145 E
`
`FIG. 11
`
`EXAMPLE OPERATION: D0 >< D1 (0x7f X 0x80)
`
`MEMORY STORED
`
`STORED D0
`
`5
`
`
`
`
`
`CODE EXTENSION
`CIRCUIT
`
`1
`
`
`
`0xO0U0007f
`
`32 Oxffffff80
`
`2
`
`64 0xffffffffffffC080
`
`6
`
`32 0xffffc0803
`
`POSITIVE CONVERSION
`MSB: 1->0x00000D00
`SATURATION CALCULATION
`CIRCUIT
`
`
`
`32 0x00000000
`
`
`
`
`
`D1
`
`REGISTER STORED
`VALUE
`
`MEMORY STORED VALUE
`
`axon
`
`PETITIONER EXHIBIT 1025-0013
`
`
`
`OUTPUT OF I_.OWER—ORDER
`
`32 BITS _
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 12 of 17
`
`US RE43,145 E
`
`1E1
`V459!
`AE.,I_!%,I-3!?
`égéggggé
`égagégégé
`$313!?
`@1315
`«W»‘M
`INSTRUCTIONFETCHSTAGE
`
`INSTRUCTIONDECODINGSTAGE
`
`
`
`FIG.12A
`
`EXECUTIONSTAGE
`
`>-'
`
`(1)
`
`E5‘
`284:
`§<=:c*7a
`
`
`
`REGISTERWRITESTAGE
`
`PETITIONER EXHIBIT 1025-0014
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 13 of 17
`
`US RE43,145 E
`
`2
`
`20
`I-«
`33
`L“
`at.
`C}
`HOLD
`was
`Echo
`
`z
`
`9
`E-*
`B8
`‘“<
`><E--
`“M
`
`>
`
`an
`
`gm 3%
`mo
`Dds-«
`5%:
`33”’
`§<::
`-«E—~
`mm
`U[--«
`U)
`Q
`EE
`
`3 g EE
`
`Z17-I1
`QED
`H:
`Um
`3:05
`[_.U
`was
`Eu.
`
`PETITIONER EXHIBIT 1025-0015
`
`FIG.12B
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 14 of 17
`
`US RE43,145 E
`
`
`
`TORAGE ADDRESS
`POSITIVE CONVERSION
`SATURATION CALUCULATION INDICATED
`WIDTH INDICATION
`
`-
`11 -
`-
`00" - -24bit POSIIIVE CONVERSION O0 -
`-
`0 1 lfibit POSITIVE CONVERSION
`01 -
`1 1....8bi”:0S|'[[VE CDNVERSIDN
`10- '
`
`- -MCR
`-
`- REGISTER DO
`-
`- REGISTER D1
`- REGISTER D2
`
`READ ADDRESS INDICATION
`
`l 1 -
`
`-
`
`- -MCR
`
`00' ‘
`01- '
`10' '
`
`' ' REGISTER D0
`- -REGISTER D1
`' REGISTER D2
`
`PETITIONER EXHIBIT 1025-0016
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 15 of 17
`
`US RE43,145 E
`
`."‘\
`
`PETITIONER EXHIBIT 1025-0017
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 16 of 17
`
`US RE43,145 E
`
`uIIIIIIII!|uIl0IIlI.I|IlullllloI.IlIl|lIlv!Il|||lllulnll
`
`M_zo%.%z%%88\.n:85235336asas
`
`ehW205235_,._em.~._m._1.§.._m_1,._+_.m.p_...mm
`
`:22ozaaaW».E<§m
`
`PETITIONER EXHIBIT 1025-0018
`
`
`
`

`

`U.S. Patent
`
`Jan. 24, 2012
`
`Sheet 17 of 17
`
`US RE43,145 E
`
`FIG. 16
`
`MULBSST INSTRUCTION
`
` MULTIPLIER READ
`
`MULTIPLICAND READ
`ADDRESS INDICATION ADDRESS INDICATION
`
`I1--"MCR
`11----MCR
`00- -
`- -REGISTER D0
`00' -
`- -REGIST ER D0
`01----REGISTERDI O1----REGISTERDI
`10- -
`- -REGISTER D2
`10- -
`- -REGISTER D2
`
`POSITIVE CONVERSION SATURATION CALCULATION
`WIDTH INDICATION
`
`01- '
`
`- -24bit POSITIVE VALUE
`
`10- ' --16bit POSITIVE VALUE
`
`1 I ' ' ' '8bit POSITIVE VALUE
`
`CALCULATION CONTENT INDICATION
`
`1- -
`
`- -MULTIPLICATION
`
`0- -
`
`- -NONE
`
`PETITIONER EXHIBIT 1025-0019
`
`

`

`US RE43,l45 E
`
`1
`PROCESSOR WHICH CAN FAVORABLY
`EXECUTE A ROUNDING PROCESS
`COMPOSED OF POSITIVE CONVERSION
`AND SATURATED CALCULATION
`PROCESSING
`
`Matter enclosed in heavy brackets [ ] appears in the
`original patent but forms no part of this reissue specifica-
`tion; matter printed in italics indicates the additions
`made by reissue.
`
`More than one reissue application has been filed for the
`reissue of US. Pat. No. 6,23 7,084. The reissue applications
`are application Ser. Nos. 10/366, 502 (reissued as RE39,I2I
`on Jun. 6, 2006) and II/016,920 (this application), all of
`which are divisional reissues ofU.S. Pat. No. 6,23 7,084. This
`application is a divisional reissue of application Ser No.
`I0/366,502filed Feb. 13, 2003 which is a reissue ofSer No.
`09/399,577filed on Sep. 20, 1999, now US. Pat. No. 6,23 7,
`084, which is a divisional ofapplication Ser. No. 08/980,676
`filed Dec. 1, 1997, now US. Pat. No. 5,974,540.
`This is a divisional application ofU.S. Ser. No. 08/980,676
`now U.S. Pat. No. 5,974,540 filed Dec. 1, 1997.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`
`The present invention relates to a processor that performs
`processing according to instruction sequences that are stored
`in a ROM or the like.
`
`2. Background of the Invention
`In recent years, there has been a visible increase in the use
`of application software that can interactively reproduce vari-
`ous kinds of data, such as video data, still image data, and
`audio data, that have been compressed according to tech-
`niques such as frame encoding, field encoding, or motion
`compensation. As such software has been developed, there
`has been increasing demand for multimedia-oriented proces-
`sors that can efficiently execute the software. These multime-
`dia-oriented processors are processors designed with a spe-
`cial architecture to facilitate programming, such as the
`compression and decompression ofvideo and audio data. The
`high-speed processing required for handling video data is the
`matrix multiplication of compressed data that has N*N
`matrix elements with coefficient data that also has N*N
`
`matrix elements. Representative examples of compressed
`data that has N*N matrix elements are the luminescence
`
`block composed of 16*16 luminescence elements, the blue
`color difference block (Cb block) composed of 8*8 color
`difference elements, and the red color difference block (Cr
`block) composed of 8*8 color difference elements used in
`MPEG (Moving Pictures Experts Group) techniques. The
`matrix multiplication for compressed data referred to here is
`performed very frequently when executing the approxima-
`tion calculations for an inverse DCT (Discrete Cosine Trans-
`form) in image compression methods such as MPEG and
`JPEG (Joint Photographic Experts Group).
`The following is a description of conventional multimedia-
`oriented processors that can perform high-speed matrix mul-
`tiplication. The basic architecture of conventional multime-
`dia-oriented processors is provided with a sum-product result
`register (hereinafter simply referred to as an MCR register) as
`hardware, and is provided with an instruction set that includes
`a “MOV MCR, **” transfer instruction for transferring a sum-
`product value.
`
`2
`
`An example ofthe hardware construction ofa conventional
`multimedia-oriented processor is shown in FIG. 1. As shown
`in FIG. 1, the arithmetic logic unit (hereinafter, “ALU”) 61
`performs the multiplication of an element Fij that forms part
`of the compressed data and an element Gji that forms part of
`the coefficient matrix in accordance with a multiplication
`instruction. The ALU 61 also reads the sum-product value
`stored in the sum-product result register 62, adds the multi-
`plication result of Gji*Fij to the read sum-product value, and
`has the result of this addition stored in the sum-product result
`register 62. By repeating the above calculation, a sum-prod-
`uct value is accumulated in the sum-product result register 62.
`Once the multiplication has been performed a predetermined
`number of times, the programmer issues a sum-product value
`transfer instruction. By issuing a transfer instruction, the
`accumulated value in the sum-product result register 62 is
`transferred to the general registers, and is used as the matrix
`multiplication result for one row and one column. By per-
`forming N*N iterations of the above processing, the matrix
`multiplication of N*N compressed data and an N*N coeffi-
`cient matrix can be completed.
`When a conventional multimedia-oriented processor is
`used, however, positive correction saturation operations for
`amcnding thc sum-product valuc posc many difficultics for
`programmers.
`Positive conversion processing refers to the conversion of a
`sum-product value that is a negative value into either zero or
`a positive value. Normally, compressed data is expressed as a
`coded relative value that reflects the relation of the present
`value to the preceding and succeeding values. As a result,
`there are many cases when the sum of products for each
`element in the compressed data and the corresponding coef-
`ficients is a negative value. Most reproduction-related hard-
`ware, such as displays and speakers, however is only able to
`process uncoded data, so that when the sum-product values
`are to be reproduced, it is first necessary to perform positive
`conversion processing.
`Saturation calculation processing refers to processing that
`sets all values that exceed a given range (or, in other words,
`which are “saturated”) at a predetermined value. This is to
`say, when an element that includes an erroneous bit generated
`during transfer is used in a sum-product calculation as part of
`the sum-product processing for compressed data, there is an
`increase in the probability of the sum-product value exceed-
`ing a value that can be expressed by the stated number of bits.
`Since most reproduction-related hardware is only physically
`capable of reproducing uncoded data with a fixed valid num-
`ber ofbits, such as eight bits, saturation processing is required
`to convert the sum-product value into a value that can be
`expressed using the valid number of bits.
`It has been conventional practice to perform this kind of
`positive value conversion processing and saturation calcula-
`tion processing by converting the-sum-product value using a
`subroutine that corrects the sum-product value. An example
`of a subroutine that corrects the sum-product value is
`explained below. In this example, the register width and the
`calculation width of the calculation unit are 32 bits, with the
`width of the MCR being 32 bits, and the sum-product value
`being expressed as a coded 16-bit integer. The data that can be
`handled by the reproduction-related hardware needs to be
`expressed using uncoded 8-bit integers. This subroutine is set
`as using the data register D0 for storing the calculation result.
`Each instruction is expressed using two operands, with the
`left and right operands being respectively called the first and
`the second operands. The second operand is used both to
`indicate the transfer address of a transfer instruction and the
`
`storage address of an arithmetical instruction.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`PETITIONER EXHIBIT 1025-0020
`
`

`

`3
`MOV MCR,D0
`CMP 0XFFFF,8000,D0
`BCC CARRY
`
`Instruction 1:
`Instruction 2:
`Instruction 3:
`Instruction 4:
`Instruction 5:
`CARRY:
`Instruction 6:
`CMP 0x0000_00FF,D0
`Instruction 7: BCS END
`
`MOV 0x0000_00000,D0
`BRA END
`
`Instruction 8: MOV 0x0000_00FF,D0
`END: (end of positive conversion saturation calculation
`processing)
`Describing the above instructions in order, Instruction 1,
`“MOV MCR,D0”, transfers the stored value of the MCR
`register into the data register D0.
`Instruction 2, “CMP
`0xFFFF_8000,D0”, compares the value in the data register
`with the immediate “0xFFFF_8000”, where “0x” shows that
`the value is given in hexadecimal. This comparison is per-
`formed by subtracting the immediate “0xFFFF_8000” given
`in the first operand from the stored value of the data register
`D0 given in the second operand.
`The sixteenth bit of the immediate “0xFFFF_8000” in
`Instruction 2 is the code bit used for a 16-bit coded integer, so
`that whcn thc storcd value of thc data rcgistcr D0 is greater
`that the immediate “0xFFFF_8000”, this shows that the
`value stored in the MCR is a negative number.
`On the other hand, when the stored value of the D0 register
`is less than “0xFFFF,8000”, this shows that the value stored
`by the MCR is a positive number. If this number is a positive
`number, a carry is performed and the carry flag in the flag
`register is set.
`The letter “B” in the “BCC” in Instruction 3 stands for
`
`“Branch”, while the letters “CC” stand for “Carry Clear”.
`When the comparison in Instruction 2 finds that the stored
`value ofthe register D0 is less than the immediate “0xFFFF,
`8000”, a branch is performed to Instruction 6 which has the
`label “CARRY”. Conversely, when the comparison in
`Instruction 2 finds that the stored value of the register D0 is
`greater than the immediate “0xFFFF_8000”, Instruction 4,
`“MOV 0x0000,0000,D0” transfers the value zero into the
`register D0, amending the sum-product value to zero. After
`this amendment, the unconditional branch “BRA END” in
`Instruction 5 is performed to transfer the processing to the
`“END” label, thereby completing the positive conversion
`processing.
`The processing described above is performed when the
`stored value of the register D0 is negative. The following is a
`description of the processing performed when the stored
`value of the register D0 is greater than the immediate
`“0xFFFF_8000”.
`In such a case,
`Instruction 6, “CMP
`0x0000_00FF,D0” compares the stored value of the register
`D0 with the immediate “0x0000_00FF”. This comparison is
`performed by subtracting the immediate “0x0000_00FF”
`given in the first operand from the stored value of the data
`register D0 given in the second operand. When the stored
`value of the D0 register is smaller than the immediate
`“0x0000_00FF”, a carry is performed and the carry flag in
`the flag register is set.
`The letters “CS” in Instruction 7, “BCS END”, stand for
`“Carry Set”, so that when the carry flag is set, a branch is
`performed to the label “END” from Instruction 7.
`When the carry flag is not set, no branch is performed in
`Instruction 7 and processing advances to Instruction 8, “MOV
`0x0000_00FF,D0”, where the immediate “0x0000_00FF”
`is transferred into the register D0 to amend the calculation
`result to “0x0000_00FF”, thereby completing the saturation
`calculation processing.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US RE43,145 E
`
`4
`
`The problem with the sum-product value amendment pro-
`cess described above lies in the considerable increase in code
`
`size caused by the insertion ofthe above eight instructions for
`one amendment of a sum-product value. When the program is
`written into a ROM to embed the software into the informa-
`
`tion processing apparatus, the required amount of installed
`ROM will have to need to be increased by an amount equal to
`this increase in code size, leading to an increase in manufac-
`turing cost. A large number of manufacturers of domestic
`appliances such as digital video players, electronic note-
`books, and word processors seek to improve on their rivals’
`products by using their own decompression processing pro-
`grams, although the installation of such decompression pro-
`cessing programs presently has the drawback of increasing
`costs by increasing the required amount of ROM, making
`such installation problematic.
`There is also the problem that since eight instructions need
`to be executed to correct one sum-product value, there is a
`large increase in processing time. When, as shown in FIG. 2,
`an approximation calculation for an inverse DCT is per-
`formed by multiplying compressed data Fij (where i,j:1,2,3,
`4,5 .
`.
`. 8) composed of 8*8 elements with a coefficient matrix
`Gji (where i,j:1,2,3,4,5 .
`.
`. 8) also composed of8*8 elements
`to produce the multiplication result matrix Hij (where i,j:1,
`2,3,4,5 .
`.
`. 8), the calculation of the matrix multiplication
`result element H21 requires the sum-product processing of
`the multiplication results of one colunm of compressed data
`elements F11, F21, F31, F41, F51, F61, F71, F81 by one row
`ofcoefficient data elements G11, G12, G13, G14, G15, G16,
`G17, G18. The result is then subjected to positive conversion
`saturation calculation processing. Following this, the calcu-
`lation of the matrix multiplication result element H12
`requires the sum-product processing of the multiplication
`results of the colunm of compressed data elements F12, F22,
`F32, F42, F52, F62, F72, F82 by one row of coefficient data
`elements G11, G12, G13, G14, G15, G16, G17, G18, withthe
`sum-product result then being subjected to positive conver-
`sion saturation calculation processing.
`The same sum-product processing and positive conversion
`saturation calculation processing is required to obtain the
`other matrix multiplication result elements H21, H31, H41,
`H51, H61, H71, H81, .
`.
`.
`, and since there are 64 elements in
`the coefficient matrix Gij (where i,j:1,2,3,4,5 .
`.
`. 8), the
`sum-product value amending subroutine for positive conver-
`sion saturation calculation processing needs to be performed
`64 times. This sum-product value amending subroutine
`includes branch instructions (as Instructions 3, 5, and 7), so
`that when this sum-product value amending subroutine is
`executed, branches will occur regardless of whether negative
`values or saturation occur, so that the 64 iterations of the
`subroutine will not be performed smoothly. When attempts
`are made to improve the processing speed ofthe sum-product
`operation by introducing pipeline processing to the processor,
`the execution ofthe stated three branch instructions will result
`
`in a noticeable drop in processing efficiency.
`In order to increase the speed of the matrix multiplication,
`it is possible to install a specialized circuit for performing
`matrix multiplication. However, if all of the matrix multipli-
`cations are performed by a specialized circuit, there would be
`a vast increase in hardware, and the processor characteristic
`known as versatility, whereby the processor executes a variety
`of processes in accordance with the program written by the
`programmer, is lost. If the versatility of the processor is lost,
`there is the risk that the processor will not be able to respond
`
`PETITIONER EXHIBIT 1025-0021
`
`

`

`US RE43,l45 E
`
`5
`to programmers’ wishes, and so will not, for example, be able
`to execute an original decompression processing program.
`
`SUMMARY OF THE INVENTION
`
`6
`lation processing is performed in the same step as the calcu-
`lation processing, so that the effective number of steps taken
`the positive conversion saturation calculation processing is
`zero.
`
`It is a primary object of the present invention to provide a
`processor that can perform a rounding process made up of a
`positive conversion process and a saturation calculation pro-
`cess at high speed, while minimizing the increase in code size
`caused by the rounding process.
`The stated object can be achieved by a processor that suc-
`cessively decodes and executes instructions in an instruction
`sequence, the instruction sequence including instructions that
`indicate a storage address of a value used in an operation, the
`processor including: a detecting unit for detecting whether a
`next instruction to be decoded includes an operation content
`indication showing that the next instruction is a correction
`instruction and, ifpresent, reading the operation content indi-
`cation; and a rounding unit for rounding, when the detecting
`unit has detected an operation content indication showing that
`the next instruction is a correction instruction, a coded m-bit
`integer stored at a storage address indicated by the instruction
`to a value expressed as an uncoded s-bit integer (where s<m).
`With thc statcd construction, thc proccssing for rounding
`values is performed once each time a correction instruction is
`detected out of the instruction sequence, so that the rounding
`process can be executed by the programmer writing only one
`instruction.
`
`As the rounding process is performed according to one
`correction instruction, the execution time for one execution of
`the rounding process is extremely short. When the rounding
`of calculated values is required very often, such as when
`decompressing data, there will not be a significant increase in
`the time taken by the decompression processing.
`Since the rounding process can be performed by simply
`executing a correction instruction, when the processor
`attempts to perform a sum-products operation at high speed
`through pipeline processing, there will be no confusion in the
`pipeline. Accordingly,
`the code size of the instruction
`sequence can be reduced and the execution of the instruction
`sequence made faster by adding a small amount of hardware
`to the processor.
`The stated object can also be achieved by a processor that
`successively decodes and executes instructions in an instruc-
`tion sequence, the instruction sequence including instructions
`that indicate a storage address of a value to be used in an
`operation, the processor including: a first detecting unit for
`detecting whether a next instruction to be decoded includes an
`indication showing that the instruction has a calculation per-
`formed; a second detecting unit for detecting whether the next
`instruction to be decoded includes an indication showing that
`calculation is to be performed and that rounding is-to be
`performed on a calculation result; a calculating unit for per-
`forming, when the first detecting unit detects that the next
`instruction includes an indication showing that the instruction
`has a calculation performed, a calculation using an m-bit
`integer in accordance with the indication; and a rounding unit
`for rounding, when the second detecting unit has detected that
`the next instruction to be decoded includes an indication
`
`showing that rounding is to be performed, a calculation result
`of a calculation that uses an m-bit integer to a value expressed
`as an uncoded s-bit integer (where s<m).
`With the stated construction, correction instructions for
`performing a rounding process of a coded calculation result
`are provided, so that the two processes composed of a calcu-
`lation process and a rounding process can be performed in a
`single step. As a result, positive conversion saturation calcu-
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`These and other objects, advantages and features of the
`invention will become apparent from the following descrip-
`tion thereof taken in conjunction with the accompanying
`drawings which illustrate a specific embodiment ofthe inven-
`tion. In the drawings:
`FIG. 1 shows a conventional construction composed of an
`ALU 61 and a sum-product result register 62;
`FIG. 2 gives a representation of multiplication of matrices
`composed of N*N elements;
`FIG. 3 shows the construction of the processor of the first
`embodiment of the present invention;
`FIG. 4 shows the construction of the operation execution
`apparatus 14 in the present embodiment;
`FIG. 5 shows an instruction sequence composing the
`matrix multiplication subroutine in the present embodiment;
`FIG. 6 shows the instruction format of a sum-product func-
`tion multiplication instruction “MACCB D1,D1” i11 tl1e
`present embodiment;
`FIG. 7 shows the instruction format of a positive conver-
`sion saturation calculation instruction “MCSST” in the
`
`present embodiment;
`FIG. 8A shows the 32-bit expressions that are the multi-
`plier, the multiplicand, the sum-product value, and the matrix
`multiplication result element;
`FIG. 8B shows how the sum-product value is convened by
`the positive conversion saturation calculation circuit 3;
`FIG. 9 is a truth value table showing the relation of the
`combination ofthe output values of the constant generator 21
`and the zero generator 25 with the output of the multiplexer
`24;
`FIG. 10 shows the flow of data when performing an 8*8 bit
`multiplication using a 32*32 bit multiplication/sum-product
`unit;
`FIG. 11 shows the flow of data when performing an 8*8 bit
`multiplication using a 32*32 bit multiplication/sum-product
`unit;
`FIG. 12A shows an example of the pipeline processing
`performed by the processor shown in FIG. 3;
`FIG. 12B shows the execution according to pipeline pro-
`cessing of a matrix multiplication subroutine inside the pro-
`cessor shown in FIG. 3;
`FIG. 13 shows the instruction format of a positive conver-
`sion saturation calculation instruction “MCSST” in the
`
`applied example in the first embodiment;
`FIG. 14 shows the internal construction of the operation
`execution apparatus 14 in the first embodiment;
`FIG. 15 shows the internal construction of the operation
`execution apparatus 14 in the second embodiment; and
`FIG. 16 shows the instruction format of a positive conver-
`sion saturation calculation multiplication instruction “MulB-
`SST Dm,Dn”.
`
`DESCRIPTION OF THE PREFERRED
`EMBODIMENTS
`
`First Embodiment
`
`The following is an explanation of the first embodiment of
`the present invention with reference to the drawings. FIG. 3
`shows the internal construction of the processor in the first
`embodiment ofthe present invention, which can be seen to be
`
`PETITIONER EXHIBIT 1025-0022
`
`

`

`US RE43,l45 E
`
`7
`composed of a ROM 11, an instruction fetch circuit 12, a
`decoder 13, an operation execution apparatus 14, an address
`bus 17, and a data bus 18, with the address bus 17 and the data
`bus 18 being connected to the RAM 10.
`The RAM 10 stores the compressed data Fij (i,j:l,2,3,4,
`.
`. 8

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket