`Dye
`
`US006173381B1
`(10) Patent No.:
`US 6,173,381 B1
`(45) Date of Patent:
`*Jan. 9, 2001
`
`(54) MEMORY CONTROLLER INCLUDING
`EMBEDDED DATA COMPRESSION AND
`DECOMPRESSION ENGINES
`
`(75) Inventor: Thomas A. Dye, Austin, TX (US)
`(73) Assignee: Interactive Silicon, Inc., Austin, TX
`(US)
`
`(*) Notice:
`
`This patent issued on a continued pros
`ecution application filed under 37 CFR
`1.53(d), and is subject to the twenty year
`patent term provisions of 35 U.S.C.
`154(a)(2).
`-
`Under 35 U.S.C. 154(b), the term of this
`patent shall be extended for 0 days.
`This patent is subject to a terminal dis-
`claimer.
`
`-
`(21) Appl. No.: 08/916,464
`(22) Filed:
`Aug. 8, 1997
`Related U.S. Application Data
`(60) Continuation of application No. 08/463,106, filed on Jun. 5,
`1995, now abandoned, which is a division of application No.
`08/340,667, filed on Nov. 16, 1994, now Pat. No. 6,002,411.
`(51) ?nt. Cl." ...................................................... G06F 13/00
`-
`-
`-
`(52) U.S. ???? ?? ??
`345/521; 38.2/232
`e
`2
`(58) Field of Search ..................................... 395/341, 888,
`395/133, 159; 711/203, 170, 160, 133,
`134, 136, 155, 159, 165; 709/247; 34.5/521,
`202, 509; 710/68; 714/763, 764; 38.2/232
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`4,008,460 * 2/1977 Bryant et al. ........................ 395/463
`4,688,108 * 8/1987 Cotton et al. .
`358/261.1
`4,881,075 * 11/1989 Weng ..................................... 341/87
`
`4,929,946 * 5/1990 O’Brien et al. ........................ 341/87
`(List continued on next page.)
`page.
`
`* cited by examiner
`Primary Examiner—Eddie P. Chan
`Assistant Examiner—Hong Kim
`(74) Attorney, Agent, or Firm—Conley, Rose & Tayon, PC;
`Jeffrey C. Hood
`ABSTRACT
`(57)
`An integrated memory controller (IMC) which includes data
`compression and decompression engines for improved per
`formance. The memory controller (IMC) of the present
`invention preferably sits on the main CPU bus or a high
`speed system peripheral bus such as the PCI bus and couples
`to system memory. The IMC preferably uses a lossless data
`compression and decompression scheme. Data transfers to
`and from the integrated memory controller of the present
`invention can thus be in either two formats, these being
`compressed or normal (non-compressed). The IMC also
`preferably includes microcode for specific decompression of
`particular data formats such as digital video and digital
`audio. Compressed data from system I/O peripherals such as
`the hard drive, floppy drive, or local area network (LAN) are
`decompressed in the IMC and stored into system memory or
`saved in the system memory in compressed format. Thus,
`data can be saved in either a normal or compressed format,
`retrieved from the system memory for CPU usage in a
`normal or compressed format, or transmitted and stored on
`a medium in a ???? ?????? format.
`????
`memory mapping allows for format definition spaces whic
`define
`of the
`and ? ? ÍO .. read ??
`Written. o tWare OVerrideS may be placed in app ICatIOnS
`software in systems that desire to control data decompres
`sion at the software application level. The integrated data
`compression and decompression capabilities of the IMC
`remove system bottle-necks and increase performance. This
`allows lower cost systems due to smaller data storage
`requirements and reduced bandwidth requirements. This
`also increases system bandwidth and hence increases system
`performance. Thus the IMC of the present invention is a
`significant advance over the operation of current memory
`controllers.
`
`97 Claims, 19 Drawing Sheets
`
`
`
`
`
`CPU
`102
`
`BUS IFF
`
`IMC
`- 140
`
`CACHE
`104
`
`106
`
`D|SK
`120
`??
`
`COMP
`LOGIC
`302
`
`*
`
`DECOMP
`304
`
`NORMAL EDATA
`SYSTEM
`OR
`COMPRESSED | MEMORY
`[????
`
`11
`
`Realtime 2019
`Page 1 of 37
`
`
`
`US 6,173,381 B1
`Page 2
`
`U.S. PATENT DOCUMENTS
`5,237,460 * 8/1993 Miller et al. ......................... 395/888
`5247,638 - 9/1993 o'Brieneral.
`... 395/888
`5,247,646 * 9/1993 Osterlund et al. .
`... 395/888
`5,353,425 * 10/1994 Matamy et al. ...
`... 711/144
`5,357,614 * 10/1994 Pattisam et al.
`... 395/250
`5,396,343 * 3/1995 Hanselman .....
`... 358/426
`5420696 · 5/1995 wegeng et al.
`... 358/468
`5,455,577 * 10/1995 Slivka et al. ...
`... 341/51
`5,483,622 * 1/1996 Zimmerman et al.
`. 358/1.15
`5,504,842 * 4/1996 Gentile ..............
`. 358/1.15
`5,548,742 * 8/1996 Wang et al. ......................... 711/128
`
`5,479,587 * 12/1995 Campbell et al. .
`
`. 358/1.17
`
`
`
` ??? ?? ?
`
`
`
`5,559,978 + 9/1996 Spilo .................................... 711/203
`5,563,595
`10/1996 Strohacker ........................... 341/106
`5,584,008 * 12/1996 Shimada et al. ..................... 711/114
`5,602,976 + 2/1997 Cooper et al. …... 358/1.17
`5,606,428 * 2/1997 Hanselman ........................... 358/404
`5,652,878 * 7/1997 Craft ........................................ 707/1
`5,696,912 ° 12/1997 Bicevskis et al.
`. 395/308
`5,696,926 º 12/1997 Culbert et al. ..
`. 711/203
`5,699,539 * 12/1997 Garber et al
`711/2
`5,708,763 * 1/1998 Peltzer ........
`. 395/115
`5812,817 * 9/1998 Hovis et al. .
`... 711/173
`5,828,877
`10/1998 Pearce et al. ........................ 395/670
`
`???
`
`?
`
`-
`
`
`
`- - - - - - - - - - - - - - - - - - - - - - - - - - - -
`
`Realtime 2019
`Page 2 of 37
`
`
`
`U.S. Patent
`
`Jan. 9, 2001
`
`Sheet 1 of 19
`
`US 6,173,381 B1
`
`
`
`(PRIOR ART)
`
`Realtime 2019
`Page 3 of 37
`
`
`
`US. Patent
`
`Jan. 9, 2001
`
`Sheet 2 0f 19
`
`US 6,173,381 B1
`
`SYSTEM BUS
`
`DAC
`
`AUDIO
`
`m
`
`106
`
`BOOT
`DEVICE
`fl
`
`110
`
`/
`
`“‘40
`m
`
`SYSTEM
`MEMORY
`
`K623
`
`\
`
`\
`
`\
`
`\
`
`\
`
`\
`
`\\
`
`\
`
`\
`
`\
`
`\
`
`\
`
`\
`
`\
`
`\
`
`EB
`
`71/
`
`/
`/ E19
`
`//
`
`/
`
`/
`
`/
`
`/
`
`CPU
`fl
`
`CACHE
`104
`'—
`
`/
`
`/
`
`/
`
`/
`
`/
`
`/%
`
`-
`
`
`
`VIDEO
`DISPLAY
`
`fl
`
`CONTROLLER
`m
`
`_*
`
`122
`
`KEYBOARD
`
`124
`
`6
`
`MOUSE
`
`118
`
`|/O BUS
`
`FIG' 2
`
`Realtime 2019
`
`Page 4 Of 37
`
`Realtime 2019
`Page 4 of 37
`
`
`
`US. Patent
`
`Jan. 9, 2001
`
`Sheet 3 0f 19
`
`US 6,173,381 B1
`
`89>
`
`m0tzO§
`
`ml:
`
`
`mmmzfi55%
`0EEng.sz-
`
`m:me—_On_
`
`
`
` o:Imo?
`
`:55:lo:wwwqw
`
`.55529:50:
`
`a553
`
`FOOm
`
`mam459:name
`
`20%<55II
`EH8-xmean.me
`
`
`«9NSzo_mz<n_xm_meSEz
`II!bmammoEmmFZmmmwmg
`
`
`om<o
`
`
`
`Iw<‘_n_Emhw>wx90
`IIIIourom?
`mm:mm:225mamaIImama
`
`
`
`
`
`
`
`fillflmamzo_mz<n_xwmIIl
`
` mayo:om<0m>mxzo_wz<n_xwMESIIno
`
`Realtime 2019
`
`Page 5 of 37
`
`Realtime 2019
`Page 5 of 37
`
`
`
`
`
`
`
`
`US. Patent
`
`Jan.9,2001
`
`Sheet4 0f19
`
`US 6,173,381 B1
`
`<m.OE
`
`
`
`om<0m>mxzo_wz<axm_
`
`
`
`«NVonm?mam
`
`aAOKHZOO
`
`89>
`
`gmotzoz
`
`>m0§w§
`
`JOmHZOO
`
`.2396a51?on
`E05:wwonmmam#09:mome
`
`
`
`052
`
`I
`
`\moiaéoQ:
`
`|
`
`mm:mm:
`
`
`
`1min.55km
`
`
`
`
`
`fl
`
`OPE
`
`wmmfiF
`
`PQDmKMFZ_
`
`EMFw>w
`
`
`
`amom6a
`
`zo_mz<n_xmmesEz
`59%$20
`mammoEmEZmmwwmnz
`
`
`
`Realtime 2019
`
`Page 6 of 37
`
`Realtime 2019
`Page 6 of 37
`
`
`
`
`
`
`
`
`
`
`
`U.S. Patent
`
`
`
`
`
`
`
`
`
`WELSÅS
`
`ISOS
`
`Realtime 2019
`Page 7 of 37
`
`
`
`US. Patent
`
`Jan. 9, 2001
`
`Sheet6 0f19
`
`US 6,173,381 B1
`
`20m<20
`
`
`
`a%Iw<4mEwkm>w
`
`
`$52:55%
`0EEDEEZ-
`
`
`
`IOlgamzmklmofimm
`
`IIIFmamx:mm:56256
`
`amamzo_mz<n_xm_
`
`vEO>>._.m_z
`
`m0<umwkz_
`
`Qm<0
`
`mm
`
`325
`
`wrvwsm_om
`
`E055.I35505:
`
` <0:
`
`:OQDmOI
`
`wIO<O
`
`MODEm
`
`mow
`
`0m.9“—
`
`v?NNF
`
`fl
`
`
`
`
`
`mazes.om<0m>m§zo_wz<n_xm--
`
`E055.mam
`
`_w0m
`
`ZMHQ<Q<
`
`qmlm
`
`ma<k
`
`Mmlr
`
`NEED
` m>_mo'——a
`
`xmfi_.-
`a20mIoo
`Imfio
`
`x90
`
`Realtime 2019
`
`Page 8 of 37
`
`Realtime 2019
`Page 8 of 37
`
`
`
`
`
`
`
`
`
`
`
`
`U.S. Patent
`
`Jan. 9, 2001
`
`Sheet 7 of 19
`
`US 6,173,381 B1
`
`WELSÅSOWNI
`
`BOIAEO
`LOO8
`
`
`
`
`
`
`
`
`
`
`
`Realtime 2019
`Page 9 of 37
`
`
`
`U.S. Patent
`
`Jan. 9, 2001
`
`Sheet 8 of 19
`
`US 6,173,381 B1
`
`PC|BUS
`
`BIOS ROM
`146
`
`SYSTEM MEMORY
`
`
`
`RED
`GRN
`
`TO VIDEO
`
`AUDIO
`DAC
`1 144 4
`
`FIG. 4
`
`Realtime 2019
`Page 10 of 37
`
`
`
`U.S. Patent
`
`Jan. 9, 2001
`
`Sheet 9 of 19
`
`US 6,173,381 B1
`
`??S?
`I/F
`
`204
`
`F
`|
`5
`
`F
`|
`F
`?
`
`|MC BLOCK DIAGRAM
`
`????U????
`ENGINE
`210
`
`GRAPHICS
`ENGINE
`212
`
`?
`
`,
`214
`
`220
`.
`
`f
`F
`?
`
`F
`|
`F
`?
`
`???
`CNTL
`#1 |/F1
`221
`
`MEM
`???L
`#2
`222 ||/F2
`
`206
`
`216
`
`
`
`INSTRUCTION STORAGE/
`DECODE
`230
`
`WINDOW ASSEMBLER
`
`AUDIO
`SHI TER AUD|O
`
`L/R
`
`DISPLAY STORAGE BUFFER
`244
`
`DISPLAY MEMORY SHIFTER
`246
`
`RED
`DAC
`250
`
`R
`
`GREEN
`DAC
`252
`
`G
`
`FIG. 5
`
`BLUE
`DAC
`254
`
`?
`
`Realtime 2019
`Page 11 of 37
`
`
`
`U.S. Patent
`
`Jan. 9, 2001
`
`Sheet 10 of 19
`
`US 6,173,381 B1
`
`WELSÅS?|O
`
`)\>|OWNE W | CESSERHdWNO O
`
`EINISONE
`
`OECIO O
`
`TÕ5
`
`90||
`
`??? E HOV/O
`
`
`
`
`
`Realtime 2019
`Page 12 of 37
`
`
`
`U.S. Patent
`
`Jan. 9, 2001
`
`Sheet 11 of 19
`
`US 6,173,381 B1
`
`
`
`| ||
`
`Realtime 2019
`Page 13 of 37
`
`
`
`U.S. Patent
`
`Jan. 9, 2001
`
`Sheet 12 of 19
`
`US 6,173,381 B1
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`NORMAL DATA
`SYSTEM
`OR
`COMPRESSED MEMORY
`D???
`
`11
`-
`
`DECOMP
`LOGIC
`304
`
`Normal or compressed data transfer, No modification by IMC
`FIG. 7
`
`BUS I/F
`
`NORMAL DATA
`
`SYSTEM
`11
`- ????R?
`
`DECOMP
`LOGIC
`304
`
`Memory to memory decompression
`FIG. 8
`
`Realtime 2019
`Page 14 of 37
`
`
`
`U.S. Patent
`
`Jan. 9, 2001
`
`Sheet 13 of 19
`
`US 6,173,381 B1
`
`NORMAL
`D???
`B?? |/F
`
`CPU
`102
`
`CACHE
`104
`
`106
`
`< TX
`DISK
`
`120
`
`DECOMP
`L????????
`
`-
`
`NORMAL DATA
`
`- ??????
`
`COMPRESSED
`D???
`
`I/O
`
`NORMAL
`DATA
`Memory decompression to CPU or Disk
`FIG. 9
`
`s
`
`
`
`COMPRESSED
`D???
`
`BUS
`
`NORMAL DATA
`
`110
`
`SYSTEM
`MEMORY
`
`DECOMP
`LOGIC
`304
`
`COMPRESSED
`D???
`Decompression from Disk or CPU to memory
`FIG. 10
`
`Realtime 2019
`Page 15 of 37
`
`
`
`U.S. Patent
`
`Jan. 9, 2001
`
`Sheet 14 of 19
`
`US 6,173,381 B1
`
`NORMAL
`????
`
`
`
`NORMAL DATA
`
`110
`
`SYSTEM
`MEMORY
`
`302
`
`ECOMP
`LOGIC
`304
`
`COMPRESSED
`D???
`Decompression of disk to CPU
`FIG. 11
`
`B?? |/F
`
`
`
`
`
`SYSTEM
`MEMORY
`
`DECOMP
`LOGIC
`304
`
`Memory to memory Compression
`FIG. 12
`
`Realtime 2019
`Page 16 of 37
`
`
`
`U.S. Patent
`
`Jan. 9, 2001
`
`Sheet 15 of 19
`
`US 6,173,381 B1
`
`
`
`COMPRESSED
`D???
`
`SYSTEM
`11
`- ????R?
`
`DECOMP
`LOGIC
`304
`
`COMPRESSED
`D???
`
`COMPRESSED
`[????
`Compression from memory to CPU or Disk
`FIG. 13
`COMPRESSED
`D???
`
`NORMAL
`DATA
`
`BUS
`
`DECOMP
`LOGIC
`304
`
`NORMAL
`D???
`Compression from CPU or disk to memory
`FIG. 14
`
`Realtime 2019
`Page 17 of 37
`
`
`
`U.S. Patent
`US. Patent
`
`9n.aJ
`
`1m
`
`US 6,173,381 B1
`US 6,173,381 B1
`
`2,<56isEOz
`
`55$
`
`>mozm§
`
`n“1,«53MommmmmmEOOmmaa069mmzoomo
`Swe
`
`2.O_..._
`
`E2m>wnswO:5me8Emu3&0B:o_wmmEEoo
`
`<H<D
`
`??OWO
`amIo<o
`?OT
`
`wa
`
`dMfl
`
`0:
`
`Realtime 2019
`
`Page 18 of 37
`
`
`
`4<2m02
`
`<F<D
`
`m=m3m
`
`3&0
`
`mdfl
`
`Realtime 2019
`Page 18 of 37
`
`
`
`
`U.S. Patent
`
`Jan. 9, 2001
`
`Sheet 17 of 19
`
`US 6,173,381 B1
`
`CPU requests data from
`the memory controller
`502
`
`No
`
`
`
`
`
`Data resides in
`main memory in a normal
`format?
`504
`
`
`
`No
`
`
`
`
`
`Obtain requested data
`frOm disk
`510
`
`Yes
`
`Data resides
`in main memory in a
`compressed format?
`504
`
`
`
`Yes
`
`
`
`Memory controller
`transfers requested data
`to CPU
`506
`
`End
`
`Determine LRU data in
`main memory
`522
`
`
`
`Compress LRU data and
`store in main memory
`(orto disk)
`524
`
`Decompress requested
`data and store in main
`memory
`526
`
`Provide requested
`data to CPU
`528
`
`FIG. 16
`
`End
`
`Realtime 2019
`Page 19 of 37
`
`
`
`U.S. Patent
`
`Jan. 9, 2001
`
`Sheet 18 of 19
`
`US 6,173,381 B1
`
`Address
`
`Mapping Registers
`
`
`
`0000xxxx
`
`Compress
`Reads
`
`Decompress
`Reads
`
`Compress
`Writes
`
`Decompress
`Writes
`
`Normal
`
`0001xxxx
`
`0002xxxx
`
`0003xxxx
`
`0004xxxx
`
`0008xxxx
`
`FIG. 17
`
`Realtime 2019
`Page 20 of 37
`
`
`
`U.S. Patent
`
`Jan. 9, 2001
`
`Sheet 19 of 19
`
`US 6,173,381 B1
`
`
`
`??T E HOVO
`
`XISIC]
`
`???
`
`Realtime 2019
`Page 21 of 37
`
`
`
`US 6,173,381 B1
`
`1
`MEMORY CONTROLLER INCLUDING
`EMBEDDED DATA COMPRESSION AND
`DECOMPRESSION ENGINES
`
`This is a continuation of application Ser. No. 08/463,106,
`now abandoned titled “Memory Controller
`Including
`Embedded Data Compression and Decompression Engines”
`filed Jun. 5, 1995, whose inventor is Thomas A. Dye, which
`is a divisional of application Ser. No. 08/340,667, now US.
`Pat. No. 6,002,411 titled “Integrated Video and Memory
`Controller with Data Processing and Graphical Processing
`Capabilities” and filed Nov. 16, 1994, whose inventor is
`Thomas A. Dye.
`
`FIELD OF THE INVENTION
`
`invention relates to computer system
`The present
`architectures, and more particularly to an integrated memory
`and graphics controller which includes an embedded data
`compression and decompression engine for increased sys-
`tem bandwidth and efficiency.
`
`DESCRIPTION OF THE RELATED ART
`
`Since their introduction in 1981, the architecture of per-
`sonal computer systems has remained substantially
`unchanged. The current state of the art in computer system
`architectures includes a central processing unit (CPU) which
`couples to a memory controller interface that in turn couples
`to system memory. The computer system also includes a
`separate graphical interface for coupling to the video dis-
`play. In addition, the computer system includes input/output
`(I/O) control
`logic for various I/O devices,
`including a
`keyboard, mouse, floppy drive, hard drive, etc.
`In general, the operation of a modern computer architec-
`ture is as follows. Programs and data are read from a
`respective I/O device such as a floppy disk or hard drive by
`the operating system, and the programs and data are tem-
`porarily stored in system memory. Once a user program has
`been transferred into the system memory, the CPU begins
`execution of the program by reading code and data from the
`system memory through the memory controller. The appli-
`cation code and data are presumed to produce a specified
`result when manipulated by the system CPU. The code and
`data are processed by the CPU and data is provided to one
`or more of the various output devices. The computer system
`may include several output devices,
`including a video
`display, audio (speakers), printer, etc. In most systems, the
`video display is the primary output device.
`Graphical output data generated by the CPU is written to
`a graphical interface device for presentation on the display
`monitor. The graphical interface device may simply be a
`video graphics array (VGA) card, or the system may include
`a dedicated video processor or video acceleration card
`including separate video RAM (VRAM). In a computer
`system including a separate, dedicated video processor, the
`video processor includes graphics capabilities to reduce the
`workload of the main CPU. Modern prior art personal
`computer systems typically include a local bus video system
`based on either the peripheral component interconnect (PCI)
`bus or the VESA (Video Electronics Standards Association)
`VL bus, or perhaps a proprietary local bus standard. The
`video subsystem is generally positioned on a local bus near
`the CPU to provide increased performance.
`Therefore, in summary, program code and data are first
`read from the hard disk to the system memory. The program
`code and data are then read by the CPU from system
`memory, the data is processed by the CPU, and graphical
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`data is written to the video RAM in the graphical interface
`device for presentation on the display monitor. The CPU
`typically reads data from system memory across the system
`bus and then writes the processed data or graphical data back
`to the I/O bus or local bus where the graphical interface
`device is situated. The graphical interface device in turn
`generates the appropriate video signals to drive the display
`monitor. It is noted that this operation requires the data to
`make two passes across the system bus and/or the I/O
`subsystem bus. In addition, the program which manipulates
`the data must also be transferred across the system bus from
`the main memory. Further, two separate memory subsystems
`are required, the system memory and the dedicated video
`memory, and video data is constantly being transferred from
`the system memory to the video memory frame buffer. FIG.
`1 illustrates the data transfer paths in a typical computer
`system using prior art technology.
`Computer systems are being called upon to perform larger
`and more complex tasks that require increased computing
`power. In addition, modem software applications require
`computer systems with increased graphics capabilities.
`Modem software applications typically include graphical
`user interfaces (GUIs) which place increased burdens on the
`graphics capabilities of the computer system. Further, the
`increased prevalence of multimedia applications also
`demands computer systems with more powerful graphics
`capabilities. Therefore, a new computer system and method
`is desired which provides increased system performance and
`in particular, increased video and/or graphics performance,
`than that possible using prior art computer system architec-
`tures.
`
`SUMMARY OF THE INVENTION
`
`The present invention comprises an integrated memory
`controller (IMC) which includes data compression/
`decompression engines for improved performance. The
`memory controller (IMC) of the present invention preferably
`sits on the main CPU bus or a high speed system peripheral
`bus such as the PCI bus. The IMC includes one or more
`
`symmetric memory ports for connecting to system memory.
`The IMC also includes video outputs to directly drive the
`video display monitor as well as an audio interface for
`digital audio delivery to an external stereo digital-to-analog
`converter (DAC).
`The IMC transfers data between the system bus and
`system memory and also transfers data between the system
`memory and the video display output. Therefore, the IMC
`architecture of the present invention eliminates the need for
`a separate graphics subsystem. The IMC also improves
`overall system performance and response using main system
`memory for graphical information and storage. The IMC
`system level architecture reduces data bandwidth require-
`ments for graphical display since the host CPU is not
`required to move data between main memory and the
`graphics subsystem as in conventional computers, but rather
`the graphical data resides in the same subsystem as the main
`memory. Therefore, for graphical output, the host CPU or
`DMA master is not limited by the available bus bandwidth,
`thus improving overall system throughput.
`The integrated memory controller of the preferred
`embodiment includes a bus interface unit which couples
`through FIFO buffers to an execution engine. The execution
`engine includes a compression/decompression engine
`according to the present
`invention as well as a texture
`mapping engine according to the present invention. In the
`preferred embodiment
`the compression/decompression
`
`Realtime 2019
`
`Page 22 of 37
`
`Realtime 2019
`Page 22 of 37
`
`
`
`US 6,173,381 B1
`
`3
`engine comprises a single engine which performs both
`compression and decompression.
`In an alternate
`embodiment, the execution engine includes separate com-
`pression and decompression engines.
`The execution engine in turn couples to a graphics engine
`which couples through FIFO buffers to one or more sym-
`metrical memory control units. The graphics engine is
`similar in function to graphics processors in conventional
`computer systems and includes line and triangle rendering
`operations as well as span line interpolators. An instruction
`storage/decode block is coupled to the bus interface logic
`which stores instructions for
`the graphics engine and
`memory compression/decompression engines. A Window
`Assembler is coupled to the one or more memory control
`units. The Window Assembler in turn couples to a display
`storage buffer and then to a display memory shifter. The
`display memory shifter couples to separate digital to analog
`converters (DACs) which provide the RGB signals and the
`synchronization signal outputs to the display monitor. The
`window assembler
`includes a novel display list-based
`method of assembling pixel data on the screen during screen
`refresh, thereby improving system performance. In addition,
`a novel antialiasing method is applied to the video data as
`the data is transferred from system memory to the display
`screen. The internal graphics pipeline of the IMC is opti-
`mized for high end 2D and 3D graphical display operations,
`as well as audio operations, and all data is subject
`to
`operation within the execution engine and/or the graphics
`engine as it travels through the data path of the IMC.
`As mentioned above, according to the present invention
`the execution engine of the IMC includes a compression/
`decompression engine for compressing and decompressing
`data within the system. The IMC preferably uses a lossless
`data compression and decompression scheme. Data transfers
`to and from the integrated memory controller of the present
`invention can thus be in either two formats, these being
`compressed or normal (non-compressed). The execution
`engine also preferably includes microcode for specific
`decompression of particular data formats such as digital
`video and digital audio. Compressed data from system I/O
`peripherals such as the hard drive, floppy drive, or local area
`network (LAN) are decompressed in the IMC and stored
`into system memory or saved in the system memory in
`compressed format. Thus, data can be saved in either a
`normal or compressed format, retrieved from the system
`memory for CPU usage in a normal or compressed format,
`or transmitted and stored on a medium in a normal or
`
`compressed format. Internal memory mapping allows for
`format definition spaces which define the format of the data
`and the data type to be read or written. Graphics operations
`are achieved preferably by either a graphics high level
`drawing protocol, which can be either a compressed or
`normal data type, or by direct display of pixel information,
`also in a compressed or normal format. Software overrides
`may be placed in applications software in systems that desire
`to control data decompression at the software application
`level. In this manner, an additional protocol within the
`operating system software for data compression and decom-
`pression is not required.
`The compression/decompression engine in the IMC is
`also preferably used to cache least recently used (LRU) data
`in the main memory. Thus, on CPU memory management
`misses which occur during translation from a virtual address
`to a physical address,
`the compression/decompression
`engine compresses the LRU block of system memory and
`stores this compressed LRU block in system memory. Thus
`the LRU data is effectively cached in a compressed format
`
`4
`in the system memory. As a result of the miss, if the address
`points to a previously compressed block cached in the
`system memory, the compressed block is now decompressed
`and tagged as the most recently used (MRU) block. After
`being decompressed, this MRU block is now accessible to
`the CPU.
`
`The use of the compression/decompression engine to
`cache LRU data in compressed format
`in the system
`memory greatly improves system performance,
`in many
`instances by as much as a factor of 10, since transfers to and
`from disk generally have a maximum transfer rate of 10
`Mbytes/sec, whereas the decompression engine can perform
`at over 100 Mbytes/second.
`The integrated data compression and decompression
`capabilities of the IMC remove system bottle-necks and
`increase performance. This allows lower cost systems due to
`smaller data storage requirements and reduced bandwidth
`requirements. This also increases system bandwidth and
`hence increases system performance. Thus the IMC of the
`present invention is a significant advance over the operation
`of current memory controllers.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`A better understanding of the present invention can be
`obtained when the following detailed description of the
`preferred embodiment is considered in conjunction with the
`following drawings, in which:
`FIG. 1 is a prior art diagram illustrating data flow in a
`prior art computer system;
`FIG. 2 is a block diagram illustrating data flow in a
`computer system including an integrated memory controller
`(IMC) according to the present invention;
`FIG. 3 illustrates a block diagram of a computer system
`including an IMC according to the present invention;
`FIG. 3A illustrates an alternate embodiment of the com-
`
`puter system of FIG. 3 including memory control and
`graphics/audio blocks coupled to the system memory;
`FIG. 3B illustrates an alternate embodiment of the com-
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`puter system of FIG. 3 including two IMCs coupled to the
`system memory;
`FIG. 3C illustrates an alternate embodiment of the com-
`
`45
`
`50
`
`55
`
`60
`
`65
`
`puter system of FIG. 3 including a first IMC coupled to the
`cache bridge which couples to system memory and a second
`IMC coupled to the PCI bus which couples to system
`memory;
`FIG. 3D illustrates a computer system including the IMC
`and using a prior art architecture where the IMC couples to
`the PCI bus and uses a separate frame buffer memory for
`video data;
`FIG. 4 is a block diagram illustrating the IMC interfacing
`to system memory and a video display monitor;
`FIG. 5 is a block diagram illustrating the internal archi-
`tecture of the integrated memory controller (IMC) of the
`present invention;
`FIG. 6 illustrates the compression/decompression logic
`comprised in the IMC 140 according to the present inven-
`tion;
`FIG. 6A illustrates an alternate embodiment including
`separate compression and decompression engines comprised
`in the IMC 140 according to the present invention;
`FIG. 7 illustrates normal or compressed data transfers in
`a computer system incorporating the IMC where the IMC
`does not modify data during the transfer;
`FIG. 8 illustrates a memory-to-memory decompression
`operation performed by the IMC according to the present
`invention;
`
`Realtime 2019
`
`Page 23 of 37
`
`Realtime 2019
`Page 23 of 37
`
`
`
`US 6,173,381 B1
`
`5
`FIG. 9 illustrates a memory decompression operation
`performed by the IMC on data being transferred to the CPU
`or to a hard disk according to the present invention;
`FIG. 10 illustrates decompression of data received from
`the hard disk or CPU that is transferred in normal format in
`
`system memory according to the present invention;
`FIG. 11 illustrates operation of the IMC decompressing
`data retrieved from the hard disk that is provided in normal
`format to the CPU;
`FIG. 12 illustrates a memory-to-memory compression
`operation performed by the IMC according to the present
`invention;
`FIG. 13 illustrates operation of the IMC 140 compressing
`data retrieved from the system memory and providing the
`compressed data to either the CPU or hard disk;
`FIG. 14 illustrates compression of data in a normal format
`received from the CPU or hard disk that
`is stored in
`
`5
`
`10
`
`15
`
`compressed form in the system memory;
`FIG. 15 illustrates operation of the IMC in compressing
`normal data obtained from the CPU that is stored in com-
`
`20
`
`pressed form on the hard disk 120;
`FIG. 16 is a flowchart diagram illustrating operation of a
`computer system where least recently used data in the
`system memory is cached in a compressed format to the
`system memory using the compression/decompression
`engine of the present invention;
`FIG. 17 illustrates memory mapping registers which
`delineate compression and decompression operations for
`selected memory address spaces; and
`FIG. 18 illustrates read and write operations for an
`address space shown in FIG. 17.
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENT
`
`Incorporation by Reference
`US. patent application Ser. No. 08/340,667 titled “Inte-
`grated Video and Memory Controller with Data Processing
`and Graphical Processing Capabilities” and filed Nov. 16,
`1994, is hereby incorporated by reference in its entirety.
`Prior Art Computer System Architecture
`FIG. 1 illustrates a block diagram of a prior art computer
`system architecture. As shown, prior art computer architec-
`tures typically include a CPU 102 coupled to a cache system
`104. The CPU 102 and cache system 104 are coupled to the
`system bus 106. A memory controller 108 is coupled to the
`system bus 106 and the memory controller 108 in turn
`couples to system memory 110. In FIG. 1, graphics adapter
`112 is shown coupled to the system bus 106. However, it is
`noted that in modern computer systems the graphics adapter
`112 is typically coupled to a separate local expansion bus
`such as the peripheral component interface (PCI) bus or the
`VESA VL bus. Prior art computer systems also typically
`include bridge logic coupled between the CPU 102 and the
`memory controller 108 wherein the bridge logic couples to
`the local expansion bus where the graphics adapter 112 is
`situated. For example, in systems which include a PCI bus,
`the system typically includes a host/PCI/cache bridge which
`integrates the cache logic 104, host interface logic, and PCI
`interface logic. The graphics adapter 112 couples to frame
`buffer memory 114 which stores the video data that
`is
`actually displayed on the display monitor. Modern prior art
`computer systems typically include between 1 to 4 Mega-
`bytes of video memory. An I/O subsystem controller 116 is
`shown coupled to the system bus 106. In computer systems
`which include a PCI bus, the I/O subsystem controller 116
`typically is coupled to the PCI bus. The I/O subsystem
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`
`controller 116 couples to an input/output (I/O) bus 118.
`Various peripheral I/O devices are generally coupled to the
`I/O bus 18, including a hard disk 120, keyboard 122, mouse
`124, and audio digital-to-analog converter (DAC) 144.
`Prior art computer system architectures generally operate
`as follows. First, programs and data are generally stored on
`the hard disk 120. If a software compression application is
`being used, data may be stored on the hard disk 120 in
`compressed format. At the direction of the CPU 102, the
`programs and data are transferred from the hard disk 120
`through the I/O subsystem controller 116 to system memory
`110 via the memory controller 108. If the data being read
`from the hard disk 120 is stored in compressed format, the
`data is decompressed by software executing on the CPU 102
`prior to being transferred to system memory 110. Thus
`software compression applications require the compressed
`data to be transferred from the hard disk 120 to the CPU 120
`prior to storage in the system memory 110.
`The CPU 102 accesses programs and data stored in the
`system memory 110 through the memory controller 108 and
`the system bus 106. In processing the program code and
`data,
`the CPU 102 generates graphical data or graphical
`instructions that are then provided over the system bus 106
`and generally the PCI bus (not shown) to the graphics
`adapter 112. The graphics adapter 112 receives graphical
`instructions or pixel data from the CPU 102 and generates
`pixel data that is stored in the frame buffer memory 114. The
`graphics adapter 112 generates the necessary video signals
`to drive the video display monitor (not shown) to display the
`pixel data that is stored in the frame buffer memory 114.
`When a window on the screen is updated or changed, the
`above process repeats whereby the CPU 102 reads data
`across the system bus 106 from the system memory 110 and
`then transfers data back across the system bus 106 and local
`expansion bus to the graphics adapter 112 and frame buffer
`memory 114.
`When the computer system desires to store or cache data
`on the hard disk 120 in a compressed format, the data is read
`by the CPU 102 and compressed by the software compres-
`sion application. The compressed data is then stored on the
`hard disk 120.
`If compressed data is stored in system
`memory 110 which must be decompressed, the CPU 102 is
`required to read the compressed data, decompress the data
`and write the decompressed data back to system memory
`110.
`
`Computer Architecture of the Present Invention
`Referring now to FIG. 2, a block diagram illustrating the
`computer architecture of a system incorporating the present
`invention is shown. Elements in FIG. 2 that are similar or
`identical
`to those in FIG. 1 include the same reference
`
`numerals for convenience. As shown, the computer system
`of the present invention includes a CPU 102 preferably
`coupled to a cache system 104. The CPU 102 may include
`a first level cache system and the cache 104 may comprise
`a second level cache. Alternatively, the cache system 104
`may be a first level cache system or may be omitted as
`desired. The CPU 102 and cache system 104 are coupled to
`a system bus 106. The CPU 102 and cache system 104 are
`also directly coupled through the system bus 106 to an
`integrated memory controller (IMC) 140 according to the
`present invention. The integrated memory controller (IMC)
`140 includes a compression/decompression engine for
`greatly increasing the performance of the computer system.
`It is noted that the IMC 140 can be used as the controller for
`
`main system memory 110 or can be used to control other
`memory subsystems as desired. The IMC 140 may also be
`used as the graphics controller in computer systems using
`prior art architectures having separate memory and video
`subsystems.
`
`Realtime 2019
`
`Page 24 of 37