`United States Patent
`5,467,087
`Nov. 14, 1995
`[45] Date of Patent:
`Chu
`
`[11] Patent Number:
`
`QC1ANAAA
`
`US005467087A
`
`[54] HIGH SPEED LOSSLESS DATA
`COMPRESSION SYSTEM
`
`[75]
`
`Inventor: Ke-Chiang Chu, Saratoga, Calif.
`
`[73] Assignee: Apple Computer, Inc., Cupertino,
`Calif.
`
`[21] Appl. No.: 992,972
`
`[22] Filed:
`
`Dec. 18, 1992
`
`5, Sep. 1978, pp. 530-536,J. Ziv, A. Lempel, “Compression
`of Individual Sequences via Variable-Rate Coding”.
`TEEETransactions on Information Theory vol. II, 21, No. 2,
`Mar. 1975, pp. 194-203. Peter Elias, “Universal Codeword
`Sets and Representations of the Integers”.
`Communications of the ACM,Apr. 1989 Vol. 32 No.4, pp.
`490-504, Edward R. Riala and Daniel H. Greene, “Data
`Compression with Finite Windows”.
`Timothy C. Bell, John G. Cleary, Ian H. Witten, Text
`Compression, Prentice Hall, Englewood Cliffs, New Jersey,
`1990, pp. 206-243.
`[ST] Ute Choeciccscccssssesssssssseessssneseeeceee HO03M 7/30
`System Enhancement Associates, ARC File Archive Utility
`
`[52]US.Cheeeeccececscssssccneecseecccesuescuseennennerens 341/51; 341/86
`Version 5.1, copyright 1985, 1986, p. 2.
`[58] Field of Search 0....0......cccssscsescsees 341/51, 65, 67,
`341/87, 95, 106, 107
`
`Primary Examiner—Howard L. Williams
`Attorney, Agent, or Firm—Irene Y. Hu
`
`[57]
`
`ABSTRACT
`
`A data compression process and system that identifies the
`data type of an input data stream and thenselects in response
`to the identified data type at least one data compression
`method from a set of data compression methods that pro-
`vides an optimal compression ratio for that particular data
`type, thus maximizing the compression ratio for that input
`data stream. Moreover, the data compression process also
`provides meansto alter the rate of compression during data
`compression for added flexibility and data compression
`efficiency. Furthermore, a system memory allocation process
`is also provided to allow system or user control over the
`amount of system memory to be allocated for the memory
`intensive data compression process. System memory allo-
`cation process estimates the memory requirement to com-
`press the input data stream, and allocates only that amount
`of system memory as needed by the data compression for
`memory allocation efficiency.
`
`[56]
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`7/1968 Wernikoff ou...cesscssesesssereees 341/51
`3,394,352
`4,558,302 12/1985 Welch............
`.» 341/65
`5,010,345
`4/1991 Nagysce
`. 341/51
`5,045,852
`9/1991 Mitchell et al.
`w 341/67
`5,184,126
`2/1993 Nagy..........
`5,220,417
`6/1993 Sugiura ......cscsscscsscssesersesoeeeese 358/515
`5,247,638
`9/1993 O’BION .....creecccesscesernreseesensaees 341/87
`
`. 340/347
`
`
`...
`
`FOREIGN PATENT DOCUMENTS
`
`2-262766 10/1990
`
`Japan ......ccsssecsereneensceeees HOAN 1/41
`
`OTHER PUBLICATIONS
`
`IEEE, Computer, Jun. 1984, pp. 8-19, Terry A. Welch,
`Sperry
`Research
`Center,
`“A
`Technique
`_—for
`High-Performance Data Compression”.
`IEEE,Transactions On Information Theory, vol. I 23, No.
`3, May 1977, pp. 337-343 J. Ziv, A. Lempel, “A Universal
`Algorithm for Sequential Data Compression”.
`JEEE,Transactions On Information Theory, vol. I 24, No.
`
`IDENTIFY INPUT DATA
`TYPE AND GENERATE
`DATA TYPEiD SIGNAL
`108
`
`COMPRESS DATA
`IN ACCORDANCE
`WITH IDENTIHED
`DATA TYPE
`
`RETRIEVE DATA
`TYPE INFORMATION
`OF COMPRESSED
`DATA STREAM
`
`US Patent No. 9,054,728
`
`34 Claims, 8 Drawing Sheets
`
`Commvault Ex. 1025
`Commvault v. Realtime
`
`Page1
`
`Page 1
`
`Commvault Ex. 1025
`Commvault v. Realtime
`US Patent No. 9,054,728
`
`
`
`U.S. Patent
`
`Nov. 14, 1995
`
`Sheet 1 of 8
`
`5,467,087
`
`10
`
`13
`14 C
`11
`12
`SEB
`$9cio
`J
`
`15
`
`16
`
`(2
`12
`22
`faa|falFf—-poolFf
`
`A4
`
`Aa’
`
`FAG. I riorarn
`
`10
`
`13
`
`14
`—
`
`MOONFZ
`<>
`
`
`
`FIG. 2 erioraan
`
`Page 2
`
`Page 2
`
`
`
`U.S. Patent
`
`Nov. 14, 1995
`
`Sheet 2 of 8
`
`5,467,087
`
`Sv
`
`
`
`SNIGOONVWssNH
`
`NOISSSudWO9
`
`VLvd
`
`ONIGOO271
`
`(LUVHOIHd)
`
`< “
`
`Di
`
`Page 3
`
`NOISSSYdWO9
`
`vivd
`
`Page 3
`
`
`
`US. Patent
`
`Nov. 14, 1995
`
`Sheet 3 of 8
`
`5,467,087
`
`IDENTIFY INPUT DATA
`TYPE AND GENERATE
`DATA TYPE ID SIGNAL
`
`106
`
`112
`
`DECOMPRESS DATA
`IN ACCORDANCE
`WITH IDENTIFIED
`DATA TYPE
`
`COMPRESS DATA
`IN ACCORDANCE
`WITH IDENTIFIED
`DATA TYPE
`
`108
`
`RETRIEVE DATA
`TYPE INFORMATION
`OF COMPRESSED
`DATA STREAM
`
`110}
`
`Page 4
`
`Page 4
`
`
`
`U.S. Patent
`
`Nov. 14, 1995
`
`Sheet 4 of 8
`
`IDENTIFY INPUT DATA
`
`114
`
`SELECT Prax
`AND Lax
`
`4115
`
`GENERATE DATA TYPE
`ID SIGNAL AND
`PROVIDE SIGNAL TO
`COMPRESSION PHASE
`
`116
`
`COMPRESSION METHOD 5,467,087
`
`SELECT AT LEAST ONE
`DATA COMPRESSION
`METHOD FROM A SET
`OF DATA COMPRESSION
`METHODS ACCORDING
`TO INPUT DATA TYPE
`SIGNAL AND CONTROL
`
`RATE SIGNAL 4
`
`COMPRESS INPUT
`DATA STREAM
`USING SELECTED
`
`Page 5
`
`Page 5
`
`
`
`U.S. Patent
`
`Nov. 14, 1995
`
`Sheet 5 of 8
`
`5,467,087
`
`ESTIMATE
`MEMORY
`REQUIREMENT
`
`
`ALLOCATE
`MEMORY
`
`
`FOR DATA
`COMPRESSION
`
`128
`
` MONITOR
`
`COMPRESSION RATE
`CONTROL SIGNAL
`
`126
`
`
`
`
` DOES
`RATE CONTROLSIGNAL
`INDICATE TO SPEED UP
`COMPRESSION
`
`
`
`INCREASE THE SPEED
`
`OF COMPRESSION BY
`
`SWITCHING LZ METHOD OR
`
`
`DECREASEPrax Lax
`
`130
`
`
`
`
`
`
`
`
`COMPRESS DATA
`USING A SELECTED
`
`
`LZ-TYPE COMPRESSION
`
`METHOD
`a
`
`1
`
`
`
`COMPRESS DATA
`USING A SELECTED
`
`
`HUFFMAN CODE
`METHOD
`
`
`
`FIG. 6
`
`Page 6
`
`Page 6
`
`
`
`US. Patent
`
`Nov. 14, 1995
`
`Sheet 6 of 8
`
`5,467,087
`
`186
`
`SELECT A HUFFMAN
`LOOKUP TABLE
`ASSOCIATED WITH THE
`IDENTIFIED DATA TYPE
`
`132
`
`DECOMPRESS DATA
`USING SELECTED
`LOOKUP TABLE TO
`GENERATE A SET OF
`(p'L’)
`134
`
`DECOMPRESS THE
`p' L' TO GENERATE
`EXPANDED DATA
`
`FIG. 7
`
`Page 7
`
`Page 7
`
`
`
`U.S. Patent
`
`Nov. 14, 1995
`
`Sheet 7 of 8
`
`5,467,087
`
`00¢)
`
`Z0L
`
`NOISSSYdNOD
`
`WSLSAS
`
`VLvd
`
`
`
`NOISSSYdWO9-3Yd
`
`WALSAS
`
`viva
`
`Page 8
`
`Page 8
`
`
`
`
`
`U.S. Patent
`
`Nov. 14, 1995
`
`Sheet 8 of 8
`
`5,467,087
`
`00€)
`
`bbb
`
`NOISSSddWO03d
`
`WALSAS
`
`vLVvd
`
`
`
`NOISSSYdWOOSC“Sud
`
`WALSAS
`
`VLvd
`
`Page 9
`
`G “
`
`SDE
`
`Page 9
`
`
`
`
`
`
`
`1
`HIGH SPEED LOSSLESS DATA
`COMPRESSION SYSTEM
`
`5,467,087
`
`This patent application relates to copending and concur-
`rently filed patent application having the following patent
`application serial number andfiling date: Ser. No. 07/993,
`181, filed Dec. 18, 1992. This patent application and this
`copending patent application are commonly owned at the
`time of filing of this patent application.
`
`2
`(“FTP”), the FTP process queries the user to supply the data
`type information of the input data stream. Other prior art
`data manipulation processes include requiring a user to set
`a data type modebit, or to assume a particular data type of
`the input data stream. Assuming a particular data type is an
`inefficient method of manipulating data. If an electronic data
`manipulation process always assumes the data type to be 8
`bits, when in reality the input data type comprise 7 bits, the
`data type assumption by the process then results in a
`substantial waste of system memory to reserve an additional
`bit for each data symbol in the input data stream. Thus,it
`would be desirable to provide a method to automatically
`detect the data type of an input data stream.
`Additionally,
`typical prior art data compression tech-
`niquesare classified either as a statistical or a dictionary type
`of data compression method. A statistical
`type of data
`compression is based on single symbol coding. Single
`symbol coding is accomplished by assigning to each pos-
`sible data symbol in the input data stream a probability for
`Having an efficient data compression system is increas-
`the appearance of that symbol. Examples ofthis type of data
`ingly significant as electronics manufacturers compete with
`compression method are the Huffman code method and the
`each other for compactness and improved performance in
`widely published variations of this code. With the Huffman
`their electronic products. In particular, an increasing market
`coding method, a symbol having a greater probability of
`demand for a variety of portable electronic products has
`appearance is encoded with a short binary string, while a
`resulted in requiring a substantial reduction to the system
`symbol having a lowerprobability of appearance in the input
`real estate available for electronic data storage and data
`data stream is encoded with a longer binary string.
`manipulation in the designs of these products. Thus, with
`less electronic memory available, having an efficient data
`A dictionary type data compression method associates
`compression method is even morecritical in the designs of
`groups of consecutive characters, as in phrases, to a dictio-
`portable electronics, if these devices are to achieve the
`nary of indices. The dictionary type data compression meth-
`comparable operation of a larger electronic system.
`ods are also commonly referred to as a “codebook” or a
`“macro coding” approach. The various coding schemes in
`A variety of data compression techniques are known. The
`the Ziv-Lempel (“LZ”) family of data compression tech-
`performances of each of these various data compression
`niquesare all examplesof the dictionary type of data coding
`techniques are measured by the compression ratio, which is
`method. In the LZ family of data compression methods, a
`the length of an uncompressed input data stream to the
`typical LZ-type compression methodprocesses an input data
`length of its corresponding compressed data stream follow-
`stream by checking first if each current data string encoun-
`ing data compression. The compression ratio for each data
`tered in the input data stream matches a data string already
`compression technique, however, also varies depending on
`stored in the output data buffer. If no match of the current
`the data type of the input data stream. Some data compres-
`data string to previously stored data strings is detected, the
`sion techniques have a higher compression ratio for ASCII
`current data string is stored into the output buffer.
`If,
`type input data than for binary data type, while other data
`however, a match is detected betweenthe current datastring
`compression techniques result in a lower compression ratio
`and a datastring already stored in a memory location of the
`for ASCII data type andahigher ratio for binary data type.
`output data buffer, a pointer indicating that memory location
`Thus, for each data type, one or more data compression
`is stored into the output buffer instead of the data string.
`techniques can be identified which will provide an optimal
`data compression ratio according to that data type, while
`Shown in FIGS. 1 and 2 are two examples of LZ data
`other data compression techniques producing a lower com-
`compression methods. The LZ-1 compression method
`pression ratio for that particular data type should be avoided.
`shown in FIG. 1 processes an uncompressed input data
`stream 10 to generate a compressed data outputstream 20 by
`A variety of data types are knownandused by the industry
`comparing an uncompressed portion 13 of input data stream
`to encode characters, punctuation marks, and other symbols
`10 to data in a history buffer 11 of already processed input
`found in texts and communication protocols. Known data
`data. If a matching data string 12 is located in history buffer
`types include ASCII standard format, binary standard for-
`11 for current data string 14, data string 14 is encoded in
`mat, and unicode standard format. Although ASCH standard
`compressed data stream 20 as a pointer (p,, 1,) 24, corre-
`comprises a set of 8-bit binary numbers, only 7 ofthese bits
`sponding to an offset p, 15 and a data length 1, 16. The
`are typically used to represent an actual data symbol, while
`55
`shorter length data of pointer(p,, 1,) 24 thus replaces longer
`binary standard format encodes one data symbolin8bits.
`data string 14 in output compressed data stream 20.
`Unicode represents each data symbolwith twobytes, or a set
`of 16-bit binary numbers. The first byte, or the first 8-bit
`History buffer 11 is considered to compriseno dataat the
`prefix,
`indicates a data characteristic information of the
`time prior to data compression of input data stream 10. As
`16-bit data symbol. For example, the first byte might indi-
`the compression process progresses, history buffer 11
`cate that the 16-bit data symbol is a Kanji character.
`expands within a given system memory reserve according to
`how much of input data stream 10 has been processed until
`However, despite the variety of data types that are com-
`monly used in the industry, prior art data manipulation
`history buffer 11 reaches the maximum system memory
`allocation available for data compression. Thus, in the case
`processesdo not include automatic detection of the data type
`of an input data stream. Most prior art data manipulation
`where no matching string is found, as in the case for data
`string 12 during the initial data compression stage of input
`processes rely on the user or another source external to the
`data stream 10, unmatched string 12 is stored into output
`data manipulation process itself to supply such data type
`data stream 20 in the form ofa literal length header (LL,) 22
`information. For example,
`in a file transfer program
`
`FIELD OF INVENTION
`
`This invention relates to electronic data manipulation
`processes. More specifically, this invention relates to elec-
`tronic data compression systems used with a computer.
`
`BACKGROUND
`
`10
`
`15
`
`20
`
`25
`
`30
`
`45
`
`50
`
`60
`
`65
`
`Page 10
`
`Page 10
`
`
`
`5,467,087
`
`4
`FIG. 2 illustrates an example of the prior art LZ-2 data
`compression;
`FIG.3 illustrates an example of a prior art data compres-
`sion system;
`FIG. 4 howsa block diagram of one embodiment of a
`lossless data compression and decompression process taught
`in accordance with the principles of this invention;
`FIG.5 illustrates a detailed block diagram of one embodi-
`ment of the data compression process of FIG. 4;
`FIG.6 illustrates a detailed block diagram of one embodi-
`ment of the input data compression process of FIG. 5;
`FIG.7 illustrates a detailed block diagram of one embodi-
`ment of the data decompression process shown of FIG.4;
`FIG. 8 shows a block diagram of one embodiment of a
`lossless data compression system constructed in accordance
`with the principles of this invention; and
`FIG.9 illustrates a block diagram of one embodiment of
`a data decompression system constructed in accordancewith
`the principles of this invention.
`DETAILED DESCRIPTION
`
`10
`
`15
`
`20
`
`3
`followed by data string 12 duplicated from original data
`stream 10. Literal length header 22 encodes the number of
`characters, n, in unmatched string 12 that follows literal
`length header 22. This encoded information is recovered
`during data decompression to notify the decompression
`process of the number of data characters following literal
`length header 22, corresponding to the original input data
`that need not be expanded.
`The LZ-2 data compression methodof FIG. 2 searches for
`matching currentdata string 14 in a dictionary 30 ofindices.
`Dictionary 30 comprises a limited buffer length and data
`strings from input data stream 10. If a matching datastring
`12 is located in dictionary 30 for current data string 14,
`current data string 14 is then encoded in the output data
`stream with index 32 corresponding to the location of data
`string 12 in dictionary 30. Because the LZ-1 methodof FIG.
`1 searches for a matching data string character by character
`through the history buffer, the time required to compress
`input data stream 10 is substantially greater when using the
`LZ-1 method of FIG. 1 than with the LZ-2 method of FIG.
`2. However,
`the LZ-1 method provides a greater data
`compression ratio than the LZ-2 method.
`Data decompression is the conversion of a stream of
`FIG. 4 shows one embodiment of a high speed lossless
`compressed data backto its original expanded form. Decom-
`25
`data compression and decompression process 100 of the
`pression is typically accomplished with a lookup table, if the
`present invention. Data compression process 102 comprises
`data was compressed usingastatistical or a Huffman type
`two phases: a data pre-compression phase 106 and a com-
`coding scheme. If the data was compressed using a dictio-
`pression phase 108. Similarly, data decompression process
`nary type data compression method, such as the LZ-1
`104 also comprises two phases: a data type retrieval phase
`method (as explained above with reference to FIG. 1),
`110 and a decompression phase 112. During data compres-
`original data stream 10 is reconstructed by replacing each
`sion process 102, data pre-compression phase 106 first
`pointer(p, 1) encountered in compressed data stream 20 with
`receives an uncompressed input data stream 101 and iden-
`the data string in the history buffer locatedat offset p. If the
`tifies the data type of the input data stream. Data pre-
`data was compressed with an LZ-2 data compression
`compression phase 106 also generates a data type identifi-
`scheme (as explained above with reference to FIG. 2), the
`cation signal. Compression phase 108 then selects a data
`dictionary generated during data compression is used to
`compression method from a set of data compression meth-
`retrieve the indexed datastrings.
`ods according to the data type identification signal.
`FIG. 3 illustrates a typical prior art data compression
`Itis envisioned as within the scopeofthe principles taught
`system. Data compression system 40 receives an input
`in accordance with this invention that the set of data com-
`uncompressed data stream 10 and processes data stream 10
`pression methods can include a variety of data compression
`through a first data compression phase 42 using a first
`methods, such as the data compression methods from the
`predefined data compression technique. Alternatively, prior
`LZ-type family of data compression methods, the Huffman
`art data compression system 40 may also provide a second
`code family of data compression methods,or other such data
`data compression phase 44 using a second data compression
`compression methods, including the Arithmetic code, or
`technique also predefined by the design of data compression
`combinations of such data compression methods. In the
`system 40. Prior art data compression systems thus use the
`preferred embodimentof this invention, the set of compres-
`same data compression techniques incorporated by the data
`sion methods comprises a combination of LZ-type/Huff-
`compression system design regardless of the data type
`man-type compression methods. For example,if the input
`encountered in the input data stream. Because each data
`data type is identified as ASCH,the data type identification
`compression technique typically provides a different com-
`signal from pre-compression phase 106 indicates to com-
`pression ratio for different data types, prior art compression
`pression process 108 to select an LZ-1 and a Huffman type
`systems are unable to maximize the data compressionratio
`H, combination of compression methodsthat is designed to
`whenencountering a variety of input data types in the input
`provide an optimal compression ratio for ASCII type data.
`data stream. There is therefore a need to provide an efficient
`According to the selected combination of compression
`and flexible data compression system that maximizes the
`methods, compression process 108 then compresses the
`data compression ratios according to the input data type
`input data stream first with the LZ-1 data compression
`detected. Moreover, prior art data compression systems also
`method to generate a first set of compressed data. Thefirst
`do not maximize the usage of the CPU, such as to provide
`set of compressed data is then processed with the Huffman
`normal rate of data compression during the CPU’s idle time,
`type compression method, H,, to provide a second set of
`butincreasing the rate of data compression when the CPUis
`compressed data. Likewise for other data types, one or more
`preparing to process anothertask. It is therefore also desir-
`LZ-type/Huffman type combinations of compression meth-
`able to have a data compression system that provides
`ods, which provide an optimal data compression ratio for
`controlling means to increase or decrease the system’s rate
`one or more particular data types, can be includedin the set
`of data compression.
`of compression methods used during compression process
`BRIEF DESCRIPTION OF DRAWINGS
`108.
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`FIG. 1 illustrates an example of the prior art LZ-1 data
`compression;
`
`FIG.5 illustrates a detailed block diagram of the preferred
`embodiment of data compression process 102 of FIG. 4.
`
`Page 11
`
`Page 11
`
`
`
`5,467,087
`
`6
`5
`type signal generation process 116 generates a type identi-
`During data pre-compression 106, the data type of input data
`fication signal to be provided to compression phase 108.
`stream 101 is first identified with data type identification
`Compression method selection process 118 selects an LZ-
`process 114. In the preferred embodiment, data type iden-
`type/Huffman-type combination compression method in
`tification process 114 detects the input data type as either
`accordance with the data type identification signal, and input
`ASCI,binary, or unicode by analyzing a predefined number
`data compression process 120 compresses the input data
`of bytes of input data stream 101.
`stream according to the selected data compression method.
`Typically, in the ASCII format, a data symbol is encoded
`In the preferred embodiment of this invention, selection
`in only 7 bits out of a set of 8 bits, while the binary format
`process 118 selects, in accordance with the data type iden-
`uses all 8 bits to represent a data symbol. Consequently, a
`tification signal, a compression method havingafirst data
`byte of ASCH data corresponds to a decimal equivalent
`compression phase comprising an LZ-1 compression and a
`value in the range of 0-127, while a byte of data in binary
`second data compression phase comprising a Huffman-type
`format represents a decimal equivalent value in the range of
`compression or Arithmetic code compression.
`0-255. Thus, data type identification process 114 detects
`FIG.6 illustrates a detailed block diagram of one embadi-
`whether each byte of input data stream 101 corresponds to
`ment of compress input data process 120 of FIG. 5. In the
`a decimal equivalent value of greater than 127. If the current
`preferred embodiment of compress input data process 120,
`data byte correspondsto a decimal equivalent value greater
`a system memory allocation process 140 is provided to
`than 127, than the data type of input data stream 101 is
`allow the system or the user control of the amount of system
`identified as binary. If the current data byte corresponds to
`memory to be allocated for data compression. Memory
`a decimal equivalent value of less than 127, data type
`allocation process 140 estimates the memory requirement
`identification process 114 continues to check the next byte of
`necessary to compress input data stream 101 and then
`input data until the end of the input data stream. A consistent
`allocates that estimated amount of system memory for data
`pattern of data bytes, each comprising a decimal equivalent
`compression process 120. In the preferred embodiment,
`of less than 127, indicates that the input data type is ASCIL.
`memory allocation process 114 estimates the memory
`In the preferred embodiment, data type identification
`requirement in accordanceto the identified input data type,
`process 114 also detects for unicode format data type by
`a selected compression ratio, a selected speed of data
`comparing the first bytes of a predefined numberof pairs of
`compression, a selected p,,,,. and l,,,, value, or a selected
`bytes in input data stream 101. A typical data symbol! in
`combination ofthese features. As data compression process
`120 progresses and more system memory is needed to
`unicode is represented by a pair of bytes, with the first byte
`always Indicating the data characteristic (e.g., Kanji char-
`complete data compression of the input data stream,
`acter type) of the data encodedin the pair of bytes. Thus,if
`memory allocation process 140 then allocates additional
`the first bytes of the predefined numberofpairs of data bytes
`increments of system memory to data compression process
`matches, then the data type of input data stream 101 is
`120. Alternatively, it is also envisioned as within the scope
`identified as unicode.
`of the principles taught by this invention to have memory
`allocation process 140 provide an initial memory allocation
`of a predefined range of system memory for data compres-
`sion process 120 without first estimating a memory alloca-
`tion requirement. Memory allocation process 140 then sub-
`sequently provides additional increments of system memory
`during the compression process as is needed.
`Once the initial system memory is allocated, first data
`compression process 122 commences data compression of
`the input data stream using the LZ-type data compression
`method to generate a first set of compressed data. During a
`second compression process 124,the first set of compressed
`data is compressed using the Huffman-type code compres-
`sion method.
`
`20
`
`40
`
`45
`
`Oncethe datatypeis identified, data pre-compression 106
`also preferably encodes is data type information in any
`knownstandard used in the industry as means for denoting
`the data type of a data stream. The typical standard used for
`denoting a data type of a data stream is to encode the data
`type information in a header located at the beginning of an
`output data buffer. The data type information is then
`retrieved by decoding the header during data decompression
`to identify the data type of the compressed data stream being
`decompressed.
`In the preferred embodiment of the present invention,
`during data pre-compression phase 106, a p,,,, and l,,,,
`value are also selected which provide an optimal LZ-1 data
`compression ratio according to the identified data type.
`Table 1 illustrates a range ofp,,,,, and 1,,,,, values for ASCII
`and binary type data that may be used with the LZ-1 data
`compression method. Selecting a lower p,,, typically
`increases the rate of data compression, while typically
`decreasing the compression ratio. Similarly, selecting a
`lower1,,,,, also typically increases the rate of data compres-
`sion, since a shorter character length | requires less search
`time. Selecting a lowerl,,,,, also typically results in a lower
`compression ratio. Thus, varying the p,,,, and the l,,..
`parameters typically produces a different compression time
`and a different compression ratio.
`,
`
`TABLE1
`
`Data Type
`
`Pmax Range
`
`linax Range
`
`ASCIL
`Binary
`
`2K-8K bytes
`16K-32K bytes
`
`16-2048 bytes
`16-256 bytes
`
`Asshownin FIG.5, once p,,,,, and 1,,,,, are selected, data
`
`50
`
`55
`
`60
`
`65
`
`In an alternative embodiment of input data compression
`process 120 also shown in FIG. 6, a compression rate control
`signal 103 is provided to data compression process 120.
`Data compression rate adjustment process 130 adjusts the
`values Of lax Pmax OF both 1,,,, and p,,,, to increase or
`decrease the compression process speed in response to data
`compression rate control signal 103.
`In an alternative
`embodiment of data compression rate adjustment process
`130, adjustment process 130 indicates to LZ-type compres-
`sion process 122 whether to use LZ-1 or LZ-2 compression
`in accordance with data compressionrate control signal 103
`to adjust the compression time for compressing data. Thus,
`data compression rate adjustment process 130 provides data
`compression process 100 with the flexibility to adjust the
`compression speed during data compression.This flexibility
`provides compression system 100 means to maximize the
`CPU’s idle time to do data compression and means to
`increase the data compression speed when the CPU is in
`preparation to begin another process.
`FIG.7 illustrates an example of a detailed block diagram
`
`Page 12
`
`Page 12
`
`
`
`5,467,087
`
`7
`of the preferred embodiment of data decompression process
`112 of FIG.4. Once the data type information of compressed
`data stream 107 is retrieved by decoding the header of
`compressed data stream 107, lookup table selection process
`132 selects a corresponding Huffman code lookup table that
`is associated with that data type. A first data decompression
`process 134 then processes the compressed data using the
`selected lookup table to generate a first set of decompressed
`data. A second decompression process 136 then processes
`the first set of decompressed data using the selected LZ type
`decompression codebook to provide as output an expanded
`original data stream.It is also envisioned as within the scope
`of the principles taught by this invention that other such data
`decompression algorithms may be substituted during data
`decompression process 112 to decompress compressed data
`stream 107, if another compression algorithm was selected
`during data compression process 102 in response to the
`particular data type of the original data stream.
`FIG. 8 illustrates the preferred embodiment of a data
`compression system 200 constructed in accordance with the
`principles of this invention. Data pre-compression system
`202 receives an input data stream 101 andidentifiesits data
`type. Data pre-compression system 202 also generates a data
`type identification signal 105 in response to the identified
`data type of input data stream 101. Data compression system
`204, which also receives input data stream 101, is coupled
`to data pre-compression system 202 to receive data type
`identification signal 105. Data compression system 204 thus
`compresses input data stream 101 in accordance with the
`identified input data type. In one embodimentof this inven-
`tion, data compression system 204 selects in response to data
`type identification signal 105 at least one data compression
`method from a set of data compression method. Data com-
`pression system 204 then processes input data stream 101
`according to the selected data compression method to gen-
`erate a compressed data output stream 107. In another
`embodimentofthis invention, data compression system 204
`receives a data compression rate contro] signal 103 and
`adjusts the selected data compression method in response to
`compression rate control signal 103.
`In the preferred embodiment of data compression system
`200, data pre-compression system 202 comprises data type
`identification and data type signal generation process 106
`and data compression system 204 comprises compression
`data process 108 as explained with reference to FIG. 4.
`FIG. 9 illustrates one embodiment of a data decompres-
`sion system 300 constructed in accordance with the prin-
`ciples of this invention. Data pre-decompression system 302
`receives a compressed data stream 107 andidentifies its data
`type. In response to the identified data type, data pre-
`decompression system 302 generates a compressed data type
`identification signal 109. Data decompression system 304,
`which also receives compressed data stream 107, is coupled
`to data pre-decompression system 302 to receive com-
`pressed data type identification signal 109. Data decompres-
`sion system 304 selects at least one data decompression
`method from a set of data decompression methods in
`response to compressed data type identification signal 109.
`Compressed data stream 107 is then decompressed by data
`decompression system 304 using the selected decompres-
`sion method to generate as output expanded original data
`stream 111.
`
`In the preferred embodiment of data decompression sys-
`tem 300, data pre-decompression system 302 preferably
`comprises data type retrieval process 110, while data decom-
`pression phase 304 comprises data decompression process
`112 as explained with reference to FIG. 4.
`
`10
`
`15
`
`20
`
`35
`
`40
`
`45
`
`50
`
`355
`
`60
`
`65
`
`8
`Data compression and decompression process 100 that
`identifies the data type of a data stream and then selects
`according to the identified data type at
`least one data
`compression method, which provides optimal data compres-
`sion ratio for that identified data type, thus maximizes the
`compression ratio of the input data stream. Moreover, data
`compression process 100 also provides meansto control the
`memory allocation for the data compression process and
`meansto alter the rate of compression during data compres-
`sion process. Each of these features provides an added
`flexibility that maximizes data compression efficiency.
`I claim:
`1, An electronic data compression process for compress-
`ing at least oneset of input data, the at least one set of input
`data being of a specific data type of a plurality of data types,
`the electronic data compression process comprises the steps
`of:
`
`identifying the specific data type of the set of input data;
`selecting at
`least one data compression method in
`response to the identified data type;
`compressing the set of input data with the selected atleast
`one data compression method; and
`receiving a data compression rate contro] indicator for
`varying the data compressionrate.
`2. The electronic data compression process of claim 1
`wherein the step of compressing the set of input data
`comprises adjusting the compression rate contro! indicator.
`3. An electronic data compression process for compress-
`ing