throbber
United States Patent
`
`[19]
`
`[11] Patent Number:
`
`5,973,630
`
`Heath
`
`[45] Date of Patent:
`
`Oct. 26, 1999
`
`US005973630A
`
`[54] DATA COMPRESSION FOR USE WITH A
`COMMUNICATIONS CHANNEL
`
`[57]
`
`ABSTRACT
`
`[75]
`
`Inventor: Robert Jefi' Heath, San Diego, Calif.
`
`[73] Assjgnee; Hughes E]ectmnic5 Corporation, E1
`Segundo, Cafif.
`
`21 A 1. N .2 09 258 782
`]
`pp
`0
`/
`’
`[
`[22]
`Filed;
`Mar, 1, 1999
`
`Related U_s_ Application Data
`
`[62] Division of application No. 08/982,864, Dec. 2, 1997.
`[51]
`Int. cm ...................................................... H03M 7/00
`[52] U.S. Cl.
`................................................. 341/87; 341/51
`[58] Field of Search .............................. .. 341/87, 51, 106,
`341/67, 95, 50
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`5,473,326
`
`12/1995 Harrington et al.
`
`.................... .. 341/51
`
`Primary Examiner—Brian Young
`Attorney, Agent, or Firm—John T. Whelan; Michael W.
`Sales
`
`A method of compressing data involves receiving a symbol,
`and 3 Subsequent 5YH1b01; determining in 3 C0mPre55i0n
`dictionary whether the symbol has a valid extension pointer;
`using, in the event the symbol does have a valid extension
`pointer, the valid extension pointer to access string extension
`symbols; determining, in the event the symbol does have a
`valid extension pointer, Whether the string extension sym-
`bols equal the at least one subsequent symbol; determining
`in the compression dictionary, in the event the string exten-
`sion symbols do not equal
`the at least one subsequent
`symbol, Whether the symbol has a valid parallel extension;
`repeating,
`in the event
`the symbol has a valid parallel
`extension, the using step; repeating, in the event the string
`33331101;h:yg;§g§;s1g;;;;g;1;;g;§r jL:e:;:n‘;‘;j 3135:1133;
`extensién pointer; inserting’ in the event the Symbol does not
`have a valid extension pointer or in the event the symbol
`does not have a valid parallel extension, a code Word
`indicative of a longest string found into a compressed data
`stream; determining Whether the longest string was a single
`symbol; extending, in the event the longest string was a
`single symbol, the longest string by one symbol; extending,
`in the event the longest string was not a single symbol, the
`longest string by a plurality of symbols; inserting a string
`extension signalling code word if the string is extended by
`multiple symbols and transmitting the compressed data
`stream through the communications channel.
`
`20 Claims, 9 Drawing Sheets
`
`74
`
`/6‘
`
`DECOMPRESSION
`COMPRESSION
`
`
`DICTIONARY
`DICTIONARY
`
`(ADAPTIVE)
`(ADAPTIVE)
`24
`
`
`517
`
`
`
`ORACLE 1 104
`
`

`
`U.S. Patent
`
`5,973,630
`
`9SM.1..............p...................................4.mHmmmmHw_E>_E<9:
`
`
`
`
`
`___HHHH
`
`Nu
`
`
`
`._.:nH._.:OHmommmmmzoomo_ommmmEs_oo_mommmmazoo_E.<oSn_z_ E>_E<9:_H>m<zo:o_o>m<zo_.S_oHHzo_mmmEs_oomozo_mwmEzooH__mH__H1,HI__IHmI_mm_<55HM:_m<53
`
`
`

`
`U.S. Patent
`
`Oct. 26, 1999
`
`Sheet 2 of 9
`
`5,973,630
`
`COMPRESSION DICTIONARY
`LOCATION
`
`FIRST 256 ENTRIES OF
`COMPRESSION DICTIONARY ONLY
`HAVE AN EXTENSION POINTER
`
`101h
`102h
`
`106h
`107h
`108h
`
`3ffh
`
`INVALID
`INVALID
`
`0 23456
`
`9 A B C D E F10
`
`1213141516
`
`“COMPUTER-°COMPU
`
`IIIII
`
`oI
`
`'~2*+1;g:w
`
`COMPRESSED
`DATA STREAM
`
`c 0 M P u T E R --[100h][10Dh]
`
`-- [man]
`
`F/G.
`
`.3’
`
`

`
`U.S. Patent
`
`Oct. 26, 1999
`
`Sheet 3 of 9
`
`5,973,630
`
`01234557 9ABCDEF101213141516
`
`COMPRESSED
`DATA STREAM c 0 M P u T E R
`
`, ,
`
`[100h][10Dh]
`
`, ,
`
`[10311]
`
`CM”
`';:R
`O
`1
`1
`1111
`}£%%%%%Z%
`
`--COMPUTER--COMPU
`1
`‘%
`
`

`
`U.S. Patent
`
`Oct. 26, 1999
`
`Sheet 4 of 9
`
`5,973,630
`
`600
`
`COMPRESSOR
`START
`
`602
`
`9
`
`GET NEXT
`SYMBOL TO
`PROCESS
`
`6'04
`
`USE POINTER TO
`DOES SYMBOL
`ACCESS STRING
`OR STRING HAVE A
`EXTENSION
`VALID EXTENSION
`SYMBOLS
`POINTER?
`
`
`608
`
`
`
`
`
`
`
`
`NO
`
`
`LONGEST EXISTING
`DO ALL STRING
`EXTENSION
`3
`RI
`MA¥G‘I'3OI;"oLSIND,NI(%s
`SYMBOLS
`
`
`
`EQUAL THE
`cone wono IS A
`
`SYMBOLS
`SINGLE SYMBOL
`
`FOLLOWING THE
`oI=I
`IMPLIED BY THE
`
`SYMBOL
`LAST STRING
` EXTENSION ENTRY
`BEING PROCESSED?
`
`
`
`614
`
`
`
`
`
`IS THERE A
`VALID PARALLEL
`EXTENSION
`POINTER?
`
`
`
`
`
`
`
`
`
`PLACE CODE WORD
`REPRESENTING
`LONGEST STRING
`(OR SINGLE
`SYMBOL)
`INTO
`THE COMPRESSED
`OUTPUT
`
`
`
`
`
`
`
`YES
`
`F/G. 6/I
`
`616‘
`
`
`
` WAS THE
`LONGEST STRING
`FOUND A SINGLE
`SYMBOL?
`
`
`
`NO
`
`

`
`U.S. Patent
`
`Oct. 26, 1999
`
`Sheet 5 of 9
`
`5,973,630
`
`644
`
`
`
`ARE THERE MORE
`THAN THE MAXIMUM
`
`STRING LENGTH OF
`REPEATED SYMBOLS,
`EQUAL TO THE FIRST
`SYMBOL OF THE
`LONGEST STRING,
`FOLLOWING THE
`
`LONGEST STRING
`
`FOUND?
`
`NO
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`YES
`
`
`
`
`
`
`
`
`
`PLACE A RUN
`LENGTH ENCODING
`
`SIGNALLING CODE
`
`WORD FOLLOWED BY
`THE NUMBER OF
`
`REPEATED
`
`SYMBOLS INTO
`
`THE COMPRESSED
`OUTPUT
`
`
`
`6'20
`
`
`IS LONGEST
`STRING EQUAL TO YES
`THE MAXIMUM
`
`STRING LENGTH?
`NO
`
`622
`
`USING THE LONGEST
`STRINGS EXTENSION
`ENTRY. COMPARE
`THE SYMBOLS AT
`THE LOCATION
`POINTER, PLUS THE
`COUNT PLUS ONE,
`TO THE SYMBOLS
`STARTING WITH THE
`NEXT SYMBOL
`TO BE PROCESSED
`
`
`
`6'24
`
`
`
`DO AT LEAST
`2 SYMBOLS
`
`MATCH?
`
`
`YES
`
`
`
`6'26‘
`
`
`
`
`
`CREATE A MULTI
`SYMBOL STRING
`EXTENSION ENTRY
`
`AND EXTEND THE
`
`STRING BY THE
`NUMBER OF
`SYMBOLS
`THAT MATCH
`
`
`
`
`
`
`
`
`
`F/G.
`
`63
`
`
`
`
`
`
`
`
`CREATE A ONE
`SYMBOL STRING
`EXTENSION ENTRY
`AND EXTEND THE
`
`STRING BY ONE
`
`SYMBOL, THE
`
`
`PLACE A MULTI
`NEXT SYMBOL
`SYMBOL
`TO PROCESS
`
`EXTENSION STRING
`
`SIGNALLING CODE
`WORD INTO THE
`COMPRESSED
`OUTPUT
`
`

`
`U.S. Patent
`
`Oct. 26, 1999
`
`Sheet 6 of 9
`
`5,973,630
`
`630
`
`LINK THE NEWLY CREATED STRING
`EXTENSION TO THE LONGEST STRING
`BY PLACING THE CREATED EXTENSION
`ENTRIES IMPLIED CODE WORD INTO
`THE EXTENSION POINTER OF THE
`LAST STRING EXTENSION OF THE
`
`LONGEST STRING
`
`6'32
`
`PLACE THE POINTER TO THE NEXT
`SYMBOL TO PROCESS INTO THE
`LOCATION POINTER OF THE NEWLY
`
`CREATED STRING EXTENSION ENTRY
`
`PLACE THE NUMBER OF EXTENSION
`SYMBOLS MINUS ONE INTO THE
`COUNT OF THE NEWLY CREATED
`
`STRING EXTENSION ENTRY
`
`634
`
`PLACE AN ILLEGAL VALUE INTO THE
`EXTENSION AND PARALLEL POINTERS
`OF THE NEWLY CREATED STRING
`EXTENSION ENTRY
`
`636
`
`O
`
`638
`
`
`BYPASS LONGEST STRING AND ANY
`OTHER SYMBOLS COMPRESSED
`BY THE CODE WORDS PLACED INTO
`
`
`THE COMPRESSED OUTPUT AND
`POINT TO THE NEXT
`
`
`UNPROCESSED SYMBOL
`
`
`
`640
`
` CARE THERE
`
`
`MORE SYMBOLS TO
`PROCESS?
`
`
`~°
`
`FIG. 50
`
`YES
`
`9
`
`

`
`U.S. Patent
`
`Oct. 26, 1999
`
`Sheet 7 of 9
`
`5,973,630
`
`700
`
`START
`
`702
`GET NEXT CODE
`wonu TO
`PROCESS
`
`6
`
`704
`
`705-
`
`DOES CODE
`ORR §ElPc§L%3ENT YES
`SYMBOL?
`
`PLACE THE SYMBOL INTO
`THE NEXT UNCOMPRESSED
`LOCATION IN THE OUTPUT
`
`
`
`
`
`USE THIS CODE WORD AS AN
`INDEX INTO THE DICTIONARY
`AND GET THE POINTER,
`IN THE
`PREVIOUSLY UNCOMPRESSED
`OUTPUT. TO THE FIRST
`SYMBOL OF THIS CODE
`WORDS STRING (WHICH IS
`AT THE LOCATION POINTER
`MINUS DEPTH PLUS ONE).
`COPY DEPTH NUMBER OF
`SYMBOLS FROM THAT
`POINTER TO THE NEXT
`UNCOMPRESSED LOCATION
`IN THE OUTPUT
`
`714
`
`0
`
`712
`
`USE THE PREVIOUS CODE WORD
`PROCESSED AS AN INDEX INTO
`THE DICTIONARY AND GET THE
`
`
`POINTER,
`IN THE PREVIOUSLY
`THE CODE
`
`UNCOMPRESSED OUTPUT, TO
`WORD IS NOT IN
`
`
`THE FIRST SYMBOL OF THE
`THE DICTIONARY
`BUT IS IT EQUAL YES PREVIOUS CODE WORDS STRING
`
`
`TO THE NEXT
`(WHICH IS AT LOCATION
`SEQUENTIAL
`POINTER MINUS DEPTH PLUS
`
`
`CODE WORD
`ONE). COPY DEPTH NUMBER OF
`TO BE BUILT?
`SYMBOLS FROM THAT
` POINTER TO THE NEXT
`NO
`UNCOMPRESSED LOCATION
`IN THE OUTPUT
`
`
` IS THE CODE
`WORD IN THE
`DICTIONARY
`(l.E.
`IS IT LESS
`THAN THE NEXT
`SEQUENTIAL
`CODE wono
`TO BE BUILT)?
`
`YES
`
`NO
`
`
`
`
`
`
`
`
`
`

`
`U.S. Patent
`
`Oct. 26, 1999
`
`Sheet 8 of 9
`
`5,973,630
`
`730
`
`YES
`
` IS THE
`
`DEPTH OF THE
`PREVIOUS CODE
`
`
`WORD PROCESSED
`EQUAL TO THE
`
`
`‘ MAXIMUM STRING
`
`LENGTH?
`
`
`
`NO
`
`732
`
`716‘
`
`THE CODE WORD IS NOT IN THE
`DICTIONARY BUT IT IS GREATER
`THAN THE NEXT SEQUENTIAL
`CODE WORD TO BE BUILT.
`USING THE PREVIOUS CODE WORD
`PROCESSED AS AN INDEX INTO
`THE DICTIONARY GET THE
`LOCATION POINTER AND DEPTH
`AND CREATE A "FROM POINTER“
`TO THE LAST SYMBOL. PLUS
`ONE. OF THE PREVIOUS CODE
`WORDS STRING. SUBTRACT THIS
`CODE WORD FROM THE NEXT
`SEQUENTIAL CODE WORD TO BE
`BUILT TO GET THE NUMBER OF
`SYMBOLS BY WHICH THE
`PREVIOUS CODE WORD IS
`EXTENDED..COPY THAT NUMBER OF
`SYMBOLS FROM THE “FROM
`POINTER‘-IN THE PREVIOUSLY
`UNCOMPRESSED OUTPUT TO THE
`NEXT UNCOMPRESSED LOCATION
`IN THE OUTPUT.
`
`734
`
`736
`
`FIG. 7B
`
`COPY THE FIRST
`SYMBOL OF THE
`PREVIOUS CODE WORDS
`STRING TO THE NEXT
`UNCOMPRESSED LOCATION.
`IN THE OUTPUT.
`
`
`
`
`
`
`
`
`
`
`
`
`THIS IS A RUN LENGTH
`ENCODING CODE WORD.
`GET THE NEXT CODE
`WORD WHICH IS THE
`NUMBER OF REPEATED
`SYMBOLS AND
`REPEAT THE FIRST
`SYMBOL OF THE
`PREVIOUS CODE WORDS
`STRING THAT NUMBER
`OF TIMES INTO THE
`NEXT UNCOMPRESSED
`LOCATION IN THE
`OUTPUT
`
`
`
`
`
`
`
`UPDATE THE NEXT
`UNCOMPRESSED
`LOCATION POINTER BY
`
`
`THE NUMBER OF
`
`
`SYMBOLS PLACED
`
`
`INTO THE OUTPUT
`
`
`WHILE PROCESSING THE
`
`
`CODE WORD AND RUN
`
`
`LENGTH CODE WORDS
`
`

`
`U.S. Patent
`
`Oct. 26, 1999
`
`Sheet 9 of 9
`
`5,973,630
`
`778
`
`
`
`
`UPDATE THE NEXT UNCOMPRESSED
`LOCATION POINTER BY THE NUMBER
`OF SYMBOLS PLACED INTO THE
`OUTPUT WHILE PROCESSING THE
`CODE WORD
`
`
`
`
`
`
`BUILD AN EXTENSION STRING TO THIS
`CODE WORD STRING USING THE NEXT
`SEQUENTIAL DICTIONARY ENTRY
`
`720
`
`722
`
` PLACE THE POINTER TO THE LAST
`
`
`SYMBOL PLACED OR COPIED INTO
`THE NEXT UNCOMPRESSED LOCATION
`INTO THE LOCATION POINTER OF THE
`
`
`NEWLY CREATEéDN1S|;I'IY’IING EXTENSION
`
`726'
`
`ADD THE NUMBER OF STRING
`EXTENSION SYMBOLS, OR ONE IF
`A SINGLE SYMBOL CODE WORD,
`TO THE DEPTH OF THE PREVIOUS
`CODE WORD AND PLACE THE RESULT
`INTO DEPTH OF THE NEWLY CREATED
`STRING EXTENSION ENTRY.
`
`
`
`
`
`
`
`
`
`
`
`YES
`
`
`726
`
`ARE THERE
`
`MORE CODE WORDS TO
`
`PROCESS?
`
`
`
`
`
`FIG. 70
`
`”°
`
`9
`
`

`
`5,973,630
`
`1
`DATA COMPRESSION FOR USE WITH A
`COMMUNICATIONS CHANNEL
`
`This application is a division of application Ser. No.
`08/982,864 filed Dec. 2, 1997.
`BACKGROUND OF THE INVENTION
`
`The present invention relates to data compression (i.e.,
`creation of compressed data from uncompressed data) and
`decompression (i.e., recovery of the uncompressed data
`from the compressed data).
`Data compression systems are known in the prior art that
`compress a stream of digital data signals (uncompressed
`bits) into compressed digital data signals (compressed bits),
`which require less bandwidth (fewer bits) than the original
`digital data signals, and that decompress the compressed
`digital data signals back into the original data signals or a
`close approximation thereof. Lossless data compression
`systems decompress the compressed digital data signals
`back into the original data signals exactly. Thus, lossless
`data compression refers to any process that converts data
`into an alternative data form that requires less bandwidth,
`i.e., has fewer bits, than the data converted in a process that
`is reversible so that the original data can be recovered.
`Accordingly, the objective of data compression systems is
`to effect a savings in an amount of storage required to hold
`the data or the amount of time (or bandwidth) required to
`transmit the data. By decreasing required space for data
`storage or required time (or bandwidth)
`for data
`transmission, data compression results in a monetary and
`resource savings.
`Acompression ratio is defined as the ratio of the length of
`the data in the alternative data form (compressed data) to the
`length of the data originally (original data). Thus defined, the
`smaller the compression ratio, the greater will be the savings
`in storage, time, or bandwidth.
`If physical devices such as magnetic disks or magnetic
`tape are utilized to store the data, then a smaller space is
`required on the device for storing the compressed data than
`would be required for storing the original data, thereby, e.g.,
`utilizing fewer disks or tapes for storage. If telephone lines,
`satellite links or other communications channels are utilized
`
`for transmitting digital information, then lower costs, i.e.,
`shorter transmission times and/or smaller bandwidths, result
`when compressed data is employed instead of original data.
`Data compression systems can be made particularly effec-
`tive if the original data contains redundancies such as having
`symbols or strings of symbols appearing with high fre-
`quency. (In fact redundancies in the original data is a
`requirement for lossless data compression.) A data compres-
`sion system operating on original data containing redundan-
`cies may, for example, transform multiple instances of a
`symbol, or transform a string of symbols, in the original data
`into a more concise form, such as a special symbol or group
`of symbols indicating multiple occurrences of the symbol, or
`indicating the string of symbols, and thereafter translate or
`decompress the concise form back into the multiple
`instances of the symbol, or back into the string of symbols.
`For example, it may be desirable to transmit the contents
`of a daily newspaper via a satellite link or other communi-
`cations link to a remote location for printing. Appropriate
`sensors within a data compression system may convert the
`contents of the newspaper into a data stream of serially
`occurring characters for transmission via the satellite link. If
`the millions of bits comprising the contents of the daily
`newspaper were compressed before transmission and
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`decompressed at the receiver, a significant amount, e.g.,
`such as 50% or more, of transmission time (or bandwidth)
`could be saved.
`
`As a further example, when an extensive database such as
`an airline reservation database or a banking system database
`is stored for archival or backup purposes, a significant
`amount of storage space, such as 50% or more, can be saved
`if the database files are compressed prior to storage and
`decompressed when they are retrieved from storage.
`To be of practical and general utility, a digital data
`compression system should satisfy certain criteria.
`Specifically, one criterion is that the system should provide
`high performance,
`i.e., compression/decompression rates,
`for both compression and decompression with respect to the
`data rates in the communications channel being utilized, be
`it a data bus, a wired network, a wireless network or the like.
`In other words, data transmission rates seen by a sender of
`uncompressed data and a receiver of the uncompressed data
`should not be reduced as a result of compression/ decom-
`pression overhead. In fact, effective data rates achieved, may
`be significantly increased over slow communications
`channels, because more original data can be transmitted per
`unit time, if the original data is compressed preceding and
`following transmission, because there is less compressed
`data to transmit that there would have been original data.
`The rate at which data can be compressed (i.e.,
`the
`compression rate) is the rate at which the original data can
`be converted into compressed data typically specified in
`millions of bytes per second (megabytes/sec). The rate at
`which data can be decompressed (i.e., the decompression
`rate) is the rate at which compressed data can be converted
`back into original data. High compression rates and high
`decompression rates are necessary to maintain,
`i.e., not
`degrade, data rates achieved in present day disk, tape and
`communication systems, which typically exceed one
`megabyte/sec. Thus, practical data compression systems
`must typically have compression and decompression rates
`matching or exceeding some application-dependent
`threshold, e.g., one megabyte/sec.
`The performance of prior art data compression systems is
`typically limited by the speed of the random access memo-
`ries (RAM) and the like utilized to store statistical data and
`guide the compression and decompression processes. High
`performance compression rates and decompression rates for
`a data compression system can thus be characterized by a
`number of cycles (read and write operations) required per
`input character into or out of the data compression system.
`Fewer memory cycles per input character leads to higher
`performance compression rates and decompression rates.
`Another important criterion in the design of a data com-
`pression and decompression system is compression effec-
`tiveness. Compression effectiveness is characterized by the
`compression ratio of the system, i.e. a smaller compression
`ratio indicates greater compression effectiveness. However,
`in order for data to be compressible using a lossless data
`compression system, the data to be compressed must contain
`redundancies. As a result, the compression ratio, or com-
`pression effectiveness, in a lossless data compression system
`(and to a lesser degree in a lossy data compression system)
`is a function of the degree of redundancy in the data being
`compressed. The compression effectiveness of any data
`compression system is also affected by how effectively the
`data compression system exploits, for data compression
`purposes, the particular forms of redundancy in the original
`data.
`
`In typical computer stored data, e.g., arrays of integers,
`text, programs or the like, redundancy occurs both in the
`
`

`
`5,973,630
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`3
`repetitive use of individual symbology, e.g., digits, bytes or
`characters, and in frequent recurrence of symbol sequences,
`such as common words, blank record fields, and the like. An
`effective data compression system should respond to both
`types of redundancy.
`A further criterion important in the design of data com-
`pression and decompression systems is that of adaptability.
`Many prior art data compression procedures require prior
`knowledge, or the statistics, of the data being compressed.
`Some prior art procedures adapt to the statistics of the data
`as it is received, i.e., adaptive data compression systems, and
`others do not, i.e., non-adaptive data compressions systems.
`Where prior art procedures do not adapt to the statistics of
`the data as it
`is received, compression effectiveness is
`reduced, but where such procedures do adapt
`to the
`statistics, an inordinate degree of complexity is required in
`the data compression system. An adaptive data compression
`system may be utilized over a wide range of information
`types, which is typically the requirement in general purpose
`computer facilities while a non-adaptive data compression
`system operates optimally only on data types for which the
`non-adaptive data compression system is optimized. Thus, it
`is desirable that the data compression system achieves small
`compression ratios without prior knowledge of the data
`statistics, i.e., that the data compression system is adaptive.
`Many data compression systems currently available are
`generally not adaptable and so cannot be utilized to achieve
`small compression ratios over a wide range of data types.
`General purpose data compression procedures are known
`in the prior art that either are or may be rendered adaptive,
`two relevant procedures being the Huffman method and the
`Tunstall method. The Huifman method is widely known and
`used, reference thereto being had in an article by D. A.
`Huffman entitled “A Method for the Construction of Mini-
`mum Redundancy Codes”, Proceedings IRE, 40:10, pp.
`1098-1100 (September 1952). Further reference to the Huff-
`man procedure may be had in an article by R. Gallagher
`entitled “Variations on a Theme by Huifman”, IEEE Infor-
`mation Theory Transactions, IT-2426, (November 1978).
`Adaptive Huffman coding maps fixed length sequences of
`symbols into variable length binary words. Adaptive Huff-
`man coding suffers from the limitation that it is not ef[ica-
`cious when redundancy exists in input symbol sequences
`which are longer than the fixed sequence length the proce-
`dure can interpret. In practical implementations of the Huff-
`man procedure, the input sequence lengths rarely exceed 12
`bits due to RAM costs and, therefore, the procedure gener-
`ally does not achieve small compression ratios. Additionally,
`the adaptive Huffman procedure is complex and often
`requires an inordinately large number of memory cycles for
`each input symbol. Thus, the adaptive Huffman procedure
`tends to be undesirably cumbersome costly and slow thereby
`rendering the process unsuitable for most practical present
`day installations.
`Reference to the Tunstall procedure may be had in the
`doctoral
`thesis of B. T. Tunstall entitled “Synthesis of 55
`Noiseless Compression Codes”, Georgia Institute of
`Technology,
`(September 1967). The Tunstall procedure
`maps variable length input system sequences into fixed
`length binary output words. Although no adaptive version of
`the Tunstall procedure is described in the prior art, an
`adaptive version could be derived which, however, would be
`complex and unsuitable for high performance implementa-
`tions. Neither the Huffman nor the Tunstall procedure has
`the ability to encode increasingly longer combinations of
`source symbols.
`A further adaptive data compression system that over-
`comes some of the disadvantages of the prior art is that
`
`35
`
`40
`
`45
`
`50
`
`60
`
`65
`
`4
`disclosed in U.S. Pat. No. 4,464,650 for APPARATUS AND
`METHOD FOR COMPRESSING DATA AND RESTOR-
`
`ING THE COMPRESSED DATA, issued Aug. 7, 1984 to
`Cohen. The procedure of Cohen parses the stream of input
`data symbols into adaptively growing sequences of symbols.
`The procedure unfortunately, however, suffers from the
`disadvantages of requiring numerous RAM cycles per input
`character and utilizing time consuming and complex math-
`ematical procedures such as multiplication and division to
`effect compression and decompression. These disadvantages
`tend to render the Cohen procedure unsuitable for numerous
`economical high performance implementations.
`An even further adaptive data compression system that
`overcomes some of the disadvantages of the prior art is that
`disclosed in U.S. Pat. No. 4,558,302 for HIGH SPEED
`DATA COMPRESSION AND DECOMPRESSION APPA-
`RATUS AND METHOD, issued Dec. 10, 1985, to Welch.
`The procedure of Welch compresses an input stream of data
`symbols by storing,
`in a string table, strings of symbols
`encountered in an input stream. The Welch procedure next
`searches the input stream to determine the longest match to
`a stored string of symbols. Each stored string of symbols
`includes a prefix string and an extension character that is a
`last character in the string of symbols. The prefix string
`includes all but the extension character.
`
`When a longest match between the input data stream and
`the stored strings of symbols is determined, the code signal
`for the longest match is transmitted as the compressed code
`signal for the encountered string of symbols and an exten-
`sion character is stored in the string table. The prefix string
`of the extension character is the longest match,
`i.e., the
`longest stored string of symbols located in the search. The
`extension character of the extended string is the next input
`data character signal following the longest match.
`Searching through the string table and entering extension
`characters into the string table is effected by a limited
`searching hashing procedure. Unfortunately, even the
`improved data compression system of Welch suffers from
`less than optimal compression effectiveness, and less than
`optimal performance. As a result, the Welch procedure, like
`the Cohen procedure, is unsuitable for many high perfor-
`mance implementations.
`SUMMARY OF THE INVENTION
`
`The present invention advantageously improves upon the
`above-described approaches by providing a lossless data
`compression (i.e., creation of compressed data from uncom-
`pressed data) and decompression (i.e.,
`recovery of the
`uncompressed data from the compressed data) approach that
`improves on heretofore known data compression and
`decompression approaches.
`In one embodiment, the invention can be characterized as
`a method of compressing data for
`transmission over a
`communications channel. The method involves receiving a
`symbol, and at least one subsequent symbol; determining in
`a compression dictionary whether the symbol has a valid
`extension pointer; using, in the event the symbol does have
`a valid extension pointer,
`the valid extension pointer to
`access string extension symbols; determining, in the event
`the symbol does have a valid extension pointer, whether the
`string extension symbols equal the at least one subsequent
`symbol; determining in the compression dictionary, in the
`event the string extension symbols do not equal the at least
`one subsequent symbol, whether the symbol has a valid
`parallel pointer; repeating, in the event the symbol has a
`valid parallel pointer, the using step; repeating, in the event
`
`

`
`5,973,630
`
`5
`the string extension symbols equal the at least one subse-
`quent symbol, the determining of whether the symbol has a
`valid extension pointer; inserting, in the event the symbol
`does not have a valid extension pointer or in the event the
`symbol does not have a valid parallel pointer, a code word
`indicative of a longest string found into a compressed data
`stream; determining whether the longest string found was a
`single symbol; extending, in the event the longest string
`found was a single symbol, the longest string by one symbol;
`extending, in the event the longest string was not a single
`symbol,
`the longest string by one or more symbols;
`inserting, in the event the longest string found is extended by
`two or more symbols, a string extension signalling code
`word into the compressed data stream; and transmitting the
`compressed data stream through the communications chan-
`nel.
`In another embodiment, the invention can be character-
`ized as a method for decompressing data received over a
`communications channel. The method has steps of receiving
`a code word; determining whether the code word represents
`a single character; placing,
`in the event
`the code word
`represents a single character, the code word into an output
`data stream; determining, in the event the code word rep-
`resents more than a single character, whether the code word
`is in a dictionary; placing, in the event the code word is in
`the dictionary, a string defined by the code word into the
`output data stream; determining, in the event the code word
`is not in the dictionary, whether the code word is a next code
`word to be built; placing, in the event the code word is a next
`code word to be built, a string into the output data stream,
`the string being copied from a dictionary entry indicated by
`a previous code word processed; placing, in the event the
`code word is a next code word to be built, a first symbol of
`the string into the output data stream; and placing, in the
`event the code word is not in the dictionary and is not the
`next code word to be built, an extension string into the
`output data stream, the extension string being copied from
`the output data stream at a symbol following a last symbol
`of a dictionary entry indicated by a previous code word
`processed.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The above and other aspects, features and advantages of
`the present invention will be more apparent from the fol-
`lowing more particular description thereof, presented in
`conjunction with the following drawings wherein:
`FIG. 1 is block diagram illustrating a data compression
`system in accordance with one embodiment of the present
`invention;
`representation of a compression
`FIG. 2 is a tabular
`dictionary generated by the data compression system of FIG.
`1;
`
`FIG. 3 is a tabular representation of exemplary input (or
`original) data suitable for compression with the data com-
`pression system of FIG. 1;
`FIG. 4 is a tabular representation of exemplary com-
`pressed data generated by the data compression system of
`FIG. 1 in response to the input data of FIG. 3;
`FIG. 5 is a tabular representation of a decompression
`dictionary generated by the data compression system of FIG.
`1;
`
`FIGS. 6A, 6B and 6C are is flow charts illustrating steps
`traversed by the data compression system of FIG. 1 in order
`to compress an input data stream; and
`FIGS. 7A, 7B and 7C are flow charts illustrating steps
`traversed by the data compression system of FIG. 1 in order
`to decompress a compressed data stream.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`Corresponding reference characters indicate correspond-
`ing components throughout the several views of the draw-
`ings.
`
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENTS
`
`The following description of the presently contemplated
`best mode of practicing the invention is not to be taken in a
`limiting sense, but
`is made merely for the purpose of
`describing the general principles of the invention. The scope
`of the invention should be determined with reference to the
`claims.
`
`Referring to FIG. 1, a block diagram is shown of func-
`tional components of a data compression system in 10
`combination with an input data stream 12, a data channel 14
`and an output data stream 16,
`in accordance with one
`embodiment of the present invention. Shown is the input
`data stream 12, a compressor 18, a compression dictionary
`20, the data channel 14 carrying compressed data, a decom-
`pressor 22, a decompression dictionary 24 and the output
`data stream 16. The compressor 18, the compression dictio-
`nary 20, the decompressor 22 and the decompression dic-
`tionary 24 together make up the data compression system
`10. Advantageously, the data compression 10 system may be
`implemented using a general purpose computer or a special
`purpose computer (or other processor-and-memory- con-
`taining system, such as a satellite transmitter and receiver,
`cellular base station and cellular telephone, or the like) and
`appropriate software subsystems.
`In accordance with the data compression system 10 of the
`present embodiment, the first, for example, 256 code words
`of, for example, 1024 possible assignable code words are
`reserved for 256 possible hexadecimal character code rep-
`resentations of an 8 bit byte character. For example, the first
`256 code words may be assigned to extended ASCII
`(American Standard Code for Information Interchange)
`symbols, EBCDIC (Extended Binary Coded Decimal for
`Interchange Code) symbols, or the like. Thus, in accordance
`with the present embodiment, 768 code words, of the 1024
`possible assignable code words, are available for assignment
`as dictionary entries to recurring strings of bytes (i.e.,
`redundancies) built as character patterns are encountered
`within data to be compressed, i.e., during compression of the
`input data stream 12, or code words are processed within
`already compressed data, i.e., during decompression of the
`compressed data stream 14. These 768 code words are built
`and stored in the compression dictionary 20 during com-
`pression and built and stored in the decompression dictio-
`nary 24 during decompression.
`The present embodiment stops building dictionary entries
`and extending strings when the 1024th code word is built,
`however, this is a matter of design choice and can be adapted
`to particular applications of the present embodiment. With
`particular sizes of data streams, optimal performance may be
`achievable by, for example, employing 2048, 4096 or more
`code words. The code words, however many, along with
`their assigned character bytes or recurring strings of bytes
`(or, more accurately,
`their assigned pointers to assigned
`character bytes or recurring strings of bytes within the input
`data stream 12) make up the compression dictionary 20,
`which is used to translate data to be compressed (original
`data, i.e., the input data stream) into compressed data (i.e.,
`the compressed data stream), and the decompression dictio-
`nary 24, which is used to translate the compressed data back
`into the original data (i.e., the output data stream).
`The data compression system employs structure for the
`compression dictionary 20 that is different from the structure
`
`

`
`5,973,630
`
`7
`of the decompression dictionary 24. The compression dic-
`tionary 20 is defined according to the following elements
`illustrated in FIG. 2:
`
`EXTENSION POINTER (100)—A pointer to a compres-
`sion dictionary entry that defines an “extension string” to a
`current string. There are 1024 extension pointers each asso-
`ciated with a compression dictionary entry, and each com-
`pression dictionary entry is associated with a code word. The
`first 256 compression dictionary entries, which are associ-
`ated with the first 256 code words, are reserved for the 256
`hexadecimal representations of an 8 bit byte, and only have
`an extension pointer, i.e., they do not have a parallel pointer
`(see below). The remaining 768 compression dictionary
`entries, associated with the remaining 756 code words, have
`an extension pointer and a parallel pointer.
`PARALLEL POINTER (102)—A pointer to a compres-
`sion dictionary entry that defines an “extension string” to a
`previous string, a parallel string, i.e., a string that starts with
`the same character or characters as the current string and,
`thus, has the same previous string entry as the current string.
`There are 768 parallel pointers, associated with code words
`256 through 1023.
`LOCATION POINTER (104)—A pointer, into a previ-
`ously compressed area of the input dat

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket