throbber
United States Patent [19]
`
`5,379,036
`[11] Patent Number:
`Storer
`[45] Date of Patent: Jan. 3, 1995
`
`
`
`||||||||l|||||||||||||||||||Ill|l||l|||||||||l|||||||||||Illlllllllllllllll
`USO05379036A
`
`[57]
`
`ABSTRACI‘
`
`A data compression system utilizes multiple en-
`coding/decoding processing units arranged in a mas-
`sively parallel systolic pipe array for compressing a
`stream of input characters received at an input end of
`the pipe. The systolic pipe executes data compression
`by textual substitution, converting the uncompressed
`stream of input characters into a compressed set of
`pointers representative of the input. The processors
`store a dictionary of strings of previously-read input
`characters in association with corresponding pointers.
`The processors match portions of the input stream with
`the stored strings, and transmit the corresponding point-
`ers in place of the strings. In order to execute decoding
`or decompreadon of the stream of pointers, the proces-
`sors maintain a dictionary identical to that generated
`during encoding. The processors match received point-
`ers with stored strings of data and transmit the corre-
`sponding data strings in place of the pointers. Novel
`methods for matching. update, and deletion in a inas-
`sively parallel systolic pipe structure enhance process-
`ing speed.
`
`[54] METHOD AND APPARATUS FOR DATA
`COMPRESSION
`
`[76]
`
`Inventor:
`
`James A. Stnrer, 89 8. Great Rd..
`Lincoln, Mass. 01773
`
`[21] Appl. No.: 861,572
`
`[22] Filed:
`
`Apr. 1, I992
`
`HIJSM 7/30
`Int. Cl.‘
`[51]
`......... .. 341/51
`[52] US. Cl. .............. ..
`[58] Field of Search ........................................ .. 341/51
`
`
`
`[56]
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`34!/Sl
`341/95
`341/S1
`341/51
`.'.':.'... 341/37
`341/s1
`
`
`
`45553.30! I2/1985 Welch
`4,814,746 3/1989 Miller ct al.
`4.34-7,619 7/I989 Kato et al.
`4,376,541 10/1989 Storer
`4,331,015 n/1939
`Wong
`5,175,543 12/1992
`
`OTI-[ER PUBLICATIONS
`
`Gonm1ez—Smith, et 21]., "Parallel Algorithms for Data
`Compression"; Journal of the Association for Comput-
`ing Machinery; vol. 32 No. 2, Apr. 1935 pp. 344-373.
`
`Primary Exam:'Jrer—Howard L. Williams
`
`23 Claiins, 6 Drawing Sheets
`
`1-BIT REGISTERS
`
`IFULLB
`IFUILA
`OMARKA
`
`ORACLE 1116
`
`

`
`U.S. Patent
`
`Jan. 3, 1995
`
`Sheet 1 of 6
`
`5,379,036
`
`(101)
`
`(102)
`
`INIHIALIZE THE LOCAL DICTIONARY D TO HAVE ONE ENTRY
`FOR EACH CI-[ARACTER OF THE INPUT ALPHABET
`
`
`REPEAT FQEEVER:
`
`
`(GET THE CURRENT MATCH STRING S:)
`
`USE A MATCH OPERATION MX TO READ 8
`FROM 'I'I-IE INPUT.
`
`
`
`TRANSMIT (LOCQ IDI) BITS FOR THE INDEX
`OF S.
`
`
`
`
`
`(IFDIS FULL,'I'I-IENFIRSTI-EMPLOYA
`II ELETION OPERATION DX TO MAKE SPACE) .
`
`ADD EACH OF THE STRINGS SPECIFIED
`BY AN UPDATE OPERATION UX TO D.
`
`FIG. 1
`
`

`
`US. Patent
`
`Jan. 3, 1995
`
`Sheet 2 of 6
`
`5,379,036
`
`(201)
`
`
`
`INITIALIZE THE LOCAL DICTIONARY D
`BY PERFORMING STEP I OF THE ENCODING OPERATION.
`
`
`
`(202)
`
`
`(GET THE CURRENT MATCH STRING Sz)
`
`RECEIVE“-0(5[DDB]'I'S PDRTI-IE INDEX
`OF S.
`
`RETREIVE S FROM DANDOU'I'PUTTI-[E
`CHARACTERS OFS.
`
`
`
`
`
` (UPDATE D1)
`
`PERFORM STEP 2!) OF THE ENCODING
`ALGORITHM.
`
`FIG. 2
`
`

`
`U.S. Patent
`
`Jan. 3, 1995
`
`Sheet 3 of 6
`
`5,379,036
`
`I I I I I I *""°
`
`10.1
`
`10.
`
`10.
`
`10.
`
`10.
`
`10.
`
`I I I I I I
`
`10.9
`
`10.8
`
`10.
`
`T
`
`I ‘ I I I
`
`FIG.3
`
`

`
`U.S. Patent
`
`Jan. 3, 1995
`
`Sheet 4 of 6
`
`V D
`
`RESET
`
`1-BIT REGISTERS
`
`0 FULL B
`I RESET
`0 FULL. A
`. MODE
`C MARK A
`QFLUSI-I
`QLEADER 0MARK B
`C ACTIVE C DULIMY
`0 STOP
`
`MARK
`
`I E 5 D
`ER
`P
`
`STO
`
`1 2-BI'T
`DATA
`
`N11.
`
`GND
`
`FIG. 4
`
`VDD
`
`RESET
`
`MARK
`
`LEADER
`
`STOP
`
`12-BIT
`DATA
`
`NIL.
`
`GND
`
`

`
`U.S. Patent
`
`Jan. 3, 1995
`
`Sheet 5 of 6
`
`5,379,036
`
`
`
`PROCESSOR ARRAY
`
`PROCESSOR ARRAY
`
`FIG. 5
`
`

`
`US. Patent
`
`Jan. 3, 1995
`
`Sheet 6 of 5
`
`5,379,036
`
`INDEXING SCI-IEMCE
`
`- This is one possible indexing scheme. The indentations reflect the
`symmetry of the numbering sequence in the upper four bits.
`
`chipB1[831. . .8?e] --> chip02[881. . .8'Fe] —->
`chip@3[901. . .9?e] --> chip64[981. . .9fe] -->
`chipB5[a01. . .a?e] --> chip@6[o81. . .afe -->
`chip87[b01. . .b?e] ——> chip@8 [b81. . .bfe] -—>
`chip09[c61. . . c?e] --> chi p10[c81. . . cfe] -->
`chip11[d@1. . .d7e] —-3» chip12[d81. . .dfe] -->
`chip13[e01. . .e?e] --> chip14[e81. . .efe] —->
`chip1S[fB1. . .F?e] --> chip16[‘F81. . .1-Te] -->
`ch'ip1?[161. . .17e'_'| --> chip18[181. . .1fe] -—>
`chi.p19[201. . .27e] --> chip20[281. . .2fe] -->
`chip21[3@1. . .3?e] --> chip22[381. . .3'Fe]
`--:~
`chip23[4@1. . .47e] --> chip24[481. . .4fe] -->
`chip2S[SB1. . . 579] --> chip26[581. . . Sfej
`-—>
`chip27[661. . .6712] --> chip28 [681. . .6fe] -->
`chip29[701. . .7713] --> chip3[781. . .?fe]
`
`0 During encode chips 15 and 16 need f(hex) as their upper four hits; but
`during decode they need 0(hex) as their upper four bits. This is to be
`handled by the external control logic.
`
`FIG. 6
`
`

`
`1
`
`5,379,036
`
`2
`
`METHOD AND APPARATUS FOR DATA
`COLIPRESSION
`
`BACKGROUND OF THE INVENTION
`
`This invention relates generally to systems for digital
`data processing, and, more particularly,
`relates to
`highly parallel methods and apparatus for dynamic,
`high speed. on-line compression and decompression of
`digital data.
`Data compression methods. most of which exploit the
`redundancy that is characteristic of data streams, are
`becoming an essential element ofmany high speed data
`communications and storage systems. In ccmmuuim-
`tious systems, a transmitter can compress data before
`transmitting it, and a receiver can decompress the data
`after receiving it. thus increasing the effective data rate
`of the communications channel. In storage applications,
`data can be compressed before storage and decom-
`pressed after retrieval, thereby increasing the efiective
`capacity of the storage device.
`Partictilar applications of data compression include
`magnetic and optical disk interfaces, satellite communi-
`cations, computer network communications, intercon-
`nections between computers and attached devices, tape
`drive interface, and write-once media, where storage
`space reduction is critical.
`Compression of data streams can be lossless or lossy.
`Lossless data compression transforms a body of data
`into a smaller body of data from which it is possible to
`exactly and uniquely recover the original data. Lossy
`data comprwsicu transforms a body of data into a
`smaller one from which an acceptable approximation of
`the original—as defined by a selected fidelity criter-
`ion—can be constructed.
`Lossless compression is appropriate for applications
`in which it is unacceptable to lose even a single bit of
`information. These include transmission or storage of
`textual data, such as primed human language, program-
`ming language source code or object code. database
`information, numerical data, and electronic mail. Loss-
`less compression is also used for devices such as disk
`controllers that must provide exact preservation and
`retrieval of nncorrupted data.
`Lossy compression is useful in specialized applica-
`tions such as the transmission and storage of digitally
`sampled analog data, including speech, music, images,
`and video. Lossy compression ratios are typically much
`higher than those attainable by purely lossless compres-
`sion, depending upon on the nature of the data and the
`degree of fidelity required. For example, digitized
`speech that has been sampled 8,000 times per second,
`with 8 bits per sample. typically compresses by less than
`a factor of 2 with any lossless algorithm. However, by
`first quantizing each sample, which preserves accept-
`able quality for many applications, compression ratios
`exceeding 20 to 1 can be achieved. In contrast, typical
`lossless compression ratios are 3 to 1 for English text, 5
`to I for programming language source code, and 10 to
`I for spreadsheets or other easily compressible data.
`The difference is even greater for highly compressible
`sources such as video. In such cases. quantization and
`dithering techniques may be applied in combination
`with otherwise lcssless compression to achieve high
`ratios of lossy compression.
`Among the most powerful approaches to lossless data
`compression are textual substitution methods, in which
`frequently-appearing dam strings are replaced by
`
`5
`
`IO
`
`15
`
`20
`
`25
`
`35
`
`40
`
`shorter indexes or pointers stored in correspondence
`with the data strings in a dictionary. Typically, an en-
`coder module and a decoder module maintain identical
`copies of a dictionary containing data strings that have
`appeared in the input stream. The encoder finds
`matches between portions of the input stream and the
`previous]y—encountered strings stored in the dictionary.
`The encoder then transmits, in place of the matched
`string, the dictionary index or pointer corresponding to
`the string.
`The encoder can also update the dictionary with an
`entry based on the current match and the current con-
`tents of the dictionary. If insufficient space is available
`in the dictionary, space is created by deleting strings
`from the dictionary.
`The decoder, operating in a converse manner, re-
`ceives at its input a set of indexes, retrieves each corre-
`sponding dictionary entry as a “current ma
`", and
`updates its dictionary. Because the encoder and decoder
`work in a "lock-step" fashion to maintain identical dic-
`tionaries, no additional communicaticn is necessary
`between the encoder and decoder.
`Thus, the input to the coder of an on-line textual
`substitution data compressor is a stream of bytes or
`characters, and the output is a sequence of pointers.
`Conversely, the input to the decoder is a stream of
`pointers and the output is a stream of bytes or charac-
`l.e.1's.
`
`A common example of a textual substitution method
`is the publicly available COMPRESS command of the
`UNIX system, which implements the method devel-
`oped by Lempel and Ziv. See J. Ziv, A. Lempel, "A
`Universal Algorithm for Sequential Data Commu-
`sicu,” IEEE Transactions on Infiarrnurion Theory, Vol.
`1'1‘-23, No. 5, pp. 337-343, 1977; and J. Ziv. A. Lempel,
`"Compression of Individual Sequences Via Variable
`Rate Coding," IEEE Trcnsocttbns on Infortrtcrian The-
`ory, Vol. IT-24, No. 5. pp. 530-536, 1978.
`Textual substitution methods are generally superior
`to conventional methods such as Huffman coding. For
`example, the COMPRESS command of the LTNDC sys-
`tem easily out-performs the COMPACT command of
`UNIX, whichisbased onl-luifmancoding.
`Further examples of data compression methods and
`apparatus are disclosed in the following:
`1.5.5. Pat. No. 4.875.541.
`U.S. Pat. No. 4,814-.746.
`S. Henriques, N. Rsnganathan. "A Parallel Architec-
`ture for Data Compression.” P:-oceedings of the IEEE
`Symposium on Parallel and Distributed Pracemfrrg, pp.
`2tS0—266, December 1990.
`R. Zito-Wolf, "A Broadcast/Reduce Architecture
`for I-Iigh-Speed Data Compression," Proceedings cfrhe
`IEEE Symposiurn on Parallel and Distributed Program-
`ing, pp. 174-131. December 1990.
`S. Bunion, G. Borriello, “Practical Dictionary Man-
`agement for Hardware Data Compression." Apr. 2,
`1990.
`
`65
`
`R. Zito-Wolf, “A Systolic Architecture for Sliding-
`Window Data Compression," Proceeding of the IEEE
`Workshop on VLSI Stgml .Pl'BC£S.§'Il'lg, pp. 339-351, 1990.
`C. Thomborson, B. Wei, “Systolic Implementations
`of a Move-to-Front Text Compressor.“ Journal of the
`Asrociotr'on fitr Comparing Machinery, pp. 283-290. 1989.
`J. Storer. Data Comp:-essr'on. Computer Science Press,
`pp. 163466, 1988.
`
`

`
`3
`J. Storer, "Textual Substitution Techniques for Data
`Compression," Combinatorial Algorithm: on Words.
`Springer-Verlag (Apostolico and Galil,
`ed.). pp.
`lll-130, 1985.
`V. Miller. M. Wegman, “Variations on a Theme by
`Ziv and Lempel,” Combinatorial Algorithxns on Words,
`Springer-Verlag (Apostolico and Galil, ed.), pp.
`131-140, 1935.
`M. Gonzalez Smith. 3. Storm’, "Parallel Algorithms
`for Data Compression," Journal of the Association fivr
`Computing Machinery. Vol. 32, No. 2. PP. 344-373,
`April 1935.
`U.S. Pat. No. 4,876,541 discloses a data compression
`system including an encoder for compressing an input
`stream and a decoder for decompressing the com-
`pressed data. The encoder and decoder maintain dictio-
`naries for storing strings of frequently appearing char-
`acters. Each string is stored in association with a corre-
`sponding pointer. The encoder matches portions of the
`input stream with strings stored in the encoder dictio-
`nary, and transmits the corresponding pointers in place
`ofthc strings, thacby providing data compression. The
`decoder decompresses the compressed dam in a con-
`verse manner. The system utilizes selected matching,
`dictionary update. and deletion methods to maintain
`processing efficiency.
`U.S. Pat. No. 4,814,746 discloses a data compression
`method including the steps of establishing a dictionary
`of strings of frequently appearing characters, determin-
`ing a longt string of the input stream that matches a
`string in the dictionary. transmitting the pointer associ-
`ated with that string in place of the string. and adding a
`new string to the dictionary. The new string is a concat-
`enation or linking of a previous matched string and the
`current matched string. The method also includes the
`step of deleting a least recently used string from the
`dictionary if the dictionary is full.
`Henriques et al. discloses a systolic array of proces-
`sors for executing sliding window data compression in
`accordance with the Ziv and Lempel method. Data
`compression is divided into parsing and coding. During
`parsing. the input string of symbols is split into sub-
`strings. During coding, each substring is sequentially
`coded into a fixed length code.
`Zito-Wolf ("A Broadcast/Reduce Architecture for
`High-Speed Data Compression") discusses a data com-
`pression system utilizing a sliding window dictionary
`and a combination of a systolic array and pipelined trees
`for broadcast of input characters and reduction of re-
`sults. in this system, for every character position of the
`input stream, the processor computes a pair (length,
`ofiket) identifying the longest matched string ending at
`that character.
`
`IO
`
`20
`
`30
`
`35
`
`45
`
`5,379,036
`
`4
`Thomborson et al. discloses systolic implementations
`of data compression utilizing move-to-front encoding.
`A move-to-front encoder finds the current ordinal posi-
`tion of a symbol in the input stream, transmits that ordi-
`nal position, and moves the symbol to the front of the
`list. A characteristic of this encoder is that more-fre-
`quently occurring input symbols will be at the front of
`the list.
`Storer (1988) discusses parallel processing implemen-
`tations utilizing the dynamic dictionary model of data
`compression. An ID heuristic is implemented for updat-
`ing the dictionary, and a SWAP heuristic is executed
`for deleting entries from the dictionary to create space
`for new entries The SWAP heuristic is implemented by
`doubling the storage elements and adding a controller
`to each end of the pipe, for switching input/output lines
`appropriately as the dictionaries are switched. A sys-
`tolic pipe irnplemenlation utilizes a "bottom up" match-
`i.ttgtechnique.1nparticular.asastreamot'charactersor
`pointers
`flows
`through the systolic pipe,
`longer
`matched strings are constructed from pairs of snmller
`matched strings. Each processor supports a FLAG bit
`that can be set during data compression to designate ll
`“ earning" processor in the pipe. The learning proces-
`sor is the tirst processor along the pipe that contains a
`dictionary try.
`Storer (1935) discusses ofi‘-line and on-line data com-
`pression methods
`textual substitution in which
`pointers are transmitted in place of character strings.
`An on-line implementation described therein utilizes an
`encoder and decoder, each having a fixed amount of
`local memory forming a local dictionary. Another im-
`plementation utilizes a systolic array of processing ele-
`ments utilizing a sliding dictionary to match strings in
`the input stream. The sliding dictionary stores the last N
`characters procmsed in the systolic pipeline. one char-
`acter per processing element The dictionary is updated
`by sliding old characters to the left in order to bring in
`new characters.
`Miller et al. (1935) discusses enhancements to the data
`compression methods of Ziv and Lempel. including a
`dictionary replacement strategy that enables the use of
`a fixed sine dictionary, methods for using a larger class
`ofstringsthatcanbeaddedtothedictionary,a.nd1ech-
`niques that avoid tmcomprmed output.
`Gonzalez—Smith er al. discloses systolic arrays for
`parallel implementation of data comprusion by textural
`substitution. The systems described therein implement a
`sliding dictionary method. in which characters being
`read in are compared with each of the elements of a
`dictionary that spans the N characters preceding the
`current character.
`
`Bunton et al. discusses data compression methods
`utilizing a dictionary tagging technique for deleting
`selected entries from the data compression dictionary.
`The TAG dictionary management scheme disclosed
`therein employs a structure known as a trio data struc-
`ture with tagged nodes.
`R. Zito-Wolf (“A Systolic Architecture for Sliding-
`Window Data tbmpression") discusses a systolic pipe
`implementation of textual substitution data compression
`utilizing a sliding window. The systolic-array architec-
`ture codes substrings of the input by reference to identi-
`cal sequences occurring in a bounded window of pre-
`ceding characters of the input, wherein the contents of
`the window form a dictionary of referenceable strings.
`
`65
`
`These publications accordingly disclose various sys-
`tems for data compression. However, one deficiency
`shared by conventional data compression methods and
`systems, such as those described above, is the tradeofi
`between the benefits of data compression and the com-
`putational costs of the encoding and subsequent decod-
`ing. There exists a need for data compression systems
`that provide high compression ratios and high speed
`operation, without necemitating complex encoding and
`decoding modules.
`It is thus an object of the invention to provide lossless
`data compression systems utilizing massively parallel
`architectures to compress and decompress data streams
`at high rates, and with high compression ratios.
`
`

`
`5
`It is a further object of the invention to provide such
`methods and apparatus that can be bodied in rela-
`tively simple, low-cost digital hardware.
`A further object of the invention is to provide data
`compression methods and apparatus that dynamically
`adapt to changing data streams.
`Other general and specific objects of the invention
`will in part be obvious and will in part appear hereinaf-
`ter.
`
`5
`
`SUMMARY OF T'I-IE INVENTION
`
`25
`
`30
`
`The foregoing objects are attained by the invention,
`one aspect of which provides a data compression sys-
`tem utilizing multiple encoding/decoding processing
`units arranged in a systolic pipe array for compressing a
`stream of input data received at an input end ofthe pipe.
`The systolic pipe executes data compression by tex-
`tual substitution, convening the uncompressed stream
`of input data into a oomprmed set of pointers represen-
`tative of‘ the input. The processors store a dictionary of 20
`previously-read strings of input data in association with
`corresponding pointers The processors match portions
`of the input stream with the stored strings, and transmit
`the corresponding pointers in place ofthe strings.
`In order to execute decoding or decompression of the
`stream of pointers, the processors maintain a decoding
`dictionary identical to that generated during encoding.
`The processors match the pointers with stored strings of
`data. and transmit the corresponding data strings in
`place of the pointers.
`A further aspect of the invention utilizes selected
`matching, update, and deletion methods to increase
`pnocwsing eflicieucy. The matching methods can in-
`clude a modified "greedy" heuristic; the updating fea-
`ture can include an implementation of a modified "iden-
`tity" ("ID") heuristic: and thedeletion methods include
`a “swap'’ heuristic.
`One practice of the invention utilizes a "bottom up”
`approach to finding the longest possible match between
`the input stream and a stored string, in which succes-
`sively longer match strings are constructed from
`shorter matches.
`The invention will next be described in connection
`with certain illustrated embodiments; however. it will
`be clear to those skilled in the art that various modifica-
`tions, additions and subtractions can be made without
`departing front the spirit or scope of the claims.
`DESCRIPTION OF THE DRAWHNIGS
`
`45
`
`50
`
`For a fuller understanding of the nature and objects
`of the invention, reference should be made to the fol-
`lowing detailed description and the accompanying
`drawings, in which:
`FIG. 1 is a flow chart of the overall data compres-
`sion/encoding method of the invention;
`FIG. 2 is a flow chart of the overall data decompres-
`sion/decoding method of the invention,-
`FIG. 3 is a schematic diagram of a systolic pipe for
`data compression in accord with the invention;
`FIG.4isase.hematicdiagramofaprocessorinthe
`systolic pipe of FIG. 3;
`FIG. 5 is a schematic diagram of a complete data
`compression system constructed in accordance with the
`invention, including two processor arrays and control-
`ler logic for executing a SWAP operation; and
`FIG. Gisaschemaficdiagramofanindexingsc
`quence utilized in one practice of the invention.
`
`55
`
`60
`
`65
`
`5,379,036
`
`6
`
`DESCRIPTION OF ILLUSTRATED
`EMBODl.'MIENTS
`
`Overview: FIGS. 1-3 depict the overall structure and
`function of a data compression system constructed in
`accord with the invention. In particular. FIGS. 1 and 2
`are flow charts of encoding and decoding operations
`utilizing on-line dynamic textual substitution. The en-
`coding operation depicted in FIG. 1 reads a stream of
`characters and writes a compressed stream of corre-
`sponding hits, while the decoding operation depicted in
`FIG. 2 receives the compressed stream of bits and gen-
`erates a stream of corresponding characters.
`FIG. 3 shows the systemic pipe structure of a data
`compression system constructed in accord with the
`invention, for executing the method steps depicted in
`FIGS. 1 and 2. A systolic pipe is a linear array of pro-
`cessors in which each processor communicates only
`with left and right adjacent processors. As indicated in
`FIGS. 1-3, the invention provides a massively parallel,
`systolic pipe multiprocessor system that compresses
`data by means of textual substitution
`Parallel implementations of dynamic textual substitu-
`tion were previously thought to require data hroadcmt
`or other complex global communication. The invention,
`however, exploits the massively parallel nature of the
`systolic pipe to execute dynamic textual substitution at
`high speed, without complex interprocessor connec-
`nous.
`
`Encoding/Decoding: These advantages are provided
`by implentatiou of the steps depicted in FIGS. 1 and
`2. As indicated therein, the encoder and decoder each
`have a tired, finite quantity of local memory forming a
`local dictionary D. The two local dictionaries are ini-
`tializedatatin1e'Il0tobeempty.Dur-ingencodingand
`decoding operations, the dictionaries change dynami-
`cally, and are maintained by a novel variant of an iden-
`tity ("ID") update technique, a SWAP deletion tech-
`nique. and a "bottom up" parallel version of the
`"greedy" match technique. These methods are dis-
`cussed in detail hereinafter.
`
`The illustrated methods and apparatus constitute a
`real-time. on-line data compression system. In an on-line
`systern, neither the sender nor the receiver can “see” all
`of the data at once. Data are constantly passed from the
`sender through the encoder, then through the decoder,
`and on to the receiver. The encoding and decoding
`operations shown in FIGS. 1 and 3 satisfy the require-
`ments of the following definition: an on-line compres-
`sion/decompression method is real-time if there exists a
`constant It (which is independent of the data being pro-
`cessed or the size of the dictionary) such that for every
`I: units of time, exactly one new character is read by the
`encoder and exactly one character is written by the
`decoder.
`
`Theonly exceptiontothis ruleis theallowancecfa
`small “ g” between the encoder and decoder. Thus,
`there exists another constant I. (that is independent of
`thedatabeingprocessed)suchthat (i) duriugthefirstl.
`units of time the decoder produces no characters. and
`(ii) given an input of finite length, L time units may
`elapse between the time that the encoder has finished
`reading characters, and the time that the decoder is
`finished transmitting characters.
`Further explanation of the invention is simplified by
`assuming that the input date stream is a sequence of 8-bit
`characters. The set of all possible characters. ie.. the
`integers 0 through 255, is referred to herein as the input
`
`

`
`7
`alphabet S. In addition, data handling in compressed
`form is assumed to be noiseless, such that no bits are
`added, lost, or changed. While this assumption is made
`for convenience, those skilled in the art will recognize
`that known error detection and correction techniques
`can be utilized to ensure that no hits are corrupted or
`lost.
`The input data stream is then encoded by executing
`the following steps, as depicted in FIG. 1:
`(101)
`the local dictionary D to have one
`entry for each character of the input alphabet. (102)
`repeat forever:
`(at) (Get the current match string Use a match opera-
`tion MX to read s from the input. Transmit [log2|D|]
`bits for the index of s.
`
`(b) (Update D:) Add each of the strings specified by
`anupdateoperateUXtoD (il'D isfull, then fi.rstem-
`ploy a deletion operation DX to make space).
`The compressed stream can then be decoded by exe-
`cuting the following steps, as depicted in FIG. 2:
`(201) Initialize the local dictionary D by performing
`Step 1 of the encoding operation.
`(202) repeat forever:
`(a)
`(Get
`the current match string 3:) Receive
`[logzlD [] bits for the index ofs. Retrieve s from D and
`output the characters of s.
`(b) (Update D:) Perform step 2!: of the encoding
`algorithm.
`The encoding and decoding operations of FIGS. 1
`and 2 are executed in lock-step to maintain identical
`copies of the constantly changing dictionary D. The
`encoder repeatedly finds a match between the incoming
`characters of the input stream and the dictionary, de-
`leta these characters from the input stream, transmits
`the index of the corresponding dictionary entry. and
`updates the dictionary. The dictionary update opera-
`tion, described hereinafter, generates an update string
`that is a function of the current contents of the diode-
`nary and the most recent match. If i.nsu.fficienl space is
`available in the dictionary for updating, a deletion oper-
`ation must be performed, as discussed in below.
`Similarly, the decoder repeatedly receives an index,
`retrieves. the corresponding dictionary entry as the
`“current match", and then executes the same operation
`as the encoder to update its dictionary.
`the number of
`In accordance with the invention,
`entries in the dictionary D can be varied. The term
`D”, is used herein to denote the maximum number of
`entries that the dictionary D may contain. A typical
`value is D,,m=4,(l96. Experimental simulations indi-
`cate that this value permits the system to achieve most
`of the compression attainable by the underlying en-
`coding/decoding method. while providing a reasonable
`limit on the number of entries in the dictionary. In one
`practice of the invention, the local dictionary D utilizes
`a data structure in which each dictionary entry occupies
`a constant amount of data space, independent of the
`actual length of the corresponding string.
`MX UX, DX: In executing the steps depicted in
`FIGS. 1 and 2., the invention performs the following
`operations:
`The match operation, MX: This operation removes
`from the input stream a string ‘'5'’ that is contained
`within the dictionary D. Since the charam of S are
`implicitly in D and can never be deleted, there is always
`at least one such 3.
`
`The update operation, UK: This operation that takes
`the local dictionary D and returns a set of strings that
`
`5
`
`20
`
`25
`
`35
`
`-15
`
`50
`
`55
`
`60
`
`65
`
`5,379,036
`
`8
`should be added to the dictionary if space can be found
`for them.
`The deletion operation DX: This operation takes the
`local dictionary D and returns a string of D that may be
`deleted.
`
`the following
`In accordance with the invention,
`methods are utilized for the matching, updating, and
`deleting tasks:
`The match operation MX provides an approximate
`determination of the longest possible match string, uti-
`lining a modified "greedy" match heuristic (which takes
`the longest possible match between the input stream and
`a string in D) in eonjunctionwilh a “bottom up" tech-
`nique, in which longer match strings are constructed
`from shorter matches.
`The update operation UK employs a modified ver-
`sion of an identity (ID) technique. A typical ID opera-
`tion adds the previous match concatenated with the
`current match. In conventional data compression sys-
`tems, the ID technique required complex data struc-
`tures that were characteriaed by undesirable worst-case
`properties. However, by employing a modified ID tech-
`nique in conjunction with the massively parallel archi-
`tecture depicted in FIG. 3, the invention avoids the
`need for complex data structures. The modified ID
`method is described hereinafter.
`In one practice of the invention, the deletion opera-
`tion DX plays a SWAP technique that utilizes pri-
`mary and auxiliary
`When the primary dic-
`tionary first fills to capacity, an auxiliary dictionary is
`started. but comprmion based on the primary dictio-
`nary is continued. From this point on, each time the
`auxiliary dictionary becomes full, the roles of the pri-
`mary and auxiliary dictionaries are reversed, and the
`auxiliary dictionary is reset to be empty. A structure for
`providing this function is depicted in FIG. 5, using two
`identical copies of the systolic pipe architecture (10,
`10'),
`in conjunction with conventional control logic
`elements (34, 36, 38) for switching the roles of the pri-
`mary and auxiliary dictionaries.
`Those skilled in the art will appreciate that the inven-
`tion can be implemented using other DX methods. Al-
`ternatives for D):
`include the known FREEZE
`method, in which the dictionary remains unchanged
`after it fills to capacity, and the conventional LRU
`(least recently used) method, in which the processors
`delete the least recently matched dictionary string. The
`FREEZE heuristic, however, can be unstable if the
`data characteristics change after the dictionary fills.
`Moreover. the SWAP method described above in con-
`nection with FIG. 5 is better adapted for parallel imple-
`mentations than is an LRU technique.
`Systolic Pipe: As indicated in FIG. 3, a preferred
`embodiment of the invention executes the above-
`described method steps using a systolic pipe 10. The
`pipe 10 includes a linear array of processors 10.1, 10.2.
`10.3, . .
`. , l|J.N, each connected only to left and right
`adjacent processors. The "snake" arrangement depicted
`in FIG. 3 can implement textual substitution methods at
`much higher data rates than are achievable in serial
`structures.
`
`A simple example of a systolic pipe is an automobile
`assembly line, which may produce a new automobile
`every twenty minutes even though each car is in the
`assembly line for a day. Although each station in the
`automobile assembly line performs a different task, the
`stations are at least conceptually identical, if we view
`then: all as taking as input a partially built car, perform-
`
`

`
`9
`ing an elementary operation (suchas welding), and then
`transmitting a partially completed car to the next sta-
`tion.
`
`10
`numbered so that ii'i<j, then processor i is to the left of
`processor j and occurs earlier in the pipe than processor
`_1.
`
`5,379,036
`
`As indicated in FIG. 4, a one-bit LEADER register is
`5 provided in register module 22 of each proomsor in the
`pipe. The LEADER BIT is initially ONE for the left-
`most processor and ZERO for all others. It is always
`the case that at most one processor is the LEADER
`PROCESSOR, that the processors to its left contain
`dictionary entries, and that processors to its right are
`empty.
`As the data str passes through the processors to
`the left of the LEADER PROCESSOR, it is encoded.
`In particular. whenever a pair of pointers enter a pro-
`cessor and matches the pair of pointers stored in the
`storage registers of that processor (see FIG. 4). that pair
`of pointers is removed from the data stream and re-
`placed by a single pointer—the index of that processor.
`Specifically, upon detection of a match. the index stored
`at index register 34 is pushed down into the Built regis-
`ter 28. Data passes unchanged through the processors to
`the right of the LEADER PROCESSOR. When a pair
`of pointers ters the LEADER PROCESSOR, they
`represent adjacent substrings of the original data re-
`ferred to as the "previous" -match and the “current"
`match. The LEADER PROCESSOR can simply adopt
`thispair ofpointersasits
`,se

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket