`
`(19) World Intellectual Property Organization
`International Bureau
`
`(43) International Publication Date
`30 August 2001 (30.08.2001)
`
`
`
`PCT
`
`(10) International Publication Number
`WO 01/63772 Al
`
`(51) International Patent Classification’:
`HO4N 1/415
`
`H03M 7/34,
`
`(22) International Filing Date: 22 February 2001 (22.02.2001)
`
`(74) Agents: NILLES, Andrew, J. et al.; Nilles & Nilles, S.C.,
`Firstar Center, Suite 2000, 777 East Wisconsin Avenue,
`Milwaukee, WI 53202 (US).
`(21) International Application Number:—PCT/US01/05722
`(81) Designated States (national): AE, AG, AL, AM,AT, AU,
`AZ, BA, BB, BG, BR, BY, BZ, CA, CH, CN, CR, CU, CZ,
`DE, DK, DM, DZ, EE, ES, FI, GB, GD, GE, GH, GM, HR,
`HU,ID,IL, IN, IS, JP, KE, KG, KP, KR, KZ, LC, LK, LR,
`LS, LT, LU, LV, MA, MD, MG, MK, MN, MW, MX, MZ,
`NO, NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, TM,
`TR, TT, TZ, UA, UG, UZ, VN, YU, ZA, ZW.
`
`(25) Filing Language:
`
`(26) Publication Language:
`
`English
`
`English
`
`(30) Priority Data:
`09/513,309
`
`25 February 2000 (25.02.2000)
`
`US
`
`PHYSICAL OPTICS CORPORATION
`(71) Applicant:
`[US/US]; 20600 Gramercy Place, Building 100, Torrance,
`CA 90501-1821 (US).
`
`(84) Designated States (regional): ARIPO patent (GH, GM,
`KE, LS, MW, MZ, SD, SL, SZ, TZ, UG, ZW), Eurasian
`patent (AM, AZ, BY, KG, KZ, MD,RU, TJ, TM), European
`patent (AT, BE, CH, CY, DE, DK,ES, FI, FR, GB, GR,IE,
`IT, LU, MC, NL, PT, SE, TR), OAPI patent (BF, BJ, CF,
`CG, CI, CM, GA, GN, GW, ML, MR, NE, SN, TD, TG).
`
`(72) Inventors: TERNOVSKTIY, Igor, V.; 5700 Ravensbur
`—_with international search report
`Drive, #209, Rancho Palos Verdes, CA 90275 (US).
`DEVIVYE, Aleksandr, A.; 6517 Kester Avenue, #4, Van
`Nuys, CA 91411 (US). ROTENBERG,Joseph; 719 Price
`Drive, Burbank, CA 91504 (US). LIN, Freddie; 800 Calle
`De Arboles, Redondo Beach, CA 90277 (US).
`
`For two-letter codes and other abbreviations, refer to the "Guid-
`ance Notes on Codes andAbbreviations" appearing at the begin-
`ning ofeach regular issue ofthe PCT Gazette.
`
`Published:
`
`(54) Title: METHOD AND APPARATUS FOR OPTIMIZED LOSSLESS COMPRESSION USING A PLURALITY OF CODERS
`
`
`
`30
`
`x
`32
`
`
`[__[ARTTAMETIC||TEWPELZV[HUFFMAN[|a)
`
`
`
`
`LOSSLESS
`LOSSLESS
`
`
`CODER
`
`CODER
`
`(Ar,8)
`(n,8)
`
`
`
`
`
`
`
`
`
`
`LOSSLESS
`CODER
`(n,m)
`
`
`
`LOSSLESS
`CODER
`(LZ,8)
`
`LOSSLESS
`CODER
`(Ar,10)
`
`
`
`
`LOSSLESS
`CODER
`(Ar,M)
`
`
`
`O01/63772Al
`
`(57) Abstract: A method of lossless compression of a stream of data first includes using a plurality of lossless coders to compress
`a test portion of the data stream (30). Oncethe test portion is compressed, the method determines a performance characteristic(s)
`associated with each ofthe lossless coders (32). Then the method selects one of the lossless coders based on the performance charac-
`teristic(s) and encodesa first portion of the data stream with the selected coder. Thereafter, the method includes repeating the using,
`determining, selecting and encoding steps for another test portion and a second portion of the data stream. Notably, the repeating
`step may include selecting a different one of the lossless coders.
`
`NetApp; Rackspace
`
`Exhibit1017
`
`Page 1
`
`NetApp; Rackspace Exhibit 1017 Page 1
`
`
`
`WO 01/63772
`
`PCT/US01/05722
`
`METHOD AND APPARATUSFOR OPTIMIZED LOSSLESS
`COMPRESSION USING A PLURALITY OF CODERS
`
`BACKGROUNDOF THE INVENTION
`
`1.
`
`Field of the Invention
`
`The present invention is directed to data compression techniques and, more
`
`particularly, to a method and apparatus for selecting among different types of
`
`lossless compression coders to optimize system performance.
`
`2.
`
`Description of the Related Art
`
`Data compression operates to minimize the numberofbits used to store or
`
`transmit information and encompasses a wide array of software and hardware
`
`compression techniques. Notably, depending on the type of data to be compressed
`
`10
`
`and any numberofother factors, particular compression techniques can provide
`
`markedly superior performance in terms of compression ratio and coding speed.
`
`Generally, data compression includes taking a stream of symbols or phrases
`
`and converting them into codes that are smaller (in bit length) than the original data.
`
`Knowncompression techniques and algorithms can be divided into two major
`
`15
`
`families including lossy and lossless. Lossy data compression can beusedto greatly
`
`increase data compression ratios; however, increased compression comesat the
`
`expense of a certain loss in accuracy. As a result, lossy compression typically is
`
`implemented only in those instances in which some data loss is acceptable. For
`example, lossy compression is used effectively used when applied to digitized voice
`signals and graphics images. Lossless compression, on the other hand, is a family
`
`20
`
`of data compressionthat utilizes techniques designed to generate an exact duplicate
`of the input data stream after a compression/decompression cycle. This type of
`
`compression is necessary when storing database records, word processingfiles, etc.,
`
`where loss of information is absolutely unacceptable. The present invention is
`
`25
`
`directed to lossless data compression.
`
`Some lossless compression algorithms use information theory to generate
`
`variable length codes when given a probability table for a given set of symbols. The
`
`decision to output a certain code for a particular symbol or set of symbols(i.e.,
`
`message) is based on a model. The modelis a set of rules used to process input
`
`NetApp; Rackspace
`
`Exhibit1017
`
`Page2
`
`NetApp; Rackspace Exhibit 1017 Page 2
`
`
`
`WO 01/63772
`
`PCT/US01/05722
`
`messages, and in response, determine which codes to output. An algorithm or
`
`program uses the model to analyze the symbols (e.g., determine a probability
`
`associated with the symbol) and then outputs an appropriate code based on that
`
`processing. There are any number of ways to modeldata, all of which can use the
`
`same coding technique to producetheir output.
`
`In general, to compress data
`
`efficiently, a model should be selected that predicts symbols or phrases with high
`
`probabilities because symbols or messages that have a high probability have a low
`
`information content, and therefore require fewer bits to encode. The nextstep is to
`
`encode the symbols using a particular lossless coder.
`
`10
`
`Conventionally, lossless compression coders can be grouped according to
`
`whether they implementstatistical modeling or dictionary-based modeling.
`
`Statistical modeling reads and encodes a single symbolat a time using the
`
`probability of the character's appearance, while dictionary-based modeling uses a
`
`single code to replace strings of symbols. Notably, in dictionary-based modeling,
`the modelis significantly more important than in statistical-based modeling because |
`
`15
`
`problems associated with encoding every symbolare significantly reduced.
`
`One form ofstatistical data compression is known as Shannon-Fano(S-F)
`
`coding. S-F coding was developed to provide variable-length bit coding so as to
`
`allow coding symbols with exactly (or a close approximation to) the numberofbits
`
`20
`
`of information that the message or symbol contains. S-F coding relies on knowing
`the probability of each symbol's appearance in a message. After the probabilities
`
`are determined, a table of codes is constructed with each code having a different
`
`numberofbits (advantageously, symbols with low probabilities have morebits).
`
`Oneproblem with a coding technique suchasthis is that it creates variable length
`
`25
`
`codes that have an integral numberofbits, even though the information to be coded
`
`likely will require a non-integral numberofbits.
`
`Another type of coding, Huffman coding, is similar to S-F coding in that it
`
`creates variable length codes that are an integral numberofbits, butit utilizes a
`
`completely different algorithm. Generally, S-F and Huffman codings are close in
`
`30
`
`performance but Huffman coding, it has been determined, alwaysat least equals the
`
`NetApp; Rackspace
`
`Exhibit1017
`
`Page3
`
`NetApp; Rackspace Exhibit 1017 Page 3
`
`
`
`WO 01/63772
`
`PCT/US01/05722
`
`efficiency of S-F coding so it is therefore preferred, especially since both algorithms
`
`take a similar amount of processing power. Although Huffman codingis relatively
`
`easy to implement and economical for both coding and decoding, it is inefficient due
`
`to its use of an integral numberofbits per code as in S-F coding. Ifa particular
`
`symbol is determined to have-an information content(i.e., entropy) of 1.5 bits, a
`
`Huffman coder will generate a code having a bit count that is either one or twobits.
`
`Generally, if a statistical method could assign a 90% probability to a given symbol,
`
`the optimal code size would be 0.15 bits; however, Huffman or S-F coding likely
`
`would assign a one bit code to the symbol, which is six times larger than necessary.
`
`10
`
`In view of this problem associated with utilizing an integral numberofbits,
`
`arithmetic coding was developed. Arithmetic coding replaces a stream of input
`
`symbols with a single floating point output number, and bypassesthe step of
`
`15
`
`replacing an input symbol with a specific code. Because an arithmetic codeis not
`- limited to being optimal only when the symbolprobabilities are integral powers of
`one-half (which is most often not the case), it attains the theoretical entropy of the
`symbolto be coded, thus maximizing compressionefficiency for any knownsource.
`In other words, if the entropy of a given character is 1.5 bits, arithmetic coding uses
`
`1.5 bits to encode the symbol, an impossibility for Huffman and Shannon-Fano
`
`coding. Although arithmetic coding is extremely efficient, it consumes rather large
`
`20
`
`amounts of computing resources, both in terms of CPU power and memory. This is
`
`due to the fact that sophisticated models that demanda significant amount of
`
`memory mustbe built, and that the algorithm itself requires a significant amount of
`
`computational operations.
`
`In an alternative to the above types of lossless coding, known as
`
`25
`
`substitutional or dictionary-based coding, dictionary-based compression algorithms
`
`replace occurrences of particular phrases (i.e., groups of bytes) in a data stream
`with areference to a previous occurrence of those phrases. Unlike the above
`systems that achieve compression by encoding symbols into bit strings that use
`
`fewer bits than the original symbols, dictionary-based algorithms do not encode
`
`30
`
`single symbols. Rather, dictionary-based compression techniques encode variable
`
`NetApp; Rackspace
`
`Exhibit1017
`
`Page 4
`
`NetApp; Rackspace Exhibit 1017 Page 4
`
`
`
`WO 01/63772
`
`PCT/US01/05722
`
`length strings of symbols as single "tokens." It is these tokens that form an index to
`
`a phrase dictionary. Because the tokens are smaller than the phrases they replace,
`
`compression occurs. Two main classes of dictionary-based compression schemes
`
`are known as the LZ77 and LZ78 compression algorithms of the Lempel-Ziv family
`
`of compression coders. Notably, dictionary-based codingis utilized extensively in
`desktop general purpose compression and has been implemented by Compuserve
`Information Service to encode bit-mapped graphical images. For example, the GIF
`
`format uses a LZW variant to compress repeated sequences and screen images.
`Although dictionary-based compression techniques are very popular forms of
`compression, the disadvantage of such algorithmsis that a more sophisticated data
`
`10
`
`structure is needed to handle the dictionary.
`
`Overall, as communication mediumssuch as the internet expand, data
`
`compression will continue to be extremely important to the efficient communication
`
`of data, with different compression algorithms providing particular advantages in
`
`15
`
`particular arenas. There are many types of data compression methodsthat are being
`
`implemented in the art, including those described above as well as others.
`
`In
`
`addition, there are many variants associated with each type of known compression
`
`algorithm and many improvements have been developed. Again, depending on any
`
`numberoffactors associated with the system and the type of data being compressed,
`
`20
`
`each may bepreferred to provide optimum data encoding.
`
`Because different ones of known coding techniques provide unique benefits
`
`depending upon various operational factors including the data to be encoded, a
`lossless compression system that selectively encodes data with different types of
`
`coders was desired. The telecommunications industry, in particular, is in need of a
`
`25
`
`system which implements different types of coders, especially when the incoming
`
`data is received from multiple sources that provide different types of unknown data,
`
`i.e., when different portions of the data stream would be optimally compressed with
`
`different coding techniques.
`
`NetApp; Rackspace
`
`Exhibit1017
`
`Page5
`
`NetApp; Rackspace Exhibit 1017 Page 5
`
`
`
`WO 01/63772
`
`PCT/US01/05722
`
`SUMMARYOF THE INVENTION
`
`The present invention is directed to a method and apparatus that determines
`
`which of a number of embedded coding schemes will optimally compress different
`portions of an incoming data stream. The method of the preferred embodimentis
`designed to accommodate a data stream characterized by having different packets of
`
`information (e.g., from sources unknown to the encoders) each of which may have
`
`different associated statistics.
`
`Accordingto a first aspect of the preferred embodiment, a methodoflossless
`
`compression of a stream of data includes providing a plurality of lossless coders.
`
`10
`
`The methodthen includes selecting one of the lossless coders to compress the stream
`
`of data, and thereafter encoding the data stream with the selected lossless coder.
`
`According to another aspect of the preferred embodiment, a method of
`
`lossless compression of a stream of data includes using a plurality of lossless coders
`
`to compressa test portion of the data stream. Once the test portion is compressed,
`
`15
`
`the method determines a performance characteristic associated with each of the
`
`lossless coders. Then the method includes selecting one of the lossless coders based
`
`on the determining step and encoding a first portion of the data stream with the
`selected coder. Thereafter, the method includes repeating the using, determining,
`
`selecting and encoding steps for another test portion and a second portion of the data
`
`20
`
`stream. Notably, the repeating step may include selecting a different one of the
`
`lossless coders.
`
`According to a further aspect of the preferred embodiment, each of the
`
`lossless coders uses (1) a compression technique, and (2) a numberof bits per word
`
`determined by the selecting step, in the encoding step. And, the compression
`
`25
`
`technique is one of Arithmetic coding, Huffman coding and LZ coding.
`
`According to yet another aspect of the preferred embodiment, an apparatus
`
`for lossless data compression includes an interface to receive a stream of data.
`addition, the apparatus includes a plurality of lossless coders and a processor.
`operation, each lossless coder separately compresses atest portion of the data
`stream and, in response, the processor determines a performancecharacteristic
`
`In
`In
`
`30
`
`NetApp; Rackspace
`
`Exhibit1017
`
`Page6
`
`NetApp; Rackspace Exhibit 1017 Page 6
`
`
`
`WO 01/63772
`
`PCT/US01/05722
`
`associated with each of the lossless coders, and then selects, based on the
`
`performance characteristics, one of the lossless coders to encodeatleast a first
`
`portion of the data stream.
`
`Accordingto a still further aspect of the preferred embodiment, the
`
`performance characteristic includes at least one of compression ratio and duration of
`
`the compression of the test portion for a corresponding lossless coder. Moreover,
`
`the encoderincludes a plurality of processors and each of the lossless coders
`
`correspondsto one of the processors, and wherein the lossless coders compress the
`
`sametest portion in parallel.
`
`10
`
`These and other objects, advantages, and features of the invention will
`
`becomeapparentto those skilled in the art from the detailed description and the
`
`accompanying drawings.
`
`It should be understood, however, that the detailed
`
`description and accompanying drawings, while indicating preferred embodiments of
`
`15
`
`the present invention, are given by wayofillustration and not of limitation. Many
`changes and modifications may be made within the scope of the present invention
`without departing from the spirit thereof, and the invention includesall such
`
`modifications.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`Preferred exemplary embodimentsof the invention areillustrated in the
`
`20
`
`accompanying drawings in which like reference numerals representlike parts
`
`throughout, and in which:
`
`FIG. 1 is a flow diagram showing the general operation of a method ofthe
`
`preferred embodiment;
`
`FIG. 1A is a chart showing an array of lossless coders used in the method
`
`25
`
`shownin FIG. 1;
`
`FIG. 2 is a generic block diagram showing an encoding/decoding system of
`
`the preferred embodiment; and
`
`FIG. 3 is a schematic diagram showing the data stream asit is
`
`encoded/decoded by the system shownin FIG. 2.
`
`NetApp; Rackspace
`
`Exhibit1017
`
`Page/7
`
`NetApp; Rackspace Exhibit 1017 Page 7
`
`
`
`WO 01/63772
`
`PCT/US01/05722
`
`DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
`
`Referring to Fig. 1, a method 10 includes, after initialization and start-upat
`
`Step 12, inputting data to the system at Step 14. The data input at Step 14 may be
`
`synchronousor asynchronous data. Notably, the data stream may be received from
`
`unspecified sources such as sensors that monitor temperature, pressure, etc. of a
`
`subject (e.g., telemetric data gathered in military applications) and that continuously
`
`transmit readings to the system encoder (described below) of the preferred
`
`embodiment. Unspecified data necessarily implies that the statistics associated with
`
`the data are random,and therefore unlike known systems that perform compression
`
`10
`
`with a single type of encoder based on knowledge regardingthestatistics of the
`
`data, the preferred embodimentis able to efficiently code a data stream comprised
`
`of different types of data. Other types of applications where this type of random
`
`data mayoriginate from multiple sources include hospital monitoring applications,
`
`chemical factories, nuclear plants, and others.
`
`15
`
`Asthe data is continuously input to the system,it is transmitted to a division
`
`on communication block where method 10, at Step 16, processes the data by
`
`dividing or framing the data for further communication thereof. The division on
`
`communication block is implemented by method 10 in conventional fashion. Next,
`at Step 18, the data is pre-processed which mayinclude generating a histogram
`indicating the statistics associated with the data framed in Step 16.
`
`20
`
`Once the data has been pre-processed at Step 18, method 10 adds
`
`synchronization and header codes at Step 20, as required to further process and
`
`identify the data bits in the data stream. Upon completion of Step 20, the data is
`
`transmitted to a plurality of coders that provide lossless compression.
`
`In particular,
`
`25
`
`at Step 22, method 10 codesa test portion of the data stream with a plurality of
`
`lossless coders and determines system performancecriteria associated with each of
`
`the coders. The coders used to code the portion of the data in Step 22 are shownat
`
`32 in chart 30 of Fig. 1A. The columnsof the charts indicate different types of
`
`lossless coding techniques/algorithms which may include Huffman coding,
`
`30
`
`Arithmetic coding, Lempel-Ziv coding, as well as variants of these and other known
`
`NetApp; Rackspace
`
`Exhibit1017
`
`Page8
`
`NetApp; Rackspace Exhibit 1017 Page 8
`
`
`
`WO 01/63772
`
`PCT/US01/05722
`
`coding techniques. Notably, the method also compares the output of the coding
`
`techniques with the data stream without encoding/compressing because, in certain
`
`circumstances, uncompressed data may be optimum.
`
`In general, the columns comprise lossless coding techniques. The rows
`
`comprise different designations for the numberof bits per word, bpw 1-m, may be
`
`used to encode the data. For instance, the bits per word associated with the
`
`interface may beset to be eight bits, ten bits, etc., for example. Asa result, at Step
`
`22, method 10 codesa portion of the data with n x m numberoflossless coders.
`
`Preferably, Step 22 is performed for a test period of time or amountof data to
`
`10
`
`determine whichofthe lossless coders achieves optimum system performanceprior.
`
`Thereafter, the data is encoded (described below).
`
`The test compression performed by coders 32 in Step 22 preferably is
`
`conducted in parallel to quickly compile data corresponding to eachofthe lossless
`
`coders. Parallel coding of the test data is possible due to the fact that computing
`
`15
`
`powerhas becomeso inexpensive that the benefits (e.g., in terms of encoding
`
`speed) greatly outweigh the costs. Nevertheless, in an alternative embodiment, each
`
`of coders 32 shown in Fig. 1A may code the test data sequentially over a designated
`
`period of time to produce the corresponding performance data. Although not
`
`preferred, such a sequential test may be performed when computing poweris at a
`
`20
`
`premium.
`
`Turning to Table 1 below, the performancecriteria generated in Step 22 for
`
`nine different lossless coders (three different word lengths x three different coding
`techniques) is shown. Notefirst that the input bit rate is set at a predetermined
`value, while the output bit rate, although preferably set based on the transmission
`
`25
`
`medium employed, may be continuously updated based on feedback information
`
`regarding the lossless coder employed. Optimally, the output bit rate will be made
`
`as small as possible. As shown in Table 1, after compressing a test amountof data,
`
`an outputfile size in bytes, a compression ratio, and a time to encode are each
`
`determined for a designated speed input (kbit/second) and speed output
`
`30
`
`(kbit/second). For example, for an input file having 304,180,992 bytes and when
`
`NetApp; Rackspace
`
`Exhibit1017
`
`Page 9
`
`NetApp; Rackspace Exhibit 1017 Page 9
`
`
`
`WO 01/63772
`
`PCT/US01/05722
`
`using eight bits per word, Huffman coding achieves a compression ratio of 1.8272,
`
`Lempel-Ziv coding achievesa ratio of 2.505, and arithmetic coding achievesaratio
`
`of 2.7724.
`
`In addition, the time to encode the test data for each of these algorithms
`
`is 128 seconds, 522 seconds, and 1,582 seconds, respectively. Once the
`
`performancecriteria are generated for each lossless encoder, method 10 executes
`
`step 24 to select one of the coders to code, compress the data for a predetermined
`
`amountof time or for a particular amountofdata.
`
`10
`
`15
`
`Notably, the selection made in Step 24 typically is not based solely on
`compressionratio realized but rather the selection is made based on a combination
`of overall processing time and compression ratio performance characteristics. For
`example, in Table 1, arithmetic coding, for eight bits per word, achieves a
`compressionratio of 2.7724 which is greater than the compression ratio achieved
`
`for Lempel-ZIV coding, 2.505. However, arithmetic coding takes more than fifteen
`minutes longer to encode than the Lempel-Ziv lossless coder.
`In this case, method
`10 likely would select the Lempel-Ziv coder in Step 24. However, if the
`
`performance achieved byall n x m lossless coders does not satisfy a threshold level,
`method 10 may decide to send the data uncompressed. This decision dependson,
`
`amongother things, user requirements.
`
`The input clock rate indicated in Table 1 is dependent upon both the media
`
`20
`
`over which the data is transmitted (internet, for example) and the type of coding
`
`algorithm implemented. The time performancecriteria is generated according to the
`
`following equation,
`
`toverall = te + tprocessing.
`
`(Eq. 1)
`
`In Equation 1, tprocessing includes the time duration associated with compressing the
`
`25
`
`data, system delay, etc. Further, tc is the time to transmit the data and equals the
`
`size of the file divided by the compression ratio and by the output speed, i.e., the bit
`
`rate, and reflects the time savings achieved by compressing the data. The
`
`compression ratio (CR) being equalto the input file size divided by the outputfile
`
`size.
`
`30
`
`NetApp; Rackspace
`
`Exhibit1017
`
`Page 10
`
`NetApp; Rackspace Exhibit 1017 Page 10
`
`
`
`WO 01/63772
`
`PCT/US01/05722
`
`
`
`
`
`Zi
`[|
`
`OutputFile Size (byte)|166,470,896 _|- 193,662,252|184,718,156
`
`Compression Ratio
`Time (second)
`Input Speed (kbit/second
`Output Speed (kbit/second)
`OutputFile Size (byte)
`Compression Ratio
`
`Time (second)
`10 bits per word
`Input Speed(kbit/second)
`
`Output Speed (kbit/second)
`
`OutputFile Size (byte)
`
`Compression Ratio
`
`
`Time (second)
`12 bits per word
`
`
`Input Speed(kbit/second)
`
`Output Speed(kbit/second)
`
`
`
`
`
`
`
`
`
`
`
`4,220
`
`8 bits per word
`
`
`
`1,859
`
`3,724
`
`Once a coder (n,m) 32 (Fig. 1A) is selected in Step 24, method10 encodes.
`
`the data with the selected coder for, preferably, a predetermined amountoftime.
`
`Thereafter, the program returns to Step 22 to code a new test portion of the data
`
`stream and select an optimum coder for encoding the next portion of the data
`
`stream. This operation may require implementing a different lossless coder shown
`
`in Chart 30.
`
`Turning to FIG. 2, a system 40 for performing method 10 includes an
`
`10
`
`encoder 41 having an input interface 42 which includes a clock input C1 and a data
`
`input D1 for receiving a data stream 43 that is either synchronous or asynchronous.
`
`Interface 42 is coupled to a digital signal processor (DSP) chip 46 via input and
`
`output data-control-synchronousinput/output lines 44. Notably, DSP 46 preferably
`
`performs steps 16 and 18 in method 10 shownin FIG. 1 to frame the data and
`
`15
`
`prepare it for compression. The output of DSP 46 is coupled to computer 50 via a
`
`PCI bus 48 that communicates the framed data to the computer. Computer 50,
`
`preferably, adds appropriate header codes to the data stream to indicate different
`
`packets of data and operates to encode/compress the test data with each lossless
`
`coder shownin chart 30. As noted above, computer 50 may comprisea plurality of
`
`20
`
`processors each capable of encoding/compressing data for a corresponding one of
`
`NetApp; Rackspace
`
`Exhibit1017
`
`Page 11
`
`NetApp; Rackspace Exhibit 1017 Page 11
`
`
`
`WO 01/63772
`
`PCT/US01/05722
`
`11
`
`the lossless coders implemented from the grid 30 in Fig. 1A. Alternatively, a single
`
`computer 50 could be used to implement the test compression for each of the
`
`lossless coders 32 in a sequential fashion for predetermined period of time.
`
`Computer 50 may also be used to add header codes to the data to ensure that
`
`the file will be decompressed correctly. The compressed data is then transmitted via
`
`PCI bus 48 to the DSP chip 46 to divide the data as necessary for the specific
`
`communication system implemented. This process may involve buffering the data
`
`by inserting empty blocks and/or deleting existing blocks. Thereafter, particular
`
`synchronization codes may be added and the data stream is transmitted along the
`
`10
`
`input/output lines back to interface 42. The particular interface code settings
`
`include designating the numberofbits/word, the number of words per frame,
`
`synchronizing codes, a control sum, etc.
`
`Interface 42 then outputs the data stream
`
`on line D2, so that it may be transmitted over a medium 52 suchasthe internet.
`
`Preferably, the output clock rate C2 is set by the operator and is dependent upon the
`
`15
`
`type of medium 52 implemented.
`
`Next, the decoder 53 of system 40 includes an interface 54 having a data
`
`input D3 for receiving the compressed data from medium 52 at a clock rate C3 that
`
`correspondsto clock rate C2 output from interface 42. Notably, clocks C2 and C3
`
`20
`
`are optional.
`Interface 54 transmits the compressed data stream via data-control-
`synchronouslines 56, while the POC synchronization added by encoder 41 is
`deleted. Thereafter, a DSP chip 58 detects the header code(s) and removes empty
`
`blocks from the data stream. The data processed by DSP chip 58 is then transmitted
`
`to a computer 62 via a PCI bus 60. Computer 62 decompressesthe data, and,
`
`preferably, implements conventional control sum check (CSC) comparison
`
`25
`
`techniques. Additional error detection or error correction coders may also be
`
`implemented by computer 62. A Reed-Solomonerror correction coder is standard
`
`for communication networks andis preferably included. Notably, the above-
`
`described processing operations may be performed either by computer(s) 50, 62 or
`
`DSPchips 46, 48, but the preferred implementation has been described. The
`
`NetApp; Rackspace
`
`Exhibit1017
`
`Page 12
`
`NetApp; Rackspace Exhibit 1017 Page 12
`
`
`
`WO 01/63772
`
`PCT/US01/05722
`
`12
`
`compressed data is then transmitted back to interface 54 and onto data line D4 at a
`
`clock rate C4 = Cl.
`
`A representation of the method steps described in FIG. 1 and performed by
`
`the apparatus shownin Fig. 2 is shown schematically in FIG. 3 for a telemetric data
`
`stream. The arrow labeled A onthe right-side of FIG. 3 indicates the encoding
`process, while the arrow B alongtheleft-side of the data shownin FIG.3 indicates
`
`the decoding process. More particularly, data stream 43 is input to interface 42
`
`(FIG. 2) and then framed in a preassigned fashion into packet portions 64, 66
`(preferably several kilobytes, e.g., two 8k portions), preferably by DSP 46.
`
`10
`
`Thereafter, portions 64, 66 are compressed into, for example, a 4.5k block 68 and a
`
`4.3k block 70. Then, a header 73, 75 is added to the blocks of the packets (with the
`
`interface information described above) to create blocks 72, 74, respectively. Then,
`
`the blocks are buffered to construct a buffered and compressed packet 76 which may
`
`be divided again if necessary to create stream 78. Then, POC synchronization is
`
`15
`
`added by DSP chip 46 and new data stream 80 may be transmitted to decoder 53
`
`via, for example, the internet 52 (FIG. 2) where it is decoded as described above.
`
`Many changes and modifications may be made within the scope of the
`
`present invention without departing from the spirit thereof. Other changes and
`
`modifications falling within the scope of the invention will become apparent from
`
`20
`
`the appended claims.
`
`NetApp; Rackspace
`
`Exhibit1017
`
`Page 13
`
`NetApp; Rackspace Exhibit 1017 Page 13
`
`
`
`WO 01/63772
`
`PCT/US01/05722
`
`13
`
`Whatis claimed is:
`
`1.
`
`A method of lossless compression of a stream of data, the method
`
`comprising the steps of:
`
`providing a plurality of different types of lossless coders;
`
`selecting one of the lossless coders to compress the data stream; and
`
`encoding the data stream with the selected lossless coder.
`
`2.
`
`The methodof claim 1, further comprising the step of, prior to said selecting
`
`step, individually compressing at least a portion of the data stream with each of the
`
`10
`
`lossless coders; and
`
`wherein said selecting step is performed based on said compressingstep.
`
`3.
`
`The method of claim 1, wherein said selecting step is based on a
`
`performance characteristic associated with said compressingstep.
`
`15
`
`4,
`
`The method of claim 3, wherein the performance characteristic includes at
`
`least one of compression ratio and duration of said compressing step for a
`
`corresponding lossless coder.
`
`20
`
`5.
`
`The method of claim 1, wherein at least one of the lossless coders uses
`
`statistical modeling.
`
`6.
`
`The method of claim 5, wherein at least another of the lossless coders uses
`
`dictionary-based modeling.
`
`25
`
`The method ofclaim 2, wherein the lossless coders perform said
`7.
`compressing step in parallel.
`
`8.
`
`The methodof claim 2, wherein the lossless coders perform said
`
`30
`
`compressing step sequentially.
`
`NetApp; Rackspace
`
`Exhibit1017
`
`Page 14
`
`NetApp; Rackspace Exhibit 1017 Page 14
`
`
`
`WO 01/63772
`
`PCT/US01/05722
`
`14
`
`9.
`
`The method of claim 1, wherein the lossless coders are defined in part by a
`
`numberofbits per word used in said encoding step.
`
`A methodof lossless compression of a stream of data, the method
`10.
`comprising the stepsof:
`
`using a plurality of different types of lossless coders to compressa test
`
`portion of the data stream;
`
`determining a performance characteristic associated with each of the lossless
`
`coders in responseto said using step;
`
`10
`
`selecting one of the lossless coders based on said determiningstep;
`encoding a first portion of the data stream with the selected coder; and
`
`repeating said using, determining, selecting and encoding steps for another
`
`test portion and a second portion of the data stream.
`
`11.|The method of claim 10, wherein said repeating step includesselecting a
`15
`
`different one of the lossless coders.
`
`12.|The method of claim 10, wherei