throbber
US006195024B1
`(10) Patent No.
`a2) United States Patent
`US 6,195,024 B1
`Fallon
`(45) Date of Patent:
`Feb. 27, 2001
`
`
`(54) CONTENT INDEPENDENT DATA
`COMPRESSION METHOD AND SYSTEM
`
`(74) Altorney, Agent, or Firm—Frank V. DeRosa; F. Chau
`& Associates, LLP
`
`(75)
`
`Inventor:
`
`James J. Fallon, Bronxville, NY (US)
`
`(57)
`
`ABSTRACT
`
`(73) Assignee: Realtime Data, LLC, New York, NY
`(US)
`
`
`
`Systems and methods for providing content independent
`lossless data compression and decompression. A data com-
`pression system includes a plurality of encoders that are
`(*) Notice: Subject to any disclaimer, the term of this_configured to simultaneously or sequentially compress data
`patent is extended or adjusted under 35
`independent of the data content. Theresults of the various
`US.C. 154(b) by 0 days.
`encoders are compared to determine if compression is
`achieved and to determine which encoderyields the highest
`lossless compression ratio. The encoded data with the high-
`est lossless compressionratio is then selected for subsequent
`data processing, storage, or transmittal. A compression iden-
`tification descriptor may be appended to the encoded data
`with the highest compression ratio to enable subsequent
`decompression and data interpretation. Furthermore, a timer
`may be added to measure the time elapsed during the
`encoding process against an a priori-specified time limit.
`Whenthe time limit expires, only the data output from those
`encoders that have completed the encoding process are
`compared. The encoded data with the highest compression
`ratio is selected for data processing, storage, or transmittal.
`The imposed time limit ensures that the real-time or pseudo
`:
`se
`:
`real-time nature of the data encodingis preserved. Buffering
`..
`the output from each encoder allows additional encoders to
`be sequentially applied to the output of the previous encoder,
`yielding a more optimal lossless data compressionratio.
`
`(21) Appl. No.: 09/210,491
`(22)
`Filed:
`Dec. 11. 1998
`,
`(S51)
`Int. C1? one H03M 7/34; HO3M 7/00
`
`sessscsescsecnsccneeeneesteesense 341/51; 341/79
`(52) U.S. C1.
`cee
`(58) Field of Search 0...eee 341/51, 79, 67;
`709/231, 219, 236, 250; 358/1.1; 712/32;
`711/208
`
`(56)
`
`References Cited
`US. PATENT DOCUMENTS
`
`10/1989 Tsukiyamaetal..
`4,872,009
`H
`5/1990 O Brien et al.
`.
`4,929,946
`.
`9/1991 Mitchell etal. .
`5,045,852
`.
`3/1992 Langdon, Jr. et al.
`5.097.261
`5,175,543 * 12/1992 Lantz,
`ceccocsssssessssssesssseeeeseee 341/51
`
`(List continued on next page.)
`
`Primary Examiner—Patrick Wamsley
`
`34 Claims, 16 Drawing Sheets
`
`
`
`
`
`BUFFER/
`COUNTER1
`
`
`
`
`
`ENCODED DATA
`
`
`
`
`
`
`
`
`BUFFER/
`COUNTER n
`
`
`
`
`
`
`
`
`
`ENCODERE1
`DATA STREAM
`
`
`|
`ENCODERE2
`COUNTER5
`pees
`
`
`
`
`
`
`
`
`gto Lat Batbl (excooenes]H+|REY,|ocelotee
`
`
`
`
`OEOPANISON
`DESCRIPTION
`ENCODERES
`COUNTER 3
`COUNTER
`BUFFER
`107
`207
`
`
`ENCODER En
`
`30
`
`40
`
`Commvault Ex. 1013
`Commvault v. Realtime
`
`US Patent No. 9,054,728
`
`Page1
`
`Page 1
`
`Commvault Ex. 1013
`Commvault v. Realtime
`US Patent No. 9,054,728
`
`

`

`US 6,195,024 B1
`
`Page 2
`
`U.S. PATENT DOCUMENTS
`.
`3/1993 Normile etal. .
`5,212,742
`ae jihogs cane et al
`-
`tite
`eTOUSSI ef al.
`5,243,348
`9/1993 Jackson . :
`5,270,832
`12/1993 Balkanskietal. .
`1/1995. Storer .
`5,379,036
`.
`5,381,145
`1/1995 Allen et al.
`.
`.
`5,394,534
`2/1995 Kulakowski et al.
`5/1995 Chang et al. wees 341/79
`5,412,384 *
`5,461,679
`10/1995 Normile et al.
`.
`5,467,087
`11/1995 Chu.
`5,471,206
`11/1995 Allen etal. .
`5,479,587
`12/1995 Campbell et al.
`E
`5,486,826
`1/1996 Remillard .
`2/1996 Je-Changet al.
`5,495,244
`5,533,051
`7/1996 James .
`5,583,500
`12/1996 Allen etal. .
`5,627,534
`5/1997 Craft.
`5,654,703
`8/1997 Clark, I.
`5,668,737
`9/1997 Tler.
`
`.
`
`.
`
`5,717,393
`5,717,394
`5,729,228
`5,748,904
`5,771,340
`5,784,572
`5,799,110
`5,805,932
`5,809,176
`5,818,368
`5,818,530
`5,819,215
`5,825,424
`5,847,762
`5,861,824
`5,917,438
`5,964,842
`5,991,515
`6 031,939
`mes
`
`2/1998 Nakanoetal. .
`2/1998 Schwartz etal. .
`3/1998 Franaszeketal. .
`5/1998 Huang et al.
`.
`6/1998 Nakazato etal. .
`7/1998 Rostokeretal. .
`8/1998 Israelsen et al.
`.
`,
`9/1998 Kawashimaetal. .
`M
`9/1998 Yajima .
`10/1998 Langley .
`10/1998 Canfield etal. .
`10/1998 Dobsonetal. .
`10/1998 Canfield etal. .
`12/1998 Canfield etal. .
`1/1999 Ryuetal..
`6/1999 Ando .
`10/1999 Packard .
`11/1999 Fall et al. .
`2/2000 Gilbert et al
`moert
`ef al.
`
`* cited by examiner
`
`Page 2
`
`Page 2
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 1 of 16
`
`US 6,195,024 B1
`
`| | | | |
`
`RO
`|
`|
`
`1
`
`2
`
`3
`
`INPUT DATA STREAM
`
`SIGNAL
`
`
`IDENTIFY INPUT DATA TYPE AND
`GENERATEDATA TYPE IDENTIFICATION
`
`DATA TYPE
`ID SIGNAL
`COMPRESS DATA IN ACCORDANCE WITH|
`IDENTIFIED DATA TYPE
`
`| | | | |
`
`| |
`
`|
`|
`
`COMPRESSED DATA STREAM
`
`RETRIEVE DATA TYPE
`INFORMATION OF COMPRESSED
`DATA STREAM
`
`WITH IDENTIFIED DATA TYPE
`
`DECOMPRESSDATA IN ACCORDANCE
`
`FIG. 1
`PRIOR ART
`
`Page 3
`
`Page 3
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 2 of 16
`
`US 6,195,024 B1
`
`/MWVAYELS
`
`YOLdIyOSAad
`
`NOISSSYdNOOD
`
`AdAL
`
`NOILdIYDS3G
`NOSIYVdNOO
`
`éYSLNNOD
`
`/NOILWNINYSLIG€YSLNNODYa4sng
`NOISSSYdNODLANI
`
`OILLVWYfaasasandVLivd
`
`
`
`VLVGG4dGOONA
`
`/¥a4snd
`
`kYSALNNOD
`
`LaYaGOONAWaalsVLva
`
`€3aYaqGOONA
`
`VLVd
`
`9018
`
`YALNNOD
`
`faadsnd
`
`UYAINNOD
`
`U3YAGOONA
`
`¢Old
`
`Page 4
`
`Page 4
`
`
`
`
`
`
`
`
`

`

`U.S. Patent
`
`B
`
`Feb. 27, 2001
`
`Sheet 3 of 16
`
`RECEIVE INITIAL
`DATA BLOCK FROM
`INPUT DATA STREAM
`
`COUNTSIZE OF
`
`DATA BLOCK
`
`BUFFER DATA BLOCK
`
`US 6,195,024 B1
`
`300
`
`302
`
`304
`
`THRESHOLD LIMIT
`
`COMPRESS DATA 7306
`
`BLOCK WITH
`ENABLED ENCODERS
`
`BUFFER ENCODED
`DATA BLOCK OUTPUT
`FROM EACH
`ENCODER
`
`COUNTSIZE OF
`ENCODED DATA
`BLOCKS
`
`CALCULATE
`COMPRESSION
`RATIOS
`
`COMPARE
`COMPRESSION
`RATIOS WITH
`
`308
`
`310
`
`312
`
`314
`
`FIG. 3a
`
`Page 5
`
`Page 5
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 4 of 16
`
`US 6,195,024 B1
`
`316
`
`
`IS
`COMPRESSION
`
`
`RATIO OF AT LEAST ONE
`ENCODED DATA BLOCK
`GREATER THAN
`
`THRESHOLD?
`
`
`NO
`
`
`
`APPEND NULL
`SELECT ENCODED
`
`
`DATA BLOCK WITH
`DESCRIPTOR TO
`
`
`UNENCODED INPUT
`GREATEST
`
`COMPRESSION RATIO
`DATA BLOCK
`
`
`
`
`
`
`OUTPUT ENCODED
`
`DATA BLOCK WITH
`
`DESCRIPTOR
`
`
`
`APPEND
`CORRESPONDING
`DESCRIPTOR
`
`OUTPUT UNENCODED
`DATA BLOCK WITH
`NULL DESCRIPTOR
`
`
`
`MORE
`DATA BLOCKSIN INPUT
`STREAM?
`
`
`
`
`
`TERMINATE DATA
`COMPRESSION
`
`PROCESS
`
`
`330
`
`RECEIVE NEXT DATA
`BLOCK FROM INPUT
`STREAM
`
`
`
`
`
`FIG. 3b
`
`318
`
`320
`
`Page 6
`
`Page 6
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 5 of 16
`
`US 6,195,024 B1
`
`
`
`VLVOGS3dO9ONZ
`
`/MWVAaLls
`
`YOLdIYOSAG
`
`ddAL
`
`NOISSSYdWOO
`NOISSSYdNOD
`
`OILVY
`
`NOILdIWOSAG
`/INOILVNINYS3LA0
`NOSIYVdNOO
`€YSLNNOO
`
`VLVd
`
`f4assng
`
`éYSLNNOOD
`
`/f4344Na
`
`é3YACOONA
`
`€3YAGOONA
`
`Vivd
`
`4o01d
`
`434sng
`
`YALNNOO
`
`LNdNl
`
`fMassnd
`
`bYaLNNOD
`
`baYaCGOONA
`
` WVAaLS
`
`VLVG
`
`dOSaYNDIS
`
`LIdSW
`
`NOILVNINYSALAG
`
`YsaqOONS
`
`ALIMaVeISAd
`
`SdOLOVA
`
`ySls
`
`fadassng
`
`UYALNNOD
`
`U3YSGOONS
`
`Page 7
`
`Page 7
`
`
`
`
`
`
`
`
`
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 6 of 16
`
`US 6,195,024 B1
`
`B
`
`RECEIVEINITIAL
`DATA BLOCK FROM
`INPUT DATA STREAM
`
`BUFFER DATA BLOCK
`
`COUNTSIZE OF
`DATA BLOCK
`
`500
`
`502
`
`504
`
`COMPRESS DATA
`BLOCK WITH
`
`ENABLED ENCODERS
`
`
`
`APPEND CORRESPONDING
`DESIRABILITY FACTORS TO
`
`ENCODED DATA BLOCKS
`
`508
`
`BUFFER ENCODED DATA
`BLOCK OUTPUT
`
`FROM EACH ENCODER
`
`510
`
`COUNT SIZE OF
`ENCODED DATA
`BLOCKS
`
`RATIOS
`
`CALCULATE
`COMPRESSION
`
`912
`
`514
`
`COMPARE COMPRESSION
`
`RATIOS WITH THRESHOLD LIMIT
`
`516
`
`FIG. 5a
`
`Page 8
`
`Page 8
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 7 of 16
`
`US 6,195,024 B1
`
`A
`
`518
`
`
`IS
`COMPRESSION
`
`
`RATIO OF AT LEAST ONE
`ENCODED DATA BLOCK
`
`GREATER THAN
`THRESHOLD?
`
`
`
`
`
`
`APPEND NULL
`CALCULATE FIGURE OF
`
`DESCRIPTOR TO
`MERIT FOR EACH ENCODED
`
`
`UNENCODED INPUT
`
`
`DATA BLOCK WHICH EXCEED
`DATA BLOCK
`THRESHOLD
`
`
`
`
`
`
`
`
`APPEND
`
`CORRESPONDING
`
`
`DESCRIPTOR
`
`
`OUTPUT ENCODED
`
`DATA BLOCK WITH
`DESCRIPTOR
`
`SELECT ENCODED DATA
`BLOCK WITH GREATEST
`FIGURE OF MERIT
`
`OUTPUT UNENCODED
`DATA BLOCK WITH
`NULL DESCRIPTOR
`
`
`
`
`PROCESS
`
`TERMINATE DATA
`COMPRESSION
`
`
`
`MORE
`DATA BLOCKSIN
`
`INPUT STREAM?
`
`
`RECEIVE NEXT DATA
`
`
`
`BLOCK FROM INPUT
`STREAM
`
`
`YES
`
`534
`
`B
`
`FIG. 5b
`
`520
`
`522
`
`Page 9
`
`Page 9
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 8 of 16
`
`US 6,195,024 B1
`
`
`
`VLVGG3H0OON4
`
`/MWVSAYLS
`
`YOLdIYOSAG
`
`AdAL
`
`NOISSSYdNO0D
`NOISSSYdWOd
`
`OILVa
`
`NOIlLdlY¥OS3G
`/NOLLVNINYSLAG
`NOSIYVdNOO
`€YALNNOD
`
`fdassing
`
`|YSLNNOD
`
`/fdaasnd
`
`éYALNNOD
`
`/a4445N8
`
` WVAYLS
`
`baYAQOONA
`
`VLVG
`
`¢édYAqCOQONA
`
`€4YACOONS
`
`LNdNI
`
`VLVd
`
`VIVd
`
`49018
`
`dassng
`
`YaLNNOO
`
`faasing
`
`UYALNNOD
`
`uyYACOONA
`
`
`
`AWILGalaloads
`
`“-Y4sn
`
`9‘SIs
`
`Page 10
`
`Page 10
`
`
`
`
`
`
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 9 of 16
`
`US 6,195,024 B1
`
`710
`
`TIME EXPIRED?
`
`NO
`
`
`NO
`
`
`ENCODING
`
`INPUT INITIAL
`DATA BLOCK FROM
`INPUT DATA STREAM
`
`COUNTSIZE OF
`DATA BLOCK
`
`BUFFER DATA BLOCK
`
`INITIALIZE TIMER
`
`700
`
`702
`
`B
`
`704
`
`706
`
`708
`
`
`COMPLETE?
`
`
`
`STOP
`ENCODING
`PROCESS
`
`
`
`BUFFER ENCODED
`
`
`BEGIN
`DATA BLOCK FOR EACH
`
`
`COMPRESSING
`ENCODER THAT
`DATA BLOCK WITH
`COMPLETED ENCODING
`ENCODERS
`santa
`
`PROCESS
`
`WITHIN TIME LIMIT
`
`COUNT SIZE OF
`ENCODED DATA
`BLOCKS
`
`720
`
`722
`
`
`
`
`
` CALCULATE
`COMPRESSION
`
`RATIOS
`
`
`
`COMPARE COMPRESSION
`
`
`RATIOS WITH THRESHOLD
`LIMIT
`
`
`
`FIG. 7a
`
`724
`
`Page 11
`
`Page 11
`
`

`

`
`
`APPEND NULL
`SELECT ENCODED
`DESCRIPTOR TO
`DATA BLOCK WITH
`
`
`UNENCODED INPUT
`GREATEST
`
`
`DATA BLOCK
`
`COMPRESSION RATIO
`
`
`
`
`
`
`
`
`APPEND
`CORRESPONDING
`
`DESCRIPTOR
`
`
`
`OUTPUT UNENCODED
`OUTPUT ENCODED
`DATA BLOCK WITH
`DATA BLOCK WITH
`
`
`NULL DESCRIPTOR
`
`DESCRIPTOR
`
`728
`
`730
`
`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 10 of 16
`
`US 6,195,024 B1
`
`726
`
`IS
`COMPRESSION
`
`RATIO OF AT LEAST ONE
`ENCODED DATA BLOCK
`GREATER THAN
`THRESHOLD?
`
`NO
`
`
`
`
`
`
`
`
`
`
`
`MORE
`DATA BLOCKSIN INPUT
`STREAM?
`
`
`TERMINATE DATA
`COMPRESSION
`
`
`PROCESS
`
`
`
`REGEIVE NEXT DATA
`BLOCK FROM INPUT
`STREAM
`
`740
`
`FIG. 7b
`
`Page 12
`
`Page 12
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 11 of 16
`
`US 6,195,024 B1
`
`
`
`VLVOGG3GOINF
`
`OILVY
`
`NOSINWdNOO
`
`dO3aywNols
`
`Lids
`
`NOILYNIAWYALAG
`
`YOLdINOSICZYALNNOOD
`
`/MWWIYLSfdassng
`
`NOHHOSAC||/NOLWNINYSLSG€YaLNNOD
`
`NOISSaudWOoNOISSANdWOD
`
`ddAlfMa44na
`
`8Sls
`
`YAqCOONS
`
`ALMavalsSad
`
`SYHOLOVA
`
`YdswWiL
`
`06
`
`
`
`AWLLGalaloads
`
`-Y4dSN
`
`Page 13
`
`/faadind
`
`f4adsnd
`
`UYSLINNOD
`b¥S3LNNOOD
` i
`cjYAGOONA
`€4YAqCOONS
`U3YACOONA
`baYaCOONS
`
`
`
`LAdNI
`
`VLivd
`
`434d4dng
`fAHOSS3A90Ud
`
`YALNNOD
`
`
`
`WVAYdLSVLVG
`
`
`
`VivdLAdNI
`
`MOO78
`
`Page 13
`
`
`
`
`

`

`U.S. Patent
`
`NOISSAYdWOD
`
`
`
`OILVYLNdNIViva
`
`
`
`
`
`NOILYNINYSLAGveldft]ceaLedvivayo071d
`
`Feb. 27, 2001
`
`4O3yNoIuuglelzwwa
`
`
`
`u'Lg-ZL|WVAYLSVLVd
`
`uzccdLiza
`
`
`Sheet 12 of 16
`
`US 6,195,024 B1
`
`NOILVNINWYS.L3q0
`
`LIMAWN
`
`
`
`NOSIYWdNOD/PSEEElarsYSLNNOD
`
`09
`
`
`
`vivdGadGOONa
`
`/MWSLS
`
`YOldiI4DS34a
`
`NOISSSAY¥dWOdYAGOONA
`
`
`
`NOLLdINOSAGSYOLOVA
`
`AdALALINGVHISAa06
`
`Page 14
`
`a0¢HANISwidalstoads
`
`-¥asn
`
`
`
`
`
`Page 14
`
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 13 of 16
`
`US 6,195,024 B1
`
`7110
`
`TIME EXPIRED?
`
`116
`
`APPLY OUTPUT
`OF COMPLETED
`ENCODING
`STAGE TO NEXT
`ENCODING
`STAGE IN
`CASCADE PATH
`
`BUFFER
`ENCODED DATA
`BLOCK OUTPUT
`FROM
`COMPLETED
`ENCODING
`STAGE
`
`
`ENCODER PATHS
`
`100
`
`102
`B
`
`404
`
`106
`
`108
`
`RECEIVE INITIAL
`DATA BLOCK FROM
`INPUT DATA STREAM
`
`COUNTSIZE OF
`DATA BLOCK
`
`BUFFER DATA
`BLOCK
`
`INITIALIZE TIMER
`
`APPLY INPUT DATA
`BLOCK TO FIRST
`ENCODING STAGE
`IN CASCADED
`
`STOP ENCODING
`PROCESS
`
`
`
`
`
`SELECT BUFFERED OUTPUT OF LAST
`ENCODING STAGE IN ENCODER
`CASCADE THAT COMPLETED ENCODING
`PROCESS WITHIN TIME LIMIT
`
`
`
`
`COUNTSIZE OF
`122
`ENCODED DATA
`BLOCKS
`
`
`
`CALCULATE
`COMPRESSION
`RATIOS
`
`124
`
`COMPARE COMPRESSION
`RATIOS WITH THRESHOLD
`LIMIT
`
`FIG. 10a
`
`A
`
`126
`
`Page 15
`
`Page 15
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 14 of 16
`
`US 6,195,024 B1
`
`128
`
`
`IS
`COMPRESSION
`
`
`RATIO OF AT LEAST ONE
`
`
`ENCODED DATA BLOCK
`
`GREATER THAN
`THRESHOLD?
`
`
`
`
`CALCULATE FIGURE OF
`APPEND NULL
`
` 134
`MERIT FOR EACH ENCODED
`DESCRIPTOR TO
`DATA BLOCK WHICH EXCEED
`UNENCODED INPUT
`
`THRESHOLD
`DATA BLOCK
`
`
` SELECT ENCODED DATA
`
`
`136
`
`BLOCK WITH GREATEST
`FIGURE OF MERIT
`
`
`
`
`
`OUTPUT UNENCODED
`APPEND
`DATA BLOCK WITH
`CORRESPONDING
`138
`
`
`
`NULL DESCRIPTOR
`DESCRIPTOR
`
`
`
`DATA BLOCK WITH
`DESCRIPTOR
`
`
` OUTPUT ENCODED
`
`140
`
`B FIG. 10b
`
`430
`
`132
`
`Page 16
`
`
`MORE
`DATA BLOCKSIN
`INPUT STREAM?
`
`
`
`PROCESS
`
`TERMINATE DATA
`COMPRESSION
`
`YES
`
`
`
`RECEIVE NEXT DATA
`BLOCK FROM INPUT
`STREAM
`
`
`
`|_-144
`
`Page 16
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 15 of 16
`
`US 6,195,024 B1
`
`Y4sjsng
`
`90L|
`
`
`
`VLVOLNdLNO
`
`WVSdLS
`
`VLiVdLNdLNO
`
`€dY¥A00o0R0
`
`ud¥300903d
`
`vOLL
`
`TIAN/MVLVG
`
`YOLdYOSAC
`
`bdY¥aqoo03d
`
`2QY¥Sd0080
`
`
`
`cOLtL
`
`OOlL
`
`NOILOVYLX>
`45ndWOO1d
`
`YOldlyOSad
`
`ViVdLAdNI
`
`
`WVAYLSVLVG
`
`LLSls
`
`Page 17
`
`Page 17
`
`
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 16 of 16
`
`US 6,195,024 B1
`
`RECEIVE INITIAL
`DATA BLOCK FROM
`INPUT DATA STREAM
`
`BUFFER DATA BLOCK
`
`1202
`
`
` EXTRACT DATA
`
`COMPRESSION TYPE
`DESCRIPTOR
`
`1204
`
`
` IS DATA
`COMPRESSION
`
`
`NULL?
`
` 1200
`
`
`
`
`
`
`
`
`
`SELECT DECODER(S)
`CORRESPONDING TO
`
`
`DESCRIPTOR
`
`
`
`
`OUTPUT
`RECEIVE NEXT
`
`
`
`DATA BLOCK IN
`UNDECODED
`
`
` DECODE DATA BLOCK USING
`INPUT STREAM
`DATA BLOCK
`
`SELECTED DECODER(S)
`
`
`
`OUTPUT DECODED
`DATA BLOCK
`
`
`
`MORE DATA
`BLOCKSIN INPUT
`
`STREAM?
`
`
`1218
`TERMINATE
`
`DECODING PROCESS
`
`NO
`
`FIG. 12
`
`Page 18
`
`Page 18
`
`

`

`US 6,195,024 B1
`
`1
`CONTENT INDEPENDENT DATA
`COMPRESSION METHOD AND SYSTEM
`
`BACKGROUND
`
`1. Technical Field
`
`The present invention relates generally to data compres-
`sion and decompression and, more particularly, to systems
`and methodsfor providing content independentlossless data
`compression and decompression.
`2. Description of the Related Art
`Information may be represented in a variety of manners.
`Discrete information such as text and numbers are easily
`represented in digital data. This type of data representation
`is known as symbolic digital data. Symbolic digital data is
`thus an absolute representation of data such as a letter,
`figure, character, mark, machine code, or drawing.
`Continuous information such as speech, music, audio,
`images and video, frequently exists in the natural world as
`analog information. As is well-knownto those skilled in the
`art, recent advances in very large scale integration (VLSI)
`digital computer technology have enabled both discrete and
`analog information to be represented with digital data.
`Continuous information represented as digital data is often
`referred to as diffuse data. Diffuse digital data is thus a
`representation of data that is of low information density and
`is typically not easily recognizable to humansin its native
`form.
`
`There are many advantages associated with digital data
`representation. For instance, digital data is more readily
`processed, stored, and transmitted due to its inherently high
`noise immunity. In addition, the inclusion of redundancy in
`digital data representation enables error detection and/or
`correction. Error detection and/or correction capabilities are
`dependent upon the amount and type of data redundancy,
`available error detection and correction processing, and
`extent of data corruption.
`One outcomeof digital data representation is the continu-
`ing need for increased capacity in data processing, storage,
`and transmittal. This is especially true for diffuse data where
`increases in fidelity and resolution create exponentially
`greater quantities of data. Data compression is widely used
`to reduce the amount of data required to process, transmit,
`or store a given quantity of information. In general, there are
`two types of data compression techniques that may be
`utilized either separately or jointly to encode/decode data:
`lossless and lossy data compression.
`Lossy data compression techniques provide for an inexact
`representation of the original uncompressed data such that
`the decoded (or reconstructed) data differs from the original
`unencoded/uncompressed data. Lossy data compression is
`also knownasirreversible or noisy compression. Entropy is
`defined as the quantity of information in a givenset ofdata.
`Thus, one obvious advantage of lossy data compression is
`that the compression ratios can be larger than the entropy
`limit, all at the expense of information content. Many lossy
`data compression techniques seek to exploit various traits
`within the humansenses to eliminate otherwise impercep-
`tible data. For example, lossy data compression of visual
`imagery might seek to delete information content in excess
`of the display resolution or contrast ratio.
`On the other hand, lossless data compression techniques
`provide an exact representation of the original uncom-
`pressed data. Simply stated, the decoded (or reconstructed)
`data is identical to the original unencoded/uncompressed
`data. Lossless data compression is also knownasreversible
`
`10
`
`15
`
`20
`
`35
`
`40
`
`45
`
`60
`
`65
`
`2
`or noiseless compression. Thus, lossless data compression
`has, as its current limit, a minimum representation defined
`by the entropy of a given dataset.
`There are various problems associated with the use of
`lossless compression techniques. One fundamental problem
`encountered with most lossless data compression techniques
`are their content sensitive behavior. Thisis often referred to
`
`as data dependency. Data dependency implies that the com-
`pression ratio achieved is highly contingent upon the content
`of the data being compressed. For example, database files
`often have large unused fields and high data redundancies,
`offering the opportunity to losslessly compressdata at ratios
`of 5 to 1 or more. In contrast, concise software programs
`have little to no data redundancy and, typically, will not
`losslessly compress better than 2 to 1.
`Another problem with lossless compression is that there
`are significant variations in the compression ratio obtained
`whenusing a single lossless data compression technique for
`data streams having different data content and data size. This
`process is known as natural variation.
`A further problem is that negative compression may occur
`when certain data compression techniques act upon many
`types of highly compressed data. Highly compressed data
`appears random and many data compression techniques will
`substantially expand, not compress this type of data.
`For a given application, there are many factors which
`govern the applicability of various data compression tech-
`niques. These factors include compression ratio, encoding
`and decoding processing requirements, encoding and decod-
`ing time delays, compatibility with existing standards, and
`implementation complexity and cost, along with the adapt-
`ability and robustness to variations in input data. A direct
`relationship exists in the current art between compression
`ratio and the amount and complexity of processing required.
`Oneofthe limiting factors in most existing prior art lossless
`data compression techniquesis the rate at which the encod-
`ing and decoding processes are performed. Hardware and
`software implementation tradeoffs are often dictated by
`encoder and decoder complexity along with cost.
`Another problem associated with lossless compression
`methods is determining the optimal compression technique
`for a given set of input data and intended application. To
`combat this problem, there are many conventional content
`dependent techniques which maybe utilized. For instance,
`filetype descriptors are typically appendedto file names to
`describe the application programsthat normally act upon the
`data contained within thefile. In this manner data types, data
`structures, and formats within a given file may be ascer-
`tained. Fundamental problems with this content dependent
`technique are:
`(1) the extremely large number of application programs,
`some of which do not possess published or documented
`file formats, data structures, or data type descriptors;
`(2)
`the ability for any data compression supplier or
`consortium to acquire, store, and access the vast
`amounts of data required to identify knownfile descrip-
`tors and associated data types, data structures, and
`formats; and
`(3) the rate at which new application programsare devel-
`oped and the need to update file format data descrip-
`tions accordingly.
`An alternative technique that approaches the problem of
`selecting an appropriate lossless data compression technique
`is disclosed in U'S. Pat. No. 5,467,087 to Chu entitled “High
`Speed Lossless Data Compression System” (“Chu”). FIG. 1
`illustrates an embodiment of this data compression and
`
`Page 19
`
`Page 19
`
`

`

`US 6,195,024 B1
`
`3
`decompression technique. Data compression 1 comprises
`two phases, a data pre-compression phase 2 and a data
`compression phase 3. Data decompression 4 of a com-
`pressed input data stream is also comprised of two phases,
`a data type retrieval phase 5 and a data decompression phase
`6. During the data compression process 1,
`the data pre-
`compressor 2 accepts an uncompressed data stream, identi-
`fies the data type of the input stream, and generates a data
`type identification signal. The data compressor 3 selects a
`data compression method from a preselected set of methods
`to compress the input data stream, with the intention of
`producing the best available compression ratio for that
`particular data type.
`There are several problems associated with the Chu
`method. One such problem is the need to unambiguously
`identify various data types. While these might include such
`common data types as ASCH, binary, or unicode, there, in
`fact, exists a broad universe of data types that fall outside the
`three most common data types. Examples of these alternate
`data types include: signed and unsigned integers of various
`lengths, differing types and precision of floating point
`numbers, pointers, other forms of character text, and a
`multitude of user defined data types. Additionally, data types
`may be interspersed or partially compressed, making data
`type recognition difficult and/or impractical. Another prob-
`lem is that given a known data type, or mix of data types
`within a specific set or subset of input data,
`it may be
`difficult and/or impractical to predict which data encoding
`technique yields the highest compressionratio.
`Chu discloses an alternate embodiment wherein a data
`
`compression rate control signal is provided to adjust specific
`parameters of the selected encoding algorithm to adjust the
`compression time for compressing data. One problem with
`this technique is that the length of time to compress a given
`set of input data may be difficult or impractical to predict.
`Consequently, there is no guarantee that a given encoding
`algorithm or set of encoding algorithms will perform for all
`possible combinations of input data for a specific timing
`constraint. Another problem is that, by altering the param-
`eters of the encoding process,
`it may be difficult and/or
`impractical to predict the resultant compression ratio.
`Other conventional techniques have been implemented to
`address the aforementioned problems. For instance, US.
`Pat. No. 5,243,341 to Seroussi et al. describes a class of
`Lempel-Ziv lossless data compression algorithms that uti-
`lize a memorybaseddictionary of finite size to facilitate the
`compression and decompression of data. A second standby
`dictionary is included comprised of those encoded data
`entries that compress the greatest amount of input data.
`Whenthe current dictionary fills up and is reset, the standby
`dictionary becomesthe current dictionary, thereby maintain-
`ing a reasonable data compression ratio and freeing up
`memory for newly encodeddata strings. Multiple dictionar-
`ies are employed within the same encoding technique to
`increase the lossless data compression ratio. This technique
`demonstrates the prior art of using multiple dictionaries
`within a single encoding process to aid in reducing the data
`dependency of a single encoding technique. One problem
`with this methodis that it does not address the difficulties in
`
`dealing with a wide variety of data types.
`teaches a
`USS. Pat. No. 5,717,393 to Nakano, et al.
`plurality of code tables such as a high-usage code table and
`a low-usage code table in an entropy encoding unit. A
`block-sorted last character string from a block-sorting trans-
`forming unit is the move-to-front transforming unit is trans-
`formed into a move-to-front (MTF)codestring. The entropy
`encoding unit switches the code tables at a discontinuous
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`part of the MTF codestring to perform entropy coding. This
`technique increases the compression rate without extending
`the block size. Nakano employs multiple code tables within
`a single entropy encoding unit to increase the lossless data
`compressionratio for a given block size, somewhat reducing
`the data dependency of the encoding algorithm. Again, the
`problem with this technique is that it does not address the
`difficulties in dealing with a wide variety of data types.
`U.S. Pat. No. 5,809,176 to Yajima discloses a technique of
`dividing a native or uncompressed image data into a plu-
`rality of streams for subsequent encoding by a plurality of
`identically functioning arithmetic encoders. This method
`demonstrates the technique of employing multiple encoders
`to reduce the time of encoding for a single method of
`compression.
`U.S. Pat. Nos. 5,583,500 and 5,471,206 to Allen, at al.
`disclose systems for parallel decompression of a data stream
`comprised of multiple code words. At least two code words
`are decoded simultaneously to enhance the decoding pro-
`cess. This technique demonstrates the prior art of utilizing
`multiple decoders to expedite the data decompression pro-
`cess.
`
`U.S. Pat. No. 5,627,534 to Craft teaches a two-stage
`lossless compression process. A run length precompressed
`output is post processed by a Lempel-Ziv dictionary sliding
`window dictionary encoder that outputs a succession of
`fixed length data units. This yields a relatively high-speed
`compression technique that provides a good match between
`the capabilities and idiosyncrasies of the two encoding
`techniques. This technique demonstrates the prior art of
`employing sequential lossless encoders to increase the data
`compression ratio.
`U.S. Pat. No. 5,799,110 to Israelsen, et al. discloses an
`adaptive threshold technique for achieving a constantbit rate
`on a hierarchical adaptive multistage vector quantization. A
`single compression technique is applied iteratively until the
`residual
`is reduced below a prespecified threshold. The
`threshold may be adapted to provide a constant bit rate
`output. If the nth stage is reached without the residual being
`less than the threshold, a smaller input vector is selected.
`USS. Pat. No. 5,819,215 to Dobson,et al. teaches a method
`of applying either lossy or lossless compression to achieve
`a desired subjective level of quality to the reconstructed
`signal.
`In certain embodiments this technique utilizes a
`combination of run-length and Huffman encoding to take
`advantage of other local and global statistics. The tradeoffs
`considered in the compression process are perceptible dis-
`tortion errors versus a fixed bit rate output.
`SUMMARYOF THE INVENTION
`
`The present invention is directed to systems and methods
`for providing content independentlossless data compression
`and decompression. In one aspect of the present invention,
`a method for providing content independentlossless data
`compression comprises the steps of:
`(a) receiving as input a block of data from a stream of
`data, the data stream comprising oneofat least one data
`block and a plurality of data blocks;
`(b) counting the size of the input data block;
`(c) encoding the input data block with a plurality of
`lossless encoders to provide a plurality of encoded data
`blocks;
`(d) counting the size of each of the encoded data blocks;
`(e) determining a lossless data compression ratio obtained
`for each of the encoders by taking the ratio of the size
`of the encoded data block output from the encoders to
`the size of the input data block;
`
`Page 20
`
`Page 20
`
`

`

`US 6,195,024 B1
`
`5
`(f) comparing each of the determined compression ratios
`with an a priori user specified compression threshold;
`(g) selecting for output the input data block and append-
`ing a null data type compression descriptor to the input
`data block,if all of the encoder compressionratiosfall
`below the a priori specified compression threshold; and
`(h) selecting for output the encoded data block having the
`highest compression ratio and appending a correspond-
`ing data type compression descriptor to the selected
`encoded data block,if at least one of the compression
`ratios exceed the a priori specified compression thresh-
`old.
`
`In another aspect of the present invention, a timer is
`preferably added to measure the time elapsed during the
`encoding process against an a priori-specified time limit.
`Whenthe time limit expires, only the data output from those
`encoders that have completed the present encoding cycle are
`compared to determine the encoded data with the highest
`compression ratio. The time limit ensures that the real-time
`or pseudoreal-time nature of the data encoding is preserved.
`In another aspect of the present invention,the results from
`each encoderare buffered to allow additional encoders to be
`sequentially applied to the output of the previous encoder,
`yielding a more optimal lossless data compressionratio.
`In another aspect of the present invention, a method for
`providing content independent lossless data decompression
`includesthe steps of receiving as input a block of data from
`astream of data, extracting an encoding type descriptor from
`the input data block, decoding the input data block with one
`or more of a plurality of available decoders in accordance
`with the extracted encoding type descriptor, and outputting
`the decoded data block. An input data block having a null
`descriptor type extracted therefrom is output without being
`decoded.
`Advantageously, the present invention employsa plurality
`of encoders applying a plurality of compression techniques
`on an input data stream so as to achieve maximum com-
`pression in accordance with the real-time or pseudo real-
`time data rate constraint. Thus, the outputbit rate is not fixed
`and the amount, if any, of permissible data quality degra-
`dation is not adaptable, but is user or data specified.
`The present invention is realized due to recent improve-
`ments in processing speed,inclusive of dedicated analog and
`digital hardware circuits, central processing units, (and any
`hybrid combinations thereof), which, coupled with reduc-
`tions in cost, are enabling of new content independent data
`compression and decompression solutions.
`These and other aspects, features and advantages of the
`present invention will become apparent from the following
`detailed description of preferred embodiments, which is to
`be read in connection with the accompanying drawings.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a block/flow diagram of a content dependent
`high-speed lossless data compression and decompression
`system/method according to the prior art;
`FIG. 2 is a block diagram of a content independent data
`compression system according to one embodiment of the
`present invention;
`FIGS. 3a and 3b comprise a flow diagram of a data
`compression method according to one aspect of the present
`invention which illustrates the operation of the data com-
`pression system of FIG. 2;
`FIG. 4 is a block diagram of a content independent data
`compression system according to another embodimentofthe
`present invention having an enhanced metric for selecting an
`optimal encoding technique;
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`FIGS. 5a and 5b comprise a flow diagram of a data
`compression method according to another aspect of the
`present invention whichillustrates the operation of the data
`compression system of FIG. 4;
`FIG. 6 is a block diagram of a content independent data
`compression system according to another embodimentof the
`present invention having an a priori specified timer that
`provides real-time or pseudo real-time of output data;
`FIGS. 7a and 7b comprise a flow diagram of a data
`compression method according to another aspect of the
`present invention whichillustrates the operation of the data
`compression system of FIG. 6;
`FIG. 8 is a block diagram of a content independent data
`compression system according to another embodiment hav-
`ing an a priori specified timer that provides real-time or
`pseudo real-time of output data and an enhanced metric for
`selecting an optimal encoding technique;
`FIG. 9 is a block diagram of a content independent data
`compression system according to another embodimentof the
`present invention having an encoding architecture compris-
`ing a plurality of sets of serially-cascaded encoders;
`FIGS. 10a and 106 comprise a flow diagram of a data
`compression method according to another aspect of the
`present invention whichillustrates the operation of the data
`compression system of FIG. 9;
`FIG. 11 is block diagram of a content independent data
`decompression system according to one embodimentof the
`present invention; and
`FIG. 12 is a flow diagram of a data decompression method
`according to one aspect of the present
`invention which
`illustrates the operation of the data compression system of
`FIG.

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket