throbber
a2 United States Patent
`US 6,195,024 B1
`(10) Patent No.:
`Feb. 27, 2001
`(45) Date of Patent:
`Fallon
`
`US006195024B1
`
`(54) CONTENT INDEPENDENT DATA
`COMPRESSION METHOD AND SYSTEM
`
`(74) Attorney, Agent, or Firm—Frank V. DeRosa; F. Chau
`& Associates, LLP
`
`(75)
`
`Inventor:
`
`James J. Fallon, Bronxville, NY (US)
`
`(57)
`
`ABSTRACT
`
`(73) Assignee: Realtime Data, LLC, New York, NY
`(US)
`
`(*) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`US.C. 154(b) by 0 days.
`
`(21) Appl. No.: 09/210,491
`
`(22)
`
`Filed:
`
`Dec. 11, 1998
`
`Int. C17 oes H03M 7/34; H0O3M 7/00
`(S51)
`(52) US. C0. eee esse see tesssessnsssees 341/51; 341/79
`(58) Field of Search 0.0... 341/51, 79, 67;
`709/231, 219, 236, 250; 358/1.1; 712/32;
`711/208
`
`(56)
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`10/1989 Tsukiyamaet al.
`4,872,009
`5/1990 O Brienetal. .
`4,929,946
`9/1991 Mitchell et al. .
`5,045,852
`3/1992 Langdon,Jr. et al. .
`5,097,261
`5,175,543 * 12/1992 Lantz vceccceecccseeseeesenees 341/51
`
`.
`
`(List continued on next page.)
`
`Systems and methods for providing content independent
`lossless data compression and decompression. A data com-
`pression system includes a plurality of encoders that are
`configured to simultaneously or sequentially compress data
`independent of the data content. The results of the various
`encoders are compared to determine if compression is
`achieved and to determine which encoderyields the highest
`lossless compression ratio. The encoded data with the high-
`est lossless compressionratio is then selected for subsequent
`data processing, storage, or transmittal. A compression iden-
`tification descriptor may be appended to the encoded data
`with the highest compression ratio to enable subsequent
`decompression and data interpretation. Furthermore, a timer
`may be added to measure the time elapsed during the
`encoding process against an a priori-specified time limit.
`Whenthe time limit expires, only the data output from those
`encoders that have completed the encoding process are
`compared. The encoded data with the highest compression
`ratio is selected for data processing, storage, or transmittal.
`The imposed time limit ensures that the real-time or pseudo
`real-time nature of the data encoding is preserved. Buffering
`the output from each encoder allows additional encodersto
`be sequentially applied to the output of the previous encoder,
`yielding a more optimal lossless data compressionratio.
`
`Primary Examiner—Patrick Wamsley
`
`34 Claims, 16 Drawing Sheets
`
`
`
`
`
`ENCODERE1
`DATA STREAM
`
`
`BUFFER/
`
`COUNTER 2
`
`INPUT
`COMPRESSION|—|coMPRESSION
`DATA
`
`
`
`TYPE
`RATIO
`BUFFER/
`
`ENCODER E3
`
`BUFFER
`cccRipTION
`DETERMINATION||
`COUNTER3
`COMPARISON
`
`
`
`
`
`
`ENCODER E2
`
`
`BUFFER/
`COUNTER1
`
`
`
`
`
`
`ENCODED DATA
`STREAM W/
`DESCRIPTOR
`
`ENCODER En
`
`
`
`BUFFER/
`COUNTER n
`
`30
`
`40
`
`NetApp; Rackspace
`
`Exhibit1004
`
`Page 1
`
`NetApp; Rackspace Exhibit 1004 Page 1
`
`

`

`US 6,195,024 BI
`
`Page 2
`
`2/1998 Nakanoetal. .
`5,717,393
`U.S. PATENT DOCUMENTS
`2/1998 Schwartz et al. .
`5,717,394
`.
`3/1998 Franaszeketal. .
`5,720,228
`5/1993 Normileetal. .
`5,212,742
`5/1998 Huang etal. .
`5,748,904
`oe utoo, pani et al. ,
`6/1998 Nakazato etal. .
`5,771,340
`hes
`/
`erousst et al. .
`5,243,348
`9/1993 Jackson .
`7/1998 Rostokeretal. .
`5,784,572
`.
`5,270,832
`12/1993 Balkanski etal. .
`8/1998 Israelsen et al. .
`5,799,110
`,
`5,379,036
`1/1995 Storer .
`9/1998 Kawashimaetal. .
`5,805,932
`ve
`5,381,145
`1/1995 Allen et al. .
`9/1998 Yajima .
`5,809,176
`.
`.
`5,394,534
`2/1995 Kulakowski et al.
`10/1998 Langley .
`5,818,368
`5,412,384
`5/1995 Changet al. wee 341/79
`10/1998 Canfield et al.
`5,818,530

`a5,819,215
`5,461,679
`10/1995 Normile et al. .
`10/1998 Dobsonetal. .
`8),
`5,467,087
`11/1995 Chu.
`10/1998 Canfield et al.
`5,825,424
`5,471,206
`11/1995 Allen et al. .
`12/1998 Canfield et al.
`5,847,762
`5,479,587
`12/1995 Campbelletal. .
`1/1999 Ryuetal..
`5,861,824
`E
`5,486,826
`1/1996 Remillard .
`6/1999 Ando.
`5,917,438
`5,495,244
`2/1996 Je-Changetal. .
`10/1999 Packard .
`5,964,842
`5,533,051
`7/1996 James .
`11/1999 Fall etal. .
`5,991,515
`
`
`5,583,500 2/2000 Gilbertetal12/1996 Allen et al. . 6,031,939
`
`weds
`moert
`ef als
`5,627,534
`5/1997 Craft .
`5,654,703
`8/1997 Clark, Il.
`5,668,737
`9/1997 Tler.
`
`.
`
`.
`.
`
`*
`
`* cited by examiner
`
`NetApp; Rackspace
`
`Exhibit 1004
`
`Page 2
`
`NetApp; Rackspace Exhibit 1004 Page 2
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 1 of 16
`
`US 6,195,024 B1
`
`| | | | |
`
`i
`|
`|
`
`{
`
`2
`
`3
`
`INPUT DATA STREAM
`
`SIGNAL
`
`
`IDENTIFY INPUT DATA TYPE AND
`GENERATEDATA TYPE IDENTIFICATION
`
`DATA TYPE
`ID SIGNAL
`COMPRESS DATA IN ACCORDANCE wiTH|
`IDENTIFIED DATA TYPE
`
`| | | | |
`
`| |
`
`|
`|
`
`COMPRESSED DATA STREAM
`
`RETRIEVE DATA TYPE
`INFORMATION OF COMPRESSED
`DATA STREAM
`
`WITH IDENTIFIED DATA TYPE
`
`DECOMPRESSDATA IN ACCORDANCE
`
`FIG. 1
`PRIOR ART
`
`NetApp; Rackspace
`
`Exhibit1004
`
`Page 3
`
`NetApp; Rackspace Exhibit 1004 Page 3
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 2 of 16
`
`US 6,195,024 B1
`
`
`
`VLVGGAGOONA
`
`/MWVAdLS
`
`YOLdiYyOSad
`
`/a¥a4sAnd
`
`é4YACOONA
`
`fa344ng
`
`LYALNNOD
`
`LdYSqOQONA
`
`
`
`WVSAYLSVLVO
`
`faaisna éYALNNOOD
`
`AdAL
`
`NOISSAYdNOO
`NOISSSYdNWOD
`
`OlLWY
`
`NOILdIYOS3Ad
`/NOLLYNIASLAG
`NOSIeVdNOD
`€Y¥S3LNNOO
`
`€4YaqdOQONA
`
`¢Old
`
`faaasng
`
`UYALNNOD
`
`U3YAGOONA
`
`LAdNI
`
`vivd
`
`Vivd
`
`49014
`
`tsss
`
`YALNNOD
`
`NetApp; Rackspace
`
`Exhibit 1004
`
`Page 4
`
`NetApp; Rackspace Exhibit 1004 Page 4
`
`
`
`
`
`
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 3 of 16
`
`RECEIVE INITIAL
`DATA BLOCK FROM
`INPUT DATA STREAM
`
`COUNT SIZE OF
`DATA BLOCK
`
`BUFFER DATA BLOCK
`
`RATIOS
`
`US 6,195,024 B1
`
`300
`
`302
`
`304
`
`308
`
`310
`
`312
`
`314
`
`COMPRESS DATA
`BLOCK WITH
`
`ENABLED ENCODERS
`
`BUFFER ENCODED
`DATA BLOCK OUTPUT
`FROM EACH
`ENCODER
`
`COUNTSIZE OF
`ENCODED DATA
`BLOCKS
`
`CALCULATE
`COMPRESSION
`
`COMPARE
`COMPRESSION
`RATIOS WITH
`
`THRESHOLD LIMIT
`
`A
`FIG. 3a
`
`NetApp; Rackspace
`
`Exhibit1004
`
`Page5
`
`NetApp; Rackspace Exhibit 1004 Page 5
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 4 of 16
`
`US 6,195,024 B1
`
`316
`
`
`IS
`COMPRESSION
`
`
`RATIO OF AT LEAST ONE
`ENCODED DATA BLOCK
`GREATER THAN
`
`THRESHOLD?
`
`
`NO
`
`
`
`
`APPEND NULL
`SELECT ENCODED
`
`
`
`DESCRIPTOR TO
`DATA BLOCK WITH
`
`UNENCODED INPUT
`
`GREATEST
`
`DATA BLOCK
`COMPRESSION RATIO
`
`
`
`APPEND
`CORRESPONDING
`DESCRIPTOR
`
`
`
`
`OUTPUT UNENCODED
`OUTPUT ENCODED
`
`DATA BLOCK WITH
`DATA BLOCK WITH
`
`NULL DESCRIPTOR
`DESCRIPTOR
`
`
`
`
`MORE
`DATA BLOCKSIN INPUT
`STREAM?
`
`
`
`
`
`TERMINATE DATA
`COMPRESSION
`
`
`PROCESS
`
`
`
`330
`
`RECEIVE NEXT DATA
`BLOCK FROM INPUT
`STREAM
`
`
`
`
`
`FIG. 3b
`
`NetApp; Rackspace
`
`Exhibit1004
`
`Page6
`
`NetApp; Rackspace Exhibit 1004 Page 6
`
`

`

`U.S. Patent
`
`
`
`WLVGG4d0OON4
`
`/4aass5ne
`
`LYALNNOS
`
`b5YadqQOONAS
`
` WVAYLS
`
`VIVO
`
` YSLNNOO
`
`é3YACOONS
`
`€4YsAqOONA
`
`LAdNI
`
`Vivd
`
`4445n¢
`
`VLVd
`
`MOO18
`
`Feb. 27, 2001
`
`Sheet 5 of 16
`
`US 6,195,024 B1
`
`/MWVAYLS
`
`NOISSSYdWOO
`
`NOILdI¥OS3AG
`
`ddAL
`
`fdasind
`
`éYALNNOO
`
`faasasnd
`
`€YSLNNOD
`
`dOAYNSIs
`
`LIYAW
`
`
`
`NOILVNINSLAG
`
`YsaqdOoONa
`
`ALNIEVeISSa
`
`SYHOLOVA
`
`vOla
`
`f4assnd
`
`UYALNNOO
`
`UZYSCOONS
`
`NetApp; Rackspace
`
`Exhibit 1004
`
`Page 7
`
`NetApp; Rackspace Exhibit 1004 Page 7
`
`
`
`
`
`
`
`
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 6 of 16
`
`US 6,195,024 B1
`
`B
`
`RECEIVE INITIAL
`DATA BLOCK FROM
`INPUT DATA STREAM
`
`BUFFER DATA BLOCK
`
`COUNTSIZE OF
`DATA BLOCK
`
`500
`
`502
`
`504
`
`COMPRESS DATA
`BLOCK WITH
`
`ENABLED ENCODERS.
`
`
`
`APPEND CORRESPONDING
`DESIRABILITY FACTORS TO
`
`ENCODED DATA BLOCKS
`
`508
`
`BUFFER ENCODED DATA
`BLOCK OUTPUT
`
`FROM EACH ENCODER
`
`510
`
`COUNT SIZE OF
`ENCODED DATA
`BLOCKS
`
`RATIOS
`
`CALCULATE
`COMPRESSION
`
`912
`
`914
`
`COMPARE COMPRESSION
`
`RATIOS WITH THRESHOLD LIMIT
`
`516
`
`FIG. 5a
`
`NetApp; Rackspace
`
`Exhibit1004
`
`Page 8
`
`NetApp; Rackspace Exhibit 1004 Page 8
`
`

`

`U.S. Patent
`
`Feb.27, 2001
`
`Sheet 7 of 16
`
`US 6,195,024 B1
`
`A
`
`518
`
`
`IS
`
`COMPRESSION
`
`RATIO OF AT LEAST ONE
`ENCODED DATA BLOCK
`
`GREATER THAN
`THRESHOLD?
`
`
`
`520
`
`522
`
`
`
`
`SELECT ENCODED DATA
`BLOCK WITH GREATEST
`FIGURE OF MERIT
`
`OUTPUT ENCODED
`DATA BLOCK WITH
`DESCRIPTOR
`
`
`
`
`APPEND NULL
`CALCULATE FIGURE OF
`
`DESCRIPTOR TO
`
`
`MERIT FOR EACH ENCODED
`UNENCODED INPUT
`
`
`DATA BLOCK WHICH EXCEED
`
`DATA BLOCK
`THRESHOLD
`
`
`
`
`
`
`
`APPEND
`OUTPUT UNENCODED
`
`DATA BLOCK WITH
`CORRESPONDING
`
`
`NULL DESCRIPTOR
`
`DESCRIPTOR
`
`
`
`
`
`MORE
`
`DATA BLOCKSIN
`INPUT STREAM?
`
`
`RECEIVE NEXT DATA
`BLOCK FROM INPUT
`STREAM
`
`PROCESS
`
`TERMINATE DATA
`COMPRESSION
`
`YES
`
`934
`
`
`
`
`B
`
`FIG. 5b
`
`NetApp; Rackspace
`
`Exhibit1004
`
`Page 9
`
`NetApp; Rackspace Exhibit 1004 Page 9
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 8 of 16
`
`“Yasn YUYALNNOD
`
`NOISS3udN0oLNdNIViva
`
`
`
`
`NOILdl¥OS3dINOWNiwatad€MSLNNOD4345ndYALNNOD
`
`AdALOLLWMMassnaVLVdM9018
`
`
`faadsng
`
`NOISSSYdNOO
`
`
`
`VLVOGA00OONA
`
`
`
`IMWVAYLS/aaasng
`
`YOLdIYOSAdéYALNNOOD
`
`cdYACOONA
`
`faasdng
`
`bY4LNNOOD
`
`LdYAqQOONA
`
`
`
`WVAYLSVLVC
`
`US 6,195,024 B1
`
`9SIs
`
`
`
`AIWILGalaloads
`
`NetApp; Rackspace
`
`Exhibit 1004
`
`Page 10
`
`NetApp; Rackspace Exhibit 1004 Page 10
`
`
`
`
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 9 of 16
`
`US 6,195,024 B1
`
`710
`
`TIME EXPIRED?
`
`NO
`
`
`NO
`
`
`ENCODING
`
`INPUT INITIAL
`DATA BLOCK FROM
`INPUT DATA STREAM
`
`COUNTSIZE OF
`DATA BLOCK
`
`BUFFER DATA BLOCK
`
`INITIALIZE TIMER
`
`700
`
`702
`
`B
`
`704
`
`706
`
`708
`
`
`COMPLETE?
`
`WITHIN TIME LIMIT
`
`
`
`
` CALCULATE
`COMPRESSION
`
`RATIOS
`
`STOP
`ENCODING
`PROCESS
`
`
`
`BUFFER ENCODED
`BUFFER
`DATA BLOCK FOR EACH
`
`
`BEGIN
`COMPRESSING
`hockDUTPUT
`ENCODER THAT
`
`
`DATA BLOCK WITH
`FROM EACH
`COMPLETED ENCODING
`
`
`ENCODERS
`PROCESS
` ENCODER
`
`COUNT SIZE OF
`ENCODED DATA
`BLOCKS
`
`720
`
`722
`
`
`
`COMPARE COMPRESSION
`724
`RATIOS WITH THRESHOLD
`
`LIMIT
`
`
`
`FIG. 7a
`
`NetApp; Rackspace
`
`Exhibit1004
`
`Page 11
`
`NetApp; Rackspace Exhibit 1004 Page 11
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 10 of 16
`
`US 6,195,024 B1
`
`726
`
`
`IS
`COMPRESSION
`
`
`RATIO OF AT LEAST ONE
`
`ENCODED DATA BLOCK
`GREATER THAN
`
`THRESHOLD?
`
`NO
`
`
`
`
`
`APPEND NULL
`SELECT ENCODED
`DATA BLOCK WITH
`DESCRIPTOR TO
`
`
`UNENCODED INPUT
`
`
`GREATEST
`DATA BLOCK
`
`COMPRESSION RATIO
`
`
`
`
`
`
`
`
`APPEND
`CORRESPONDING
`
`DESCRIPTOR
`
`
`
`OUTPUT UNENCODED
`OUTPUT ENCODED
`DATA BLOCK WITH
`DATA BLOCK WITH
`
`
`NULL DESCRIPTOR
`DESCRIPTOR
`
`
`
`
`
`
`
`
`MORE
`DATA BLOCKSIN INPUT
`
`STREAM?
`
`
`TERMINATE DATA
`COMPRESSION
`
`
`PROCESS
`
`
`
`RECEIVE NEXT DATA
`BLOCK FROM INPUT
`STREAM
`
`
`
`
`740
`
`FIG. 7b
`
`NetApp; Rackspace
`
`Exhibit1004
`
`Page 12
`
`NetApp; Rackspace Exhibit 1004 Page 12
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 11 of 16
`
`US 6,195,024 B1
`
`
`
`VLVGG3GO9ON4
`
`
`
`IMWV3YLSfaaasng
`
`YOLdISOSIaZHSLNNOO
`
`/fHassnd
`
`NOSIYVdNOD
`
`uAa44na43YaqdOONa
`UYALNNOD
` i
`64YAQOONA
`baYaQOQONA
`
`dOaYeNSIs
`
`LIS
`
`NOILYNINYALAG
`
`YAGOONA
`
`SYOLOVS
`
`ALMavulsaa06
`
`8SIs
`
`
`NOISSaudWoooneINdNI0K
`
`
`NOHdeOSSC||NOLLWNINSALSO€YaLNNOD€4YACOONAviva.fHOSS300Nd
`
`
`ddALMMasanaVivd
`NOISSaudWooWVNANI
`
`
`LYSLNNOOWV3aLSVLVO
`
`
`
`
`
`
`
`YSIALLAWLLGalaloadS
`
`YALNNOD
`
`-d4sn
`
`NetApp; Rackspace
`
`Exhibit 1004
`
`Page 13
`
`NetApp; Rackspace Exhibit 1004 Page 13
`
`
`
`
`

`

`Feb. 27, 2001
`
`U.S. Patent
`
`NOISSSYdWOO
`
`
`
`OlLWYINdNIViva
`
`
`
`
`
`
`
`NOILVNINYSLAGvenaMf)eesLeaviva90148
`
`4O3yNolSuw3oul3wa
`
`
`
`u'L>-ZLoLegaWVAYLSVLVG
`
`uwzZ3ZcaLiz
`
`
`Sheet 12 of 16
`
`US 6,195,024 B1
`
`NOILVNINYSLIC
`
`LMSW
`
`
`
`NOSIYvVdWoo/¥3a44n¢YSLNNOD
`
`vivdG3GOONZ
`
`/MWW3Y4LS
`
`YOldivoSIaG
`
`09OL
`
`NOISSSYudNOOYAGOONg
`
`
`
`NOlLdI¥OS3aaSYOLOVSA
`
`AdALALINEVvyISad06
`
`
`
`
`
`-Y¥asn
`
`
`
`HAIL50¢FWILG3lsl0ad$S
`
`NetApp; Rackspace
`
`Exhibit 1004
`
`Page 14
`
`NetApp; Rackspace Exhibit 1004 Page 14
`
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 13 of 16
`
`US 6,195,024 B1
`
`100
`
`102
`B
`
`104
`
`106
`
`108
`
`RECEIVEINITIAL
`DATA BLOCK FROM
`INPUT DATA STREAM
`
`COUNTSIZE OF
`DATA BLOCK
`
`BUFFER DATA
`BLOCK
`
`INITIALIZE TIMER
`
`APPLY INPUT DATA
`BLOCK TO FIRST
`ENCODING STAGE
`IN CASCADED
`
`
`ENCODER PATHS
`
`7110
`
`
`
`
`TIME EXPIRED?
`
`116
`
`APPLY OUTPUT
`OF COMPLETED
`ENCODING
`STAGE TO NEXT
`ENCODING
`STAGE IN
`CASCADE PATH
`
`BUFFER
`ENCODED DATA
`BLOCK OUTPUT
`FROM
`COMPLETED
`ENCODING
`STAGE
`
`STOP ENCODING
`PROCESS
`
`
`
`
`
`
`
`
`
`COUNTSIZE OF
`122
`ENCODED DATA
`BLOCKS
`
`SELECT BUFFERED OUTPUT OF LAST
`ENCODING STAGE IN ENCODER
`CASCADE THAT COMPLETED ENCODING
`PROCESS WITHIN TIME LIMIT
`
`
`
`CALCULATE
`COMPRESSION
`RATIOS
`
`124
`
`COMPARE COMPRESSION
`RATIOS WITH THRESHOLD
`LIMIT
`
`126
`
`FIG. 10a
`
`A
`
`NetApp; Rackspace
`
`Exhibit1004
`
`Page 15
`
`NetApp; Rackspace Exhibit 1004 Page 15
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 14 of 16
`
`US 6,195,024 B1
`
`128
`
`
`IS
`COMPRESSION
`
`
`RATIO OF AT LEAST ONE
`
`
`ENCODED DATA BLOCK
`
`GREATER THAN
`THRESHOLD?
`
`
`
`136
`
`BLOCK WITH GREATEST
`FIGURE OF MERIT
`
`
`
`132
`
`
`CALCULATE FIGURE OF
`APPEND NULL
`130
`
` 134
`MERIT FOR EACH ENCODED
`DESCRIPTOR TO
`
`DATA BLOCK WHICH EXCEED
`UNENCODED INPUT
`
`THRESHOLD
`DATA BLOCK
`
` SELECT ENCODED DATA
`
`
`
`OUTPUT UNENCODED
`APPEND
`
`DATA BLOCK WITH
`
`CORRESPONDING
`138
`
`NULL DESCRIPTOR
`
`DESCRIPTOR
`
` OUTPUT ENCODED
`
`140
`DATA BLOCK WITH
`
`DESCRIPTOR
`
`
`MORE
`DATA BLOCKSIN
`INPUT STREAM?
`
`
`
`
`
`
`PROCESS
`
`YES
`
`TERMINATE DATA
`COMPRESSION
`
`RECEIVE NEXT DATA
`BLOCK FROM INPUT
`STREAM
`
`144
`
`B FIG. 10b
`
`NetApp; Rackspace
`
`Exhibit1004
`
`Page 16
`
`NetApp; Rackspace Exhibit 1004 Page 16
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 15 of 16
`
`US 6,195,024 B1
`
`43ssnd
`
`9011
`
`
`
`VLVOLNdLNO
`
`WVSYLS
`VIVOLAdLNO
`
`€dY¥agoo030
`
`udY¥Aqg008d
`
`VOLL
`
`TIAN/MVLVG
`
`YOLdIYOSAG
`
`10¥aqgoo3d
`
`cQY¥sqgo0o030
`
`
`
`cOLL
`
`OOLL
`
`NOILOVYLXS
`
`Y3ssNd4D071d
`
`YyOldidossad
`
`VIVdLAdNI
`
`
`WV3YLSVLVG
`
`LLOla
`
`NetApp; Rackspace
`
`Exhibit 1004
`
`Page 17
`
`NetApp; Rackspace Exhibit 1004 Page 17
`
`
`
`

`

`U.S. Patent
`
`Feb. 27, 2001
`
`Sheet 16 of 16
`
`US 6,195,024 B1
`
`RECEIVE INITIAL
`DATA BLOCK FROM
`INPUT DATA STREAM
`
`BUFFER DATA BLOCK
`
`1202
`
` EXTRACT DATA
`
`COMPRESSION TYPE
`DESCRIPTOR
`
`1204
`
`
`
`IS DATA
`COMPRESSION
`
`
`TYPE DESCRIPTOR
`NULL?
`
` 1200
`
`
`
`
`
`
`
`
`
`SELECT DECODER(S)
`CORRESPONDING TO
`
`DESCRIPTOR
`1208
`
`
`
`
`
`OUTPUT
`RECEIVE NEXT
`
`
`DATA BLOCK IN
`UNDECODED
`
`
`INPUT STREAM
`DATA BLOCK
`
`
`
`DECODE DATA BLOCK USING
`SELECTED DECODER(S)
`
`
`OUTPUT DECODED
`DATA BLOCK
`
`
` MORE DATA
`
`BLOCKSIN INPUT
`STREAM?
`
`
`1218
`TERMINATE
`
`DECODING PROCESS
`
`NO
`
`FIG. 12
`
`NetApp; Rackspace
`
`Exhibit1004
`
`Page 18
`
`NetApp; Rackspace Exhibit 1004 Page 18
`
`

`

`US 6,195,024 B1
`
`1
`CONTENT INDEPENDENT DATA
`COMPRESSION METHOD AND SYSTEM
`
`BACKGROUND
`
`1. Technical Field
`
`The present invention relates generally to data compres-
`sion and decompression and, more particularly, to systems
`and methodsfor providing content independentlossless data
`compression and decompression.
`2. Description of the Related Art
`Information may be represented in a variety of manners.
`Discrete information such as text and numbers are easily
`represented in digital data. This type of data representation
`is known as symbolic digital data. Symbolic digital data is
`thus an absolute representation of data such as a letter,
`figure, character, mark, machine code, or drawing.
`Continuous information such as speech, music, audio,
`images and video, frequently exists in the natural world as
`analog information. As is well-knownto those skilled in the
`art, recent advances in very large scale integration (VLSI
`digital computer technology have enabled both discrete and
`analog information to be represented with digital data.
`Continuous information represented as digital data is often
`referred to as diffuse data. Diffuse digital data is thus a
`representation of data that is of low information density and
`is typically not easily recognizable to humansin its native
`form.
`
`There are many advantages associated with digital data
`representation. For instance, digital data is more readily
`processed, stored, and transmitted due to its inherently high
`noise immunity. In addition, the inclusion of redundancy in
`digital data representation enables error detection and/or
`correction. Error detection and/or correction capabilities are
`dependent upon the amount and type of data redundancy,
`available error detection and correction processing, and
`extent of data corruption.
`One outcomeof digital data representation is the continu-
`ing need for increased capacity in data processing, storage,
`and transmittal. This is especially true for diffuse data where
`increases in fidelity and resolution create exponentially
`greater quantities of data. Data compression is widely used
`to reduce the amount of data required to process, transmit,
`or store a given quantity of information. In general, there are
`two types of data compression techniques that may be
`utilized either separately or jointly to encode/decode data:
`lossless and lossy data compression.
`Lossy data compression techniques provide for an inexact
`representation of the original uncompressed data such that
`the decoded (or reconstructed) data differs from the original
`unencoded/uncompressed data. Lossy data compression is
`also knownasirreversible or noisy compression. Entropyis
`defined as the quantity of information in a given set of data.
`Thus, one obvious advantage of lossy data compression is
`that the compression ratios can be larger than the entropy
`limit, all at the expense of information content. Many lossy
`data compression techniques seek to exploit various traits
`within the human senses to eliminate otherwise impercep-
`tible data. For example, lossy data compression of visual
`imagery might seek to delete information content in excess
`of the display resolution or contrast ratio.
`Onthe other hand, lossless data compression techniques
`provide an exact representation of the original uncom-
`pressed data. Simply stated, the decoded (or reconstructed)
`data is identical to the original unencoded/uncompressed
`data. Lossless data compression is also knownasreversible
`
`10
`
`15
`
`20
`
`25
`
`35
`
`40
`
`45
`
`65
`
`2
`or noiseless compression. Thus, lossless data compression
`has, as its current limit, a minimum representation defined
`by the entropy of a given data set.
`There are various problems associated with the use of
`lossless compression techniques. One fundamental problem
`encountered with most lossless data compression techniques
`are their content sensitive behavior. This is often referred to
`
`as data dependency. Data dependency implies that the com-
`pression ratio achievedis highly contingent upon the content
`of the data being compressed. For example, database files
`often have large unused fields and high data redundancies,
`offering the opportunity to losslessly compressdata atratios
`of 5 to 1 or more. In contrast, concise software programs
`have little to no data redundancy and, typically, will not
`losslessly compress better than 2 to 1.
`Another problem with lossless compression is that there
`are significant variations in the compression ratio obtained
`whenusing a single lossless data compression technique for
`data streams having different data content and data size. This
`process is known asnatural variation.
`A further problem is that negative compression may occur
`when certain data compression techniques act upon many
`types of highly compressed data. Highly compressed data
`appears random and many data compression techniques will
`substantially expand, not compress this type of data.
`For a given application, there are many factors which
`govern the applicability of various data compression tech-
`niques. These factors include compression ratio, encoding
`and decoding processing requirements, encoding and decod-
`ing time delays, compatibility with existing standards, and
`implementation complexity and cost, along with the adapt-
`ability and robustness to variations in input data. A direct
`relationship exists in the current art between compression
`ratio and the amount and complexity of processing required.
`Oneofthe limiting factors in most existing prior art lossless
`data compression techniquesis the rate at which the encod-
`ing and decoding processes are performed. Hardware and
`software implementation tradeoffs are often dictated by
`encoder and decoder complexity along with cost.
`Another problem associated with lossless compression
`methods is determining the optimal compression technique
`for a given set of input data and intended application. To
`combat this problem, there are many conventional content
`dependent techniques which may be utilized. For instance,
`filetype descriptors are typically appendedto file names to
`describe the application programsthat normally act upon the
`data contained withinthe file. In this manner data types, data
`structures, and formats within a given file may be ascer-
`tained. Fundamental problems with this content dependent
`technique are:
`(1) the extremely large number of application programs,
`some of which do not possess published or documented
`file formats, data structures, or data type descriptors;
`(2)
`the ability for any data compression supplier or
`consortium to acquire, store, and access the vast
`amounts of data required to identify knownfile descrip-
`tors and associated data types, data structures, and
`formats; and
`(3) the rate at which new application programsare devel-
`oped and the need to update file format data descrip-
`tions accordingly.
`An alternative technique that approaches the problem of
`selecting an appropriate lossless data compression technique
`is disclosed in U.S. Pat. No. 5,467,087 to Chu entitled “High
`Speed Lossless Data Compression System” (“Chu”). FIG. 1
`illustrates an embodiment of this data compression and
`
`NetApp; Rackspace
`
`Exhibit1004
`
`Page 19
`
`NetApp; Rackspace Exhibit 1004 Page 19
`
`

`

`US 6,195,024 B1
`
`3
`decompression technique. Data compression 1 comprises
`two phases, a data pre-compression phase 2 and a data
`compression phase 3. Data decompression 4 of a com-
`pressed input data stream is also comprised of two phases,
`a data type retrieval phase 5 and a data decompression phase
`6. During the data compression process 1, the data pre-
`compressor 2 accepts an uncompressed data stream, identi-
`fies the data type of the input stream, and generates a data
`type identification signal. The data compressor 3 selects a
`data compression method from a preselected set of methods
`to compress the input data stream, with the intention of
`producing the best available compression ratio for that
`particular data type.
`There are several problems associated with the Chu
`method. One such problem is the need to unambiguously
`identify various data types. While these might include such
`common data types as ASCII, binary, or unicode, there, in
`fact, exists a broad universe of data types that fall outside the
`three most commondata types. Examples of these alternate
`data types include: signed and unsigned integers of various
`lengths, differing types and precision of floating point
`numbers, pointers, other forms of character text, and a
`multitude of user defined data types. Additionally, data types
`may be interspersed or partially compressed, making data
`type recognition difficult and/or impractical. Another prob-
`lem is that given a known data type, or mix of data types
`within a specific set or subset of input data,
`it may be
`difficult and/or impractical to predict which data encoding
`technique yields the highest compressionratio.
`Chu discloses an alternate embodiment wherein a data
`compression rate control signal is provided to adjust specific
`parameters of the selected encoding algorithm to adjust the
`compression time for compressing data. One problem with
`this technique is that the length of time to compress a given
`set of input data may be difficult or impractical to predict.
`Consequently, there is no guarantee that a given encoding
`algorithm or set of encoding algorithms will perform forall
`possible combinations of input data for a specific timing
`constraint. Another problem is that, by altering the param-
`eters of the encoding process,
`it may be difficult and/or
`impractical to predict the resultant compressionratio.
`Other conventional techniques have been implemented to
`address the aforementioned problems. For instance, U.S.
`Pat. No. 5,243,341 to Seroussi et al. describes a class of
`Lempel-Ziv lossless data compression algorithms that uti-
`lize a memory based dictionary offinite size to facilitate the
`compression and decompression of data. A second standby
`dictionary is included comprised of those encoded data
`entries that compress the greatest amount of input data.
`Whenthe current dictionaryfills up andis reset, the standby
`dictionary becomesthe current dictionary, thereby maintain-
`ing a reasonable data compression ratio and freeing up
`memoryfor newly encoded data strings. Multiple dictionar-
`ies are employed within the same encoding technique to
`increase the lossless data compression ratio. This technique
`demonstrates the prior art of using multiple dictionaries
`within a single encoding process to aid in reducing the data
`dependency of a single encoding technique. One problem
`with this methodis that it does not address the difficulties in
`dealing with a wide variety of data types.
`teaches a
`U.S. Pat. No. 5,717,393 to Nakano, et al.
`plurality of code tables such as a high-usage code table and
`a low-usage code table in an entropy encoding unit. A
`block-sorted last character string from a block-sorting trans-
`forming unit is the move-to-front transforming unit is trans-
`formed into a move-to-front (MTF)codestring. The entropy
`encoding unit switches the code tables at a discontinuous
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`4
`part of the MTFcodestring to perform entropy coding. This
`technique increases the compression rate without extending
`the block size. Nakano employs multiple code tables within
`a single entropy encoding unit to increase the lossless data
`compression ratio for a given block size, somewhat reducing
`the data dependency of the encoding algorithm. Again, the
`problem with this technique is that it does not address the
`difficulties in dealing with a wide variety of data types.
`US. Pat. No. 5,809,176 to Yajima discloses a technique of
`dividing a native or uncompressed image data into a plu-
`rality of streams for subsequent encoding by a plurality of
`identically functioning arithmetic encoders. This method
`demonstrates the technique of employing multiple encoders
`to reduce the time of encoding for a single method of
`compression.
`U.S. Pat. Nos. 5,583,500 and 5,471,206 to Allen, at al.
`disclose systemsfor parallel decompression of a data stream
`comprised of multiple code words. At least two code words
`are decoded simultaneously to enhance the decoding pro-
`cess. This technique demonstrates the prior art of utilizing
`multiple decoders to expedite the data decompression pro-
`cess.
`
`U.S. Pat. No. 5,627,534 to Craft teaches a two-stage
`lossless compression process. A run length precompressed
`output is post processed by a Lempel-Ziv dictionary sliding
`window dictionary encoder that outputs a succession of
`fixed length data units. This yields a relatively high-speed
`compression technique that provides a good match between
`the capabilities and idiosyncrasies of the two encoding
`techniques. This technique demonstrates the prior art of
`employing sequential lossless encoders to increase the data
`compression ratio.
`U.S. Pat. No. 5,799,110 to Israelsen, et al. discloses an
`adaptive threshold technique for achieving a constantbit rate
`on a hierarchical adaptive multistage vector quantization. A
`single compression technique is applied iteratively until the
`residual
`is reduced below a prespecified threshold. The
`threshold may be adapted to provide a constant bit rate
`output. If the nth stage is reached without the residual being
`less than the threshold, a smaller input vector is selected.
`USS. Pat. No. 5,819,215 to Dobson,et al. teaches a method
`of applying either lossy or lossless compression to achieve
`a desired subjective level of quality to the reconstructed
`signal.
`In certain embodiments this technique utilizes a
`combination of run-length and Huffman encoding to take
`advantage of other local and global statistics. The tradeoffs
`considered in the compression process are perceptible dis-
`tortion errors versus a fixed bit rate output.
`SUMMARY OF THE INVENTION
`
`The present invention is directed to systems and methods
`for providing content independentlossless data compression
`and decompression. In one aspect of the present invention,
`a method for providing content independent lossless data
`compression comprises the steps of:
`(a) receiving as input a block of data from a stream of
`data, the data stream comprising oneof at least one data
`block and a plurality of data blocks;
`(b) counting the size of the input data block;
`(c) encoding the input data block with a plurality of
`lossless encoders to provide a plurality of encoded data
`blocks;
`(d) counting the size of each of the encoded data blocks;
`(e) determining a lossless data compression ratio obtained
`for each of the encoders by taking the ratio of the size
`of the encoded data block output from the encoders to
`the size of the input data block;
`
`NetApp; Rackspace
`
`Exhibit1004
`
`Page 20
`
`NetApp; Rackspace Exhibit 1004 Page 20
`
`

`

`US 6,195,024 B1
`
`5
`(f) comparing each of the determined compression ratios
`with an a priori user specified compression threshold;
`(g) selecting for output the input data block and append-
`ing a null data type compression descriptor to the input
`data block, if all of the encoder compressionratios fall
`below the a priori specified compression threshold; and
`(h) selecting for output the encoded data block having the
`highest compression ratio and appending a correspond-
`ing data type compression descriptor to the selected
`encoded data block, if at least one of the compression
`ratios exceed the a priori specified compression thresh-
`old.
`In another aspect of the present invention, a timer is
`preferably added to measure the time elapsed during the
`encoding process against an a priori-specified time limit.
`Whenthe time limit expires, only the data output from those
`encoders that have completed the present encoding cycle are
`compared to determine the encoded data with the highest
`compression ratio. The time limit ensures that the real-time
`or pseudoreal-time nature of the data encodingis preserved.
`In another aspect of the present invention,the results from
`each encoderare buffered to allow additional encoders to be
`sequentially applied to the output of the previous encoder,
`yielding a more optimal lossless data compressionratio.
`In another aspect of the present invention, a method for
`providing content independentlossless data decompression
`includes the steps of receiving as input a block of data from
`a stream ofdata, extracting an encoding type descriptor from
`the input data block, decoding the input data block with one
`or more of a plurality of available decoders in accordance
`with the extracted encoding type descriptor, and outputting
`the decoded data block. An input data block having a null
`descriptor type extracted therefrom is output without being
`decoded.
`Advantageously, the present invention employsa plurality
`of encoders applying a plurality of compression techniques
`on an input data stream so as to achieve maximum com-
`pression in accordance with the real-time or pseudo real-
`time data rate constraint. Thus, the output bit rate is notfixed
`and the amount, if any, of permissible data quality degra-
`dation is not adaptable, but is user or data specified.
`The present invention is realized due to recent improve-
`ments in processing speed, inclusive of dedicated analog and
`digital hardware circuits, central processing units, (and any
`hybrid combinations thereof), which, coupled with reduc-
`tions in cost, are enabling of new content independent data
`compression and decompression solutions.
`These and other aspects, features and advantages of the
`present invention will become apparent from the following
`detailed description of preferred embodiments, which is to
`be read in connection with the accompanying drawings.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a block/flow diagram of a content dependent
`high-speed lossless data compression and decompression
`system/method according to the prior art;
`FIG. 2 is a block diagram of a content independent data
`compression system according to one embodiment of the
`present invention;
`FIGS. 3a and 3b comprise a flow diagram of a data
`compression method according to one aspect of the present
`invention which illustrates the operation of the data com-
`pression system of FIG. 2;
`FIG. 4 is a block diagram of a content independent data
`compression system according to another embodimentof the
`present invention having an enhanced metric for selecting an
`optimal encoding technique;
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`FIGS. 5a and 5b comprise a flow diagram of a data
`compression method according to another aspect of the
`present invention whichillustrates the operation of the data
`compression system of FIG. 4;
`FIG. 6 is a block diagram of a content independent data
`compression system according to another embodimentofthe
`present invention having an a priori specified timer that
`provides real-time or pseudo real-time of output data;
`FIGS. 7a and 7b co

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket