throbber
US 6,195,024 B1
`(10) Patent N0.:
`(12) Unlted States Patent
`
`Fallon
`(45) Date of Patent:
`Feb. 27, 2001
`
`U5006195024B1
`
`(54) CONTENT INDEPENDENT DATA
`COMPRESSION METHOD AND SYSTEM
`
`(74) Attorney, Agent, or Firm—Frank V. DeRosa; F. Chau
`& Associates, LLP
`
`(75)
`
`Inventor:
`
`James J. Fallon, Bronxville, NY (US)
`
`(57)
`
`ABSTRACT
`
`(73) Assignee; Realtime Data, LLC, New York, NY
`(US)
`
`( * ) Notice:
`
`Subject to any diSClaimer, the term Of this
`PalenI l5 eXlended 0r adjllSled under 35
`U~S-C~ 154(b) by 0 daYS~
`
`(21) Appl. No.1 09/210,491
`(22)
`Filed:
`Dec. 11 1998
`’
`Int. Cl.7 ............................ .. H03M 7/34; H03M 7/00
`(51)
`(52) US. Cl.
`............................................... .. 341/51; 341/79
`(58) Field of Search ................................ .. 341/51, 79, 67;
`709/231, 219, 236, 250; 358/11; 712/32;
`711/208
`
`(56)
`
`References Cited
`Us. PATENT DOCUMENTS
`
`~
`
`10/1989 TSUklyama et al'
`498727009
`5/1990 0 Brlen et a1.
`.
`4,929,946
`9/1991 Mitchell et a1.
`.
`5 045 852
`.
`3/1992 Langdon, Jr. et a1.
`5:097:261
`5,175,543 * 12/1992 Lantz ................................... .. 341/51
`.
`.
`(L1st continued on next page.)
`
`Systems and methods for providing content independent
`lossless data compression and decompression. A data com-
`pression system includes a plurality of encoders that are
`configured to simultaneously or sequentially compress data
`independent of the data content. The results of the various
`encoders are compared to determine if compression is
`achieved and to determine Which encoder yields the highest
`lossless compression ratio. The encoded data With the high-
`est lossless compression ratio is then selected for subsequent
`data processing, storage, or transmittal. A compression iden-
`tification descriptor may be appended to the encoded data
`With the highest compression ratio to enable subsequent
`decompression and data interpretation. Furthermore, a timer
`may be added to measure the time elapsed during the
`encoding process against an a priori-specified time limit.
`When the time limit expires, only the data output from those
`encoders that have completed the encoding process are
`compared. The encoded data With the highest compression
`ratio is selected for data processing, storage, or transmittal.
`The imposed time limit ensures that the real-time or pseudo
`-
`.
`-
`-
`real-time nature of the data encoding is preserved. Buffering
`the output from each encoder allows additional encoders to
`be sequentially applied to the output 0fthe preViouS encoder,
`yielding a more optimal lossless data compression ratio.
`
`Primary Examiner—Patrick Wamsley
`
`34 Claims, 16 Drawing Sheets
`
`
`
`ENCODER E1
`
`
`
`
`BUFFER/
`COUNTER ‘1
`
`BUFFER]
`COUNTER 2
`
`
`ENCODER E2
`
`
`
`INPUT
`DATA
`BUFFER
`
`ENCODER E3
`
`BUFFER]
`
`COUNTER 3
`
`
`
`
`DA TA STREAM
`
`DATA
`BLOCK
`COUNTER
`
`ENCODED DA TA
`STREAM W/
`DESCRIPTOR
`
`
`
`COMPRESS'ON
`COMPRESSION
`
`
`DETERG—IFNCATION/
`TYPE
`DESCRIPTION
`COMPARISON
`
`
`
`
`
`
`
`
`ENCODER En
`
`
`
`
`30
`
`BUFFER/
`COUNTER n
`
`40
`
`NetApp; Rackspace
`
`Exhibit1006
`
`Page1
`
`NetApp; Rackspace Exhibit 1006 Page 1
`
`

`

`Us 6,195,024 B1
`
`Page 2
`
`.
`
`.
`
`.
`2/1998 Nakano et al.
`2/1998 Schwartz et al. .
`3/1998 Franaszek et al.
`5/1998 Huang et al.
`.
`6/1998 Nakazato et al.
`7/1998 Rostoker et al. .
`8/1998 Israelsen et al. .
`.
`9/1998 Kawashlma et al. .
`..
`9/1998 Yajrma.
`10/1998 Langley.
`10/1998 Canfield et al.
`10/1998 Dobson et al.
`10/1998 Canfield et al.
`12/1998 Canfield et al.
`1/1999 Ryu etal..
`6/1999 Ando .
`10/1999 Packard.
`11/1999 Fall et al. .
`1
`2/2000 Glb t
`t
`1
`er 6 ‘1‘ ‘
`
`.
`
`.
`
`.
`.
`
`U.S. PATENT DOCUMENTS
`.
`5/1993 Nomifle “ 9 ~
`3133: SDangl 6} a1~ ‘1
`/
`“011551 a a "
`9/1993 Jackson.
`.
`12/1993 Balkanskl et al. .
`1/1995 Storer.
`1/1995 Allen et al. .
`.
`.
`2/1995 Kulakowskr et al.
`5/1995 Chang et al.
`........................ .. 341/79
`.
`10/1995 Normlle et al. .
`11/1995 Chu.
`11/1995 Allen et al..
`12/1995 Campbell et al.
`.
`1/1996 Remrllard.
`2/1996 Je—Chang et al.
`7/1996 James .
`
`.
`
`.
`
`*
`
`12/1996 Allen et al..
`5/1997 Craft .
`8/1997 Clark, 11 .
`9/1997 Iler .
`
`53127742
`gjgiéagii
`3
`7
`5,243,348
`5,270,832
`5,379,036
`5,381,145
`5,394,534
`5,412,384
`5,461,679
`5,467,087
`5,471,206
`5,479,587
`5,486,826
`5,495,244
`5,533,051
`
`5,583,500
`5,627,534
`5,654,703
`5,668,737
`
`5,717,393
`5,717,394
`5,729,228
`5,748,904
`5,771,340
`5,784,572
`5,799,110
`5,805,932
`5,809,176
`5,818,368
`5 818 530
`’
`’
`5 819 215
`7
`7
`5,825,424
`5,847,762
`5,861,824
`5,917,438
`5,964,842
`5,991,515
`6031939
`7
`7
`
`* Cited by examiner
`
`NetApp; Rackspace
`
`Exhibit 1006
`
`Page 2
`
`NetApp; Rackspace Exhibit 1006 Page 2
`
`

`

`US. Patent
`
`Feb. 27, 2001
`
`Sheet 1 0f 16
`
`US 6,195,024 B1
`
`'________.__________
`
`I I I I I
`
`r/
`
`1
`
`I
`I
`
`2
`
`3
`
`INPUT DATA STREAM
`
`IDENTIFY INPUT DATA TYPE AND
`GENERATE DATA TYPE IDENTIFICATION
`
`SIGNAL
`
`
`DATA TYPE
`ID SIGNAL
`
`COMPRESS DATA IN ACCORDANCE WITH
`IDENTIFIED DATA TYPE
`
`I | I I |
`
`I I
`
`I
`I
`
`COMPRESSED DATA STREAM
`
`RETRIEVE DATA TYPE
`
`INFORMATION OF COMPRESSED
`DATA STREAM
`
`WITH IDENTIFIED DATA TYPE
`
`DECOMPRESS DATA IN ACCORDANCE
`
`FIG. 1
`
`PRIOR ART
`
`NetApp; Rackspace
`
`Exhibit1006
`
`Page3
`
`NetApp; Rackspace Exhibit 1006 Page 3
`
`

`

`
`
`US 6,195,024 B1
`
`US. Patent
`
`beF
`
`[any
`
`61f02te6
`
`7,E:EEm2.E38805
`1zo_mmmmn__>_oommoEEowmQ
`ZOEwMKQEOO
`
`OF<m
`
`mm>k
`
`zoEEowmo
`\ZO_._.<Z__2mmFmQ
`
`ZOQK<1§OO
`
`N.OE
`
`\mwuqu
`
`VKMFZDOO
`
`\MMLLDm
`
`NmmFZDOO
`
`\mmuqu
`
`mmwkZDOO
`
`\mmmmDm
`
`CIMFZDOO
`
`ov
`
`cmmmooozm
`
`Nmmwooozm
`
`
`
`mmmwooozw
`
`HD&Z_
`
`<F<D
`
`mmuqu
`
`<H<D
`
`XOOJm
`
`KMHZDOO
`
`memooozm
`
`Efimmkm<H<Q
`
`NetApp; Rackspace
`
`Exhibit 1006
`
`Page 4
`
`
`
`NetApp; Rackspace Exhibit 1006 Page 4
`
`
`
`
`
`
`
`
`

`

`US. Patent
`
`Feb. 27, 2001
`
`Sheet 3 0f 16
`
`US 6,195,024 B1
`
`RECEIVE INITIAL
`DATA BLOCK FROM
`INPUT DATA STREAM
`
`300
`
`B
`
`COUNT SIZE OF
`DATA BLOCK
`
`BUFFER DATA BLOCK
`
`COMPRESS DATA
`BLOCK WITH
`
`ENABLED ENCODERS
`
`
`
`BUFFER ENCODED
`DATA BLOCK OUTPUT
`FROM EACH
`ENCODER
`
`COUNT SIZE OF
`ENCODED DATA
`BLOCKS
`
`CALCULATE
`COMPRESSION
`RATIOS
`
`302
`
`304
`
`303
`
`310
`
`312
`
`314
`
`COMPARE
`
`THRESHOLD LIMIT
`
`COMPRESSION
`RATIOS WITH
`
`FIG. 3a
`
`NetApp; Rackspace
`
`Exhibit1006
`
`Page5
`
`NetApp; Rackspace Exhibit 1006 Page 5
`
`

`

`US. Patent
`
`Feb. 27, 2001
`
`Sheet 4 0f 16
`
`US 6,195,024 B1
`
`316
`
`IS
`
`
`COMPRESSION
`
`RATIO OF AT LEAST ONE
`
`ENCODED DATA BLOCK
`GREATER THAN
`
`THRESHOLD?
`
`NO
`
`
`
`APPEND NULL
`
`
`SELECT ENCODED
`
`DESCRIPTOR TO
`DATA BLOCK WITH
`
`
`
`UNENCODED INPUT
`GREATEST
`
`DATA BLOCK
`COMPRESSION RATIO
`
`
`
`
`
`318
`
`320
`
`MORE
`
`DATA BLOCKS IN INPUT
`
`STREAM?
`
`
`
`TERMINATE DATA
`COMPRESSION
`
`PROCESS
`
`
`
`330
`
`RECEIVE NEXT DATA
`BLOCK FROM INPUT
`STREAM
`
`
`
`
`
`FIG. 3b
`
`NetApp; Rackspace
`
`Exhibit1006
`
`Page6
`
`APPEND
`
`CORRESPONDING
`DESCRIPTOR
`
`
`OUTPUT UNENCODED
`
`OUTPUT ENCODED
`
`DATA BLOCK WITH
`DATA BLOCK WITH
`
`NULL DESCRIPTOR
`DESCRIPTOR
`
`
`
`
`
`
`
`
`
`
`NetApp; Rackspace Exhibit 1006 Page 6
`
`

`

`
`
`US. Patent
`
`Feb. 27, 2001
`
`Sheet 5 0f 16
`
`US 6,195,024 B1
`
`EEmmmoozm
`
`§:EEmENEqummmmooozm
`
`Emuqu
`
`_‘NEHZDOO
`
`rmmmooozm
`
`
`
`Edwmrmtm.«Q.«\Q
`
`ZO_mwm_N_n=>_OO
`
`zOm_m<n__>_oo.
`
`zo_mmmmn_s_oo5a;E<o
`
`
`zoEEommommmpzaoo«mmqumszaoo
`
`
`
`mafi\zo:Mq_F__\<¢mEmoEEqu8mmooozm<53xogm
`
`
`
`ZO_._.<Z=2N_m._.m_DmeHo/E
`
`
`
`mOmmDGEmmooozm
`
`
`
`tmwfi>t45<m5mo
`
`v.®_n_
`
`EmELDm
`
`cmmPZDOO
`
`cmmmooozm
`
`NetApp; Rackspace
`
`Exhibit 1006
`
`Page 7
`
`
`
`NetApp; Rackspace Exhibit 1006 Page 7
`
`
`
`
`
`

`

`US. Patent
`
`Feb. 27, 2001
`
`Sheet 6 0f 16
`
`US 6,195,024 B1
`
`B
`
`RECEIVE INITIAL
`DATA BLOCK FROM
`INPUT DATA STREAM
`
`BUFFER DATA BLOCK
`
`COUNT SIZE OF
`DATA BLOCK
`
`500
`
`502
`
`504
`
`COMPRESS DATA
`BLOCK WITH
`
`ENABLED ENCODERS
`
`
`APPEND CORRESPONDING
`
`
`DESIRABILITY FACTORS TO
`ENCODED DATA BLOCKS
`
`
`BUFFER ENCODED DATA
`BLOCK OUTPUT
`
`FROM EACH ENCODER
`
`508
`
`510
`
`COUNT SIZE OF
`ENCODED DATA
`BLOCKS
`
`RATIOS
`
`CALCULATE
`COMPRESSION
`
`512
`
`514
`
`COMPARE COMPRESSION
`
`RATIOS WITH THRESHOLD LIMIT
`
`516
`
`FIG. 5a
`
`NetApp; Rackspace
`
`Exhibit1006
`
`Page8
`
`NetApp; Rackspace Exhibit 1006 Page 8
`
`

`

`US. Patent
`
`Feb. 27, 2001
`
`Sheet 7 0f 16
`
`US 6,195,024 B1
`
`A
`
`518
`IS
`
`
`COMPRESSION
`
`
`RATIO OF AT LEAST ONE
`ENCODED DATA BLOCK
`
`GREATER THAN
`
`THRESHOLD?
`
`
`
`520
`
`522
`
`
`
`
`
`
`APPEND NULL
`
`
`CALCULATE FIGURE OF
`
`DESCRIPTOR TO
`MERIT FOR EACH ENCODED
`
`
`
`UNENCODED INPUT
`DATA BLOCK WHICH EXCEED
`
`
`DATA BLOCK
`THRESHOLD
`
`
`
`
`SELECT ENCODED DATA
`
`BLOCK WITH GREATEST
`FIGURE OF MERIT
`
`APPEND
`OUTPUT UNENCODED
`
`
`DATA BLOCK WITH
`CORRESPONDING
`
`NULL DESCRIPTOR
`DESCRIPTOR
`
`
`
`
`OUTPUT ENCODED
`DATA BLOCK WITH
`DESCRIPTOR
`
`
`
`PROCESS
`
`TERMINATE DATA
`COMPRESSION
`
`MORE
`
`
`DATA BLOCKS IN
`
`INPUT STREAM?
`
`YES
`
`RECEIVE NEXT DATA
`BLOCK FROM INPUT
`STREAM
`
`534
`
`
`
`
`B
`
`FIG. 5b
`
`NetApp; Rackspace
`
`Exhibit1006
`
`Page9
`
`NetApp; Rackspace Exhibit 1006 Page 9
`
`

`

`
`
`US. Patent
`
`b
`
`2
`
`61f08t
`
`US 6,195,024 B1
`
`2.E53805
`
`Em“.qu
`
`eFE5300FEmmooozmEmmaE3
`
`0moEEommqNmmfizaoo
`7,§:EEmEm“.qummmmooozm
`
`
`
`
`zoammmgzooSag<55mm?\ZOfiMflEQMmEQEmtammm.mmooozm5.350.5mzo_mmmmas_oo
`
`
`
`
`20:95memE5300mm:qummfizpoo
`
`Szow_m<n_s_oo
`
`o.0_u_
`
`:EMFZDOO
`
`cEmmmamwmmooozm
`
`om
`
`ime
`
`
`
`NEE.sznzomn‘m
`
`NetApp;Rackspace
`
`Exhibit1006
`
`Page10
`
`
`
`NetApp; Rackspace Exhibit 1006 Page 10
`
`
`
`

`

`US. Patent
`
`Feb. 27, 2001
`
`Sheet 9 0f 16
`
`US 6,195,024 B1
`
`700
`
`702
`
`E3
`
`704
`
`705
`
`708
`
`INPUT INITIAL
`DATA BLOCK FROM
`INPUT DATA STREAM
`
`COUNT SIZE OF
`
`DATA BLOCK
`
`71o
`
`TIME EXPIRED?
`
`
`No
`
`No
`
`
`
`
`COMPLETE?
`
`
`ENCODING
`
`
`
`
`
`BUFFER DATA BLOCK
`
`INITIALIZE TIMER
`
`STOP
`ENCODING
`PROCESS
`
`718
`BUFFER ENCODED
`
`
`BEGIN
`BUFFER
`DATA BLOCK FOR EACH
`
`
`COMPRESSING
`ENCODER THAT
`
`
`DATA BLOCK WITH
`FROM EACH
`COMPLETED ENCODING
`ENCODERS
`PROCESS
`
`
`
` ENCODER WITHIN TIME LIMIT
`
`
`
`ENCODED DATA
`BLOCKS
`
`720
`
`
`
` COUNT SIZE OF
`
` CALCULATE
`
`
`COMPRESSION
`RATIOS
`
`722
`
`
`
`COMPARE COMPRESSION
`
`
`RATIOS WITH THRESHOLD
`
`
`LIMIT
`
`
`
`724
`
`FIG. 7a
`
`NetApp;Rackspace
`
`Exhibit1006
`
`Page11
`
`NetApp; Rackspace Exhibit 1006 Page 11
`
`

`

`US. Patent
`
`Feb. 27, 2001
`
`Sheet 10 0f 16
`
`US 6,195,024 B1
`
`726
`
`IS
`
`
`COMPRESSION
`
`
`RATIO OF AT LEAST ONE
`ENCODED DATA BLOCK
`
`GREATER THAN
`THRESHOLD?
`
`
`NO
`
`
`
`APPEND NULL
`SELECT ENCODED
`
`
`DATA BLOCK WITH
`DESCRIPTOR TO
`
`
`UNENCODED INPUT
`GREATEST
`
`
`
`DATA BLOCK
`COMPRESSION RATIO
`
`
`
`APPEND
`
`CORRESPONDING
`DESCRIPTOR
`
`728
`
`730
`
`
`
`
`OUTPUT UNENCODED
`
`OUTPUT ENCODED
`
`
`DATA BLOCK WITH
`DATA BLOCK WITH
`
`
`NULL DESCRIPTOR
`DESCRIPTOR
`
`
`
`
`
`
`
`MORE
`
`
`DATA BLOCKS IN INPUT
`
`STREAM?
`
`TERMINATE DATA
`
`COMPRESSION
`
`
`PROCESS
`
`
`
`740
`
`RECEIVE NEXT DATA
`
`
`BLOCK FROM INPUT
`
`STREAM
`
`
`FIG. 7b
`
`NetApp;Rackspace
`
`Exhibit1006
`
`Page12
`
`NetApp; Rackspace Exhibit 1006 Page 12
`
`

`

`
`
`US. Patent
`
`61f011teehS
`
`US 6,195,024 B1
`
`w...E:EEm
`
`b.«9&5me
`
`«R«GQMQOOZM
`
`
`zo_mmmmn__>_ooGEEnzo_mmmmn=>_oo1ZOmEEEOQm20:136me20:423ng2mat
`
`
`
`
`Emu.qu
`
`_.KMHZDOO
`
`EMHEDm.
`
`N15.2300
`
`EwnEDm
`
`mKMFZDOO
`
`EmuEDm
`
`:mmHZDOO
`
`
`
` {
`mmmmooozm
`Nmmmooozm
`cmmmooozw
`
`rm.mmooozm
`
`HDQZ_
`
`<._.<n_
`
`awn—“Em
`
`
`
`<._.<Q.512.
`
`XOOAm
`
`\mOmmwOOma
`
`ZMHZDOO
`
`EvflmfihE.«d
`
`w.9“.
`
`no#50:
`
`tam:
`
`ZO_._.<Z__>_mm_.m_D
`
`mmooozm
`
`>H:_m<m_wm_n_
`
`meHOdE
`
`mm=>=._.
`
`om
`
`-IWMD
`
`
`
`ME:QMIGMQW
`
`NetApp;Rackspace
`
`Exhibit1006
`
`Page13
`
`
`
`NetApp; Rackspace Exhibit 1006 Page 13
`
`
`
`
`

`

`
`
`US. Patent
`
`Feb. 27, 2001
`
`Sheet 12 0f 16
`
`US 6,195,024 B1
`
`ZO_mwm_Mn=>_OO
`
`O_._.<m
`
`ZOw_m<n__>_OO\ZO_._.<Z__>_mm_._.m_D
`
`“.0MMDOE
`
`Elm—S.
`
`ZO_.r<Z__>_mm_.rm_n_
`
`om
`
`ZO_mmmmm_>_OO
`
`m_n_>._.
`
`ZOrrnzmome
`
`«Q.«dQMQOOZM
`
`\\SV<<mwfim
`
`IOKQEOmMQ
`
`
`
`m.®_n_
`
`mmooozm
`
`>._._.__m<m__mm_n_
`
`mmOhO/E
`
`Oom
`
`mm=>=H
`
`om
`
`-mmmb
`
`
`
`NEE.QMEGMQM,
`
`NetApp;Rackspace
`
`Exhibit1006
`
`Page14
`
`.
`C
`
`FOE
`
`
`
`
`
`moi
`
`:.mmNdm.
`
`C
`
`NOE
`
`:.Nm.NMm.
`
`_._Nm_
`
`
`
`C
`
`EOE
`
`:_EmNEm.
`
`_\_Em
`
`
`
`
`
`cémN;m.
`
`firm.
`
`
`
`._.3n_Z_
`
`<._.<n_
`
`<F<D
`
`XOOJm
`
`mwnEDm
`
`EMHZDOO
`
`SEEKS.«R«d
`
`NetApp; Rackspace Exhibit 1006 Page 14
`
`
`
`
`
`

`

`US. Patent
`
`Feb. 27, 2001
`
`Sheet 13 0f 16
`
`US 6,195,024 B1
`
`100
`
`102
`
`104
`
`106
`
`108
`
`RECEIVE INITIAL
`DATA BLOCK FROM
`INPUT DATA STREAM
`
`COUNT SIZE OF
`
`DATA BLOCK
`ENCODER PATHS
`
`BUFFER DATA
`BLOCK
`
`INITIALIZE TIMER
`
`APPLY INPUT DATA
`BLOCK TO FIRST
`ENCODING STAGE
`IN CASCADED
`
`TIME EXPIRED?
`
`(110
`
`NO
`
`APPLY OUTPUT
`OF COMPLETED
`ENCODING
`STAGE TO NEXT
`ENCODING
`STAGE IN
`
`I 16
`
`BUFFER
`ENCODED DATA
`
`112 YES
`
`BLOCK OUTPUT
`FROM
`
`COMPLETED
`ENCODING
`STAGE
`
`ENCODING
`STAGE
`COMPLETE?
`
`CASCADE PATH
`
`
`
`SELECT BUFFERED OUTPUT OF LAST
`
`
`ENCODING STAGE IN ENCODER
`CASCADE THAT COMPLETED ENCODING
`PROCESS WITHIN TIME LIMIT
`
`
`
`COUNT SIZE OF
`
`
`122
`ENCODED DATA
`BLOCKS
`
`STOP ENCODING
`PROCESS
`
`CALCULATE
`COMPRESSION
`RATIOS
`
`124
`
`
`
`COMPARE COMPRESSION
`RATIOS WITH THRESHOLD
`LIMIT
`
`126
`
`FIG. 103
`
`A
`
`NetApp;Rackspace
`
`Exhibit1006
`
`Page15
`
`NetApp; Rackspace Exhibit 1006 Page 15
`
`

`

`US. Patent
`
`Feb. 27, 2001
`
`Sheet 14 0f 16
`
`US 6,195,024 B1
`
`128
`IS
`
`
`COMPRESSION
`
`
`RATIO OF AT LEAST ONE
`
`
`ENCODED DATA BLOCK
`
`GREATER THAN
`
`THRESHOLD?
`
`
`
`134
`
`136
`
`BLOCK WITH GREATEST
`FIGURE OF MERIT
`
`
`
`130
`
`132
`
`140
`
`DATA BLOCK WITH
`DESCRIPTOR
`
`
`
`MORE
`
`DATA BLOCKS IN
`
`INPUT STREAM?
`
`
`
`
`YES
`
`TERMINATE DATA
`COMPRESSION
`
`
`
`
`
`RECEIVE NEXT DATA
`BLOCK FROM INPUT
`STREAM
`
`144
`
`PROCESS
`
`3 FIG. 10b
`
`NetApp;Rackspace
`
`Exhibit1006
`
`Page16
`
`
`CALCULATE FIGURE OF
`APPEND NULL
`MERIT FOR EACH ENCODED
`DESCRIPTOR TO
`
`
`DATA BLOCK WHICH EXCEED
`UNENCODED INPUT
`
`THRESHOLD
`DATA BLOCK
`
` SELECT ENCODED DATA
`
`OUTPUT UNENCODED
`APPEND
`
`
`DATA BLOCK WITH
`CORRESPONDING
`138
`
`
`NULL DESCRIPTOR
`DESCRIPTOR
`
`
`
` OUTPUT ENCODED
`
`
`
`142
`
`\
`
`NetApp; Rackspace Exhibit 1006 Page 16
`
`

`

`
`
`US. Patent
`
`2b.W...
`
`2
`
`ehS
`
`51
`
`M
`
`
`
`W«RedRDQFDO
`
`79NDmmDOOwD
`
`quZ\\S«R«d
`
`IOhntmome
`
`_.DmeOOm—D
`
`a00:.
`
`mmHEDm
`
`pm:0EMDOOMD
`
`#0:
`
`
`
`
`
`Eqflmhm,<H<DFDR—.30mommoooma
`
`
`
`No:
`
`oo:
`
`KOFQEmeD
`
`2O_._.O<m._.Xm
`
`
`
`awn—“Emx00..m_
`
`
`<._.<n_._.Dn_z_
`33$me«R«d
`
`91,6SU
`
`1BM.
`
`0.s,:o:
`
`
`
`NetApp;Rackspace
`
`Exhibit1006
`
`Page17
`
`NetApp; Rackspace Exhibit 1006 Page 17
`
`
`
`

`

`US. Patent
`
`Feb. 27, 2001
`
`Sheet 16 0f 16
`
`US 6,195,024 B1
`
` 1200
`RECEIVE INITIAL
`
`DATA BLOCK FROM
`
`
`INPUT DATA STREAM
`BUFFER DATA BLOCK
`
`
`
` EXTRACT DATA
`
`COMPRESSION TYPE
`
`
`DESCRIPTOR
`
`
`
`
`1202
`
`1204
`
`ISDATA
`
`
`COMPRESSON
`
`
`TYPEDESCRPTO'
`
`
`NULL?
`
`
`
`DECODE DATA BLOCK USING
`
`
`SELECT DECODER(S)
`CORRESPONDING TO
`
`
`DESCRIPTOR
`1208
`
`OUTPUT
`RECEIVE NEXT
`
`
`
`UN DECODED
`DATA BLOCK IN
`
`
`INPUT STREAM
`DATA BLOCK
`
`
`
`SELECTED DECODER(S)
`
` OUTPUT DECODED
`
`
`DATA BLOCK
`
`
`
` MOREDATA
`
`BLOCKSININPUT
`
`
`STREAM?
`
`
`
`
`
`TERMINATE
`DECODING PROCESS
`
`1218
`
`FIG. 12
`
`NetApp;Rackspace
`
`Exhibit1006
`
`Page18
`
`NetApp; Rackspace Exhibit 1006 Page 18
`
`

`

`US 6,195,024 B1
`
`1
`CONTENT INDEPENDENT DATA
`COMPRESSION METHOD AND SYSTEM
`
`BACKGROUND
`
`1. Technical Field
`
`The present invention relates generally to data compres-
`sion and decompression and, more particularly, to systems
`and methods for providing content independent lossless data
`compression and decompression.
`2. Description of the Related Art
`Information may be represented in a variety of manners.
`Discrete information such as text and numbers are easily
`represented in digital data. This type of data representation
`is known as symbolic digital data. Symbolic digital data is
`thus an absolute representation of data such as a letter,
`figure, character, mark, machine code, or drawing.
`Continuous information such as speech, music, audio,
`images and video, frequently exists in the natural world as
`analog information. As is well-known to those skilled in the
`art, recent advances in very large scale integration (VLSI)
`digital computer technology have enabled both discrete and
`analog information to be represented with digital data.
`Continuous information represented as digital data is often
`referred to as diffuse data. Diffuse digital data is thus a
`representation of data that is of low information density and
`is typically not easily recognizable to humans in its native
`form.
`
`There are many advantages associated with digital data
`representation. For instance, digital data is more readily
`processed, stored, and transmitted due to its inherently high
`noise immunity. In addition, the inclusion of redundancy in
`digital data representation enables error detection and/or
`correction. Error detection and/or correction capabilities are
`dependent upon the amount and type of data redundancy,
`available error detection and correction processing, and
`extent of data corruption.
`One outcome of digital data representation is the continu-
`ing need for increased capacity in data processing, storage,
`and transmittal. This is especially true for diffuse data where
`increases in fidelity and resolution create exponentially
`greater quantities of data. Data compression is widely used
`to reduce the amount of data required to process, transmit,
`or store a given quantity of information. In general, there are
`two types of data compression techniques that may be
`utilized either separately or jointly to encode/decode data:
`lossless and lossy data compression.
`Lossy data compression techniques provide for an inexact
`representation of the original uncompressed data such that
`the decoded (or reconstructed) data differs from the original
`unencoded/uncompressed data. Lossy data compression is
`also known as irreversible or noisy compression. Entropy is
`defined as the quantity of information in a given set of data.
`Thus, one obvious advantage of lossy data compression is
`that the compression ratios can be larger than the entropy
`limit, all at the expense of information content. Many lossy
`data compression techniques seek to exploit various traits
`within the human senses to eliminate otherwise impercep-
`tible data. For example, lossy data compression of visual
`imagery might seek to delete information content in excess
`of the display resolution or contrast ratio.
`On the other hand, lossless data compression techniques
`provide an exact representation of the original uncom-
`pressed data. Simply stated, the decoded (or reconstructed)
`data is identical to the original unencoded/uncompressed
`data. Lossless data compression is also known as reversible
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`4-0
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`
`or noiseless compression. Thus, lossless data compression
`has, as its current limit, a minimum representation defined
`by the entropy of a given data set.
`There are various problems associated with the use of
`lossless compression techniques. One fundamental problem
`encountered with most lossless data compression techniques
`are their content sensitive behavior. This is often referred to
`
`as data dependency. Data dependency implies that the com-
`pression ratio achieved is highly contingent upon the content
`of the data being compressed. For example, database files
`often have large unused fields and high data redundancies,
`offering the opportunity to losslessly compress data at ratios
`of 5 to 1 or more. In contrast, concise software programs
`have little to no data redundancy and, typically, will not
`losslessly compress better than 2 to 1.
`Another problem with lossless compression is that there
`are significant variations in the compression ratio obtained
`when using a single lossless data compression technique for
`data streams having different data content and data size. This
`process is known as natural variation.
`Afurther problem is that negative compression may occur
`when certain data compression techniques act upon many
`types of highly compressed data. Highly compressed data
`appears random and many data compression techniques will
`substantially expand, not compress this type of data.
`For a given application, there are many factors which
`govern the applicability of various data compression tech-
`niques. These factors include compression ratio, encoding
`and decoding processing requirements, encoding and decod-
`ing time delays, compatibility with existing standards, and
`implementation complexity and cost, along with the adapt-
`ability and robustness to variations in input data. A direct
`relationship exists in the current art between compression
`ratio and the amount and complexity of processing required.
`One of the limiting factors in most existing prior art lossless
`data compression techniques is the rate at which the encod-
`ing and decoding processes are performed. Hardware and
`software implementation tradeoffs are often dictated by
`encoder and decoder complexity along with cost.
`Another problem associated with lossless compression
`methods is determining the optimal compression technique
`for a given set of input data and intended application. To
`combat this problem, there are many conventional content
`dependent techniques which may be utilized. For instance,
`filetype descriptors are typically appended to file names to
`describe the application programs that normally act upon the
`data contained within the file. In this manner data types, data
`structures, and formats within a given file may be ascer-
`tained. Fundamental problems with this content dependent
`technique are:
`(1) the extremely large number of application programs,
`some of which do not possess published or documented
`file formats, data structures, or data type descriptors;
`(2)
`the ability for any data compression supplier or
`consortium to acquire, store, and access the vast
`amounts of data required to identify known file descrip-
`tors and associated data types, data structures, and
`formats; and
`(3) the rate at which new application programs are devel-
`oped and the need to update file format data descrip-
`tions accordingly.
`An alternative technique that approaches the problem of
`selecting an appropriate lossless data compression technique
`is disclosed in US. Pat. No. 5,467,087 to Chu entitled “High
`Speed Lossless Data Compression System” (“Chu”). FIG. 1
`illustrates an embodiment of this data compression and
`
`NetApp;Rackspace
`
`Exhibit1006
`
`Page19
`
`NetApp; Rackspace Exhibit 1006 Page 19
`
`

`

`3
`
`4
`
`US 6,195,024 B1
`
`decompression technique. Data compression 1 comprises
`two phases, a data pre-compression phase 2 and a data
`compression phase 3. Data decompression 4 of a com-
`pressed input data stream is also comprised of two phases,
`a data type retrieval phase 5 and a data decompression phase
`6. During the data compression process 1, the data pre-
`compressor 2 accepts an uncompressed data stream, identi-
`fies the data type of the input stream, and generates a data
`type identification signal. The data compressor 3 selects a
`data compression method from a preselected set of methods
`to compress the input data stream, with the intention of
`producing the best available compression ratio for that
`particular data type.
`There are several problems associated with the Chu
`method. One such problem is the need to unambiguously
`identify various data types. While these might include such
`common data types as ASCII, binary, or unicode, there, in
`fact, exists a broad universe of data types that fall outside the
`three most common data types. Examples of these alternate
`data types include: signed and unsigned integers of various
`lengths, differing types and precision of floating point
`numbers, pointers, other forms of character text, and a
`multitude of user defined data types. Additionally, data types
`may be interspersed or partially compressed, making data
`type recognition difficult and/or impractical. Another prob-
`lem is that given a known data type, or mix of data types
`within a specific set or subset of input data,
`it may be
`difficult and/or impractical to predict which data encoding
`technique yields the highest compression ratio.
`Chu discloses an alternate embodiment wherein a data
`
`compression rate control signal is provided to adjust specific
`parameters of the selected encoding algorithm to adjust the
`compression time for compressing data. One problem with
`this technique is that the length of time to compress a given
`set of input data may be difficult or impractical to predict.
`Consequently, there is no guarantee that a given encoding
`algorithm or set of encoding algorithms will perform for all
`possible combinations of input data for a specific timing
`constraint. Another problem is that, by altering the param-
`eters of the encoding process,
`it may be difficult and/or
`impractical to predict the resultant compression ratio.
`Other conventional techniques have been implemented to
`address the aforementioned problems. For instance, US.
`Pat. No. 5,243,341 to Seroussi et al. describes a class of
`Lempel-Ziv lossless data compression algorithms that uti-
`lize a memory based dictionary of finite size to facilitate the
`compression and decompression of data. A second standby
`dictionary is included comprised of those encoded data
`entries that compress the greatest amount of input data.
`When the current dictionary fills up and is reset, the standby
`dictionary becomes the current dictionary, thereby maintain-
`ing a reasonable data compression ratio and freeing up
`memory for newly encoded data strings. Multiple dictionar-
`ies are employed within the same encoding technique to
`increase the lossless data compression ratio. This technique
`demonstrates the prior art of using multiple dictionaries
`within a single encoding process to aid in reducing the data
`dependency of a single encoding technique. One problem
`with this method is that it does not address the difficulties in
`
`dealing with a wide variety of data types.
`teaches a
`US. Pat. No. 5,717,393 to Nakano, et al.
`plurality of code tables such as a high-usage code table and
`a low-usage code table in an entropy encoding unit. A
`block-sorted last character string from a block-sorting trans-
`forming unit is the move-to-front transforming unit is trans-
`formed into a move-to-front (MTF) code string. The entropy
`encoding unit switches the code tables at a discontinuous
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`part of the MTF code string to perform entropy coding. This
`technique increases the compression rate without extending
`the block size. Nakano employs multiple code tables within
`a single entropy encoding unit to increase the lossless data
`compression ratio for a given block size, somewhat reducing
`the data dependency of the encoding algorithm. Again, the
`problem with this technique is that it does not address the
`difficulties in dealing with a wide variety of data types.
`US. Pat. No. 5,809,176 to Yajima discloses a technique of
`dividing a native or uncompressed image data into a plu-
`rality of streams for subsequent encoding by a plurality of
`identically functioning arithmetic encoders. This method
`demonstrates the technique of employing multiple encoders
`to reduce the time of encoding for a single method of
`compression.
`US. Pat. Nos. 5,583,500 and 5,471,206 to Allen, at al.
`disclose systems for parallel decompression of a data stream
`comprised of multiple code words. At least two code words
`are decoded simultaneously to enhance the decoding pro-
`cess. This technique demonstrates the prior art of utilizing
`multiple decoders to expedite the data decompression pro-
`cess.
`
`US. Pat. No. 5,627,534 to Craft teaches a two-stage
`lossless compression process. A run length precompressed
`output is post processed by a Lempel-Ziv dictionary sliding
`window dictionary encoder that outputs a succession of
`fixed length data units. This yields a relatively high-speed
`compression technique that provides a good match between
`the capabilities and idiosyncrasies of the two encoding
`techniques. This technique demonstrates the prior art of
`employing sequential lossless encoders to increase the data
`compression ratio.
`US. Pat. No. 5,799,110 to Israelsen, et al. discloses an
`adaptive threshold technique for achieving a constant bit rate
`on a hierarchical adaptive multistage vector quantization. A
`single compression technique is applied iteratively until the
`residual
`is reduced below a prespecified threshold. The
`threshold may be adapted to provide a constant bit rate
`output. If the nth stage is reached without the residual being
`less than the threshold, a smaller input vector is selected.
`US. Pat. No. 5,819,215 to Dobson, et al. teaches a method
`of applying either lossy or lossless compression to achieve
`a desired subjective level of quality to the reconstructed
`signal.
`In certain embodiments this technique utilizes a
`combination of run-length and Huffman encoding to take
`advantage of other local and global statistics. The tradeoffs
`considered in the compression process are perceptible dis-
`tortion errors versus a fixed bit rate output.
`SUMMARY OF THE INVENTION
`
`The present invention is directed to systems and methods
`for providing content independent lossless data compression
`and decompression. In one aspect of the present invention,
`a method for providing content independent lossless data
`compression comprises the steps of:
`(a) receiving as input a block of data from a stream of
`data, the data stream comprising one of at least one data
`block and a plurality of data blocks;
`(b) counting the size of the input data block;
`(c) encoding the input data block with a plurality of
`lossless encoders to provide a plurality of encoded data
`blocks;
`(d) counting the size of each of the encoded data blocks;
`(e) determining a lossless data compression ratio obtained
`for each of the encoders by taking the ratio of the size
`of the encoded data block output from the encoders to
`the size of the input data block;
`
`NetApp;Rackspace
`
`Exhibit1006
`
`Page 20
`
`NetApp; Rackspace Exhibit 1006 Page 20
`
`

`

`5
`
`6
`
`US 6,195,024 B1
`
`(f) comparing each of the determined compression ratios
`with an a priori user specified compression threshold;
`(g) selecting for output the input data block and append-
`ing a null data type compression descriptor to the input
`data block, if all of the encoder compression ratios fall
`below the a priori specified compression threshold; and
`(h) selecting for output the encoded data block having the
`highest compression ratio and appending a correspond-
`ing data type compression descriptor to the selected
`encoded data block, if at least one of the compression
`ratios exceed the a priori specified compression thresh-
`old.
`
`In another aspect of the present invention, a timer is
`preferably added to measure the time elapsed during the
`encoding process against an a priori-specified time limit.
`When the time limit expires, only the data output from those
`encoders that have completed the present encoding cycle are
`compared to determine the encoded data with the highest
`compression ratio. The time limit ensures that the real-time
`or pseudo real-time nature of the data encoding is preserved.
`In another aspect of the present invention, the results from
`each encoder are buffered to allow additional encoders to be
`
`sequentially applied to the output of the previous encoder,
`yielding a more optimal lossless data compression ratio.
`In another aspect of the present invention, a method for
`providing content independent lossless data decompression
`includes the steps of receiving as input a block of data from
`a stream of data, extracting an encoding type descriptor from
`the input data block, decoding the input data block with one
`or more of a plurality of available decoders in accordance
`with the extracted encoding type descriptor, and outputting
`the decoded data block. An input data block having a null
`descriptor type extracted therefrom is output without being
`decoded.
`
`Advantageously, the present invention employs a plurality
`of encoders applying a plurality of compression techniques
`on an input data stream so as to achieve maximum com-
`pression in accordance with the real-time or pseudo real-
`time data rate constraint. Thus, the output bit rate is not fixed
`and the amount, if any, of permissible data quality degra-
`dation is not adaptable, but is user or data specified.
`The present invention is realized due to recent improve-
`ments in processing speed, inclusive of dedicated analog and
`digital hardware circuits, central processing units, (and any
`hybrid combinations thereof), which, coupled with reduc-
`tions in cost, are enabling of new content independent data
`compression and decompression solutions.
`These and other aspects, features and advantages of the
`present invention will become apparent from the following
`detailed description of preferred embodiments, which is to
`be read in connection with the accompanying drawings.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 is a block/flow diagram of a content dependent
`high-speed lossless data compression and decompression
`system/method according to the prior art;
`FIG. 2 is a block diagram of a content independent data
`compression system according to one embodiment of the
`present invention;
`FIGS. 3a and 3b comprise a flow diagram of a data
`compression method according to one aspect of the present
`invention which illustrates the operation of the data com-
`pression system of FIG. 2;
`FIG. 4 is a block diagram of a content independent data
`compression system according to another embodiment of the
`present invention having an enhanced metric for selecting an
`optimal encoding technique;
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`4-0
`
`4

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket