throbber

`SECOND EDITION
`
`1
`
`DISH 1015
`
`

`

` DATA COMPRESSION
`
`Techniques and Applications
`Hardware and Software Considerations
`
`Second Edition
`
`GILBERT HELD
`4-Degree Consul/(fig,
`Macon, Georgia, USA
`
`and
`
`THOMAS R. MARSHALL
`(software author)
`
`FENWIC‘K LIBRARY
`JGEORGE MA
`-
`FAIRFAX, Wasom umvsnsm
`
`John Wiley & Sons
`Chichester - New York - Brisbane - Toronto - Singapore
`
`
`
`2
`
`

`

`
`
`Copyright © 1983, 1987 by John Wiley & Sons Ltd.
`
`All rights reserved.
`
`No part of this book may be reproduced by any means, or
`transmitted, or translated into a machine language without the
`written permission of the publisher.
`
`Library of Congress Cataloging-in-Publication Data:
`Held, Gilbert, 1943—
`Data compression.
`Bibliography: p.
`Includes index.
`
`1. Data compression (Computer science) 1. Marshall,
`Thomas (Thomas R.)
`II. Title.
`QA76.9.D33H44 1987
`005.74’6
`ISBN 0 471 91280 8
`
`86-18942
`
`British Library Cataloguing in Publication Data:
`Held, Gilbert
`Data compression: techniques and applications hardware
`and software considerations. —
`2nd ed.
`
`1. Data compression (Computer science)
`I. Title
`II. Marshall, Thomas
`005.74’6
`QA76.9.D33
`
`ISBN 0 471 91280 8
`
`Typeset by Photo-Graphics, Honiton, Devon
`Printed in Great Britain
`
`3
`
`

`

` CHAPTER ONE
`
`Rationale and Utilization
`
`In the chronology of computer development, large—scale information transfer
`by remote computing and the development of massive information storage
`and retrieval systems have witnessed a tremendous growth. Concurrent with
`this growth, several problem areas have developed which can result in major,
`but unnecessary, economic expenditures.
`One problem is the so-called ‘run-away database”. Here the size of the
`database used by an organization for its information storage and retrieval
`programs becomes larger and larger, requiring additional disc drives for
`online systems and reels of magnetic tapes for those systems that can be
`processed in a batch environment.
`Accompanying the growth in the size of databases has been a large increase
`in the number of users and duration of usage by personnel at remote
`locations. These factors result in tremendous amounts of data being trans-
`ferred between computers and remote terminals. To provide transmission
`facilities for the required data transfers, communications lines and auxiliary
`devices, such as modems and multiplexers, have been continuously upgraded
`by many organizations to permit higher data—transfer capability.
`Although the obvious solutions to these problems of data storage and
`information transfer are to install additional storage devices and expand
`existing communications facilities, to do so requires an additional increase
`in an organization’s equipment and operating costs. One method that can
`be employed to alleviate a portion of data, storage and information transfer
`problems is through the representation of data by more efficient codes. If
`one examines an organization’s database or monitors a transmission line,
`there is an excellent chance that the individual characters that both make
`
`up the database and the transmission sequence could be encoded more
`effectively. Two techniques that can result in a more effective encoded data
`representation are logical and physical data compression.
`
`1.1 LOGICAL COMPRESSION
`
`When a database is designed, one of the first steps of the analyst is to obtain
`as much data reduction as possible. This data reduction results from the
`
`1
`
`
`
`4
`
`

`

`
`
` 2 e
`
`limination of redundant fields of information while representing the data
`elements in the remaining fields with as few logical indicators as is feasible.
`Although logical c0mpression is data dependent and the method employed
`can vary based upon the analyst’s foresight,
`the following two examples
`will illustrate the ease of implementation and benefits of this compression
`technique.
`One simple example of logical data compression is the occupational field
`on a personnel database. Suppose 30 alphanumeric positions are allocated
`to this field.
`If the field is fixed, occupations such as the 10-character
`occupational description ‘DISHWASHER’ have 20 blanks inserted into the
`remainder of the field. Then, 30 million characters of storage would be
`required for the occupational field of 1 million workers. Suppose at most
`there were 32 768 distinct occupations. Instead of indicating the occupational
`title, one could encode the equivalent 5-digit data code, eliminating 25
`character positions per field. The size of the field could be reduced further
`by allocating the binary value of 1 or more characters to the occupational
`code. As an example, an 8-bit character could represent 28 — 1 or 255
`distinct values or occupational codes. Linking two 8-bit characters through
`appropriate software would provide 216—1 or 65 535 distinct codes. This
`would reduce the field size from 30 to 2 characters, saving 28 million
`characters of storage. If our counting begins at zero instead of a conventional
`starting place of 1, an 8-bit character could represent 256 codes while a 16—
`bit character could be employed to represent 65 536 distinct values.
`A second example of logical compression is a date field. This type of
`field frequently occurs in databases. Normally,
`the numeric equivalents
`of the subfields representing day, month and year are used in place of
`longhand notation. Thus, 01 04 81 would represent 1 April 1981. While
`this
`logical compression results
`in 6 numeric characters of storage,
`additional data reduction can result from storing the date as a binary value.
`Since the day will never exceed 31, 5 bits would suffice to represent the date
`field. Similarly, 4 bits could be used to represent the month value while 7
`bits could represent 127 years, permitting a relative year ranging from 1900
`to 2027.
`
`Logical compression using numerical and binary representation is illus—
`trated in Figure 1.1 for the preceding date field example. It is interesting to
`note that employing binary representation reduces the date field to 16 binary
`digits or two 8-bit concatenated characters of storage. As discussed, many
`logical compression methods can be considered by an analyst during the
`database design process. Each method may result in a distinct degree of data
`storage reduction. Correspondingly, when logically compressed databases or
`portions of such databases are transmitted between locations, transmission
`time is reduced since fewer data characters are transmitted.
`
`While logical compression can be an effective tool in minimizing the size
`of a database, it only reduces transmission time when logically compressed
`data is transmitted. Thus,
`the transmission of inquiry and response data
`
`5
`
`

`

`
`
`
`Longhand
`Example
`
`DAY
`1
`
`MONTH
`APRIL
`
`YEAR
`1981
`
`Logical compression using numerical representation
`
`Example
`
`01
`
`04
`
`81
`
`Logical compression using binary representation
`
`Example
`
`00001
`
`0100
`
`1010001
`
`Figure 1.1 Logical compression methods. Logical com-
`pression can result from alphanumeric, numeric or
`binary representation of data in a shorthand notation
`
`which are typically encoded as separate and distinct entities in the appropriate
`bit representation of the code for each character is not normally affected.
`Similarly, the occurrence of repeating patterns and groups of characters
`which are normally contained in reports transmitted from computer systems
`to terminal devices would not be affected. For such situations, a reduction
`in data transmission time depends upon the physical compression of the data
`as it is encountered.
`
`1.2 PHYSICAL COMPRESSION
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Physical compression can be viewed as the process of reducing the quantity
`of data prior to it entering a transmission medium and the expansion of such
`data into its original format upon receipt at a distant location. Although
`both physical and logical compression can result in reduced transmission time,
`distinct application differences exist between the two techniques. Logical
`compression is normally used to represent databases more efficiently and
`does not consider the frequency of occurrence of characters or groups of
`characters. Physical compression takes advantage of the fact that when data
`is encoded as separate and distinct entities, the probabilities of occurrence
`of characters and groups of characters differ. Since frequently occurring
`characters are encoded into as many bits as those characters that only rarely
`occur, data reduction becomes possible by encoding frequently occurring
`characters into short bit codes while representing infrequently occurring
`characters by longer bit codes. Like logical compression, many physical
`compression techniques exist. Some techniques replace repeating strings of
`characters by a special compression indicator character and a quantity count
`character. Other techniques replace frequently occurring characters with a
`short binary code while infrequently encountered characters are replaced by
`longer binary codes. In Chapter 2, 10 distinct physical compression methods
`are covered in detail. For the remainder of this book we will focus our
`
`attention upon physical data compression.
`
`
`
`6
`
`

`

`1.3 COMPRESSION BENEFITS
`:-
`When data compression is used to reduce storage requirements, overall
`program execution time may be reduced. This is because the reduction in
`storage will result in a reduction of disc-access attempts, while the encoding
`and decoding required by the compression technique employed will result
`in additional program instructions being executed. Since the execution time
`of a group of program instructions is normally significantly less than the time
`required to access and transfer data to a peripheral device, overall program
`execution time may be reduced.
`With respect to the transmission of data, compression provides the network
`planner with several benefits in addition to the potential cost savings associ—
`ated with sending less data over the switched telephone network where the
`cost of the call is usually based upon its duration. First, compression can
`reduce the probability of transmission errors occurring since fewer characters
`are transmitted when data is compressed while the probability of an error
`occurring remains constant. Second, since compression increases efficiency,
`it may reduce or even eliminate extra workshifts. Finally, by converting text
`that is represented by a conventional code such as standard ASCII into a
`different code, compression algorithms may provide a level of security against
`illicit monitoring.
`For data communications, the transfer of compressed data over a medium
`results in an increase in the effective rate of information transfer, even
`though the actual data transfer rate expressed in bits per second remains the
`same. Data compression can be implemented on most existing hardware by
`software or through the use of a special hardware device that incorporates
`one or more compression techniques.
`In Figure 1.2, a basic data-compression block diagram is illustrated. Shown
`as a black box, compression and decompression may occur within the user’s
`processor to include personal computers, intelligent terminals or in a device
`foreign to the processor, such as a specialized communications component.
`Foremost among these components are data concentrators and statistical
`multiplexers.
`To examine in some detail a portion of the benefits that may result from
`the employment of one or more compression techniques requires a review
`of some fundamental compression terminology.
`
`1.4 TERMINOLOGY
`
`As illustrated in Figure 1.2, an original data stream is operated upon accord-
`ing to a particular algorithm to producea compressed data stream. This
`compression of the original data stream is sometimes referred to as an
`encoding process with the result that the compressed data stream is also
`called an encoded data stream. Reversing the process, the compressed data
`stream is decompressed to reproduce the original data stream. Since this
`
`7
`
`

`

`
`
`Doto
`
`block
`data-compression
`1.2 Basic
`Figure
`diagram. An original data stream operated upon
`according to one or more compression algorithms
`results in the generation of a compressed data
`stream
`
`
`
`Compressed
`Original
`compression
`
`data
`data
`Data
`
`
`
`decompression
`
`
`
`
`decompression process results in the decoding of the compressed data stream,
`the result is sometimes referred to as the decoded data stream. We will use
`
`the term original data stream and decoded data stream synonymously, as
`well as the terms compressed data stream and encoded data stream.
`The degree of data reduction obtained as a result of the compression
`process is known as the compression ratio. This ratio measures the quantity
`of compressed data in comparison to the quantity of original data, such that
`(Ruth and Krentzler, 1972):
`
`Length of original data string
`Com ression ratio =——.
`Length of compressed data string
`p
`
`From the above equation, it is obvious that the higher the compression
`ratio the more effective the compression technique employed. Another term
`used when talking about compression is the figure of merit, where:
`
`Length of compressed data string
`Figure Of merlt _ Length of original data string
`
`The figure of merit is the reciprocal of the compression ratio and must
`always be less than unity for the compression process to be effective. The
`fraction of data reduction is one minus the figure of merit. Thus, a com—
`pression technique that results in one character of compressed data for every
`three characters in the original data stream would have a compression ratio
`of 3, a figure of merit of 0.33 and the fraction of data reduction would be
`0.66.
`
`1.5 COMMUNICATIONS APPLICATIONS
`
`To obtain an overview of some of the communications benefits available
`
`through the incorporation of data compression, we can consider a typical
`data communications application. As illustrated in the top portion of Figure
`1.3, a remote batch terminal is connected to a central computer with trans-
`
`8
`
`

`

`
`
`N0 compress/an
`
`
`
`llHHilHiiHll'
`
`
`
`
`96 kbps transmission
`rate
`
`
`9.6 kbps information
`transfer rate
`
`Remote batch
`terminal
`
`Compress/on rat/'0 2
`
`
`
`
`
`9 6 kbps transmission
`rate
`Computer
`
`
`
`
`
` Remote batch
`terminal
`19.2 kbps information
`transfer rate
`
`Compressxbn rat/0 2
`
`
`4.8 kbps transmission
`rate
`
`
`
`Computer
`Remote batch
`
`terminal
`9.6 kbps information
`
`
`
`transfer rate
`
`
`
`Figure 1.3 Data compression affects the information transfer ratio (ITR), through the
`use of data compression, the methodology and structure of one’s data communications
`facility may be changed
`
`mission occurring at a 9.6 kbps data rate. Let us assume that the data to be
`transmitted has not been compressed. If through the programming of one
`or more compression algorithms or the installation of a hardware compression
`device a compression ratio of 2 is obtained, several alternatives may be
`available with respect to one’s data communications methodology. First, our
`data transmission time is reduced since the effective information transfer
`rate has increased to approximately 19.2 kbps as shown in the middle portion
`of Figure 1.3. Ignoring communications software overhead, the data trans-
`mission time is halved. Thus, one may now consider using the remote batch
`terminal for other remote processing applications or perhaps an expensive
`after-hours shift or portion of such a shift can be alleviated. In the lower
`portion of Figure 1.3 another user option is illustrated. Here the transmission
`rate may be reduced to 4800 bps. With a compression ratio of 2,
`this is
`equivalent to an information transfer rate of 9600 bps. By lowering the data
`transmission rate, more expensive 9600 bps modems may be replaced by
`4800 bps modems and line conditioning which is normally required when
`transmitting data at 9600 bps may be removed, resulting in an additional
`cost reduction.
`
`
`
`9
`
`

`

`No compression
`
`Compression
`
`
`
`Computer
`
`
`
`
`
` Compression performing modern
`
`Figure 1.4 Data compression on a multidrOp line reduces the flow of data on the
`line, permitting additional terminals to be serviced
`
`A second type of communications application that can benefit from the
`utilization of data compression is illustrated in Figure 1.4. A typical multidrop
`network is illustrated in the top portion of Figure 1.4, connecting terminals
`at diverse geographical locations via a common leased line to a computer
`site. Typically, the transmission activity of the terminals is the governing
`factor that limits the multidrop line to a maximum number of drops. In the
`bottom portion of Figure 1.4,
`it is assumed that compression performing
`modems were substituted for the conventional modems used in the original
`multidrop configuration. Since data compression on a multidrop line reduces
`the flow of data on the line, its utilization will normally enable additional
`drops to be added to the line prior to the occurrence of throughput delays
`that affect the response time of the terminals attached to each drop. In this
`particular example,
`it
`is assumed that the use of compression performing
`modems permitted an increase in the number of line drops from 4 to 6.
`
`1.6 DATA COMPRESSION AND INFORMATION TRANSFER
`
`When data is transmitted between terminals, a terminal and a computer or
`two computers, several delay factors may be encountered which cumulatively
`affect the information transfer rate. Data transmitted over a transmission
`
`medium must be converted into an acceptable format for that medium. When
`digital data is transmitted over analogue telephone lines, modems must be
`
`10
`
`10
`
`

`

`_£___
`
`8 e
`
`mployed to convert the digital pulses of the business machine into a modu-
`lated signal acceptable for transmission on the analogue telephone circuit.
`The time between the first bit entering the modem and the first modulated
`signal produced by the device is known as the modem’s internal delay time.
`Since two such devices are required for a point—to—point circuit, the total
`internaldelay time'encountered during a transmission sequence equals-twice
`the modem’s internal delay time. Such times can range from a few to 10 or
`more milliseconds (ms). The second delay encountered on a circuit
`is a
`function of the distance between points and is known as the circuit or
`propagation delay time. This is the time required for the signal
`to be
`propagated or transferred down the line to the distant end. Propagation
`delay time can be approximated by equating 1 millisecond for every 150
`circuit miles and adding 12 milliseconds to the total.
`Once data is received at the distant end it must be acted upon, resulting
`in a processing delay which is a function of the computer or terminal
`employed as well as the quantity of transmitted data which must be acted
`upon. Processing delay time can range from a few milliseconds where a
`simple error check is performed to determine if the transmitted data was
`received correctly to many seconds where a search of a database must occur
`in response to a transmitted query. Each time the direction of transmission
`changes in a typical half duplex protocol, control signals at the associated
`modem to computer and modem to terminal interface change. The time
`required to switch control signals to change the direction of transmission is
`known as line turnaround time and can result in delays up to 250 or more
`milliseconds, depending upon the transmission protocol employed. We can
`denote the effect of data compression by examining the transmission protocol
`commonly known as BISYNC communications and a few of its derivations.
`
`BISNYC communications
`
`One of the most commonly employed transmission protocols is the Binary
`Synchronous Communications (BISNYC) communications control structure.
`This line control structure was introduced in 1966 by International Business
`Machine Corporation and is used for'transmission by many medium-speed
`and high-speed devices to include terminal and computer systems. BISNYC
`provides a set of rules which govern the synchronous transmission of binary-
`coded data. While this protocol can be used with a variety of transmission
`codes, it is limited to the half dupiex transmission mode and requires the
`acknowledgement of the receipt of every block of transmitted data. In an
`evolutionary process, a number of synchronous protocols have been
`developed to supplement or Serve as a replacement to BISNYC, the most
`prominent being the high level data link control (HDLC) protocol defined
`by the International Standard Organization (ISO).
`The key difference between BISYNC and HDLC protocols is that BISYNC
`is a half duplex, character-oriented transmission control structure while
`
`11
`
`11
`
`

`

`
`
`HDLC is a bit-oriented, full duplex transmission control structure. We can
`investigate the efficiency of these basic transmission control structures and
`the effect of data compression upon their information transfer efficiency. To
`do so, an examination of some typical error control procedures is first
`required.
`
`Error control
`
`'
`
`I
`
`9
`
`The most commonly employed error-control procedure is known as auto-
`matic request for repeat (ARQ). In this type of control procedure, upon
`detection of an error a request is made by the receiving station to the sending
`station to retransmit the message. Two types of ARQ procedures have been
`developed: ‘stop and wait ARQ’ and ‘go back n ARQ’, which is sometimes
`called continuous ARQ.
`
`‘Stop and wait ARQ’ is a simple type of error-control procedure. Here
`the transmitting station stops at the end of each block and waits for a reply
`from the receiving terminal pertaining to the block’s accuracy (ACK) or
`error (NAK) prior to transmitting the next block. This type of error-control
`procedure is illustrated in Figure 1.5. Here the time between transmitted
`blocks is referred to as dead time which acts to reduce the effective data
`
`rate on the circuit. When the transmission mode is half duplex, the circuit
`must be turned around twice for each block transmitted, once to receive the
`reply (ACK or NAK) and once again to resume transmitting. These line
`turnarounds, as well as such factors as the propagation delay time, station
`message processing time and the modem internal delay time, all contribute
`to what is shown as the cumulative delay factors.
`When the ‘go back n ARQ’ type of error control procedure is employed,
`the dead time can be substantially reduced to the point where it may be
`
`I
`
`I
`
`: Dead time
`:
`__......—.—¢-—|-
`
`——|.-
`
`Transmit
`
`Block [+2
`
`Block [+1
`
`_ A
`
`CK
`1' + 1
`
`1 I I
`
`I
`
`IIIIIr
`
`II
`
`.—
`
`.1—
`
`ACK
`1'
`
`Curmlmiue
`deloy factors
`
`Receive
`
`Figure 1.5 Stop and wait ARQ. In this type of error control procedure, the receiver
`transmits an acknowledgement after each block. This can result in a significant
`amount of cumulative delay time between data blocks
`
`
`
`12
`
`12
`
`

`

`10
`
`Block
`[+3
`
`Block
`[+1
`
`Block
`[+2
`
`Block
`[+1
`
`Block
`[
`
`Primary channel —-
`
`AC_K
`1
`
`NAK
`(+1
`
`ACK
`[+2
`
`ACK
`[+1
`
`ACK
`[+3
`
`— Reverse channel
`
`Figure 1.6 Go back N ARQ. In a ‘go hack it ARQ’ error-
`control procedure,
`the transmitter continuously sends
`messages until the receiver detects an error. The receiver
`then transmits a negative acknowledgement on the reverse
`channel and the transmitter retransmits the block received
`in error. Some versions of this technique require blocks
`sent before the error indication was encountered to be
`retransmitted in addition to the block received in error
`
`insignificant. One way to implement this type of error control procedure is
`by the utilization of a simultaneous reverse channel for acknowledgement
`signalling as illustrated in Figure 1.6. In this type of operating mode, the
`receiving station sends back the ACK or NAK reSponse on the reverse
`channel for each transmitted block. If the primary channel operates at a
`much higher data rate than the reverse channel, many blocks may have been
`received prior to the transmitting station receiving the NAK in response to
`a block at the receiving station being found in error. The number of blocks
`one may go back to request a transmission, n, is a function of the block size
`and buffer area available in the business machines and terminals at
`the
`transmitting and receiving stations, the ratio of the data transfer rates of the
`primary and reverse channels and the processing time required to compute
`the block check character and transmit an acknowledgement. For the latter,
`this time is shown as small gaps between the ACK and NAK blocks in Figure
`1.6.
`
`Half duplex throughout mode]
`
`When a message block is transmitted in the BISYNC control structure, a
`number of control characters are contained in that block in addition to the
`message text. If the variable C is assigned to represent the number of control
`characters per block and the variable D is used to represent the number of
`data characters, then the total block length is C + D. If the data transfer
`rate expressed in bps is denoted as TR and the number of bits per character
`is denoted as BC, then the transmission time for one character is equal to
`BCITR which can be denoted as Tc. Since D + C characters are contained
`in a message block,
`the time required to transmit the block will become
`TC*(D + C). Once the block is received, it must be acknowledged. To do
`
`13
`
`13
`
`

`

`11
`
`so, the receiving station is required to first compute a block check character
`(BCC) and compare it with the transmitted BCC character appended to the
`end of the transmitted block. Although the BCC character is computed as
`the data is received, a comparison is performed after the entire block is
`received and only then can an acknowledgement be transmitted. The time
`to check the transmitted and computed BCC characters and form and trans-
`mit the acknowledgement is known as the processing and acknowledgement
`time (TPA)'
`When transmission is half duplex, the line turnaround time (TL) required
`to reverse the transmission direction of the line must be added. Normally,
`this time includes the request-to-send/clear-to-send (RTS/CTS) modem delay
`time as well as each of the modems’ internal delay time. For the acknowl-
`edgement to reach its destination, it must propagate down the circuit and
`this propagation delay time, denoted as Tp, must also be considered. If the
`acknowledgement message contains A characters then, when transmitted on
`the primary channel, A*BC/TR seconds are required to send the acknowl-
`edgement.
`Once the original transmitting station receives the acknowledgement it
`must determine if it is required to retransmit the previously sent message
`block. This time is similar to the processing and acknowledgement time
`previously discussed. To transmit either a new message block or repeat the
`previously sent mesasge block, the line must be turned around again and
`the message block will require time to propagate down the line to the
`receiving station. Thus,
`the total
`time to transmit a message block and
`receive an acknowledgement, denoted as TB, becomes:
`
`Since efficiency is the data-transfer rate divided by the theoretical data-
`transfer rate, the transmission control structure efficiency (ETCS) becomes:
`
`_ BC*D*(1—P)
`Eras —
`T33 TB
`
`(1.2)
`
`Here P is the probability that one or more bits in the block are in error,
`causing a retransmission to occur.
`Although the preceding is a measurement of the transmission control
`structure efficiency,
`it does not consider the data code efficiency which is
`the ratio of information bits to total bits per character. When the data code
`efficiency is included, we obtain a measurement of the information transfer
`efficiency. We can call this ratio the information transfer ratio (ITR) which
`Will provide us with a measurement of the protocol’s information transfer
`efficiency. This results in:
`
`s:
`.
`ITR =M
`Be
`
`14
`
`(1.3)
`
`
`
`
`
`14
`
`

`

` 12
`
`where:
`
`ITR = Information transfer ratio
`
`BIC = Information bits per character
`BC
`= Total bits per character
`D
`= Data characters per message block
`A
`= Characters in the acknowledgement message
`C
`= Control characters per message block
`TR
`2 Data transfer rate (bps)
`TC
`= Transmission time per character (BC/TR)
`TPA = Processing and acknowledgment time
`TL
`= Line turnaround time
`Tp
`= Propagation delay time
`p
`= Probability of one or more errors in block.
`
`the information transfer ratio provides us with a
`From the preceding,
`measurement of the efficiency of the transmission control structure without
`considering the effect of compression. When compression is considered we
`obtain a new term which we will denote as the effective information transfer
`ratio (EITR).
`When data is compressed, the original data stream will be reduced in
`size prior to transmission, the actual reduction being dependent upon the
`compression algorithms employed as well as the composition of the data
`acted upon. In general, we can assume that the compression ratio considers
`the number of characters in the compressed data stream to include special
`control characters required to indicate one or more compression algorithms.
`This reasonable assumption simplifies the effect of considering data com-
`pression when examining a particular protocol. As an example, consider a
`160-character data block compressed into 78 data characters plus 2 com—
`pression indicator characters. Here the compression ratio would be 160/
`(78 + 2) or 2. The effect upon the previously developed equation to compute
`the information transfer ratio would be to change D in the numerator to the
`non-compressed string length of 160 characters while D in the denominator
`would be the actual 78 compressed data characters plus the two additional
`special characters required to indicate data compression, resulting in a total
`of 80 characters. If the total number of control characters framing the data
`block is relatively small,
`the effective information transfer ratio can be
`approximated by multiplying the information transfer ratio by the com—
`pression ratio.
`
`Computation examples
`
`We will assume that our data transmission rate is 4800 bps and we will
`transmit information using a BISYNC transmission control structure employ-
`ing a ‘stop and wait ARQ’ error control procedure. Furthermore, let us
`assume the following parameters:
`
`15
`
`15
`
`

`

`
`
`13
`
`= 4 characters per acknowledgement
`A
`Brc = 8 bits per character
`.
`BC
`= 8 bits per character
`D
`= 80 data characters per block
`C
`z 10 control characters per block
`TR
`3 4800 bps
`TC
`= 8/4800 = 0.00166 seconds (5) per character
`TPA = 20 milliseconds = 0.02 s
`TL
`= 100 milliseconds = 0.10 s
`Tp
`= 30 milliseconds = 0.03 s
`P
`= 0.01
`
`Then:
`
`ITR =
`
`8 80 (1 0'01)
`4800*[0.00166(80+ 10) +2*(0.02+ 0.03+0.1}+4*8f4800)]
`
`= 0.2861.
`
`Since the transfer rate of information in bits (TRIB) is equal to the product
`of the data transfer rate and the information transfer ratio, we obtain:
`
`TRIB = ITR*TR = 0.2861‘4800 = 1373 bps.
`
`For the preceding example, approximately 28 per cent of the data transfer
`rate is effectively used.
`Let us now examine the effect of doubling the text size to 160 characters
`while the remaining parameters except P continue as before. Since the block
`size has doubled, P approximately doubles, resulting in the ITR becoming:
`
`8*160*(1-0.02)
`ITR _
`‘ 4800*[0.00166(160+10)+2*(0.02+0.03+0.1)+(4*8/4800)]
`= 0.4339.
`
`With an ITR of 0.4339 the TRIB now becomes:
`
`TRIB = ITR*TR = 0.4339*4800 = 2083 bps.
`
`Here, doubling the block size raises the percentage of the data transfer
`rate effectively used to 43.39 per cent.
`
`Compression effect
`
`Suppose one or more data-compression algorithms are employed which result
`in a compression ratio of 2. What effect would this have upon the effective
`information transfer ratio?
`The effective information transfer ratio (EITR) can be obtained by mod—
`ifying equation (1.1) as follows:
`
`16
`
`16
`
`

`

`
`
`Brciiflffl—P)
`EITR =m 1.4
`TR*[TC*(DE+C)+2*(TM+TL+T.)+(A*BCITRJ]
`
`(
`
`)
`
`where
`
`D1 = original data block size in characters prior to compression
`D2 = compressed data block size in characters to include special com-
`pression indication characters.
`
`If on the average 160 data characters are transmitted in a compressed
`format of 80 characters we obtain:
`
`
`anew (1—0.01)
`EITR _
`‘ 48m*[0.00166(80+10)+2*(0.02+0.03+0.1}+(4*8!48{}0)]
`= 0.5788.
`
`As previously discussed, the effective information transfer ratio can be
`approximated as follows:
`
`EITR = ITR*CR.
`
`(1.5)
`
`Substituting, we obtain:
`
`EITR 2 0.2861'2 2 0.5722.
`
`Since the transfer rate of information in bits (TRIB) is the product of the
`effective information transfer ratio and the operating data rate, we obtain:
`
`TRIB = 0.5788*4800 = 2778 bps.
`
`In Table 1.1, the reader will find a comparison of the variations in the
`ITR, EITR and TRIB when non-compressed and compressed data are
`transmitted for two different block sizes.
`
`is apparent that two methods can be employed to
`it
`From Table 1.1,
`increase one’s transmission efficiency. First, one may alter the protocol or

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket