throbber
McGRAW-IDLL
`BOOK COMPANY
`New York
`St. Louis
`San Francisco
`Diisseldorf
`Johannesburg
`Kuala Lumpur
`London
`Mexico
`Montreal
`New D elhi
`Panama
`Paris
`Sao Paulo
`Singapore
`Sydney
`Tokyo
`Toronto
`
`A. BRUCE CARLSO N
`Associate Professor of Sysrems Engi11eeri11g
`Rensselaer Polyteclmic lnstifllte
`
`Communication
`Systems
`
`AN INTRODUCTION TO
`SIGNALS AND NOISE IN
`ELECTRICAL COMMUNICATION
`
`S ECOND EDITION
`
`IPR2018-1556
`HTC EX1054, Page 1
`
`

`

`This book was set in Times New Roman.
`The editors were Kenneth J. Bowman and Michael Gardner;
`the cover was designed by Pencils Portfolio, Inc.;
`the production supervisor was Leroy A . Young.
`The dra wings were done by J & R Services, Inc.
`Kingsport P ress, Inc., was printer and binder.
`
`Library of Congress Cataloging in Publication Data
`Carlson, A
`Bruce, date
`Communication system.
`(McGraw-Hill electrical and electronic engineering
`series)
`Bibliography: p.
`I. Signal theory (Telecommunication) 2. Telecom(cid:173)
`I. Title.
`munication.
`TK5 102.5.C3 1975
`ISBN 0--07-009957-X
`
`621.38'043
`
`74-9841
`
`COMMUN ICATION SYSTEMS
`An Introduction to
`Signals a nd Noise in
`Electrical Communication
`Copyright<!:) 1968, 1975 by McGraw-Hill, Inc. All rights reserved.
`Printed in the United States of America. No part of this publication may be reproduced
`stored in a retrieval system, or transmitted, in any form or by any means,
`'
`electron ic, mechanical, photocopying, recording, o r otherwise,
`without the prior wr itten permission of the publisher.
`
`890KPKP 832
`
`To the memory of my father,
`ALBIN JOHN CARLSON
`
`IPR2018-1556
`HTC EX1054, Page 2
`
`

`

`9
`
`INFORMATION THEORY AND
`COMMUNICATION SYSTEMS
`
`The past several chapters have dealt with electrical communication primarily in
`terms of signals, both desired and undesired. We have devised signal models, examined
`the effects of networks on signals, and analyzed modulation as a means of signal
`transmission. Although many rewards and much insight have been gained by this
`approach, signal theory alone is not sufficient for a complete understanding of our
`subject matter, particularly when it comes to the design of new and improved systems.
`What is needed is a more encompassing view of the communication process, a broader
`in short,
`perspective leading to basic principles for system design and comparison -
`a general theory of communication.
`Prior to the 1940s a few steps were taken toward such a theory in the telegraphy
`investigations of Nyquist and Hartley. But then, shortly after World War II, Claude
`Shannon (1948) and Norbert Wiener (1949) set forth new concepts that had and
`continue to have major impact. Taken together, the ideas of Wiener and Shannon
`established the foundation of modem (statistical) communication theory. Both men
`were concerned with extracting information from a background of noise, and both
`applied statistical concepts to the problem. t There were, however, differences in
`emphasis.
`
`t Both men arc also famous for other accomplishments. Wiener founded the subject
`known as cybemctfos, and Shannon first pointed out the relationship of Boolean
`algebra to switching~ircuit design -
`in his master's thesis!
`
`INFORMATION THEORY AND COMMUN ICATION SYSTEMS 341
`
`Wiener treated the case where the information-bearing signals are beyond the
`designer's control, in whole or part, all the processing being at the receiving end.
`The problem then can be stated in this fashion : Given the set of possible signals, not
`of our choosing, plus the inevitable noise, how do we make the best estimate of the
`present and future values of the signal being received? Optimum solutions to this and
`similar problems are sought in the discipline known as detection theory.
`Shannon's work is more nearly akin to what we think of as communication,
`where signal processing can take place at both transmitter and receiver. Shannon
`posed this problem: given the set of possible messages a source may produce, not of
`our choosing, how shall the messages be represented so as best to convey the infor(cid:173)
`mation over a given system v.ith its inherent physical limitations? To handle this pro(cid:173)
`blem in quite general terms it is necessary to concentrate more on the information
`per se than on the signals, and Shannon's approach was soon rechristened information
`theory.
`Information theory is a mathematical subject dealing with three basic concepts:
`the measure of information, the capacity of a communication channel to transfer
`information, and coding as a means of utilizing channels at full capacity. These
`concepts are tied together in what can be called the fundamental theorem of infor(cid:173)
`mation theory, as follows.
`
`Given an information source and a communication channel, there exists a
`coding technique such that the information can be transmitted over the channel
`at any rate less than the channel capacity and with arbitrarily small fre<i uency of
`errors despite the presence of noise.
`
`The surprising, almost astonishing aspect of this theorem is error-free transmission
`on a noisy channel, a condition achieved through the use of coding. In essence,
`coding is used to match the source and channel for maximum reliable information
`transfer, roughly analogous to impedance matching for maximum power transfer.
`But the study of coding is, by and large, tangential to our inmediate aims.
`Thus, with some reluctance, we limit this chapter primarily to the concepts of infor(cid:173)
`mation measure and channel capacity, with emphasis on the latter. By so doing we
`shall eventually arrive at answers to these significant questions:
`
`2
`
`J Precisely how do the fundamental physical limitations (i.e., bandwidth
`and noise) restrict information transmission?
`Is there such a thing as an ideal communication system, and, if so, what are
`its characteristics?
`3 How well do existing communication systems measure up to the ideal, and
`how can their performance be improved?
`
`Answers to these questions are certainly germane to electrical communication.
`
`IPR2018-1556
`HTC EX1054, Page 3
`
`

`

`342
`
`INFORMATION THEORY AND COMMUNICATION SYSTEMS
`
`9. 1
`
`INFORMATION MEASURE: ENTROPY 343
`
`They will be explored in some detail at the close of the chapter. But we must begin
`with information theory.
`
`9.1 INFORMATION MEASURE: ENTROPY
`The crux of information theory is the measure of information. Here we are using in(cid:173)
`formation as a technical term, not to be confused with its more conventional interpre(cid:173)
`tations. In particular, the information of information theory has little to do with
`knowledge or meaning, concepts which defy precise definition, to say nothing of
`quantitative measurement. In the context of communication, information is simply
`that which is produced by the source for transfer to the user. This implies that
`before transmission, the information was not available at the destination; otherwise
`the transfer would be zero. Pursuing this line of reasoning, consider the following
`somewhat contrived situation.
`A man is planning a trip to Chicago. To determine what clothes he should
`pack, he telephones the Chicago weather bureau and receives one of the following
`forecasts:
`
`The sun will rise.
`It will rain.
`There will be a tornado.
`
`Clearly, the amount of information gained from these messages is quite different.
`The first contains virtually no information, since we are reasonably sure in advance
`that the sun will rise ; there is no uncertainty about this, and the call has been wasted.
`But the forecast of rain does provide information not previously available to the
`traveler, for rain is not an everydl!Y occurrence. The third forecast contains even more
`information, tornadoes being relatively rare and unexpected events.
`Note that the messages have been listed in order of decreasing likelihood and
`increasing information. The less likely the message, the more information it conveys
`to the user. We are thus inclined to say that information measure is related to un(cid:173)
`certainty, the uncertainty of the user as to what the message will be. Moreover, the
`amount of information depends only on the message uncertainty, rather than its
`actual content or possible interpretations. Had the Chicago weather forecast been
`"The sun will rain tornadoes," it would convey information, being quite unlikely, but
`not much meaning.
`Alternately, going to the transmitting end of a communication system, informa(cid:173)
`tion measure is an indication of the freedom of choice exercised by the source in selec(cid:173)
`ting a message. If the source can freely choose from many different messages, the
`user is highly uncertain as to which message will be selected. But if there is no choice
`at all, only one possible message, there is no uncertainty and hence no information.
`
`Whether one prefers the uncertainty viewpoint or the freedom-of-choice interpre(cid:173)
`tation, it is evident that the measure of information involves probabilities. Messages
`of high probability, indicating little uncertainty on the part of the user or little choice
`on the part of the source, convey a small amount of information, and vice versa.
`This notion is formalized by defining self-information in terms of probability.
`
`Self-Information
`
`I
`
`Consider a so urce that produces various messages. Let one of the messages be
`designated A, and let PA be the probability that A is selected for transmission. Consis(cid:173)
`tent with our discussion above, we write the self-information associated with A as
`
`where the function f( ) is to be determined. As a step toward finding f( ), intuitive
`reasoning suggests that the following requirements be imposed:
`
`JA =[(PA)
`
`f(P,.) ~ 0
`
`where O :,;PA :;5; I
`
`(I)
`
`(2)
`
`lim f(P,.) =0
`P..t -+ I
`f(P,.) > [(Pa)
`
`(3)
`
`for PA < P8
`The student should have little trouble interpreting these requirements.
`Many functions satisfy Eqs. (I) to (3). The final and deciding factor comes from
`considering the transmission of independent messages. When message A is delivered,
`the user receives J A units of information. If a second message B is also delivered,
`the total information received should be the sum of the self-informations, J _. + J 8 .
`This summation rule is readily appreciated if we think of A and B as coming from
`different sources. But suppose both messages come from the same source ; we can
`then speak of the compound message C = AB. If A and Bare statistically independent,
`Pc = PAP 8 and Jc = f(P A P 8 ). But the received information is still J~ = J A + J 8 =
`.f(P A) + f(P 8) and therefore
`
`f(P,.P8 )=f(P,.)+f(P8 )
`
`(4)
`
`which is our final requirement for f( ).
`There is one and only one functiont satisfying the conditions (I) to (4), namely,
`the logarithmic function f( ) = -log. ( ), where b is the logarithmic base. Thus
`self-information is defined as
`
`"'-
`JA -
`
`I
`- log• PA = log.-
`PA
`
`(5)
`
`t See Ash (1965, chap. 1) for proof.
`
`IPR2018-1556
`HTC EX1054, Page 4
`
`

`

`344
`
`INFORMATION THEORY AND COMMUNICATION SYsnMS
`
`9.1
`
`INFORMATION MEASURE: ENTROPY 345
`
`Alternately, assuming the levels to be equally likely, the information per element is
`log 8 = 3 bits, for a total of 3 x I 0 5 x 3 ::::, I 06 bits, as before.
`But what a bout the thousand words? Suppose, for the sake of a rgument, tlaat a
`vocabulary consists of 100,000 equally likely words. The probability of any one
`word is then P = 10 - 5, so the information contained in 1,000 words is
`::::, 2 x 104 bits
`.! = 1,000 log 105 = 10 3 x 3.32 log, 0 105
`
`or substantially less than the information in one picture.
`The validity of the above assumptions is of course open to question ; the point
`of this example is the method, not the results.
`////
`
`Entropy and Information Rate
`
`Self-information is defined in terms of the individual messages or symbols a source
`may produce. It is not, however, a useful description of the source re lative to communi(cid:173)
`cation. A communication system is not designed around a particular message but
`rather all possible messages, i.e., what the source could produce as distinguished
`from what it does produce on a given occasion. Thus, although the instanta neous
`information flow from a source may he erratic, one must describe the source in terms
`of the average information produced. This average information is called the source
`entropy.
`For a discrete source whose symbols are sta1istical/y independent, the entropy
`expression is easily formulated. Let m be the number of different symbols, i.e., an
`alphabet of size m . When the jth symbol is transmitted, it conveys J 1 = log (1/ P,)
`bits of information. In a long message of N » 1 symbols, the jth symbol occurs
`about NP
`times, and the total information in the message is approximately
`1
`
`NP1.F, + NP2.F2 + ... + NP.,.F~ = L NP; .F,
`
`m
`
`} • 1
`
`bits
`
`where b is unspecified for the moment. The minus sign in - logb PA is perhaps
`disturbing at first glance. But, since probabilities are bounded by O ~ PA ~ I, the
`negative of the logarithm is positive, as desired. The alternate form logb (I/PA) helps
`avoid confusion on this score, and will be used throughout.
`Specifying the logarithmic base b is equivalent to selecting the unit of informa(cid:173)
`tion. While common or natural logarithms (b = JO or b= e) seem obvious candidates,
`the standard convention of information theory is to take b = 2. The corresponding
`unit of information is termed the bi1, a contraction for binary digit suggested by
`J. W. Tukey. Thus
`
`J A = log2 ...!..
`PA
`
`bits
`
`The reasoning behind this rather strange convention goes like this. Information is a
`measure of choice exercised by the source; the simplest possible choice is that between
`two equiprobable messages, i.e., an unbiased binary choice. The information unit is
`therefore normalized to this lowest-order situation, and I bit of information is the
`amount required or conveyed by the choice between two equally likely possibilities,
`i.e., if PA = P8 = M, then J A = .!8 = log2 2 = I bit.
`Binary digits enter the picture simply because any two things can be represented
`by the two binary digits O and 1. Note, however, that one binary digit may convey
`more or less than I bit of information, depending on the probabilities. To prevent
`misinterpretation, binary digits as message elements are called binils in this chapter.
`Since tables of base 2 logarithms are relatively uncommon, the following
`conversion relationship is needed:
`
`log2 v = log2 IO log10 v"" 3.32 log10 v
`(6)
`Thus, if PA = Yto, f,. = 3.32 log10 10 = 3.32 bits. In the remainder of this chapter,
`all logarithms will be base 2 unless otherwise indicated.
`
`Example 9.1 The Information in a Picture
`
`It has often been said that one picture is worth a thousand words. With a little stretch(cid:173)
`ing, information measure supports this old saying.
`For analysis we decompose the picture into a number of discrete dots, or ele(cid:173)
`ments, each element having a brightness level ranging in steps from black to white.
`The standard television image, for instance, has about 500 x 600 = 3 x 105 elements
`and eight easily distinguishable levels. Hence, there are 8 x 8 x .. . = 83
`10
`' possible
`•
`pictures, each with probability P = s- <Jx 10'> if selected at random. Therefore
`
`J = Jog 83 ' 1°' = 3 x 105 log 8::::, 106 bits
`
`which, when divided by N, yields the average information per symbol. We therefore
`define the entropy of a discrete source as
`.,
`.,
`I
`Jff~ L P;J, = L P, log p.
`J• I
`J•I
`It should be observed that Eq. (7) is a n ensemble average. If the source is nonstation(cid:173)
`ary, the symbol probabilities may change with time and the entropy is not very mean(cid:173)
`ingful. We shall henceforth assume that infor mation sources are ergodic, so tha t ti me
`a nd ensemble averages are identical.
`The name entropy and its symbol Jff are borrowed from a similar equatiort in
`statistical mechanics. Because of the mathematical similarity, various attempts have
`
`bits/symbol
`
`(7)
`
`J
`
`IPR2018-1556
`HTC EX1054, Page 5
`
`

`

`346
`
`INFORMATION lREORY AND COMMUNICATION SYSTEMS
`
`9.1
`
`INFORMATION MEASURE: ENTROPY 347
`
`been made to relate communication entropy with thermodynamic entropy. t However,
`the attempted relationships seem to cause more confusion than illumination, and it is
`perhaps wiser to treat the two entropies as different things with the same name.
`For this reason the alternate designation comentropy has been suggested for communi(cid:173)
`cation entropy.
`But what is the meaning of communication entropy as written in Eq. (7)?
`Simply this: although one cannot say which symbol the source will produce next,
`on the average we expect to get .1'f bits of information per symbol or N.1'f bits in a
`message of N symbols, if N is large.
`For a fixed alphabet size (fixed m) the entropy of a d iscrete source depends on the
`symbol probabilities but is bounded by
`
`0 ::. .1'f s; log m
`
`(8)
`
`These extreme limits are readily interpreted and warrant further discussion. The
`lower limit, Jf' = 0, implies that the source delivers no information (on the average),
`and hence there is no uncertainty about the message. We would expect this to corres(cid:173)
`pond to a source that continually produces the same symbol; i.e., all symbol probab(cid:173)
`ilities are zero, save for one symbol having P = I. It is easily shown that Jf' = 0 in
`this case. At the other extreme, the maximum entropy must correspond to maximum
`uncertainty or maximum freedom of choice. This implies that all symbols are equally
`likely ; there is no bias, no preferred symbol. A little further thought reveals that the
`symbol probabilities must be the same; i.e., .1'f = Jf' max = log m when a ll P1 = l /m,
`which has particular significance for our later wor k.
`The variation of .1'f between the limits of Eq. (8) is best illustrated by considering
`a binary source (m = 2). The symbol probabilities are then related and can be written
`as P and 1 - P. Thus
`
`I
`I
`.1'f = P log - + ( I - P) log - -
`P
`1 - P
`
`(9)
`
`which is plotted versus P in Fig. 9.1. Note the rather broad maximum centered at
`P = 0.5, the equally likely case, where .1'f = log 2 = I bit.
`Bringing the time element into the picture, suppose two sources have equal
`entropies but one is "faster" than the other, producing more symbols per unit time.
`In a given period, more information must be transferred from the faster source than
`from the slower, which obviously places greater demands on the communication
`system. Thus, for our purposes, the description of a source is not its entropy alone
`but its entropy rate, or information rate, in bits per second. The entropy rate of a
`discrete source is. simply defined a s
`
`t Sec Brillouin (1956).
`
`1.0
`
`:E
`i, 0.5
`
`FIGURE 9.1
`Entropy of a binary source versus the
`probability of one symbol.
`
`O
`
`o.s
`p
`
`1.0
`
`fJl 8 .1'f
`i
`
`bits/s
`
`(10)
`
`where i is the average symbol duration, namely,
`
`..
`i= I P; TJ
`
`j
`
`•
`
`(11)
`
`l
`
`so 1/i equa ls the average number of symbols per unit time.
`
`EXERCISE 9. 1 Calculate the entropy rate of a telegraph source having Pd0 , = %,
`Tdo• = 0.2 s, Pdash = ~ , Tdash = 0.4 s. Ans.: Bl = 3.44 bits/s.
`
`Example 9.2 Entropy and PCM
`
`As an example of entropy applied to our earlier studies, consider a PCM system whose
`input is the continuous signal x(I) bandlimited in W = 50 Hz. Suppose x(t) is sampled
`at the Nyquist rate J. = 2 W, a nd let there be four quantum levels such that the quan(cid:173)
`tized values have probabilities :¥2, X , Ys, and Ys. Identifying each possible quantized
`value as a "symbol," the output of the quantizer then looks like a discrete source
`with m = 4 and
`
`Yi' = V2 log 2 + X log 4 + ~ log 8 + ~ log 8
`= 1.75 bits/symbol
`Since the symbol rate is 1/'f = J. = JOO,
`
`fJt = 100 x 1.75 = 175
`
`bits/s
`
`Thus, it should be possible to represent this same information by equiprobable
`binary digits (biaits) generated at a rate of 175 binits/s.
`
`IPR2018-1556
`HTC EX1054, Page 6
`
`

`

`348
`
`INFORMATION THEORY AND COMMUNICATION SYSTEMS
`
`9.1
`
`INFORMATION MEASURE: ENTROPY 349
`
`To check this idea, suppose that the system is in fact binary PCM with the
`quantized samples transmitted as coded binary pulses. Labeling the quantum levels
`by the code numbers 0, I , 2, and 3, let us assume the following coding:
`
`Code number
`
`Probability
`
`Binary code
`
`0
`I
`2
`3
`
`~
`l4
`l-11
`~
`
`00
`01
`10
`11
`
`Since there are two binary digits for each sample, the PCM signal has 200 pulses per
`second. Now we argued that 175 binary digits per second should be sufficient ; yet
`200 per second is requfred with this code.
`The anomaly is quickly resolved by noting that the binary digits are not equally
`likely; in fact, P0 = 1Yi6 and P 1 = Yi6, as the reader can verify. Clearly, the suggested
`code is not optimum. On the other hand, it is simple and reasonably efficient.
`Pressing onward, the binit rate and probabilities given a bove suggest that the
`information rate at the encoder output is
`q/ = 200(1Yi6 log 1;{1 + Yi6 log 1%)
`= 2()() X 0.897 = 179 bits/S
`Again something is wrong, for the information rate into the encoder is only l 75 bits/s.
`Surely direct encoding does not add information!
`To explain the discrepancy we must recall that the entropy equation (7) is based
`on statistica lly independent symbols. And, while it may be true that successive
`quantized samples are independent, the successive binary pulses are not, because we
`have encoded in groups of two. For the case of dependent symbols, we must modify
`/Ill
`the measure of information and introduce conditional entropy.
`
`Conditional Entropy and Redundancy
`
`Discrete sources are often constrained by certain rules which limit the choice in selec(cid:173)
`ting successive symbols. The resulting intersymbol influence reduces uncertainty and
`thereby reduces the amount of information produced. We account for this effect by
`using conditional probabilties and conditional entropy.
`Written text, being governed by rules of spelling and grammar, is a good
`example of intersymbol influence. On a relative-frequency basis, the probability of
`U in printed English is Pu = 0.02; but if the previous letter is Q, the conditional
`
`probability of U given Q is P( U I Q) ~ I, whereas in contrast P(U I W) < 0.001.
`The influence may well extend over several symbols, phrases, or even complete
`sentences, as illustrated by the fact that in most textbooks, including this one,
`P(THATjIT CAN BE SHOWN)~ I.
`The expression for conditional entropy therefore is formulated by considering
`the entire past history of the source - more precisely, a ll possible past histories.
`Thus, if j represents the next symbol (or group of symbols) and i represents the pre(cid:173)
`ceding sequence, the information conveyed by j given i is log (1/P(jj i)]. Ave raging
`over all j's and i's gives the conditional entropy
`
`I
`Jlf0 = ~ ~ P 1P(jli) log P(jli)
`
`(12)
`
`In general .Jf0 :s; Jf' ; the equality applies only when the symbols are independent and
`P(jji) = P1 .
`A source producing dependent symbols is said to be redundant, meaning that
`symbols are generated which are not absolutely essential to convey the information.
`(ls it really necessary to explicitly write the U following every Q in English?) The
`redundancy of English text is estimated to be roughly 50 percent. This implies that
`in the long run, half the symbols are unnecessary: yu sbld babl t read tbs evntbo
`sevrl ltrs r msng. The reader may wish to ponder the observation that without
`redundancy, abbreviation would be impossible, and any two-dimensional array of
`letters would form a valid crossword puzzle.
`For very long passages of printed English, the conditional entropy may be as
`low as 0.5 to 1.0 bits/symbol because of contextual inferences. Thus, with suitable
`coding, printed English theoretically could be transmitted in binary form with an
`average of one binary digit per symbol. Contrast this with existing teletype systems
`that use five binary digits per character.
`From the viewpoint of efficient communication, redundancy in a message is
`undesirable; the same information could be sent with fewer nonredundant (indepen(cid:173)
`dent) symbols. Thus, coding to reduce intersymbol influence is a method of improving
`efficiency. On the other band, redundancy is a definite aid in resolving ambiguities
`if the message is received with errors, a not uncommon phenomenon in telegraphy,
`for example. Indeed, coding for error protection is based on the insertion of redundant
`symbols.
`Optimum transmission therefore entails coding to reduce the inefficie nt re(cid:173)
`dunda ncy of the message, plus coding to add " efficient" redundancy for error control.
`Much has been done in the area of error-detecting and error-correcting codes, but
`coding to reduce message redundancy is far more difficult, and relatively little bas
`been accomplished in this direction. (In retrospect, the bandwidth-reduction feature
`of differential PCM relies on the redundancy of the input signal.)
`
`IPR2018-1556
`HTC EX1054, Page 7
`
`

`

`350
`
`INFORMATION THEORY AND COMMUNICATION SYSTEMS
`
`9.2 CHANNEL CAPACITY AND DISCRETE CKANNELS 35 (
`
`Continuous Information Sources
`
`Having discussed discrete sources at some length, the next logical step would be
`the definition of entropy for continuous sources, sources whose messages are
`continuously varying functions of time. Such a definition is possible but will not be
`presented here. For one reason, the mathematics gets rather complicated and tends
`to obscure physical interpretation; for another, the entropy of a continuous source
`turns out to be a relative measure instead of an absolute measure of informa tion.
`Fortunately, the goals of this chapter can be achieved by sticking to the discrete
`formulation, and most of our conclusions will apply to continuous sources with minor
`modification. But more important, we shall find that because of the fun damental
`physical limitations, communication is inherently a discrete process regardless of the
`source. This striking conclusion is one of Shannon's principal contributions lo the
`theory of communication, but it was noted by Hartley as far back as 1928.
`
`9.2 CHANNEL CAPACITY AND DISCRETE CHANNELS
`We have seen that il is often convenient to treat the terminal equipment of a communi(cid:173)
`cation system as being perfect (noise-free, distortionless, etc.) and think of all undesired
`effects as taking place in the channel. The communication channel is therefore an
`abstraction, a model representing the vehicle of transmission plus all phenomena that
`tend to restrict transmission. The fact that there are fundamental physical limitations
`to information transfer by electrical means leads to the notion of channel capacily.
`Just as entropy rate measures the amount of information produced by a source
`in a given time, capacity is a measure of the amount of information a channel can
`transfer per unit time. Channel capacity is symbolized by ((!, and its units are bits per
`second. Restating the fundamental theorem in terms of 9t and re, we have:
`Given a channel of capacity <(l and a source having entropy rate Bl, then if
`Bf :s; ({/, there exists a coding technique such that the output of the source can
`be transmitted over the channel with an arbitrarily small frequency of errors,
`despite the presence of noise. If 9l > <(J, it is not possible to transmit without
`errors.
`
`Although we shall attempt to make the theorem plausible, complete proof
`involves a great deal of coding theory and is omitted here. Instead we shall concen(cid:173)
`trate on aspects more pertinent to electrical communication.
`
`Channel Capacity
`
`The fundamental theorem implicitly defines channel capacity as the maximum rate
`at which the channel supplies reliable information to the destination. With this
`interpretation in mind we can form ulate a general expression for capacity by means
`of the following argument.
`
`Consider all the different messages of length Ta source might produce. lf the
`channel is noisy, it will be difficult to decide at the receiver which particular message
`was intended and the goal of info rmation transfer is partially defeated. But suppose
`we restrict the messages to only those that are "very different" from each other,
`such that the received message can be correctly identified with sufficiently small
`probability of error. Let M (T) be the number of these very different messages of
`length T.
`Now, insofar as the destination or user is concerned. the source-plus-channel
`combination may be regarded as a new source generating messages at the receiving
`end. With the above message restriction, this equivalent source is discrete and has an
`alphabet of size M(T). Correspondingly, the maximum entropy produced by the
`equivalent source is log M(T), and the maximum entropy rate at the destination is
`( I /T) log M(T). Hence, letting T-+ oo to ensure generality,
`
`I
`.
`<(J = hm - log M(T)
`T-a, T
`
`bits/s
`
`(I)
`
`an alternate definition for channel capacity. The following discussion of discrete
`channels shows that Eq. (I) is an intuitively meaningful definition.
`
`Discrete Noiseless C hannels
`
`A d iscrete channel is one that transmits information by successively assuming various
`disjoint electrical states- voltage levels, instantaneous frequency, etc. Let µ be the
`number of possible states and r the signaling rate in states per unit time. If the signal(cid:173)
`to-noise ratio is sufficiently large, the error probability at the receiver can be extremely
`small, so small that to all intents and purposes the channel is deemed to be noiseless.
`U nder this assumption, any sequence of symbols will be correctly identified and the
`capacity calculation is straightforward.
`A received message of length T will consist of rT symbols, each symbol being
`one of the µ possible states. The number of different messages is thus M(T) = Jl'r
`and hence
`
`<'{/= Jim .!_logµ'r = Jim rTlogµ
`r-., T
`r- oo T
`= r logµ
`The capacity of a noiseless discrete channel is therefore proportional to the signaling
`rate a nd the logarithm of the number of states. For a binary channel (JL = 2) the
`capacity is numerically equa l to the signaling rate, that is, <c = r.
`According to Eq. (2), one can double channel capacity by doubling the signaling
`speed (which is certainly reasonable) or by squaring the number of states (somewhat
`more subtle to appreciate). In regard to the latter, suppose two identical but indepen-
`
`bits/s
`
`(2)
`
`IPR2018-1556
`HTC EX1054, Page 8
`
`

`

`352
`
`INFORMATION TI<EORY AND COMMUNICATION SYSTEMS
`
`9.2 CHANNEL CAPACITY AND DtSCRETE CHANNELS 353
`
`dent channels are operated in parallel so their combined capacity is clearly 2(r logµ) =
`2r logµ = r log µ 2• At the output we can say that we are receiving 2r symbols per
`second, each sym bol being drawn from an alphabet of sizeµ; or the output can be
`viewed as r compound symbols per second, each being drawn from an alphabet of size
`µ X µ = µl.
`The fo llowing example illustrates a coding technique that achieves !ft = <(I on a
`no iseless discrete channel.
`
`Example 9.3
`
`Consider the PCM source in Example 9.2. In theory a noiseless binary channel will
`suffice to convey the information if r ~ 175 binits/s. To minimize !ft we cannot use
`the previous code because it requires two binary digits per source symbol and r = 200.
`A more efficient code is as follows.
`
`Code number
`
`Probability Binary code
`
`0
`I
`2
`3
`
`l-2
`J.i
`J,g
`J,g
`
`0
`10
`110
`111
`
`With tnis code, a message containing N » I source symbols requires the trans(cid:173)
`mission of N/2 + 2(N/4) + 3(N/8) + 3(N/8) = 1.15N channel symbols, that is, 1.75
`binary digits per source symbol. The required signaling rate is then r = 1.75/i = 175,
`and we have achieved transmission at fft = '11; the encoding has produced a perfect
`match between source and channel.
`Although this example is admittedly a special case, there are two general hints
`about efficient encoding to be gained from it. First, the code is such that the channel
`symbols O and 1 are equally likely a nd statistically independent. (The skeptical reader
`should verify this.) Second, the source symbol with the nighest probability is assigned
`the shortest word code, and so forth down the line to the least probable symbol,
`which gets the longest code. Systematic procedures for devising such codes are
`described in the literature. t
`Somewhat parenthetically we might note that assigning shorter code words to
`the more probable symbols is just common sense. Over a century ago, long before
`S hannon, Samuel Morse constructed h

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket